bibliograph.parsing
Each parser accepts input from a given bibliographic reference format and outputs
a list of python dictionaries, one for each entry listed in the input source. Each
of these dictionaries will contain some number of the following fields:
+---------------------+-----------+---------------------------------------------------+
| Field Name: | Required: | Description of Field Contentsx: |
+=====================+===========+===================================================+
|reference_type |Yes |the type of content referenced by this entry |
+---------------------+-----------+---------------------------------------------------+
|title |Yes |the title of the content referenced by this entry |
+---------------------+-----------+---------------------------------------------------+
|abstract |No |short description or summary of the content |
| | | referenced by this entry |
+---------------------+-----------+---------------------------------------------------+
|publisher |? |name of the publishing company |
+---------------------+-----------+---------------------------------------------------+
|publication_year |? |year in which the content was published |
+---------------------+-----------+---------------------------------------------------+
|publication_month|? |month in which the content was published |
+---------------------+-----------+---------------------------------------------------+
|publication_url |? |fully-qualified url pointing to an online version |
| | | of the content |
+---------------------+-----------+---------------------------------------------------+
|authors |Yes |list of dictionaries, one for each author of the |
| | | content. The dictionaries will contain three |
| | | items: 'firstname' (given name), 'lastname' |
| | | (surname, family name), middlename (any name or |
| | | names in-between the first and last names) |
+---------------------+-----------+---------------------------------------------------+
|journal |No |Title of the journal in which the content appears |
+---------------------+-----------+---------------------------------------------------+
|volume |No |Volume of the periodical in which the content |
| | | appears |
+---------------------+-----------+---------------------------------------------------+
|number |No |Number of the periodical in which the content |
| | | appears |
+---------------------+-----------+---------------------------------------------------+
|pages |No |Page numbers within the given volume:number of the |
| | | periodical in which the content appears |
+---------------------+-----------+---------------------------------------------------+
Requirements
- requires Bibutils 4.6 or higher
Configuration
bibliograph.parsing
honors the environment variable FIX_BIBTEX
. If set, the module
will clean up BibTeX import data through the "bib2xml | xml2bib" pipeline in order cleanup
up improper or misformatted BixTeX data. However you may lose some data (e.g. the anotate
field will be filtered out through Bibutils).
Sources
Formats for input files have been gleaned from a number of sources:
RIS: http://www.refman.com/support/risformat_intro.asp
Contributors
Change history
1.0.1 (2011-02-10)
bibtex.py: add spaces to mname if more than one part is left for mname
1.0.0 (2010-03-19)
1.0.0c2 (2010-03-09)
- .end -> enw changed for proper format detection
1.0.0c1 (2010-03-03)
1.0.0b5 (2010-02-01)
- fixed RIS tests due to changes in bibliograph.core related to RIS parameters
1.0.0b4 (2010-01-31)
- introducing FIX_BIBTEX environment variable to enable bib2bib transformation
in order to make BibTeX parsing more robust
1.0.0b3 (2010-01-31)
- made BibTeX parser more robust
1.0.0b2 (2010-01-30)
- fixed failing endnote parser test
1.0.0b1 (2010-01-28)
?===================
- new numbering schema
- minor tweaks
0.2.3 (2010-01-22)
- added explicit input encoding check for RIS files since bib.core expects
from now on RIS input data with UTF-8 encoding
- updated tests with utf-8 encoded input data
0.2.2 (2009-12-12)
- now dealing correctly with all TeX escapings (and restored
the escaping support of old versions)
0.2.1 (2009-12-05)
- fixed keywords import of BibTex files
0.2.0 (2009-12-04)
- added BibTeX parsing support for identifiers (ISBN, ASIN, PURL, URN, ISSN, DOI)
- BibTeX parser no deals correctly with keys containing a dash like 'date-modified'
- added more tests
0.1.0 (2008-09-04)
- Created package with ZopeSkel
- Ported all parsers from Products.CMFBibliographyAT
- Removed all CMF / Zope2 / Plone dependencies
- Removed obsolete/unused parsers: [CitationManager, IBSS, ISBN, pyblbibex]
- Fixed broken EndNote parser
- Established reliable checkFormat() methods for existing parsers
- Extended parser test coverage