datacite-mapping

A library for mapping DataCite XML to Ruby objects,
based on xml-mapping and
xml-mapping_extensions.
Full API documentation on RubyDoc.info.
Supports Datacite 4.3; backward-compatible with
Datacite 3.1.
Note that although this gem maintains compatibility with multiple
versions of DataCite XML, changes to DataCite XML sometimes force
changes to the internal object model of the gem. So different versions
of this gem may require minor updates to how a part of the model is accessed.
Usage
The core of the Datacite::Mapping library is the Resource
class, corresponding to the root <resource/>
element
in a Datacite document.
Reading
To create a Resource
object from XML file, use Resource.parse_xml
or Resource.load_from_file
,
depending on the data source:
XML source | Method to use |
---|
file path | Resource.load_from_file |
String | Resource.parse_xml |
IO | Resource.parse_xml |
REXML::Document | Resource.parse_xml |
REXML::Element | Resource.parse_xml |
Example:
require 'datacite/mapping'
include Datacite::Mapping
resource = Resource.load_from_file('datacite-example-full-v4.3.xml')
abstract = resource.descriptions.find { |d| d.type = DescriptionType::ABSTRACT }
abstract.value
Note that Datacite::Mapping uses the TypesafeEnum gem to represent controlled
vocabularies such as ResourceTypeGeneral
and DescriptionType.
Writing
In general, a Resource
object must be provided with all required attributes on initialization.
resource = Resource.new(
identifier: Identifier.new(value: '10.5555/12345678'),
creators: [
Creator.new(
name: 'Josiah Carberry',
identifier: NameIdentifier.new(
scheme: 'ORCID',
scheme_uri: URI('http://orcid.org/'),
value: '0000-0002-1825-0097'
),
affiliations: [
'Department of Psychoceramics, Brown University'
]
)
],
titles: [
Title.new(value: 'Toward a Unified Theory of High-Energy Metaphysics: Silly String Theory')
],
publisher: 'Journal of Psychoceramics',
publication_year: 2008
)
To create XML from a Resource
object, use Resource.write_xml
, Resource.save_to_file
, or
Resource.save_to_xml
, depending on the destination:
XML destination | Method to use |
---|
XML string | Resource.write_xml |
file path | Resource.save_to_file |
REXML::Element | Resource.save_xml |
Example:
resource.write_xml
Namespace prefix
To set a prefix for the Datacite namespace, use Resource.namespace_prefix=
:
resource.namespace_prefix = 'dcs'
resource.write_xml
Datacite 3 compatibility
In general, Datacite::Mapping is lax on read, accepting either Datacite 3 or Datacite 4 or a mix,
and (mostly for historical reasons involving bad data its authors needed to parse) allowing some
deviations from the schema. By default, it writes Datacite 4, but can write Datacite 3 by passing
an optional argument to any of the writer methods:
resource.write_xml(mapping: :datacite_3)
When using the :datacite_3
mapping, the Datacite 4 <geoLocationPolygon/>
and <fundingReference/>
elements, which are not supported in Datacite 3, will be dropped, with a warning. Any
<relatedIdentifier/>
elements of type IGSN will be converted
to Handle identifiers with prefix 10273 (the prefix of the IGSN resolver).
Contributing
Datacite::Mapping is released under an MIT license. When submitting a pull request,
please make sure the Rubocop style checks pass, as well as making sure unit tests pass with 100%
coverage; you can check these individually with bundle exec rubocop
and bundle exec rake:coverage
,
or run the default rake task which includes both, bundle exec rake
.