
Security News
NVD Concedes Inability to Keep Pace with Surging CVE Disclosures in 2025
Security experts warn that recent classification changes obscure the true scope of the NVD backlog as CVE volume hits all-time highs.
@istex/istex-merge
Advanced tools
Library to build merged documents and generate Hal TEIs from them.
npm install @istex/istex-merge
Function to create a merged document from multiple documents with a set of rules.
Complete the resources/mapping.json file.
This JSON file's structure is as follows:
{
"corpusName": true,
"source": true,
"sourceId": false,
"sourceUid": {
"action": "merge",
"path": "sourceUids"
},
// ...
"title.default": true,
"title.en": true,
"title.fr": true,
"utKey": false,
// ...
"_business.duplicates": {
"action": "merge",
"id": "sourceUid"
},
// ...
"_business.hasFulltext": false,
"fulltextUrl": true
}
This file describes the fields that will be present in the generated merged document.
Note:
istex-merge
can merge the data coming from all sources. The two possible scenarios are:
sourceUid
field is merged and placed into sourceUids
(we make it plurial because the value becomes an array)._business.duplicates
): a property (sourceUid
in the example above) must be used to discriminate the values and remove potential duplicates if the values are objects.Complete the JSON files describing the priority rules (example: rules/default.json).
This JSON file's structure is as follows:
{
"priorities": [
"hal",
"crossref",
"pubmed",
"sudoc"
],
"keys": {
"corpusName": [/*...*/],
"source": [/*...*/],
"sourceId": [/*...*/],
"sourceUid": [/*...*/],
// ...
"title.default": [/*...*/],
"title.fr": [/*...*/],
"title.en": [/*...*/],
"utKey": [/*...*/],
// ...
"_business.hasFulltext": [/*...*/],
"fulltextUrl": [/*...*/]
}
}
The priority mechanism:
priorities
defines the default priority order. It is applied to every field without a specific priority order.keys.<field>
defines a specific priority order for <field>
. Use an empty array ([]
) to tell istex-merge
to use the default priority order.This library must be integrated in an environment with direct access to the docObject
s and the JSON file with the rules.
const { generateMergedDocument } = require('istex-merge');
const rules = require('./myCustomFile.json');
const docObjects = [{...}, {...}, {...}];
const mergedDocument = generateMergedDocument(docObjects, rules);
Considering the following list of documents:
[
{
"source": "hal",
"authors": [],
"abstract": {
"fr": "abstract.hal.fr",
"en": "abstract.hal.en"
}
},
{
"source": "crossref",
"authors": [
"authors.crossref.1",
"authors.crossref.2"
],
"abstract": {
"fr": "abstract.crossref.fr",
"en": "abstract.crossref.en"
}
},
{
"source": "pubmed",
"authors": [
"authors.pubmed.1",
"authors.pubmed.2"
],
"abstract": {
"fr": "abstract.pubmed.fr",
"en": "abstract.pubmed.en"
}
},
{
"source": "sudoc",
"authors": [
"authors.sudoc.1",
"authors.sudoc.2"
],
"abstract": {
"fr": "abstract.sudoc.fr",
"en": "abstract.sudoc.en"
}
}
]
Note: The docObject
s used to create the merged document MUST contain a source
field.
I want to build a merged document according to the following rules:
abstract.fr
, use data coming from "crossref", then "pubmed" and finally "sudoc".abstract.en
, use data coming from "pubmed", then "sudoc".I, thus, use the following JSON file:
{
"priorities": [
"hal",
"crossref",
"pubmed",
"sudoc"
],
"keys": {
"authors": [],
"abstract.fr": [
"crossref",
"pubmed",
"sudoc",
"hal"
],
"abstract.en": [
"pubmed",
"sudoc",
"crossref",
"hal"
]
}
}
Which will give me the following result:
{
"source": "hal",
"authors": [
"authors.crossref.1",
"authors.crossref.2"
],
"abstract": {
"fr": "abstract.crossref.fr",
"en": "abstract.pubmed.en"
},
"origins": {
"authors": "crossref",
"abstract.fr": "crossref",
"abstract.en": "pubmed",
"sources": [
"hal",
"crossref",
"pubmed"
]
}
}
Description:
source
: the base sourceorigins.<field>
: the source that was modified by istex-merge
for <field>
origins.sources
: an array compiling all the sources used in the merged documentauthors
), istex-merge
will go down the priority list until it finds a source with data for this field.Function to generate a Hal TEI from a merged document.
Generate a merged document using the generateMergedDocument function
const { generateMergedDocument, generateHalTEI } = require('istex-merge');
const rules = require('./myCustomFile.json');
const docObjects = [{...}, {...}, {...}];
const mergedDocument = generateMergedDocument(docObjects, rules);
const halTEIAsString = generateHalTEI(mergedDocument);
You can also pass an options
object to generateHalTEI
. This object is passed as is to xmlbuilder2 (the XML builder used by istex-merge
). You can find all the available options here.
For example, you can use this options
object to pretty print the TEI like so:
const prettyPrintedTEI = generateHalTEI(mergedDocument, { prettyPrint: true });
FAQs
Library to build merged documents and generate Hal TEIs from them.
The npm package @istex/istex-merge receives a total of 1 weekly downloads. As such, @istex/istex-merge popularity was classified as not popular.
We found that @istex/istex-merge demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Security experts warn that recent classification changes obscure the true scope of the NVD backlog as CVE volume hits all-time highs.
Security Fundamentals
Attackers use obfuscation to hide malware in open source packages. Learn how to spot these techniques across npm, PyPI, Maven, and more.
Security News
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.