The document you are reading now is targeted at developers wanting to use or contribute to the engine of Open Terms Archive. For a high-level overview of Open Terms Archive’s wider goals and processes, please read its public homepage.
Open Terms Archive Engine
This codebase is a Node.js module enabling downloading, archiving and publishing versions of documents obtained online. It can be used independently from the Open Terms Archive ecosystem.
For documentation, visit docs.opentermsarchive.org
License
The code for this software is distributed under the European Union Public Licence (EUPL) v1.2. In short, this means you are allowed to read, use, modify and redistribute this source code, as long as you as you credit “Open Terms Archive Contributors” and make available any change you make to it under similar conditions.
Contact the core team over email at contact@[project name without spaces].org
if you have any specific need or question regarding licensing.
0.27.0 - 2023-04-19
Full changeset and discussions: #996, #999, #1000.
Development of this release was supported by the French Ministry for Foreign Affairs through its ministerial State Startups incubator under the aegis of the Ambassador for Digital Affairs.
Changed
- Breaking: Rename CLI option
--terms-types
to --types
in API; simply rename accordingly in your own codebase - Breaking: Rename CLI option
--refilter-only
, -r
to --extract-only
, -e
in API; simply rename accordingly in your own codebase - Breaking: Rename class
PageDeclaration
to SourceDocument
and its atribute noiseSelectors
to insignificantContentSelectors
in API; simply rename accordingly in your own codebase - Breaking: Rename function and its parameters
filter({ content, mimeType, pageDeclaration })
to extract(sourceDocument)
in API; content
and mimeType
are embedded sourceDocument
attributes; rename accordingly in your own codebase and set content
and mimeType
in the sourceDocument
passed as a parameter to the function - Breaking: Rephrase commit messages in Git storage:
Start tracking
is changed to First record of
, Refilter
to Apply technical or declaration upgrade on
and Update
to Record new changes of
; existing data will still be loaded, but new commits will use these new messages, if you have scripts that parse commit messages directly, update them accordingly - Breaking: Rename document attribute
isRefilter
to isExtractOnly
in MongoDB storage; existing data will still be loaded, but new entries will use this new attribute, if you have scripts that query the Mongo database directly, update them accordingly - Make vocabulary consistent throughout the codebase (#971)
Removed
- Breaking: Remove
npm run extract
command; use npm run start -- --extract-only
instead