istex
Ce plugin propose une série d'instructions liées à l’usage de l’API ISTEX
installation
npm install @ezs/istex
usage
Table of Contents
ISTEX
Take an array and returns matching documents for every value of the array
Parameters
query
(string | Array<string>) ISTEX query (or queries) (optional, default data.query||[]
)id
(string | Array<string>) ISTEX id (or ids) (optional, default data.id||[]
)maxPage
number maximum number of pages to getsize
number size of each page of resultsduration
string maximum duration between two requests (ex: "30s")field
Array<Object> fields to output
Examples
.pipe(ezs('ISTEX', {
query: 'this is a test',
size: 3,
maxPage: 1,
sid: 'test'
}))
Returns Array<Object>
ISTEXFacet
Take an object containing a query string field, a facet, and output
aggregations from the ISTEX API.
Parameters
query
string ISTEX query (optional, default "*"
)facet
string ISTEX facet (optional, default "corpusName"
)sid
string User-agent identifier (optional, default "ezs-istex"
)
Examples
from([{ query: 'ezs', facet: 'corpusName' }])
.pipe(ezs('ISTEXFacet', { sid: 'test', }))
Returns Array<Object>
ISTEXFetch
Take Object
with id
and returns the document's metadata
Parameters
source
string Field to use to fetch documents (optional, default "id"
)target
stringid
string ISTEX Identifier of a document (optional, default data.id
)sid
string User-agent identifier (optional, default "ezs-istex"
)
Examples
Input:
[{
id: '87699D0C20258C18259DED2A5E63B9A50F3B3363',
}, {
id: 'ark:/67375/QHD-T00H6VNF-0',
}]
will produce two JSON records.
.pipe(ezs('ISTEXFetch', { source: 'id' }))
Returns Array<Object>
ISTEXFiles
Take an Object with ISTEX id
and generate an object for each file
See ISTEXScroll
Parameters
fulltext
string typology of the document to save (optional, default pdf
)metadata
string format of the files to save (optional, default json
)enrichment
string? enrichment of the document to savesid
string User-agent identifier (optional, default "ezs-istex"
)
Returns Array
ISTEXFilesContent
Take an Object with ISTEX source
and check the document's file.
Warning: to access fulltext, you have to give a token
parameter.
ISTEXFetch produces the stream you need to save the file.
See ISTEXFiles
Parameters
Returns Object
ISTEXFilesWrap
Take and Object with ISTEX stream
and wrap into a single zip
See ISTEXFiles
Returns Buffer
ISTEXParseDotCorpus
Parse a .corpus
file content, and execute the action contained in the
.corpus
file.
1query.corpus
[ISTEX]
query = language.raw:rum
field = doi
field = author
field = title
field = language
field = publicationDate
field = keywords
field = host
field = fulltext
1notice.corpus
[ISTEX]
id 2FF3F5B1477986B9C617BB75CA3333DBEE99EB05
Returns Object
ISTEXResult
Take Object
containing results of ISTEX API, and returns hits
value
(documents).
This should be placed after ISTEXScroll.
See ISTEXScroll
Parameters
source
string (optional, default data
)target
string (optional, default feed
)
Returns Array<Object>
ISTEXSave
Take and Object with ISTEX id
and save the document's file.
Warning: to access fulltext, you have to give a token
parameter.
ISTEXFetch produces the stream you need to save the file.
See ISTEXFetch
Parameters
directory
string path for the PDFs (optional, default currentworkingdirectory
)typology
string typology of the document to save (optional, default "fulltext"
)format
string format of the files to save (optional, default "pdf"
)sid
string User-agent identifier (optional, default "ezs-istex"
)token
string? authentication token (see documentation)
Returns Array
ISTEXScroll
Take an object containing a query string field and output records from the
ISTEX API. Every output record is merged with the input object.
Parameters
query
string ISTEX query (optional, default input
)sid
string User-agent identifier (optional, default "ezs-istex"
)maxPage
number Maximum number of pages to getsize
number size of each page of results (optional, default 2000
)duration
string maximum duration between two requests (optional, default "5m"
)field
Array<string> fields to get (optional, default ["doi"]
)
Examples
from([{ query: 'this is a test' }])
.pipe(ezs('ISTEXScroll', {
maxPage: 2,
size: 1,
sid: 'test',
}))
Returns Array<Object>
ISTEXTriplify
Take Object
containing flatten hits from ISTEXResult.
If the environment variable DEBUG is set, some errors could appear on stderr.
See
data:
{
'author/0/name': 'Geoffrey Strickland',
'author/0/affiliations/0': 'University of Reading',
'host/issn/0': '0047-2441',
'host/eissn/0': '1740-2379',
'title': 'Maupassant, Zola, Jules Vallès and the Paris Commune of 1871',
'publicationDate': '1983',
'doi/0': '10.1177/004724418301305203',
'id': 'F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483',
'score': 1
}
javascript:
.pipe(ezs('ISTEXTriplify', {
property: [
'doi/0 -> http://purl.org/ontology/bibo/doi',
'language -> http://purl.org/dc/terms/language',
'author/\\d+/name -> http://purl.org/dc/terms/creator',
'author/\\d+/affiliations -> https://data.istex.fr/ontology/istex#affiliation',
],
));
output:
<https://data.istex.fr/document/F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483>
a <http://purl.org/ontology/bibo/Document> ;
"10.1002/zaac.19936190205" ;
<https://data.istex.fr/ontology/istex#idIstex> "F6CB7249E90BD96D5F7E3C4E80CC1C3FEE4FF483" ;
<http://purl.org/dc/terms/creator> "Geoffrey Strickland" ;
<https://data.istex.fr/ontology/istex#affiliation> "University of Reading" ;
Parameters
property
Object path to uri for the properties to output (property and uri separated by ->
) (optional, default []
)source
string the root of the keys (ex: istex/
) (optional, default ""
)
Returns string
ISTEXUniq
Remove duplicates triples within a single document's set of triples (same
subject).
Assume that every triple of a document (except the first one) follows another
triple of the same document.
Input:
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <http://purl.org/dc/terms/creator> "S Corbett" .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
Action in a `.ezs` script
[ISTEXUniq]
Output
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <http://purl.org/dc/terms/creator> "S Corbett" .
<https://api.istex.fr/ark:/67375/NVC-JMPZTKTT-R> <https://data.istex.fr/ontology/istex#affiliation> "Department of Public Health, University of Sydney, Australia." .
ISTEXUnzip
Take the content of a zip file, extract JSON files, and yield JSON objects.
The zip file comes from dl.istex.fr, and the manifest.json
is not
extracted.
Returns any Array