
Security News
The Changelog Podcast: Practical Steps to Stay Safe on npm
Learn the essential steps every developer should take to stay secure on npm and reduce exposure to supply chain attacks.
in-text-citations-parser
Advanced tools
Creates a stream from PDF
Node.js module for parsing in-text citations from streams.
Use it with pdf-stream module.
npm i in-text-citations-parser --save
new XMLTransform() converter 'use strict';
const fs = require('fs');
const pdf_stream = require('pdf-stream');
global.XMLHttpRequest = require('xhr2'); // for PDFJS
const parser = require('in-text-citations-parser');
const ParserTransform = parser.ParserTransform;
const XMLTransform = parser.XMLTransform;
const FileTransform = parser.FileTransform;
let configs = [{
file: './CRIS2016_paper_40_Parinov.txt',
uri: 'http://dspacecris.eurocris.org/bitstream/11366/526/1/CRIS2016_paper_40_Parinov.pdf',
handle: 'repec:rus:mqijxk:43',
prefix: 'pannot1'
},
// ...
];
configs.map((config)=> {
fs.createReadStream(config.file)
.pipe(new ParserTransform('gost'))
.pipe(new XMLTransform({
uri: config.uri,
handle: config.handle,
prefix: config.prefix
}))
.pipe(new FileTransform({
outdir: './out',
prefix: config.prefix,
}));
});
new XMLinTextRefTransform() converter 'use strict';
const fs = require('fs');
const pdf_stream = require('pdf-stream');
const parser = require('in-text-citations-parser');
const ParserTransform = parser.ParserTransform;
const XMLinTextRefTransform = parser.XMLinTextRefTransform;
const FileTransform = parser.FileTransform;
let file = './CRIS2016_paper_40_Parinov.txt';
//let file = './2014_Nevolin_rfbr.txt';
//let file = './pdt-journal_112_145.txt';
const text_stream = fs.createReadStream(file);
text_stream
.pipe(new ParserTransform('gost'))
.pipe(new XMLinTextRefTransform())
.pipe(new FileTransform({
outdir: './out',
prefix: 'CRIS2016_paper_40_Parinov_intextref',
}))
;
All methods are streams, use them with .pipe().
alternative usage:
new ParserTransform('gost')
Find citations in text.
Options:
options — String with name of predefined style (gost) or RegExp or Object, with params:
regexp — Regular expression for citation match;normalize — Object with functions for normalizing the matched citations.Convert annotations to XML in object mode stream.
Options:
xml.Linkage.Object.To.Uri;xml.Linkage.Object.To.Handle;xml.DocID.Convert annotations to XML format in-text references for CiteEcCyr project, in object mode stream.
Save XMLs objects to XML files.
Options object:
outdir — output directory;prefix — file prefix.Contributors are welcome. Open an issue or submit pull request.
Small note: If editing the README, please conform to the standard-readme specification.
Apache 2.0
© Sergey N
FAQs
Extract inline citations from stream
We found that in-text-citations-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Learn the essential steps every developer should take to stay secure on npm and reduce exposure to supply chain attacks.

Security News
Experts push back on new claims about AI-driven ransomware, warning that hype and sponsored research are distorting how the threat is understood.

Security News
Ruby's creator Matz assumes control of RubyGems and Bundler repositories while former maintainers agree to step back and transfer all rights to end the dispute.