node-poppler

Asynchronous node.js wrapper for the Poppler PDF rendering library
Intro
The node-poppler module was created out of a need for a PDF-to-HTML conversion module at Yeovil District Hospital NHSFT to convert clinical documents in PDF format to HTML.
Installation
Install using yarn
:
yarn add node-poppler
Or npm
:
npm install node-poppler
node-poppler's test scripts use yarn commands.
Linux and macOS/Darwin support
Windows and macOS/Darwin binaries are provided with this repository.
For Linux users, you will need to download the poppler-data
and poppler-utils
binaries separately.
An example of downloading the binaries on a Debian system:
sudo apt-get install poppler-data
sudo apt-get install poppler-utils
If you do not wish to use the included macOS binaries, you can download the latest versions with Homebrew:
brew install poppler
Once they have been installed, you will need to pass the poppler-utils
installation directory in as parameters to an instance of the Poppler class:
const { Poppler } = require('node-poppler');
const poppler = new Poppler('./usr/bin');
API
const { Poppler } = require('node-poppler');
API Documentation can be found here
Examples
poppler.pdfToCairo
options
object requires at least one of the following to be set: jpegFile
; pdfFile
; pngFile
; psFile
; svgFile
; tiffFile
.
Example of an async await call poppler.pdfToCairo, to convert only the first and second page of a PDF file to PNG:
const { Poppler } = require('node-poppler');
const file = 'test_document.pdf';
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
pngFile: true
};
const outputFile = `test_document.png`;
const res = await poppler.pdfToCairo(file, outputFile, options);
console.log(res);
poppler.pdfToHtml
Every field of the options
object is optional.
Example of calling poppler.pdfToHtml with a promise:
const { Poppler } = require('node-poppler');
const file = 'test_document.pdf';
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2
};
poppler.pdfToHtml(file, options).then((res) => {
console.log(res);
});
poppler.pdfToText
Every field of the options
object is entirely optional.
Example of calling poppler.pdfToText with a promise:
const { Poppler } = require('node-poppler');
const file = 'test_document.pdf';
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2
};
poppler.pdfToText(file, options).then((res) => {
console.log(res);
});
Contributing
Please see CONTRIBUTING.md for more details regarding contributing to this project.
License
node-poppler
is licensed under the MIT license.
2.0.0 (2020-11-03)
- docs: enable TypeScript definition generation for all methods (cceecc8) , thanks to @arthurdenner
- docs(index): correct stdout usage (be0bb49)
- docs(readme): add note about macos binaries (41c7e1e)
- test(index): correct param orders for function calls (e075a7b)
- build(deps-dev): bump dev dependencies (c450c04)
- build(travis): update osx image (0c043db)
- feat(index): add typescript definition file (d82df8b)
- feat(lib): update poppler win32 binaries from 20.10.0 to 20.11.0 (bc5478e)
- refactor(index): reorder parameters for all functions (ead466e)
- chore: add TypeScript config to generate definition (c5b4858), thanks to @arthurdenner
- chore(scripts): do not lint ts and tsx files (b1e8426)
BREAKING CHANGE
- optional
options
object parameter for all functions has been moved to the end. i.e. Poppler.pdfToText(options, file, outputFile)
is now Poppler.pdfToText(file, outputFile, options)
.
This allows for easier use of the functions as users no longer have to place an undefined parameter if no options are provided. Poppler.pdfToText(undefined, file, outputFile)
can now be called instead like Poppler.pdfToText(file, outputFile)
.