node-poppler
Asynchronous node.js wrapper for the Poppler PDF rendering library
Intro
Poppler is a PDF rendering library that also includes a collection of utility binaries, which allows for the manipulation and extraction of data from PDF documents such as converting PDF files to HTML, TXT, or PostScript.
The node-poppler
module provides an asynchronous node.js wrapper around said utility binaries for easier use.
It was created out of a need for a PDF-to-HTML conversion module at Yeovil District Hospital NHS Foundation Trust to convert clinical documents.
Installation
Install using npm
:
npm install node-poppler
Or yarn
:
yarn add node-poppler
node-poppler's test scripts use npm commands.
Linux and macOS/Darwin Support
Windows and macOS/Darwin binaries are provided with this repository.
For Linux users, you will need to download the poppler-data
and poppler-utils
binaries separately.
An example of downloading the binaries on a Debian system:
sudo apt-get install poppler-data
sudo apt-get install poppler-utils
If you do not wish to use the included macOS binaries, you can download the latest versions with Homebrew:
brew install poppler
Once they have been installed, you will need to pass the poppler-utils
installation directory in as parameters to an instance of the Poppler class:
const { Poppler } = require("node-poppler");
const poppler = new Poppler("./usr/bin");
API
const { Poppler } = require("node-poppler");
API Documentation can be found here
Examples
poppler.pdfToCairo
Example of an async
await
call to poppler.pdfToCairo()
, to convert only the first and second page of a PDF file to PNG:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
pngFile: true,
};
const outputFile = `test_document.png`;
const res = await poppler.pdfToCairo(file, outputFile, options);
console.log(res);
poppler.pdfToHtml
Example of calling poppler.pdfToHtml()
with a promise chain:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
poppler.pdfToHtml(file, undefined, options).then((res) => {
console.log(res);
});
Example of calling poppler.pdfToHtml()
with a promise chain, providing a Buffer as an input:
const fs = require("fs");
const { Poppler } = require("node-poppler");
const file = fs.readFileSync("test_document.pdf");
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
poppler.pdfToHtml(file, "tester.html", options).then((res) => {
console.log(res);
});
poppler.pdfToText
Example of calling poppler.pdfToText()
with a promise chain:
const { Poppler } = require("node-poppler");
const file = "test_document.pdf";
const poppler = new Poppler();
const options = {
firstPageToConvert: 1,
lastPageToConvert: 2,
};
poppler.pdfToText(file, options).then((res) => {
console.log(res);
});
Contributing
Contributions are welcome, and any help is greatly appreciated!
See CONTRIBUTING.md for details on how to get started.
Please adhere to this project's Code of Conduct when contributing.
Acknowledgements
License
node-poppler
is licensed under the MIT license.