
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
A lightweight, robust PDF parsing library for Node.js written in TypeScript. Extract text, images, and metadata from PDFs — even damaged ones — with no external dependencies.
npm install pdfnano
import { PDFParser } from 'pdfnano';
async function parsePDF(filePath: string) {
const parser = new PDFParser();
const result = await parser.parseFile(filePath);
console.log('Full text:', result.text);
console.log('Pages:', result.pages.length);
console.log('Title:', result.metadata.title);
}
parsePDF('document.pdf');
mimeType (e.g., image/jpeg, image/png)pageNumber for imagesparseFileToMarkdown, parseBufferToMarkdownparseFileToJSON, parseBufferToJSON (base64‑encodes image data)extractImagesFromFile, extractImagesFromBufferCore parsing
parseFile(filePath: string): Promise<PDFParseResult>parseBuffer(buffer: Buffer): Promise<PDFParseResult>Markdown and JSON helpers
parseFileToMarkdown(filePath: string): Promise<string>parseBufferToMarkdown(buffer: Buffer): Promise<string>parseFileToJSON(filePath: string): Promise<string>parseBufferToJSON(buffer: Buffer): Promise<string>Images only
extractImagesFromFile(filePath: string): Promise<PDFImage[]>extractImagesFromBuffer(buffer: Buffer): Promise<PDFImage[]>text: string — All text from the PDFpages: PDFPage[] — Per-page text, images, and dimensionsimages: PDFImage[] — All extracted imagesmetadata: PDFMetadata — Title, author, page count, etc.import { PDFParser } from 'pdfnano';
async function toMarkdown(filePath: string) {
const parser = new PDFParser();
const md = await parser.parseFileToMarkdown(filePath);
console.log(md);
}
toMarkdown('document.pdf');
import { PDFParser } from 'pdfnano';
import * as fs from 'fs';
async function toJSON(filePath: string) {
const parser = new PDFParser();
const buf = fs.readFileSync(filePath);
const jsonStr = await parser.parseBufferToJSON(buf);
const json = JSON.parse(jsonStr);
console.log(json.pages.length, 'pages');
}
toJSON('document.pdf');
import { PDFParser } from 'pdfnano';
import * as fs from 'fs';
import * as path from 'path';
async function saveImages(filePath: string, outDir: string) {
const parser = new PDFParser();
const images = await parser.extractImagesFromFile(filePath);
if (!fs.existsSync(outDir)) fs.mkdirSync(outDir, { recursive: true });
const ext = (m: string) => m === 'image/jpeg' ? 'jpg' : (m === 'image/png' ? 'png' : 'bin');
images.forEach((img, i) => {
const name = `image_${i + 1}_p${img.pageNumber}_${img.width}x${img.height}.${ext(img.mimeType)}`;
fs.writeFileSync(path.join(outDir, name), img.data);
});
console.log(`Saved ${images.length} image(s) to`, outDir);
}
saveImages('document.pdf', './out');
PDFNano attempts best-effort recovery when it encounters malformed XRef tables or missing markers. This allows parsing many real-world PDFs that otherwise fail. If you prefer strict mode (reject clearly invalid inputs), add a pre-check for the %PDF- header before calling the parser.
npm install
npm run build
npm test
MIT
FAQs
A pure JavaScript/TypeScript PDF parser with no external dependencies
We found that pdfnano demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.