PDF Worker Package
A simple and robust PDF text extraction utility using pdfjs-dist. This package provides an easy-to-use interface for extracting text content from PDF files.
Installation
npm install pdf-worker-package
Usage
import { processPDF } from 'pdf-worker-package';
import * as fs from 'fs';
async function extractPDFText() {
try {
const pdfBuffer = fs.readFileSync('path/to/your/file.pdf');
const text = await processPDF(pdfBuffer);
console.log('Extracted text:', text);
} catch (error) {
console.error('Error processing PDF:', error);
}
}
async function extractPDFTextFromFile(file: File) {
try {
const arrayBuffer = await file.arrayBuffer();
const text = await processPDF(arrayBuffer);
console.log('Extracted text:', text);
} catch (error) {
console.error('Error processing PDF:', error);
}
}
API
processPDF(pdfData: ArrayBuffer): Promise<string>
Processes a PDF file and extracts its text content.
pdfData: The PDF file as an ArrayBuffer
- Returns: A promise that resolves to the extracted text
- Throws: Error if PDF processing fails
Error Handling
The package includes robust error handling for common PDF processing issues:
- Worker initialization failures
- Invalid PDF files
- Processing errors
đ Related
License
MIT thegreatbey