n8n PDF-Toolkit
Empowering Digital Innovation
Merge, split, render and OCR PDFs directly in your n8n workflows — developed by prokodo.

✨ Features
- 🧩 Merge multiple PDFs into one
- ✂️ Split a PDF into pages or custom ranges
- 🖼️ PDF → Images (PNG/JPEG) at a chosen DPI
- 🔎 OCR (Tesseract.js) on PDFs or images
- 🤖 Auto-detects PDF vs image input
- 📑 Per-page or combined text output
- 📎 Optional TXT attachment
✅ Requirements
- Node.js 18+ / 20 LTS
- n8n ≥ 1.103 (tested on 1.105+)
Using an older n8n (e.g. 1.88)? It may still work if you align n8n-core / n8n-workflow versions. For best results, upgrade n8n.
📦 Install
Option A — Install into your n8n “custom extensions” folder (recommended)
Local n8n (not Docker):
export N8N_CUSTOM_EXTENSIONS=~/.n8n
npm install --prefix "$N8N_CUSTOM_EXTENSIONS" @prokodo/n8n-nodes-pdf-toolkit@latest
n8n start
Docker (example Dockerfile):
FROM n8nio/n8n:latest
ENV N8N_CUSTOM_EXTENSIONS=/home/node/.n8n
ENV NODE_PATH=/home/node/.n8n/node_modules
USER node
RUN npm install --prefix /home/node/.n8n @prokodo/n8n-nodes-pdf-toolkit@latest
After starting n8n, search in the node picker for “prokodo (PDF Toolkit)”
Internal name: prokodoPdfToolkit
🛠 Dev install (build + link locally)
npm ci
npm run build
npm link
npm link @prokodo/n8n-nodes-pdf-toolkit --prefix ~/.n8n
export N8N_CUSTOM_EXTENSIONS=~/.n8n
n8n start
Publish-ready tip: This package publishes compiled JS from dist/ to npm.
You don’t need to commit dist/ to Git. To support installs straight from GitHub, add:
"scripts": {
"prepare": "npm run build"
}
…and commit src/ (not dist/).
🔎 Node usage
Common
- Binary Property (Input): name of the incoming binary property holding your PDF/image (default data).
Merge PDFs
Operation: Merge PDFs
- Output Property (Merged PDF): binary property for output (default data)
- Output File Name: e.g. merged.pdf
- Result: a single item with the merged PDF.
Split PDF
Operation: Split PDF
- Split Mode: Every Page or By Ranges
- Ranges (if By Ranges): e.g. 1,3-5,10- (1-based, inclusive; 10- means page 10 to end)
- Output Property: binary property for each part (default data)
- Result: one item per output part.
PDF → Images
Operation: PDF to Images
- Image Format: PNG or JPEG
- DPI: e.g. 150 (higher → bigger & slower)
- JPEG Quality: only when format is JPEG
- Page Ranges: e.g. 1-2,5
- Result: one item per rendered page, each with binary.image.
OCR
- Operation: OCR
- Languages (Tesseract): e.g. eng, deu, eng+deu
- Return: Single (combined text) or Per Page
- Also Attach TXT as Binary: optional .txt file output
Advanced Settings (optional):
- OCR Page Ranges (for PDFs only)
- OCR Image Format, OCR DPI, OCR JPEG Quality (rasterization settings)
- Tesseract Lang Path (custom URL/path hosting *.traineddata)
- OCR auto-detects whether the incoming file is a PDF or image via MIME/bytes.
OCR auto-detects whether the incoming file is a PDF or image via MIME/bytes.
🧯 Troubleshooting
Node doesn’t show up
Ensure N8N_CUSTOM_EXTENSIONS points to the folder where you installed the package.
Restart n8n and search for “prokodo (PDF Toolkit)”.
Verify your n8n version (≥ 1.103 recommended).
“Could not get parameter” during execution
Open the node in the editor and ensure all visible fields for the chosen operation are set.
If you switched operations, re-open and re-save the node.
canvas native dependency errors
On Debian/Ubuntu containers:
apt-get update && apt-get install -y \
libcairo2 libpango-1.0-0 libjpeg62-turbo libgif7 librsvg2-2 \
&& rm -rf /var/lib/apt/lists/*
On Alpine:
apk add --no-cache cairo pango jpeg giflib librsvg
Prefer Debian-based n8n images for smoother canvas support.
OCR language data not found / offline
Slow/High memory on large PDFs
- Lower DPI (e.g. 120).
- Use page ranges (process in chunks).
- Avoid very high DPI for OCR unless necessary.
🙌 Contributing
PRs welcome!
npm ci
npm run build
Open a PR with what changed and how to test it.
📄 License
This library is published under MIT.
© 2025 prokodo.
Visit us at prokodo.com.