
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
A powerful Node.js tool for converting PDF documents to Markdown format using advanced vision models. PDF2MD extracts text, tables, and images from PDFs and generates well-structured Markdown documents.
# Clone the repository
git clone https://github.com/yourusername/pdf2md.git
cd pdf2md/pdf2md-node
# Install dependencies
npm install
# Build
npm run build
import { parsePdf, getPageCount } from './src/index.js';
// Get PDF page count
const pageCount = await getPageCount('path/to/your.pdf');
console.log(`PDF has ${pageCount} pages`);
// Convert PDF to Markdown
const result = await parsePdf('path/to/your.pdf', {
apiKey: 'your-api-key',
model: 'gpt-4-vision-preview',
useFullPage: true // Use full page processing mode
});
console.log(`Markdown file generated: ${result.mdFilePath}`);
const options = {
// Output directory for generated files
outputDir: './output',
// API key for the vision model
apiKey: 'your-api-key',
// API endpoint (if using a custom endpoint)
baseUrl: 'https://api.example.com/v1',
// Vision model to use
model: 'gpt-4-vision-preview',
// Custom prompt for the vision model
prompt: 'Convert this PDF to well-structured Markdown',
// Whether to use full page processing (recommended)
useFullPage: true,
// Whether to keep intermediate image files
verbose: false,
// Image scaling factor (higher = better quality but slower)
scale: 3,
// Whether to use OpenAI-compatible API
openAiApicompatible: true,
// Concurrency (number of pages that can be processed simultaneously)
concurrency: 2,
// Progress handling callback method (allows the caller to track processing progress; the entire conversion task is only considered complete when the taskStatus is finished)
onProgress: ({ current, total, taskStatus }) => {
console.log(`Processed: ${current}, Total pages: ${total}, Task status: ${taskStatus}`);
}
};
const result = await parsePdf('path/to/your.pdf', options);
| Provider | Models |
|---|---|
| OpenAI | gpt-4-vision-preview, gpt-4o |
| Claude | claude-3-opus-20240229, claude-3-sonnet-20240229 |
| Gemini | gemini-pro-vision |
| Doubao | doubao-1.5-vision-pro-32k-250115 |
The project includes several test scripts to verify functionality:
# Test the full PDF to Markdown conversion process
node test/testFullProcess.js
# Test only the PDF to image conversion
node test/testFullPageImages.js
# Test specific vision models
node test/testModel.js
pdf2md-node/
├── src/
│ ├── index.js # Main entry point
│ ├── pdfParser.js # PDF parsing module
│ ├── imageGenerator.js # Image generation module
│ ├── modelClient.js # Vision model client
│ ├── markdownConverter.js # Markdown conversion module
│ └── utils.js # Utility functions
├── test/
│ ├── samples/ # Sample PDF files for testing
│ ├── testFullProcess.js # Full process test
│ └── ... (other test files)
└── package.json
PDF2MD consists of the following core modules, each responsible for specific functionality:
Coordinates the entire system:
Parses PDF files and extracts structured information:
Renders PDF areas as images:
Interacts with various vision model APIs:
Converts model results to standard Markdown format:
This project is licensed under the MIT License - see the LICENSE file for details.
FAQs
Convert PDF to Markdown using Node.js
The npm package pdf-toolmd receives a total of 0 weekly downloads. As such, pdf-toolmd popularity was classified as not popular.
We found that pdf-toolmd demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.