🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
DemoInstallSign in
Socket

@opendocsg/pdf2md

Package Overview
Dependencies
Maintainers
0
Versions
30
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@opendocsg/pdf2md

A PDF to Markdown Converter

0.2.1
latest
Source
npm
Version published
Weekly downloads
20K
29.95%
Maintainers
0
Weekly downloads
 
Created
Source

pdf2md

JavaScript npm library to parse PDF files and convert them into Markdown

Major Changes

See Releases

Usage

Library

const fs = require('fs')
const pdf2md = require('@opendocsg/pdf2md')

const pdfBuffer = fs.readFileSync(filePath)
pdf2md(pdfBuffer, callbacks)
  .then(text => {
    let outputFile = allOutputPaths[i] + '.md'
    console.log(`Writing to ${outputFile}...`)
    fs.writeFileSync(path.resolve(outputFile), text)
    console.log('Done.')
  })
  .catch(err => {
    console.error(err)
  })

CLI tool

$ cd [project_folder]
$ npx @opendocsg/pdf2md --inputFolderPath=[your input folder path] --outputFolderPath=[your output folder path] --recursive

If you are converting recursively on a large number of files you might encounter the error "Allocation failed - JavaScript heap out of memory”. Instead, run the command

$ node lib/pdf2md-cli.js --max-old-space-size=4096 --inputFolderPath=[your input folder path] --outputFolderPath=[your output folder path] --recursive

Options:

  • Input folder path (should exist)
  • Output folder path (should exist)
  • Recursive - convert all PDFs for folders within folders. Specify the tag if you require recursive, and omit if you don't

Credits

pdf-to-markdown - original project by Johannes Zillmann
pdf.js - Mozilla's PDF parsing & rendering platform which is used as a raw parser

Keywords

PDF

FAQs

Package last updated on 12 Dec 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts