Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

llm-document-ocr

Package Overview
Dependencies
Maintainers
1
Versions
9
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

llm-document-ocr

LLM Based OCR and Document Parsing for Node.js

  • 1.2.0
  • latest
  • Source
  • npm
  • Socket score

Version published
Maintainers
1
Created
Source

llm-document-ocr

npm version npm downloads license


Sponsored by Mercoa, the API for BillPay and Invoicing. Everything you need to launch accounts payable in your product with a single API!


LLM Based OCR and Document Parsing for Node.js. Uses GPT4 and Claude3 for OCR and data extraction.

  • Converts PDFs (including multi page PDFs) into PNGs for use with GPT4
  • Automatically crops white-space to create smaller inputs
  • Cleans up JSON string returned by the LLM and converts it to an JSON object
  • Custom prompt support for capturing any data you need

Supports:

  • ✅ PNG
  • ✅ WEBP
  • ✅ JPEG / JPG
  • ✅ GIF
  • ✅ PDF
  • ✅ Multi-page PDF
  • ❌ DOC
  • ❌ DOCX

Installation

npm i --save llm-document-ocr
yarn add llm-document-ocr

Note: If you are deploying via Docker, see the Dockerfile for an example Alpine base image.

Usage

import { DocumentOcr, prompts } from "llm-document-ocr";

const documentOcr = new DocumentOcr({
  apiKey: 'YOUR-OpenAi/Anthropic-API-KEY' // required, defaults to process.env.OPENAI_API_KEY. OpenAI models need an OpenAI API key, Antrhopic models need an Anthropic API key.
  model: "gpt-4o", // optional, defaults to "gpt-4-turbo". Options are: "gpt-4-turbo", "gpt-4o", "claude-3-opus-20240229", "claude-3-sonnet-20240229", "claude-3-haiku-20240307"
  standardFontDataUrl: "https://unpkg.com/pdfjs-dist@3.2.146/standard_fonts/", // optional, defaults to "https://unpkg.com/pdfjs-dist@3.2.146/standard_fonts/". You can use the systems fonts or the fonts under ./node_modules/pdfjs-dist/standard_fonts/ as well.
});

const documentData = await documentOcr.process({
  model: "gpt-4o", // optional, defaults to model defined in constructor
  document: 'JVBERi0xLjMNCiXi48/TDQoNCjEgMCBvYmoNCjw8DQ...', // Base64 String, Base64 URI, or Buffer
  mimeType: 'application/pdf', // mime-type of the document or image
  prompt: 'invoiceStartDate, invoiceEndDate, amount', // system prompt for data extraction. See examples below.
  pageOptions: 'FIRST_AND_LAST' // optional, defaults to 'ALL'. Determines which page of the PDF will be processed. Available options are 'ALL', 'FIRST_AND_LAST', 'FIRST', 'LAST'.
})

Prompts

Prompts will be automatically prefixed to tell the LLM to return JSON. You will need to specify the data you wish to extract, and the LLM will return a JSON object with those keys.

For example, the prompt we use at Mercoa for invoice processing is the following:

`invoice number, invoice amount, currency (as ISO 4217 code), dueDate, invoiceDate, serviceStartDate, serviceEndDate,
  vendor's [name, email with @, website],
  line items [amnt, price, qty, des, name, cur (as ISO 4217 code)]`;

And this returns a JSON object that looks like:

{
  invoiceNumber?: string | number
  invoiceAmount?: string | number
  currency?: string
  dueDate?: string
  invoiceDate?: string
  serviceStartDate?: string
  serviceEndDate?: string
  vendor: {
    name?: string
    email?: string
    website?: string
  }
  lineItems: Array<{
    des?: string
    qty?: string | number
    price?: string | number
    amnt?: string | number
    name?: string
    cur?: string
  }>
}

Issues and Contributing

If you encounter a bug or want to see something added/changed, please go ahead and open an issue

If you wish to contribute to the library, thanks! Please see the CONTRIBUTING guide for more details.

License

MIT © Mercoa, Inc

Keywords

FAQs

Package last updated on 13 May 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc