New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

easyocr-unstructured

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

easyocr-unstructured

Parse unstructured text from PDFs

  • 1.1.6
  • PyPI
  • Socket score

Maintainers
1

EasyOCR Unstructured

EasyOCR Unstructured is a powerful library for Optical Character Recognition (OCR) that can extract text from PDFS, then group the text based on proximity.

It is intended for PDF files that have text that doesn't follow the left to right top to bottom standard of document writing.

Getting Started

pip install easyocr-unstructured

import easyocr_unstructured

# Initialize the EasyOCR Unstructured object
easyocr = EasyocrUnstructured()

# Invoke the OCR process on your PDF file
result = easyocr.invoke('/path/to/your_pdf_file.pdf')

#result will be a list of lists containing strings
from pprint import pprint as pp
pp(result)

Example Output

The output will look something like this:

[
    ["This is the piece of text. Nothing near it"],
    ["This is the second piece of text.", "This is the third piece of text that was close to the second"],
    ["This is the fourth piece of text. Nothing near it"],
    ...
]

Prerequisites

  • Python 3.12 +

Installing

pip install easyocr-unstructured

Usage

import easyocr_unstructured

easyocr = EasyocrUnstructured()
result = easyocr.invoke('/path/to/your_pdf_file.pdf')

Running the tests

No tests yet

Built With

  • Wing Pro
  • Python 3.12
  • numpy
  • easyocr
  • pdf2image
  • hashlib

Contributing

Please do, any sensible and safe change will be added!

Authors

Kevin Fink

License

MIT

Acknowledgments

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc