Latest Threat Research:SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains.Details
Socket
Book a DemoInstallSign in
Socket

ocr-pro

Package Overview
Dependencies
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ocr-pro

OCR extractor for PAN and Aadhaar card details

pipPyPI
Version
0.1.2
Maintainers
1

OCR Extractor

PyPI version Python versions License Downloads

A simple and efficient OCR-based data extraction tool for Indian PAN and Aadhaar cards using Tesseract OCR.

🆕 What's New in v0.1.3

  • Corrected example usage in README:
    • print(pan_data.get_pan())
    • print(aadhaar_data.get_aadhaar())
  • Includes all features from v0.1.2:
    • Added tesseract_cmd parameter to ExtractAadhaarData and ExtractPanData for custom Tesseract paths.
    • Fixed issue with preprocessing argument (preprocess) in child classes not being passed correctly.

(For full version history, see CHANGELOG.md)

✨ Features

  • Extract PAN card data with a single function call
  • Extract Aadhaar card data with a single function call
  • Built-in preprocessing option for better OCR accuracy
  • Cross-platform support (Windows, Linux, macOS) with configurable Tesseract path

📦 Installation

pip install ocr-pro

🚀 Usage

Extract PAN Card Data

from ocr import ExtractPanData

# Default usage (preprocess=False by default)
pan_data = ExtractPanData("pan_image.jpg", tesseract_cmd="/usr/bin/tesseract")

print(pan_data.get_pan())

Extract Aadhaar Card Data

from ocr import ExtractAadhaarData

# You can also enable preprocessing
aadhaar_data = ExtractAadhaarData("aadhaar_image.jpg", tesseract_cmd="/usr/bin/tesseract", preprocess=True)

print(aadhaar_data.get_aadhaar())

Arguments

  • filepath (str) → Path to the image file
  • tesseract_cmd (str, optional) → Path to the Tesseract executable (default: system auto-detection or "C:\Program Files\Tesseract-OCR\tesseract.exe" on Windows)
  • preprocess (bool, default=False) → Whether to apply preprocessing for better OCR results

⚙️ Requirements

📜 License

MIT License

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts