Latest Threat Research:SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains.Details →

Book a Demo Install Sign in

ocr-pro

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

ocr-pro

OCR extractor for PAN and Aadhaar card details

PyPI

Version: 0.1.2

Maintainers: 1

OCR Extractor

A simple and efficient OCR-based data extraction tool for Indian PAN and Aadhaar cards using Tesseract OCR.

🆕 What's New in v0.1.3

Corrected example usage in README:
- print(pan_data.get_pan())
- print(aadhaar_data.get_aadhaar())
Includes all features from v0.1.2:
- Added tesseract_cmd parameter to ExtractAadhaarData and ExtractPanData for custom Tesseract paths.
- Fixed issue with preprocessing argument (preprocess) in child classes not being passed correctly.

(For full version history, see CHANGELOG.md)

✨ Features

Extract PAN card data with a single function call
Extract Aadhaar card data with a single function call
Built-in preprocessing option for better OCR accuracy
Cross-platform support (Windows, Linux, macOS) with configurable Tesseract path

📦 Installation

pip install ocr-pro

🚀 Usage

Extract PAN Card Data

from ocr import ExtractPanData

# Default usage (preprocess=False by default)
pan_data = ExtractPanData("pan_image.jpg", tesseract_cmd="/usr/bin/tesseract")

print(pan_data.get_pan())

Extract Aadhaar Card Data

from ocr import ExtractAadhaarData

# You can also enable preprocessing
aadhaar_data = ExtractAadhaarData("aadhaar_image.jpg", tesseract_cmd="/usr/bin/tesseract", preprocess=True)

print(aadhaar_data.get_aadhaar())

Arguments

filepath (str) → Path to the image file
tesseract_cmd (str, optional) → Path to the Tesseract executable (default: system auto-detection or "C:\Program Files\Tesseract-OCR\tesseract.exe" on Windows)
preprocess (bool, default=False) → Whether to apply preprocessing for better OCR results

⚙️ Requirements

Python 3.7+
Tesseract OCR installed on your system

📜 License

MIT License

FAQs

What is ocr-pro?

Is ocr-pro well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

ocr-pro

OCR Extractor

🆕 What's New in v0.1.3

✨ Features

📦 Installation

🚀 Usage

Extract PAN Card Data

Extract Aadhaar Card Data

Arguments

⚙️ Requirements

📜 License

Related posts

SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains

Socket Joins the OpenJS Foundation