
Security News
AI Agent Lands PRs in Major OSS Projects, Targets Maintainers via Cold Outreach
An AI agent is merging PRs into major OSS projects and cold-emailing maintainers to drum up more work.
ocrxdoc
Advanced tools
A clean, easy-to-use Python framework for OCR (Optical Character Recognition) using Qwen3-VL AI models. Supports images (JPG, PNG, JPEG), PDF, DOCX, and TXT files.
pip install ocrxdoc
pip install ocrxdoc[pdf]
pip install ocrxdoc[docx]
pip install ocrxdoc[all]
from ocrxdoc import OCREngine
# Initialize OCR engine
engine = OCREngine(model_size="4B", device="auto")
# Load model
engine.load_model()
# Process an image
result = engine.ocr("path/to/image.jpg", prompt="Extract all text from this image")
print(result)
from ocrxdoc import OCREngine
engine = OCREngine(model_size="4B")
engine.load_model()
# Process image
result = engine.ocr("image.jpg")
# Process PDF
result = engine.ocr("document.pdf")
# Process DOCX
result = engine.ocr("document.docx")
# Process TXT
result = engine.ocr("text.txt")
from ocrxdoc import OCREngine
engine = OCREngine(model_size="4B")
engine.load_model()
files = ["image1.jpg", "image2.png", "document.pdf"]
def progress_callback(current, total, filename):
print(f"Processing {current}/{total}: {filename}")
results = engine.ocr_batch(files, progress_callback=progress_callback)
for file_path, result in results:
print(f"{file_path}: {result[:100]}...")
from ocrxdoc import OCREngine
# Use custom model path
engine = OCREngine(
model_path="./custom/models/Qwen3-VL-4B-Instruct",
device="cuda:0"
)
engine.load_model()
from ocrxdoc import OCREngine
engine = OCREngine(model_size="4B")
engine.load_model()
# OCR only a specific region: (x, y, width, height)
result = engine.ocr(
"image.jpg",
roi=(100, 100, 500, 300) # Crop region before OCR
)
from ocrxdoc import OCREngine
engine = OCREngine(
model_size="4B",
max_tokens=5000,
temperature=0.1,
top_p=0.9
)
engine.load_model()
# Or update after initialization
engine.set_generation_params(
max_tokens=5000,
temperature=0.1
)
Models need to be downloaded manually due to their large size:
4B Model (Default):
./models/Qwen3-VL-4B-Instruct/2B Model:
./models/Qwen3-VL-2B-Instruct/Main OCR engine class.
__init__(model_path=None, model_size="4B", device="auto", dtype=None, poppler_path=None, max_tokens=3000, temperature=0.2, top_p=0.8, top_k=50, repetition_penalty=1.1)Initialize OCR engine.
load_model()Load the OCR model and processor.
ocr(file_path, prompt="...", roi=None)Perform OCR on a file.
file_path: Path to fileprompt: Prompt for OCR modelroi: Optional region of interest as (x, y, width, height)Returns: Extracted text string
ocr_batch(file_paths, prompt="...", progress_callback=None)Perform OCR on multiple files.
file_paths: List of file pathsprompt: Prompt for OCR modelprogress_callback: Optional callback(current, total, filename)Returns: List of tuples (file_path, ocr_result)
set_generation_params(max_tokens=None, temperature=None, top_p=None, top_k=None, repetition_penalty=None)Update generation parameters.
cleanup()Clean up temporary files.
See examples/ directory for more examples.
MIT License
FAQs
Python Framework for OCR using Qwen3-VL Models
We found that ocrxdoc demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
An AI agent is merging PRs into major OSS projects and cold-emailing maintainers to drum up more work.

Research
/Security News
Chrome extension CL Suite by @CLMasters neutralizes 2FA for Facebook and Meta Business accounts while exfiltrating Business Manager contact and analytics data.

Security News
After Matplotlib rejected an AI-written PR, the agent fired back with a blog post, igniting debate over AI contributions and maintainer burden.