Socket
Book a DemoInstallSign in
Socket

mb-capcha-ocr

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

mb-capcha-ocr

An pytorch ocr base library for MBBank lib

0.1.5
pipPyPI
Maintainers
1

OCR Model Training and Prediction

This project is designed to train and use an Optical Character Recognition (OCR) model for recognizing characters in CAPTCHA images.

Project Structure

  • mb_capcha_ocr/: Contains the core OCR model and prediction logic.
  • train_model/: Contains the training script for the OCR model.

Installation and Setup for Training

  • Clone the repository:

    git clone https://github.com/thedtvn/mbbank-capcha-ocr
    cd mbbank-capcha-ocr
    cd train_model
    
  • Create and activate a virtual environment:

    python -m venv .venv
    source .venv/bin/activate  # On Windows use `.venv\Scripts\activate`
    
  • Install the required dependencies:

    pip install -r train_requirements.txt
    

Training the Model

  • Place your training and testing images in the dataset/ directory. The images should be named in the format {label}.(png|jpg|jpeg).

  • Run the training script:

    python train.py
    
  • The trained model will be saved as model.onnx in the directory.

Using the Model for Prediction

from PIL import Image
from mb_capcha_ocr import OcrModel

model = OcrModel()  # model_path optional if using custom model
img = Image.open("path_to_image.png")
predicted_text = model.predict(img)
print(predicted_text)

Files

  • train_model/train.py: Script to train the OCR model.
  • mb_capcha_ocr/predict.py: Script to predict text from an image using the trained OCR model.
  • requirements.txt: List of dependencies required for the project.

Dependencies

  • Python 3.x
  • numpy
  • onnxruntime
  • Pillow

Dependencies Training

  • Python 3.x
  • torch
  • torchvision
  • matplotlib
  • Pillow
  • onnx

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Credits

Best thanks to CookieGMVN for providing the dataset V1 V2.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.