OCR Engine based on OCRopy and Kraken using Python 3.
It is designed to both be easy to use from the command line but also be modular to be integrated and customized from other python scripts.
Documentation
The documentation of Calamari is hosted here.
Pretrained model repository
Pretrained models are available at calamari_models
and calamari_models_experimental.
Current releases (with individual model tarballs) can be accessed
here and
here.
Installing
Calamari is available on pypi:
pip install calamari-ocr
Read the docs for further instructions.
Command-Line Interface
See the docs to learn how to use Calamari from the command line.
Calamari API
See the docs to learn how to adapt Calamari for your needs.
Citing Calamari
If you use Calamari in your Research-Project, please cite:
Wick, C., Reul, C., Puppe, F.: Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition. Digital Humanities Quarterly 14(1) (2020)
@article{wick_calamari_2020,
title = {Calamari - {A} {High}-{Performance} {Tensorflow}-based {Deep} {Learning} {Package} for {Optical} {Character} {Recognition}},
volume = {14},
number = {1},
journal = {Digital Humanities Quarterly},
author = {Wick, Christoph and Reul, Christian and Puppe, Frank},
year = {2020},
}