Bagira
This is a simple gem that executes Ghostscript and Tesseract OCR via command line in order to perform ocr on images and pdf documets. It currently supports JPG, PNG and PDF.
Installation
Before using please make sure that you have installed both Ghostscript and Tesseract OCR
On Ubuntu:
sudo apt-get install libgs-dev
sudo apt-get install tesseract-ocr
On Mac (Homebrew):
brew install gs
brew install tesseract
Add this line to your application's Gemfile:
gem 'bagira'
And then execute:
$ bundle
Or install it yourself as:
$ gem install bagira
Usage
bagira = Bagira.new(path_to_your_file)
output = bagira.perform_ocr
Development
rSpec is included with basic tests, you can check the spec/ folder for more details.
Contributing
Bug reports and pull requests are welcome on GitHub at https://bitbucket.org/juanmvallejo/bagira/issues
License
The gem is available as open source under the terms of the MIT License.