Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

rescribe.xyz/bookpipeline

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

rescribe.xyz/bookpipeline

  • v1.3.0
  • Go
  • Socket score

Version published
Created
Source

rescribe.xyz/bookpipeline package

This package contains various tools and functions for the OCR of books, with a focus on distributed OCR using short-lived virtual servers.

This is a Go package, and can be installed in the standard go way, by running go get rescribe.xyz/bookpipeline/... and documentation can be read with the go doc command or online at https://pkg.go.dev/rescribe.xyz/bookpipeline.

If you just want to install and use the commands, you can get the package with git clone https://git.rescribe.xyz/bookpipeline, and then install them with go install ./... from within the bookpipeline directory.

Commands

The commands in the cmd/ directory are at the heart of this package. For more details on their usage, use go doc or read doc.go in the package repository.

The key commands for the virtual server side are:

  • bookpipeline : processes items from queues, doing preprocessing, ocr and postprocessing, and moving items on to the next queue step on completion. this is the core command of the package.
  • booktopipeline : uploads a book to the pipeline and adds it to the appropriate queue.
  • getpipelinebook : downloads the pipeline results for a book.
  • lspipeline : prints useful information about the status of the pipeline.
  • mkpipeline : sets up storage buckets and queues for use by the pipeline.
  • spotme : starts up a short-lived virtual server running bookpipeline.

There are also some commands which are more useful in a standalone setting:

  • confgraph : creates a graph showing average word confidence of each page of hOCR in a directory
  • pagegraph : creates a graph showing average confidence of each word in a page of hOCR
  • pdfbook : creates a searchable PDF from a directory of hOCR and image files

Rescribe tool for local operation

While bookpipeline was built with cloud based operation in mind, there is also a local mode that can be used to run OCR jobs from a single computer, with all the benefits of preprocessing, choosing the best threshold for each image, graph creation, PDF creation, and so on that the pipeline provides.

Several of the commands accept a -c local flag for local operation, but now there is also a new command, named rescribe, that is designed to make things much simpler for people just wanting to do some OCR on their local computer.

More information about this, including links to prebuilt executables, can be found on our blog at https://blog.rescribe.xyz/posts/desktop-tool/.

Contributions

Any and all comments, bug reports, patches or pull requests would be very welcomely received. Please email them to nick@rescribe.xyz.

License

This package is licensed under the GPLv3. See the LICENSE file for more details.

FAQs

Package last updated on 10 Apr 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc