Package ocr contains Go samples for creating OCR (Optical Character Recognition) Cloud functions.
This utility converts Chronicling America OCR batches into CSVs of the OCR text. It takes as its arguments paths to Chronicling America OCR batches which are stored as .tar.bz2 files, which in turn contain directories of text files (which we care about) and XML files (which we don't). The path to the files comprise (with modification) an ID for that page on Chronicling America. This utility reads in each batch, extracts the page text, and writes each of them as a CSV file with a column for the batch ID, page ID, and text. It will process the batches in parallel.
Package bolt implements a low-level key/value store in pure Go. It supports fully serializable transactions, ACID semantics, and lock-free MVCC with multiple readers and a single writer. Bolt can be used for projects that want a simple data store without the need to add large dependencies such as Postgres or MySQL. Bolt is a single-level, zero-copy, B+tree data store. This means that Bolt is optimized for fast read access and does not require recovery in the event of a system crash. Transactions which have not finished committing will simply be rolled back in the event of a crash. The design of Bolt is based on Howard Chu's LMDB database project. Bolt currently works on Windows, Mac OS X, and Linux. There are only a few types in Bolt: DB, Bucket, Tx, and Cursor. The DB is a collection of buckets and is represented by a single file on disk. A bucket is a collection of unique keys that are associated with values. Transactions provide either read-only or read-write access to the database. Read-only transactions can retrieve key/value pairs and can use Cursors to iterate over the dataset sequentially. Read-write transactions can create and delete buckets and can insert and remove keys. Only one read-write transaction is allowed at a time. The database uses a read-only, memory-mapped data file to ensure that applications cannot corrupt the database, however, this means that keys and values returned from Bolt cannot be changed. Writing to a read-only byte slice will cause Go to panic. Keys and values retrieved from the database are only valid for the life of the transaction. When used outside the transaction, these byte slices can point to different data or can point to invalid memory which will cause a panic.
Package captcha implements generation and verification of image and audio CAPTCHAs. A captcha solution is the sequence of digits 0-9 with the defined length. There are two captcha representations: image and audio. An image representation is a PNG-encoded image with the solution printed on it in such a way that makes it hard for computers to solve it using OCR. An audio representation is a WAVE-encoded (8 kHz unsigned 8-bit) sound with the spoken solution (currently in English, Russian, Chinese, and Japanese). To make it hard for computers to solve audio captcha, the voice that pronounces numbers has random speed and pitch, and there is a randomly generated background noise mixed into the sound. This package doesn't require external files or libraries to generate captcha representations; it is self-contained. To make captchas one-time, the package includes a memory storage that stores captcha ids, their solutions, and expiration time. Used captchas are removed from the store immediately after calling Verify or VerifyString, while unused captchas (user loaded a page with captcha, but didn't submit the form) are collected automatically after the predefined expiration time. Developers can also provide custom store (for example, which saves captcha ids and solutions in database) by implementing Store interface and registering the object with SetCustomStore. Captchas are created by calling New, which returns the captcha id. Their representations, though, are created on-the-fly by calling WriteImage or WriteAudio functions. Created representations are not stored anywhere, but subsequent calls to these functions with the same id will write the same captcha solution. Reload function will create a new different solution for the provided captcha, allowing users to "reload" captcha if they can't solve the displayed one without reloading the whole page. Verify and VerifyString are used to verify that the given solution is the right one for the given captcha id. Server provides an http.Handler which can serve image and audio representations of captchas automatically from the URL. It can also be used to reload captchas. Refer to Server function documentation for details, or take a look at the example in "capexample" subdirectory.