Package captcha implements generation and verification of image and audio CAPTCHAs. A captcha solution is the sequence of digits 0-9 with the defined length. There are two captcha representations: image and audio. An image representation is a PNG-encoded image with the solution printed on it in such a way that makes it hard for computers to solve it using OCR. An audio representation is a WAVE-encoded (8 kHz unsigned 8-bit) sound with the spoken solution (currently in English, Russian, Chinese, and Japanese). To make it hard for computers to solve audio captcha, the voice that pronounces numbers has random speed and pitch, and there is a randomly generated background noise mixed into the sound. This package doesn't require external files or libraries to generate captcha representations; it is self-contained. To make captchas one-time, the package includes a memory storage that stores captcha ids, their solutions, and expiration time. Used captchas are removed from the store immediately after calling Verify or VerifyString, while unused captchas (user loaded a page with captcha, but didn't submit the form) are collected automatically after the predefined expiration time. Developers can also provide custom store (for example, which saves captcha ids and solutions in database) by implementing Store interface and registering the object with SetCustomStore. Captchas are created by calling New, which returns the captcha id. Their representations, though, are created on-the-fly by calling WriteImage or WriteAudio functions. Created representations are not stored anywhere, but subsequent calls to these functions with the same id will write the same captcha solution. Reload function will create a new different solution for the provided captcha, allowing users to "reload" captcha if they can't solve the displayed one without reloading the whole page. Verify and VerifyString are used to verify that the given solution is the right one for the given captcha id. Server provides an http.Handler which can serve image and audio representations of captchas automatically from the URL. It can also be used to reload captchas. Refer to Server function documentation for details, or take a look at the example in "capexample" subdirectory.
Package doc
Package ocrv1 is a generated GoMock package.
Package captcha implements generation and verification of image and audio CAPTCHAs. A captcha solution is the sequence of digits 0-9 with the defined length. There are two captcha representations: image and audio. An image representation is a PNG-encoded image with the solution printed on it in such a way that makes it hard for computers to solve it using OCR. An audio representation is a WAVE-encoded (8 kHz unsigned 8-bit) sound with the spoken solution (currently in English, Russian, Chinese, and Japanese). To make it hard for computers to solve audio captcha, the voice that pronounces numbers has random speed and pitch, and there is a randomly generated background noise mixed into the sound. This package doesn't require external files or libraries to generate captcha representations; it is self-contained. To make captchas one-time, the package includes a memory storage that stores captcha ids, their solutions, and expiration time. Used captchas are removed from the store immediately after calling Verify or VerifyString, while unused captchas (user loaded a page with captcha, but didn't submit the form) are collected automatically after the predefined expiration time. Developers can also provide custom store (for example, which saves captcha ids and solutions in database) by implementing Store interface and registering the object with SetCustomStore. Captchas are created by calling New, which returns the captcha id. Their representations, though, are created on-the-fly by calling WriteImage or WriteAudio functions. Created representations are not stored anywhere, but subsequent calls to these functions with the same id will write the same captcha solution. Reload function will create a new different solution for the provided captcha, allowing users to "reload" captcha if they can't solve the displayed one without reloading the whole page. Verify and VerifyString are used to verify that the given solution is the right one for the given captcha id. Server provides an http.Handler which can serve image and audio representations of captchas automatically from the URL. It can also be used to reload captchas. Refer to Server function documentation for details, or take a look at the example in "capexample" subdirectory.
The bookpipeline package contains various tools and functions for the OCR of books, with a focus on distributed OCR using short-lived virtual servers. It also contains several tools that are useful standalone; read the accompanying README for more details. The book pipeline is a way to split the different processes that for book OCR into small jobs, which can be processed when a computer is ready for them. It is currently implemented with Amazon's AWS cloud systems, and can scale from zero to many computers, with jobs being processed faster when more servers are available. Central to the bookpipeline in terms of software is the bookpipeline command, which is part of the rescribe.xyz/bookpipeline package. Presuming you have the go tools installed, you can install it, and useful tools to control the system, with this command: All of the tools provided in the bookpipeline package will give information on what they do and how they work with the '-h' flag, so for example to get usage information on the booktopipeline tool simply run the following: To get the pipeline tools to work for you, you'll need to change the settings in cloudsettings.go, and set up your ~/.aws/credentials appropriately. Most of the time the bookpipeline is expected to be run from potentially short-lived servers on Amazon's EC2 system. EC2 provides servers which have no guaranteed of stability (though in practice they seem to be), called "Spot Instances", which we use for bookpipeline. bookpipeline can handle a process or server being suddenly destroyed without warning (more on this later), so Spot Instances are perfect for us. We have set up a machine image with bookpipeline preinstalled which will launch at bootup, which is all that's needed to launch an bookpipeline instance. Presuming the bookpipeline package has been installed on your computer (see above), the spot instance can be started with the command: You can keep an eye on the servers (spot or otherwise) that are running, and the jobs left to do and in progress, with the "lspipeline" tool (which is also part of the bookpipeline package). It's recommended to use this with the ssh private key for the servers, so that it can also report on what each server is currently doing, but it can run successfully without it. It takes a little while to run, so be patient. It can be run with the command: Spot instances can be terminated with ssh, using their ip address which can be found with lspipeline, like so: The bookpipeline program is run as a service managed by systemd on the servers. The system is fully resiliant in the face of unexpected failures. See the section "How the pipeline works" for details on this. bookpipeline can be managed like any other systemd service. A few examples: Books can be added to the pipeline using the "booktopipeline" tool. This takes a directory of page images as input, and uploads them all to S3, adding a job to the pipeline queue to start processing them. So it can be used like this: Once a book has been finished, it can be downloaded using the "getpipelinebook" tool. This has several options to download specific parts of a book, but the default case will download the best hOCR for each page, PDFs, and the best, conf and graph.png files. Use it like this: To get the plain text from the book, use the hocrtotxt tool, which is part of the rescribe.xyz/utils package. You can get the package, and run the tool, like this: The central part of the book pipeline is several SQS queues, which contain jobs which need to be done by a server running bookpipeline. The exact content of the SQS messages vary according to each queue, as some jobs need more information than others. Each queue is checked at least once every couple of minutes on any server that isn't currently processing a job. When a job is taken from the queue by a process, it is hidden from the queue for 2 minutes so that no other process can take it. Once per minute when processing a job the process sends a message updating the queue, to tell it to keep the job hidden for two minutes. This is called the "heartbeat", as if the process fails for any reason the heartbeat will stop, and in 2 minutes the job will reappear on the queue for another process to have a go at. Once a job is completed successfully it is deleted from the queue. Queue names are defined in cloudsettings.go. queuePreProc Each message in the queuePreProc queue is a bookname, optionally followed by a space and the name of the training to use. Each page of the bookname will be binarised with several different parameters, and then wiped, with each version uploaded to S3, with the path of the preprocessed page, plus the training name if it was provided, will be added to the queueOcrPage queue. The pages are binarised with different parameters as it can be difficult to determine which binarisation level will be best prior to OCR, so several different options are used, and in the queueAnalyse step the best one is chosen, based on the confidence of the OCR output. queueWipeOnly This queue works the same as queuePreProc, except that it doesn't binarise the pages, only runs the wiper. Hence it is designed for books which have been prebinarised. queuePreNoWipe This queue works the same as queuePreProc, except that it doesn'T wipe the pages, only runs the binarisation. It is designed for books which don't have tricky gutters or similar noise around the edges, but do have marginal content which might be inadventently removed by the wiper. queueOcrPage This queue contains the path of individual pages, optionally followed by a space and the name of the training to use. Each page is OCRed, and the results are uploaded to S3. After each page is OCRed, a check is made to see whether all pages that look like they were preprocessed have corresponding .hocr files. If so, the bookname is added to the queueAnalyse queue. queueAnalyse A message on the queueAnalyse queue contains only a book name. The confidences for each page are calculated and saved in the 'conf' file, and the best version of each page is decided upon and saved in the 'best' file. PDFs are then generated, and the confidence graph is generated. The queues should generally only be messed with by the bookpipeline and booktopipeline tools, but if you're feeling ambitious you can take a look at the `addtoqueue` tool. Remember that messages in a queue are hidden for a few minutes when they are read, so for example you couldn't straightforwardly delete a message which was currently being processed by a server, as you wouldn't be able to see it. At present the bookpipeline has some silly limitations of file names for book pages to be recognised. This is something which will be fixed in due course. While bookpipeline was built with cloud based operation in mind, there is also a local mode that can be used to run OCR jobs from a single computer, with all the benefits of preprocessing, choosing the best threshold for each image, graph creation, PDF creation, and so on that the pipeline provides. Several of the commands accept a `-c local` flag for local operation, but now there is also a new command, named rescribe, that is designed to make things much simpler for people just wanting to do some OCR on their local computer. Note that the local mode is not as well tested as the core cloud modes; please report any bugs you find with it.
Package lookup implements a nice, simple and fast library which helps you to lookup objects on a screen. It also includes OCR functionality. With Lookup you can do OCR tricks like recognizing any information in a screenshot from your Robot application. Which can be useful for debugging or automating things. This library is a port of the Java Lookup library (https://gitlab.com/axet/lookup) to GoLang. Details on NCC (Normalized Cross Correlation) used by this library can be found in the original library's docs folder (a lot of math).
Package captcha implements generation and verification of image CAPTCHAs with custom fonts. A captcha solution is the sequence of digits 0-9 and letters a-z, A-Z with a defined length. The captcha is a PNG-encoded image with the solution printed on it in such a way that makes it hard for computers to solve it using OCR. This package only requires font files. See https://github.com/chenzihaojie/captcha/tree/master/fontgen for details on how to get them. So, before you start generating captchas, you have to load a font: To make captchas one-time, the package includes a memory storage that stores captcha ids, their solutions, and expiration time. Used captchas are removed from the store immediately after calling Verify or VerifyString, while unused captchas (user loaded a page with captcha, but didn't submit the form) are collected automatically after the predefined expiration time. Developers can also provide custom store (for example, which saves captcha ids and solutions in database) by implementing Store interface and registering the object with SetCustomStore. Captchas are created by calling New, which returns the captcha id. Their representations, though, are created on-the-fly by calling WriteImage or WriteAudio functions. Created representations are not stored anywhere, but subsequent calls to these functions with the same id will write the same captcha solution. Reload function will create a new different solution for the provided captcha, allowing users to "reload" captcha if they can't solve the displayed one without reloading the whole page. Verify and VerifyString are used to verify that the given solution is the right one for the given captcha id. Server provides an http.Handler which can serve image and audio representations of captchas automatically from the URL. It can also be used to reload captchas. Refer to Server function documentation for details, or take a look at the example in https://github.com/chenzihaojie/captcha/tree/master/capexample