Security News
Input Validation Vulnerabilities Dominate MITRE's 2024 CWE Top 25 List
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
@cboulanger/abbyy-cloud-ocr
Advanced tools
NodeJS client and CLI to interact with the ABBYY Cloud OCR service
This project provides a NodeJS client with TypeScript support and a command line interface (CLI) for the Abbyy Cloud OCR service (https://cloud.ocrsdk.com/). It currently implements a subset of the available API methods from the v1 and v2 web API:
To use the library in your projects, simply npm install @cboulanger/abbyy-cloud-ocr
. See the CLI script
for an example on how to use the API.
git clone https://github.com/cboulanger/cboulanger/abbyy-cloud-ocr.git
cd cboulanger/abbyy-cloud-ocr
cp .env.dist ./.env
# edit .env and provide the values needed there
npm install
npm test
You can create a standalone command line executable file which can be run on the command line by executing npm run pkg
. The executables for Linux/Windows/MacOS will be written to the bin
directory.
Please note that if you have set environment variables in a
.env
file, the package include them and will be visible as plain text in the source! Please remove the file if you intend to distribute the built executable. The values will be used as defaults, which is convenient for personal use of the executable.
The usage of the executable is
Usage: abbyy-cloud-ocr-<platform> --help
Options:
-u, --service-url <url> The http endpoint of the Cloud OCR Service
-i, --app-id <id> The id of the application
-P, --password <password> The application password
-h, --help display help for command
Commands:
process [options] <files...> Process the given files and download the results
list [options] List ongoing or finished tasks.
info
help [command] display help for command
abbyy-cloud-ocr-<platform> process [options] file1 [file2 [file3]...]
Process the given files and download the results
Options:
-l, --language <language> Recognition language or comma-separated list of languages, defaults to "English"
-e, --export-format <format> Output format. One of: txt (default), txtUnstructured, rtf, docx, xlsx, pptx, pdfa, pdfSearchable, pdfTextAndImages, xml
-c, --custom-options <options> Other custom options passed to REST-ful call, like 'profile=documentArchiving'
-o, --output-path <path> The path to which to save the processed files
-F, --filenames Output the filenames of the processed and downloaded files
-h, --help display help for command
Note that if you don't compile in your .env
file, you need to set the environment variables defined therein
before calling the executable (or provide them on the command line).
The executable lets you do something like this:
# export credentials so that you don't need to provide them as CLI options
export ABBYY_SERVICE_URL=XXXX
export ABBYY_APP_ID=YYYYYY
export ABBYY_APP_PASSWD=ZZZZZ
PAGES_BEFORE=$(abbyy-cloud-ocr info | jq ".pages")
abbyy-cloud-ocr process \
-l German \
-e docx,txtUnstructured \
-c "txtUnstructured:paragraphAsOneLine=true" \
-o ~/files/OCR \
~/files/PDF-SOURCE/*
PAGES_AFTER=$(abbyy-cloud-ocr info | jq ".pages")
echo "$(expr $PAGES_BEFORE - $PAGES_AFTER) pages used, $PAGES_AFTER left."
FAQs
NodeJS client and CLI to interact with the ABBYY Cloud OCR service
We found that @cboulanger/abbyy-cloud-ocr demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.
Research
Security News
A threat actor's playbook for exploiting the npm ecosystem was exposed on the dark web, detailing how to build a blockchain-powered botnet.