Socket
Socket
Sign inDemoInstall

tesseract.js

Package Overview
Dependencies
10
Maintainers
3
Versions
67
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

Comparing version 2.0.0-alpha.16 to 2.0.0-beta.1

examples/browser/download-pdf.html

407

docs/api.md
# API
## TesseractWorker.recognize(image, lang, [, options]) -> [TesseractJob](#tesseractjob)
- [createWorker()](#create-worker)
- [Worker.load](#worker-load)
- [Worker.loadLanguage](#worker-load-language)
- [Worker.initialize](#worker-initialize)
- [Worker.setParameters](#worker-set-parameters)
- [Worker.recognize](#worker-recognize)
- [Worker.detect](#worker-detect)
- [Worker.terminate](#worker-terminate)
- [createScheduler()](#create-scheduler)
- [Scheduler.addWorker](#scheduler-add-worker)
- [Scheduler.addJob](#scheduler-add-job)
- [Scheduler.getQueueLen](#scheduler-get-queue-len)
- [Scheduler.getNumWorkers](#scheduler-get-num-workers)
- [setLogging()](#set-logging)
- [recognize()](#recognize)
- [detect()](#detect)
- [PSM](#psm)
- [OEM](#oem)
---
<a name="create-worker"></a>
## createWorker(options): Worker
createWorker is a factory function that creates a tesseract worker, a worker is basically a Web Worker in browser and Child Process in Node.
**Arguments:**
- `options` an object of customized options
- `corePath` path for tesseract-core.js script
- `langPath` path for downloading traineddata, do not include `/` at the end of the path
- `workerPath` path for downloading worker script
- `dataPath` path for saving traineddata in WebAssembly file system, not common to modify
- `cachePath` path for the cached traineddata, more useful for Node, for browser it only changes the key in IndexDB
- `cacheMethod` a string to indicate the method of cache management, should be one of the following options
- write: read cache and write back (default method)
- readOnly: read cache and not to write back
- refresh: not to read cache and write back
- none: not to read cache and not to write back
- `workerBlobURL` a boolean to define whether to use Blob URL for worker script, default: true
- `gzip` a boolean to define whether the traineddata from the remote is gzipped, default: true
- `logger` a function to log the progress, a quick example is `m => console.log(m)`
**Examples:**
```javascript
const { createWorker } = Tesseract;
const worker = createWorker({
langPath: '...',
logger: m => console.log(m),
});
```
## Worker
A Worker helps you to do the OCR related tasks, it takes few steps to setup Worker before it is fully functional. The full flow is:
- load
- loadLanguauge
- initialize
- setParameters // optional
- recognize or detect
- terminate
Each function is async, so using async/await or Promise is required. When it is resolved, you get an object:
```json
{
"jobId": "Job-1-123",
"data": { ... }
}
```
jobId is generated by Tesseract.js, but you can put your own when calling any of the function above.
<a name="worker-load"></a>
### Worker.load(jobId): Promise
Worker.load() loads tesseract.js-core scripts (download from remote if not presented), it makes Web Worker/Child Process ready for next action.
**Arguments:**
- `jobId` Please see details above
**Examples:**
```javascript
(async () => {
await worker.load();
})();
```
<a name="worker-load-language"></a>
### Worker.loadLanguage(langs, jobId): Promise
Worker.loadLanguage() loads traineddata from cache or download traineddata from remote, and put traineddata into the WebAssembly file system.
**Arguments:**
- `langs` a string to indicate the languages traineddata to download, multiple languages are concated with **+**, ex: **eng+chi\_tra**
- `jobId` Please see details above
**Examples:**
```javascript
(async () => {
await worker.loadLanguage('eng+chi_tra');
})();
```
<a name="worker-initialize"></a>
### Worker.initialize(langs, oem, jobId): Promise
Worker.initialize() initializes the Tesseract API, make sure it is ready for doing OCR tasks.
**Arguments:**
- `langs` a string to indicate the languages loaded by Tesseract API, it can be the subset of the languauge traineddata you loaded from Worker.loadLanguage.
- `oem` a enum to indicate the OCR Engine Mode you use
- `jobId` Please see details above
**Examples:**
```javascript
(async () => {
/** You can load more languages in advance, but use only part of them in Worker.initialize() */
await worker.loadLanguage('eng+chi_tra');
await worker.initialize('eng');
})();
```
<a name="worker-set-parameters"></a>
### Worker.setParameters(params, jobId): Promise
Worker.setParameters() set parameters for Tesseract API (using SetVariable()), it changes the behavior of Tesseract and some parameters like tessedit\_char\_whitelist is very useful.
**Arguments:**
- `params` an object with key and value of the parameters
- `jobId` Please see details above
**Supported Paramters:**
| name | type | default value | description |
| ---- | ---- | ------------- | ----------- |
| tessedit\_ocr\_engine\_mode | enum | OEM.LSTM\_ONLY | Check [HERE](https://github.com/tesseract-ocr/tesseract/blob/4.0.0/src/ccstruct/publictypes.h#L268) for definition of each mode |
| tessedit\_pageseg\_mode | enum | PSM.SINGLE\_BLOCK | Check [HERE](https://github.com/tesseract-ocr/tesseract/blob/4.0.0/src/ccstruct/publictypes.h#L163) for definition of each mode |
| tessedit\_char\_whitelist | string | '' | setting white list characters makes the result only contains these characters, useful the content in image is limited |
| preserve\_interword\_spaces | string | '0' | '0' or '1', keeps the space between words |
| tessjs\_create\_hocr | string | '1' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes hocr in the result |
| tessjs\_create\_tsv | string | '1' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes tsv in the result |
| tessjs\_create\_box | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes box in the result |
| tessjs\_create\_unlv | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes unlv in the result |
| tessjs\_create\_osd | string | '0' | only 2 values, '0' or '1', when the value is '1', tesseract.js includes osd in the result |
**Examples:**
```javascript
(async () => {
await worker.setParameters({
tessedit_char_whitelist: '0123456789',
});
})
```
<a name="worker-recognize"></a>
### Worker.recognize(image, options, jobId): Promise
Worker.recognize() provides core function of Tesseract.js as it executes OCR
Figures out what words are in `image`, where the words are in `image`, etc.

@@ -8,138 +177,194 @@ > Note: `image` should be sufficiently high resolution.

**Arguments:**
- `image` see [Image Format](./image-format.md) for more details.
- `lang` property with a value from the [list of lang parameters](./tesseract_lang_list.md), you can use multiple languages separated by '+', ex. `eng+chi_tra`
- `options` a flat json object that may include properties that override some subset of the [default tesseract parameters](./tesseract_parameters.md)
- `options` a object of customized optons
- `rectangles` an array of objects to specify the region you want to recognized in the image, the object should contain top, left, width and height, see example below.
- `jobId` Please see details above
Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result.
**Output:**
### Simple Example:
**Examples:**
```javascript
const worker = new Tesseract.TesseractWorker();
worker
.recognize(myImage)
.then(function(result){
console.log(result);
});
const { createWorker } = Tesseract;
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(image);
console.log(text);
})();
```
### More Complicated Example:
With rectangles
```javascript
const worker = new Tesseract.TesseractWorker();
// if we know our image is of spanish words without the letter 'e':
worker
.recognize(myImage, 'spa', {
tessedit_char_blacklist: 'e',
})
.then(function(result){
console.log(result);
const { createWorker } = Tesseract;
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(image, {
rectangles: [
{ top: 0, left: 0, width: 100, height: 100 },
],
});
console.log(text);
})();
```
## TesseractWorker.detect(image) -> [TesseractJob](#tesseractjob)
<a name="worker-detect"></a>
### Worker.detect(image, jobId): Promise
Figures out what script (e.g. 'Latin', 'Chinese') the words in image are written in.
Worker.detect() does OSD (Orientation and Script Detection) to the image instead of OCR.
**Arguments:**
- `image` see [Image Format](./image-format.md) for more details.
- `jobId` Please see details above
Returns a [TesseractJob](#tesseractjob) whose `then`, `progress`, `catch` and `finally` methods can be used to act on the result of the script.
**Examples:**
```javascript
const worker = new Tesseract.TesseractWorker();
worker
.detect(myImage)
.then(function(result){
console.log(result);
});
const { createWorker } = Tesseract;
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data } = await worker.detect(image);
console.log(data);
})();
```
## TesseractJob
<a name="worker-terminate"></a>
### Worker.terminate(jobId): Promise
A TesseractJob is an object returned by a call to `recognize` or `detect`. It's inspired by the ES6 Promise interface and provides `then` and `catch` methods. It also provides `finally` method, which will be fired regardless of the job fate. One important difference is that these methods return the job itself (to enable chaining) rather than new.
Worker.terminate() terminates the worker and clean up
Typical use is:
**Arguments:**
- `jobId` Please see details above
```javascript
const worker = new Tesseract.TesseractWorker();
worker.recognize(myImage)
.progress(message => console.log(message))
.catch(err => console.error(err))
.then(result => console.log(result))
.finally(resultOrError => console.log(resultOrError));
(async () => {
await worker.terminate();
})();
```
Which is equivalent to:
<a name="create-scheduler"></a>
## createScheduler(): Scheduler
createScheduler() is a factory function to create a scheduler, a scheduler manage a job queue and workers to enable multiple workers to work together, it is useful when you want to speed up your performance.
**Examples:**
```javascript
const worker = new Tesseract.TesseractWorker();
const job1 = worker.recognize(myImage);
const { createScheduler } = Tesseract;
const scheduler = createScheduler();
```
job1.progress(message => console.log(message));
### Scheduler
job1.catch(err => console.error(err));
<a name="scheduler-add-worker"></a>
### Scheduler.addWorker(worker): string
job1.then(result => console.log(result));
Scheduler.addWorker() adds a worker into the worker pool inside scheduler, it is suggested to add one worker to only one sheduler.
job1.finally(resultOrError => console.log(resultOrError));
```
**Arguments:**
- `worker` see Worker above
**Examples:**
### TesseractJob.progress(callback: function) -> TesseractJob
Sets `callback` as the function that will be called every time the job progresses.
- `callback` is a function with the signature `callback(progress)` where `progress` is a json object.
For example:
```javascript
const worker = new Tesseract.TesseractWorker();
worker.recognize(myImage)
.progress(function(message){console.log('progress is: ', message)});
const { createWorker, createScheduler } = Tesseract;
const scheduler = createScheduler();
const worker = createWorker();
scheduler.addWorker(worker);
```
The console will show something like:
<a name="scheduler-add-job"></a>
### Scheduler.addJob(action, ...payload): Promise
Scheduler.addJob() adds a job to the job queue and scheduler waits and finds an idle worker to take the job.
**Arguments:**
- `action` a string to indicate the action you want to do, right now only **recognize** and **detect** are supported
- `payload` a arbitrary number of args depending on the action you called.
**Examples:**
```javascript
progress is: {loaded_lang_model: "eng", from_cache: true}
progress is: {initialized_with_lang: "eng"}
progress is: {set_variable: Object}
progress is: {set_variable: Object}
progress is: {recognized: 0}
progress is: {recognized: 0.3}
progress is: {recognized: 0.6}
progress is: {recognized: 0.9}
progress is: {recognized: 1}
(async () => {
const { data: { text } } = await scheduler.addJob('recognize', image, options);
const { data } = await scheduler.addJob('detect', image);
})();
```
<a name="scheduler-get-queue-len"></a>
### Scheduler.getQueueLen(): number
### TesseractJob.then(callback: function) -> TesseractJob
Sets `callback` as the function that will be called if and when the job successfully completes.
- `callback` is a function with the signature `callback(result)` where `result` is a json object.
Scheduler.getNumWorkers() returns the length of job queue.
<a name="scheduler-get-num-workers"></a>
### Scheduler.getNumWorkers(): number
For example:
Scheduler.getNumWorkers() returns number of workers added into the scheduler
<a name="scheduler-terminate"></a>
### Scheduler.terminate(): Promise
Scheduler.terminate() terminates all workers added, useful to do quick clean up.
**Examples:**
```javascript
const worker = new Tesseract.TesseractWorker();
worker.recognize(myImage)
.then(function(result){console.log('result is: ', result)});
(async () => {
await scheduler.terminate();
})();
```
The console will show something like:
<a name="set-logging"></a>
## setLogging(logging: boolean)
setLogging() sets the logging flag, you can `setLogging(true)` to see detailed information, useful for debugging.
**Arguments:**
- `logging` boolean to define whether to see detailed logs, default: false
**Examples:**
```javascript
result is: {
blocks: Array[1]
confidence: 87
html: "<div class='ocr_page' id='page_1' ..."
lines: Array[3]
oem: "DEFAULT"
paragraphs: Array[1]
psm: "SINGLE_BLOCK"
symbols: Array[33]
text: "Hello World↵from beyond↵the Cosmic Void↵↵"
version: "3.04.00"
words: Array[7]
}
const { setLogging } = Tesseract;
setLogging(true);
```
### TesseractJob.catch(callback: function) -> TesseractJob
Sets `callback` as the function that will be called if the job fails.
- `callback` is a function with the signature `callback(error)` where `error` is a json object.
<a name="recognize"></a>
## recognize(image, langs, options): Promise
### TesseractJob.finally(callback: function) -> TesseractJob
Sets `callback` as the function that will be called regardless if the job fails or success.
- `callback` is a function with the signature `callback(resultOrError)` where `resultOrError` is a json object.
recognize() is a function to quickly do recognize() task, it is not recommended to use in real application, but useful when you want to save some time.
See [Tesseract.js](../src/Tesseract.js)
<a name="detect"></a>
## detect(image, options): Promise
Same background as recongize(), but it does detect instead.
See [Tesseract.js](../src/Tesseract.js)
<a name="psm"></a>
## PSM
See [PSM.js](../src/constatns/PSM.js)
<a name="oem"></a>
## OEM
See [OEM.js](../src/constatns/OEM.js)

@@ -15,16 +15,14 @@ # Tesseract.js Examples

```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker();
worker
.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png')
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
});
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await worker.terminate();
})();
```

@@ -35,16 +33,16 @@

```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker({
logger: m => console.log(m), // Add logger here
});
worker
.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png')
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
});
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await worker.terminate();
})();
```

@@ -55,50 +53,36 @@

```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker();
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng+chi_tra'
)
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
});
(async () => {
await worker.load();
await worker.loadLanguage('eng+chi_tra');
await worker.initialize('eng+chi_tra');
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await worker.terminate();
})();
```
### with whitelist char (^2.0.0-beta.1)
### with whitelist char (^2.0.0-alpha.5)
Sadly, whitelist chars is not supported in tesseract.js v4, so in tesseract.js we need to switch to tesseract v3 mode to make it work.
```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const { TesseractWorker, OEM } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker();
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{
'tessedit_ocr_engine_mode': OEM.TESSERACT_ONLY,
'tessedit_char_whitelist': '0123456789-.',
}
)
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
await worker.setParameters({
tessedit_char_whitelist: '0123456789',
});
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await worker.terminate();
})();
```
### with different pageseg mode (^2.0.0-alpha.5)
### with different pageseg mode (^2.0.0-beta.1)

@@ -108,125 +92,71 @@ Check here for more details of pageseg mode: https://github.com/tesseract-ocr/tesseract/blob/4.0.0/src/ccstruct/publictypes.h#L163

```javascript
import Tesseract from 'tesseract.js';
import { createWorker, PSM } from 'tesseract.js';
const { TesseractWorker, PSM } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker();
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{
'tessedit_pageseg_mode': PSM.SINGLE_BLOCK,
}
)
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
await worker.setParameters({
tessedit_pageseg_mode: PSM.SINGLE_BLOCK,
});
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await worker.terminate();
})();
```
### with pdf output (^2.0.0-alpha.12)
### with pdf output (^2.0.0-beta.1)
In this example, pdf file will be downloaded in browser and write to file system in Node.js
Please check **examples** folder for details.
```javascript
import Tesseract from 'tesseract.js';
Browser: [download-pdf.html](../examples/browser/download-pdf.html)
Node: [download-pdf.js](../examples/node/download-pdf.js)
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
### with only part of the image (^2.0.0-beta.1)
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{
'tessjs_create_pdf': '1',
}
)
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
});
```
If you want to handle pdf file by yourself
```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
const worker = createWorker();
const rectangles = [
{ left: 0, top: 0, width: 500, height: 250 },
];
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{
'tessjs_create_pdf': '1',
'tessjs_pdf_auto_download': false, // disable auto download
'tessjs_pdf_bin': true, // add pdf file bin array in result
}
)
.progress((p) => {
console.log('progress', p);
})
.then(({ files: { pdf } }) => {
console.log(Object.values(pdf)); // As pdf is an array-like object, you need to do a little convertion first.
worker.terminate();
});
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png', 'eng', { rectangles });
console.log(text);
await worker.terminate();
})();
```
### with preload language data
### with multiple workers to speed up (^2.0.0-beta.1)
```javascript
const Tesseract = require('tesseract.js');
import { createWorker, createScheduler } from 'tesseract.js';
const { TesseractWorker, utils: { loadLang } } = Tesseract;
const worker = new TesseractWorker();
const scheduler = createScheduler();
const worker1 = createWorker();
const worker2 = createWorker();
loadLang({ langs: 'eng', langPath: worker.options.langPath })
.then(() => {
worker
.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png')
.progress(p => console.log(p))
.then(({ text }) => {
console.log(text);
worker.terminate();
});
});
(async () => {
await worker1.load();
await worker2.load();
await worker1.loadLanguage('eng');
await worker2.loadLanguage('eng');
await worker1.initialize('eng');
await worker2.initialize('eng');
scheduler.addWorker(worker1);
scheduler.addWorker(worker2);
/** Add 10 recognition jobs */
const results = await Promise.all(Array(10).fill(0).map(() => (
await scheduler.addJob('recognize', 'https://tesseract.projectnaptha.com/img/eng_bw.png')
)))
console.log(results);
await scheduler.terminate(); // It also terminates all workers.
})();
```
### with only part of the image (^2.0.0-alpha.12)
```javascript
import Tesseract from 'tesseract.js';
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
worker
.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{
tessjs_image_rectangle_left: 0,
tessjs_image_rectangle_top: 0,
tessjs_image_rectangle_width: 500,
tessjs_image_rectangle_height: 250,
}
)
.progress((p) => {
console.log('progress', p);
})
.then(({ text }) => {
console.log(text);
worker.terminate();
});
```

@@ -6,5 +6,5 @@ FAQ

When you execute recognize() function (ex: `recognize(image, 'eng')`), the language model to download is determined by the 2nd argument of recognize(). (`eng` in the example)
The language model is downloaded by `worker.loadLanguage()` and you need to pass the langs to `worker.initialize()`.
Tesseract.js will first check if \*.traineddata already exists. (browser: [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API), Node.js: fs, in the folder you execute the command) If the \*.traineddata doesn't exist, it will fetch \*.traineddata.gz from [tessdata](https://github.com/naptha/tessdata), ungzip and store in IndexedDB or fs, you can delete it manually and it will download again for you.
During the downloading of language model, Tesseract.js will first check if \*.traineddata already exists. (browser: [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API), Node.js: fs, in the folder you execute the command) If the \*.traineddata doesn't exist, it will fetch \*.traineddata.gz from [tessdata](https://github.com/naptha/tessdata), ungzip and store in IndexedDB or fs, you can delete it manually and it will download again for you.

@@ -19,24 +19,26 @@ ## How can I train my own \*.traineddata?

Starting from 2.0.0-alpha.10, you can get all these information in the final result.
Starting from 2.0.0-beta.1, you can get all these information in the final result.
```javascript
import Tesseract from 'tesseract.js';
import { createWorker } from 'tesseract.js';
const worker = createWorker({
logger: m => console.log(m)
});
const { TesseractWorker } = Tesseract;
const worker = new TesseractWorker();
worker
.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png', 'eng', {
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
await worker.setParameters({
tessedit_create_box: '1',
tessedit_create_unlv: '1',
tessedit_create_osd: '1',
})
.then((result) => {
console.log(result.text);
console.log(result.hocr);
console.log(result.tsv);
console.log(result.box);
console.log(result.unlv);
console.log(result.osd);
});
const { data: { text, hocr, tsv, box, unlv } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
console.log(hocr);
console.log(tsv);
console.log(box);
console.log(unlv);
})();
```

@@ -12,6 +12,16 @@ ## Local Installation

```javascript
const worker = Tesseract.TesseractWorker({
workerPath: 'https://unpkg.com/tesseract.js@v2.0.0-alpha.13/dist/worker.min.js',
Tesseract.recognize(image, langs, {
workerPath: 'https://unpkg.com/tesseract.js@v2.0.0-beta.1/dist/worker.min.js',
langPath: 'https://tessdata.projectnaptha.com/4.0.0',
corePath: 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.10/tesseract-core.wasm.js',
corePath: 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.13/tesseract-core.wasm.js',
})
```
Or
```javascript
const worker = createWorker({
workerPath: 'https://unpkg.com/tesseract.js@v2.0.0-beta.1/dist/worker.min.js',
langPath: 'https://tessdata.projectnaptha.com/4.0.0',
corePath: 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.13/tesseract-core.wasm.js',
});

@@ -27,4 +37,4 @@ ```

### corePath
A string specifying the location of the [tesseract.js-core library](https://github.com/naptha/tesseract.js-core), with default value 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.10/tesseract-core.wasm.js' (fallback to tesseract-core.asm.js when WebAssembly is not available).
A string specifying the location of the [tesseract.js-core library](https://github.com/naptha/tesseract.js-core), with default value 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.13/tesseract-core.wasm.js' (fallback to tesseract-core.asm.js when WebAssembly is not available).
Another WASM option is 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.10/tesseract-core.js' which is a script that loads 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.10/tesseract-core.wasm'. But it fails to fetch at this moment.
Another WASM option is 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.13/tesseract-core.js' which is a script that loads 'https://unpkg.com/tesseract.js-core@v2.0.0-beta.13/tesseract-core.wasm'. But it fails to fetch at this moment.
# Tesseract Languages
The `lang` property of the options object passed to `Tesseract.recognize` can have one of the following values (the default is `'eng'`.):
Lang Code | Language | 4.0 traineddata
:---------| :------- | :---------------
afr | Afrikaans | [afr.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/afr.traineddata.gz)
amh | Amharic | [amh.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/amh.traineddata.gz)
ara | Arabic | [ara.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ara.traineddata.gz)
asm | Assamese | [asm.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/asm.traineddata.gz)
aze | Azerbaijani | [aze.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/aze.traineddata.gz)
aze_cyrl | Azerbaijani - Cyrillic | [aze_cyrl.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/aze_cyrl.traineddata.gz)
bel | Belarusian | [bel.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/bel.traineddata.gz)
ben | Bengali | [ben.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ben.traineddata.gz)
bod | Tibetan | [bod.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/bod.traineddata.gz)
bos | Bosnian | [bos.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/bos.traineddata.gz)
bul | Bulgarian | [bul.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/bul.traineddata.gz)
cat | Catalan; Valencian | [cat.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/cat.traineddata.gz)
ceb | Cebuano | [ceb.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ceb.traineddata.gz)
ces | Czech | [ces.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ces.traineddata.gz)
chi_sim | Chinese - Simplified | [chi_sim.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/chi_sim.traineddata.gz)
chi_tra | Chinese - Traditional | [chi_tra.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/chi_tra.traineddata.gz)
chr | Cherokee | [chr.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/chr.traineddata.gz)
cym | Welsh | [cym.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/cym.traineddata.gz)
dan | Danish | [dan.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/dan.traineddata.gz)
deu | German | [deu.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/deu.traineddata.gz)
dzo | Dzongkha | [dzo.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/dzo.traineddata.gz)
ell | Greek, Modern (1453-) | [ell.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ell.traineddata.gz)
eng | English | [eng.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/eng.traineddata.gz)
enm | English, Middle (1100-1500) | [enm.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/enm.traineddata.gz)
epo | Esperanto | [epo.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/epo.traineddata.gz)
est | Estonian | [est.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/est.traineddata.gz)
eus | Basque | [eus.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/eus.traineddata.gz)
fas | Persian | [fas.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/fas.traineddata.gz)
fin | Finnish | [fin.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/fin.traineddata.gz)
fra | French | [fra.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/fra.traineddata.gz)
frk | Frankish | [frk.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/frk.traineddata.gz)
frm | French, Middle (ca. 1400-1600) | [frm.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/frm.traineddata.gz)
gle | Irish | [gle.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/gle.traineddata.gz)
glg | Galician | [glg.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/glg.traineddata.gz)
grc | Greek, Ancient (-1453) | [grc.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/grc.traineddata.gz)
guj | Gujarati | [guj.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/guj.traineddata.gz)
hat | Haitian; Haitian Creole | [hat.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/hat.traineddata.gz)
heb | Hebrew | [heb.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/heb.traineddata.gz)
hin | Hindi | [hin.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/hin.traineddata.gz)
hrv | Croatian | [hrv.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/hrv.traineddata.gz)
hun | Hungarian | [hun.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/hun.traineddata.gz)
iku | Inuktitut | [iku.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/iku.traineddata.gz)
ind | Indonesian | [ind.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ind.traineddata.gz)
isl | Icelandic | [isl.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/isl.traineddata.gz)
ita | Italian | [ita.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ita.traineddata.gz)
ita_old | Italian - Old | [ita_old.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ita_old.traineddata.gz)
jav | Javanese | [jav.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/jav.traineddata.gz)
jpn | Japanese | [jpn.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/jpn.traineddata.gz)
kan | Kannada | [kan.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kan.traineddata.gz)
kat | Georgian | [kat.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kat.traineddata.gz)
kat_old | Georgian - Old | [kat_old.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kat_old.traineddata.gz)
kaz | Kazakh | [kaz.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kaz.traineddata.gz)
khm | Central Khmer | [khm.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/khm.traineddata.gz)
kir | Kirghiz; Kyrgyz | [kir.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kir.traineddata.gz)
kor | Korean | [kor.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kor.traineddata.gz)
kur | Kurdish | [kur.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/kur.traineddata.gz)
lao | Lao | [lao.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/lao.traineddata.gz)
lat | Latin | [lat.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/lat.traineddata.gz)
lav | Latvian | [lav.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/lav.traineddata.gz)
lit | Lithuanian | [lit.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/lit.traineddata.gz)
mal | Malayalam | [mal.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/mal.traineddata.gz)
mar | Marathi | [mar.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/mar.traineddata.gz)
mkd | Macedonian | [mkd.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/mkd.traineddata.gz)
mlt | Maltese | [mlt.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/mlt.traineddata.gz)
msa | Malay | [msa.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/msa.traineddata.gz)
mya | Burmese | [mya.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/mya.traineddata.gz)
nep | Nepali | [nep.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/nep.traineddata.gz)
nld | Dutch; Flemish | [nld.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/nld.traineddata.gz)
nor | Norwegian | [nor.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/nor.traineddata.gz)
ori | Oriya | [ori.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ori.traineddata.gz)
pan | Panjabi; Punjabi | [pan.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/pan.traineddata.gz)
pol | Polish | [pol.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/pol.traineddata.gz)
por | Portuguese | [por.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/por.traineddata.gz)
pus | Pushto; Pashto | [pus.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/pus.traineddata.gz)
ron | Romanian; Moldavian; Moldovan | [ron.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ron.traineddata.gz)
rus | Russian | [rus.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/rus.traineddata.gz)
san | Sanskrit | [san.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/san.traineddata.gz)
sin | Sinhala; Sinhalese | [sin.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/sin.traineddata.gz)
slk | Slovak | [slk.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/slk.traineddata.gz)
slv | Slovenian | [slv.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/slv.traineddata.gz)
spa | Spanish; Castilian | [spa.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/spa.traineddata.gz)
spa_old | Spanish; Castilian - Old | [spa_old.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/spa_old.traineddata.gz)
sqi | Albanian | [sqi.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/sqi.traineddata.gz)
srp | Serbian | [srp.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/srp.traineddata.gz)
srp_latn | Serbian - Latin | [srp_latn.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/srp_latn.traineddata.gz)
swa | Swahili | [swa.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/swa.traineddata.gz)
swe | Swedish | [swe.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/swe.traineddata.gz)
syr | Syriac | [syr.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/syr.traineddata.gz)
tam | Tamil | [tam.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tam.traineddata.gz)
tel | Telugu | [tel.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tel.traineddata.gz)
tgk | Tajik | [tgk.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tgk.traineddata.gz)
tgl | Tagalog | [tgl.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tgl.traineddata.gz)
tha | Thai | [tha.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tha.traineddata.gz)
tir | Tigrinya | [tir.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tir.traineddata.gz)
tur | Turkish | [tur.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/tur.traineddata.gz)
uig | Uighur; Uyghur | [uig.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/uig.traineddata.gz)
ukr | Ukrainian | [ukr.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/ukr.traineddata.gz)
urd | Urdu | [urd.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/urd.traineddata.gz)
uzb | Uzbek | [uzb.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/uzb.traineddata.gz)
uzb_cyrl | Uzbek - Cyrillic | [uzb_cyrl.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/uzb_cyrl.traineddata.gz)
vie | Vietnamese | [vie.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/vie.traineddata.gz)
yid | Yiddish | [yid.traineddata.gz](https://tessdata.projectnaptha.com/4.0.0/yid.traineddata.gz)
Please check [HERE](https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-400-november-29-2016) for supported languages
#!/usr/bin/env node
const path = require('path');
const { TesseractWorker } = require('../../');
const Tesseract = require('../../');
const [,, imagePath] = process.argv;
const image = path.resolve(__dirname, (imagePath || '../../tests/assets/images/cosmic.png'));
const tessWorker = new TesseractWorker();
console.log(`Detecting ${image}`);
console.log(`Recognizing ${image}`);
tessWorker.detect(image)
.progress((info) => {
console.log(info);
})
.then((data) => {
console.log('done', data);
process.exit();
Tesseract.detect(image, { logger: m => console.log(m) })
.then(({ data }) => {
console.log(data);
});
#!/usr/bin/env node
const path = require('path');
const { TesseractWorker } = require('../../');
const Tesseract = require('../../');
const [,, imagePath] = process.argv;
const image = path.resolve(__dirname, (imagePath || '../../tests/assets/images/cosmic.png'));
const tessWorker = new TesseractWorker();
console.log(`Recognizing ${image}`);
tessWorker.recognize(image)
.progress((info) => {
console.log(info);
})
.then((data) => {
console.log(data.text);
})
.catch((err) => {
console.log('Error\n', err);
})
.finally(() => {
process.exit();
Tesseract.recognize(image, 'eng', { logger: m => console.log(m) })
.then(({ data: { text } }) => {
console.log(text);
});
{
"name": "tesseract.js",
"version": "2.0.0-alpha.16",
"version": "2.0.0-beta.1",
"description": "Pure Javascript Multilingual OCR",
"main": "src/index.js",
"types": "types/index.d.ts",
"types": "src/index.d.ts",
"unpkg": "dist/tesseract.min.js",

@@ -13,9 +13,11 @@ "jsdelivr": "dist/tesseract.min.js",

"prepublishOnly": "npm run build",
"wait": "wait-on http://localhost:3000/package.json",
"wait": "rimraf dist && wait-on http://localhost:3000/dist/tesseract.dev.js",
"test": "npm-run-all -p -r start test:all",
"test:all": "npm-run-all wait test:browser:* test:node",
"test:node": "nyc mocha --exit --bail --require ./scripts/test-helper.js ./tests/*.test.js",
"test:browser-tpl": "mocha-headless-chrome -a incognito -a no-sandbox -a disable-setuid-sandbox -t 300000",
"test:all": "npm-run-all wait test:browser:* test:node:all",
"test:node": "nyc mocha --exit --bail --require ./scripts/test-helper.js",
"test:node:all": "npm run test:node:one -- ./tests/*.test.js",
"test:browser-tpl": "mocha-headless-chrome -a incognito -a no-sandbox -a disable-setuid-sandbox -a disable-logging -t 300000",
"test:browser:detect": "npm run test:browser-tpl -- -f ./tests/detect.test.html",
"test:browser:recognize": "npm run test:browser-tpl -- -f ./tests/recognize.test.html",
"test:browser:scheduler": "npm run test:browser-tpl -- -f ./tests/scheduler.test.html",
"lint": "eslint src",

@@ -25,3 +27,3 @@ "postinstall": "opencollective-postinstall || true"

"browser": {
"./src/node/index.js": "./src/browser/index.js"
"./src/worker/node/index.js": "./src/worker/browser/index.js"
},

@@ -58,8 +60,11 @@ "author": "",

"axios": "^0.18.0",
"check-types": "^7.4.0",
"bmp-js": "^0.1.0",
"file-type": "^12.3.0",
"idb-keyval": "^3.2.0",
"is-url": "1.2.2",
"opencollective-postinstall": "^2.0.2",
"regenerator-runtime": "^0.13.3",
"resolve-url": "^0.2.1",
"tesseract.js-core": "^2.0.0-beta.12",
"tesseract.js-utils": "^1.0.0-beta.8"
"tesseract.js-core": "^2.0.0-beta.13",
"zlibjs": "^0.3.1"
},

@@ -66,0 +71,0 @@ "repository": {

@@ -14,3 +14,3 @@ <p align="center">

<h3 align="center">
Version 2 is now available and under development in the master branch<br>
Version 2 beta is now available and under development in the master branch<br>
Check the <a href="https://github.com/naptha/tesseract.js/tree/support/1.x">support/1.x</a> branch for version 1

@@ -30,21 +30,41 @@ </h3>

```javascript
import { TesseractWorker } from 'tesseract.js';
const worker = new TesseractWorker();
import Tesseract from 'tesseract.js';
worker.recognize(myImage)
.progress(progress => {
console.log('progress', progress);
}).then(result => {
console.log('result', result);
});
Tesseract.recognize(
'https://tesseract.projectnaptha.com/img/eng_bw.png',
'eng',
{ logger: m => console.log(m) }
).then(({ data: { text } }) => {
console.log(text);
})
```
Or more imperative
```javascript
import { createWorker } from 'tesseract.js';
const worker = createWorker({
logger: m => console.log(m)
});
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(text);
await woker.terminate();
})();
```
[Check out the docs](#docs) for a full explanation of the API.
## Major changes in v2
- Upgrade to tesseract v4
## Major changes in v2 beta
- Upgrade to tesseract v4.1 (using emscripten 1.38.45)
- Support multiple languages at the same time, eg: eng+chi_tra for English and Traditional Chinese
- Supported image formats: png, jpg, bmp, pbm
- Support WebAssembly (fallback to ASM.js when browser doesn't support)
- Support Typescript

@@ -59,3 +79,3 @@

<!-- v2 -->
<script src='https://unpkg.com/tesseract.js@v2.0.0-alpha.16/dist/tesseract.min.js'></script>
<script src='https://unpkg.com/tesseract.js@v2.0.0-beta.1/dist/tesseract.min.js'></script>

@@ -109,3 +129,3 @@ <!-- v1 -->

The development server will be available at http://localhost:3000/examples/browser/demo.html in your favorite browser.
It will automatically rebuild `tesseract.dev.js` and `worker.min.js` when you change files in the src folder.
It will automatically rebuild `tesseract.dev.js` and `worker.dev.js` when you change files in the **src** folder.

@@ -112,0 +132,0 @@ You can also run the development server in Gitpod ( a free online IDE and dev environment for GitHub that will automate your dev setup ) with a single click.

@@ -13,3 +13,3 @@ const webpack = require('webpack');

app.use('/', express.static(path.resolve(__dirname, '..')));
app.use(middleware(compiler, { publicPath: '/dist' }));
app.use(middleware(compiler, { publicPath: '/dist', writeToDisk: true }));

@@ -16,0 +16,0 @@ module.exports = app.listen(3000, () => {

@@ -0,1 +1,2 @@

const constants = require('../tests/constants');
global.expect = require('expect.js');

@@ -5,1 +6,5 @@ global.fs = require('fs');

global.Tesseract = require('../src');
Object.keys(constants).forEach((key) => {
global[key] = constants[key];
});

@@ -24,3 +24,3 @@ const path = require('path');

devServer: {
allowedHosts: ['localhost', '.gitpod.io'],
allowedHosts: ['localhost', '.gitpod.io'],
},

@@ -37,5 +37,5 @@ });

genConfig({
entry: path.resolve(__dirname, '..', 'src', 'browser', 'worker.js'),
entry: path.resolve(__dirname, '..', 'src', 'worker-script', 'browser', 'index.js'),
filename: 'worker.dev.js',
}),
];

@@ -27,5 +27,5 @@ const path = require('path');

genConfig({
entry: path.resolve(__dirname, '..', 'src', 'browser', 'worker.js'),
entry: path.resolve(__dirname, '..', 'src', 'worker-script', 'browser', 'index.js'),
filename: 'worker.min.js',
}),
];

@@ -10,13 +10,17 @@ /**

*/
const utils = require('tesseract.js-utils');
const TesseractWorker = require('./common/TesseractWorker');
const types = require('./common/types');
require('regenerator-runtime/runtime');
const createScheduler = require('./createScheduler');
const createWorker = require('./createWorker');
const Tesseract = require('./Tesseract');
const OEM = require('./constants/OEM');
const PSM = require('./constants/PSM');
const { setLogging } = require('./utils/log');
module.exports = {
/** Worker for OCR, @see common/TesseractWorker.js */
TesseractWorker,
/** Utilities for tesseract.js, @see {@link https://www.npmjs.com/package/tesseract.js-utils} */
utils,
/** Check ./common/types for more details */
...types,
OEM,
PSM,
createScheduler,
createWorker,
setLogging,
...Tesseract,
};

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc