Security News
Introducing the Socket Python SDK
The initial version of the Socket Python SDK is now on PyPI, enabling developers to more easily interact with the Socket REST API in Python projects.
esearch-ocr
Advanced tools
本仓库是 eSearch的 OCR 服务依赖
支持本地 OCR(基于 PaddleOCR)
基于onnxruntime的 web runtime,使用 wasm 运行,未来可能使用 webgl 甚至是 webgpu。
模型需要转换为 onnx 才能使用:Paddle2ONNX 或在线转换
部分模型已打包:Releases
在 js 文件下可以使用 electron 进行调试
npm i esearch-ocr onnxruntime-web
web
import * as ocr from "esearch-ocr";
import * as ort from "onnxruntime-web";
const ocr = require("esearch-ocr");
const ort = require("onnxruntime-node");
[!IMPORTANT] 需要手动安装 onnxruntime(onnxruntime-node 或 onnxruntime-web,视平台而定),并在
init
参数中传入ort
这样设计是因为 web 和 electron 可以使用不同的 ort,很难协调,不如让开发者自己决定
浏览器或 Electron 示例
await ocr.init({
detPath: "ocr/det.onnx",
recPath: "ocr/rec.onnx",
dic: "abcdefg...",
ort,
});
let img = document.createElement("img");
img.src = "data:image/png;base64,...";
img.onload = async () => {
let canvas = document.createElement("canvas");
canvas.width = img.width;
canvas.height = img.height;
canvas.getContext("2d").drawImage(img, 0, 0);
ocr.ocr(canvas.getContext("2d").getImageData(0, 0, img.width, img.height))
.then((l) => {})
.catch((e) => {});
};
或者
const localOCR = await ocr.init({
detPath: "ocr/det.onnx",
recPath: "ocr/rec.onnx",
dic: "abcdefg...",
ort,
});
localOCR.ocr(/*像上面ocr.ocr一样调用*/);
这在需要多次运行 ocr 时非常有用
node.js 示例,需要安装canvas
init type
{
ort: typeof import("onnxruntime-web");
detPath: string;
recPath: string;
dic: string; // 文件内容,不是路径
dev?: boolean;
maxSide?: number;
imgh?: number;
imgw?: number;
detShape?: [number, number]; // ppocr v3 需要指定为[960, 960], v4 为[640, 640]
canvas?: (w: number, h: number) => any; // 用于node
imageData?: any; // 用于node
cv?: any;
}
ocr type
type PointType = [number, number]
ocr(img: ImageData): Promise<{
text: string;
mean: number;
box: [PointType, PointType, PointType, PointType]; // ↖ ↗ ↘ ↙
}[]>
除了 ocr 函数,还有det
函数,可单独运行,检测文字坐标;rec
函数,可单独运行,检测文字内容。具体定义可看编辑器提示
FAQs
paddleocr models run on onnx
The npm package esearch-ocr receives a total of 40 weekly downloads. As such, esearch-ocr popularity was classified as not popular.
We found that esearch-ocr demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The initial version of the Socket Python SDK is now on PyPI, enabling developers to more easily interact with the Socket REST API in Python projects.
Security News
Floating dependency ranges in npm can introduce instability and security risks into your project by allowing unverified or incompatible versions to be installed automatically, leading to unpredictable behavior and potential conflicts.
Security News
A new Rust RFC proposes "Trusted Publishing" for Crates.io, introducing short-lived access tokens via OIDC to improve security and reduce risks associated with long-lived API tokens.