👑 GLiNER.js: Generalist and Lightweight Named Entity Recognition for JavaScript
GLiNER.js is a TypeScript-based inference engine for running GLiNER (Generalist and Lightweight Named Entity Recognition) models. GLiNER can identify any entity type using a bidirectional transformer encoder, offering a practical alternative to traditional NER models and large language models.
📄 Paper
•
📢 Discord
•
🤗 Demo
•
🤗 Available models
•
🧬 Official Repo
🌟 Key Features
- Flexible entity recognition without predefined categories
- Lightweight and fast inference
- Easy integration with web applications
- TypeScript support for better developer experience
🚀 Getting Started
Installation
npm install gliner
Basic Usage
const gliner = new Gliner({
tokenizerPath: "onnx-community/gliner_small-v2",
onnxSettings: {
modelPath: "public/model.onnx",
executionProvider: "webgpu",
wasmPaths: "path/to/wasm",
multiThread: true,
maxThreads: 4,
fetchBinary: true,
},
transformersSettings: {
allowLocalModels: true,
useBrowserCache: true,
},
maxWidth: 12,
modelType: "gliner",
});
await gliner.initialize();
const texts = ["Your input text here"];
const entities = ["city", "country", "person"];
const options = {
flatNer: false,
threshold: 0.1,
multiLabel: false,
};
const results = await gliner.inference({
texts,
entities,
...options,
});
console.log(results);
Response Format
The inference results will be returned in the following format:
[
{
spanText: "New York",
start: 10,
end: 18,
label: "city",
score: 0.95,
},
];
🛠 Setup & Model Preparation
To use GLiNER models in a web environment, you need an ONNX format model. You can:
Converting to ONNX Format
Use the convert_to_onnx.py
script with the following arguments:
model_path
: Location of the GLiNER model
save_path
: Where to save the ONNX file
quantize
: Set to True for IntU8 quantization (optional)
Example:
python convert_to_onnx.py --model_path /path/to/your/model --save_path /path/to/save/onnx --quantize True
🌟 Use Cases
GLiNER.js offers versatile entity recognition capabilities across various domains:
- Enhanced Search Query Understanding
- Real-time PII Detection
- Intelligent Document Parsing
- Content Summarization and Insight Extraction
- Automated Content Tagging and Categorization
...
🔧 Areas for Improvement
Creating a PR
- for any changes, remember to run
pnpm changeset
, otherwise there will not be a version bump and the PR Github Action will fail.
🙏 Acknowledgements
📞 Support
For questions and support, please join our Discord community or open an issue on GitHub.