Short Text Mining
BENT: Biomedical Entity Annotator
Onnx Text Recognition (OnnxTR): docTR Onnx-Wrapper for high-performance OCR on documents.
Aspose.PSD for Python via .NET is a standalone API to read, write, process, convert Adobe Photoshop PSD, PSB formats without needing to install Adobe Photoshop® and AI files without Adobe Illustrator®
A python wrapper for the Doc2X API and comes with native texts processing (to improve texts recall in RAG).
Generalist model for Relation Extraction (Extract any relation types from texts)
Wrappers for including pre-trained transformers in spaCy pipelines
Convert images to character art with support for multiple character sets and formats
Python bindings for MeTA
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
A text-to-intent parsing framework.
An augmentation library based on SpaCy for joint augmentation of text and labels.
A text-to-intent parsing framework.
Simple Text-Processing and -Analytics Command Line Tool made in Python.
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
A python module implementing the Rapid Automatic Keyword Extraction algorithm.
('Core libraries for natural language processing',)
Breame is a lightweight Python package with a number of tools to aid in the detection of words that have dual spellings and meanings in British and American English.
Python port of open source text processing library for Turkish, zemberek-nlp
Text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction.
Basic tools for working with natural language text data
Toolkits for text processing and augmentation for Bangla NLP
an extensible tool to process legal citations in text
Powerful and Pythonic PDF processing library based on xpdf-4.02
Library designed as a python wrapper to unleash Rust text processing power combined with Python
A neural network intent parser
Process text for NLP
Narrative analysis add-on for the Orange 3 data mining software package.
Tools for organizing a collections of text for entity-centric stream processing.
Tools, wrappers, etc... for data science with a concentration on text processing
Tiny preprocessor for Russian text
Python ctypes bindings for reliq
A minimalist collection of text processing tools for Python 3
HuSpaCy: industrial strength Hungarian natural language processing
Artificial Intelligence a Modern Approach 4th Ed by Peter Norvig and Stuart Russel
A library that provides an ergonomic model for XML encoded text documents (e.g. with TEI-XML).
Interface with various cloud APIs for language processing such as translation, text to speech
Linguistic Pattern Lab using spaCy
An AI-powered tool to clean manga panels.
A python package for text preprocessing task in natural language processing
A package for working with Kazakh language text processing.
A FastAPI-based web server for working with LLMs, embedding models, and Pinecone Vector DB.
A package for extracting keywords from large text very quickly (much faster than regex and the original flashtext package
data science utils for data preprocessing for feeding various models, pipelining, time data format converting
Melusine is a high-level library for emails processing
HanDic package for installing via pip.