Phrase Tree from Natural Language Toolkit
An augmentation library based on SpaCy for joint augmentation of text and labels.
A text-to-intent parsing framework.
('Core libraries for natural language processing',)
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Built with Python and FastAPI, it enables efficient, scalable, and user-friendly local deployment of MLX-based multimodal models with an OpenAI-compatible interface. Supports text, vision, and audio processing capabilities. Perfect for developers looking to run MLX models locally while maintaining compatibility with existing OpenAI-based applications.
A Python package for determining a piece of text's point of view (first, second, third, or unknown).
Effortless LLM extraction from documents
A text-to-intent parsing framework.
an extensible tool to process legal citations in text
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
The goal of the Indic NLP Library is to build Python based libraries for common text processing and Natural Language Processing in Indian languages. This fork is specialized for IndicTrans2.
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.
A neural network intent parser
A neural network intent parser
Breame is a lightweight Python package with a number of tools to aid in the detection of words that have dual spellings and meanings in British and American English.
Text2Text Language Modeling Toolkit
Aspose.PSD for Python via .NET is a standalone API to read, write, process, convert Adobe Photoshop PSD, PSB formats without needing to install Adobe Photoshop® and AI files without Adobe Illustrator®
A python wrapper for the Doc2X API and comes with native texts processing (to improve texts recall in RAG).
A powerful text-to-EPUB conversion tool
A Python library for interacting with Pollinations.ai API with support for text, images, and audio processing.
BENT: Biomedical Entity Annotator
Python bindings for MeTA
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
Advanced AI agent framework with composable assets (personas, instructions, workflows, skills), Claude Code-style CLI, multi-agent orchestration, 36+ tools, and enterprise security
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, JSON, CSV, HTML) with intelligent content extraction and advanced OCR.
A library for augmenting text for natural language processing applications.
Preprocessing and Extraction of Linguistic Information for Computational Analysis
Text processing with pandas DataFrames.
Library designed as a python wrapper to unleash Rust text processing power combined with Python
A python module implementing the Rapid Automatic Keyword Extraction algorithm.
HuSpaCy: industrial strength Hungarian natural language processing
Text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction.
Interface with various cloud APIs for language processing such as translation, text to speech
An AI-powered tool to clean manga panels.
*A collection of Python utility functions for file operations, text processing, and basic image/video manipulation. Provides helper classes and convenience functions for common programming tasks.*
A minimalist collection of text processing tools for Python 3
Snips Natural Language Understanding library
TextTools is a high-level NLP toolkit built on top of modern LLMs.
Simple Text-Processing and -Analytics Command Line Tool made in Python.
fenic is a Python DataFrame library for processing text data with APIs inspired by PySpark.
A package for processing complex text with mixed Chinese and English characters
Toolkits for text processing and augmentation for Bangla NLP
Powerful and Pythonic PDF processing library based on xpdf-4.02
A package for extracting keywords from large text very quickly (much faster than regex and the original flashtext package