Natural Language Toolkit
Python package and command-line tool designed to gather text on the Web, includes all necessary discovery and text processing components to perform web crawling, downloads, scraping, and extraction of main texts, metadata and comments.
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
Textile processing for python.
Module for automatic summarization of text documents and HTML pages.
Thai Natural Language Processing library
Microsoft Azure Text Analytics Client Library for Python
An accurate natural language detection library, suitable for short text and mixed-language text
Extract quantities from unstructured text.
Natural language processing augmentation library for deep neural networks
NeMo text processing for ASR and TTS
Functions to preprocess and normalize text.
Pyap is an MIT Licensed text processing library, written in Python, for detecting and parsing addresses. Currently it supports USA, Canadian and British addresses.
Python package for Korean natural language processing.
Python library for processing Chinese text
Wrappers for several pre-processing scripts from the Moses toolkit.
NLP, before and after spaCy
A base class for wrapping text-processing tools
Identification and conversion functions for Chinese text processing
Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.
Generalist model for NER (Extract any entity types from texts)
A text summarization and keyword extraction package based on TextRank
Natural Language Processing (NLP) library for Urdu language.
A command to manage a header section for a source code tree
The goal of the Indic NLP Library is to build Python based libraries for common text processing and Natural Language Processing in Indian languages.
pre-processing package for text strings
processing web text data for NLP LLM
Nonsense String Evaluator
SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batching, and more. Supports datasets from Huggingface, torchdata iterables, or simple lists of dictionaries.
STAM is a library for dealing with standoff annotations on text
A CLI tool for interacting with the Vectara platform, including advanced text processing and indexing features.
an extensible tool to process legal citations in text
Short Text Mining
A library for augmenting text for natural language processing applications.
Phrase Tree from Natural Language Toolkit
Text processing with pandas DataFrames.
Python bindings for MeTA
Unsupervised Korean Natural Language Processing Toolkits
Text processing library for russian languange
Text2Text: Crosslingual NLP/G toolkit
A python module implementing the Rapid Automatic Keyword Extraction algorithm.
An augmentation library based on SpaCy for joint augmentation of text and labels.
Snips Natural Language Understanding library