Natural Language Toolkit
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML.
An accurate natural language detection library, suitable for short text and mixed-language text
Extensive Language Pack for Tree-Sitter
Thai Natural Language Processing library
Microsoft Azure Text Analytics Client Library for Python
Functions to preprocess and normalize text.
Textile processing for python.
Python package for Korean natural language processing.
Natural language processing augmentation library for deep neural networks
Pyap is an MIT Licensed text processing library, written in Python, for detecting and parsing addresses. Currently it supports USA, Canadian and British addresses.
Extract quantities from unstructured text.
Generalist model for NER (Extract any entity types from texts)
Module for automatic summarization of text documents and HTML pages.
NeMo text processing for ASR and TTS
Python library for processing Chinese text
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense chunking library
A text summarization and keyword extraction package based on TextRank
Nonsense String Evaluator
NLP, before and after spaCy
Python ctypes bindings for reliq
The goal of the Indic NLP Library is to build Python based libraries for common text processing and Natural Language Processing in Indian languages.
uroman is a universal romanizer. It converts text in any script to the standard Latin alphabet.
A fast Voice Activity Detection and Transcription System
A base class for wrapping text-processing tools
Convert HTML to markdown
Natural Language Processing (NLP) library for Urdu language.
A Python library for a _FULL_ Zalgo experience
Wrappers for several pre-processing scripts from the Moses toolkit.
A command to manage a header section for a source code tree
Identification and conversion functions for Chinese text processing
Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.
('Core libraries for natural language processing',)
A library for extracting abbreviations from text.
Analiticcl is an approximate string matching or fuzzy-matching system that can be used to find variants for spelling correction or text normalisation
pre-processing package for text strings
Blazing-fast Thai text processing library powered by Rust
Onnx Text Recognition (OnnxTR): docTR Onnx-Wrapper for high-performance OCR on documents.
STAM is a library for dealing with standoff annotations on text, this is the python binding.
A text-to-intent parsing framework.
Biblioteca completa para normalizar nombres y apellidos en español con redistribución inteligente y detección de patrones
Phrase Tree from Natural Language Toolkit
Text2Text Language Modeling Toolkit
Unsupervised Korean Natural Language Processing Toolkits