Snips Natural Language Understanding library
A terminal-based text annotation tool
Basic tools for working with natural language text data
An augmentation library based on SpaCy for joint augmentation of text and labels.
Text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction.
HuSpaCy: industrial strength Hungarian natural language processing
A utility for normalizing persian, arabic and english texts
Interface with various cloud APIs for language processing such as translation, text to speech
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
A library for calculating a variety of features from text using spaCy
Toolkits for text processing and augmentation for Bangla NLP
A minimalist collection of text processing tools for Python 3
Text2Text: Crosslingual NLP/G toolkit
Library designed as a python wrapper to unleash Rust text processing power combined with Python
Tools, wrappers, etc... for data science with a concentration on text processing
GATE NLP implementation in Python.
A python package for text preprocessing task in natural language processing
Adapt Transformer-based language models to new text domains
Chinese text analysis library, which can perform word frequency statistics, dictionary expansion, sentiment analysis, similarity, readability, co-occurrence analysis, social calculation (attitude, prejudice, culture) on texts
An AI-powered tool to clean manga panels.
Powerful and Pythonic PDF processing library based on xpdf-4.02
A library that provides an ergonomic model for XML encoded text documents (e.g. with TEI-XML).
Puristaa (Finnish for compress) - shared prefix compression of ordered string sequences.
Python interface for eunjeon project & mecab based morphological analyzer.
Open-source tool for exploring, labeling, and monitoring data for NLP projects.
HuSpaCy: industrial strength Hungarian natural language processing
Parses unstructured recipe ingredient text into standardized quantities, units, and foods
processing web text data for NLP LLM
Python port of open source text processing library for Turkish, zemberek-nlp
Pre-processing text in parallel for Keras in python.
User-friendly library to find similar objects
MoverScore: Evaluating text generation with contextualized embeddings and earth mover distance
Voice-Activated Natural Language UI
A CLI tool for interacting with the Vectara platform, including advanced text processing and indexing features.
Tools for organizing a collections of text for entity-centric stream processing.
A ridiculously simple search engine factory
A high-resolution image-to-PCB converter. Gerbolyze plots SVG, PNG and JPG onto existing gerber files. It handles almost the full SVG spec and deals with text, path outlines, patterns, arbitrary paths with self-intersections and holes, etc. fully automatically. It can vectorize raster images both by contour tracing and by grayscale dithering. All processing is done at the vector level without intermediate conversions to raster images accurately preserving the input.
Wrappers for including pre-trained transformers in spaCy pipelines
Aspose.PSD for Python via .NET is a standalone API to read, write, process, convert Adobe Photoshop PSD, PSB formats without needing to install Adobe Photoshop® and AI files without Adobe Illustrator®
A wrapper for wordcloud module for creating persian (and other rtl languages) word cloud.
Python client for Cognica database
Bloatectomy: a method for the identification and removal of duplicate text in the bloated notes of electronic health records and other documents.
Data extraction and rendering library for Shakespearean text.
Test package for distribution
Python bindings for MeTA
data science utils for data preprocessing for feeding various models, pipelining, time data format converting