Clean the text for NLP project
A FastAPI-based web server for working with LLMs, embedding models, and Pinecone Vector DB.
Python client for Aeca database
koshort is a Python package for Korean internet spoken language crawling and processing... or maybe Korean domestic cat.
('A tool to convert single or mass PDFs to datasets for', 'language analysis, including a toolbox of text and NLP pre-processing options')
awkg is an awk-like text-processing tool powered by python language
Framework to process 3 channels in one: Video, Audio & Text
Can be used to pre process data before ai processing
A helper class for facilitating preprocessing of text corpus before any topic modeling algorithms
An utility library for processing Vietnamese texts
Utility functions for text processing.
A REST API for running Large Language Models
An Arabic text processing library intended for use in NLP applications.
A downloader for textual corpora, for use in digital humanities, corpus linguistics, and natural language processing.
Generate code-switched texts from monolingual texts
utoken is a universal tokenizer (multilingual word segmenter) that divides text into words, punctuation and special tokens such as numbers, URLs, XML tags, email-addresses and hashtags. It comes with a companion detokenizer.
Meta package to install the PDX Python User Group utilities.
A small tool to parse and process annotated text corpora
Pretraining transformer based Thai language models
Breame is a lightweight Python package with a number of tools to aid in the detection of words that have dual spellings and meanings in British and American English.
An auto mapper that accepts a list of string and a list of objects of the format {'code', 'name'} and return a list of object where each 'code' is mapped to the most similar strings from the list of strings
Detect emotions in text.
Easy NLP library for Python
GPT2 text generation with just two lines of code!
Sketch Grammar Explorer (Sketch Engine API wrapper)
Easy longform text generation for creative writing.
Kit de ferramentas para processos básicos de Processamento de Linguagem Natural.
This module, part of the `abstract_essentials` package, provides a collection of utility functions for working with images and PDFs, including loading and saving images, extracting text from images, capturing screenshots, processing PDFs, and more.
Fast text processing acceleration.
high quality multi-lingual speech to text
Jarvis UI to perform voice commands via API calls
NLP library to process french text
Fast text processing
easytoken is an independent Open Source, Natural Language Processing python library which implements a easytoken to create token from Both Sentence and Paragraph.
Python library for SEO-friendly HTML text processing and keyword linking
Pashto Natural Language Processing Toolkit
textcleaner: text-data pre-processing library
A package for working with Kazakh language text processing.
Novoic linguistics feature extraction package.
A GPT-J api to use with python3 to generate text, blogs, code, and more (Note: Starting with version 3.0.7 the api is using the old domain again so there might be some issues with limits)
Yet another Python implementation of TextRank: package for the creation, manipulation, and study of TextRank algorithm based keywords extraction and summarisation
Tiny preprocessor for Russian text
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.
A helper tool that processes text copied from PDF, removing newlines, replacing punctuation and more.
A Python tool for splitting large Markdown files into smaller sections based on a specified token limit. This is particularly useful for processing large Markdown files with GPT models, as it allows the models to handle the data in manageable chunks.
Thai Text Generator library