Get list of common stop words in various languages in Python
English language files for gruut tokenizer/phonemizer
Extract quantities from unstructured text.
Linked Open Data Modeling Language
German language files for gruut tokenizer/phonemizer
A utility library that provides a MongoDB-like query language for querying python collections. It's mainly intended to parse objects structured as fundamental types in a similar fashion to what is produced by JSON or YAML parsers.
Spanish language files for gruut tokenizer/phonemizer
French language files for gruut tokenizer/phonemizer
A PDF language detection and OCR tool
Microsoft Azure Cognitive Services Text Analytics Client Library for Python
RobotCode Language Server for Robot Framework
CLI tool converting Language Transfer lessons into Anki flashcards, automating content extraction for efficient language learning.
Flashcard app with support for downloading inflection tables
Official Implementation of "COLLIE: Systematic Construction of Constrained Text Generation Tasks"
Open Language Model (OLMo)
Simple text to phones converter for multiple languages
Functions to preprocess and normalize text.
Morphological analyzer (POS tagger + inflection engine) for Russian language.
Script Languages Container Tool
A domain-specific language for modeling convex optimization problems in Python.
Python client library for Cleanlab Trustworthy Language Model
A fast python implementation of full ROUGE metrics for automatic summarization.
Event Query Language
Core training module for the Open Language Model (OLMo)
OCR, layout, reading order, and table recognition in 90+ languages
Shell interface for docopt, the command-line interface description language.
Python implementation of TextRank as a spaCy pipeline extension, for graph-based natural language work plus related knowledge graph practices; used for for phrase extraction of text documents.
Multi-lingual Automatic Speech Recognition (ASR) based on Whisper models, with accurate word timestamps, access to language detection confidence, several options for Voice Activity Detection (VAD), and more.
Module for automatic summarization of text documents and HTML pages.
CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.
CodeJail manages execution of untrusted code in secure sandboxes. It is designed primarily for Python execution, but can be used for other languages as well.
LLM-Guard is a comprehensive tool designed to fortify the security of Large Language Models (LLMs). By offering sanitization, detection of harmful language, prevention of data leakage, and resistance against prompt injection attacks, LLM-Guard ensures that your interactions with LLMs remain safe and secure.
Parse natural language time expressions in python
Cross-language UserAgent classifier library, python implementation
binaries for clangd, a clang-based C++ language server (LSP)
Python library for ISO 639 standard
Finite-state grammar compilation
Standalone Python library for generating ROS message and service data structures for various languages.
A Language Server for Sphinx projects.
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Evolutionary Scale Modeling (esm): Pretrained language models for proteins. From Facebook AI Research.
QUA language SDK to control a Quantum Computer
A lightweight, flexible, and expandable JSON query language
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense chunking library
Python implementation of the Riva Client API
A framework for evaluating language models - packaged by NVIDIA
The electronic structure package for quantum computers.