Speech to Text command using IBM Watson API
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
The official Python SDK for the Deepgram automated speech recognition platform.
Run local opensource AI models (Stable Diffusion, LLMs, TTS, STT, chatbots) in a lightweight Python GUI
A fast Voice Activity Detection and Transcription System
Desktop AI Assistant powered by models: OpenAI o1, GPT-4o, GPT-4, GPT-4 Vision, GPT-3.5, DALL-E 3, Llama 3, Mistral, Gemini, Claude, DeepSeek, Bielik, and other models supported by Langchain, Llama Index, and Ollama. Features include chatbot, text completion, image generation, vision analysis, speech-to-text, internet access, file handling, command execution and more.
An Optimized Speech-to-Text Pipeline for the Whisper Model.
ElevenLabs MCP Server
Client for communication with Phonexia Enhanced Speech To Text Built On Whisper microservice.
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
Agent Framework plugin for services using Gladia's API.
Real-time microphone transcription with Deepgram using Python.
A high-performance Python package for calculating Word Error Rate (WER), powered by Rust.
Real-time, Fully Local Whisper's Speech-to-Text and Speaker Diarization
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Vaani is an open-source, AI-powered speech-to-text desktop app. Vaani (वाणी) refers to "speech" or "voice" in Sanskrit.
Breton language speech-to-text tools
A real-time translation tool using Whisper & Opus-MT
Push-to-talk transcription using faster-whisper
A toolkit for whisper.cpp with audio processing and model management
Real time speech to text
Leopard Speech-to-Text Engine.
Aiola Speech-To-Text Python SDK
Google EMEA gTech Ads Data Science Team's solution to automatically translate and dub video ads into multiple languages using AI.
This code is for speech to text created by me
GUI for Whisper transcription & MarianMT translation
Cheetah Speech-to-Text Engine.
Audio Manipulation Detection Client
My MCP Server
Transcription Normalization Client
AllVoiceLab MCP Server
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
A library for sending Sinhala audio files to a Flask API and decoding the received text
Python SDK for Aurora
Transcription tool for audio files based on Whisper and Pyannote
A python speech to text library
Fast GPT-3 client for Windows and Unix that supports both text and speech in any language.
The official Python SDK for the Deepgram automated speech recognition platform.
scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
at16k is a Python library to perform automatic speech recognition or speech to text conversion.
Tool for detecting and extracting text from intertitles in Swedish newsreels.
A Python client library for the Aristech Speech-to-Text API
ASRecognition: just an easy-to-use library for Automatic Speech Recognition.
SONATA: SOund and Narrative Advanced Transcription Assistant
ElevenLabs MCP Server
A real-time speech-to-text clipboard tool.
Tatt creates a uniform API for multiple speech-to-text (STT) services.