Speech to Text command using IBM Watson API
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
The official Python SDK for the Deepgram automated speech recognition platform.
A fast Voice Activity Detection and Transcription System
Desktop AI Assistant powered by models: OpenAI o1, GPT-4o, GPT-4, GPT-4 Vision, GPT-3.5, DALL-E 3, Llama 3, Mistral, Gemini, Claude, DeepSeek, Bielik, and other models supported by Langchain, Llama Index, and Ollama. Features include chatbot, text completion, image generation, vision analysis, speech-to-text, internet access, file handling, command execution and more.
A high-performance Python package for calculating Word Error Rate (WER), powered by Rust.
Run local opensource AI models (Stable Diffusion, LLMs, TTS, STT, chatbots) in a lightweight Python GUI
My MCP Server
Client for communication with Phonexia Enhanced Speech To Text Built On Whisper microservice.
Vaani is an open-source, AI-powered speech-to-text desktop app. Vaani (वाणी) refers to "speech" or "voice" in Sanskrit.
An Optimized Speech-to-Text Pipeline for the Whisper Model.
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
Aiola Speech-To-Text Python SDK
ElevenLabs MCP Server
Agent Framework plugin for services using Gladia's API.
Voicegain Speech-to-Text Python SDK
A python speech to text library
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Real-time, Fully Local Whisper's Speech-to-Text and Speaker Diarization
This code is for speech to text created by me
Real time speech to text
Leopard Speech-to-Text Engine.
Tatt creates a uniform API for multiple speech-to-text (STT) services.
Real-time microphone transcription with Deepgram using Python.
Google EMEA gTech Ads Data Science Team's solution to automatically translate and dub video ads into multiple languages using AI.
Cheetah Speech-to-Text Engine.
ElevenLabs MCP Server
scribe is a local speech recognition tool that provides real-time transcription using vosk and whisper AI, with the goal of serving as a virtual keyboard on a computer
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
The official Python SDK for the Deepgram automated speech recognition platform.
Breton language speech-to-text tools
A library for sending Sinhala audio files to a Flask API and decoding the received text
SONATA: SOund and Narrative Advanced Transcription Assistant
Transcription tool for audio files based on Whisper and Pyannote
Leopard speech-to-text engine demos
A toolkit for whisper.cpp with audio processing and model management
A package for text-to-speech and speech-to-text tools
Fast GPT-3 client for Windows and Unix that supports both text and speech in any language.
Cheetah speech-to-text engine demos
tpro processes transcripts from speech-to-text services and outputs to various formats.
Push-to-talk transcription using faster-whisper
openai/whisper speech to text model + extra features
AllVoiceLab MCP Server
Python SDK for Aurora
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.