Speech to Text command using IBM Watson API
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
The official Python SDK for the Deepgram automated speech recognition platform.
Desktop AI Assistant powered by models: OpenAI o1, GPT-4o, GPT-4, GPT-4 Vision, GPT-3.5, DALL-E 3, Llama 3, Mistral, Gemini, Claude, Bielik, and other models supported by Langchain, Llama Index, and Ollama. Features include chatbot, text completion, image generation, vision analysis, speech-to-text, internet access, file handling, command execution and more.
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
A python speech to text library
Client for communication with Phonexia Enhanced Speech To Text Built On Whisper microservice.
Voicegain Speech-to-Text Python SDK
Uses whisper AI to transcribe speach from video and audio files. Also accepts urls for youtube, rumble, bitchute, clear file, etc.
Google EMEA gTech Ads Data Science Team's solution to automatically translate and dub video ads into multiple languages using AI.
Breton language speech-to-text tools
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Leopard Speech-to-Text Engine.
The official Python SDK for the Deepgram automated speech recognition platform.
Tatt creates a uniform API for multiple speech-to-text (STT) services.
Fast GPT-3 client for Windows and Unix that supports both text and speech in any language.
An Optimized Speech-to-Text Pipeline for the Whisper Model.
Leopard speech-to-text engine demos
Cheetah Speech-to-Text Engine.
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
A Speech-to-Text toolkit with VAD, punctuation, and emotion classification
Cheetah speech-to-text engine demos
A Python client library for the Aristech Speech-to-Text API
A package for text-to-speech and speech-to-text tools
Transcription tool for audio files based on Whisper and Pyannote
A library for sending Sinhala audio files to a Flask API and decoding the received text
tpro processes transcripts from speech-to-text services and outputs to various formats.
Python SDK for Aurora
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
openai/whisper speech to text model + extra features
at16k is a Python library to perform automatic speech recognition or speech to text conversion.
A simple speech-to-text application using Wit.ai
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
ASRecognition: just an easy-to-use library for Automatic Speech Recognition.
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.
FrogBase simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms with speech-to-text models, image & text encoders and embedding stores.
Jarvis - Voice Personal Assistant
Using Gladia's Whisper API for transcribing YouTube videos
high quality multi-lingual speech to text
A Python SDK for video processing, providing functionalities like speech-to-text, summarization, transcription, and chaptering.
Convert images or audio files to plain text on the command line
A fast parallel implementation of continuous integrate-and-fire (CIF) https://arxiv.org/abs/1905.11235
S.T.A.R.K. Platform Library And Community Extensions
This package is for extract text from audio/video file
Unified Speech-to-text Client
A web interface for the ScAIbe speech-to-text transcription tool