Speech to Text command using IBM Watson API
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
The official Python SDK for the Deepgram automated speech recognition platform.
Client for communication with Phonexia Enhanced Speech To Text Built On Whisper microservice.
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
Uses whisper AI to transcribe speach from video and audio files. Also accepts urls for youtube, rumble, bitchute, clear file, etc.
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Voicegain Speech-to-Text Python SDK
Tatt creates a uniform API for multiple speech-to-text (STT) services.
Leopard Speech-to-Text Engine.
Google EMEA gTech Ads Data Science Team's solution to automatically translate and dub video ads into multiple languages using AI.
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
A library for sending Sinhala audio files to a Flask API and decoding the received text
Breton language speech-to-text tools
The official Python SDK for the Deepgram automated speech recognition platform.
Leopard speech-to-text engine demos
Cheetah Speech-to-Text Engine.
An Optimized Speech-to-Text Pipeline for the Whisper Model.
Fast GPT-3 client for Windows and Unix that supports both text and speech in any language.
Cheetah speech-to-text engine demos
openai/whisper speech to text model + extra features
tpro processes transcripts from speech-to-text services and outputs to various formats.
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.
Python SDK for Aurora
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
Using Gladia's Whisper API for transcribing YouTube videos
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
at16k is a Python library to perform automatic speech recognition or speech to text conversion.
S.T.A.R.K. Platform Library And Community Extensions
A Python SDK for video processing, providing functionalities like speech-to-text, summarization, transcription, and chaptering.
Convert images or audio files to plain text on the command line
high quality multi-lingual speech to text
Client for communication with Phonexia Speech To Text Whisper Enhanced microservice.
Transcription tool for audio files based on Whisper and Pyannote
Jarvis - Voice Personal Assistant
Unified Speech-to-text Client
ASRecognition: just an easy-to-use library for Automatic Speech Recognition.
FrogBase simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms with speech-to-text models, image & text encoders and embedding stores.
A web interface for the ScAIbe speech-to-text transcription tool
BALanced Execution through Natural Activation : a human-computer interaction methodology for code running.
Integrate OpenAI speech-to-text Whisper with your keyboard
This package is for extract text from audio/video file
A fast parallel implementation of continuous integrate-and-fire (CIF) https://arxiv.org/abs/1905.11235
WhiLa (Whisper-to-LaTeX) connects tools to convert spoken mathematics into LaTeX code. It includes a Speech-To-Text (STT) layer using OpenAI's whisper model and a Math-To-LaTeX (MTL) layer to render mathematics in LaTeX. The MTL layer is a Large-Language Model (LLM) for converting spoken math to legible LaTeX code. WhiLa aims to bridge the gap between writing math and the digital world, particularly for education and those unable to use conventional math writing techniques.
Named after a spell in the Harry Potter Universe, where it amplies the sound of a speaker. In muggles' terminology, this is a repository of modules for audio and speech processing for and on top of machine learning based tasks such as speech-to-text.
Inspect, modify, and add metadata to DeepSpeech (speech-to-text) datasets in CSV format.