Speech to Text command using IBM Watson API
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
The official Python SDK for the Deepgram automated speech recognition platform.
Uses whisper AI to transcribe speach from video and audio files. Also accepts urls for youtube, rumble, bitchute, clear file, etc.
Voicegain Speech-to-Text Python SDK
Client for communication with Phonexia Speech To Text Whisper Enhanced microservice.
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
Leopard Speech-to-Text Engine.
Breton language speech-to-text tools
An Optimized Speech-to-Text Pipeline for the Whisper Model.
Tatt creates a uniform API for multiple speech-to-text (STT) services.
Cheetah Speech-to-Text Engine.
Python SDK for Aurora
Fast GPT-3 client for Windows and Unix that supports both text and speech in any language.
high quality multi-lingual speech to text
at16k is a Python library to perform automatic speech recognition or speech to text conversion.
Leopard speech-to-text engine demos
ASRecognition: just an easy-to-use library for Automatic Speech Recognition.
A library for sending Sinhala audio files to a Flask API and decoding the received text
tpro processes transcripts from speech-to-text services and outputs to various formats.
Cheetah speech-to-text engine demos
openai/whisper speech to text model + extra features
The official Python SDK for the Deepgram automated speech recognition platform.
Using Gladia's Whisper API for transcribing YouTube videos
This package is for extract text from audio/video file
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.
A fast parallel implementation of continuous integrate-and-fire (CIF) https://arxiv.org/abs/1905.11235
Convert images or audio files to plain text on the command line
A Python SDK for video processing, providing functionalities like speech-to-text, summarization, transcription, and chaptering.
FrogBase simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms with speech-to-text models, image & text encoders and embedding stores.
Unified Speech-to-text Client
BALanced Execution through Natural Activation : a human-computer interaction methodology for code running.
Jarvis - Voice Personal Assistant
S.T.A.R.K. Platform Library And Community Extensions
Integrate OpenAI speech-to-text Whisper with your keyboard
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
ArmSpeech is an offline Armenian speech recognition library (speech-to-text) and CLI tool based on Coqui STT (🐸STT) and trained on the ArmSpeech dataset.
framework for synchronous batch speech-to-text transcription using backends like AWS, Watson, etc.
Transcription tool for audio files based on Whisper and Pyannote
Inspect, modify, and add metadata to DeepSpeech (speech-to-text) datasets in CSV format.
A simple extension to allow the creation of markdown cells via speech-to-text.
Transcribe voice recordings using the Google Cloud Speech-To-Text API, and export the results to Emacs org-mode headings.
Named after a spell in the Harry Potter Universe, where it amplies the sound of a speaker. In muggles' terminology, this is a repository of modules for audio and speech processing for and on top of machine learning based tasks such as speech-to-text.