Whisper Speech-to-Text
Whisper Speech-to-Text is a JavaScript library that allows you to record audio from a user's microphone, and then
transcribe the audio into text using OpenAI's Whisper ASR system. This library is designed to be used in web
applications.
Features
- Real-time transcription of speech to text using OpenAI's Whisper ASR system.
- Easy to use API for starting, pausing, resuming, and stopping recordings.
- Automatic handling of microphone permissions and audio recording.
Installation
npm i whisper-speech-to-text
Usage
import { WhisperSTT } from "whisper-speech-to-text";
const whisper = new WhisperSTT("your-openai-api-key");
await whisper.startRecording();
await whisper.pauseRecording();
await whisper.resumeRecording();
await whisper.stopRecording((text) => {
console.log("Transcription:", text);
});
API
The WhisperSTT
class has the following methods:
startRecording()
: Starts recording audio from the user's microphone.pauseRecording()
: Pauses the current recording.resumeRecording()
: Resumes a paused recording.stopRecording(onFinish: (text: string) => void)
: Stops the current recording and transcribes the audio into text.
The transcription is passed to the onFinish
callback.
Contributing
Contributions to this project are welcome! If you would like to contribute, please follow these steps:
- Fork the repository on GitHub.
- Clone your fork to your local machine.
- Create a new branch for your changes.
- Make your changes and commit them to your branch.
- Push your changes to your fork on GitHub.
- Open a pull request from your branch to the main repository.
Disclaimer
This library is not officially associated with OpenAI. Please use responsibly and ensure that your use case complies
with OpenAI's use case policy.
Support
If you encounter any problems or have any questions, please open an issue on the GitHub repository.