whisper-onnx-speech-to-text
Transcribe speech to text on node.js using OpenAI's Whisper models converted to cross-platform ONNX format
Installation
- Add dependency to project
npm install whisper-onnx-speech-to-text
- Download whisper model of choice
npx whisper-onnx-speech-to-text download
Usage
import { initWhisper } from 'whisper-onnx-speech-to-text';
const whisper = await initWhisper("base.en");
const transcript = await whisper.transcribe("example/sample.wav");
Result (JSON)
[
{
text: " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country."
chunks: [
{ timestamp: [0, 8.18], text: " And so my fellow Americans ask not what your country can do for you" },
{ timestamp: [8.18, 11.06], text: " ask what you can do for your country." }
]
}
]
API
initWhisper
The initWhisper()
takes the name of the model and returns an instance of the Whisper class initialized with the chosen model.
Whisper
The Whisper
class has the following methods:
1. transcribe(filePath: string, language?: string)
: transcribes speech from wav file.
Parameters:
filePath
: path to wav filelanguage
: target language for recognition. Name format - the full name in English like 'spanish'
2. disposeModel()
: dispose initialized model.
Made with