nodejs-whisper
Node.js bindings for OpenAI's Whisper model.
data:image/s3,"s3://crabby-images/8fe25/8fe252f76dc2a000f81c31eb23c03acd6799638b" alt="MIT License"
Features
- Automatically convert the audio to WAV format with a 16000 Hz frequency to support the whisper model.
- Output transcripts to (.txt .srt .vtt)
- Optimized for CPU (Including Apple Silicon ARM)
- Timestamp precision to single word
- Split on word rather than on token (Optional)
- Translate from source language to english (Optional)
- Convert audio formet to wav to support whisper model
Installation
- Install make tools
sudo apt update
sudo apt install build-essential
- Install nodejs-whisper with npm
npm i nodejs-whisper
- Download whisper model
npx nodejs-whisper download
- NOTE: user may need to install make tool
Usage/Examples
import path from 'path'
import { nodewhisper } from 'nodejs-whisper'
const filePath = path.resolve(__dirname, 'YourAudioFileName')
await nodewhisper(filePath, {
modelName: 'base.en',
autoDownloadModelName: 'base.en',
whisperOptions: {
outputInText: false,
outputInVtt: false,
outputInSrt: true,
outputInCsv: false,
translateToEnglish: false,
wordTimestamps: false,
timestamps_length: 20,
splitOnWord: true,
},
})
const MODELS_LIST = [
'tiny',
'tiny.en',
'base',
'base.en',
'small',
'small.en',
'medium',
'medium.en',
'large-v1',
'large',
]
Types
interface IOptions {
modelName: string
autoDownloadModelName?: string
whisperOptions?: WhisperOptions
}
interface WhisperOptions {
outputInText?: boolean
outputInVtt?: boolean
outputInSrt?: boolean
outputInCsv?: boolean
translateToEnglish?: boolean
timestamps_length?: number
wordTimestamps?: boolean
splitOnWord?: boolean
}
Run Locally
Clone the project
git clone https://github.com/ChetanXpro/nodejs-whisper
Go to the project directory
cd nodejs-whisper
Install dependencies
npm install
Start the server
npm run dev
Build Project
npm run build
Made with
Feedback
If you have any feedback, please reach out to us at chetanbaliyan10@gmail.com
Authors