
Research
Supply Chain Attack on Axios Pulls Malicious Dependency from npm
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.
@lumen-labs-dev/whisper-node
Advanced tools
Local audio transcription on CPU. Node.js bindings for OpenAI's Whisper.
Node.js bindings for OpenAI's Whisper. Transcription done local with VAD and Speaker Diarization.
npm install @lumen-labs-dev/whisper-node
npx whisper-node
Alternatively, the same downloader can be invoked as:
npx whisper-node download
On Windows, whisper-node downloads precompiled Whisper binaries during install (or first use) and runs them directly — no local build tools are required.
setx WHISPER_WIN_FLAVOR cpu
# or: blas | cublas-11.8 | cublas-12.4
Ensure the Microsoft Visual C++ 2015–2022 Redistributable (x64) is installed. If you see error code 0xC0000135 when starting the binary, install the redistributable and retry.
Optional: point to a custom Windows binary subfolder inside lib/whisper.cpp:
setx WHISPER_WIN_BIN_DIR Win64
# examples: Win64 | BlasWin64 | CublasWin64-11.8 | CublasWin64-12.4
Non-Windows platforms still build from source when needed.
If the package was installed without bundling lib/whisper.cpp, the downloader will automatically set up the upstream whisper.cpp assets inside node_modules/@lumen-labs-dev/whisper-node/lib/whisper.cpp. On Windows, this uses precompiled release archives; on non-Windows it may clone and build from source.
import { whisper } from '@lumen-labs-dev/whisper-node';
const transcript = await whisper("example/sample.wav");
console.log(transcript); // output: [ {start,end,speech} ]
[
{
"start": "00:00:14.310", // time stamp begin
"end": "00:00:16.480", // time stamp end
"speech": "howdy" // transcription
}
]
import { whisper } from '@lumen-labs-dev/whisper-node';
const filePath = "example/sample.wav"; // required
const options = {
modelName: "base.en", // default
// modelPath: "/custom/path/to/model.bin", // use model in a custom directory (cannot use along with 'modelName')
whisperOptions: {
language: 'auto', // default (use 'auto' for auto detect)
gen_file_txt: false, // outputs .txt file
gen_file_subtitle: false, // outputs .srt file
gen_file_vtt: false, // outputs .vtt file
// Enable per-word timestamps only if you really need them.
// For typical sentence/segment output, leave this off.
// When per-word is detected, whisper-node will automatically merge words into sentences.
word_timestamps: false,
no_timestamps: false, // when true, Whisper prints only text (no [..] lines)
// timestamp_size: 0 // cannot use along with word_timestamps:true
},
// Forwarded to shelljs.exec (defaults shown)
shellOptions: {
silent: true,
async: false,
}
}
const transcript = await whisper(filePath, options);
whisper(filePath: string, options?: { modelName?, modelPath?, whisperOptions?, shellOptions? }) => Promise<ITranscriptLine[]>modelName (one of the official names) or a modelPath pointing to a .bin file. Do not pass both.{ start, end, speech } objects parsed from Whisper's console output.Notes:
no_timestamps: true changes Whisper's console output format. Since the JSON parser expects [start --> end] text lines, using no_timestamps: true will typically yield an empty array. Prefer timestamp_size (segment-level) or word_timestamps (word-level) when you need structured JSON.word_timestamps, whisper-node will auto-merge single-word lines into sentence-level segments using pause and punctuation heuristics. You can still access raw lines before merge by calling the underlying CLI yourself..txt/.srt/.vtt files via gen_file_* flags even if you don't use the JSON array.whisper-node will automatically convert common audio/video inputs (e.g., mp3, m4a, wav, mp4) into 16 kHz mono WAV when needed using fluent-ffmpeg and the bundled ffmpeg-static/ffprobe-static binaries. The converted file is written next to your input as <name>.wav16k.wav and used for transcription.
If your input is already a 16kHz mono WAV, it is used as-is without conversion.
You can enrich the transcript with speaker labels without Python using a lightweight, naive diarization:
Usage:
import whisper, { DiarizationOptions } from '@lumen-labs-dev/whisper-node';
const transcript = await whisper('audio.mp3', {
diarization: {
enabled: true,
numSpeakers: 2, // or omit to auto-guess a small K
}
});
// Each transcript line may include speaker: 'S0', 'S1', ...
Notes:
ITranscriptLine.Files must be .wav and 16 kHz
Example .mp3 file converted with an FFmpeg command: ffmpeg -i input.mp3 -ar 16000 output.wav
Run the interactive downloader (downloads into node_modules/@lumen-labs-dev/whisper-node/lib/whisper.cpp/models; non-Windows will build on first use if needed):
npx @lumen-labs-dev/whisper-node
You will be prompted to choose one of:
| Model | Disk | RAM |
|---|---|---|
| tiny | 75 MB | ~273 MB |
| tiny.en | 75 MB | ~273 MB |
| base | 142 MB | ~388 MB |
| base.en | 142 MB | ~388 MB |
| small | 466 MB | ~852 MB |
| small.en | 466 MB | ~852 MB |
| medium | 1.5 GB | ~2.1 GB |
| medium.en | 1.5 GB | ~2.1 GB |
| large-v1 | 2.9 GB | ~3.9 GB |
| large | 2.9 GB | ~3.9 GB |
If you already have a model elsewhere, pass modelPath in the API and skip the downloader.
You can configure defaults without passing options in code by creating one of the following files in your project root:
whisper-node.config.jsonwhisper.config.jsonOr set an explicit path via environment variable WHISPER_NODE_CONFIG=/abs/path/to/config.json.
Example config:
{
"modelName": "base.en",
"modelPath": "/custom/models/ggml-base.en.bin",
"whisperOptions": {
"language": "auto",
"word_timestamps": true
},
"shellOptions": {
"silent": true
}
}
Notes:
whisper() function always override values from the config file.modelName from config to skip the prompt when valid.Control verbosity via environment variable (defaults to INFO):
# ERROR | WARN | INFO | DEBUG
setx WHISPER_NODE_LOG_LEVEL DEBUG
make (see link above) or use MSYS2/Chocolatey alternatives.xcode-select --install.sudo apt-get install build-essential (Debian/Ubuntu) or the equivalent for your distro.modelPath.no_timestamps: true. The JSON parser expects timestamped lines like [00:00:01.000 --> 00:00:02.000] text.src/
cli/ # CLI entrypoints (e.g., download)
config/ # constants and configuration
core/ # domain logic (whisper command builder)
infra/ # process/shell integration with whisper.cpp
utils/ # helper utilities (e.g., transcript parsing)
scripts/ # development/test scripts
d.ts filenpm run build - runs tsc, outputs to /dist and gives sh permission to dist/cli/download.js
npm run test - runs the compiled example in dist/scripts/test.js
FAQs
Local audio transcription on CPU. Node.js bindings for OpenAI's Whisper.
We found that @lumen-labs-dev/whisper-node demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.

Security News
TeamPCP is partnering with ransomware group Vect to turn open source supply chain attacks on tools like Trivy and LiteLLM into large-scale ransomware operations.