Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

@ericedouard/vad-node-realtime

Package Overview
Dependencies
Maintainers
1
Versions
6
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ericedouard/vad-node-realtime

Powerful, user-friendly realtime voice activity detector (VAD) for node

latest
npmnpm
Version
0.2.0
Version published
Maintainers
1
Created
Source

Forked from https://github.com/ricky0123/vad which supports web and react. It used to support node, but it wasn't realtime. Here it's based for node and realtime.

See the project home for more details.

Features

  • Real-time and non-real-time voice activity detection
  • Built on the Silero VAD model
  • Easy to use API
  • Works completely offline
  • Efficient processing for server environments

Installation

npm install @ericedouard/vad-node-realtime

Usage

Real-time VAD

Use RealTimeVAD when you need to process audio chunks in real time, such as receiving audio from a client application:

const { RealTimeVAD } = require('@eric-edouard/vad-node-realtime');

async function example() {
  // Create a new RealTimeVAD instance
  const vad = await RealTimeVAD.new({
    onSpeechStart: () => {
      console.log('Speech started');
    },
    onSpeechEnd: (audio) => {
      console.log('Speech ended, received audio of length:', audio.length);
      // Process the audio data here
    },
    // Optional: customize VAD parameters
    positiveSpeechThreshold: 0.6,
    negativeSpeechThreshold: 0.4,
    minSpeechFrames: 4,
  });

  // Start processing
  vad.start();

  // When you receive audio chunks from your source:
  function onAudioChunkReceived(audioChunk) {
    // Process each chunk of audio data
    // audioChunk should be a Float32Array with sample rate matching the sampleRate option (default: 16000Hz)
    await vad.processAudio(audioChunk);
  }

  // When you're done with the stream:
  await vad.flush(); // Process any remaining audio
  vad.destroy(); // Clean up resources
}

example();

Non-real-time VAD

For processing entire audio files or pre-recorded chunks:

const { NonRealTimeVAD } = require('@eric-edouard/vad-node-realtime');

async function example() {
  const vad = await NonRealTimeVAD.new();
  
  // audioData is a Float32Array of audio samples
  // sampleRate is the sample rate of the audio
  for await (const { audio, start, end } of vad.run(audioData, sampleRate)) {
    console.log(`Speech detected from ${start}ms to ${end}ms`);
    // Process detected speech segment
  }
}

API Reference

RealTimeVAD

  • RealTimeVAD.new(options): Create a new RealTimeVAD instance
  • start(): Start processing audio
  • pause(): Pause processing audio
  • processAudio(audioData): Process a chunk of audio data
  • flush(): Process any remaining audio and trigger final callbacks
  • reset(): Reset the VAD state
  • destroy(): Clean up resources

RealTimeVADOptions

  • sampleRate: Sample rate of the input audio (default: 16000, inputs with different sample rates will be automatically resampled)
  • onSpeechStart: Callback when speech starts
  • onSpeechEnd: Callback when speech ends, with the audio data
  • onVADMisfire: Callback when speech was detected but was too short
  • onFrameProcessed: Callback after each frame is processed
  • positiveSpeechThreshold: Threshold for detecting speech (0-1)
  • negativeSpeechThreshold: Threshold for detecting silence (0-1)
  • minSpeechFrames: Minimum number of frames to consider as speech

License

ISC

Keywords

speech-recognition

FAQs

Package last updated on 17 Apr 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts