New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

audio-ml

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

audio-ml

A comprehensive JavaScript/TypeScript library for audio feature extraction, designed for machine learning applications and voice AI systems

latest
Source
npmnpm
Version
1.0.2
Version published
Maintainers
1
Created
Source

Audio ML - Audio Analysis for Machine Learning

A JavaScript/TypeScript library for real-time audio feature extraction and processing. Works in both browsers and Node.js.

Installation

npm install audio-ml
# or
yarn add audio-ml
# or
pnpm add audio-ml

Demo

https://github.com/user-attachments/assets/aae5ff8c-120b-4c6c-a4d4-7348dacc3ca0

Analyzers

16 low-level audio analyzers, all sharing the same interface: analyzer.analyzeFrame(pcm: Float32Array).

import { FFTAnalyzer, MFCCAnalyzer } from 'audio-ml';

const fft = new FFTAnalyzer({ sampleRate: 44100, fftSize: 1024 });
const mfcc = new MFCCAnalyzer({ sampleRate: 44100 });

const spectrum = fft.analyzeFrame(pcmFrame);    // Float32Array
const features = mfcc.analyzeFrame(pcmFrame);   // number[]
AnalyzerOutputDescription
FFTFloat32ArrayMagnitude spectrum
MFCCnumber[]13 mel-frequency cepstral coefficients
PLPnumber[]Perceptual linear prediction
Mel Spectrogramnumber[]Mel-scaled power spectrum
Constant-Q TransformFloat32ArrayLog-spaced frequency analysis
Chroma Featuresnumber[]12-tone pitch class distribution
Spectral CentroidnumberFrequency center of mass (Hz)
Spectral Rolloffnumber85th-percentile frequency (Hz)
Spectral BandwidthnumberSpectral spread around centroid (Hz)
Spectral FlatnessnumberNoise-like vs tonal content (0–1)
Zero Crossing RatenumberRate of sign changes
RMSEnumberRoot mean square energy
Waveform EnvelopeFloat32ArrayAmplitude envelope
AutocorrelationFloat32ArrayPeriodicity / pitch detection
LPCnumber[]Linear predictive coding coefficients
Wavelet TransformFloat32Array[]Multi-level time-frequency decomposition

Applications

Higher-level tools built on top of the analyzers. Import from audio-ml/applications. All applications extend BaseApplication with an event-driven API: call processFrame() per audio frame, listen for events.

import { VAD, AudioDenoiser, VoicemailBeepDetector } from 'audio-ml/applications';

Voice Activity Detection (VAD)

Detects speech vs silence by combining RMSE, Zero Crossing Rate, Spectral Flatness, and Spectral Centroid with weighted scoring and temporal smoothing.

const vad = new VAD({ sampleRate: 44100 });

vad.on('speech-start', ({ confidence }) => console.log('Speaking', confidence));
vad.on('speech-end', ({ confidence }) => console.log('Silent', confidence));

// Per frame
const result = vad.processFrame(pcm); // { isSpeech, confidence, features }

Audio Denoiser

Removes background noise via spectral subtraction. Automatically estimates the noise profile from initial silence using RMSE and Spectral Flatness, then subtracts it in the frequency domain.

const denoiser = new AudioDenoiser({ sampleRate: 44100, fftSize: 2048 });

denoiser.on('noise-estimated', () => console.log('Noise profile ready'));
denoiser.on('denoised-frame', ({ audio, snr }) => { /* clean audio */ });

const { audio, snr, noiseReduction } = denoiser.processFrame(pcm);

Voicemail Beep Detector

Detects tonal beeps using FFT peak detection across configurable frequency ranges, with sustained-tone tracking and duration filtering.

const detector = new VoicemailBeepDetector({
  sampleRate: 44100,
  fftSize: 2048,
  frequencyRanges: [
    { min: 400, max: 500, name: 'Low beep' },
    { min: 900, max: 1100, name: 'Mid beep' },
  ]
});

detector.on('beep-detected', ({ frequency, duration, confidence }) => {
  console.log(`Beep at ${frequency} Hz`);
});

detector.processFrame(pcm);

Use Cases

  • Speech recognition: MFCC and PLP for acoustic modeling, Spectral features for phone classification
  • Speaker identification: Voiceprint extraction via MFCC, LPC, and Spectral Centroid/Bandwidth
  • Voice activity detection: VAD application, or build your own with RMSE, ZCR, and Spectral Flatness
  • Noise reduction: AudioDenoiser application for real-time spectral subtraction
  • Telephony: VoicemailBeepDetector for detecting end-of-greeting tones in voicemail systems
  • Music analysis: Chroma Features for chord/key detection, Autocorrelation for tempo, CQT for pitch tracking
  • Genre / mood classification: Combine MFCC, Spectral Rolloff, Bandwidth, and Flatness as ML feature vectors
  • Onset detection: Waveform Envelope and Spectral Flatness for detecting note/event boundaries

Platform Support

  • Browser: Modern browsers with Web Audio API. Works with Vite, Webpack, Rollup, etc.
  • Node.js: 18.0.0+. Pair with audio decoding libraries (node-wav, audio-decode) for file processing.

Development

# Run the interactive demo
cd demo && yarn install && yarn dev

The demo includes live visualizations of all 16 analyzers plus interactive pages for each application.

Contributing

To add a new analyzer, create a class in src/analysis/ implementing analyzeFrame(pcm: Float32Array) and export it from src/analysis/index.ts.

To add a new application, extend BaseApplication in src/applications/, implement processFrame(), and export from src/applications/index.ts.

License

MIT - See LICENSE file for details

Keywords

audio

FAQs

Package last updated on 15 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts