audio-file-decoder

About
A library for decoding audio files, including support for decoding specific timestamp ranges within files. Written with FFmpeg and compiled to WebAssembly via Emscripten. Intended for use in browser environments only.
The following audio file formats are supported:
Why?
WebAudio currently provides decodeAudioData as a means to access raw samples from audio files in a faster than realtime manner. It only supports decoding entire audio files however which can take huge amounts of memory. For example, a 10 minute audio file with a sample rate of 44100 Hz, floating point samples, and stereo channels will occupy 44100 Hz * 600 seconds * 4 bytes * 2 channels = ~212 MB of memory when decoded.
The WebCodecs proposal is planning to address this oversight (see here for more info) but until adoption by browsers this can be used as a more memory-friendly alternative to WebAudio's current implementation.
Caveats/Notes
- Files still need be stored in memory for access since the filesystem is sandboxed.
- Multiple channels are automatically downmixed into a single channel via sample averaging. Decoded audio is also NOT resampled, whereas
decodeAudioData will automatically resample to the sample rate of its AudioContext.
- Sample position accuracy may be slightly off when decoding timestamp ranges due to timestamp precision and how FFmpeg's seek behaves. FFmpeg tries to seek to the closest frame possible for timestamps which may introduce an error of a few frames, where each frame contains a fixed (e.g 1024 samples) or dynamic number of samples depending on the audio file encoding.
- Performance is about ~2x slower than Chromium's implementation of
decodeAudioData. Chromium's implementation also uses FFmpeg for decoding, but is able to run natively with threading and native optimizations enabled, while this library has them disabled for WebAssembly compatibility.
Usage / API
Getting Started
npm install --save audio-file-decoder
Synchronous Decoding
An example of synchronous audio file decoding in ES6:
import { getAudioDecoder } from 'audio-file-decoder';
import DecodeAudioWasm from 'audio-file-decoder/decode-audio.wasm';
const fileOrArrayBuffer = ...;
getAudioDecoder(DecodeAudioWasm, fileOrArrayBuffer)
.then(decoder => {
const sampleRate = decoder.sampleRate;
const channelCount = decoder.channelCount;
const encoding = decoder.encoding;
const duration = decoder.duration;
let samples;
samples = decoder.decodeAudioData();
samples = decoder.decodeAudioData(5.5, -1);
samples = decoder.decodeAudioData(30, 60);
const options: DecodeAudioOptions = {
multiChannel: true,
};
samples = decoder.decodeAudioData(0, -1, options);
decoder.dispose();
});
Asynchronous Decoding
An example of asynchronous audio file decoding in ES6:
import { getAudioDecoderWorker } from 'audio-file-decoder';
import DecodeAudioWasm from 'audio-file-decoder/decode-audio.wasm';
const fileOrArrayBuffer = ...;
let audioDecoder;
getAudioDecoderWorker(DecodeAudioWasm, fileOrArrayBuffer)
.then(decoder => {
const sampleRate = decoder.sampleRate;
const channelCount = decoder.channelCount;
const encoding = decoder.encoding;
const duration = decoder.duration;
audioDecoder = decoder;
const options: DecodeAudioOptions = {
multiChannel: false,
};
return decoder.getAudioData(15, 45, options);
})
.then(samples => {
console.log(samples);
audioDecoder.dispose();
});
Additional Options
You can pass additional options when decoding audio data. Currently supported options are listed below:
interface DecodeAudioOptions {
multiChannel?: boolean;
}
Importing WASM Assets
The getAudioDecoder and getAudioDecoderWorker factory functions expect relative paths (from your app's origin) to the wasm file or inlined versions of the wasm file provided by the library. You'll need to include this wasm file as an asset in your application, either by using a plugin/loader if using module bundlers (e.g file-loader for webpack) or by copying this file over in your build process.
If using a module bundler with appropriate plugins/loaders, you can simply import the required wasm asset like below:
import { getAudioDecoder, getAudioDecoderWorker } from 'audio-file-decoder';
import DecodeAudioWasm from 'audio-file-decoder/decode-audio.wasm';
getAudioDecoder(DecodeAudioWasm, myAudioFile);
getAudioDecoderWorker(DecodeAudioWasm, myAudioFile);
If you aren't using module bundler, then you need to make sure your build process copies the asset over. The wasm file is located at:
/node_modules/audio-file-decoder/decode-audio.wasm
For example, a typical application using this library should include it as an asset like in the example file structure below:
app/
dist/
index.html
index.js
decode-audio.wasm
Make sure to then manually pass in the correct relative path (again, from your app's origin) when using getAudioDecoder or getAudioDecoderWorker.
Building
The build steps below have been tested on Ubuntu 20.04.1 LTS.
First clone the repo, then navigate to the repo directory and run the following commands:
sudo apt-get update -qq
sudo apt-get install -y autoconf automake build-essential cmake git pkg-config wget libtool
git clone https://github.com/emscripten-core/emsdk.git
./emsdk/emsdk install 3.0.0
./emsdk/emsdk activate 3.0.0
source ./emsdk/emsdk_env.sh
npm install && npm run sync && npm run build-deps
npm run build-wasm && npm run build
Commands for the WebAssembly module, which can be useful if modifying or extending the C++ wrapper around FFmpeg:
npm run build-wasm
npm run clean-wasm
Commands for FFmpeg and dependencies, which can be useful if modifying the compilation of FFmpeg and its dependencies:
npm run sync
npm run unsync
npm run build-deps
npm run clean-deps
Contributing
Contributions are welcome! Feel free to submit issues or PRs for any bugs or feature requests.
License
Licensed under LGPL v2.1 or later. See the license file for more info.