Security News
Input Validation Vulnerabilities Dominate MITRE's 2024 CWE Top 25 List
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Hark is a tiny browser/commonJS module that listens to an audio stream, and emits events indicating whether the user is speaking or not.
npm install hark
If you aren't using browserify, you'll want hark.bundle.js.
var hark = require('../hark.js')
var getUserMedia = require('getusermedia')
getUserMedia(function(err, stream) {
if (err) throw err
var options = {};
var speechEvents = hark(stream, options);
speechEvents.on('speaking', function() {
console.log('speaking');
});
speechEvents.on('stopped_speaking', function() {
console.log('stopped_speaking');
});
});
Hark uses the webaudio API to FFT (get the power of) the audio in the audio stream. If the power is above a threshold, it's determined to be speech.
var speech = hark(stream, options);
speech.on('speaking', function() {
console.log('Speaking!');
});
speaking
emitted when the stream appears to be speakingstopped_speaking
emitted when the audio doesn't seem to be speakingcurrent_volume
emitted on every poll event by the event emitter with the current volume (in decibels) and the current threshold for speechsetInterval(interval_in_ms)
changesetThreshold(threshold_in_db)
change the minimum volume at which the audio will emit a speaking
eventinterval
(optional, default 100ms) how frequently the analyser polls the audio stream to check if speaking has started or stopped. This will also be the frequency of the volume_change
events.threshold
(optional, default -65db for audio tags, -45db for rtc streams) the volume at which speaking
/stopped\_speaking
events will be firedplay
(optional, default true for audio tags, false for webrtc streams) whether the audio stream should also be piped to the speakers, or just swallowed by the analyser. Typically for audio tags you would want to hear them, but for microphone based webrtc streams you may not to avoid feedback.Fine tuning the volume threshold is the main configuration setting for how this module will behave. The levels of -65db and -45db for audio tags and rtc streams respectively have been chosen based on some basic experimentation on mysetup, but you may wish to change them (and should if it improves your app).
What is dB? Decibels are how sound is measured. The loudest sounds on your system will be at 0dB, and silence in webaudio is -100dB. Speech seems to be between roughly -65dB and -45dB depending on the volume and type of source. If speaking events are being fired too frequently, you would make this number higher (i.e. towards 0). If they are not firing frequently enough (you are speaking loudly but no events are firing), make the number closer to -100dB).
Clone and open example/index.html or view it online
Chrome 27+ currently
MIT
FAQs
Converts an audio stream to speech events in the browser
The npm package hark receives a total of 25,923 weekly downloads. As such, hark popularity was classified as popular.
We found that hark demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.
Research
Security News
A threat actor's playbook for exploiting the npm ecosystem was exposed on the dark web, detailing how to build a blockchain-powered botnet.