octopus-web
The Picovoice Octopus library for web browsers, powered by WebAssembly.
Octopus is Picovoice's Speech-to-Index engine. It directly indexes speech without relying on a text representation. This acoustic-only approach boosts accuracy by removing out-of-vocabulary limitation. All processing is done via WebAssembly and Workers in a separate thread.
Compatibility
- Chrome / Edge
- Firefox
- Safari
This library requires several modern browser features: WebAssembly, Web Workers, and promises. Internet Explorer will not work.
Packages
The Octopus SDK for Web is split into separate worker and factory packages; import each as required.
Workers
For typical cases, use the worker package. The worker package creates complete OctopusWorker instances that can be immediately used.
Factories
Factory packages allow you to create instances of Octopus directly. Useful for building your own custom Worker/Worklet, or some other bespoke purpose.
Installation & Usage
Worker
To obtain an OctopusWorker, we can use the static create factory method from the OctopusWorkerFactory. Here is a complete example that:
- Obtains an
OctopusWorker from the OctopusWorkerFactory
- Passes a standard audio stream to be indexed and stores the result in
octopusMetadata object. The audio should be in the voice recognition standard format (16-bit 16kHz linear PCM, single-channel)
- Searches a phrase and receives the occurrences time if there are any matches with their probabilities
E.g.:
yarn add @picovoice/octopus-web-en-worker
import { OctopusWebEnWorker } from "@picovoice/octopus-web-en-worker";
let octopusMetadata = undefined;
function octopusIndexCallback(metadata) {
octopusMetadata = metadata;
}
function octopusSearchCallback(matches) {
console.log(`Search results (${matches.length}):`);
console.log(
`Start: ${match.startSec}s -> End: ${match.endSec}s (Probability: ${match.probability})`
);
}
async function startOctopus() {
const accessKey = ...
const OctopusWorker = await OctopusWorkerFactory.create(
accessKey,
octopusIndexCallback,
octopusSearchCallback
);
}
startOctopus();
const audioSignal = new Int16Array();
OctopusWorker.postMessage({
command: "index",
input: audioSignal,
});
...
const searchText = ...;
OctopusWorker.postMessage({
command: "search",
metadata: octopusMetadata,
searchPhrase: searchText,
});
...
if (done) {
OctopusWorker.sendMessage({ command: "release" });
}
Important Note: Because the workers are all-in-one packages that run an entire machine learning inference model in WebAssembly, they are approximately 9MB in size. While this is tiny for a speech recognition model, it's large for web delivery. Because of this, you likely will want to use dynamic import() instead of static import {} to reduce your app's starting bundle size. See e.g. https://webpack.js.org/guides/code-splitting/ for more information.
Factory
If you wish to build your own worker, or perhaps not use workers at all, use the factory packages. This will let you instantiate Octopus engine instances directly.
E.g.:
import { Octopus } from "@picovoice/octopus-web-en-factory";
async function startOctopus() {
const accessKey = "";
const handle = await Octopus.create(accessKey);
}
startOctopus();
...
const audioSignal = new Int16Array();
let octopusMetadata = await handle.index(audioSignal);
...
const searchText = "";
let octopusMatches = await handle.search(octopusMetadata, searchText);
console.log(`Search results (${octopusMatches.length}):`);
console.log(
`Start: ${octopusMatches.startSec}s -> End: ${octopusMatches.endSec}s (Probability: ${octopusMatches.probability})`
);
Important Note: Because the workers are all-in-one packages that run an entire machine learning inference model in WebAssembly, they are approximately 9MB in size. While this is tiny for a speech recognition model, it's large for web delivery. Because of this, you likely will want to use dynamic import() instead of static import {} to reduce your app's starting bundle size. See e.g. https://webpack.js.org/guides/code-splitting/ for more information.
Build from source (IIFE + ESM outputs)
This library uses Rollup and TypeScript along with Babel and other popular rollup plugins. There are two outputs: an IIFE version intended for script tags / CDN usage, and a JavaScript module version intended for use with modern JavaScript/TypeScript development (e.g. Angular, Create React App, Webpack).
yarn
yarn build
The output will appear in the ./dist/ folder.
For example usage refer to the web demo