
Security News
TC39 Advances 11 Proposals for Math Precision, Binary APIs, and More
TC39 advances 11 JavaScript proposals, with two moving to Stage 4, bringing better math, binary APIs, and more features one step closer to the ECMAScript spec.
speechmatics
Advanced tools
[!WARNING]
This package has been deprecated, and will no longer be updated. See the packages@speechmatics/batch-client
and@speechmatics/real-time-client
for best quality integration with Speechmatics APIs.
Official JS/TS SDK for Speechmatics API.
To access the API you need to have an account with Speechmatics. You can sign up for a free trial here.
The documentation for the API can be found here.
More examples on how to use the SDK can be found in the examples folder.
Our Portal is also a good source of information on how to use the API. You can find it here. Check out Upload
and Realtime Demo
sections.
npm install speechmatics
In order to use the SDK, authentication is needed. Generate an API key in the Portal. You can find more information on how to do that here.
The section below explains the different options available for authenticating using your API key.
An API key can be used in 2 different ways for authentication:
Authorization
header.Bearer authentication will be used by the SDK if you pass an API key, as opposed to a JWT, when the SDK instance is created:
import { RealtimeSession } from 'speechmatics';
const sm = new RealtimeSession(YOUR_API_KEY);
It is important to note in Browsers, or any client, you should never use Bearer authentication
(option 1) as this exposes your API key which is NOT a short-lived token. The above example is meant for server-side Node code.
You can use your API key on the serverside to obtain a JWT for an authenticated user. These tokens are short-lived and won't be valid for authentication after they expire. A new JWT can be requested at any time. The http request for obtaining a JWT is as follows:
POST
https://mp.speechmatics.com/v1/api_keys
type
with possible values: batch
or rt
Content-Type: application/json
and Authorization: Bearer YOUR_API_KEY
ttl
. The value for ttl is a number that indicates for how many seconds the token will be valid. Between 60
and 3600
Example of a request for a realtime JWT valid for 1 hour:
curl -L -X POST "https://mp.speechmatics.com/v1/api_keys?type=rt" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $YOUR_API_KEY" \
-d '{"ttl": 3600}'
A valid JWT can then be passed to the RealtimeSession
constructor:
import { RealtimeSession } from 'speechmatics';
const session = new RealtimeSession(YOUR_JWT);
Alternatively:
const session = new RealtimeSession({apiKey: YOUR_JWT});
There is also the option to provide an async callback to fetch a JWT. This is useful if you want the SDK to refresh the JWT before it expires.
const session = new RealtimeSession({
apiKey: async () => {
// ... implement your JWT fetching here
},
});
This examples shows you how to set up and run a realtime session on a Node.js backend server using a file as an input.
import { RealtimeSession } from 'speechmatics';
// imports helpful for the file streaming
const fs = require('fs');
const path = require('path');
// init the session
const session = new RealtimeSession(YOUR_API_KEY);
//add listeners
session.addListener('RecognitionStarted', () => {
console.log('RecognitionStarted');
});
session.addListener('Error', (error) => {
console.log('session error', error);
});
session.addListener('AddTranscript', (message) => {
console.log('AddTranscript', message);
});
session.addListener('AddPartialTranscript', (message) => {
console.log('AddPartialTranscript', message);
});
session.addListener('EndOfTranscript', () => {
console.log('EndOfTranscript');
});
//start session which is an async method
session.start().then(() => {
//prepare file stream
const fileStream = fs.createReadStream(
path.join(__dirname, 'example_files/example.wav'),
);
//send it
fileStream.on('data', (sample) => {
console.log('sending audio', sample.length);
session.sendAudio(sample);
});
//end the session
fileStream.on('end', () => {
session.stop();
});
});
Because our API keys are persistent, it is important to remember not to use them to authenticate on the client side. Instead, we recommend generating a short-lived JWT on the server side using your API key and providing this JWT as an argument to the RealtimeSession constructor:
const session = new RealtimeSession(YOUR_JWT);
This examples shows you how to run the SDK in a web app using the in-built MediaRecorder browser class to access the computer's microphone devices.
import { RealtimeSession } from 'speechmatics';
// create a session with JWT
const session = new RealtimeSession(YOUR_JWT);
//add listeners
session.addListener('RecognitionStarted', () => {
console.log('RecognitionStarted');
});
session.addListener('Error', (error) => {
console.log('session error', error);
});
session.addListener('AddTranscript', (message) => {
console.log('AddTranscript', message);
});
session.addListener('AddPartialTranscript', (message) => {
console.log('AddPartialTranscript', message);
});
session.addListener('EndOfTranscript', () => {
console.log('EndOfTranscript');
});
//start session which is an async method
session.start().then(async () => {
//setup audio stream
let stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const mediaRecorder = new MediaRecorder(stream, {
mimeType: 'audio/webm;codecs=opus',
audioBitsPerSecond: 16000
});
mediaRecorder.start(1000);
mediaRecorder.ondataavailable = (event) => {
if (event.data.size > 0) {
session.sendAudio(event.data);
}
};
});
A TranscriptionConfig object specifies different configuration values that can be used for transcription. If a transcription config is not given, the SDK uses a default one with just the language
field set to en
.
A TranscriptionConfig
object can be passed to the start
method of RealtimeSession
object.
const session = new RealtimeSession(YOUR_API_KEY);
const transcription_config = {
language: 'en',
additional_vocab: [
{ content: 'gnocchi', sounds_like: ['nyohki', 'nokey', 'nochi'] },
{ content: 'CEO', sounds_like: ['C.E.O'] }
],
diarization: 'speaker_change',
enable_partials: true,
operating_point: 'enhanced'
};
session.start({ transcription_config });
More information about the available fields can be found in the documentation.
You can find more examples in the examples folder.
To run the node sample code you'll need to add your API key to a .env
file or directly inside the node example file. You can generate your API key in the Speechmatics Console.
node examples/example_rt_node.js
We'd love to see your contributions! Please read our contributing guidelines for more information.
FAQs
Speechmatics Javascript Libraries
The npm package speechmatics receives a total of 682 weekly downloads. As such, speechmatics popularity was classified as not popular.
We found that speechmatics demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
TC39 advances 11 JavaScript proposals, with two moving to Stage 4, bringing better math, binary APIs, and more features one step closer to the ECMAScript spec.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.