AssemblyAI Node.js SDK
The AssemblyAI Node.js SDK provides an easy-to-use interface for interacting with the AssemblyAI API,
which supports async and real-time transcription, as well as the latest LeMUR models.
Installation
You can install the AssemblyAI SDK by running:
npm install assemblyai
yarn add assemblyai
pnpm add assemblyai
bun add assemblyai
Usage
Import the AssemblyAI package and create an AssemblyAI object with your API key:
import { AssemblyAI } from "assemblyai";
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY,
});
You can now use the client
object to interact with the AssemblyAI API.
Create a transcript
When you create a transcript, you can either pass in a URL to an audio file, or upload a file directly.
let transcript = await client.transcripts.transcribe({
audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});
let transcript = await client.transcripts.transcribe({
audio: "./news.mp4",
});
Note
You can also pass streams and buffers to the audio
property.
transcribe
queues a transcription job and polls it until the status
is completed
or error
.
You can configure the polling interval and polling timeout using these options:
let transcript = await client.transcripts.transcribe(
{
audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
},
{
pollingInterval: 1000,
pollingTimeout: 5000,
}
);
If you don't want to wait until the transcript is ready, you can use submit
:
let transcript = await client.transcripts.submit({
audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});
Get a transcript
This will return the transcript object in its current state. If the transcript is still processing, the status
field will be queued
or processing
. Once the transcript is complete, the status
field will be completed
.
const transcript = await client.transcripts.get(transcript.id);
If you created a transcript using submit
, you can still poll until the transcript status
is completed
or error
using waitUntilReady
:
const transcript = await client.transcripts.waitUntilReady(transcript.id, {
pollingInterval: 1000,
pollingTimeout: 5000,
});
List transcripts
This will return a page of transcripts you created.
const page = await client.transcripts.list();
You can also paginate over all pages.
let nextPageUrl: string | null = null;
do {
const page = await client.transcripts.list(nextPageUrl);
nextPageUrl = page.page_details.next_url;
} while (nextPageUrl !== null);
Delete a transcript
const res = await client.transcripts.delete(transcript.id);
Use LeMUR
Call LeMUR endpoints to summarize, ask questions, generate action items, or run a custom task.
Custom Summary:
const { response } = await client.lemur.summary({
transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
answer_format: "one sentence",
context: {
speakers: ["Alex", "Bob"],
},
});
Question & Answer:
const { response } = await client.lemur.questionAnswer({
transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
questions: [
{
question: "What are they discussing?",
answer_format: "text",
},
],
});
Action Items:
const { response } = await client.lemur.actionItems({
transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
});
Custom Task:
const { response } = await client.lemur.task({
transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
prompt: "Write a haiku about this conversation.",
});
Transcribe in real time
Create the real-time service.
const rt = client.realtime.createService();
You can also pass in the following options.
const rt = client.realtime.createService({
realtimeUrl: 'wss://localhost/override',
apiKey: process.env.ASSEMBLYAI_API_KEY
sampleRate: 16_000,
wordBoost: ['foo', 'bar']
});
You can also generate a temporary auth token for real-time.
const token = await client.realtime.createTemporaryToken({ expires_in = 60 });
const rt = client.realtime.createService({
token: token,
});
[!WARNING]
Storing your API key in client-facing applications exposes your API key.
Generate a temporary auth token on the server and pass it to your client.
You can configure the following events.
rt.on("open", ({ sessionId, expiresAt }) => console.log('Session ID:', sessionId, 'Expires at:', expiresAt));
rt.on("close", (code: number, reason: string) => console.log('Closed', code, reason));
rt.on("transcript", (transcript: TranscriptMessage) => console.log('Transcript:', transcript));
rt.on("transcript.partial", (transcript: PartialTranscriptMessage) => console.log('Partial transcript:', transcript));
rt.on("transcript.final", (transcript: FinalTranscriptMessage) => console.log('Final transcript:', transcript));
rt.on("error", (error: Error) => console.error('Error', error));
After configuring your events, connect to the server.
await rt.connect();
Send audio data via chunks.
getAudio((chunk) => {
rt.sendAudio(chunk);
});
Or send audio data via a stream by piping to the realtime stream.
audioStream.pipe(rt.stream());
Close the connection when you're finished.
await rt.close();
Tests
To run the test suite, first install the dependencies, then run pnpm test
:
pnpm install
pnpm test