RATT Agent Library
Welcome to the **RATT Agent Library **! 🎧⚡
This tiny TypeScript client streams microphone (or external) audio over WebSocket, handles reconnection and heartbeats, and emits typed events for live transcription and UI state. It works in browsers (with AudioWorklets) and in Node (via external PCM).
🚀 Features
- Low-latency audio streaming (16 kHz, 16-bit PCM) over WebSocket
- Browser + Node: use the mic in browsers or push external PCM in any runtime
- Typed events: ready, mic state, amplitude (VAD energy), transcription, socket messages, errors
- Prebuffering & gating: capture early, start sending when the server says
start_audio
- Resilient WS: single-flight connect, StrictMode-safe reuse, heartbeats & auto-reconnect
- Safe defaults: echo cancellation, AGC, noise suppression (browser)
- Tiny API: a single class
AssistantClient you can drop into your app
📦 Installation
npm i ratt-lib
pnpm add ratt-lib
✨ Quick Start (Browser)
import { AssistantClient, AssistantOptions, AssistantEvent } from "ratt-lib";
const requestId = { current: "" };
const chatSessionId = "test";
const clientId = "test";
const rattAgentDetails = {
conciergeId: chatbotData?.id,
conciergeName: chatbotData?.name ?? chatbotData?.assistantDetails?.name,
organizationId: chatbotData?.organization,
organizationName: chatbotData?.organizationName,
requestId: requestId.current,
agentSettings: {
voiceAgentMongoId: chatbotData?.agents?.filter((a: any) => a.title === "RATTAgent")?.[0]?._id,
},
username: user?.provider?.name,
useremailId: user?.email,
chatSessionId: chatSessionId,
rlefVoiceTaskId: chatbotData?.audioTranscription?.modelId || CREATE_AUDIO_RELF_VOICE_TASK_ID_DEFAULT_VALUE,
assistant_type: chatbotData?.assistant_type,
isAudioRequest: true,
client_id: clientId,
userId: encodeParam(USER_ID_DEFAULT_VALUE),
testQuestion: "",
testAnswer: "",
testVariants: JSON.stringify({ Edit: [], Add: [], Delete: [] }),
} as const;
const client = new AssistantClient({
url: "wss://dev-egpt.techo.camp/audioStreamingWebsocket?clientId=${clientId}&sessionId=${chatSessionId}",
requestId,
rattAgentDetails,
onSend: () => console.log("Transcript submitted"),
showToast: (type, title, msg) => console.log(type, title, msg),
pingIntervalMs: 5000,
maxMissedPongs: 2,
workletBasePath: "/",
});
client.on(AssistantEvent.READY, () => {
console.log("WS ready:", client.wsReady);
});
client.on(AssistantEvent.MIC_CONNECTING, ({ detail }) => {
console.log("Mic connecting:", detail.connecting);
});
client.on(AssistantEvent.MIC_OPEN, ({ detail }) => {
console.log("Mic open:", detail.open);
});
client.on(AssistantEvent.AMPLITUDE, ({ detail }) => {
console.log("Amplitude:", detail.value);
});
client.on(AssistantEvent.TRANSCRIPTION, ({ detail }) => {
console.log("Transcript:", detail.text);
});
client.on(AssistantEvent.ERROR, ({ detail }) => {
console.error("Assistant error:", detail.error);
});
document.querySelector("#start")!.addEventListener("click", () => {
client.startSession();
});
document.querySelector("#stop")!.addEventListener("click", () => {
client.stopAudio();
});
What happens under the hood?
-
startSession() sends your rattAgentDetails + a new requestId.
-
When your server responds with { "start_audio": true }, the client:
- marks the mic as open,
- starts converting to PCM16,
- and begins streaming.
-
As your server streams partial ASR, send either:
{"streaming_data":{"previous_transcription":"...", "new_transcription":"..."}} (chunked delta)
or
{"transcription":"final text"} (full updates)
-
End with {"stop_audio": true} and/or {"disconnect": true} when you’re done.
🧩 React one-liner (optional)
useEffect(() => {
const unsub = client.on(AssistantEvent.TRANSCRIPTION, ({ detail }) => {
setText(detail.text);
});
return unsub;
}, []);
🖥️ Node / External Audio (no browser mic)
If you're not in a browser (or you have your own capture pipeline), set externalAudio: true and push PCM yourself.
import { AssistantClient } from "ratt-lib";
import fs from "node:fs";
const requestId = { current: "" };
const client = new AssistantClient({
url: "wss://dev-egpt.techo.camp/audioStreamingWebsocket?clientId=${clientId}&sessionId=${chatSessionId}",
requestId,
externalAudio: true,
externalAmplitudeRms: true,
});
await client.connect();
await client.startSession();
const pcm = fs.readFileSync("./audio.raw");
client.pushPCM16(new Int16Array(pcm.buffer, pcm.byteOffset, pcm.byteLength / 2));
await client.stopAudio();
PCM format: 16 kHz, 16-bit, mono, little-endian.
If you have Float32Array [-1..1], call pushFloat32() instead.
🔔 Events
All events are emitted as standard CustomEvents and strongly typed via AssistantEvents type.
READY — WebSocket is ready (connected & open)
MIC_CONNECTING — { connecting: boolean } while we prep/prompt for mic
MIC_OPEN — { open: boolean } mic flow is active/inactive
AMPLITUDE — { value: number } live energy (for a mic meter)
TRANSCRIPTION — { text: string, delta?: string } progressive or final
SOCKET_MESSAGE — { raw: MessageEvent, parsed?: any } every incoming WS message
ERROR — { error: unknown } any operational error
const off = client.on(AssistantEvent.TRANSCRIPTION, ({ detail }) => {
console.log(detail.text);
});
off();
⚙️ Options
type AssistantOptions = {
url: string;
requestId: { current: string };
rattAgentDetails?: Record<string, any>;
onSend?: () => void;
showToast?: (type: "error" | "info" | "success", title: string, msg: string) => void;
pingIntervalMs?: number;
maxMissedPongs?: number;
workletBasePath?: string;
mediaStreamProvider?: () => Promise<MediaStream>;
audioContextFactory?: () => AudioContext | null;
workletLoader?: (base: string) => Promise<AudioContext | null>;
externalAudio?: boolean;
externalAmplitudeRms?: boolean;
pcmChunkSize?: number;
};
🧪 Common Methods
await client.connect();
await client.startSession();
await client.stopAudio();
client.disconnect();
client.teardown();
client.closeSocket();
await client.beginPrebuffering();
client.stopPrebuffering();
await client.startMic();
client.stopMic();
client.pushPCM16(int16ArrayOrBuffer);
client.pushFloat32(float32Array);
client.wsReady;
client.micOpen;
client.micConnecting;
client.amplitude;
client.transcription;
🧯 Troubleshooting
- No audio sent
Ensure your server replies with
{"start_audio": true}. The client buffers until the gate opens.
- Mic blocked
Browser will throw
NotAllowedError. The client emits ERROR and calls showToast(...).
- Noisy audio / echo
The default constraints enable echo cancellation, AGC, and noise suppression. Override
mediaStreamProvider if needed.
- Multiple connects in React StrictMode
This client reuses a global
ACTIVE_WS and a single CONNECT_PROMISE—you’re safe.
- Heartbeat timeouts
Increase
pingIntervalMs or maxMissedPongs if your WS hops are choppy.
🔐 Permissions & Security
- Browsers require a user gesture to start the microphone.
- Only minimal audio data is sent; handle it securely on your server. Use
wss:// in production.
Questions, bugs, or ideas? Open an issue in your repo or ping your team chat. Happy streaming! 🎙️💬