Introducing Socket Firewall: Free, Proactive Protection for Your Software Supply Chain.Learn More
Socket
Book a DemoInstallSign in
Socket

@pipecat-ai/gemini-live-websocket-transport

Package Overview
Dependencies
Maintainers
5
Versions
12
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@pipecat-ai/gemini-live-websocket-transport

Pipecat Gemini Multimodal Live Transport Package

Source
npmnpm
Version
0.3.4-rc.2
Version published
Weekly downloads
91
85.71%
Maintainers
5
Weekly downloads
 
Created
Source

Gemini Live Websocket Transport

A real-time websocket transport implementation for interacting with Google's Gemini Multimodal Live API, supporting bidirectional audio and unidirectional text communication.

Installation

npm install @pipecat-ai/client-js @pipecat-ai/real-time-websocket-transport @pipecat-ai/gemini-live-websocket-transport

Overview

The GeminiLiveWebsocketTransport class extends the RealTimeWebsocketTransport to implement a fully functional RTVI Transport. It provides a framework for implementing real-time communication directly with the Gemini Multimodal Live voice-to-voice service. It handles media device management, audio/video streams, and state management for the connection.

Features

  • Real-time bidirectional communication with Gemini Multimodal Live
  • Audio streaming support
  • Text message support
  • Automatic reconnection handling
  • Configurable generation parameters
  • Support for initial conversation context

Usage

Basic Setup

import { GeminiLiveWebsocketTransport, GeminiLLMServiceOptions } from '@pipecat-ai/gemini-live-websocket-transport';

const options: GeminiLLMServiceOptions = {
  api_key: 'YOUR_API_KEY',
  generation_config: {
    temperature: 0.7,
    maxOutput_tokens: 1000
  }
};

const transport = new GeminiLiveWebsocketTransport(options);
let RTVIConfig: RTVIClientOptions = {
  transport,
  ...
};

Configuration Options

interface GeminiLLMServiceOptions {
  api_key: string;                    // Required: Your Gemini API key
  initial_messages?: Array<{          // Optional: Initial conversation context
    content: string;
    role: string;
  }>;
  generation_config?: {               // Optional: Generation parameters
    candidate_count?: number;
    maxOutput_tokens?: number;
    temperature?: number;
    top_p?: number;
    top_k?: number;
    presence_penalty?: number;
    frequency_penalty?: number;
    response_modalities?: string;
    speech_config?: {
      voice_config?: {
        prebuilt_voice_config?: {
          voice_name: "Puck" | "Charon" | "Kore" | "Fenrir" | "Aoede";
        };
      };
    };
  };
}

Sending Messages

// Send text prompt message
rtviClient.sendMessage({
  type: 'send-text',
  data: 'Hello, Gemini!'
});

Handling Events

The transport implements the various RTVI event handlers. Check out the docs or samples for more info.

API Reference

Methods

  • initialize(): Set up the transport and establish connection
  • sendMessage(message): Send a text message
  • handleUserAudioStream(data): Stream audio data to the model
  • disconnectLLM(): Close the connection
  • sendReadyMessage(): Signal ready state

States

The transport can be in one of the following states:

  • "disconnected"
  • "initializing"
  • "initialized"
  • "connecting"
  • "connected"
  • "ready"
  • "disconnecting
  • "error"

Error Handling

The transport includes comprehensive error handling for:

  • Connection failures
  • Websocket errors
  • API key validation
  • Message transmission errors

License

BSD-2 Clause

Contributing

Feel free to submit issues and pull requests for improvements or bug fixes. Be nice :)

FAQs

Package last updated on 17 Dec 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts