
Security News
GitHub Actions Checkout Now Blocks Risky pull_request_target Checkouts
GitHub Actions checkout now blocks risky pull_request_target checkouts by default to help prevent pwn request supply chain attacks.
gemini-realtime-stream
Advanced tools
Google Gemini AI real-time streaming with audio processing capabilities
A Model Context Protocol (MCP) server that provides real-time streaming capabilities with Google's Gemini AI models, including live audio/video processing, function calling, and bidirectional WebSocket communication.
start_realtime_sessionInitialize a real-time streaming session with Gemini Live API.
Parameters:
model (string, optional): Gemini model to use (default: "gemini-2.0-flash-exp")voice (string, optional): Voice configuration for audio outputsystem_instruction (string, optional): System instructions for the modeltools (array, optional): Available tools for function callingsend_realtime_messageSend a message to an active real-time session.
Parameters:
session_id (string): Active session identifiercontent (string): Message content to sendcontent_type (string, optional): Content type (default: "text")stream_audio_inputStream audio input to the real-time session.
Parameters:
session_id (string): Active session identifieraudio_data (string): Base64-encoded audio dataformat (string, optional): Audio format (default: "pcm16")sample_rate (number, optional): Sample rate in Hz (default: 16000)capture_screen_streamCapture and stream screen content to the session.
Parameters:
session_id (string): Active session identifierregion (object, optional): Screen region to capturequality (string, optional): Capture quality ("high", "medium", "low")get_session_statusRetrieve the current status of a real-time session.
Parameters:
session_id (string): Session identifier to checkend_realtime_sessionTerminate an active real-time streaming session.
Parameters:
session_id (string): Session identifier to terminatelist_active_sessionsList all currently active real-time sessions.
Parameters: None
npm install
npm run build
export GEMINI_API_KEY="your-api-key-here"
Add the server to your MCP client configuration:
{
"mcpServers": {
"gemini-realtime-stream": {
"command": "node",
"args": ["/path/to/gemini-realtime-stream/dist/gemini-realtime-stream.js"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}
// Start a new session
const session = await startRealtimeSession({
model: "gemini-2.0-flash-exp",
system_instruction: "You are a helpful AI assistant."
});
// Send a message
await sendRealtimeMessage({
session_id: session.id,
content: "Hello, how are you today?"
});
// Start session with voice capabilities
const session = await startRealtimeSession({
model: "gemini-2.0-flash-exp",
voice: "Aoede"
});
// Stream audio input
await streamAudioInput({
session_id: session.id,
audio_data: base64AudioData,
format: "pcm16",
sample_rate: 16000
});
// Capture and stream screen content
await captureScreenStream({
session_id: session.id,
region: { x: 0, y: 0, width: 1920, height: 1080 },
quality: "high"
});
The server provides comprehensive error handling:
@modelcontextprotocol/sdk: MCP SDK for server implementation@google/generative-ai: Google Generative AI SDKws: WebSocket library for real-time communicationThis project is licensed under the MIT License.
Contributions are welcome! Please read the contributing guidelines before submitting pull requests.
For issues and questions, please use the GitHub issue tracker.
FAQs
Google Gemini AI real-time streaming with audio processing capabilities
We found that gemini-realtime-stream demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
GitHub Actions checkout now blocks risky pull_request_target checkouts by default to help prevent pwn request supply chain attacks.

Product
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.

Product
Socket MCP now lets AI assistants review org alerts, investigate threats using the Socket threat feed, and inspect package files in addition to dependency scoring.