New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

voiceai-sdk

Package Overview
Dependencies
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

voiceai-sdk

Official SDK for SLNG.AI Voice API - Text-to-Speech, Speech-to-Text, and LLM services

latest
Source
npmnpm
Version
0.1.5
Version published
Maintainers
1
Created
Source

VoiceAI SDK

🎙️ The official Node.js/TypeScript SDK for SLNG.AI - Simple, powerful voice AI for developers.

npm version License: MIT

Quick Start

npm install voiceai-sdk
import { VoiceAI, tts, stt, llm } from 'voiceai-sdk';

// Initialize once in your app
new VoiceAI({ 
  apiKey: 'your-api-key' // Get yours at https://slng.ai/signup
});

// Text to Speech
const audio = await tts.synthesize('Hello world', 'orpheus');

// Speech to Text  
const transcript = await stt.transcribe(audioFile, 'whisper-v3');

// LLM Completion
const response = await llm.complete('What is the meaning of life?', 'llama-4-scout');

Why SLNG.AI?

  • 🚀 All Voice AI in One Place - TTS, STT, and LLMs through a single API
  • 🎯 Best Models - Access to Orpheus, ElevenLabs, Whisper, and more
  • 💳 Simple Pricing - Pay-as-you-go with transparent credit system
  • 👩‍💻 Developer First - Clean API, great docs, responsive founders
  • Fast Integration - Get started in minutes, not hours
  • 🌍 Multi-language - Support for 29+ languages across models

Installation

npm install voiceai-sdk
# or
yarn add voiceai-sdk
# or
pnpm add voiceai-sdk

Authentication

Get your API key at https://slng.ai/signup

import { VoiceAI } from 'voiceai-sdk';

new VoiceAI({ 
  apiKey: process.env.VOICEAI_API_KEY,
  timeout: 60000 // Optional: custom timeout in ms (default: 30000)
});

Text-to-Speech (TTS)

Simple Usage

import { tts } from 'voiceai-sdk';

// Quick synthesis with model name
const audio = await tts.synthesize('Hello world', 'orpheus');

// Use convenience methods
const audio = await tts.orpheus('Hello world');
const audio = await tts.vui('Hello world');
const audio = await tts.koroko('Hello world');

// Orpheus Indic for Indian languages (Mumbai region - low latency)
const audio = await tts.orpheusIndic('नमस्ते', { language: 'hi' });
const audio = await tts.orpheusIndic('வணக்கம்', { language: 'ta' });

Advanced Options

// With voice and language options
const audio = await tts.orpheus('Bonjour le monde', {
  voice: 'pierre',
  language: 'fr',
  stream: false
});

// ElevenLabs models
const audio = await tts.elevenlabs.multiV2('Hello world', {
  voice: 'Rachel',
  language: 'en',
  stability: 0.5,
  similarity_boost: 0.75
});

// Voice cloning with XTTS
const audio = await tts.xtts('Hello world', {
  speakerVoice: 'base64_encoded_audio', // 6+ seconds of reference audio
  language: 'en'
});

// Voice cloning with MARS6  
const audio = await tts.mars6('Hello world', {
  audioRef: 'base64_encoded_audio',
  language: 'en-us',
  refText: 'Reference transcript' // Optional but recommended
});

Streaming

const stream = await tts.synthesize('Long text...', 'orpheus', {
  stream: true
});

// Handle streaming response
for await (const chunk of stream) {
  // Process audio chunks
}

Available Models

console.log(tts.models);
// ['vui', 'orpheus', 'orpheus-indic', 'koroko', 'xtts-v2', 'mars6', 'elevenlabs/multi-v2', ...]

// Get voices for a model
const voices = tts.getVoices('orpheus');
// ['tara', 'leah', 'jess', 'leo', 'dan', ...]

// Get supported languages
const languages = tts.getLanguages('orpheus');
// ['en', 'fr', 'de', 'ko', 'zh', 'es', 'it', 'hi']

// Orpheus Indic supports 8 major Indian languages
const indicLanguages = tts.getLanguages('orpheus-indic');
// ['hi', 'ta', 'te', 'bn', 'mr', 'gu', 'kn', 'ml']
// Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam

Speech-to-Text (STT)

Basic Transcription

import { stt } from 'voiceai-sdk';

// Transcribe audio file
const result = await stt.transcribe(audioFile, 'whisper-v3');
console.log(result.text);

// Convenience methods
const result = await stt.whisper(audioFile);
const result = await stt.kyutai(audioFile, { language: 'fr' });

With Options

// Whisper with options
const result = await stt.whisper(audioFile, {
  language: 'es',
  timestamps: true,
  diarization: true
});

// Kyutai - optimized for French and English (Mumbai region)
const result = await stt.kyutai(audioFile, {
  language: 'fr',  // 'en' or 'fr' only
  timestamps: true
});

// Access segments with timestamps
result.segments?.forEach(segment => {
  console.log(`[${segment.start}-${segment.end}]: ${segment.text}`);
});

Supported Input Types

// File object (browser)
const file = document.getElementById('audio-input').files[0];
const result = await stt.whisper(file);

// Blob
const blob = new Blob([audioData], { type: 'audio/wav' });
const result = await stt.whisper(blob);

// ArrayBuffer
const buffer = await fetch('audio.mp3').then(r => r.arrayBuffer());
const result = await stt.whisper(buffer);

// Base64 string
const base64Audio = 'data:audio/wav;base64,...';
const result = await stt.whisper(base64Audio);

Language Models (LLM)

Simple Completion

import { llm } from 'voiceai-sdk';

// Single prompt
const result = await llm.complete('Explain quantum computing', 'llama-4-scout');
console.log(result.content);

// Convenience method
const result = await llm.llamaScout('What is the speed of light?');

Chat Format

const messages = [
  { role: 'system', content: 'You are a helpful assistant' },
  { role: 'user', content: 'What is the capital of France?' }
];

const result = await llm.llamaScout(messages, {
  temperature: 0.7,
  maxTokens: 500
});

Streaming Responses

const stream = await llm.llamaScout('Write a story...', {
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk);
}

Handling Cold Starts

Some models may take 60-90 seconds to start up on first use. The SDK handles this automatically with:

  • Smart timeouts: Models known to be slow get longer timeouts
  • Clear messages: Timeout errors explain cold starts
  • Warmup utilities: Pre-warm models before use

Pre-warming Models

import { warmup } from 'voiceai-sdk';

// Warm up a single model
await warmup.tts('orpheus');
await warmup.stt('whisper-v3');
await warmup.llm('llama-4-scout');

// Warm up multiple models in parallel
await warmup.multiple([
  { type: 'tts', model: 'orpheus' },
  { type: 'stt', model: 'whisper-v3' },
  { type: 'llm', model: 'llama-4-scout' }
]);

Custom Timeouts

// Global timeout for all requests
new VoiceAI({ 
  apiKey: 'your-key',
  timeout: 120000 // 2 minutes
});

// The SDK automatically uses longer timeouts for known slow models:
// - Orpheus: 90s
// - Orpheus Indic: 90s
// - XTTS-v2: 90s  
// - Whisper-v3: 120s
// - MARS6: 90s

Error Handling

try {
  const audio = await tts.synthesize('Hello', 'orpheus');
} catch (error) {
  if (error.message.includes('Authentication failed')) {
    // Invalid API key
  } else if (error.message.includes('Insufficient credits')) {
    // Need more credits
  } else if (error.message.includes('Rate limit')) {
    // Too many requests
  } else if (error.message.includes('timed out')) {
    // Model may be cold starting - retry in a moment
  }
}

Examples

Build a Voice Assistant

import { VoiceAI, tts, stt, llm } from 'voiceai-sdk';

new VoiceAI({ apiKey: process.env.VOICEAI_API_KEY });

async function voiceAssistant(audioInput) {
  // 1. Transcribe user's speech
  const transcript = await stt.whisper(audioInput);
  
  // 2. Generate AI response
  const response = await llm.llamaScout(transcript.text);
  
  // 3. Convert response to speech
  const audio = await tts.orpheus(response.content, {
    voice: 'tara',
    language: 'en'
  });
  
  return audio;
}

Multilingual TTS

const languages = {
  en: 'Hello world',
  fr: 'Bonjour le monde',
  de: 'Hallo Welt',
  es: 'Hola mundo'
};

for (const [lang, text] of Object.entries(languages)) {
  const audio = await tts.orpheus(text, {
    language: lang,
    voice: getVoiceForLanguage(lang)
  });
  // Save or play audio
}

Voice Cloning

// Clone voice with XTTS-v2
const referenceAudio = await loadAudioAsBase64('speaker.wav');

const clonedSpeech = await tts.xtts('This is my cloned voice', {
  speakerVoice: referenceAudio,
  language: 'en'
});

// Clone with MARS6 (supports prosody)
const clonedWithProsody = await tts.mars6('Excited speech!', {
  audioRef: referenceAudio,
  refText: 'This is how I normally speak',
  language: 'en-us',
  temperature: 0.8
});

TypeScript Support

Full TypeScript support with exported types:

import { 
  VoiceAI,
  TTSOptions,
  TTSResult,
  STTOptions,
  STTResult,
  LLMMessage,
  LLMOptions,
  LLMResult 
} from 'voiceai-sdk';

Need Help?

Feedback & Support

We're building this for developers like you. Your feedback matters!

  • 📧 Email founders: hello@slng.ai
  • 🤖 Request models: Need a specific model? Just ask!
  • 🐛 Report issues: hello@slng.ai
  • 💬 Discord: Coming soon!

Contributing

We welcome contributions! Feel free to:

  • Report bugs
  • Suggest new features
  • Submit pull requests
  • Request new models

License

MIT © SLNG.AI

Built with ❤️ by the SLNG.AI team. Making voice AI simple for developers everywhere.

Keywords

voiceai

FAQs

Package last updated on 16 Aug 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts