
Research
SANDWORM_MODE: Shai-Hulud-Style npm Worm Hijacks CI Workflows and Poisons AI Toolchains
An emerging npm supply chain attack that infects repos, steals CI secrets, and targets developer AI toolchains for further compromise.
@juspay/vokal
Advanced tools
Production voice bot framework with TTS, STT, and AI evaluation using Neurolink
A production-ready voice bot testing and interaction framework with streaming Speech-to-Text, Text-to-Speech, and AI-powered evaluation
Vokal is a comprehensive TypeScript framework for building, testing, and evaluating voice-based applications. It provides a provider-agnostic architecture for Speech-to-Text, Text-to-Speech, and AI-powered evaluation services. Currently supports Google Cloud providers (Speech-to-Text, Text-to-Speech via Neurolink SDK, and Gemini AI), with an extensible design that allows for additional provider integrations.
Perfect for:
node -v # Should be 18.x or higher
npm -v # Should be 8.x or higher
npm install @juspay/vokal
Or clone and build from source:
git clone https://github.com/juspay/vokal.git
cd vokal
npm install
npm run build
Create a .env file in your project root:
# Option 1: Service Account (Recommended - Full Features)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Option 2: API Key (Limited Features)
GOOGLE_AI_API_KEY=your_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
💡 Tip: Service account authentication provides access to advanced features like configurable VAD timeouts and enhanced STT capabilities.
import { createVoiceTest } from '@juspay/vokal';
const voiceTest = createVoiceTest();
// Generate and save speech
const audioPath = await voiceTest.generateSpeech({
text: "Welcome to Vokal! Your voice testing framework.",
languageCode: 'en-US',
voiceName: 'en-US-Neural2-F'
});
console.log('Audio saved to:', audioPath);
import { VoiceInteractionService } from '@juspay/vokal';
const voiceBot = new VoiceInteractionService();
// Run complete voice interaction
const result = await voiceBot.runVoiceInteraction(
"What is your name?",
{
language: 'en-US',
voice: 'en-US-Neural2-D',
backgroundSound: 'office',
backgroundVolume: 0.15
}
);
console.log('User said:', result.transcript);
console.log('Confidence:', result.confidence);
import { VoiceBotTestService } from '@juspay/vokal';
// Run test suite from configuration
const testService = VoiceBotTestService.create('./test-config.json');
const results = await testService.runTestSuite();
console.log(`✅ Pass Rate: ${results.summary.passRate}%`);
console.log(`📊 Average Score: ${results.summary.averageScore}`);
console.log(`📁 Results: ${results.summary.resultsFile}`);
| Service | Description | Use Case |
|---|---|---|
| VoiceTestService | Text-to-Speech with background audio via Neurolink | Generate test audio with realistic environments |
| VoiceInteractionService | Complete TTS + Listen + STT pipeline | Full conversation simulation |
| VoiceBotTestService | Automated test suite execution | Test multiple scenarios with AI evaluation |
| AIComparisonService | AI-powered response evaluation | Semantic answer validation using Gemini |
| AudioMixerService | Background audio mixing | Add realistic noise to test scenarios |
| AudioRecordingService | Microphone recording via naudiodon | Capture user responses |
| STTHandlerManager | Provider-agnostic STT management | Unified interface for multiple STT providers |
Vokal includes a comprehensive command-line interface:
# Basic TTS generation
vokal voice generate "Hello, world!" --voice en-US-Neural2-F --lang en-US
# With background audio
vokal voice generate "Welcome" --voice en-US-Neural2-D --lang en-US --bg cafe --bgvol 0.2 --play
# Advanced settings
vokal voice generate "Fast speech" --voice en-US-Neural2-A --rate 1.5 --pitch 5.0 --output speech.mp3
# List all voices
vokal voices
# Filter by language
vokal voices en-US
# JSON output
vokal voices en-IN --format json
# List available background sounds
vokal backgrounds
# Test system audio capability
vokal test-audio
# Play an audio file
vokal play ./output.wav
# Create sample configuration
vokal test --save-sample
# Run test suite
vokal test ./config.json
# Run with specific provider and debug mode
vokal test --provider google-ai --debug --verbose
# Display comprehensive usage examples
vokal example
Run vokal --help for complete CLI documentation.
Create a JSON file to define your test scenarios:
{
"metadata": {
"name": "My Voice Bot Tests",
"version": "1.0.0",
"description": "Voice bot test suite"
},
"settings": {
"defaultLanguage": "en-US",
"defaultVoice": "en-US-Neural2-D",
"recordingDuration": 10000,
"passingScore": 0.7,
"sttProvider": "google-ai",
"ttsProvider": "google-ai",
"aiProvider": "google-ai",
"vadSettings": {
"silenceThreshold": 0.02,
"silenceDuration": 2000,
"speechTimeout": 10000
}
},
"questions": [
{
"id": "greeting",
"question": "Hello! How can I help you?",
"intent": "User greets and asks for help",
"expectedElements": ["Greeting", "Request for assistance"],
"sampleResponse": "Hi, I need help with my account"
}
]
}
See the examples/sample-config.json for a complete example.
Vokal is built with a provider-agnostic architecture using the Handler pattern for extensibility.
Google Cloud (Default)
GoogleAISTTHandlerAIComparisonServicevokal/
├── src/
│ ├── services/ # Core voice services
│ │ ├── voice-test.ts # TTS service with Neurolink
│ │ ├── voice-interaction.ts # Complete pipeline orchestration
│ │ ├── voice-bot-test.ts # Test suite execution
│ │ ├── ai-comparison.ts # AI-powered evaluation
│ │ ├── audio-mixer.ts # Background audio processing
│ │ └── audio-recording.ts # Microphone capture
│ ├── providers/ # Provider implementations
│ │ ├── google-ai-stt.handler.ts # Google Cloud STT
│ │ ├── stt-handler-manager.ts # Provider manager
│ │ └── stt-registry.ts # Provider registry
│ ├── types/ # TypeScript type definitions
│ ├── utils/ # Utilities (logging, retry, validation, security)
│ ├── constants/ # Audio configuration constants
│ ├── errors/ # Custom error classes
│ └── cli/ # Command-line interface
├── examples/ # Example configurations
│ ├── sample-config.json # Test suite example
│ ├── basic-example.js # Basic usage template
│ └── stt-handler-example.ts # STT provider example
├── assets/ # Background audio files
│ ├── office-ambience.wav
│ ├── cafe-ambience.wav
│ ├── nature-sounds.wav
│ ├── rain-light.wav
│ ├── phone-static.wav
│ └── crowd-distant.wav
├── memory-bank/ # AI assistant context
└── docs/ # Documentation (coming soon)
// Handler pattern for provider abstraction
interface STTHandler {
startStreaming(config, onResult, onSpeechStart, onSpeechEnd, onError);
stopStreaming();
}
// Register providers
STTHandlerManager.registerHandler('google-ai', GoogleAISTTHandler);
// Get provider instance
const handler = STTHandlerManager.getHandler('google-ai');
Available background sound presets for realistic test environments:
| Sound | Description | Recommended Volume | Use Case |
|---|---|---|---|
| office | Office ambience with typing and quiet chatter | 0.15 | Business applications, productivity bots |
| cafe | Coffee shop atmosphere with ambient noise | 0.20 | Customer service, casual conversations |
| nature | Outdoor setting with birds and gentle wind | 0.18 | Wellness apps, meditation guides |
| rain | Gentle rainfall ambience | 0.12 | Calming applications, sleep aids |
| phone | Phone line static and connection noise | 0.08 | IVR testing, call center simulations |
| crowd | Distant crowd noise and murmurs | 0.10 | Public space simulations, event apps |
All audio files are located in the assets/ directory as WAV files.
Vokal follows security best practices:
validation.tssecure-exec.ts# Build the project
npm run build
# Run linting
npm run lint
# Format code
npm run format
# Type checking
npm run typecheck
| Script | Description |
|---|---|
npm run build | Build TypeScript to JavaScript (dist/) |
npm run dev | Build in watch mode |
npm run clean | Clean build directory |
npm run lint | Lint code with ESLint |
npm run format | Format code with Prettier |
npm run typecheck | Run TypeScript type checking |
npm run prebuild | Format and lint before build |
Contributions are welcome! Please read our Contributing Guide for details.
git checkout -b feature/amazing-feature)git push origin feature/amazing-feature)See CODE_OF_CONDUCT.md for community guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
/docsMade with ❤️ by the Breeze Team
FAQs
Production voice bot framework with TTS, STT, and AI evaluation using Neurolink
We found that @juspay/vokal demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 7 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
An emerging npm supply chain attack that infects repos, steals CI secrets, and targets developer AI toolchains for further compromise.

Company News
Socket is proud to join the OpenJS Foundation as a Silver Member, deepening our commitment to the long-term health and security of the JavaScript ecosystem.

Security News
npm now links to Socket's security analysis on every package page. Here's what you'll find when you click through.