
Security News
Feross on TBPN: How North Korea Hijacked Axios
Socket CEO Feross Aboukhadijeh breaks down how North Korea hijacked Axios and what it means for the future of software supply chain security.
Production-ready LLM API pool manager with load balancing, failover, and dynamic configuration
A production-ready, fault-tolerant Node.js library for managing multiple LLM API providers with intelligent load balancing, automatic failover, and dynamic configuration management.
🚀 Multi-Provider Support
⚖️ Intelligent Load Balancing
🔄 Automatic Failover
📊 Advanced Rate Limiting
🛡️ Fault Tolerance
⚙️ Dynamic Configuration
🖼️ Multi-Modal Support
📈 Comprehensive Monitoring
npm install llmpool
Create a config.json file:
{
"providers": [
{
"name": "groq-primary",
"type": "groq",
"api_key": "your-groq-api-key",
"base_url": "https://api.groq.com/openai/v1",
"model": "mixtral-8x7b-32768",
"priority": 1,
"requests_per_minute": 30,
"requests_per_day": 1000
},
{
"name": "openai-fallback",
"type": "openai",
"api_key": "your-openai-api-key",
"base_url": "https://api.openai.com/v1",
"model": "gpt-4",
"priority": 2,
"requests_per_minute": 100,
"requests_per_day": 5000
}
]
}
const { LLMPool, createTextMessage } = require('llmpool');
async function main() {
// Initialize pool
const pool = new LLMPool({
configPath: './config.json'
});
await pool.initialize();
// Send chat request
const response = await pool.chat({
messages: [
createTextMessage('system', 'You are a helpful assistant.'),
createTextMessage('user', 'What is the capital of France?')
],
temperature: 0.7,
max_tokens: 1000
});
console.log('Response:', response.content);
console.log('Provider:', response.provider);
console.log('Tokens used:', response.usage.total_tokens);
await pool.shutdown();
}
main().catch(console.error);
const { createImageMessage } = require('llmpool');
const response = await pool.chat({
messages: [
createImageMessage(
'user',
'What do you see in this image?',
'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEASABIAAD...'
)
]
});
const pool = new LLMPool({
configUrl: 'https://your-domain.com/llm-config.json',
checkInterval: 300000 // Check for updates every 5 minutes
});
pool.on('configChanged', (config) => {
console.log('Configuration updated automatically');
});
pool.on('requestSuccess', (event) => {
console.log(`✅ ${event.provider} succeeded on attempt ${event.attempt}`);
});
pool.on('requestError', (event) => {
console.log(`❌ ${event.provider} failed: ${event.error}`);
});
pool.on('providersUpdated', (providers) => {
console.log(`Updated ${providers.length} providers`);
});
// Get overall pool health
const health = pool.getPoolHealth();
console.log(`Available: ${health.availableProviders}/${health.totalProviders}`);
// Get detailed provider statistics
const stats = pool.getProviderStats();
Object.entries(stats).forEach(([name, stat]) => {
console.log(`${name}:`);
console.log(` Success Rate: ${stat.performance.successRate.toFixed(2)}%`);
console.log(` Avg Response Time: ${stat.performance.averageResponseTime}ms`);
console.log(` Total Cost: $${stat.usage.totalCost.toFixed(4)}`);
});
const pool = new LLMPool({
// Configuration source (choose one)
configPath: './config.json', // Local file path
configUrl: 'https://example.com/config.json', // Remote URL
// Behavior settings
timeout: 30000, // Request timeout (ms)
maxRetries: 3, // Maximum retry attempts
retryDelay: 1000, // Initial retry delay (ms)
checkInterval: 300000, // Config check interval (ms)
useTokenCounting: true // Enable token estimation
});
{
"name": "provider-name", // Unique identifier
"type": "openai", // Provider type
"api_key": "your-api-key", // API authentication
"base_url": "https://api.openai.com/v1",
"model": "gpt-4", // Model to use
"priority": 1, // Selection priority (lower = higher priority)
// Rate limiting
"requests_per_minute": 100, // RPM limit
"requests_per_day": 5000, // Daily limit
// Circuit breaker
"circuit_breaker_threshold": 5, // Failure threshold
"circuit_breaker_timeout": 60000, // Recovery timeout (ms)
// Request defaults
"max_tokens": 4096, // Default max tokens
"temperature": 0.7, // Default temperature
"timeout": 30000, // Request timeout (ms)
// Cost tracking (optional)
"input_token_price": 0.03, // Cost per 1K input tokens
"output_token_price": 0.06 // Cost per 1K output tokens
}
| Provider | Type | Base URL |
|---|---|---|
| OpenAI | openai | https://api.openai.com/v1 |
| Gemini | gemini | https://generativelanguage.googleapis.com/v1beta/openai |
| Anthropic | anthropic | https://api.anthropic.com/v1 |
| Groq | groq | https://api.groq.com/openai/v1 |
| Together AI | together | https://api.together.xyz/v1 |
| Cohere | cohere | https://api.cohere.ai/v1 |
The library provides specific error types for different scenarios:
const {
ProviderError,
RateLimitError,
ConfigurationError
} = require('llmpool');
try {
const response = await pool.chat({ messages });
} catch (error) {
if (error instanceof RateLimitError) {
console.log(`Rate limited by ${error.provider}, retry in ${error.resetTime}s`);
} else if (error instanceof ProviderError) {
console.log(`Provider ${error.provider} failed: ${error.message}`);
if (error.retryable) {
// Can retry with different provider
}
} else if (error instanceof ConfigurationError) {
console.log(`Configuration issue: ${error.message}`);
}
}
Run the test suite:
npm test
Run specific test categories:
# Unit tests only
npm test -- --testNamePattern="LLMPool|Provider|ConfigManager"
# Integration tests
npm test -- --testNamePattern="Integration"
# Performance tests
npm test -- --testNamePattern="Performance"
The pool handles concurrent requests efficiently:
// Process multiple requests simultaneously
const promises = requests.map(request =>
pool.chat({ messages: request.messages })
);
const results = await Promise.allSettled(promises);
// Use environment variables
const config = {
providers: [{
name: 'openai',
type: 'openai',
api_key: process.env.OPENAI_API_KEY,
// ... other config
}]
};
All requests are validated before sending:
// Set up periodic monitoring
setInterval(() => {
const health = pool.getPoolHealth();
const stats = pool.getProviderStats();
// Log metrics to your monitoring system
console.log('Pool Health:', health);
// Alert on issues
if (!health.healthy) {
console.warn('🚨 Pool unhealthy - no available providers');
}
Object.entries(stats).forEach(([name, stat]) => {
if (stat.performance.successRate < 90) {
console.warn(`⚠️ ${name} has low success rate: ${stat.performance.successRate}%`);
}
});
}, 30000);
The library emits structured events that can be integrated with monitoring tools:
// Prometheus metrics example
pool.on('requestSuccess', (event) => {
prometheus.requestsTotal
.labels({ provider: event.provider, status: 'success' })
.inc();
});
pool.on('requestError', (event) => {
prometheus.requestsTotal
.labels({ provider: event.provider, status: 'error' })
.inc();
});
No available providers
High failure rates
Configuration not updating
Enable verbose logging:
const pool = new LLMPool({
configPath: './config.json',
debug: true
});
pool.on('debug', (message) => {
console.log('DEBUG:', message);
});
Implement regular health checks:
async function healthCheck() {
const health = pool.getPoolHealth();
if (!health.healthy) {
throw new Error('LLM Pool is unhealthy');
}
return {
status: 'healthy',
providers: health.availableProviders,
total: health.totalProviders
};
}
git clone https://github.com/KTBsomen/llmpool.git
cd llmpool
npm install
npm test
MIT License - see LICENSE file for details.
For more examples and advanced usage patterns, see the examples directory.
FAQs
Production-ready LLM API pool manager with load balancing, failover, and dynamic configuration
The npm package llmpool receives a total of 1 weekly downloads. As such, llmpool popularity was classified as not popular.
We found that llmpool demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Socket CEO Feross Aboukhadijeh breaks down how North Korea hijacked Axios and what it means for the future of software supply chain security.

Security News
OpenSSF has issued a high-severity advisory warning open source developers of an active Slack-based campaign using impersonation to deliver malware.

Research
/Security News
Malicious packages published to npm, PyPI, Go Modules, crates.io, and Packagist impersonate developer tooling to fetch staged malware, steal credentials and wallets, and enable remote access.