LLM Selector
A Python library that uses any LLM (via LiteLLM) to intelligently select the best model for a given task.
Installation
From PyPI (Recommended)
pip install llm-selector
From Source
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install .
Development Installation
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"
Quick Start
import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig
async def main():
models = {
"gpt-4": {
"description": "OpenAI GPT-4 for complex reasoning",
"weight": 2.0,
"cost_per_token": 0.00003
},
"claude-3": {
"description": "Anthropic Claude 3 for balanced tasks",
"weight": 1.0,
"cost_per_token": 0.000015
}
}
selector_configs = [
LLMSelectorConfig("gpt-3.5-turbo", api_key="your-openai-key")
]
selector = LLMSelector(models=models, selector_configs=selector_configs)
result = await selector.select("Write a complex Python data analysis script")
print(f"Selected: {result}")
asyncio.run(main())
import asyncio
from llm_selector import LLMSelector, LLMSelectorConfig
async def main():
models = {
"openai/gpt-4": {
"description": "OpenAI GPT-4 for complex reasoning and coding",
"weight": 2.0,
"cost_per_token": 0.00003
},
"anthropic/claude-3": {
"description": "Anthropic Claude 3 for safe and helpful responses",
"weight": 1.5,
"cost_per_token": 0.000015
},
"cerebras/llama3.3-70b": {
"description": "Cerebras Llama 3.3 70B for fast inference",
"weight": 1.0
},
}
selector_configs = [
LLMSelectorConfig(
model="openrouter/openai/gpt-oss-20b",
api_base="https://openrouter.ai/api/v1",
api_key="env:OPENROUTER_API_KEY"
),
LLMSelectorConfig(
model="anthropic/claude-3-haiku",
api_key="dotenv:ANTHROPIC_KEY"
),
]
selector = LLMSelector(
models=models,
selector_configs=selector_configs,
cache_enabled=True,
debug=True
)
user_input = "I need to write a complex Python script for data analysis."
selected_model = await selector.select(user_input)
print(f"Selected model: {selected_model}")
metrics = selector.get_metrics()
print(f"Cache hit rate: {metrics['cache_hit_rate']:.2%}")
print(f"Total cost: ${selector.get_total_cost():.4f}")
asyncio.run(main())
Advanced Features
Synchronous Interface
selector = LLMSelector(models=models, selector_configs=configs)
result = selector.select_sync("My query")
Caching
selector = LLMSelector(
models=models,
cache_enabled=True,
cache_ttl=3600,
max_cache_size=1000
)
selector.clear_cache()
Cost Tracking
selector.set_budget_limit(10.0)
if selector.is_budget_exceeded():
print("Budget exceeded!")
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']
Metrics & Monitoring
metrics = selector.get_metrics()
print(f"""
Requests: {metrics['total_requests']}
Cache hits: {metrics['cache_hits']} ({metrics['cache_hit_rate']:.1%})
Success rate: {metrics['successful_selections']/metrics['total_requests']:.1%}
Average latency: {metrics['average_latency']:.2f}s
Total cost: ${metrics['total_cost']:.4f}
""")
selector.reset_metrics()
Cost Tracking
selector.set_budget_limit(10.0)
if selector.is_budget_exceeded():
print("Budget exceeded!")
total_cost = selector.get_total_cost()
cost_by_model = selector.get_metrics()['cost_by_model']
Retry Logic
selector = LLMSelector(
models=models,
max_retries=3,
retry_delay=1.0
)
API Reference
LLMSelector
Constructor
LLMSelector(
models: dict,
preprompt: str = None,
selector_configs: list = None,
default_model: str = None,
validate_configs: bool = True,
max_retries: int = 3,
retry_delay: float = 1.0,
cache_enabled: bool = True,
cache_ttl: int = 3600,
debug: bool = False,
max_cache_size: int = 1000,
**litellm_kwargs
)
Methods
select(input: str) -> str: Async model selection
select_sync(input: str) -> str: Sync model selection
get_metrics() -> dict: Get performance metrics
reset_metrics(): Reset all metrics
clear_cache(): Clear cached results
get_total_cost() -> float: Get total API cost
set_budget_limit(limit: float): Set cost budget
is_budget_exceeded() -> bool: Check budget status
validate_configuration() -> bool: Validate current config
LLMSelectorConfig
Constructor
LLMSelectorConfig(
model: str,
api_base: str = None,
api_key: str = None,
**kwargs
)
Methods
validate() -> bool: Validate configuration
Model Configuration Format
models = {
"model_name": {
"description": "Human-readable description",
"weight": 1.0,
"cost_per_token": 0.00001
}
}
Configuration
API Key Sources
API keys can be specified in three ways:
- Plain text:
"sk-your-api-key"
- Environment variable:
"env:VARIABLE_NAME"
- .env file:
"dotenv:KEY_NAME"
Environment Variables
Create a .env file in your project root:
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-key
Model Weights
Model weights influence selection probability:
- Higher weights (> 1.0) increase selection chance
- Lower weights (< 1.0) decrease selection chance
- Weights are relative to other models
Deployment
Docker
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
RUN pip install .
CMD ["python", "your_app.py"]
Production Considerations
- Set appropriate
max_cache_size based on memory constraints
- Configure
cache_ttl based on how often model preferences change
- Use environment variables for API keys in production
- Monitor metrics regularly for performance optimization
- Set budget limits to control costs
Model Configuration
Each model supports:
description: Human-readable description
weight: Selection weight (default: 1.0)
cost_per_token: Cost estimation (optional)
API Key Configuration
API keys can be specified in three ways:
- Plain text:
"sk-your-api-key"
- Environment variable:
"env:VARIABLE_NAME"
- .env file:
"dotenv:KEY_NAME"
LLMSelectorConfig
model: The model name (e.g., "openrouter/openai/gpt-4")
api_base: Custom API base URL (optional)
api_key: API key (plain, env:, or dotenv:)
**kwargs: Additional litellm parameters
Testing
The library includes comprehensive unit tests, integration tests, and performance benchmarks.
Install Test Dependencies
pip install -e ".[test]"
Run Tests
pytest
pytest -m "not integration and not performance"
pytest -m integration
pytest -m performance
pytest --cov=llm_selector --cov-report=html
Test Coverage
- Unit Tests: Core functionality, error handling, configuration validation
- Integration Tests: Multi-provider scenarios, cost optimization, concurrent load
- Performance Tests: Latency benchmarks, throughput testing, memory usage
- Coverage Goal: 80%+ code coverage with detailed edge case testing
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/YoannDev90/llm-selector.git
cd llm-selector
pip install -e ".[test]"
pre-commit install
Running Tests
pytest
pytest --cov=llm_selector --cov-report=html
pytest -m "unit"
pytest -m "integration"
pytest -m "performance"
Changelog
v0.1.0 (Current)
- Initial release with core LLM selection functionality
- Support for multiple LLM providers via LiteLLM
- Intelligent caching with TTL and size limits
- Cost tracking and budget management
- Comprehensive metrics and monitoring
- Retry logic with exponential backoff
- Async and sync interfaces
- Extensive test coverage
Publishing
For information on how to publish new versions to PyPI, see PUBLISHING.md.
The project includes automated publishing via GitHub Actions that triggers on version tags.