ModelForge
A Python library for managing LLM providers, authentication, and model selection with seamless LangChain integration.

🚀 Version 2.2.0 - Enhanced Model Metadata, Improved Telemetry, and Quiet Mode!
Installation
Recommended: Virtual Environment
python -m venv model-forge-env
source model-forge-env/bin/activate
pip install model-forge-llm
modelforge --help
Quick Install (System-wide)
pip install model-forge-llm
Quick Start
Option 1: GitHub Copilot via Device Authentication Flow
modelforge models list --provider github_copilot
modelforge auth login --provider github_copilot
modelforge config use --provider github_copilot --model claude-3.7-sonnet
modelforge test --prompt "Write a Python function to reverse a string"
Option 2: OpenAI (API Key Required)
modelforge auth login --provider openai --api-key YOUR_API_KEY
modelforge config use --provider openai --model gpt-4o-mini
modelforge test --prompt "Hello, world!"
Option 3: Local Ollama (No API Key Needed)
modelforge config add --provider ollama --model qwen3:1.7b
modelforge config use --provider ollama --model qwen3:1.7b
modelforge test --prompt "What is machine learning?"
Common Commands - Complete Lifecycle
modelforge --help
modelforge config show
modelforge models list
modelforge models search "claude"
modelforge models info --provider openai --model gpt-4o
modelforge auth login --provider openai --api-key KEY
modelforge auth login --provider github_copilot
modelforge auth status
modelforge auth logout --provider openai
modelforge config add --provider openai --model gpt-4o-mini --api-key KEY
modelforge config add --provider ollama --model qwen3:1.7b --local
modelforge config use --provider openai --model gpt-4o-mini
modelforge config remove --provider openai --model gpt-4o-mini
modelforge test --prompt "Hello, how are you?"
modelforge test --prompt "Explain quantum computing" --verbose
modelforge test --input-file prompt.txt --output-file response.txt
echo "What is AI?" | modelforge test
modelforge test --prompt "Hello" --no-telemetry
modelforge test --prompt "What is 2+2?" --quiet
echo "Hello" | modelforge test --quiet > output.txt
modelforge models list --refresh
modelforge settings telemetry on
modelforge settings telemetry off
modelforge settings telemetry status
What's New
v2.2.0 Features
🤫 Quiet Mode for Automation
--quiet
flag: Minimal output showing only the model response
- Perfect for piping: Clean output for scripts and automation
- Automatic telemetry suppression: No metadata in quiet mode
- Conflict prevention: Cannot use with
--verbose
flag
📊 Enhanced Telemetry Display
- Context window tracking: See how much of the model's context you're using
- Token estimation: Automatic estimation for providers that don't report usage
- Capability display: Shows if model supports functions, vision, etc.
- Improved formatting: Cleaner, more informative telemetry output
🎯 Enhanced Model Metadata (Opt-in)
- Model capabilities: Access context length, max tokens, supported features
- Cost estimation: Calculate costs before making API calls
- Parameter validation: Automatic validation against model limits
- Backward compatible: Opt-in feature with
enhanced=True
🔧 Developer Experience
- Logging control: Suppress logs without
--verbose
flag
- Better error messages: More context and helpful suggestions
- Improved callback handling: Fixed telemetry in enhanced mode
v2.1.0 Features
🔐 Environment Variable Authentication
- Zero-touch auth for CI/CD pipelines
- Support for all providers via env vars
- Automatic token handling
🌊 Streaming Support
- Real-time response streaming
- Automatic auth refresh during streams
- CLI and API streaming capabilities
v2.0 Features
🎯 Telemetry & Cost Tracking
- Token usage monitoring: See exactly how many tokens each request uses
- Cost estimation: Real-time cost calculation for supported providers
- For GitHub Copilot: Shows reference costs based on equivalent OpenAI models (subscription-based service)
- Performance metrics: Request duration and model response times
- Configurable display: Enable/disable telemetry output globally or per-command
📥 Flexible Input/Output
- Multiple input sources: Command line, files, or stdin
- File output: Save responses directly to files
- Streaming support: Pipe commands together for automation
- Q&A formatting: Clean, readable output for interactive use
🏗️ Simplified Architecture
- Cleaner codebase: Removed complex decorators and factory patterns
- Direct error handling: Clear, actionable error messages
- Improved test coverage: Comprehensive test suite with 90%+ coverage
- Better maintainability: Simplified patterns for easier contribution
🔧 Enhanced CLI
- Settings management: Global configuration for telemetry and preferences
- Improved error messages: Context and suggestions for common issues
- Better help text: More descriptive command documentation
- Consistent output: Unified formatting across all commands
- Provider name flexibility: Both
github-copilot
and github_copilot
formats supported
Python API
Basic Usage
from modelforge.registry import ModelForgeRegistry
registry = ModelForgeRegistry()
llm = registry.get_llm()
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([("human", "{input}")])
chain = prompt | llm
response = chain.invoke({"input": "Tell me a joke"})
print(response)
Advanced Usage with Telemetry (NEW in v2.0)
from modelforge.registry import ModelForgeRegistry
from modelforge.telemetry import TelemetryCallback
registry = ModelForgeRegistry(verbose=True)
telemetry = TelemetryCallback(provider="openai", model="gpt-4o-mini")
llm = registry.get_llm(
provider_name="openai",
model_alias="gpt-4o-mini",
callbacks=[telemetry]
)
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_template("Explain {topic} in simple terms")
chain = prompt | llm | StrOutputParser()
for chunk in chain.stream({"topic": "quantum computing"}):
print(chunk, end="", flush=True)
questions = [
"What is machine learning?",
"Explain neural networks",
"How does backpropagation work?"
]
responses = chain.batch([{"topic": q} for q in questions])
print(f"Tokens used: {telemetry.metrics.token_usage.total_tokens}")
print(f"Duration: {telemetry.metrics.duration_ms:.0f}ms")
print(f"Estimated cost: ${telemetry.metrics.estimated_cost:.6f}")
from modelforge.telemetry import format_metrics
print(format_metrics(telemetry.metrics))
Enhanced Model Metadata (v2.2.0 - Opt-in Feature)
from modelforge import ModelForgeRegistry
registry = ModelForgeRegistry()
llm = registry.get_llm("openai", "gpt-4o", enhanced=True)
print(f"Context window: {llm.context_length:,} tokens")
print(f"Max output: {llm.max_output_tokens:,} tokens")
print(f"Supports functions: {llm.supports_function_calling}")
print(f"Supports vision: {llm.supports_vision}")
pricing = llm.pricing_info
print(f"Input cost: ${pricing['input_per_1m']}/1M tokens")
print(f"Output cost: ${pricing['output_per_1m']}/1M tokens")
estimated_cost = llm.estimate_cost(input_tokens=5000, output_tokens=1000)
print(f"Estimated cost for this request: ${estimated_cost:.4f}")
llm.temperature = 0.7
llm.max_tokens = 2000
Configuration Management
from modelforge import config
current = config.get_current_model()
print(f"Current: {current.get('provider')}/{current.get('model')}")
if not current:
print("No model selected. Configure with:")
print("modelforge config add --provider openai --model gpt-4o-mini")
settings = config.get_settings()
print(f"Telemetry enabled: {settings.get('show_telemetry', True)}")
config.update_setting("show_telemetry", False)
Error Handling
from modelforge.registry import ModelForgeRegistry
from modelforge.exceptions import ConfigurationError, ProviderError
try:
registry = ModelForgeRegistry()
llm = registry.get_llm()
response = llm.invoke("Hello world")
except ConfigurationError as e:
print(f"Configuration issue: {e}")
print("Run: modelforge config add --provider PROVIDER --model MODEL")
except ProviderError as e:
print(f"Provider error: {e}")
print("Check: modelforge auth status")
Streaming Support (v2.1+)
ModelForge provides enhanced streaming capabilities with automatic authentication handling:
from modelforge.registry import ModelForgeRegistry
from modelforge.streaming import stream
registry = ModelForgeRegistry()
llm = registry.get_llm()
async for chunk in stream(llm, "Write a story about AI",
provider_name="openai",
provider_data=registry._config.get("providers", {}).get("openai")):
print(chunk, end="", flush=True)
from modelforge.streaming import stream_to_file
from pathlib import Path
await stream_to_file(llm, "Explain quantum computing",
Path("output.txt"),
provider_name="github_copilot",
provider_data=provider_data)
CLI Streaming:
modelforge test --prompt "Write a story" --stream
modelforge test --prompt "Explain AI" --stream --output-file response.txt
Key Features:
- Automatic token refresh for OAuth providers during long streams
- Environment variable authentication support
- Retry on authentication errors
- Progress callbacks and buffering options
Note on Streaming Behavior:
The actual streaming granularity depends on the provider's API implementation. Some providers (like GitHub Copilot) may return responses in larger chunks rather than token-by-token streaming, while others (like Ollama) support finer-grained streaming.
Supported Providers
- OpenAI: GPT-4, GPT-4o, GPT-3.5-turbo
- Google: Gemini Pro, Gemini Flash
- Ollama: Local models (Llama, Qwen, Mistral)
- GitHub Copilot: Claude, GPT models via GitHub (use
github_copilot
or github-copilot
)
Authentication
ModelForge supports multiple authentication methods:
- API Keys: Store securely in configuration
- Device Flow: Browser-based OAuth for GitHub Copilot
- No Auth: For local models like Ollama
- Environment Variables: Zero-touch authentication for CI/CD (NEW in v2.1)
Authentication Methods
modelforge auth login --provider openai --api-key YOUR_KEY
modelforge auth login --provider github_copilot
modelforge auth status
Environment Variable Support (v2.1+)
For CI/CD and production deployments, you can use environment variables to provide credentials without storing them in configuration files:
export MODELFORGE_OPENAI_API_KEY="sk-..."
export MODELFORGE_ANTHROPIC_API_KEY="sk-ant-..."
export MODELFORGE_GOOGLE_API_KEY="..."
export MODELFORGE_GITHUB_COPILOT_ACCESS_TOKEN="ghu_..."
modelforge test --prompt "Hello"
Environment variables take precedence over stored credentials and eliminate the need for interactive authentication.
Configuration
ModelForge uses a two-tier configuration system:
- Global:
~/.config/model-forge/config.json
(user-wide)
- Local:
./.model-forge/config.json
(project-specific)
Local config takes precedence over global when both exist.
Model Discovery
modelforge models list
modelforge models search "gpt"
modelforge models info --provider openai --model gpt-4o
Development Setup
For contributors and developers:
git clone https://github.com/smiao-icims/model-forge.git
cd model-forge
./setup.sh
uv sync --extra dev
uv run pytest
Requirements:
- Python 3.11+
- uv (modern Python package manager)
See CONTRIBUTING.md for detailed development guidelines.
Documentation
License
MIT License - see LICENSE file for details.