AI Tool Calling - Selectools

Build AI agents that can call your custom Python functions and search your knowledge bases. Selectools is a production-ready framework that combines tool calling with RAG (Retrieval-Augmented Generation) to create powerful, context-aware AI agents. Connect LLMs (OpenAI, Anthropic, Gemini, Ollama) to your tools, embed and search your documents, and let the AI retrieve relevant information before answering. Works with any provider, includes 4 vector stores, supports 4 embedding providers, and tracks costs automatically.
Why This Library Stands Out
π― True Provider Agnosticism
Unlike other tool-calling libraries that lock you into a single provider's API, this library provides a unified interface across OpenAI, Anthropic, Gemini, and local providers. Switch providers with a single line changeβno refactoring required. Your tool definitions remain identical regardless of the backend.
π‘οΈ Production-Ready Robustness
Built for real-world reliability:
- Hardened parser that handles malformed JSON, fenced code blocks, and mixed content (not just perfect API responses)
- Automatic retry logic with exponential backoff for rate limits and transient failures
- Per-tool execution timeouts to prevent runaway operations
- Request-level timeouts to avoid hanging on slow providers
- Iteration caps to control agent loop costs and prevent infinite loops
π§ Developer-First Ergonomics
@tool decorator with automatic schema inference from Python type hints
ToolRegistry for organizing and discovering tools
- Injected kwargs for clean separation of user parameters and configuration (API keys, database connections, etc.)
- Zero boilerplate: Define a function, add a decorator, done
π¨ Vision + Streaming Support
- Native vision support for providers that offer it (OpenAI GPT-4o, etc.)
- Real-time streaming with callback handlers for responsive UIs
- Unified API for both streaming and one-shot responses
π§ͺ Testing-Friendly Architecture
- Local provider for offline development and testing (no API calls, no costs)
- Mock injection for deterministic testing (e.g.,
SELECTOOLS_BBOX_MOCK_JSON)
- Fake providers included for unit testing your agent logic
- Clean separation of concerns makes components easy to test in isolation
π Comprehensive RAG Support (v0.8.0)
Built-in Retrieval-Augmented Generation with production-ready components:
- 4 Embedding Providers: OpenAI, Anthropic/Voyage, Gemini (free!), Cohere
- 4 Vector Stores: In-memory (NumPy), SQLite (persistent), Chroma, Pinecone
- Smart Document Processing: Load from text, files, directories, PDFs with intelligent chunking
- Pre-built RAG Tools: Drop-in knowledge base search for your agents
- Automatic Cost Tracking: Monitor both LLM and embedding API costs
- High-Level API: Create RAG agents in 3 lines with
RAGAgent.from_directory()
π¦ Library-First Design
Not a framework that takes over your applicationβa library that integrates into your existing code. Use as much or as little as you need. No magic globals, no hidden state, no framework lock-in.
What's Included
- Core Agent Framework: Agent loop, parser, prompt builder, and provider adapters
- 4 LLM Providers: OpenAI, Anthropic, Gemini, Ollama with unified interface
- 4 Embedding Providers: OpenAI, Anthropic/Voyage, Gemini (free!), Cohere
- 4 Vector Stores: In-memory (NumPy), SQLite, Chroma, Pinecone
- RAG Components: Document loaders, text chunking, semantic search tools
- 120 Model Registry: Type-safe model constants with pricing and metadata
- Pre-built Toolbox: File operations, web scraping, data processing, and more
- Cost Tracking: Monitor LLM and embedding API costs automatically
- Comprehensive Examples: 13+ examples including RAG, streaming, analytics
- Production Testing: 400+ tests ensuring reliability
Install
From PyPI (Recommended)
pip install selectools
pip install selectools[rag]
pip install selectools[providers]
pip install selectools[rag,providers]
From Source (Development)
git clone https://github.com/johnnichev/selectools.git
cd selectools
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
Quick Start: RAG in 30 Seconds
from selectools import OpenAIProvider
from selectools.embeddings import OpenAIEmbeddingProvider
from selectools.models import OpenAI
from selectools.rag import RAGAgent, VectorStore
embedder = OpenAIEmbeddingProvider(model=OpenAI.Embeddings.TEXT_EMBEDDING_3_SMALL.id)
vector_store = VectorStore.create("memory", embedder=embedder)
agent = RAGAgent.from_directory(
directory="./docs",
provider=OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id),
vector_store=vector_store,
chunk_size=500,
top_k=3
)
response = agent.run("What are the main features?")
print(response)
print(agent.get_usage())
That's it! Your AI agent now searches your documents before answering.
Set API Keys
export OPENAI_API_KEY="your-api-key-here"
export ANTHROPIC_API_KEY="your-api-key-here"
export GEMINI_API_KEY="your-api-key-here"
Usage (Library)
from selectools import Agent, AgentConfig, Message, Role, Tool, ToolParameter
from selectools.providers.openai_provider import OpenAIProvider
search_tool = Tool(
name="search",
description="Search the web",
parameters=[ToolParameter(name="query", param_type=str, description="query")],
function=lambda query: f"Results for {query}",
)
from selectools.models import OpenAI
provider = OpenAIProvider(default_model=OpenAI.GPT_4O.id)
agent = Agent(tools=[search_tool], provider=provider, config=AgentConfig(max_iterations=4))
response = agent.run([Message(role=Role.USER, content="Search for Backtrack")])
print(response.content)
Common ways to use it (library-first)
- Define tools (
Tool or @tool/ToolRegistry), pick a provider, run Agent.run([...]).
- Add vision by supplying
image_path on Message when the provider supports it.
- For offline/testing: use the Local provider and/or
SELECTOOLS_BBOX_MOCK_JSON=tests/fixtures/bbox_mock.json.
- Optional dev helpers (not required for library use):
scripts/smoke_cli.py for quick provider smokes; scripts/chat.py for the vision demo.
Providers (incl. vision & limits)
- OpenAI: streaming; vision via Chat Completions
image_url (e.g., gpt-4o); request timeout default 30s; retries/backoff via AgentConfig.
- Anthropic: streaming; vision model-dependent; set
ANTHROPIC_API_KEY.
- Gemini: streaming; vision model-dependent; set
GEMINI_API_KEY.
- Ollama (v0.6.0): local LLM execution; zero cost; privacy-first; supports llama3.2, mistral, codellama, etc.
- Local: no network; echoes latest user text; no vision.
- Rate limits: agent detects
rate limit/429 and backs off + retries.
- Timeouts:
AgentConfig.request_timeout (provider) and tool_timeout_seconds (per tool).
Agent config at a glance
- Core:
model, temperature, max_tokens, max_iterations.
- Reliability:
max_retries, retry_backoff_seconds, rate-limit backoff, request_timeout.
- Execution safety:
tool_timeout_seconds to bound tool runtime.
- Streaming:
stream=True to stream provider deltas; optional stream_handler callback.
- Analytics (v0.6.0):
enable_analytics=True to track tool usage metrics, success rates, and performance.
- Observability (v0.5.2):
hooks dict for lifecycle callbacks (on_tool_start, on_llm_end, etc.).
Model Selection with Autocomplete
Use typed model constants for IDE autocomplete and type safety:
from selectools import Agent, OpenAIProvider
from selectools.models import OpenAI, Anthropic, Gemini, Ollama
provider = OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id)
agent = Agent(
tools=[...],
provider=provider,
config=AgentConfig(model=OpenAI.GPT_4O.id)
)
model_info = OpenAI.GPT_4O
print(f"Cost: ${model_info.prompt_cost}/${model_info.completion_cost} per 1M tokens")
print(f"Context: {model_info.context_window:,} tokens")
print(f"Max output: {model_info.max_tokens:,} tokens")
Available model classes:
OpenAI - 65 models (GPT-5, GPT-4o, o-series, GPT-4, GPT-3.5, etc.)
Anthropic - 18 models (Claude 4.5, 4.1, 4, 3.7, 3.5, 3)
Gemini - 26 models (Gemini 3, 2.5, 2.0, 1.5, 1.0, Gemma)
Ollama - 13 models (Llama, Mistral, Phi, etc.)
All 120 models include pricing, context windows, and max token metadata. See selectools.models for the complete registry.
New in v0.7.0: Model registry with IDE autocomplete for 120 models!
Coming in v0.8.0: Embedding models and RAG support
Real-World Examples
1. Quick Start: Simple Echo Tool
The simplest possible toolβgreat for testing your setup:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.openai_provider import OpenAIProvider
@tool(description="Echo input back to user")
def echo(text: str) -> str:
return text
agent = Agent(tools=[echo], provider=OpenAIProvider(), config=AgentConfig(max_iterations=3))
resp = agent.run([Message(role=Role.USER, content="Hello, world!")])
print(resp.content)
2. Customer Support: Multi-Tool Workflow
Build a customer support agent that can search a knowledge base, check order status, and create tickets:
from selectools import Agent, AgentConfig, Message, Role, ToolRegistry
from selectools.providers.openai_provider import OpenAIProvider
import json
registry = ToolRegistry()
@registry.tool(description="Search the knowledge base for help articles")
def search_kb(query: str, max_results: int = 5) -> str:
results = [
{"title": "How to reset password", "url": "https://help.example.com/reset"},
{"title": "Shipping information", "url": "https://help.example.com/shipping"}
]
return json.dumps(results)
@registry.tool(description="Look up order status by order ID")
def check_order(order_id: str) -> str:
return json.dumps({
"order_id": order_id,
"status": "shipped",
"tracking": "1Z999AA10123456784",
"estimated_delivery": "2025-12-10"
})
@registry.tool(description="Create a support ticket")
def create_ticket(customer_email: str, subject: str, description: str, priority: str = "normal") -> str:
ticket_id = "TKT-12345"
return json.dumps({
"ticket_id": ticket_id,
"status": "created",
"message": f"Ticket {ticket_id} created successfully"
})
from selectools.models import OpenAI
agent = Agent(
tools=registry.all(),
provider=OpenAIProvider(default_model=OpenAI.GPT_4O.id),
config=AgentConfig(max_iterations=5, temperature=0.7)
)
response = agent.run([
Message(role=Role.USER, content="Hi, I ordered something last week (order #12345) and haven't received it yet. Can you help?")
])
print(response.content)
3. Vision AI: Bounding Box Detection
Detect and annotate objects in images using OpenAI Vision:
from selectools import Agent, AgentConfig, Message, Role
from selectools.examples.bbox import create_bounding_box_tool
from selectools.providers.openai_provider import OpenAIProvider
bbox_tool = create_bounding_box_tool()
from selectools.models import OpenAI
agent = Agent(
tools=[bbox_tool],
provider=OpenAIProvider(default_model=OpenAI.GPT_4O.id),
config=AgentConfig(max_iterations=5)
)
response = agent.run([
Message(
role=Role.USER,
content="Find the laptop in this image and draw a bounding box around it",
image_path="assets/office_desk.jpg"
)
])
print(response.content)
4. Data Pipeline: Research Assistant
Chain multiple tools to research a topic, summarize findings, and save results:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.openai_provider import OpenAIProvider
import json
@tool(description="Search academic papers and articles")
def search_papers(query: str, year_from: int = 2020) -> str:
papers = [
{"title": "Attention Is All You Need", "authors": "Vaswani et al.", "year": 2017},
{"title": "BERT: Pre-training of Deep Bidirectional Transformers", "authors": "Devlin et al.", "year": 2018}
]
return json.dumps(papers)
@tool(description="Extract key insights from text")
def extract_insights(text: str, num_insights: int = 5) -> str:
insights = [
"Transformers use self-attention mechanisms",
"BERT uses bidirectional training",
"Pre-training on large corpora improves performance"
]
return json.dumps(insights)
@tool(description="Save research findings to a file")
def save_findings(filename: str, content: str) -> str:
with open(filename, 'w') as f:
f.write(content)
return f"Saved findings to {filename}"
from selectools.models import OpenAI
agent = Agent(
tools=[search_papers, extract_insights, save_findings],
provider=OpenAIProvider(default_model=OpenAI.GPT_4O.id),
config=AgentConfig(max_iterations=8, temperature=0.3)
)
response = agent.run([
Message(role=Role.USER, content="Research transformer architectures, extract key insights, and save to research_notes.txt")
])
print(response.content)
5. Streaming UI: Real-Time Chat
Build responsive UIs with streaming responses:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.openai_provider import OpenAIProvider
import sys
@tool(description="Get current time in a timezone")
def get_time(timezone: str = "UTC") -> str:
from datetime import datetime
import pytz
tz = pytz.timezone(timezone)
return datetime.now(tz).strftime("%Y-%m-%d %H:%M:%S %Z")
def stream_to_console(chunk: str):
"""Print chunks as they arrive for responsive UX"""
print(chunk, end='', flush=True)
from selectools.models import OpenAI
agent = Agent(
tools=[get_time],
provider=OpenAIProvider(default_model=OpenAI.GPT_4O.id),
config=AgentConfig(stream=True, max_iterations=3)
)
response = agent.run(
[Message(role=Role.USER, content="What time is it in Tokyo and New York?")],
stream_handler=stream_to_console
)
print("\n")
6. Secure Tool Injection: Database Access
Keep sensitive credentials out of tool signatures using injected_kwargs:
from selectools import Agent, AgentConfig, Message, Role, Tool
from selectools.tools import ToolParameter
from selectools.providers.openai_provider import OpenAIProvider
import psycopg2
def query_database(sql: str, db_connection) -> str:
"""
Execute a SQL query. The db_connection is injected, not exposed to the LLM.
"""
with db_connection.cursor() as cursor:
cursor.execute(sql)
results = cursor.fetchall()
return str(results)
db_conn = psycopg2.connect(
host="localhost",
database="myapp",
user="readonly_user",
password="secret"
)
db_tool = Tool(
name="query_db",
description="Execute a read-only SQL query against the database",
parameters=[
ToolParameter(name="sql", param_type=str, description="SQL SELECT query to execute")
],
function=query_database,
injected_kwargs={"db_connection": db_conn}
)
agent = Agent(
tools=[db_tool],
provider=OpenAIProvider(),
config=AgentConfig(max_iterations=3)
)
response = agent.run([
Message(role=Role.USER, content="How many users signed up last week?")
])
print(response.content)
7. Testing: Offline Development
Develop and test without API calls using the Local provider:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.stubs import LocalProvider
@tool(description="Format a todo item with priority")
def create_todo(task: str, priority: str = "medium", due_date: str = None) -> str:
result = f"[{priority.upper()}] {task}"
if due_date:
result += f" (due: {due_date})"
return result
agent = Agent(
tools=[create_todo],
provider=LocalProvider(),
config=AgentConfig(max_iterations=2, model="local")
)
response = agent.run([
Message(role=Role.USER, content="Add 'finish project report' to my todos with high priority")
])
print(response.content)
8. Provider Switching: Zero Refactoring
Switch between providers without changing your tool definitions:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.openai_provider import OpenAIProvider
from selectools.providers.anthropic_provider import AnthropicProvider
from selectools.providers.gemini_provider import GeminiProvider
import os
@tool(description="Calculate compound interest")
def calculate_interest(principal: float, rate: float, years: int) -> str:
amount = principal * (1 + rate/100) ** years
return f"After {years} years: ${amount:.2f}"
from selectools.models import OpenAI, Anthropic, Gemini
provider_name = os.getenv("LLM_PROVIDER", "openai")
providers = {
"openai": OpenAIProvider(default_model=OpenAI.GPT_4O.id),
"anthropic": AnthropicProvider(default_model=Anthropic.SONNET_3_5_20241022.id),
"gemini": GeminiProvider(default_model=Gemini.FLASH_2_0.id)
}
agent = Agent(
tools=[calculate_interest],
provider=providers[provider_name],
config=AgentConfig(max_iterations=3)
)
response = agent.run([
Message(role=Role.USER, content="If I invest $10,000 at 7% annual interest for 10 years, how much will I have?")
])
print(response.content)
9. Cost Tracking: Monitor Token Usage & Costs (v0.5.0)
Track token usage and estimated costs automatically:
from selectools import Agent, AgentConfig, Message, Role, tool
from selectools.providers.openai_provider import OpenAIProvider
@tool(description="Search for information")
def search(query: str) -> str:
return f"Results for: {query}"
@tool(description="Summarize text")
def summarize(text: str) -> str:
return f"Summary: {text[:50]}..."
from selectools.models import OpenAI
agent = Agent(
tools=[search, summarize],
provider=OpenAIProvider(default_model=OpenAI.GPT_4O.id),
config=AgentConfig(
max_iterations=5,
cost_warning_threshold=0.10
)
)
response = agent.run([
Message(role=Role.USER, content="Search for Python tutorials and summarize the top result")
])
print(f"Total tokens: {agent.total_tokens:,}")
print(f"Total cost: ${agent.total_cost:.6f}")
print("\nDetailed breakdown:")
print(agent.get_usage_summary())
Key Features:
- Automatic token counting for all providers
- Cost estimation for 15+ models (OpenAI, Anthropic, Gemini)
- Per-tool usage breakdown
- Configurable cost warnings
- Reset usage:
agent.reset_usage()
10. Pre-built Toolbox: Ready-to-Use Tools (v0.5.1)
Skip the boilerplate and use production-ready tools from the toolbox:
from selectools import Agent, AgentConfig, Message, Role
from selectools.providers.openai_provider import OpenAIProvider
from selectools.toolbox import get_all_tools, get_tools_by_category
all_tools = get_all_tools()
agent = Agent(
tools=all_tools,
provider=OpenAIProvider(),
config=AgentConfig(max_iterations=8)
)
file_tools = get_tools_by_category("file")
data_tools = get_tools_by_category("data")
text_tools = get_tools_by_category("text")
datetime_tools = get_tools_by_category("datetime")
web_tools = get_tools_by_category("web")
response = agent.run([
Message(
role=Role.USER,
content="""
1. Get the current time in UTC
2. Parse this JSON: {"users": [{"name": "Alice", "email": "alice@test.com"}]}
3. Extract all email addresses from the JSON
4. Write the results to a file called results.txt
"""
)
])
print(response.content)
Available Tool Categories:
| File (4) | read_file, write_file, list_files, file_exists | File system operations |
| Data (5) | parse_json, json_to_csv, csv_to_json, extract_json_field, format_table | Data parsing and formatting |
| Text (7) | count_text, search_text, replace_text, extract_emails, extract_urls, convert_case, truncate_text | Text processing |
| DateTime (4) | get_current_time, parse_datetime, time_difference, date_arithmetic | Date/time utilities |
| Web (2) | http_get, http_post | HTTP requests |
See examples/toolbox_demo.py for a complete demonstration.
11. Async Agent: Modern Python with asyncio
Build high-performance async applications with native async support:
import asyncio
from selectools import Agent, AgentConfig, Message, Role, tool, ConversationMemory
from selectools.providers.openai_provider import OpenAIProvider
@tool(description="Fetch weather data")
async def fetch_weather(city: str) -> str:
await asyncio.sleep(0.1)
return f"Weather in {city}: Sunny, 72Β°F"
@tool(description="Calculate")
def calculate(a: int, b: int) -> str:
return f"{a} + {b} = {a + b}"
async def main():
memory = ConversationMemory(max_messages=20)
agent = Agent(
tools=[fetch_weather, calculate],
provider=OpenAIProvider(),
config=AgentConfig(max_iterations=5),
memory=memory
)
response = await agent.arun([
Message(role=Role.USER, content="What's the weather in Seattle?")
])
print(response.content)
asyncio.run(main())
FastAPI Integration:
from fastapi import FastAPI
from selectools import Agent, Message, Role, tool, OpenAIProvider
app = FastAPI()
@tool(description="Fetch data")
async def fetch_data(query: str) -> str:
return f"Data for {query}"
@app.post("/chat")
async def chat(message: str):
agent = Agent(tools=[fetch_data], provider=OpenAIProvider())
response = await agent.arun([Message(role=Role.USER, content=message)])
return {"response": response.content}
Key Async Features:
Agent.arun() for non-blocking execution
- Async tools with
async def
- All providers support async (OpenAI, Anthropic, Gemini)
- Concurrent execution with
asyncio.gather()
- Works with FastAPI, aiohttp, and async frameworks
Tool ergonomics
- Use
ToolRegistry or the @tool decorator to infer schemas from function signatures and register tools.
- Inject per-tool config or auth using
injected_kwargs or config_injector when constructing a Tool.
- Type hints map to JSON schema; defaults make parameters optional.
Streaming Tools
Tools can stream results progressively using Python generators, providing real-time feedback for long-running operations:
from typing import Generator
from selectools import tool, Agent, AgentConfig
@tool(description="Process large file line by line", streaming=True)
def process_file(filepath: str) -> Generator[str, None, None]:
"""Process file and yield results progressively."""
with open(filepath) as f:
for i, line in enumerate(f, 1):
result = process_line(line)
yield f"[Line {i}] {result}\n"
def on_tool_chunk(tool_name: str, chunk: str):
print(f"[{tool_name}] {chunk}", end='', flush=True)
config = AgentConfig(hooks={'on_tool_chunk': on_tool_chunk})
agent = Agent(tools=[process_file], provider=provider, config=config)
Features:
- β
Sync generators (
Generator[str, None, None])
- β
Async generators (
AsyncGenerator[str, None])
- β
Real-time chunk callbacks via
on_tool_chunk hook
- β
Analytics tracking for chunk counts and streaming metrics
- β
Toolbox includes
read_file_stream and process_csv_stream
See examples/streaming_tools_demo.py for complete examples.
RAG (Retrieval-Augmented Generation)
v0.8.0 brings comprehensive RAG support for building agents that answer questions about your documents!
Quick Start
from selectools import OpenAIProvider
from selectools.embeddings import OpenAIEmbeddingProvider
from selectools.models import OpenAI
from selectools.rag import RAGAgent, VectorStore
embedder = OpenAIEmbeddingProvider(model=OpenAI.Embeddings.TEXT_EMBEDDING_3_SMALL.id)
vector_store = VectorStore.create("memory", embedder=embedder)
agent = RAGAgent.from_directory(
directory="./docs",
glob_pattern="**/*.md",
provider=OpenAIProvider(),
vector_store=vector_store,
chunk_size=1000,
top_k=3
)
response = agent.run("What are the main features?")
Embedding Providers
Choose from 4 embedding providers with 10 models:
from selectools.embeddings import (
OpenAIEmbeddingProvider,
AnthropicEmbeddingProvider,
GeminiEmbeddingProvider,
CohereEmbeddingProvider
)
from selectools.models import OpenAI, Anthropic, Gemini, Cohere
embedder = OpenAIEmbeddingProvider(model=OpenAI.Embeddings.TEXT_EMBEDDING_3_SMALL.id)
embedder = AnthropicEmbeddingProvider(model=Anthropic.Embeddings.VOYAGE_3_LITE.id)
embedder = GeminiEmbeddingProvider(model=Gemini.Embeddings.EMBEDDING_004.id)
embedder = CohereEmbeddingProvider(model=Cohere.Embeddings.EMBED_V3.id)
Vector Stores
Choose from 4 vector store backends:
from selectools.rag import VectorStore
store = VectorStore.create("memory", embedder=embedder)
store = VectorStore.create("sqlite", embedder=embedder, db_path="my_docs.db")
store = VectorStore.create("chroma", embedder=embedder, persist_directory="./chroma_db")
store = VectorStore.create("pinecone", embedder=embedder, index_name="my-index")
Document Loading
Load documents from various sources:
from selectools.rag import DocumentLoader, Document
docs = DocumentLoader.from_text("content", metadata={"source": "memory"})
docs = DocumentLoader.from_file("document.txt")
docs = DocumentLoader.from_directory("./docs", glob_pattern="**/*.md")
docs = DocumentLoader.from_pdf("manual.pdf")
docs = [
Document(text="content", metadata={"source": "test.txt"}),
Document(text="more content", metadata={"source": "test2.txt"})
]
Text Chunking
Split large documents into smaller chunks:
from selectools.rag import TextSplitter, RecursiveTextSplitter
splitter = TextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_text(long_text)
splitter = RecursiveTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = splitter.split_text(long_text)
Cost Tracking for Embeddings
Embedding costs are automatically tracked:
print(agent.usage)
Complete Example
See examples/rag_basic_demo.py for a complete working example that demonstrates:
- Creating and embedding documents
- Setting up vector stores
- Building RAG agents
- Asking questions about documents
- Cost tracking
- Different configuration options
Installation for RAG
pip install selectools
pip install selectools[rag]
RAG Troubleshooting
Common Issues
1. ImportError: No module named 'numpy'
NumPy is required for RAG features:
pip install --upgrade selectools
2. Vector Store Setup
ChromaDB:
pip install selectools[rag]
If you get sqlite3 errors on older systems:
pip install pysqlite3-binary
Pinecone:
pip install selectools[rag]
export PINECONE_API_KEY="your-key"
export PINECONE_ENVIRONMENT="your-env"
3. Embedding Provider Issues
OpenAI:
- Ensure
OPENAI_API_KEY is set
- Check quota limits
- Use
text-embedding-3-small for cost efficiency
Gemini (Free):
Anthropic/Voyage:
pip install voyageai
export VOYAGE_API_KEY="your-key"
Cohere:
pip install cohere
export COHERE_API_KEY="your-key"
4. PDF Loading Errors
pip install pypdf
For complex PDFs with images:
pip install pypdf[crypto]
5. Memory Issues with Large Documents
Use persistent storage instead of in-memory:
store = VectorStore.create("sqlite", embedder=embedder, db_path="docs.db")
Adjust chunk size:
agent = RAGAgent.from_directory(
directory="./docs",
chunk_size=500,
top_k=2
)
6. Slow Search Performance
For in-memory store:
- Consider upgrading to Chroma or Pinecone
- Reduce document count
- Use smaller embeddings (text-embedding-3-small)
For SQLite:
- Enable WAL mode for better performance
- Consider Chroma for >10k documents
7. Cost Concerns
Monitor costs:
print(agent.usage)
Use free/cheaper options:
- Gemini embeddings (free)
- Local Ollama for LLM (free)
- In-memory or SQLite storage (free)
8. Search Returns Irrelevant Results
Tune parameters:
rag_tool = RAGTool(
vector_store=store,
top_k=5,
score_threshold=0.5
)
Improve chunking:
splitter = RecursiveTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", ". ", " "]
)
Performance Tips
- Batch Operations: Use
embed_texts() instead of multiple embed_text() calls
- Caching: Keep vector store instance alive between queries
- Chunk Size: 500-1000 characters is usually optimal
- Top-K: Start with 3-5, adjust based on results
- Metadata: Add rich metadata for better filtering
Getting Help
Tests
python tests/test_framework.py
- Covers parsing (mixed/fenced), agent loop (retries/streaming), provider mocks (Anthropic/Gemini), CLI streaming, bbox mock path, and tool schema basics.
Packaging
The project ships a pyproject.toml with console scripts and a src layout. Adjust version/metadata before publishing to PyPI.
CI workflow (.github/workflows/ci.yml) runs tests, build, and twine check. Tags matching v* attempt TestPyPI/PyPI publishes when tokens are provided.
License
This project is licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later).
What This Means for You
β
You CAN:
- Use this library in commercial applications
- Profit from applications that use this library
- Import and use the library without sharing your application code
- Distribute applications that use this library
β
You MUST:
- Preserve copyright notices and license information
- Share any modifications you make to the library itself under LGPL-3.0
- Provide attribution to the original authors
β You CANNOT:
- Relicense this library under different terms
- Claim this code as your own
- Create proprietary forks (modifications must remain open source)
In Practice: You can build and sell proprietary applications using this library via normal import/usage. Only if you modify the library's source code itself must you share those modifications. This is the same license used by popular projects like Qt and GTK.
For the full license text, see the LICENSE file.
More docs
- Single source of truth is this README.
- Examples:
python examples/search_weather.py - Simple tool with local mock provider
python examples/async_agent_demo.py - Async/await usage with FastAPI patterns
python examples/conversation_memory_demo.py - Multi-turn conversation with memory
python examples/cost_tracking_demo.py - Token counting and cost tracking (v0.5.0)
python examples/toolbox_demo.py - Using pre-built tools from toolbox (v0.5.1)
python examples/v0_5_2_demo.py - Tool validation & observability hooks (v0.5.2)
python examples/ollama_demo.py - Local LLM execution with Ollama (v0.6.0)
python examples/tool_analytics_demo.py - Track and analyze tool usage (v0.6.0)
python examples/streaming_tools_demo.py - Progressive tool results with streaming (v0.6.1)
python examples/customer_support_bot.py - Multi-tool customer support workflow
python examples/data_analysis_agent.py - Data exploration and analysis tools
- Dev helpers:
python scripts/smoke_cli.py - Quick provider smoke tests (skips missing keys)
python scripts/test_memory_with_openai.py - Test memory with real OpenAI API
Roadmap
We're actively developing new features to make Selectools the most production-ready tool-calling library. See ROADMAP.md for the complete development roadmap, including:
β
Completed in v0.4.0:
- Conversation Memory - Multi-turn context management
- Async Support -
Agent.arun(), async tools, async providers
- Real Provider Implementations - Full Anthropic & Gemini SDK integration
β
Completed in v0.5.0:
- Better Error Messages - Custom exceptions with helpful context and suggestions
- Cost Tracking - Automatic token counting and cost estimation with warnings
- Gemini SDK Migration - Updated to new google-genai SDK (v1.0+)
β
Completed in v0.5.1:
- Pre-built Tool Library - 22 production-ready tools in 5 categories (file, web, data, datetime, text)
β
Completed in v0.5.2:
- Tool Validation at Registration - Catch tool definition errors during development, not production
- Observability Hooks - 10 lifecycle hooks for monitoring, debugging, and tracking agent behavior
β
Completed in v0.6.0:
- Local Model Support - Ollama provider for privacy-first, zero-cost local LLM execution
- Tool Usage Analytics - Track metrics, success rates, execution times, and parameter patterns
β
Completed in v0.6.1:
- Streaming Tool Results - Tools can yield results progressively with Generator/AsyncGenerator
- Streaming Observability -
on_tool_chunk hook for real-time chunk callbacks
- Streaming Analytics - Track chunk counts and streaming-specific metrics
- Toolbox Streaming Tools -
read_file_stream and process_csv_stream for large files
β
Completed in v0.7.0:
- Model Registry System - Single source of truth for 120 models with complete metadata
- Typed Model Constants - IDE autocomplete for all models (OpenAI, Anthropic, Gemini, Ollama)
- Rich Metadata - Pricing, context windows, max tokens for every model
- Type Safety - Catch model typos at development time
- Backward Compatible - Existing code with string model names still works
π Next: v0.8.0 (Embeddings & RAG):
- Embedding Models - Add 20+ embedding models to registry (OpenAI, Anthropic, Gemini, Cohere)
- Vector Stores - Unified interface with 4 backends (in-memory, SQLite, Chroma, Pinecone)
- Document Processing - Load, chunk, and embed documents automatically
- RAG Tools - Pre-built tools for retrieval-augmented generation and semantic search
- Cost Tracking - Extend to track embedding API costs
π‘ Coming in v0.8.x:
- Dynamic Tool Loading - Hot-reload tools without restarting the agent
- Reranking Models - Cohere and Jina rerankers for better RAG results
- Advanced Chunking - Agentic and contextual chunking strategies
π Future (v0.9.0+):
- Parallel Tool Execution - Run multiple tools concurrently
- Tool Composition - Chain tools together with
@compose decorator
- Advanced context management - Summarization, sliding windows
- And much more...
See ROADMAP.md for detailed feature descriptions, status tracking, and implementation notes.
π€ Contributing
Want to help build these features? See CONTRIBUTING.md for guidelines. We'd love contributions for:
- Priority 1 features (quick wins)
- Tool implementations for the toolbox
- Examples and tutorials
- Documentation improvements
See the full comparison with LangChain in docs/LANGCHAIN_COMPARISON.md.