βββββββ¬ ββββββββ¬ββββββββ¬ βββ
βββββ€ β ββ€ β β β ββ ββ βββ
βββββββ΄ββββββββ β΄ βββββββ΄βββββ

An open-source project from NichevLabs.
Multi-agent orchestration in plain Python. Build agent graphs, compose pipelines with |, deploy with one command. No DSL, no compile step, no paid debugger. Works with OpenAI, Anthropic, Gemini, and Ollama.
3 Ways to Build
agent = Agent(tools=[search, calculate], provider=OpenAIProvider())
result = agent.run("What is 15 * 7?")
result = AgentGraph.chain(planner, writer, reviewer).run("Write a blog post")
What's New in v0.23
v0.23.0 β Supabase Sessions + Builder RAG
Two user-facing features plus a post-ship bug-hunt sweep that pinned 8 code-generator fixes in the visual builder.
-
SupabaseSessionStore β 4th SessionStore backend alongside JSON, SQLite, and Redis. Postgres-backed via Supabase PostgREST, with idempotent upserts, namespace isolation, and the same validation guards as RedisSessionStore. Optional dep: pip install selectools[supabase]. Demo: examples/96_supabase_session_store.py.
-
Visual builder: first-class RAG + session nodes β drag Retriever (RAG) onto the canvas and pick any of 7 vector stores (memory, SQLite, Chroma, Pinecone, FAISS, Qdrant, pgvector), toggle Hybrid (BM25 + vector + RRF) and cross-encoder Rerank. Drag Session Store as a resource node and wire it into an agent via the new Session Store dropdown. Two new presets: Hybrid RAG and Multi-Tenant RAG. Python + YAML code generators emit real, runnable code.
from supabase import create_client
from selectools import SupabaseSessionStore, Agent, AgentConfig
store = SupabaseSessionStore(client=create_client(URL, KEY))
agent = Agent(
tools=[...],
config=AgentConfig(session_store=store, session_id="u-1", max_iterations=5),
)
See CHANGELOG.md for the full entry including the 8 builder code-gen fixes.
What's New in v0.22
v0.22.0 β Competitor-Informed Bug Fixes
22 bugs identified by mining 95+ closed bug reports from Agno (39k stars) and 60+ from PraisonAI (6.9k stars), then cross-referencing the patterns against selectools v0.21.0 source code. Six were shipping blockers. All 22 are now fixed with TDD regression tests.
from typing import Literal
from selectools.tools import tool
@tool()
def set_mode(mode: Literal["fast", "slow", "auto"]) -> str:
return f"mode={mode}"
store.save("session_123", memory_a, namespace="agent_a")
store.save("session_123", memory_b, namespace="agent_b")
results = store.search(query_embedding=emb, top_k=10, dedup=True)
agent.run("hello")
- 6 HIGH severity (shipping blockers): streaming dropped tool calls,
typing.Literal crashed @tool(), asyncio.run() re-entry in 8 sync wrappers, HITL silently lost in parallel groups + subgraphs, ConversationMemory had no thread lock
- 9 MEDIUM severity:
<think> tag stripping, RAG batch limits, MCP concurrent race, strβint/float/bool argument coercion, Union[str, int] support, multi-interrupt generators, GraphState fail-fast validation, session namespace isolation, summary growth cap
- 7 LOW-MED severity: cancelled-result extraction,
AgentTrace lock, async observer exception logging, batch clone isolation, OTel/Langfuse observer locks, vector store search dedup, Optional[T] without default handling
- +57 new regression tests in
tests/agent/test_regression.py, each with empirical fault-injection verification (test fails without fix, passes after)
- Thread safety end-to-end correct across
ConversationMemory, AgentTrace, OTelObserver, LangfuseObserver, MCPClient, FallbackProvider, batch clone isolation
See CHANGELOG.md for the full per-bug breakdown with cross-references to every original Agno/PraisonAI issue.
What's New in v0.21
v0.21.0 β Connector Expansion
Seven new subsystems land at once: three vector stores, four document loaders, eight new toolbox tools, multimodal messages, an Azure OpenAI provider, and two observability backends.
from selectools.rag.stores import FAISSVectorStore, QdrantVectorStore, PgVectorStore
from selectools import AzureOpenAIProvider
from selectools.observe import OTelObserver, LangfuseObserver
from selectools import image_message
agent.run([image_message("./screenshot.png", "What does this UI show?")])
- Vector stores:
FAISSVectorStore (in-process, persistable), QdrantVectorStore (REST + gRPC), PgVectorStore (PostgreSQL pgvector extension)
- Document loaders:
DocumentLoader.from_csv, from_json, from_html, from_url
- Toolbox:
execute_python, execute_shell, web_search, scrape_url, github_search_repos, github_get_file, github_list_issues, query_sqlite, query_postgres
- Multimodal:
Message.content accepts list[ContentPart]; image input works on OpenAI, Anthropic, Gemini, and Ollama vision models
- Azure OpenAI: deployment-name routing, AAD token auth, env-var fallback (
AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY)
- OpenTelemetry:
OTelObserver emits GenAI semantic-convention spans (Jaeger, Tempo, Datadog, Honeycomb, Grafana)
- Langfuse:
LangfuseObserver ships traces, generations, and spans to Langfuse Cloud or self-hosted
pip install "selectools[rag]"
pip install "selectools[observe]"
pip install "selectools[postgres]"
What's New in v0.20
v0.20.1 β Visual Agent Builder + GitHub Pages
The first AI agent framework to ship a visual graph builder in a single pip install. No React. No build step. No CDN.
Try the builder in your browser β β no install required.

pip install selectools
selectools serve --builder
- Drag START, END, and Agent nodes onto the canvas
- Click ports to connect agents with edges
- Add condition labels to edges (e.g.
"approved") for conditional routing
- Edit provider, model, and system prompt in the properties panel
- Generated Python and YAML update live in the code panel
- Export or copy to clipboard with one click
What's New in v0.19
v0.19.3 β Stability Markers Applied to All Public APIs
Every public class and function exported from selectools now carries a stability marker:
from selectools import Agent, AgentGraph, PlanAndExecuteAgent
print(Agent.__stability__)
print(AgentGraph.__stability__)
print(PlanAndExecuteAgent.__stability__)
@stable β 60+ core symbols (Agent, AgentConfig, providers, memory, tools, evals, guardrails, sessions, knowledge, cache, cancellation)
@beta β 30+ newer symbols (AgentGraph, SupervisorAgent, Pipeline, @step, parallel, branch, all four patterns, compose)
v0.19.2 β Enterprise Hardening
from selectools.stability import stable, beta, deprecated
from selectools import trace_to_html
@stable
class MyProductionAgent: ...
@beta
class MyExperimentalFeature: ...
@deprecated(since="0.19", replacement="MyProductionAgent")
class MyOldAgent: ...
Path("trace.html").write_text(trace_to_html(result.trace))
- Stability markers β
@stable, @beta, @deprecated(since, replacement) for public API signalling
- Trace HTML viewer β
trace_to_html(trace) renders a standalone waterfall timeline
- Deprecation policy β 2-minor-version window, programmatic introspection via
.__stability__
- Security audit β all 41
# nosec annotations reviewed and published in docs/SECURITY.md
- Quality infrastructure β property-based tests (Hypothesis), thread-safety smoke suite, 5 new production simulations (5332 tests total)
v0.19.1 β Advanced Agent Patterns
from selectools.patterns import PlanAndExecuteAgent, ReflectiveAgent, DebateAgent, TeamLeadAgent
agent = PlanAndExecuteAgent(planner=planner, executor=executor, provider=provider)
result = agent.run("Research and write a blog post about LLM safety")
agent = ReflectiveAgent(actor=actor, critic=critic, provider=provider, max_reflections=3)
result = agent.run("Draft a product announcement email")
agent = DebateAgent(agents={"optimist": opt, "skeptic": skep}, judge=judge, provider=provider)
result = agent.run("Should we migrate our infrastructure to microservices?")
agent = TeamLeadAgent(lead=lead, team={"researcher": r, "writer": w}, provider=provider)
result = agent.run("Produce a competitive analysis report")
- PlanAndExecuteAgent β Typed
PlanStep list; optional replanning on step failure
- ReflectiveAgent β Actorβcritic loop with
ReflectionRound records per revision
- DebateAgent β N-agent debate with transcript, judge synthesis,
DebateResult
- TeamLeadAgent β
sequential, parallel, or dynamic delegation strategies
v0.19.0 β Serve, Deploy & Complete Composition
from selectools import compose
search_and_summarize = compose(search_web, summarize)
async for chunk in pipeline.astream("input"):
print(chunk)
selectools serve β HTTP deployment with SSE streaming, Playground UI, /health, /schema
- YAML config β
AgentConfig.from_yaml("agent.yaml"), 5 built-in templates
compose() β Chain tools into composite tool; retry() and cache_step() wrappers
- PostgresCheckpointStore β Durable graph checkpointing backed by PostgreSQL
v0.18.x highlights
v0.18.0 β Multi-Agent Orchestration
from selectools import AgentGraph, SupervisorAgent, AgentConfig, OpenAIProvider, tool
graph = AgentGraph()
graph.add_node("planner", planner_agent)
graph.add_node("writer", writer_agent)
graph.add_node("reviewer", reviewer_agent)
graph.add_edge("planner", "writer")
graph.add_edge("writer", "reviewer")
graph.add_edge("reviewer", AgentGraph.END)
graph.set_entry("planner")
result = graph.run("Write a blog post about AI safety")
supervisor = SupervisorAgent(
agents={"researcher": researcher, "writer": writer},
provider=OpenAIProvider(),
strategy="plan_and_execute",
)
result = supervisor.run("Write a comprehensive report on LLM safety")
- AgentGraph β Directed graph of agent nodes with plain Python routing
- 4 Supervisor Strategies β plan_and_execute, round_robin, dynamic, magentic (Magentic-One pattern)
- Human-in-the-Loop β Generator nodes with
yield InterruptRequest() β resumes at exact yield point (LangGraph restarts the whole node)
- Parallel Execution β
add_parallel_nodes() with 3 merge policies (LAST_WINS, FIRST_WINS, APPEND)
- Checkpointing β 3 backends (InMemory, File, SQLite) for durable mid-graph persistence
- Subgraph Composition β Nest graphs inside graphs with explicit state mapping
- ModelSplit β Separate planner/executor models for 70-90% cost reduction
- Loop & Stall Detection β State hash tracking with observer events
- 10 New StepTypes β Full trace visibility into graph execution
- 13 New Observer Events β on_graph_start/end, on_node_start/end, on_graph_interrupt/resume, and more
v0.18.0 β Composable Pipelines
from selectools import Pipeline, step, parallel, branch
@step
def summarize(text: str) -> str:
return agent.run(f"Summarize: {text}").content
@step
def translate(text: str, lang: str = "es") -> str:
return agent.run(f"Translate to {lang}: {text}").content
pipeline = summarize | translate
result = pipeline.run("Long article text here...")
research = parallel(search_web, search_docs, search_db)
route = branch(
lambda x: "technical" if "code" in x else "general",
technical=code_review_pipeline,
general=summarize_pipeline,
)
- Pipeline β Chain steps sequentially with
| operator or Pipeline(steps=[...])
- @step decorator β Wrap any sync/async callable into a composable pipeline step
- parallel() β Fan-out to multiple steps and merge results
- branch() β Conditional routing based on input data
v0.17.x highlights
v0.17.7 β Caching & Context
from selectools.cache_semantic import SemanticCache
from selectools.embeddings.openai import OpenAIEmbeddingProvider
cache = SemanticCache(
embedding_provider=OpenAIEmbeddingProvider(),
similarity_threshold=0.92,
)
config = AgentConfig(cache=cache)
config = AgentConfig(
compress_context=True,
compress_threshold=0.75,
compress_keep_recent=4,
)
branch = agent.memory.branch()
store.branch("main", "experiment")
v0.17.6 β Quick Wins
from selectools import AgentConfig, REASONING_STRATEGIES, tool
config = AgentConfig(reasoning_strategy="react")
config = AgentConfig(reasoning_strategy="cot")
config = AgentConfig(reasoning_strategy="plan_then_act")
@tool(description="Search the web", cacheable=True, cache_ttl=60)
def web_search(query: str) -> str:
return expensive_api_call(query)
Also: Python 3.9β3.13 CI matrix (verified zero compatibility issues).
v0.17.4 and earlier
v0.17.4 β Agent Intelligence
from selectools import AgentConfig, estimate_run_tokens, KnowledgeMemory, SQLiteKnowledgeStore
estimate = estimate_run_tokens(messages, tools, system_prompt, model="gpt-4o")
print(f"{estimate.total_tokens} tokens, {estimate.remaining_tokens} remaining")
config = AgentConfig(
model="claude-haiku-4-5",
model_selector=lambda i, tc, u: "claude-sonnet-4-6" if i > 2 else "claude-haiku-4-5",
)
memory = KnowledgeMemory(store=SQLiteKnowledgeStore("knowledge.db"), max_entries=50)
memory.remember("User prefers dark mode", category="preference", importance=0.9, ttl_days=30)
v0.17.3 β Agent Runtime Controls
from selectools import AgentConfig, CancellationToken, SimpleStepObserver
from selectools.tools import tool
config = AgentConfig(max_total_tokens=50000, max_cost_usd=0.20)
token = CancellationToken()
result = await agent.arun("long task", cancel_token=token)
@tool(requires_approval=True, description="Send email to customer")
def send_email(to: str, subject: str, body: str) -> str: ...
config = AgentConfig(observers=[SimpleStepObserver(
lambda event, run_id, **data: sse_send({"type": event, **data})
)])
v0.17.1 β MCP Client/Server
from selectools.mcp import mcp_tools, MCPServerConfig
with mcp_tools(MCPServerConfig(command="python", args=["server.py"])) as tools:
agent = Agent(provider=provider, tools=tools, config=config)
- MCPClient β stdio + HTTP transport, circuit breaker, retry, tool caching
- MultiMCPClient β multiple servers, graceful degradation, name prefixing
- MCPServer β expose
@tool functions as MCP server
v0.17.0 β Built-in Eval Framework
from selectools.evals import EvalSuite, TestCase
suite = EvalSuite(agent=agent, cases=[
TestCase(input="Cancel account", expect_tool="cancel_sub", expect_no_pii=True),
TestCase(input="Balance?", expect_contains="balance", expect_latency_ms_lte=500),
])
report = suite.run()
report.to_html("report.html")
- 50 Evaluators β 30 deterministic + 21 LLM-as-judge
- A/B Testing, regression detection, snapshot testing
- HTML reports, JUnit XML, CLI, GitHub Action integration
Full changelog: CHANGELOG.md
v0.16.x highlights
- v0.16.6: Gemini 3.x thought_signature crash fix β base64 round-trip for non-UTF-8 binary signatures
- v0.16.5: Design Patterns & Code Quality β terminal actions, async observers, Gemini 3.x thought signatures, agent decomposition, hooks deprecated
- v0.16.4: Parallel execution safety β coherence + screening in parallel, guardrail immutability, streaming usage tracking
- v0.16.0: Memory & Persistence β persistent sessions (3 backends), summarize-on-trim, entity memory, knowledge graph
v0.15.x highlights
- v0.15.0: Enterprise Reliability β Guardrails engine (5 built-in), audit logging (4 privacy levels), tool output screening (15 patterns), coherence checking
v0.14.x highlights
- v0.14.1: Critical streaming fix β 13 bugs fixed across all providers; 141 new tests (total: 1100)
- v0.14.0: AgentObserver Protocol (25 events), 145 models with March 2026 pricing, OpenAI
max_completion_tokens auto-detection, 11 bug fixes
Coming from LangChain?
StateGraph + add_node + add_edge + compile() | AgentGraph.chain(a, b, c).run(prompt) |
LCEL prompt | llm | parser with Runnable protocol | @step + | on plain functions |
interrupt() restarts the whole node on resume | yield InterruptRequest() resumes at yield point |
| LangSmith (paid) for tracing and evals | Built-in: 50 evaluators + traces, zero cost |
5+ packages (langchain-core, langgraph, langsmith...) | 1 package: pip install selectools |
langserve for deployment | selectools serve agent.yaml |
Full migration guide with code examples: Coming from LangChain
Why Selectools
| Provider Agnostic | Switch between OpenAI, Anthropic, Gemini, Ollama with one line. Your tools stay identical. |
| Structured Output | Pydantic or JSON Schema response_format with auto-retry on validation failure. |
| Execution Traces | Every run() returns result.trace β structured timeline of LLM calls, tool picks, and executions. |
| Reasoning Visibility | result.reasoning surfaces why the agent chose a tool, extracted from LLM responses. |
| Provider Fallback | FallbackProvider tries providers in priority order with circuit breaker on failure. |
| Batch Processing | agent.batch() / agent.abatch() for concurrent multi-prompt classification. |
| Tool Policy Engine | Declarative allow/review/deny rules with glob patterns. Human-in-the-loop approval callbacks. |
| Hybrid Search | BM25 keyword + vector semantic search with RRF/weighted fusion and cross-encoder reranking. |
| Advanced Chunking | Fixed, recursive, semantic (embedding-based), and contextual (LLM-enriched) chunking strategies. |
| E2E Streaming | Token-level astream() with native tool call support. Parallel tool execution via asyncio.gather. |
| Dynamic Tools | Load tools from files/directories at runtime. Add, remove, replace tools without restarting. |
| Response Caching | LRU + TTL in-memory cache and Redis backend. Avoid redundant LLM calls for identical requests. |
| Routing Mode | Agent selects a tool without executing it. Use for intent classification and request routing. |
| Guardrails Engine | Input/output validation pipeline with PII redaction, topic blocking, toxicity detection, and format enforcement. |
| Audit Logging | JSONL audit trail with privacy controls (redact, hash, omit) and daily rotation. |
| Tool Output Screening | Prompt injection detection with 15 built-in patterns. Per-tool or global. |
| Coherence Checking | LLM-based verification that tool calls match user intent β catches injection-driven tool misuse. |
| Persistent Sessions | SessionStore with JSON file, SQLite, and Redis backends. Auto-save/load with TTL expiry. |
| Entity Memory | LLM-based entity extraction with deduplication, LRU pruning, and system prompt injection. |
| Knowledge Graph | Relationship triple extraction with in-memory and SQLite storage and keyword-based querying. |
| Cross-Session Knowledge | Daily logs + persistent facts with auto-registered remember tool. |
| MCP Integration | Connect to any MCP tool server (stdio + HTTP). MCPClient, MultiMCPClient, MCPServer. Circuit breaker, retry, graceful degradation. |
| Eval Framework | 50 built-in evaluators (30 deterministic + 21 LLM-as-judge). A/B testing, regression detection, snapshot testing, HTML reports, JUnit XML, CI integration. |
| Multi-Agent Orchestration | AgentGraph for directed agent graphs, SupervisorAgent with 4 strategies, HITL via generator nodes, parallel execution, checkpointing, subgraph composition. |
| Composable Pipelines | Pipeline + @step + ` |
| AgentObserver Protocol | 45-event lifecycle observer with run_id/call_id correlation. Built-in LoggingObserver + SimpleStepObserver. |
| Runtime Controls | Token/cost budget limits, cooperative cancellation, per-tool approval gates, model switching per iteration. |
| Production Hardened | Retries with backoff, per-tool timeouts, iteration caps, cost warnings, observability hooks + observers. |
| Library-First | Not a framework. No magic globals, no hidden state. Use as much or as little as you need. |
What's Included
- 5 LLM Providers: OpenAI, Azure OpenAI, Anthropic, Gemini, Ollama + FallbackProvider (auto-failover)
- Structured Output: Pydantic / JSON Schema
response_format with auto-retry
- Execution Traces:
result.trace with typed timeline of every agent step
- Reasoning Visibility:
result.reasoning explains why the agent chose a tool
- Batch Processing:
agent.batch() / agent.abatch() for concurrent classification
- Tool Policy Engine: Declarative allow/review/deny rules with human-in-the-loop
- 4 Embedding Providers: OpenAI, Anthropic/Voyage, Gemini (free!), Cohere
- 7 Vector Stores: In-memory, SQLite, Chroma, Pinecone, FAISS, Qdrant, pgvector
- Hybrid Search: BM25 + vector fusion with Cohere/Jina reranking
- Advanced Chunking: Semantic + contextual chunking for better retrieval
- Dynamic Tool Loading: Plugin system with hot-reload support
- Response Caching: InMemoryCache and RedisCache with stats tracking
- 152 Model Registry: Type-safe constants with pricing and metadata
- Pre-built Toolbox: 24 tools for files, data, text, datetime, web
- Persistent Sessions: 3 backends (JSON file, SQLite, Redis) with TTL
- Entity Memory: LLM-based named entity extraction and tracking
- Knowledge Graph: Triple extraction with in-memory and SQLite storage
- Cross-Session Knowledge: Daily logs + persistent memory with
remember tool, pluggable stores (File, SQLite), importance scoring, TTL
- Token Budget & Cancellation:
max_total_tokens, max_cost_usd hard limits; CancellationToken for cooperative stopping
- Token Estimation:
estimate_run_tokens() for pre-execution budget checks
- Model Switching:
model_selector callback for per-iteration model selection
- Semantic Cache:
SemanticCache β embedding-based cache hits for paraphrased queries (cosine similarity, LRU + TTL)
- Prompt Compression: Auto-summarise old history when context window fills up;
compress_context, compress_threshold, compress_keep_recent
- Conversation Branching:
ConversationMemory.branch() and SessionStore.branch() for A/B exploration and checkpointing
- Multi-Agent Orchestration:
AgentGraph with routing, parallel execution, HITL, checkpointing; SupervisorAgent with 4 strategies (plan_and_execute, round_robin, dynamic, magentic)
- Composable Pipelines:
Pipeline + @step + | operator + parallel() + branch() β chain agents, tools, and transforms
- 96 Examples: Multi-agent graphs, RAG, hybrid search, streaming, structured output, traces, batch, policy, observer, guardrails, audit, sessions (incl. Supabase), entity memory, knowledge graph, eval framework, advanced agent patterns, stability markers, HTML trace viewer, and more
- Built-in Eval Framework: 50 evaluators (30 deterministic + 21 LLM-as-judge), A/B testing, regression detection, HTML reports, JUnit XML, snapshot testing
- AgentObserver Protocol: 45 lifecycle events with
run_id correlation, LoggingObserver, SimpleStepObserver, OTel export
- 5332 Tests: Unit, integration, regression, and E2E with real API calls
Install
pip install selectools
pip install selectools[rag]
pip install selectools[observe]
pip install selectools[postgres]
pip install selectools[cache]
pip install selectools[mcp]
pip install "selectools[rag,observe,cache,mcp]"
Add your provider's API key to a .env file in your project root:
OPENAI_API_KEY=sk-...
# or ANTHROPIC_API_KEY, GEMINI_API_KEY β whichever provider you use
Quick Start
New to Selectools? Follow the 5-minute Quickstart tutorial β no API key needed.
Tool Calling Agent (No API Key)
from selectools import Agent, AgentConfig, tool
from selectools.providers.stubs import LocalProvider
@tool(description="Look up the price of a product")
def get_price(product: str) -> str:
prices = {"laptop": "$999", "phone": "$699", "headphones": "$149"}
return prices.get(product.lower(), f"No price found for {product}")
agent = Agent(
tools=[get_price],
provider=LocalProvider(),
config=AgentConfig(max_iterations=3),
)
result = agent.ask("How much is a laptop?")
print(result.content)
Tool Calling Agent (OpenAI)
from selectools import Agent, AgentConfig, OpenAIProvider, tool
from selectools.models import OpenAI
@tool(description="Search the web for information")
def search(query: str) -> str:
return f"Results for: {query}"
agent = Agent(
tools=[search],
provider=OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id),
config=AgentConfig(max_iterations=5),
)
result = agent.ask("Search for Python tutorials")
print(result.content)
RAG Agent
from selectools import OpenAIProvider
from selectools.embeddings import OpenAIEmbeddingProvider
from selectools.models import OpenAI
from selectools.rag import RAGAgent, VectorStore
embedder = OpenAIEmbeddingProvider(model=OpenAI.Embeddings.TEXT_EMBEDDING_3_SMALL.id)
store = VectorStore.create("memory", embedder=embedder)
agent = RAGAgent.from_directory(
directory="./docs",
provider=OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id),
vector_store=store,
chunk_size=500, top_k=3,
)
result = agent.ask("What are the main features?")
print(result.content)
print(agent.get_usage_summary())
Hybrid Search (Keyword + Semantic)
from selectools.rag import BM25, HybridSearcher, FusionMethod, HybridSearchTool, VectorStore
store = VectorStore.create("memory", embedder=embedder)
store.add_documents(chunked_docs)
searcher = HybridSearcher(
vector_store=store,
vector_weight=0.6,
keyword_weight=0.4,
fusion=FusionMethod.RRF,
)
searcher.add_documents(chunked_docs)
hybrid_tool = HybridSearchTool(searcher=searcher, top_k=5)
agent = Agent(tools=[hybrid_tool.search_knowledge_base], provider=provider)
Streaming with Parallel Tools
import asyncio
from selectools import Agent, AgentConfig
from selectools.types import StreamChunk, AgentResult
agent = Agent(
tools=[tool_a, tool_b, tool_c],
provider=provider,
config=AgentConfig(parallel_tool_execution=True),
)
async for item in agent.astream("Run all tasks"):
if isinstance(item, StreamChunk):
print(item.content, end="", flush=True)
elif isinstance(item, AgentResult):
print(f"\nDone in {item.iterations} iterations")
Key Features
Hybrid Search & Reranking
Combine semantic search with BM25 keyword matching for better recall on exact terms, names, and acronyms:
from selectools.rag import BM25, HybridSearcher, CohereReranker, FusionMethod
searcher = HybridSearcher(
vector_store=store,
fusion=FusionMethod.RRF,
reranker=CohereReranker(),
)
results = searcher.search("GDPR compliance", top_k=5)
See docs/modules/HYBRID_SEARCH.md for full documentation.
Advanced Chunking
Go beyond fixed-size splitting with embedding-aware and LLM-enriched chunking:
from selectools.rag import SemanticChunker, ContextualChunker
semantic = SemanticChunker(embedder=embedder, similarity_threshold=0.75)
contextual = ContextualChunker(base_chunker=semantic, provider=provider)
enriched_docs = contextual.split_documents(documents)
See docs/modules/ADVANCED_CHUNKING.md for full documentation.
Dynamic Tool Loading
Discover and load @tool functions from files and directories at runtime:
from selectools.tools import ToolLoader
tools = ToolLoader.from_directory("./plugins", recursive=True)
agent.add_tools(tools)
updated = ToolLoader.reload_file("./plugins/search.py")
agent.replace_tool(updated[0])
agent.remove_tool("deprecated_search")
See docs/modules/DYNAMIC_TOOLS.md for full documentation.
Response Caching
Avoid redundant LLM calls with pluggable caching:
from selectools import Agent, AgentConfig, InMemoryCache
cache = InMemoryCache(max_size=1000, default_ttl=300)
agent = Agent(
tools=[...],
provider=provider,
config=AgentConfig(cache=cache),
)
agent.ask("What is Python?")
agent.reset()
agent.ask("What is Python?")
print(cache.stats)
For distributed setups: from selectools.cache_redis import RedisCache
Routing Mode
Agent selects a tool without executing it -- use for intent classification:
config = AgentConfig(routing_only=True)
agent = Agent(tools=[send_email, schedule_meeting, search_kb], provider=provider, config=config)
result = agent.ask("Book a meeting with Alice tomorrow")
print(result.tool_name)
print(result.tool_args)
Structured Output
Get typed, validated results from the LLM:
from pydantic import BaseModel
from typing import Literal
class Classification(BaseModel):
intent: Literal["billing", "support", "sales", "cancel"]
confidence: float
priority: Literal["low", "medium", "high"]
result = agent.ask("I want to cancel my account", response_format=Classification)
print(result.parsed)
Auto-retries with error feedback when validation fails.
Execution Traces & Reasoning
See exactly what your agent did and why:
result = agent.run("Classify this ticket")
for step in result.trace:
print(f"{step.type} | {step.duration_ms:.0f}ms | {step.summary}")
print(result.reasoning)
result.trace.to_json("trace.json")
Provider Fallback
Automatic failover with circuit breaker:
from selectools import FallbackProvider, OpenAIProvider, AnthropicProvider
provider = FallbackProvider([
OpenAIProvider(default_model="gpt-4o-mini"),
AnthropicProvider(default_model="claude-haiku"),
])
agent = Agent(tools=[...], provider=provider)
Batch Processing
Classify multiple requests concurrently:
results = await agent.abatch(
["Cancel my subscription", "How do I upgrade?", "My payment failed"],
max_concurrency=10,
)
Tool Policy & Human-in-the-Loop
Declarative safety rules with approval callbacks:
from selectools import ToolPolicy
policy = ToolPolicy(
allow=["search_*", "read_*"],
review=["send_*", "create_*"],
deny=["delete_*"],
)
async def confirm(tool_name, tool_args, reason):
return await get_user_approval(tool_name, tool_args)
config = AgentConfig(tool_policy=policy, confirm_action=confirm)
AgentObserver Protocol
Class-based observability with run_id correlation for Langfuse, OpenTelemetry, Datadog, or custom integrations:
from selectools import Agent, AgentConfig, AgentObserver, LoggingObserver
class MyObserver(AgentObserver):
def on_tool_end(self, run_id, call_id, tool_name, result, duration_ms):
print(f"[{run_id}] {tool_name} finished in {duration_ms:.1f}ms")
def on_provider_fallback(self, run_id, failed_provider, next_provider, error):
print(f"[{run_id}] {failed_provider} failed, falling back to {next_provider}")
agent = Agent(
tools=[...], provider=provider,
config=AgentConfig(observers=[MyObserver(), LoggingObserver()]),
)
45 lifecycle events: run, LLM, tool, iteration, batch, policy, structured output, fallback, retry, memory trim, guardrail, coherence, screening, session, entity, KG, budget exceeded, cancelled, prompt compressed, plus 13 graph events (graph start/end, node start/end, routing, interrupt, resume, parallel, stall, loop, supervisor replan). See observer.py for full reference.
E2E Streaming & Parallel Execution
agent.astream() yields StreamChunk (text deltas) then AgentResult (final)
- Multiple tool calls execute concurrently via
asyncio.gather() (3 tools @ 0.15s each = ~0.15s total)
- Fallback chain:
astream -> acomplete -> complete via executor
- Context propagation with
contextvars for tracing/auth
See docs/modules/STREAMING.md for full documentation.
Providers
| OpenAI | Yes | Yes | Yes | Paid |
| Azure OpenAI | Yes | Yes | Yes | Paid (Azure billing) |
| Anthropic | Yes | Yes | Yes | Paid |
| Gemini | Yes | Yes | Yes | Free tier |
| Ollama | Yes | No | No | Free (local) |
| Fallback | Yes | Yes | Yes | Varies (wraps others) |
| Local | No | No | No | Free (testing) |
from selectools.models import OpenAI, Anthropic, Gemini, Ollama
model = OpenAI.GPT_4O_MINI
print(f"Cost: ${model.prompt_cost}/${model.completion_cost} per 1M tokens")
print(f"Context: {model.context_window:,} tokens")
Embedding Providers
from selectools.embeddings import (
OpenAIEmbeddingProvider,
AnthropicEmbeddingProvider,
GeminiEmbeddingProvider,
CohereEmbeddingProvider,
)
Vector Stores
from selectools.rag import VectorStore
from selectools.rag.stores import FAISSVectorStore, QdrantVectorStore, PgVectorStore
store = VectorStore.create("memory", embedder=embedder)
store = VectorStore.create("sqlite", embedder=embedder, db_path="docs.db")
store = VectorStore.create("chroma", embedder=embedder, persist_directory="./chroma")
store = VectorStore.create("pinecone", embedder=embedder, index_name="my-index")
store = FAISSVectorStore(embedder=embedder)
store = QdrantVectorStore(embedder=embedder, url="http://localhost:6333")
store = PgVectorStore(embedder=embedder, connection_string="postgresql://...")
Agent Configuration
config = AgentConfig(
model="gpt-4o-mini",
temperature=0.0,
max_tokens=2000,
max_iterations=6,
max_retries=3,
retry_backoff_seconds=2.0,
request_timeout=60.0,
tool_timeout_seconds=30.0,
cost_warning_threshold=0.50,
parallel_tool_execution=True,
routing_only=False,
stream=False,
cache=None,
tool_policy=None,
confirm_action=None,
approval_timeout=60.0,
enable_analytics=True,
verbose=False,
observers=[LoggingObserver()],
system_prompt="You are a helpful assistant...",
)
Tool Definition
@tool Decorator (Recommended)
from selectools import tool
@tool(description="Calculate compound interest")
def calculate_interest(principal: float, rate: float, years: int) -> str:
amount = principal * (1 + rate / 100) ** years
return f"After {years} years: ${amount:.2f}"
Tool Registry
from selectools import ToolRegistry
registry = ToolRegistry()
@registry.tool(description="Search the knowledge base")
def search_kb(query: str, max_results: int = 5) -> str:
return f"Results for: {query}"
agent = Agent(tools=registry.all(), provider=provider)
Injected Parameters
Keep secrets out of the LLM's view:
db_tool = Tool(
name="query_db",
description="Execute SQL query",
parameters=[ToolParameter(name="sql", param_type=str, description="SQL query")],
function=query_database,
injected_kwargs={"db_connection": db_conn}
)
Streaming Tools
from typing import Generator
@tool(description="Process large file", streaming=True)
def process_file(filepath: str) -> Generator[str, None, None]:
with open(filepath) as f:
for i, line in enumerate(f, 1):
yield f"[Line {i}] {line.strip()}\n"
config = AgentConfig(observers=[SimpleStepObserver(lambda event, run_id, **kw: print(kw.get("chunk", ""), end=""))])
Conversation Memory
from selectools import Agent, ConversationMemory
memory = ConversationMemory(max_messages=20)
agent = Agent(tools=[...], provider=provider, memory=memory)
agent.ask("My name is Alice")
agent.ask("What's my name?")
Cost Tracking
result = agent.ask("Search and summarize")
print(f"Total cost: ${agent.total_cost:.6f}")
print(f"Total tokens: {agent.total_tokens:,}")
print(agent.get_usage_summary())
Examples
Examples are numbered by difficulty. Start from 01 and work your way up.
| 01 | 01_hello_world.py | First agent, @tool, ask() | No |
| 02 | 02_search_weather.py | ToolRegistry, multiple tools | No |
| 03 | 03_toolbox.py | 24 pre-built tools (file, data, text, datetime, web) | No |
| 04 | 04_conversation_memory.py | Multi-turn memory | Yes |
| 05 | 05_cost_tracking.py | Token counting, cost warnings | Yes |
| 06 | 06_async_agent.py | arun(), concurrent agents, FastAPI | Yes |
| 07 | 07_streaming_tools.py | Generator-based streaming | Yes |
| 08 | 08_streaming_parallel.py | astream(), parallel execution, StreamChunk | Yes |
| 09 | 09_caching.py | InMemoryCache, RedisCache, cache stats | Yes |
| 10 | 10_routing_mode.py | Routing mode, intent classification | Yes |
| 11 | 11_tool_analytics.py | Call counts, success rates, timing | Yes |
| 12 | 12_observability_hooks.py | Lifecycle hooks, tool validation | Yes |
| 13 | 13_dynamic_tools.py | ToolLoader, plugins, hot-reload | Yes |
| 14 | 14_rag_basic.py | RAG pipeline, document loading, vector search | Yes + [rag] |
| 15 | 15_semantic_search.py | Pure semantic search, metadata filtering | Yes + [rag] |
| 16 | 16_rag_advanced.py | PDFs, SQLite persistence, custom chunking | Yes + [rag] |
| 17 | 17_rag_multi_provider.py | Embedding/store/chunk-size comparisons | Yes + [rag] |
| 18 | 18_hybrid_search.py | BM25 + vector fusion, RRF, reranking | Yes + [rag] |
| 19 | 19_advanced_chunking.py | Semantic and contextual chunking | Yes + [rag] |
| 20 | 20_customer_support_bot.py | Multi-tool customer support workflow | Yes |
| 21 | 21_data_analysis_agent.py | Data exploration and analysis | Yes |
| 22 | 22_ollama_local.py | Fully local LLM via Ollama | No (Ollama) |
| 23 | 23_structured_output.py | Pydantic response_format, auto-retry, JSON extraction | No |
| 24 | 24_traces_and_reasoning.py | AgentTrace timeline, reasoning visibility, JSON export | No |
| 25 | 25_provider_fallback.py | FallbackProvider, circuit breaker, failover chain | No |
| 26 | 26_batch_processing.py | batch(), abatch(), structured batch, error isolation | No |
| 27 | 27_tool_policy.py | ToolPolicy, deny_when, HITL approval, memory trimming | No |
| 28 | 28_agent_observer.py | AgentObserver, LoggingObserver, multiple observers, OTel export | No |
| 29 | 29_guardrails.py | Input/output guardrails, PII redaction, topic blocking | No |
| 30 | 30_audit_logging.py | JSONL audit logging, privacy controls, daily rotation | No |
| 31 | 31_tool_output_screening.py | Prompt injection detection in tool outputs | No |
| 32 | 32_coherence_checking.py | LLM-based intent verification for injection defense | Yes |
| 33 | 33_persistent_sessions.py | JsonFileSessionStore, cross-restart persistence | No |
| 34 | 34_summarize_on_trim.py | Summarize trimmed messages for context preservation | No |
| 35 | 35_entity_memory.py | Named entity extraction and tracking | No |
| 36 | 36_knowledge_graph.py | Triple extraction, in-memory and SQLite storage | No |
| 37 | 37_knowledge_memory.py | Cross-session facts, daily logs, remember tool | No |
| 38 | 38_terminal_tools.py | @tool(terminal=True), stop_condition callback | No |
| 39 | 39_eval_framework.py | EvalSuite, TestCase, evaluators, HTML reports | No |
| 40 | 40_eval_advanced.py | Pairwise A/B, regression detection, snapshots | No |
| 41 | 41_mcp_client.py | MCPClient, mcp_tools(), tool interop | No |
| 42 | 42_mcp_server.py | MCPServer, expose tools as MCP endpoints | No |
| 43 | 43_token_budget.py | max_total_tokens, max_cost_usd budget limits | No |
| 44 | 44_cancellation.py | CancellationToken, cooperative stopping | No |
| 45 | 45_approval_gate.py | @tool(requires_approval=True), confirm_action | No |
| 46 | 46_simple_observer.py | SimpleStepObserver, single-callback integration | No |
| 47 | 47_token_estimation.py | estimate_run_tokens(), pre-flight cost checks | No |
| 48 | 48_model_switching.py | model_selector callback, per-iteration model | No |
| 49 | 49_knowledge_stores.py | SQLite, Redis, Supabase knowledge stores | No |
| 50 | 50_reasoning_strategies.py | ReAct, Chain-of-Thought, Plan-then-Act | No |
| 51 | 51_tool_result_caching.py | @tool(cacheable=True, cache_ttl=300) | No |
| 52 | 52_semantic_cache.py | SemanticCache with embedding similarity | Yes |
| 53 | 53_prompt_compression.py | Auto-summarize old history on context fill | No |
| 54 | 54_conversation_branching.py | memory.branch(), store.branch() | No |
| 55 | 55_agent_graph_linear.py | Linear AgentGraph pipeline | No |
| 56 | 56_agent_graph_parallel.py | Parallel fan-out with merge policies | No |
| 57 | 57_agent_graph_conditional.py | Conditional routing with plain Python | No |
| 58 | 58_agent_graph_hitl.py | Human-in-the-loop with generator nodes | No |
| 59 | 59_agent_graph_checkpointing.py | Checkpoint, interrupt, resume | No |
| 60 | 60_supervisor_agent.py | SupervisorAgent with 4 strategies | No |
| 61 | 61_agent_graph_subgraph.py | Nested subgraph composition | No |
| 62 | 62_yaml_config.py | Load AgentConfig from YAML | No |
| 63 | 63_agent_templates.py | Built-in agent templates | No |
| 64 | 64_selectools_serve.py | Serve agent over HTTP with selectools serve | No |
| 65 | 65_tool_composition.py | compose() tool chaining | No |
| 66 | 66_streaming_pipeline.py | pipeline.astream() streaming composition | No |
| 67 | 67_type_safe_pipeline.py | Type-safe step contracts | No |
| 68 | 68_postgres_checkpoints.py | PostgresCheckpointStore for AgentGraph | Yes + [postgres] |
| 69 | 69_trace_store.py | Trace storage and querying | No |
| 70 | 70_plan_and_execute.py | PlanAndExecuteAgent with typed steps | No |
| 71 | 71_reflective_agent.py | ReflectiveAgent actorβcritic loop | No |
| 72 | 72_debate_agent.py | DebateAgent with optimist/skeptic/judge | No |
| 73 | 73_team_lead_agent.py | TeamLeadAgent with all 3 delegation strategies | No |
Run any example:
python examples/01_hello_world.py
python examples/14_rag_basic.py
Documentation
Read the full documentation β hosted on GitHub Pages with search, dark mode, and easy navigation.
Also available in docs/:
| AGENT | Agent loop, structured output, traces, reasoning, batch, policy |
| STREAMING | E2E streaming, parallel execution, routing |
| TOOLS | Tool definition, validation, registry |
| DYNAMIC_TOOLS | ToolLoader, plugins, hot-reload |
| HYBRID_SEARCH | BM25, fusion, reranking |
| ADVANCED_CHUNKING | Semantic & contextual chunking |
| RAG | Complete RAG pipeline |
| EMBEDDINGS | Embedding providers |
| VECTOR_STORES | Storage backends |
| PROVIDERS | LLM provider adapters + FallbackProvider |
| MEMORY | Conversation memory + tool-pair trimming |
| USAGE | Cost tracking & analytics |
| MODELS | Model registry & pricing |
| SESSIONS | Persistent session stores (JSON, SQLite, Redis) |
| ENTITY_MEMORY | Entity extraction and tracking |
| KNOWLEDGE_GRAPH | Triple extraction and storage |
| KNOWLEDGE | Cross-session knowledge memory |
| GUARDRAILS | Input/output validation pipeline |
| AUDIT | JSONL audit logging |
| SECURITY | Screening & coherence checking |
| EVALS | 50 evaluators, A/B testing, regression |
| MCP | MCP client/server integration |
| BUDGET | Token/cost budget limits |
| CANCELLATION | Cooperative cancellation |
| ORCHESTRATION | AgentGraph, routing, parallel, HITL |
| SUPERVISOR | SupervisorAgent, 4 strategies |
| PATTERNS | PlanAndExecute, Reflective, Debate, TeamLead |
| PARSER | Tool call parsing |
| PROMPT | System prompt generation |
Tests
pytest tests/ -x -q
pytest tests/ -k "not e2e"
5332 tests covering parsing, agent loop, providers, RAG pipeline, hybrid search, advanced chunking, dynamic tools, caching, streaming, guardrails, sessions, memory, eval framework, budget/cancellation, knowledge stores, orchestration, pipelines, agent patterns, stability markers, trace viewer, and E2E integration with real API calls.
License
Apache-2.0 β Use freely in commercial applications. No copyleft restrictions. See LICENSE.
Contributing
See CONTRIBUTING.md. We welcome contributions for new tools, providers, vector stores, examples, and documentation.
Roadmap | Changelog | Documentation