AbstractCore

Unified LLM Interface
Write once, run everywhere
AbstractCore is a Python library that provides a unified create_llm(...) API across cloud + local LLM providers (OpenAI, Anthropic, Ollama, LMStudio, and more). The default install is intentionally lightweight; add providers and optional subsystems via explicit install extras.
First-class support for:
- sync + async
- streaming + non-streaming
- universal tool calling (native + prompted tool syntax)
- structured output (Pydantic)
- media input (images/audio/video + documents) with explicit, policy-driven fallbacks (*)
- optional capability plugins (
core.voice/core.audio/core.vision) for deterministic TTS/STT and generative vision (via abstractvoice / abstractvision)
- glyph visual-text compression for long documents (**)
- optional OpenAI-compatible
/v1 gateway server (multi-provider) and single-model endpoint
(*) Media input is policy-driven (no silent semantic changes). If a model doesn’t support images, AbstractCore can use a configured vision model to generate short visual observations and inject them into your text-only request (vision fallback). Audio/video attachments are also policy-driven (audio_policy, video_policy) and may require capability plugins for fallbacks. See Media Handling and Centralized Config.
(**) Optional visual-text compression: render long text/PDFs into images and process them with a vision model to reduce token usage. See Glyph Visual-Text Compression (install pip install "abstractcore[compression]"; for PDFs also install pip install "abstractcore[media]").
Docs: Getting Started · FAQ · Docs Index · https://lpalbou.github.io/AbstractCore
AbstractFramework ecosystem
AbstractCore is part of the AbstractFramework ecosystem:
By default, AbstractCore is pass-through for tools (execute_tools=False): it returns structured tool calls in response.tool_calls, and your runtime decides whether/how to execute them (policy, sandboxing, retries, persistence). See Tool Calling and Architecture.
graph LR
APP[Your app] --> AC[AbstractCore]
AC --> P[Provider adapter]
P --> LLM[LLM backend]
AC -. tool_calls .-> RT[AbstractRuntime (optional)]
RT -. tool results .-> AC
Install
pip install abstractcore
pip install "abstractcore[openai]"
pip install "abstractcore[anthropic]"
pip install "abstractcore[huggingface]"
pip install "abstractcore[mlx]"
pip install "abstractcore[vllm]"
pip install "abstractcore[tools]"
pip install "abstractcore[media]"
pip install "abstractcore[compression]"
pip install "abstractcore[embeddings]"
pip install "abstractcore[tokens]"
pip install "abstractcore[server]"
pip install "abstractcore[openai,media,tools]"
pip install "abstractcore[all-apple]"
pip install "abstractcore[all-non-mlx]"
pip install "abstractcore[all-gpu]"
Quickstart
OpenAI example (requires pip install "abstractcore[openai]"):
from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini")
response = llm.generate("What is the capital of France?")
print(response.content)
Conversation state (BasicSession)
from abstractcore import create_llm, BasicSession
session = BasicSession(create_llm("anthropic", model="claude-haiku-4-5"))
print(session.generate("Give me 3 bakery name ideas.").content)
print(session.generate("Pick the best one and explain why.").content)
Streaming
from abstractcore import create_llm
llm = create_llm("ollama", model="qwen3:4b-instruct")
for chunk in llm.generate("Write a short poem about distributed systems.", stream=True):
print(chunk.content or "", end="", flush=True)
Async
import asyncio
from abstractcore import create_llm
async def main():
llm = create_llm("openai", model="gpt-4o-mini")
resp = await llm.agenerate("Give me 5 bullet points about HTTP caching.")
print(resp.content)
asyncio.run(main())
Token budgets (unified)
from abstractcore import create_llm
llm = create_llm(
"openai",
model="gpt-4o-mini",
max_tokens=8000,
max_output_tokens=1200,
)
Providers (common)
Open-source-first: local providers (Ollama, LMStudio, vLLM, openai-compatible, HuggingFace, MLX) are first-class. Cloud and gateway providers are optional.
openai: OPENAI_API_KEY, optional OPENAI_BASE_URL
anthropic: ANTHROPIC_API_KEY, optional ANTHROPIC_BASE_URL
openrouter: OPENROUTER_API_KEY, optional OPENROUTER_BASE_URL (default: https://openrouter.ai/api/v1)
portkey: PORTKEY_API_KEY, PORTKEY_CONFIG (config id), optional PORTKEY_BASE_URL (default: https://api.portkey.ai/v1)
ollama: local server at OLLAMA_BASE_URL (or legacy OLLAMA_HOST)
lmstudio: OpenAI-compatible local server at LMSTUDIO_BASE_URL (default: http://localhost:1234/v1)
vllm: OpenAI-compatible server at VLLM_BASE_URL (default: http://localhost:8000/v1)
openai-compatible: generic OpenAI-compatible endpoints via OPENAI_COMPATIBLE_BASE_URL (default: http://localhost:1234/v1)
huggingface: local models via Transformers (optional HUGGINGFACE_TOKEN for gated downloads)
mlx: Apple Silicon local models (optional HUGGINGFACE_TOKEN for gated downloads)
You can also persist settings (including API keys) via the config CLI:
abstractcore --status
abstractcore --configure (alias: --config)
abstractcore --set-api-key openai sk-...
What’s inside (quick tour)
Tool calling (passthrough by default)
By default (execute_tools=False), AbstractCore:
- returns clean assistant text in
response.content
- returns structured tool calls in
response.tool_calls (host/runtime executes them)
from abstractcore import create_llm, tool
@tool
def get_weather(city: str) -> str:
return f"{city}: 22°C and sunny"
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate("What's the weather in Paris? Use the tool.", tools=[get_weather])
print(resp.content)
print(resp.tool_calls)
If you need tool-call markup preserved/re-written in content for downstream parsers, pass
tool_call_tags=... (e.g. "qwen3", "llama3", "xml"). See Tool Syntax Rewriting.
Structured output
from pydantic import BaseModel
from abstractcore import create_llm
class Answer(BaseModel):
title: str
bullets: list[str]
llm = create_llm("openai", model="gpt-4o-mini")
answer = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)
print(answer.bullets)
Media input (images/audio/video)
Requires pip install "abstractcore[media]".
from abstractcore import create_llm
llm = create_llm("anthropic", model="claude-haiku-4-5")
resp = llm.generate("Describe the image.", media=["./image.png"])
print(resp.content)
Notes:
- Images: use a vision-capable model, or configure vision fallback for text-only models (
abstractcore --config; abstractcore --set-vision-provider PROVIDER MODEL).
- Video:
video_policy="auto" (default) uses native video when supported, otherwise samples frames (requires ffmpeg/ffprobe) and routes them through image/vision handling (so you still need a vision-capable model or vision fallback configured).
- Audio: use an audio-capable model, or set
audio_policy="auto"/"speech_to_text" and install abstractvoice for speech-to-text.
Configure defaults (optional):
abstractcore --status
abstractcore --set-vision-provider lmstudio qwen/qwen3-vl-4b
abstractcore --set-audio-strategy auto
abstractcore --set-video-strategy auto
See Media Handling and Vision Capabilities.
HTTP server (OpenAI-compatible gateway)
pip install "abstractcore[server]"
python -m abstractcore.server.app
Use any OpenAI-compatible client, and route to any provider/model via model="provider/model":
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
resp = client.chat.completions.create(
model="ollama/qwen3:4b-instruct",
messages=[{"role": "user", "content": "Hello from the gateway!"}],
)
print(resp.choices[0].message.content)
See Server.
Single-model /v1 endpoint (one provider/model per worker): see Endpoint (abstractcore-endpoint).
CLI (optional)
Interactive chat:
abstractcore-chat --provider openai --model gpt-4o-mini
abstractcore-chat --provider lmstudio --model qwen/qwen3-4b-2507 --base-url http://localhost:1234/v1
abstractcore-chat --provider openrouter --model openai/gpt-4o-mini
Token limits:
- startup:
abstractcore-chat --max-tokens 8192 --max-output-tokens 1024 ...
- in-REPL:
/max-tokens 8192 and /max-output-tokens 1024
Built-in CLI apps
AbstractCore also ships with ready-to-use CLI apps:
summarizer, extractor, judge, intent, deepsearch (see docs/apps/)
Documentation map
Start here:
Core features:
Reference and internals:
Project:
License
MIT