
Company News
Andrew Becherer Joins Socket as Chief Information Security Officer
Socket’s first CISO brings deep experience securing high-growth SaaS companies as open source supply chain threats accelerate.
Open-source coding agent runtime — test-verify-fix loops, persistent memory, multi-model orchestration
Self-improving agent runtime that learns from experience and drives LLM agents through test-verify-fix loops
Quick Start · Why Prax · Usage · Results · Integration Paths · Configuration · Architecture · Contributing
Goal: install Prax, configure an AI key, run your first task — in under 5 minutes. No programming background required.
Already an experienced user? Jump to One-liner for experienced users below.
Prax needs Node.js (for the CLI wrapper) and Python 3.10+ (for the runtime). Check if you already have them:
node --version # should print v14 or higher
python3 --version # should print Python 3.10 or higher
Missing one? Install:
| OS | Install command |
|---|---|
| macOS | brew install node python@3.12 (install Homebrew first if needed) |
| Linux | sudo apt install nodejs python3 python3-pip (Debian/Ubuntu) or sudo dnf install nodejs python3 (Fedora) |
| Windows | Use WSL2 and follow the Linux commands. Native Windows is not supported yet (on the 0.5.x roadmap). |
npm install -g praxagent
Verify:
prax --version
Should print:
prax 0.5.0
(0.5.0 or higher is fine.)
See command not found?
export PATH=/opt/homebrew/bin:$PATH then add it to ~/.zshrcnpm prefix -g's bin/ is on your $PATHPrax needs two things to call any LLM: an endpoint URL and an API key. The procedure is the same whether you use an official API or a third-party proxy.
Get the base_url and api_key from your service's dashboard or docs.
Export them as environment variables (the names are arbitrary — any name your shell can export works; LLM_BASE_URL / LLM_API_KEY is just our recommended convention):
export LLM_BASE_URL="https://your-service-endpoint"
export LLM_API_KEY="your-key"
Wire them into Prax via ~/.prax/models.yaml:
mkdir -p ~/.prax
cat > ~/.prax/models.yaml <<'YAML'
providers:
default:
base_url_env: LLM_BASE_URL
api_key_env: LLM_API_KEY
format: openai # use "anthropic" if your service speaks the Anthropic protocol
models:
- name: <your-model> # the exact model name your service exposes
default_model: <your-model>
YAML
Replace <your-model> with whatever model identifier your service supports (check the service's /v1/models endpoint or its dashboard).
Verify it works:
prax providers
You should see your provider listed with the model name and a status. With LLM_API_KEY set the status reads available; if it shows missing-credentials, the env var didn't reach Prax — re-run the export and make sure echo $LLM_API_KEY echoes the key back.
mkdir -p ~/Desktop/prax-hello && cd ~/Desktop/prax-hello
prax prompt "你是谁?用一句话回答。"
Should print something like:
我是 Prax 这个智能体运行时里跑的 AI 助手,可以帮你执行代码、测试和自动化任务。
Congrats — Prax is working.
echo "hello world" > greeting.txt
prax prompt "读 greeting.txt 里的内容,然后把它改成全大写再写回去"
cat greeting.txt
Should print:
HELLO WORLD
That's the distinguishing capability — Prax doesn't just chat, it reads, writes, runs tests, and verifies in a loop. Everything below builds on this.
Got stuck at any step? Common issues:
| Symptom | Fix |
|---|---|
Error: Model 'xxx' not found | The <your-model> name in ~/.prax/models.yaml doesn't match what your service exposes. Check its /v1/models endpoint or dashboard. |
HTTP 401 / Unauthorized | Key typo or expired. Regenerate and re-export LLM_API_KEY. |
| Silent exit with no output | Your endpoint is unreachable. Try curl -s "$LLM_BASE_URL"/... to confirm, or point LLM_BASE_URL at a healthy endpoint. |
| Chinese characters look broken | Set export LANG=zh_CN.UTF-8 in your shell rc. |
Once Step 5 above works, you're ready to pick a usage mode.
Use Prax directly at the shell prompt — perfect for automation, cron jobs, CI/CD, or just batching work without opening an IDE.
prax prompt "run pytest -q, fix the first failure, and stop when tests pass"
prax prompt "read README.md and propose 3 concrete improvements as a checklist"
prax cron add --name daily-news --schedule "0 17 * * *" --prompt "..." # schedule recurring work
Pick a role-matched walkthrough to see a real-world pipeline end-to-end:
| Your role | Tutorial | What you'll build |
|---|---|---|
| PM / support lead | support-digest | Daily PII-redacted ticket digest, local-only processing |
| Content creator / knowledge-base hobbyist | ai-news-daily | Scrape X/知乎/Bilibili → compile Obsidian wiki → send digest |
| Release manager | release-notes | Git log → CHANGELOG + release announcement |
| DevEx / tech writer | docs-audit | Weekly "which docs drifted from the code?" report |
| Engineering lead | pr-triage | Per-PR triage that actually runs tests on both branches |
All five tutorials start with sample data — no external API or real PR needed to follow along.
If you use Claude Code and want Prax's skills, commands, and verification hooks available inside the IDE:
prax install --profile full
This copies Prax's bundled skills (4 commercial recipes + 1 content-automation pipeline + 4 developer workflows) into ~/.claude/, registers Prax MCP servers, and wires hooks so Claude Code runs Prax's verification loop on every code change.
Verify:
prax doctor --target claude
Now reopen your project in Claude Code. You'll see new /prax-status, /prax-doctor, /prax-plan, /prax-verify slash commands, the prax-planner agent, and Prax's rules automatically applied.
To undo: prax uninstall --target claude.
Note:
prax installonly ships a Claude Code integration today. Any LLM you configure via Step 3 still works as Prax's backend regardless of IDE choice. A skill-export path for additional IDE/CLI hosts is on the 0.5.x roadmap.
git clone https://github.com/ChanningLua/prax-agent.git && cd prax-agent
pip install -e .
export LLM_BASE_URL="https://your-service-endpoint"
export LLM_API_KEY="your-key"
# (then write ~/.prax/models.yaml as shown in Step 3 above)
prax prompt "run pytest -q, fix the failure, and stop when tests pass"
Prax can execute shell commands on your behalf. It defaults to
workspace-writemode — files outside the project are off-limits. Use--permission-mode read-onlyfor safe exploration.
Prax isn't just another LLM wrapper — it's a production-grade agent runtime built for real repository work.
Prax learns from experience and self-improves across sessions and projects:
~/.prax/experiences.json that improves performance across all your repositoriesThese capabilities are production-ready and integrated into the core runtime — not experimental plugins.
Prax automatically learns from your work and applies that knowledge to future tasks. Here's how to work with its memory system:
Prax captures experience in these situations:
.prax/solutions/Check what Prax has learned:
# Project-specific memory
cat .prax/memory.json # Facts and context (JSON backend)
cat .prax/memory.db # Or SQLite backend
ls .prax/solutions/ # Problem-solution patterns
# Global cross-project experiences
cat ~/.prax/experiences.json # Shared learnings (JSON backend)
cat ~/.prax/experiences.db # Or SQLite backend
# Session history
ls .prax/sessions/ # Past conversation transcripts
Clean up memory when needed:
# Clear project memory
rm -rf .prax/memory.json .prax/solutions/
# Clear global experiences
rm -rf ~/.prax/experiences.json
# Clear session history
rm -rf .prax/sessions/
# Full reset
rm -rf .prax/ ~/.prax/
Prax supports two memory backends:
| Backend | Storage | Best For | Search |
|---|---|---|---|
| local (JSON) | .prax/memory.json + ~/.prax/experiences.json | Zero-config, small projects | Linear scan |
| sqlite | .prax/memory.db + ~/.prax/experiences.db | Medium to large projects, full-text search | FTS5 index |
Configure in .prax/config.yaml:
memory:
backend: local # or sqlite
local:
max_facts: 100
fact_confidence_threshold: 0.7
max_experiences: 500
Most tools send a prompt and hope for the best. Prax runs a test-verify-fix loop: it executes your test suite, analyzes failures, edits code, and re-runs until tests pass. The verification layer is first-class — not an afterthought.
Benchmark-proven: 10/10 repository repair tasks solved in 29.56s average (vs 8/10 baseline across peer frameworks).
Dual Runtime Paths — Native CLI for automation and CI/CD, Claude Code integration for interactive development. Choose the right tool for the job.
Cross-Session Persistent Memory — Context persists when you close the terminal. Two memory backends: JSON (zero-config) and SQLite (full-text search).
Multi-Model Orchestration — OpenAI-compatible, Anthropic-compatible, and custom-protocol LLMs with explicit routing, fallback chains, and cost tracking. Switch models mid-session with /model <your-model>.
Security by Design — Permission modes (read-only, workspace-write, danger-full-access), schema validation, workspace boundaries, and full audit trail.
Built for Real Codebases — 25+ built-in tools, middleware pipeline (loop detection, quality gates), multi-language support, and interactive REPL mode.
Transparent & Measurable — Real-time cost tracking, session history and replay, benchmark suite included, open architecture for custom extensions.
$ prax "run pytest -q, fix the failure, and stop when tests pass"
▶ VerifyCommand {"command": "pytest -q"}
✗ FAILED test_auth.py::test_login - AssertionError
▶ Read {"file_path": "src/auth.py"}
▶ Edit {"file_path": "src/auth.py", ...}
▶ VerifyCommand {"command": "pytest -q"}
✓ 1 passed in 0.12s
Verification passed. Task complete.
prax "explain the authentication flow in login.py"
prax "refactor auth.py error handling, replace requests with httpx"
prax "analyze project architecture, list technical debt, prioritize by impact"
prax repl
> analyze the codebase structure
> fix the SQL injection in user_query.py
> /model <your-model>
> /cost
Session: 12.4K tokens ($0.04)
/model, /session list, /plan, /todo show, /doctor, /cost, /help
Prax can own scheduled work end-to-end. Declare channels in .prax/notify.yaml, jobs in .prax/cron.yaml, and prax cron install writes the system-level trigger for you (LaunchAgent on macOS, crontab line on Linux).
# 1. configure an outbound channel (Feishu / Lark / Email)
cat > .prax/notify.yaml <<YAML
channels:
daily-digest:
provider: feishu_webhook
url: "\${FEISHU_WEBHOOK_URL}"
YAML
# 2. schedule a daily job
prax cron add \
--name ai-news-daily \
--schedule "0 17 * * *" \
--prompt "触发 ai-news-daily 技能" \
--notify-on failure \
--notify-channel daily-digest
# 3. install the per-minute dispatcher
prax cron install
See docs/recipes/ai-news-daily.md for the full AI-news-automation recipe.
Skills live under skills/ (bundled) or .prax/skills/ (project-local) and inject prompt guidance when their triggers match. Content / writing helpers:
| Skill | Triggers | Purpose |
|---|---|---|
browser-scrape | 抓取 scrape twitter zhihu bilibili autocli | Drive AutoCLI to scrape 55+ sites reusing the user's Chrome login |
knowledge-compile | 整理 compile wiki digest 知识库 | Turn raw markdown into Obsidian-ready wiki (index.md + topics/ + daily-digest.md) |
ai-news-daily | ai-news-daily daily digest 日报 | End-to-end pipeline: scrape → compile → notify |
chinese-coding | 中文 注释 文档 | Chinese comments/docs style guide |
Four additional commercial recipes (pr-triage, release-notes, docs-audit, support-digest) ship under skills/ as well — see the Commercial Use Cases table below for what each one does. Project-local skills in .prax/skills/ override bundled ones with the same name.
Four recipes tuned for team/enterprise workflows — designed to ship reviewable artefacts (not "AI said so" hallucinations) and to keep destructive actions firmly in human hands.
| Case | Target user | Prax differentiator | Recipe |
|---|---|---|---|
| PR Triage Bot | Eng lead | Actually checks out the PR branch and runs tests via VerifyCommand; compares against base. No GitHub side-effects. | docs/recipes/pr-triage.md |
| Release Notes Generator | Release manager | Reads git log + issue refs, groups by Conventional Commits into Keep-a-Changelog sections, idempotent per version. Writes files; never tags/pushes/publishes. | docs/recipes/release-notes.md |
| Docs Freshness Audit | DevEx / tech writer | Diffs recently-changed source vs doc mentions, outputs an evidence-cited drift report. Never edits docs itself. | docs/recipes/docs-audit.md |
| Support Ticket Digest | PM / support lead | Zero external API calls; PII redaction runs before any LLM sees the data — compliance-grade local-only processing. | docs/recipes/support-digest.md |
Each case is 10-minute deployable, works with the cron/notify plumbing above, and has hard contractual limits baked into its SKILL.md (no auto-approve, no auto-merge, no auto-refund, no auto-edit-docs) so the agent cannot drift.
Prax achieves 10/10 success rate on repository repair tasks, completing them in 29.56s average — 49% faster than the cross-framework baseline.
| Metric | Prax | Framework Baseline | Improvement |
|---|---|---|---|
| Success Rate | 10/10 (100%) | 8/10 (80%) | +25% |
| Average Time | 29.56s | 58.44s | -49% |
| Timeouts | 0 | 2 | -100% |
What drives these results:
Benchmark methodology: 10 repeated rounds on real repository-fix tasks with session state preserved. See docs/BENCHMARKS.md for full details.
Prax offers two runtime paths — choose the right tool for the job:
| Feature | Native Runtime | Claude Code Integration |
|---|---|---|
| Execution | CLI commands | Claude Code IDE |
| Interaction | Command-line REPL | IDE conversation interface |
| Context Management | Local JSON/SQLite | Claude Code sessions |
| Tool Integration | 25+ built-in tools | Claude Code tools + Prax extensions |
| Use Cases | Automation, CI/CD | Interactive development, code review |
Installation lives under Way 2 · Inside Claude Code IDE above (
prax install --profile full, thenprax doctor --target claude).
Models — create .prax/models.yaml in your project (or edit the global ~/.prax/models.yaml from Quick Start Step 3):
default_model: <your-model>
providers:
default:
base_url_env: LLM_BASE_URL # any env var name; declared in Quick Start Step 3
api_key_env: LLM_API_KEY
format: openai # use "anthropic" if your endpoint speaks the Anthropic protocol
models:
- name: <your-model> # the model identifier your service exposes
To wire multiple endpoints (e.g. one OpenAI-compatible and one Anthropic-compatible), declare another providers: entry with its own base_url_env / api_key_env and list its models. Each provider can also override base_url directly instead of via env (see core/llm_client.py for the full schema).
Permission modes
| Mode | What it allows | Default |
|---|---|---|
read-only | No file writes, no shell commands | |
workspace-write | Modify files inside the project | ✓ |
danger-full-access | Unrestricted |
prax --permission-mode read-only "analyze security vulnerabilities"
Runtime paths
| Flag | Behavior |
|---|---|
--runtime-path auto | Uses Claude CLI bridge if claude is installed, otherwise native runtime (default) |
--runtime-path native | Always use the native runtime |
--runtime-path bridge | Always use the Claude CLI bridge; fails if claude is not installed |
Data directory
| Path | Content |
|---|---|
.prax/sessions/ | Conversation history |
.prax/memory.json | Project memory (auto-extracted facts, JSON backend) |
.prax/memory.db | Project memory (SQLite backend) |
.prax/solutions/ | Problem-solution patterns from correction detection |
.prax/todos.json | Current task list |
.prax/agents/ | Custom agent definitions |
.prax/models.yaml | Model configuration |
.prax/config.yaml | Project-level configuration (memory backend, etc.) |
~/.prax/ | Global config (cross-project) |
~/.prax/experiences.json | Global cross-project experiences (JSON backend) |
~/.prax/experiences.db | Global cross-project experiences (SQLite backend) |
~/.prax/config.yaml | User-level configuration |
Key modules:
| Path | Role |
|---|---|
core/agent_loop.py | Core orchestration cycle (25 iter max, circuit breaker) |
core/middleware.py | VerificationGuidance, LoopDetection, QualityGate, etc. |
tools/verify_command.py | Bounded verification (pytest, npm test, cargo test, go test) |
tools/sandbox_bash.py | Auto-downgrade: verify commands bypass sandbox overhead |
core/memory/ | Pluggable backends (JSON / SQLite) |
core/llm_client.py | Provider registry, multi-model routing |
agents/ | Ralph (planner), Sisyphus (executor), Team (parallel) |
workflows/ | Task decomposition and orchestration |
We welcome contributions! See CONTRIBUTING.md for:
For benchmark and reproducibility work, also see docs/BENCHMARKS.md.
MIT License — see LICENSE for details.
FAQs
Open-source coding agent runtime — test-verify-fix loops, persistent memory, multi-model orchestration
We found that praxagent demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Company News
Socket’s first CISO brings deep experience securing high-growth SaaS companies as open source supply chain threats accelerate.

Company News
Replit is integrating Socket Firewall into its AI-powered development experience to help protect builders from malicious open source packages.

Security News
npm confirmed a tooling bug incorrectly marked several one-character packages as security holders and said it was working on a rollback.