Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

praxagent

Package Overview
Dependencies
Maintainers
1
Versions
16
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

praxagent

Open-source coding agent runtime — test-verify-fix loops, persistent memory, multi-model orchestration

latest
Source
npmnpm
Version
0.6.0
Version published
Maintainers
1
Created
Source

Prax

Self-improving agent runtime that learns from experience and drives LLM agents through test-verify-fix loops


License: MIT Python 3.10+

Quick Start · Why Prax · Usage · Results · Integration Paths · Configuration · Architecture · Contributing


Quick Start

Goal: install Prax, configure an AI key, run your first task — in under 5 minutes. No programming background required.

Already an experienced user? Jump to One-liner for experienced users below.

Step 1 · Install prerequisites

Prax needs Node.js (for the CLI wrapper) and Python 3.10+ (for the runtime). Check if you already have them:

node --version      # should print v14 or higher
python3 --version   # should print Python 3.10 or higher

Missing one? Install:

OSInstall command
macOSbrew install node python@3.12 (install Homebrew first if needed)
Linuxsudo apt install nodejs python3 python3-pip (Debian/Ubuntu) or sudo dnf install nodejs python3 (Fedora)
WindowsUse WSL2 and follow the Linux commands. Native Windows is not supported yet (on the 0.5.x roadmap).

Step 2 · Install Prax

npm install -g praxagent

Verify:

prax --version

Should print:

prax 0.5.0

(0.5.0 or higher is fine.)

See command not found?

  • macOS with Homebrew Node: run export PATH=/opt/homebrew/bin:$PATH then add it to ~/.zshrc
  • Linux: confirm npm prefix -g's bin/ is on your $PATH

Step 3 · Point Prax at an LLM service

Prax needs two things to call any LLM: an endpoint URL and an API key. The procedure is the same whether you use an official API or a third-party proxy.

  • Get the base_url and api_key from your service's dashboard or docs.

  • Export them as environment variables (the names are arbitrary — any name your shell can export works; LLM_BASE_URL / LLM_API_KEY is just our recommended convention):

    export LLM_BASE_URL="https://your-service-endpoint"
    export LLM_API_KEY="your-key"
    
  • Wire them into Prax via ~/.prax/models.yaml:

    mkdir -p ~/.prax
    cat > ~/.prax/models.yaml <<'YAML'
    providers:
      default:
        base_url_env: LLM_BASE_URL
        api_key_env: LLM_API_KEY
        format: openai          # use "anthropic" if your service speaks the Anthropic protocol
        models:
          - name: <your-model>  # the exact model name your service exposes
    default_model: <your-model>
    YAML
    

    Replace <your-model> with whatever model identifier your service supports (check the service's /v1/models endpoint or its dashboard).

  • Verify it works:

    prax providers
    

    You should see your provider listed with the model name and a status. With LLM_API_KEY set the status reads available; if it shows missing-credentials, the env var didn't reach Prax — re-run the export and make sure echo $LLM_API_KEY echoes the key back.

Step 4 · Your first task

mkdir -p ~/Desktop/prax-hello && cd ~/Desktop/prax-hello
prax prompt "你是谁?用一句话回答。"

Should print something like:

我是 Prax 这个智能体运行时里跑的 AI 助手,可以帮你执行代码、测试和自动化任务。

Congrats — Prax is working.

Step 5 · Prove it can touch files (the real power)

echo "hello world" > greeting.txt
prax prompt "读 greeting.txt 里的内容,然后把它改成全大写再写回去"
cat greeting.txt

Should print:

HELLO WORLD

That's the distinguishing capability — Prax doesn't just chat, it reads, writes, runs tests, and verifies in a loop. Everything below builds on this.

Got stuck at any step? Common issues:

SymptomFix
Error: Model 'xxx' not foundThe <your-model> name in ~/.prax/models.yaml doesn't match what your service exposes. Check its /v1/models endpoint or dashboard.
HTTP 401 / UnauthorizedKey typo or expired. Regenerate and re-export LLM_API_KEY.
Silent exit with no outputYour endpoint is unreachable. Try curl -s "$LLM_BASE_URL"/... to confirm, or point LLM_BASE_URL at a healthy endpoint.
Chinese characters look brokenSet export LANG=zh_CN.UTF-8 in your shell rc.

Two ways to use Prax

Once Step 5 above works, you're ready to pick a usage mode.

Way 1 · Standalone from the terminal (what you just did)

Use Prax directly at the shell prompt — perfect for automation, cron jobs, CI/CD, or just batching work without opening an IDE.

prax prompt "run pytest -q, fix the first failure, and stop when tests pass"
prax prompt "read README.md and propose 3 concrete improvements as a checklist"
prax cron add --name daily-news --schedule "0 17 * * *" --prompt "..."   # schedule recurring work

Pick a role-matched walkthrough to see a real-world pipeline end-to-end:

Your roleTutorialWhat you'll build
PM / support leadsupport-digestDaily PII-redacted ticket digest, local-only processing
Content creator / knowledge-base hobbyistai-news-dailyScrape X/知乎/Bilibili → compile Obsidian wiki → send digest
Release managerrelease-notesGit log → CHANGELOG + release announcement
DevEx / tech writerdocs-auditWeekly "which docs drifted from the code?" report
Engineering leadpr-triagePer-PR triage that actually runs tests on both branches

All five tutorials start with sample data — no external API or real PR needed to follow along.

Way 2 · Inside Claude Code IDE

If you use Claude Code and want Prax's skills, commands, and verification hooks available inside the IDE:

prax install --profile full

This copies Prax's bundled skills (4 commercial recipes + 1 content-automation pipeline + 4 developer workflows) into ~/.claude/, registers Prax MCP servers, and wires hooks so Claude Code runs Prax's verification loop on every code change.

Verify:

prax doctor --target claude

Now reopen your project in Claude Code. You'll see new /prax-status, /prax-doctor, /prax-plan, /prax-verify slash commands, the prax-planner agent, and Prax's rules automatically applied.

To undo: prax uninstall --target claude.

Note: prax install only ships a Claude Code integration today. Any LLM you configure via Step 3 still works as Prax's backend regardless of IDE choice. A skill-export path for additional IDE/CLI hosts is on the 0.5.x roadmap.

One-liner for experienced users

git clone https://github.com/ChanningLua/prax-agent.git && cd prax-agent
pip install -e .
export LLM_BASE_URL="https://your-service-endpoint"
export LLM_API_KEY="your-key"
# (then write ~/.prax/models.yaml as shown in Step 3 above)
prax prompt "run pytest -q, fix the failure, and stop when tests pass"

Prax can execute shell commands on your behalf. It defaults to workspace-write mode — files outside the project are off-limits. Use --permission-mode read-only for safe exploration.

Why Prax

Prax isn't just another LLM wrapper — it's a production-grade agent runtime built for real repository work.

Agent Capabilities

Experience-Based Self-Improvement

Prax learns from experience and self-improves across sessions and projects:

  • Correction Detection — Automatically detects when users correct mistakes, extracts problem-solution patterns, and applies them in future sessions (multilingual support)
  • Cross-Project Experience Accumulation — Builds a global experience store at ~/.prax/experiences.json that improves performance across all your repositories
  • Structured Error Recovery — Blacklists failing approaches and tries alternatives, preventing repeated mistakes within the same session
  • Persistent Memory with Confidence Scoring — Two backends (JSON/SQLite) track context, decisions, and learned patterns with decay over time
  • Temporal Knowledge Graph — Tracks entity relationships and their evolution across sessions
  • Checkpoint/Resume — Crash recovery ensures no work is lost, even during long-running tasks
  • Trajectory Recording — Learns from execution history to identify successful patterns and avoid failure modes

These capabilities are production-ready and integrated into the core runtime — not experimental plugins.

Working with Experience & Memory

Prax automatically learns from your work and applies that knowledge to future tasks. Here's how to work with its memory system:

Automatic Learning

Prax captures experience in these situations:

  • Correction Detection — When you correct a mistake (e.g., "that's wrong", "不对", "try again"), Prax extracts the problem-solution pattern and saves it to .prax/solutions/
  • Task Completion — Facts with confidence ≥ 0.7 are persisted to project memory
  • Tool Failures — Failed approaches are blacklisted within the session to avoid repetition
  • Verification Success — Successful test-fix patterns are recorded as experiences
  • Session End — Context snapshots are saved for the next session to resume

Viewing Memory

Check what Prax has learned:

# Project-specific memory
cat .prax/memory.json          # Facts and context (JSON backend)
cat .prax/memory.db            # Or SQLite backend
ls .prax/solutions/            # Problem-solution patterns

# Global cross-project experiences
cat ~/.prax/experiences.json   # Shared learnings (JSON backend)
cat ~/.prax/experiences.db     # Or SQLite backend

# Session history
ls .prax/sessions/             # Past conversation transcripts

Managing Memory

Clean up memory when needed:

# Clear project memory
rm -rf .prax/memory.json .prax/solutions/

# Clear global experiences
rm -rf ~/.prax/experiences.json

# Clear session history
rm -rf .prax/sessions/

# Full reset
rm -rf .prax/ ~/.prax/

Memory Backends

Prax supports two memory backends:

BackendStorageBest ForSearch
local (JSON).prax/memory.json + ~/.prax/experiences.jsonZero-config, small projectsLinear scan
sqlite.prax/memory.db + ~/.prax/experiences.dbMedium to large projects, full-text searchFTS5 index

Configure in .prax/config.yaml:

memory:
  backend: local  # or sqlite
  local:
    max_facts: 100
    fact_confidence_threshold: 0.7
    max_experiences: 500

Confidence & Decay

  • Facts with confidence ≥ 0.7 are persisted to memory
  • Lower-confidence observations are kept in session context only
  • Confidence scoring is static per fact (no time-based decay currently implemented)

Verification-First Architecture

Verification Loop

Most tools send a prompt and hope for the best. Prax runs a test-verify-fix loop: it executes your test suite, analyzes failures, edits code, and re-runs until tests pass. The verification layer is first-class — not an afterthought.

Benchmark-proven: 10/10 repository repair tasks solved in 29.56s average (vs 8/10 baseline across peer frameworks).

Dual Runtime Paths — Native CLI for automation and CI/CD, Claude Code integration for interactive development. Choose the right tool for the job.

Cross-Session Persistent Memory — Context persists when you close the terminal. Two memory backends: JSON (zero-config) and SQLite (full-text search).

Multi-Model Orchestration — OpenAI-compatible, Anthropic-compatible, and custom-protocol LLMs with explicit routing, fallback chains, and cost tracking. Switch models mid-session with /model <your-model>.

Security by Design — Permission modes (read-only, workspace-write, danger-full-access), schema validation, workspace boundaries, and full audit trail.

Built for Real Codebases — 25+ built-in tools, middleware pipeline (loop detection, quality gates), multi-language support, and interactive REPL mode.

Transparent & Measurable — Real-time cost tracking, session history and replay, benchmark suite included, open architecture for custom extensions.

Usage Examples

Repository Repair

$ prax "run pytest -q, fix the failure, and stop when tests pass"
▶ VerifyCommand {"command": "pytest -q"}
  ✗ FAILED test_auth.py::test_login - AssertionError
▶ Read {"file_path": "src/auth.py"}
▶ Edit {"file_path": "src/auth.py", ...}
▶ VerifyCommand {"command": "pytest -q"}
  ✓ 1 passed in 0.12s
Verification passed. Task complete.

One-off Tasks

prax "explain the authentication flow in login.py"
prax "refactor auth.py error handling, replace requests with httpx"
prax "analyze project architecture, list technical debt, prioritize by impact"

Interactive REPL

prax repl

> analyze the codebase structure
> fix the SQL injection in user_query.py
> /model <your-model>
> /cost
Session: 12.4K tokens ($0.04)

Slash Commands

/model, /session list, /plan, /todo show, /doctor, /cost, /help

Scheduled Tasks & Notifications (new in 0.4)

Prax can own scheduled work end-to-end. Declare channels in .prax/notify.yaml, jobs in .prax/cron.yaml, and prax cron install writes the system-level trigger for you (LaunchAgent on macOS, crontab line on Linux).

# 1. configure an outbound channel (Feishu / Lark / Email)
cat > .prax/notify.yaml <<YAML
channels:
  daily-digest:
    provider: feishu_webhook
    url: "\${FEISHU_WEBHOOK_URL}"
YAML

# 2. schedule a daily job
prax cron add \
  --name ai-news-daily \
  --schedule "0 17 * * *" \
  --prompt "触发 ai-news-daily 技能" \
  --notify-on failure \
  --notify-channel daily-digest

# 3. install the per-minute dispatcher
prax cron install

See docs/recipes/ai-news-daily.md for the full AI-news-automation recipe.

Bundled Skills

Skills live under skills/ (bundled) or .prax/skills/ (project-local) and inject prompt guidance when their triggers match. Content / writing helpers:

SkillTriggersPurpose
browser-scrape抓取 scrape twitter zhihu bilibili autocliDrive AutoCLI to scrape 55+ sites reusing the user's Chrome login
knowledge-compile整理 compile wiki digest 知识库Turn raw markdown into Obsidian-ready wiki (index.md + topics/ + daily-digest.md)
ai-news-dailyai-news-daily daily digest 日报End-to-end pipeline: scrape → compile → notify
chinese-coding中文 注释 文档Chinese comments/docs style guide

Four additional commercial recipes (pr-triage, release-notes, docs-audit, support-digest) ship under skills/ as well — see the Commercial Use Cases table below for what each one does. Project-local skills in .prax/skills/ override bundled ones with the same name.

Commercial Use Cases (new in 0.4)

Four recipes tuned for team/enterprise workflows — designed to ship reviewable artefacts (not "AI said so" hallucinations) and to keep destructive actions firmly in human hands.

CaseTarget userPrax differentiatorRecipe
PR Triage BotEng leadActually checks out the PR branch and runs tests via VerifyCommand; compares against base. No GitHub side-effects.docs/recipes/pr-triage.md
Release Notes GeneratorRelease managerReads git log + issue refs, groups by Conventional Commits into Keep-a-Changelog sections, idempotent per version. Writes files; never tags/pushes/publishes.docs/recipes/release-notes.md
Docs Freshness AuditDevEx / tech writerDiffs recently-changed source vs doc mentions, outputs an evidence-cited drift report. Never edits docs itself.docs/recipes/docs-audit.md
Support Ticket DigestPM / support leadZero external API calls; PII redaction runs before any LLM sees the data — compliance-grade local-only processing.docs/recipes/support-digest.md

Each case is 10-minute deployable, works with the cron/notify plumbing above, and has hard contractual limits baked into its SKILL.md (no auto-approve, no auto-merge, no auto-refund, no auto-edit-docs) so the agent cannot drift.

Results

Benchmark Results

Prax achieves 10/10 success rate on repository repair tasks, completing them in 29.56s average — 49% faster than the cross-framework baseline.

MetricPraxFramework BaselineImprovement
Success Rate10/10 (100%)8/10 (80%)+25%
Average Time29.56s58.44s-49%
Timeouts02-100%

What drives these results:

  • Verification-First Architecture — Test-verify-fix loops catch errors early
  • Quality Gate Middleware — Loop detection and convergence guidance
  • Smart Sandbox Downgrade — Verification commands bypass unnecessary overhead
  • Experience-Based Learning — Correction detection, error pattern blacklisting, and cross-session memory accumulation

Benchmark methodology: 10 repeated rounds on real repository-fix tasks with session state preserved. See docs/BENCHMARKS.md for full details.

Integration Paths

Prax offers two runtime paths — choose the right tool for the job:

FeatureNative RuntimeClaude Code Integration
ExecutionCLI commandsClaude Code IDE
InteractionCommand-line REPLIDE conversation interface
Context ManagementLocal JSON/SQLiteClaude Code sessions
Tool Integration25+ built-in toolsClaude Code tools + Prax extensions
Use CasesAutomation, CI/CDInteractive development, code review

Claude Code Integration Advantages

  • IDE Native Experience — Use Prax capabilities directly within Claude Code
  • Seamless Integration — Deep integration via MCP servers and Hooks
  • Security Protection — Pre-write secret scanning, pre-commit quality checks
  • Session Persistence — Auto-save session state, resume from breakpoints
  • Bidirectional Collaboration — Claude Code's conversational ability + Prax's verification loop

Integration Paths

Installation lives under Way 2 · Inside Claude Code IDE above (prax install --profile full, then prax doctor --target claude).

Configuration

Models — create .prax/models.yaml in your project (or edit the global ~/.prax/models.yaml from Quick Start Step 3):

default_model: <your-model>

providers:
  default:
    base_url_env: LLM_BASE_URL    # any env var name; declared in Quick Start Step 3
    api_key_env: LLM_API_KEY
    format: openai                # use "anthropic" if your endpoint speaks the Anthropic protocol
    models:
      - name: <your-model>        # the model identifier your service exposes

To wire multiple endpoints (e.g. one OpenAI-compatible and one Anthropic-compatible), declare another providers: entry with its own base_url_env / api_key_env and list its models. Each provider can also override base_url directly instead of via env (see core/llm_client.py for the full schema).

Permission modes

ModeWhat it allowsDefault
read-onlyNo file writes, no shell commands
workspace-writeModify files inside the project
danger-full-accessUnrestricted
prax --permission-mode read-only "analyze security vulnerabilities"

Runtime paths

FlagBehavior
--runtime-path autoUses Claude CLI bridge if claude is installed, otherwise native runtime (default)
--runtime-path nativeAlways use the native runtime
--runtime-path bridgeAlways use the Claude CLI bridge; fails if claude is not installed

Data directory

PathContent
.prax/sessions/Conversation history
.prax/memory.jsonProject memory (auto-extracted facts, JSON backend)
.prax/memory.dbProject memory (SQLite backend)
.prax/solutions/Problem-solution patterns from correction detection
.prax/todos.jsonCurrent task list
.prax/agents/Custom agent definitions
.prax/models.yamlModel configuration
.prax/config.yamlProject-level configuration (memory backend, etc.)
~/.prax/Global config (cross-project)
~/.prax/experiences.jsonGlobal cross-project experiences (JSON backend)
~/.prax/experiences.dbGlobal cross-project experiences (SQLite backend)
~/.prax/config.yamlUser-level configuration

Architecture

Architecture

Key modules:

PathRole
core/agent_loop.pyCore orchestration cycle (25 iter max, circuit breaker)
core/middleware.pyVerificationGuidance, LoopDetection, QualityGate, etc.
tools/verify_command.pyBounded verification (pytest, npm test, cargo test, go test)
tools/sandbox_bash.pyAuto-downgrade: verify commands bypass sandbox overhead
core/memory/Pluggable backends (JSON / SQLite)
core/llm_client.pyProvider registry, multi-model routing
agents/Ralph (planner), Sisyphus (executor), Team (parallel)
workflows/Task decomposition and orchestration

Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Development setup
  • Code style guidelines
  • Testing requirements
  • PR process

For benchmark and reproducibility work, also see docs/BENCHMARKS.md.

License

MIT License — see LICENSE for details.

Keywords

agent

FAQs

Package last updated on 01 Jun 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts