New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

compound-agent

Package Overview
Dependencies
Maintainers
1
Versions
52
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

compound-agent

Learning system for Claude Code — avoids repeating mistakes across sessions

latest
Source
npmnpm
Version
2.6.2
Version published
Weekly downloads
1.4K
205.78%
Maintainers
1
Weekly downloads
 
Created
Source

Compound Agent

compound-agent is a Claude Code plugin that ships a self-improving development factory into your repository — persistent memory, structured multi-agent workflows, and autonomous loop execution. Fully local. Everything in git.

npm version license Go

Compound-agent ecosystem overview: Architect phase decomposes work via Socratic dialogue into a dependency graph. ca loop chains tasks with cross-model review, retry, and fresh sessions. Scenario evaluation validates changes with iterative refinement. All backed by persistent memory (lessons + knowledge across all sessions) and verification gates (tests, lint, type checks on every task).

AI coding agents forget everything between sessions. Each session starts with whatever context was prepared for it — nothing more. Because agents carry no persistent state, that state must live in the codebase itself, and any agent that reads the same well-structured context should be able to pick up where another left off. Compound Agent implements this: it captures mistakes once, retrieves them precisely when relevant, and can hand entire systems to an autonomous loop that processes epic by epic with no human intervention.

What gets installed

ca setup injects a complete development environment into your repository:

ComponentWhat ships
16 slash commands/compound:architect, cook-it, spec-dev, plan, work, review, compound, learn-that, check-that, and more
26 agent role skillsTDD pair, drift detector, audit, research specialist, external reviewers, and more
7 automatic hooksFire on session start, prompt submit, tool use, tool failure, pre-compact, phase guard, and session stop
5 phase skill filesFull workflow instructions for architect, spec-dev, cook-it, work, and review
5 deployed docsWorkflow reference, CLI reference, skills guide, integration guide, and overview

This is not a memory plugin bolted onto a text editor. It is the environment your agents run inside.

How it works

Two memory systems persist across sessions:

A task session between two memory systems: Lessons (JSONL + SQLite with semantic + keyword search) are retrieved before and captured after each task. Knowledge (project docs chunked and embedded) is queried on demand.

  • Lessons — mistakes, corrections, and patterns stored as git-tracked JSONL, indexed in SQLite FTS5 with local embeddings for hybrid search. Retrieved at the start of each task, captured at the end.
  • Knowledge — project documentation chunked and embedded for semantic retrieval. Any phase can query it on demand.

Each task runs through five phases, with review findings looping back to rework. Each phase runs as its own slash command so instructions are re-injected fresh (surviving context compaction):

Inside a task: five phases (Spec, Plan, Work, Review, Compound) connected in sequence with a feedback loop from Review back to Work. Each phase runs as its own slash command with fresh instructions. Lessons are retrieved at start and captured at end. Knowledge is queryable from any phase.

Each cycle through the loop makes the next one smarter. The architect step is optional — use it for systems too large for a single feature cycle.

Three principles

These constraints follow from how AI agents work, and each one maps to a layer of the architecture.

PrincipleWithout itLayer
MemorySame mistakes every session. Architectural decisions re-derived from scratch. Knowledge locked in human heads where agents cannot reach it.Semantic Memory
Feedback loopsAgents cannot verify their own work. Manual review is the only quality gate. Drift is the default at agent-scale output.Structured Workflows
Navigable structureContext windows fill with orientation work. Agents make unverifiable assumptions about dependencies and ordering.Beads Foundation

The three are not independent. Memory without feedback loops is unreliable. Feedback without navigable structure fires blindly. The system works as a whole or not at all.

Is this for you?

"It keeps making the same mistake every session." Capture it once. Compound Agent surfaces it automatically before the agent repeats it.

"I explained our auth pattern three sessions ago. Now it's reimplementing from scratch." Architectural decisions persist as searchable lessons. Next session, they inject into context before planning starts.

"My agent uses pandas when we standardised on Polars months ago." Preferences survive across sessions and projects. Once captured, they appear at the right moment.

"Code reviews keep catching the same class of bugs." 24 specialised review agents (security, performance, architecture, test coverage) run in parallel. Findings feed back as lessons that become test requirements in future work.

"I have no idea what my agent actually learned or if it's reliable." ca list shows all captured knowledge. ca stats shows health. ca wrong <id> invalidates bad lessons. Everything is git-tracked JSONL — you can read, diff, and audit it.

"I want structured phases, not just 'go build this'." Five workflow phases (spec-dev, plan, work, review, compound) with mandatory gates between them. Each phase searches memory and docs for relevant context before starting.

"My agent doesn't read the project docs before making decisions." ca knowledge "auth flow" runs hybrid search (vector + keyword) over your indexed docs. Agents query it automatically during planning — ADRs, specs, and standards surface before code gets written.

"I want to hand a large system spec to the machine and walk away." /compound:architect decomposes it into epics. ca loop processes them autonomously.

Levels of use

Level 1 — Memory only

Two minutes to set up. Works in any session without changing your existing workflow.

# Capture a mistake or preference
ca learn "Always use Polars, not pandas in this project" --severity high
ca learn "Auth 401 fix: add X-Request-ID header" --type solution

# Search manually anytime
ca search "polars"

# Or let hooks surface it automatically — no command needed

Level 2 — Structured workflow

One command runs all five phases on a single feature: spec-dev, plan, work (TDD + agent team), review (24 agents), and compound (capture lessons).

/compound:cook-it "Add rate limiting to the API"

Run phases individually when you want more control:

/compound:spec-dev "Add rate limiting"    # Socratic dialogue → EARS spec → Mermaid diagrams
/compound:plan                            # Tasks enriched by memory search
/compound:work                            # TDD with agent team
/compound:review                          # 24 parallel agents with severity gates
/compound:compound                        # Capture what was learned

Level 3 — Factory mode

For systems too large for a single feature cycle. /compound:architect decomposes the system; ca loop processes the resulting epics autonomously.

# Step 1: decompose the system into epics
/compound:architect "Multi-tenant SaaS: auth, billing, API, admin dashboard"
# → Socratic dialogue → system-level EARS spec → DDD decomposition
# → N epics with dependency graph, interface contracts, and scope boundaries

# Step 2: generate and run the loop
ca loop --reviewers claude-sonnet --review-every 3
./infinity-loop.sh
# → Processes each epic in dependency order: spec-dev → plan → work → review → compound
# → Captures lessons after every cycle, improving subsequent cycles

The infinity loop

ca loop chains tasks in dependency order: Task 1 through Task 4, each running a full cycle in a fresh session. Cross-model review (R) gates between tasks. Failed tasks retry automatically. Tasks can escalate to human-required. Generated bash script with deterministic orchestration.

ca loop generates a bash script that processes your beads epics sequentially, running the full cook-it cycle on each one. No human intervention required between epics.

# Generate script for all ready epics
ca loop

# With periodic review every 3 epics
ca loop --reviewers claude-sonnet --review-every 3

# Target specific epics
ca loop --epics "beads-abc,beads-def,beads-ghi" --max-retries 2

# Run it
./infinity-loop.sh

The loop respects beads dependency graphs — it only processes epics whose dependencies are complete. If an epic fails after --max-retries attempts, it stops and reports before proceeding.

Current maturity: the loop works and has been used to ship real projects, including compound-agent itself. Two things still required human involvement: specifications had to be written before the loop started, and a human applied fixes after the first review pass surfaced real problems (missing error handling, a migration gap, insufficient test coverage). Fully unattended long-duration runs across many epics are the current area of hardening.

The improvement loop

ca improve generates a bash script that iterates over improve/*.md program files, spawning Claude Code sessions to make focused improvements. Each program file defines what to improve, how to find work, and how to validate changes.

# Scaffold an example program file
ca improve init
# Creates improve/example.md with a linting template

# Generate the improvement script
ca improve

# Filter to specific topics
ca improve --topics lint tests --max-iters 3

# Preview without generating
ca improve --dry-run

# Run the generated script
./improvement-loop.sh

# Preview without executing Claude sessions
IMPROVE_DRY_RUN=1 ./improvement-loop.sh

Each iteration makes one focused improvement, commits it, and moves on. If an iteration finds nothing to improve or fails validation, it reverts cleanly and moves to the next topic. The loop tracks consecutive no-improvement results and stops early to avoid diminishing returns.

Monitor progress with ca watch --improve to see live trace output from improvement sessions.

Automatic hooks

Once installed, seven Claude Code hooks fire without any commands:

HookWhen it firesWhat it does
SessionStartEvery new sessionLoads high-severity lessons into context before you type anything
PreCompactBefore context compressionSaves phase state so cook-it survives compaction
UserPromptSubmitEvery promptInjects relevant memory items into context
PreToolUseDuring cook-itEnforces phase gates — prevents jumping ahead
PostToolUseAfter tool successClears failure tracking state
PostToolUseFailureAfter tool failureTracks failures; suggests memory search after repeated errors
StopSession endEnforces phase gates — prevents skipping required steps

No configuration needed. ca setup wires them into your .claude/settings.json.

/compound:architect

AI agents work best on well-scoped problems. When a task exceeds what fits comfortably in one context window, quality degrades — not from lack of capability but from too many competing concerns pulling in different directions.

/compound:architect addresses this before the cook-it cycle begins. It takes a large system description and produces cook-it-ready epics via a structured 4-phase process:

  • Socratic — builds a domain glossary and discovery mindmap; classifies decisions by reversibility
  • Spec — produces system-level EARS requirements, C4 architecture diagrams, and a scenario table
  • Decompose — runs 6 parallel subagents (bounded context mapping, dependency analysis, scope sizing, interface design, STPA hazard analysis, structural-semantic gap analysis) then synthesises into a proposed epic structure
  • Materialise — creates beads epics with scope boundaries, interface contracts, and wired dependencies

Three human approval gates separate the phases. Each output epic is sized for one cook-it cycle and includes an EARS subset for traceability back to the system spec.

/compound:architect "Build a data pipeline: ingestion, transformation, storage, and API layer"

Installation

# Install as dev dependency
pnpm add -D compound-agent

# One-shot setup (creates dirs, hooks, templates)
npx ca setup

Requirements

  • Node.js >= 18 (for npx wrapper — the CLI itself is a Go binary)
  • ~278MB disk space for the embedding model (one-time download, shared across projects)
  • Embedding runs via ca-embed Rust daemon (nomic-embed-text-v1.5 ONNX)

Windows Users

Compound-agent runs natively on Windows (amd64 and arm64). Install and use it the same way as on macOS/Linux:

pnpm add -D compound-agent
npx ca setup

Note: The embedding daemon (ca-embed) is not available on Windows. Search automatically falls back to keyword-only mode (FTS5). All other features work identically. WSL2 users get full functionality including vector search.

CLI Reference

The CLI binary is ca (alias: compound-agent).

Capture

CommandDescription
ca learn "<insight>"Capture a lesson manually
ca learn "<insight>" --trigger "<context>"Capture with trigger context
ca learn "<insight>" --severity highSet severity (low/medium/high)
ca learn "<insight>" --citation src/api.ts:42Attach file provenance
ca capture --input <file>Capture from structured input file
ca detect --input <file>Detect correction patterns in input

Retrieval

CommandDescription
ca search "<query>"Keyword search across memory (FTS5)
ca listList all memory items
ca list --invalidatedList only invalidated items
ca check-plan --plan "<text>"Semantic search for plan-time retrieval
ca load-sessionLoad high-severity items for session start

Management

CommandDescription
ca show <id>Display item details
ca update <id> --insight "..."Modify item fields
ca delete <id>Soft-delete an item
ca wrong <id>Mark item as invalid
ca wrong <id> --reason "..."Mark invalid with reason
ca validate <id>Re-enable an invalidated item
ca statsDatabase health and age distribution
ca rebuildRebuild SQLite index from JSONL
ca compactArchive old items, remove tombstones
ca exportExport items as JSON
ca import <file>Import items from JSONL file
ca primeLoad workflow context (used by hooks)
ca verify-gates <epic-id>Verify review + compound tasks exist and are closed
ca phase-checkManage cook-it phase state (init/status/clean/gate)
ca auditRun audit checks against the codebase
ca rules checkRun repository-defined rule checks
ca test-summaryRun tests and output a compact summary

Automation

CommandDescription
ca loopGenerate infinity loop script for autonomous epic processing
ca loop --epics "id1,id2,id3"Target specific epic IDs (comma-separated)
ca loop -o <path>Custom output path (default: ./infinity-loop.sh)
ca loop --max-retries <n>Max retries per epic on failure (default: 1)
ca loop --forceOverwrite existing script
ca loop --reviewers <names...>Enable review phase with specified reviewers (claude-sonnet, claude-opus, gemini, codex)
ca loop --review-every <n>Review every N completed epics (0 = end-only, default: 0)
ca loop --max-review-cycles <n>Max review/fix iterations (default: 3)
ca loop --review-blockingFail loop if review not approved after max cycles
ca loop --review-model <model>Model for implementer fix sessions (default: claude-opus-4-6[1m])
ca improveGenerate improvement loop script from improve/*.md programs
ca improve --topics <names...>Run only specific topics
ca improve --max-iters <n>Max iterations per topic (default: 5)
ca improve --time-budget <seconds>Total time budget, 0=unlimited (default: 0)
ca improve --dry-runValidate and print plan without generating
ca improve --forceOverwrite existing script
ca improve initScaffold an example improve/*.md program file
ca watchTail and pretty-print live trace from loop sessions
ca watch --epic <id>Watch a specific epic trace
ca watch --improveWatch improvement loop traces
ca watch --no-followPrint existing trace and exit (no live tail)
ca polishGenerate polish loop script for iterative refinement
ca polish --spec-file <path>Specify the spec file for polish review
ca polish --reviewers <names>Comma-separated reviewer models
ca polish --cycles <n>Number of polish cycles (default: 1)
ca polish --forceOverwrite existing script
ca infoShow project status, phase, and telemetry summary
ca info --openOpen project dashboard in browser
ca info --jsonOutput as JSON
ca healthCheck project health and dependencies
ca feedbackSubmit feedback about compound-agent

Knowledge

CommandDescription
ca knowledge "<query>"Hybrid search over indexed project docs
ca index-docsIndex docs/ directory into knowledge base

Setup

CommandDescription
ca setupOne-shot setup (hooks + templates)
ca setup --skip-hooksSetup without installing hooks
ca setup --jsonOutput result as JSON
ca setup claudeInstall Claude Code hooks only
ca setup claude --statusCheck Claude Code integration health
ca setup claude --uninstallRemove Claude hooks only
ca setup claude --dry-runPreview what would change without writing
ca initInitialize compound-agent in current repo
ca init --skip-agentsSkip AGENTS.md and template installation
ca init --skip-claudeSkip Claude Code hooks installation
ca download-model --jsonDownload embedding model with JSON output
ca aboutShow version, animation, and recent changelog
ca doctorVerify external dependencies and project health

Memory Types

TypeTrigger meansInsight meansExample
lessonWhat happenedWhat was learned"Polars 10x faster than pandas for large files"
solutionThe problemThe resolution"Auth 401 fix: add X-Request-ID header"
patternWhen it appliesWhy it matters{ bad: "await in loop", good: "Promise.all" }
preferenceThe contextThe preference"Use uv over pip in this project"

Retrieval Ranking

boost  = severity_boost * recency_boost * confirmation_boost
         clamped to max 1.8
score  = vector_similarity(query, item) * boost

severity_boost:     high=1.5, medium=1.0, low=0.8
recency_boost:      last 30d=1.2, older=1.0
confirmation_boost: confirmed=1.3, unconfirmed=1.0

FAQ

Q: How is this different from mem0? A: mem0 is a cloud memory layer for general AI agents. Compound Agent is local-first with git-tracked storage and local embeddings — no API keys or cloud services needed. It also goes beyond memory with structured workflows, multi-agent review, and issue tracking.

Q: Does this work offline? A: Yes, completely. Embeddings run locally via the ca-embed Rust daemon (nomic-embed-text-v1.5 ONNX). No network requests after the initial model download.

Q: How much disk space does it need? A: ~278MB for the embedding model (one-time download, shared across projects) plus negligible space for lessons.

Q: Can I use it with other AI coding tools? A: The CLI (ca) works standalone with any tool. Full hook integration is available for Claude Code and Gemini CLI.

Q: What happens if the embedding model isn't available? A: Search gracefully falls back to keyword-only mode. Other commands that require embeddings will tell you what's missing. Run ca doctor to diagnose issues.

Q: Is the loop production-ready? A: The loop works and has been used to ship real projects, including compound-agent itself. Long-duration autonomous runs across many epics are the current area of hardening. For 3–5 epic sequences, it is reliable today.

Development

cd go && go build ./cmd/ca   # Build CLI binary
cd go && go test ./...       # Full test suite
cd go && go vet ./...        # Static analysis

Technology Stack

ComponentTechnology
LanguageGo
Package ManagerGo modules (+ pnpm for npm wrapper)
Buildgo build with CGO_ENABLED=0 (pure Go)
Testinggo test + table-driven tests
Storagemodernc.org/sqlite + FTS5 (pure Go, no CGO)
Embeddingsca-embed (Rust daemon via IPC)
CLICobra
ReleaseGoReleaser
Issue TrackingBeads (bd)

Architecture

graph TD
    subgraph "Claude Code Session"
        H[Hooks] -->|SessionStart| P[ca prime]
        H -->|UserPromptSubmit| UP[user-prompt hook]
        H -->|PostToolUseFailure| TF[failure tracker]
        H -->|PreToolUse| PG[phase guard]
        H -->|Stop| SA[stop audit]
    end

    subgraph "CLI (Go + Cobra)"
        CA[ca binary] --> LEARN[ca learn]
        CA --> SEARCH[ca search]
        CA --> LOOP[ca loop]
        CA --> SETUP[ca setup]
        CA --> DOCTOR[ca doctor]
    end

    subgraph "Storage"
        JSONL[".claude/lessons/index.jsonl<br/>(git-tracked source of truth)"]
        SQLITE[".claude/.cache/lessons.sqlite<br/>(FTS5 search index)"]
        JSONL -->|rebuild| SQLITE
    end

    subgraph "Embeddings"
        EMBED["ca-embed (Rust daemon)"] -->|IPC via Unix socket| VEC[Vector similarity]
    end

    UP -->|inject lessons| SEARCH
    TF -->|suggest search| SEARCH
    LEARN --> JSONL
    SEARCH --> SQLITE
    SEARCH --> VEC

Three layers work together:

  • Portable storage: JSONL in git for conflict-free collaboration
  • Fast index: SQLite + FTS5 for keyword search, rebuilt from JSONL on demand
  • Semantic search: Rust embedding daemon for vector similarity, falls back to keyword-only if unavailable

Documentation

DocumentPurpose
docs/ARCHITECTURE-V2.mdThree-layer architecture design
docs/MIGRATION.mdMigration guide from learning-agent
CHANGELOG.mdVersion history
AGENTS.mdAgent workflow instructions

The most direct way to explore the system is to open this repository with an AI agent and ask it to walk you through the design — the project is structured precisely for that.

Acknowledgments

Compound Agent builds on ideas and patterns from these projects:

ProjectInfluence
Compound Engineering PluginThe "compound" philosophy — each unit of work makes subsequent units easier. Multi-agent review workflows and skills as encoded knowledge.
BeadsGit-backed JSONL + SQLite hybrid storage model, hash-based conflict-free IDs, dependency graphs
OpenClawClaude Code integration patterns and hook-based workflow architecture

Also informed by research into Reflexion (verbal reinforcement learning), Voyager (executable skill libraries), and production systems from mem0, Letta, and GitHub Copilot Memory.

Contributing

Bug reports and feature requests are welcome via Issues. Pull requests are not accepted at this time — see CONTRIBUTING.md for details.

License

MIT — see LICENSE for details.

The embedding model (nomic-embed-text-v1.5) is downloaded on-demand from Hugging Face under the Apache 2.0 license. See THIRD-PARTY-LICENSES.md for full dependency license information.

Keywords

claude

FAQs

Package last updated on 05 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts