Your coding agent remembers everything. No more re-explaining.
Built on iii engine
Persistent memory for Claude Code, Cursor, Gemini CLI, Codex CLI, pi, OpenCode, and any MCP client.
The gist extends Karpathy's LLM Wiki pattern with confidence scoring, lifecycle, knowledge graphs, and hybrid search. agentmemory is the implementation.
Works with any agent that speaks MCP or HTTP. One server, memories shared across all of them.
You explain the same architecture every session. You re-discover the same bugs. You re-teach the same preferences. Built-in memory (CLAUDE.md, .cursorrules) caps out at 200 lines and goes stale. agentmemory fixes this. It silently captures what your agent does, compresses it into searchable memory, and injects the right context when the next session starts. One command. Works across agents.
What changes: Session 1 you set up JWT auth. Session 2 you ask for rate limiting. The agent already knows your auth uses jose middleware in src/middleware/auth.ts, your tests cover token validation, and you chose jose over jsonwebtoken for Edge compatibility. No re-explaining. No copy-pasting. The agent just knows.
npx @agentmemory/agentmemory
New in v0.9.0 — Landing site at agent-memory.dev, filesystem connector (@agentmemory/fs-watcher), standalone MCP now proxies to the running server so hooks and the viewer agree, audit policy codified across every delete path, health stops flagging memory_critical on tiny Node processes. Full notes in CHANGELOG.md.
Compatibility: this release targets stable iii-sdk^0.11.0 and iii-engine v0.11.x.
Try it in 30 seconds
# Terminal 1: start the server
npx @agentmemory/agentmemory
# Terminal 2: seed sample data and see recall in action
npx @agentmemory/agentmemory demo
demo seeds 3 realistic sessions (JWT auth, N+1 query fix, rate limiting) and runs semantic searches against them. You'll see it find "N+1 query fix" when you search "database performance optimization" — keyword matching can't do that.
Open http://localhost:3113 to watch the memory build live.
Session Replay
Every session agentmemory records is replayable. Open the viewer, pick the Replay tab, and scrub through the timeline: prompts, tool calls, tool results, and responses render as discrete events with play/pause, speed control (0.5×–4×), and keyboard shortcuts (space to toggle, arrows to step).
Already have older Claude Code JSONL transcripts you want to bring in?
# Import everything under the default ~/.claude/projects
npx @agentmemory/agentmemory import-jsonl
# Or import a single file
npx @agentmemory/agentmemory import-jsonl ~/.claude/projects/-my-project/abc123.jsonl
Imported sessions show up in the Replay picker alongside native ones. Under the hood each entry routes through the mem::replay::load, mem::replay::sessions, and mem::replay::import-jsonl iii functions — no side-channel servers.
Upgrade / Maintenance
Use the maintenance command when you intentionally want to update your local runtime:
npx @agentmemory/agentmemory upgrade
Warning: this command mutates the current workspace/runtime. It can update JavaScript dependencies, may run cargo install iii-engine --force, and may pull Docker images.
Implementation details live in src/cli.ts (see runUpgrade around the src/cli.ts:544-595 region).
Claude Code (one block, paste it)
Install agentmemory: run `npx @agentmemory/agentmemory` in a separate terminal to start the memory server. Then run `/plugin marketplace add rohitg00/agentmemory` and `/plugin install agentmemory` — the plugin registers all 12 hooks, 4 skills, AND auto-wires the `@agentmemory/mcp` stdio server via its `.mcp.json`, so you get 51 MCP tools (memory_smart_search, memory_save, memory_sessions, memory_governance_delete, etc.) without any extra config step. Verify with `curl http://localhost:3111/agentmemory/health`. The real-time viewer is at http://localhost:3113.
OpenClaw (paste this prompt)
Install agentmemory for OpenClaw. Run `npx @agentmemory/agentmemory` in a separate terminal to start the memory server on localhost:3111. Then add this to my OpenClaw MCP config so agentmemory is available with all 43 memory tools:
{
"mcpServers": {
"agentmemory": {
"command": "npx",
"args": ["-y", "@agentmemory/mcp"]
}
}
}
Restart OpenClaw. Verify with `curl http://localhost:3111/agentmemory/health`. Open http://localhost:3113 for the real-time viewer. For deeper memory-slot integration, copy `integrations/openclaw` to `~/.openclaw/extensions/agentmemory` and enable `plugins.slots.memory = "agentmemory"` in `~/.openclaw/openclaw.json`.
Install agentmemory for Hermes. Run `npx @agentmemory/agentmemory` in a separate terminal to start the memory server on localhost:3111. Then add this to ~/.hermes/config.yaml so Hermes can use agentmemory as an MCP server with all 43 memory tools:
mcp_servers:
agentmemory:
command: npx
args: ["-y", "@agentmemory/mcp"]
memory:
provider: agentmemory
Verify with `curl http://localhost:3111/agentmemory/health`. Open http://localhost:3113 for the real-time viewer. For deeper 6-hook memory provider integration (pre-LLM context injection, turn capture, MEMORY.md mirroring, system prompt block), copy integrations/hermes from the agentmemory repo to ~/.hermes/plugins/agentmemory.
Add to ~/.hermes/config.yaml with memory.provider: agentmemory or use the memory provider plugin
Cline / Goose / Kilo Code
Add MCP server in settings
Claude Desktop
Add to claude_desktop_config.json: {"mcpServers": {"agentmemory": {"command": "npx", "args": ["-y", "@agentmemory/mcp"]}}}
Aider
REST API: curl -X POST http://localhost:3111/agentmemory/smart-search -d '{"query": "auth"}'
Any agent (32+)
npx skillkit install agentmemory
From source
git clone https://github.com/rohitg00/agentmemory.git && cd agentmemory
npm install && npm run build && npm start
This starts agentmemory with a local iii-engine if iii is already installed, or falls back to Docker Compose if Docker is available. REST, streams, and the viewer bind to 127.0.0.1 by default.
Install iii-engine manually. agentmemory currently pins iii-engine to v0.11.2 — v0.11.6 introduces a new sandbox-everything-via-iii worker add model that agentmemory hasn't been refactored for yet. Pin lifts once the refactor lands. Override with AGENTMEMORY_III_VERSION=<version> if you've migrated to the sandbox model manually.
macOS x64: swap aarch64-apple-darwin for x86_64-apple-darwin
Linux x64: swap for x86_64-unknown-linux-gnu
Linux arm64: swap for aarch64-unknown-linux-gnu
Windows: download iii-x86_64-pc-windows-msvc.zip from iii-hq/iii releases v0.11.2, extract iii.exe, add to PATH
Or use Docker (the bundled docker-compose.yml pulls iiidev/iii:0.11.2). Full docs: iii.dev/docs.
Windows
agentmemory runs on Windows 10/11, but the Node.js package alone isn't enough — you also need the iii-engine runtime (a separate native binary) as a background process. The official upstream installer is a sh script and there is no PowerShell installer or scoop/winget package today, so Windows users have two paths:
Option A — Prebuilt Windows binary (recommended):
# 1. Open https://github.com/iii-hq/iii/releases/tag/iii%2Fv0.11.2 in your browser
# (we pin to v0.11.2 until agentmemory refactors for the new sandbox
# model that engine v0.11.6+ requires)
# 2. Download iii-x86_64-pc-windows-msvc.zip
# (or iii-aarch64-pc-windows-msvc.zip if you're on an ARM machine)
# 3. Extract iii.exe somewhere on PATH, or place it at:
# %USERPROFILE%\.local\bin\iii.exe
# (agentmemory checks that location automatically)
# 4. Verify:
iii --version
# Should print: 0.11.2
# 5. Then run agentmemory as usual:
npx -y @agentmemory/agentmemory
Option B — Docker Desktop:
# 1. Install Docker Desktop for Windows
# 2. Start Docker Desktop and make sure the engine is running
# 3. Run agentmemory — it will auto-start the bundled compose file:
npx -y @agentmemory/agentmemory
Option C — standalone MCP only (no engine): if you only need the MCP tools for your agent and don't need the REST API, viewer, or cron jobs, skip the engine entirely:
npx -y @agentmemory/agentmemory mcp
# or via the shim package:
npx -y @agentmemory/mcp
Diagnostics for Windows: if npx @agentmemory/agentmemory fails, re-run with --verbose to see the actual engine stderr. Common failure modes:
Symptom
Fix
iii-engine process started then did not become ready within 15s
Engine crashed on startup — re-run with --verbose, check stderr
Could not start iii-engine
Neither iii.exe nor Docker is installed. See Option A or B above
Port conflict
netstat -ano | findstr :3111 to see what's bound, then kill it or use --port <N>
Docker fallback skipped even though Docker is installed
Make sure Docker Desktop is actually running (system tray icon)
Note: there is no cargo install iii-engine — iii is not published to crates.io. The only supported install methods are the prebuilt binary above, the upstream sh install script (macOS/Linux only), and the Docker image.
Every coding agent forgets everything when the session ends. You waste the first 5 minutes of every session re-explaining your stack. agentmemory runs in the background and eliminates that entirely.
Session 1: "Add auth to the API"
Agent writes code, runs tests, fixes bugs
agentmemory silently captures every tool use
Session ends -> observations compressed into structured memory
Session 2: "Now add rate limiting"
Agent already knows:
- Auth uses JWT middleware in src/middleware/auth.ts
- Tests in test/auth.test.ts cover token validation
- You chose jose over jsonwebtoken for Edge compatibility
Zero re-explaining. Starts working immediately.
vs built-in agent memory
Every AI coding agent ships with built-in memory — Claude Code has MEMORY.md, Cursor has notepads, Cline has memory bank. These work like sticky notes. agentmemory is the searchable database behind the sticky notes.
Inspired by how human brains process memory — not unlike sleep consolidation.
Tier
What
Analogy
Working
Raw observations from tool use
Short-term memory
Episodic
Compressed session summaries
"What happened"
Semantic
Extracted facts and patterns
"What I know"
Procedural
Workflows and decision patterns
"How to do it"
Memories decay over time (Ebbinghaus curve). Frequently accessed memories strengthen. Stale memories auto-evict. Contradictions are detected and resolved.
What Gets Captured
Hook
Captures
SessionStart
Project path, session ID
UserPromptSubmit
User prompts (privacy-filtered)
PreToolUse
File access patterns + enriched context
PostToolUse
Tool name, input, output
PostToolUseFailure
Error context
PreCompact
Re-injects memory before compaction
SubagentStart/Stop
Sub-agent lifecycle
Stop
End-of-session summary
SessionEnd
Session complete marker
Key Capabilities
Capability
Description
Automatic capture
Every tool use recorded via hooks — zero manual effort
Auto-starts on port 3113. Live observation stream, session explorer, memory browser, knowledge graph visualization, and health dashboard.
open http://localhost:3113
The viewer server binds to 127.0.0.1 by default. The REST-served /agentmemory/viewer endpoint follows the normal AGENTMEMORY_SECRET bearer-token rules. CSP headers use a per-response script nonce and disable inline handler attributes (script-src-attr 'none').
The viewer at :3113 shows what your agent remembered. The iii console shows what your agent did — every memory op as an OpenTelemetry trace, every KV entry editable, every function invocable, every stream tappable. Two windows on the same memory: one product-shaped, one engine-shaped.
Watch a memory_smart_search fire and see the BM25 scan → embedding lookup → RRF fusion → reranker as a waterfall. Edit a stuck consolidation timer in the KV browser. Replay a PostToolUse hook with a tweaked payload. Pin the WebSocket stream and watch observations land live.
agentmemory ships this for free because every function, trigger, state scope, and stream is an iii primitive — nothing custom, nothing to instrument.
Workers page: every connected worker — including agentmemory itself — with PID, function count, runtime, and last-seen.
Already installed. The console ships with iii — no separate installer.
Launch alongside agentmemory:
# agentmemory viewer holds port 3113, so run the console on 3114.# Engine REST (3111), WebSocket (3112), and bridge (49134) defaults match agentmemory.
iii console --port 3114
Then open http://localhost:3114. Add --enable-flow for the experimental architecture-graph page.
Override engine endpoints only if you've moved them:
See every connected worker and its live metrics — including the agentmemory worker itself.
Functions
Invoke any of agentmemory's functions directly with a JSON payload — handy for testing memory.recall, memory.consolidate, graph.query without wiring a client.
Triggers
Replay HTTP, cron, event, and state triggers — fire the consolidation cron manually, retry an HTTP route, emit a state change.
States
KV browser with full CRUD — sessions, memory slots, lifecycle timers, embeddings index — edit values in place.
Streams
Live WebSocket monitor for memory writes, hook events, and observation updates as they flow through iii streams.
Queues
Durable queue topics + dead-letter management. Replay or drop failed embedding / compression jobs.
Traces
OpenTelemetry waterfall / flame / service-breakdown views. Filter by trace_id to see exactly which functions, DB calls, and embedding requests a single memory.search produced.
Logs
Structured OTEL logs filtered and correlated to trace/span IDs.
Config
Runtime configuration — see exactly which workers, providers, and ports your engine is running with.
Flow
(Optional, --enable-flow) Interactive architecture graph of every worker, trigger, and stream.
Traces: waterfall / flame / service breakdown for every memory operation.
Traces are already on:
iii-config.yaml ships with the iii-observability worker enabled (exporter: memory, sampling_ratio: 1.0, metrics + logs). No extra config needed — the moment agentmemory starts, every memory operation emits a trace span and a structured log the console can read.
If you want to export to Jaeger/Honeycomb/Grafana Tempo instead, change exporter: memory to exporter: otlp and set the collector endpoint per iii's observability docs.
Heads-up: no auth is enforced on the console itself — keep it bound to 127.0.0.1 (the default) and never expose it publicly.
agentmemory is already a running iii instance. Functions, triggers, KV state, streams, OTEL traces — all of it is iii primitives. You didn't install Postgres, Redis, Express, pm2, or Prometheus, because iii replaces them.
That means one more command extends agentmemory with an entire new capability.
Extend agentmemory with one command
iii worker add iii-pubsub # fan memory writes out to every connected instance
iii worker add iii-cron # scheduled consolidation, decay sweeps, snapshot rotation
iii worker add iii-queue # durable retries for embedding + compression jobs
iii worker add iii-observability # OTEL traces on every memory op (default on)
iii worker add iii-sandbox # run recalled code inside an isolated microVM
iii worker add iii-database # swap in a SQL-backed state adapter
iii worker add mcp # generic MCP host alongside the agentmemory MCP
Each iii worker add registers new functions and triggers into the same engine agentmemory is already running on. The viewer and console pick them up immediately — no reload, no new integration, no new container.
Stand up extra MCP servers next to agentmemory's, share the same engine
Full registry: workers.iii.dev. Every worker there composes through the same primitives agentmemory uses — and the agentmemory you already have is one of them.
What iii replaces
Traditional stack
agentmemory uses
Express.js / Fastify
iii HTTP Triggers
SQLite / Postgres + pgvector
iii KV State + in-memory vector index
SSE / Socket.io
iii Streams (WebSocket)
pm2 / systemd
iii engine worker supervision
Prometheus / Grafana
iii OTEL + health monitor
Custom plugin systems
iii worker add <name>
118 source files · ~21,800 LOC · 800 tests · 123 functions · 34 KV scopes — all on three primitives. No agentmemory plugin install. The plugin system is iii itself.
LLM Providers
agentmemory auto-detects from your environment. No API key needed if you have a Claude subscription.
Provider
Config
Notes
No-op (default)
No config needed
LLM-backed compress/summarize is DISABLED. Synthetic BM25 compression + recall still work. See AGENTMEMORY_ALLOW_AGENT_SDK below if you used to rely on the Claude-subscription fallback.
Anthropic API
ANTHROPIC_API_KEY
Per-token billing
MiniMax
MINIMAX_API_KEY
Anthropic-compatible
Gemini
GEMINI_API_KEY
Also enables embeddings
OpenRouter
OPENROUTER_API_KEY
Any model
Claude subscription fallback
AGENTMEMORY_ALLOW_AGENT_SDK=true
Opt-in only. Spawns @anthropic-ai/claude-agent-sdk sessions — used to cause unbounded Stop-hook recursion (#149 follow-up) so it is no longer the default.
Environment Variables
Create ~/.agentmemory/.env:
# LLM provider (pick one — default is the no-op provider: no LLM calls)
# ANTHROPIC_API_KEY=sk-ant-...
# ANTHROPIC_BASE_URL=... # Optional: Anthropic-compatible proxy / Azure
# GEMINI_API_KEY=...
# OPENROUTER_API_KEY=...
# MINIMAX_API_KEY=...
# Opt-in Claude-subscription fallback (spawns @anthropic-ai/claude-agent-sdk);
# leave OFF unless you understand the Stop-hook recursion risk (#149 follow-up):
# AGENTMEMORY_ALLOW_AGENT_SDK=true
# Embedding provider (auto-detected, or override)
# EMBEDDING_PROVIDER=local
# VOYAGE_API_KEY=...
# OPENAI_API_KEY=sk-...
# OPENAI_BASE_URL=https://api.openai.com # Override for Azure / vLLM / LM Studio / proxies
# OPENAI_EMBEDDING_MODEL=text-embedding-3-small
# OPENAI_EMBEDDING_DIMENSIONS=1536 # Required when the model is not in the known-models table
# Search tuning
# BM25_WEIGHT=0.4
# VECTOR_WEIGHT=0.6
# TOKEN_BUDGET=2000
# Auth
# AGENTMEMORY_SECRET=your-secret
# Ports (defaults: 3111 API, 3113 viewer)
# III_REST_PORT=3111
# Features
# AGENTMEMORY_AUTO_COMPRESS=false # OFF by default (#138). When on,
# every PostToolUse hook calls your
# LLM provider to compress the
# observation — expect significant
# token spend on active sessions.
# AGENTMEMORY_SLOTS=false # OFF by default. Editable pinned
# memory slots — persona,
# user_preferences, tool_guidelines,
# project_context, guidance,
# pending_items, session_patterns,
# self_notes. Size-limited; agent
# edits via memory_slot_* tools.
# Pinned slots addressable for
# SessionStart injection.
# AGENTMEMORY_REFLECT=false # OFF by default. Requires SLOTS=on.
# Stop hook fires mem::slot-reflect:
# scans recent observations, auto-
# appends TODOs to pending_items,
# counts patterns in
# session_patterns, records touched
# files in project_context. Fire-
# and-forget; does not block.
# AGENTMEMORY_INJECT_CONTEXT=false # OFF by default (#143). When on:
# - SessionStart may inject ~1-2K
# chars of project context into
# the first turn of each session
# (this is what actually reaches
# the model — Claude Code treats
# SessionStart stdout as context)
# - PreToolUse fires /agentmemory/enrich
# on every file-touching tool call
# (resource cleanup, not a token
# fix — PreToolUse stdout is debug
# log only per Claude Code docs)
# Observations are still captured via
# PostToolUse regardless of this flag.
# GRAPH_EXTRACTION_ENABLED=false
# CONSOLIDATION_ENABLED=true
# LESSON_DECAY_ENABLED=true
# OBSIDIAN_AUTO_EXPORT=false
# AGENTMEMORY_EXPORT_ROOT=~/.agentmemory
# CLAUDE_MEMORY_BRIDGE=false
# SNAPSHOT_ENABLED=false
# Team
# TEAM_ID=
# USER_ID=
# TEAM_MODE=private
# Tool visibility: "core" (8 tools) or "all" (51 tools)
# AGENTMEMORY_TOOLS=core
107 endpoints on port 3111. The REST API binds to 127.0.0.1 by default. Protected endpoints require Authorization: Bearer <secret> when AGENTMEMORY_SECRET is set, and mesh sync endpoints require AGENTMEMORY_SECRET on both peers.
Persistent memory for AI coding agents, powered by iii-engine's three primitives
The npm package @agentmemory/agentmemory receives a total of 0 weekly downloads. As such, @agentmemory/agentmemory popularity was classified as not popular.
We found that @agentmemory/agentmemory demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.It has 1 open source maintainer collaborating on the project.
Package last updated on 11 May 2026
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.