Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement โ†’
Sign In

@shadowforge0/aquifer-memory

Package Overview
Dependencies
Maintainers
1
Versions
27
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@shadowforge0/aquifer-memory

PG-native long-term memory for AI agents. Turn-level embedding, hybrid RRF ranking, optional knowledge graph. MCP server, CLI, and library API.

latest
Source
npmnpm
Version
1.9.2
Version published
Maintainers
1
Created
Source

๐ŸŒŠ Aquifer

Long-term memory for AI agents, backed by PostgreSQL.

Store sessions, enrich them, and recall the exact turn where a decision happened โ€” without adding a separate vector database.

npm version PostgreSQL 15+ pgvector License: MIT

English | ็น้ซ”ไธญๆ–‡ | ็ฎ€ไฝ“ไธญๆ–‡

Start Here

Aquifer is designed to have a short default path: start PostgreSQL + embeddings, run quickstart, then point your MCP client at aquifer mcp.

For library API usage, skip to API Reference. For a slightly more guided first run, see docs/getting-started.md.

1. Start the local stack

docker compose up -d
# PostgreSQL 16 + pgvector and Ollama with bge-m3 (auto-pulled).
# First run pulls the model โ€” `docker compose logs -f ollama-pull` to watch.

Already running PostgreSQL + pgvector and an embedding endpoint? Skip this step. quickstart picks up DATABASE_URL and embed settings from your environment if you already have them.

2. Verify end-to-end

npx --yes @shadowforge0/aquifer-memory quickstart

quickstart autodetects localhost:5432 PostgreSQL and localhost:11434 Ollama (from step 1 or your own), runs migrations, embeds a test session, recalls it, and cleans up. If it prints โœ“ Aquifer is working, you're done.

For ongoing use, install it into your project so you skip the npx resolution cost: npm install @shadowforge0/aquifer-memory then npx aquifer quickstart.

Using OpenAI instead of Ollama? export EMBED_PROVIDER=openai + OPENAI_API_KEY=sk-... before quickstart โ€” model defaults to text-embedding-3-small.

3. Connect your MCP client

Claude Code, Claude Desktop, or any MCP-capable client โ€” drop this into .mcp.json (project-level) or claude_desktop_config.json:

{
  "mcpServers": {
    "aquifer": {
      "command": "npx",
      "args": ["--yes", "@shadowforge0/aquifer-memory", "mcp"],
      "env": {
        "DATABASE_URL": "postgresql://aquifer:aquifer@localhost:5432/aquifer",
        "EMBED_PROVIDER": "ollama",
        "AQUIFER_MEMORY_SERVING_MODE": "legacy"
      }
    }
  }
}

Or run it directly: DATABASE_URL=... EMBED_PROVIDER=ollama npx aquifer mcp. The MCP server itself stays strict about env; quickstart autodetect is the try-it path, not the production one.

Keep AQUIFER_MEMORY_SERVING_MODE=legacy for first rollout. Switch to curated only when you want compatibility session_recall to serve active curated memory and session_bootstrap to use the curated daily/current-memory contract. Curated bootstrap is daily-first and returns daily continuity by default; hosts must explicitly request current memory when they need it. Use memory_recall for explicit current-memory lookup, historical_recall for the historical/session plane, and evidence_recall for the audit/debug lane. Rollback is just flipping env or config back to legacy.

Historical dateFrom/dateTo filters are local-day windows. They match sessions whose started_at, last_message_at, or delayed-ingest created_at falls within the requested day, so session-end imports are not hidden by UTC date drift.

Curated serving is scope-bound. AQUIFER_MEMORY_ACTIVE_SCOPE_PATH is the ordered inheritance path, while AQUIFER_MEMORY_ALLOWED_SCOPE_KEYS is the caller boundary. If activeScopePath is omitted, Aquifer defaults it to global plus the configured activeScopeKey, or to global alone when no active scope is configured. If allowedScopeKeys is omitted, Aquifer defaults it to that active scope path. Runtime requests outside that boundary are rejected before reading current memory rows.

Common commands

GoalCommand
Verify setupnpx aquifer quickstart
Run read-only governance diagnosticsnpx aquifer doctor --json
Run release QCnpx aquifer qc release --json
Inspect selected backend capabilities without DB connectionAQUIFER_BACKEND=local npx aquifer backend-info --json
Start the MCP servernpx aquifer mcp
Search memory manuallynpx aquifer recall "auth middleware"
Explain current-memory bootstrap selectionnpx aquifer explain bootstrap --active-scope-key project:aquifer --json
Explain current-memory recall selectionnpx aquifer explain memory --query "serving contract" --active-scope-key project:aquifer --json
Inspect finalized-session ledger rowsnpx aquifer finalization list --status finalized --json
Inspect operator ledgersnpx aquifer operator status --json
Review current-memory feedback issuesnpx aquifer review queue --scope-key project:aquifer --feedback-type incorrect --json
Resolve a reviewed memory queue itemnpx aquifer review resolve --memory-id 42 --resolution resolved --reason "verified current" --expected-latest-issue-feedback-id 9 --json
Plan curated memory compactionnpx aquifer compact --cadence daily --period-start 2026-04-27T00:00:00Z --period-end 2026-04-28T00:00:00Z
Generate a timer synthesis promptnpx aquifer operator compaction daily --include-synthesis-prompt --json
Apply reviewed timer synthesis candidatesnpx aquifer operator compaction daily --synthesis-summary-file /tmp/timer-summary.json --apply --promote-candidates --json
Generate a finalized-session checkpoint promptnpx aquifer operator checkpoint --scope-key project:aquifer --min-finalizations 10 --include-synthesis-prompt --json
Heartbeat-check an active Codex session for checkpoint worknpx aquifer codex-recovery checkpoint-heartbeat --hook-stdin --scope-key project:aquifer
Inspect pending Codex checkpoint spool filesnpx aquifer codex-recovery checkpoint-spool-status --json --limit 10
Preview a Codex UserPromptSubmit heartbeat hook installnpx aquifer codex-recovery checkpoint-heartbeat-hook --scope-key project:aquifer --hooks-path "$CODEX_HOME/hooks.json" --json
Check memory readinessnpx aquifer stats
Check saved-content preparationnpx aquifer backlog --json
Dry-run a saved-content policy decisionnpx aquifer backlog --plan skip --status pending --source openclaw-mcp
Enrich pending sessionsnpx aquifer backfill

stats, backlog, MCP memory_stats, and MCP memory_pending default to the same public status surface: Aquifer status or Saved content status, plus Available, Attention, and Action where relevant. Use stats --diagnostics, backlog --diagnostics, or MCP diagnostics: true for raw counters, buckets, guidance, and samples.

Reviewed timer synthesis is gated before promotion. Candidate items must carry mergeKey, scopeClass, durability, promotionTarget, and sourceCanonicalKeys that reference sourceCurrentMemory; runtime state also needs staleAfter or validTo. The temporal distillation gate rejects workspace/operator policy, transient material, unsupported promotion targets, invalid or missing source lineage, and duplicate merge keys before current memory promotion. assistant_shaping is retained only as a compatibility, review, or provenance surface. It is not an active runtime memory type and is not pinned into bootstrap.

Timer synthesis output is candidate material until an operator applies a reviewed synthesis summary with --promote-candidates; it does not become active curated memory from the prompt or summary file alone. The deterministic daily/weekly/monthly aggregate proposals in a dry-run are source-rollup material for review and ledger lineage only, and are blocked from normal active promotion unless a reviewed synthesis summary is attached.

Checkpoint output follows the same boundary. operator checkpoint plans from finalized session summaries and only writes checkpoint_runs when you pass an explicit reviewed synthesis summary with --apply. codex-recovery checkpoint-heartbeat is the active-session hook heartbeat for Codex JSONL files: it first checks a tiny local scheduler marker, reads the transcript only when the time window is due, then writes local spool process material only when the message threshold is also due. It does not print prompt text by default and does not write DB memory. Use codex-recovery checkpoint-spool-status --json to inspect pending local spool files by session, coverage, byte size, and modified time without printing the checkpoint prompt text.

Read-only governance commands are diagnostic surfaces. doctor, finalization list|inspect, explain bootstrap|memory, and operator status|inspect do not apply migrations, promote memory, mutate finalization status, reclaim operator leases, or add MCP tools. review queue|inspect is read-only with respect to memory truth, but uses the normal Aquifer migration gate because it depends on the resolution ledger schema. It derives an operator queue from curated memory feedback such as incorrect, stale, and scope_mismatch without printing raw transcripts, feedback notes, feedback metadata, or memory payloads. review resolve is the narrow write path for this surface: it appends a snapshot-bound resolution ledger row (resolved, ignored, or deferred) without mutating memory_records or rewriting feedback history. Newer issue feedback reopens the queue item. Use these commands to answer why a session did not finalize, why a current-memory row was selected or excluded, which visible memory rows need human review, and whether operator ledgers contain stale claims before running write paths. Explain output includes stable scope inheritance details for selected and excluded rows; non-selected rows redact title and summary so diagnostics cannot become a cross-scope content probe.

Release QC is a package-maintainer surface. qc release runs the fixed release checks from the product package root: lint, package release tests, DB release gate when AQUIFER_TEST_DB_URL is set, whitespace check, npm pack dry-run, worktree private-artifact hygiene, npm version provenance, and local git tag provenance. Use --json for automation, --strict when warnings or skipped checks should fail CI, and --require-version-tag for the final publish gate after tagging the reviewed release commit.

Need LLM summarization, the knowledge graph, OpenAI embeddings, reranking, or operations details? See docs/setup.md and Environment Variables.

Why Aquifer?

Most AI memory systems bolt a vector DB on the side. Aquifer takes a different approach: PostgreSQL is the memory.

Sessions, summaries, turn-level embeddings, entity graph โ€” all live in one database, queried with one connection. No sync layer, no eventual consistency, no extra infrastructure.

What makes it different

AquiferTypical vector-DB approach
StoragePostgreSQL + pgvectorSeparate vector DB + app DB
GranularityTurn-level embeddings (not just session summaries)Session or document chunks
Ranking3-way RRF: FTS + session embedding + turn embeddingSingle vector similarity
Knowledge graphBuilt-in entity extraction & co-occurrenceUsually separate system
Multi-tenanttenant_id on every table, day-1Often an afterthought
Dependenciespg + MCP SDKMultiple SDKs

Before and after

Without turn-level memory โ€” search misses precise moments:

Query: "What did we decide about the auth middleware?" โ†’ Returns a 2000-word session summary that mentions auth somewhere

With Aquifer โ€” search finds the exact turn:

Query: "What did we decide about the auth middleware?" โ†’ Returns the specific user turn: "Let's rip out the old auth middleware โ€” legal flagged it for session token compliance"

Requirements

ComponentRequired?PurposeExample
Node.js >= 18YesRuntimeโ€”
PostgreSQL 15+YesStorage for sessions, summaries, entitiesLocal, Docker, or managed
pgvector extensionYesVector similarity searchCREATE EXTENSION vector; (included in pgvector/pgvector Docker image)
Embedding endpointYes (for recall)Turn + session embeddingOllama bge-m3, OpenAI text-embedding-3-small, any OpenAI-compatible API
LLM endpointOptionalBuilt-in summarization during enrichOllama, OpenRouter, OpenAI โ€” or provide your own summaryFn
@modelcontextprotocol/sdk + zodYes (for MCP server)MCP protocol runtimeIncluded in dependencies โ€” installed automatically

Environment Variables

VariableRequired?PurposeExample
DATABASE_URLYesPostgreSQL connection stringpostgresql://user:pass@localhost:5432/mydb
AQUIFER_BACKENDNoBackend profile selector: postgres full backend or explicit degraded local starterpostgres
AQUIFER_LOCAL_PATHNoLocal starter JSON store path.aquifer/aquifer.local.json
AQUIFER_SCHEMANoPG schema name (default: aquifer)memory
AQUIFER_TENANT_IDNoMulti-tenant key (default: default)my-app
AQUIFER_EMBED_BASE_URLYes (for recall)Embedding API base URLhttp://localhost:11434/v1
AQUIFER_EMBED_MODELYes (for recall)Embedding model namebge-m3
AQUIFER_EMBED_API_KEYProvider-dependentAPI key for hosted embedding providerssk-...
AQUIFER_EMBED_DIMNoEmbedding dimension override (auto-detected)1024
AQUIFER_LLM_BASE_URLNoLLM API base URL (for built-in summarization)http://localhost:11434/v1
AQUIFER_LLM_MODELNoLLM model namellama3.1
AQUIFER_LLM_API_KEYProvider-dependentAPI key for hosted LLM providerssk-...
AQUIFER_ENTITIES_ENABLEDNoEnable knowledge graph (default: false)true
AQUIFER_ENTITY_SCOPENoEntity namespace (default: default)my-app
AQUIFER_RERANK_ENABLEDNoEnable cross-encoder rerankingtrue
AQUIFER_RERANK_PROVIDERNoReranker provider: tei, jina, openroutertei
AQUIFER_RERANK_BASE_URLNoReranker endpointhttp://localhost:8080
AQUIFER_AGENT_IDNoDefault agent IDmain
AQUIFER_MEMORY_SERVING_MODENoPublic serving mode: legacy default, or opt-in curatedcurated
AQUIFER_MEMORY_ACTIVE_SCOPE_KEYNoDefault active curated scope for recall/bootstrapproject:aquifer
AQUIFER_MEMORY_ACTIVE_SCOPE_PATHNoOrdered curated scope path for inheritanceglobal,project:aquifer
AQUIFER_MEMORY_ALLOWED_SCOPE_KEYSNoCaller boundary for curated scope requests; defaults to the configured active scope pathglobal,project:aquifer
AQUIFER_CODEX_CHECKPOINT_CHECK_INTERVAL_MINUTESNoActive Codex checkpoint heartbeat time gate (default: 10)10
AQUIFER_CODEX_CHECKPOINT_EVERY_MESSAGESNoActive Codex checkpoint message delta gate (default: 20)20
AQUIFER_CODEX_CHECKPOINT_EVERY_USER_MESSAGESNoOptional user-message delta gate10
AQUIFER_CODEX_CHECKPOINT_QUIET_MSNoQuiet period before reading due transcripts (default: 3000)3000
AQUIFER_MIGRATIONS_MODENoStartup handshake mode: apply (default), check, offapply
AQUIFER_MIGRATION_LOCK_TIMEOUT_MSNoAdvisory-lock wait before AQ_MIGRATION_LOCK_TIMEOUT (default 30000)30000
AQUIFER_INSIGHTS_DEDUP_MODENoInsights semantic dedup mode: off (default), shadow, enforce โ€” env wins over code for this field only, so operators can kill-switch without redeployshadow
AQUIFER_INSIGHTS_DEDUP_COSINENoCosine threshold for semantic merge (default 0.88; warn outside [0.75, 0.95])0.90
AQUIFER_INSIGHTS_DEDUP_CLOSE_BAND_FROMNoLower bound for close-band logging (dedupNear); must be below threshold (default 0.85)0.82

Full env-to-config mapping is in consumers/shared/config.js.

Curated serving is opt-in. If a host needs rollback during rollout, set AQUIFER_MEMORY_SERVING_MODE=legacy and restart the MCP/CLI process; no destructive DB rollback is required.

Insights semantic dedup (1.5.10)

When a cron extractor (scripts/extract-insights-from-recent-sessions.js) or any other caller writes insights via commitInsight, the canonical-key layer (1.5.3+) dedupes rows whose canonicalClaim + entities hash to the same value. But LLMs don't always produce the same canonicalClaim across runs, so 1.5.10 adds a second tier: title + body are embedded, matched against (tenant, agent, type)-scoped active rows, and a top cosine above AQUIFER_INSIGHTS_DEDUP_COSINE triggers supersede (enforce) or metadata-only would-merge logging (shadow). Close-band hits (closeBandFrom โ‰ค cos < threshold) write metadata.dedupNear without supersede so operators can tune thresholds without committing.

Recommended rollout: shadow for one weekly cycle, inspect SELECT metadata->>'shadowMatch' FROM insights WHERE metadata ? 'shadowMatch', then flip to enforce. Kill-switch: AQUIFER_INSIGHTS_DEDUP_MODE=off and restart.

Pre-1.5.3 rows with canonical_key_v2 IS NULL are caught by the semantic tier but skip the canonical path; a startup warn points at the one-shot backfill:

DATABASE_URL=... \
  node scripts/backfill-canonical-key.js --schema <schema> --agent <id>

The script is idempotent (WHERE canonical_key_v2 IS NULL guard) and race-safe with live writers.

Host Integration

MCP is the primary integration surface. Agent hosts connect to the Aquifer MCP server. The complete contract contains eleven tools: memory_recall, historical_recall, session_recall, evidence_recall, ingest_session, session_feedback, memory_feedback, memory_stats, memory_pending, feedback_stats, session_bootstrap.

The default runtime surface is read-only/status-only and registers eight tools: memory_recall, historical_recall, session_recall, evidence_recall, memory_stats, memory_pending, feedback_stats, session_bootstrap. Write tools (ingest_session, session_feedback, memory_feedback) must be enabled explicitly with AQUIFER_MCP_ENABLE_WRITES=true for trusted lifecycle or operator hosts.

IntegrationRouteStatusWhen to use
MCP serverconsumers/mcp.jsPrimaryClaude Code, OpenClaw, Codex, any MCP-capable host
Library APIcreateAquifer()PrimaryBackend apps, custom pipelines, direct Node.js usage
CLIconsumers/cli.jsSecondaryOperations, debugging, manual recall/backfill (aquifer bootstrap, aquifer ingest-opencode, etc.)
OpenCode ingestconsumers/opencode.jsSecondaryImport sessions from OpenCode's SQLite DB
OpenClaw pluginconsumers/openclaw-plugin.jsCompatibility onlySession capture/finalization via session_end with before_reset fallback โ€” not for tool delivery

Claude Code

Add to your project's .claude.json or user-level MCP config:

{
  "mcpServers": {
    "aquifer": {
      "type": "stdio",
      "command": "node",
      "args": ["/path/to/aquifer/consumers/mcp.js"],
      "env": {
        "DATABASE_URL": "postgresql://...",
        "AQUIFER_EMBED_BASE_URL": "http://localhost:11434/v1",
        "AQUIFER_EMBED_MODEL": "bge-m3"
      }
    }
  }
}

By default, tools appear as mcp__aquifer__memory_recall, mcp__aquifer__historical_recall, mcp__aquifer__session_recall, mcp__aquifer__evidence_recall, mcp__aquifer__memory_stats, mcp__aquifer__memory_pending, mcp__aquifer__feedback_stats, mcp__aquifer__session_bootstrap. With AQUIFER_MCP_ENABLE_WRITES=true, mcp__aquifer__ingest_session, mcp__aquifer__session_feedback, and mcp__aquifer__memory_feedback are added.

For Codex long sessions, Aquifer exposes a UserPromptSubmit-friendly heartbeat command instead of installing a daemon:

npx aquifer codex-recovery checkpoint-heartbeat \
  --hook-stdin \
  --scope-key project:aquifer

Run that from a host hook with Codex hook JSON on stdin. The heartbeat uses a time-first gate: if the local marker says the next check is not due, it exits without validating or reading the transcript. When due, it validates the transcript_path realpath under the Codex sessions directory, waits for the quiet period, checks the configured message delta, and writes a local spool file for later review. Scheduler, claim, and spool files live under the Codex state directory by default; they are process-control files, not DB memory.

Heartbeat policy resolves as command flags first, then Aquifer env/config, then defaults. The default policy is 10 minutes, 20 safe messages, no user-message gate, 3000 ms quiet period, and 60000 ms claim TTL. In config files this lives at codex.checkpoint:

{
  "codex": {
    "checkpoint": {
      "checkIntervalMinutes": 10,
      "everyMessages": 20,
      "quietMs": 3000
    }
  }
}

To prepare the Codex hook entry, generate or apply the merged hooks.json:

npx aquifer codex-recovery checkpoint-heartbeat-hook \
  --scope-key project:aquifer \
  --hooks-path "$CODEX_HOME/hooks.json" \
  --json

The hook installer is dry-run by default. Add --apply only after reviewing the merged UserPromptSubmit command. codex-recovery doctor --json reports whether the heartbeat hook is present.

OpenClaw

Install or update Aquifer inside the OpenClaw host root, then let the installer wire both the MCP server and the optional extension from the same package root:

npm install --prefix "$OPENCLAW_HOME" @shadowforge0/aquifer-memory@1.9.2
node "$OPENCLAW_HOME/node_modules/@shadowforge0/aquifer-memory/consumers/cli.js" install-openclaw --openclaw-home "$OPENCLAW_HOME"

The installer enables plugins.entries["aquifer-memory"], adds the extension to plugins.load.paths, preserves existing mcp.servers.aquifer.env values, and backs up openclaw.json before writing. Use --dry-run --json to inspect package version, plugin config, MCP target, and extension link without changing files.

By default, tools materialize as aquifer__memory_recall, aquifer__historical_recall, aquifer__session_recall, aquifer__evidence_recall, aquifer__memory_stats, aquifer__memory_pending, aquifer__feedback_stats, aquifer__session_bootstrap (server name prefix added by the host). With AQUIFER_MCP_ENABLE_WRITES=true, aquifer__ingest_session, aquifer__session_feedback, and aquifer__memory_feedback are added.

The OpenClaw plugin (consumers/openclaw-plugin.js) is retained for session capture via session_end with a before_reset fallback, then uses the v1 finalization path when enrichment produced a summary. It is not the recommended tool delivery path. Use MCP.

Other MCP-capable hosts

Any host that supports MCP stdio can connect the same way โ€” point it at node consumers/mcp.js with the required env vars. The MCP server is the canonical external contract.

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Agent Hosts                              โ”‚
โ”‚   Claude Code ยท OpenClaw ยท Codex ยท OpenCode ยท ...            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚ MCP (stdio or HTTP)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              Aquifer MCP Server (canonical API)               โ”‚
โ”‚   session_recall ยท session_feedback ยท memory_stats ยท ...     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    createAquifer (engine)                     โ”‚
โ”‚         Config ยท Migration ยท Ingest ยท Recall ยท Enrich        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚          โ”‚          โ”‚          โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ–ผโ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚storage โ”‚ โ”‚hybrid-  โ”‚ โ”‚entityโ”‚ โ”‚   pipeline/   โ”‚
    โ”‚  .js   โ”‚ โ”‚rank.js  โ”‚ โ”‚ .js  โ”‚ โ”‚summarize.js   โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚embed.js       โ”‚
         โ”‚                     โ”‚    โ”‚extract-ent.js โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ–ผโ”€โ”€โ”  โ”‚rerank.js      โ”‚
    โ”‚  PostgreSQL     โ”‚    โ”‚ LLM  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ”‚  + pgvector     โ”‚    โ”‚ API  โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚         schema/                  โ”‚
    โ”‚  001-base.sql (sessions,         โ”‚
    โ”‚    summaries, turns, FTS)        โ”‚
    โ”‚  002-entities.sql (KG)           โ”‚
    โ”‚  003-trust-feedback.sql (trust)  โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

File Reference

FilePurpose
index.jsEntry point โ€” exports createAquifer, createEmbedder, createReranker
core/aquifer.jsMain facade: migrate(), ingest(), recall(), enrich()
core/storage.jsSession/summary/turn CRUD, FTS search, embedding search
core/entity.jsEntity upsert, mention tracking, relation graph, normalization
core/hybrid-rank.js3-way RRF fusion, time decay, trust multiplier, entity boost, open-loop boost
pipeline/summarize.jsLLM-powered session summarization with structured output
pipeline/embed.jsEmbedding client (any OpenAI-compatible API)
pipeline/extract-entities.jsLLM-powered entity extraction (12 types)
pipeline/rerank.jsCross-encoder reranking (TEI, Jina, OpenRouter)
pipeline/normalize/Session normalization for Claude Code / gateway noise
consumers/opencode.jsOpenCode SQLite ingest โ€” reads sessions from OpenCode's local DB
schema/001-base.sqlDDL: sessions, summaries, turn_embeddings, FTS indexes
schema/002-entities.sqlDDL: entities, mentions, relations, entity_sessions
schema/003-trust-feedback.sqlDDL: trust_score column, session_feedback audit trail

Core Features

3-Way Hybrid Retrieval (RRF)

Query โ”€โ”€โ”ฌโ”€โ”€ FTS (BM25)              โ”€โ”€โ”
        โ”œโ”€โ”€ Session embedding search โ”€โ”€โ”œโ”€โ”€ RRF Fusion โ†’ Time Decay โ†’ Entity Boost โ†’ Results
        โ””โ”€โ”€ Turn embedding search   โ”€โ”€โ”˜
  • Full-text search โ€” PostgreSQL tsvector with language-aware ranking
  • Session embedding โ€” cosine similarity on session summaries
  • Turn embedding โ€” cosine similarity on individual user turns
  • Reciprocal Rank Fusion โ€” merges all three ranked lists (K=60)
  • Time decay โ€” sigmoid decay with configurable midpoint and steepness
  • Entity boost โ€” sessions mentioning query-relevant entities get a score boost
  • Trust scoring โ€” multiplicative trust multiplier from explicit feedback (helpful/unhelpful)
  • Open-loop boost โ€” sessions with unresolved items get a mild recency boost

Entity Intersection

When you know which entities you're looking for, pass them explicitly:

const results = await aquifer.recall('auth decision', {
  entities: ['auth-middleware', 'legal-compliance'],
  entityMode: 'all',  // only sessions containing BOTH entities
});
  • entityMode: 'any' (default) โ€” boost sessions matching any queried entity
  • entityMode: 'all' โ€” hard filter: only return sessions containing every specified entity

Trust Scoring & Feedback

Sessions accumulate trust through explicit feedback. Low-trust memories are suppressed in rankings regardless of relevance.

// After a recall result was useful
await aquifer.feedback('session-id', { verdict: 'helpful' });

// After a recall result was irrelevant
await aquifer.feedback('session-id', { verdict: 'unhelpful' });
  • Asymmetric: helpful +0.05, unhelpful โˆ’0.10 (bad memories sink faster)
  • Multiplicative in ranking: trust=0.5 is neutral, trust=0 halves the score, trust=1.0 gives 50% boost
  • Full audit trail in session_feedback table

Turn-Level Embeddings

Not just session summaries โ€” Aquifer embeds each meaningful user turn individually.

  • Filters noise: short messages, slash commands, confirmations ("ok", "got it")
  • Truncates at 2000 chars, skips turns under 5 chars
  • Stores turn text + embedding + position for precise retrieval

Knowledge Graph

Built-in entity extraction and relationship tracking:

  • 12 entity types: person, project, concept, tool, metric, org, place, event, doc, task, topic, other
  • Entity normalization: NFKC + homoglyph mapping + case folding
  • Co-occurrence relations: undirected edges with frequency tracking
  • Entity-session mapping: which entities appear in which sessions
  • Entity boost in ranking: sessions with relevant entities score higher

Multi-Tenant

Every table includes tenant_id (default: 'default'). Isolation is enforced at the query level โ€” no cross-tenant data leakage by design.

Schema-per-deployment

Pass schema: 'my_app' to createAquifer() and all tables live under that PostgreSQL schema. Run multiple Aquifer instances in the same database without conflicts.

API Reference

createAquifer(config)

Returns an Aquifer instance. Config:

{
  db,          // pg connection string or Pool instance (required)
  schema,      // PG schema name (default: 'aquifer')
  tenantId,    // multi-tenant key (default: 'default')
  embed: { fn, dim },      // embedding function (required for recall)
  llm: { fn },             // LLM function (required for built-in summarize)
  entities: {
    enabled,               // enable KG (default: false)
    scope,                 // entity namespace (default: 'default')
    mergeCall,             // merge entity extraction into summary LLM call (default: true)
  },
  rank: { rrf, timeDecay, access, entityBoost },  // weight overrides
}

aquifer.init()

Startup handshake โ€” resolves pending migrations and returns a StartupEnvelope. Hosts should await this before accepting traffic. In apply mode a ready=false envelope is the signal to abort startup.

const envelope = await aquifer.init();
// {
//   ready:             true,
//   memoryMode:        'rw',        // 'rw' | 'ro' | 'off'
//   migrationMode:     'apply',     // 'apply' | 'check' | 'off'
//   pendingMigrations: [],          // migration ids still outstanding
//   appliedMigrations: ['001-base', '003-trust-feedback', '004-completion', '006-insights'],
//   error:             null,        // { code, message } on failure
//   durationMs:        1035,
// }

The MCP consumer (consumers/mcp.js) already wires aquifer.init() before server.connect() and exits non-zero if ready=false under apply mode.

aquifer.listPendingMigrations() / aquifer.getMigrationStatus()

Returns { required, applied, pending, lastRunAt } via table and column signature probes (pg_tables plus information_schema.columns for alter-only migrations). No DDL runs. Use it from a health check or from a consumer that wants to surface drift before calling init().

aquifer.migrate()

Runs SQL migrations (idempotent). Creates tables, indexes, triggers, and extensions. Uses pg_try_advisory_lock with a 250 ms poll and a lockTimeoutMs deadline (30 s default); on exhaustion throws with code: 'AQ_MIGRATION_LOCK_TIMEOUT'. On success returns { ok: true, durationMs, notices, ddlExecuted }; on failure throws an error whose err.notices / err.failedAt describe the stage that blew up. Most callers should go through aquifer.init() instead.

aquifer.ensureMigrated()

Lazy idempotent wrapper โ€” fires migrate() once on first call, no-ops afterwards. Honors migrations.mode: check only probes, off marks the instance migrated without touching the DB.

aquifer.commit(sessionId, messages, opts)

Stores a session. Returns { id, sessionId, isNew }.

await aquifer.commit('session-001', messages, {
  agentId: 'main',
  source: 'api',
  sessionKey: 'optional-key',
  model: 'gpt-4o',
  tokensIn: 1500,
  tokensOut: 800,
  startedAt: isoString,
  lastMessageAt: isoString,
});

aquifer.enrich(sessionId, opts)

Enriches a committed session: summarize, embed turns, extract entities. Uses optimistic locking with stale-reclaim (sessions stuck processing > 10 min are reclaimable).

const result = await aquifer.enrich('session-001', {
  agentId: 'main',
  summaryFn,          // custom summarize pipeline (bypasses built-in LLM)
  entityParseFn,      // custom entity parser
  postProcess,        // async callback after tx commit
  model: 'override',  // model metadata override
  skipSummary: false,
  skipTurnEmbed: false,
  skipEntities: false,
});
// Returns: { summary, turnsEmbedded, entitiesFound, warnings, effectiveModel, postProcessError }

postProcess hook: runs after transaction commit, receives full context (session, summary, embedding, parsedEntities, etc.). Best-effort, at-most-once. If the hook throws, the error is captured and returned as postProcessError on the enrich result โ€” the session itself remains committed and is not retried.

aquifer.recall(query, opts)

Hybrid search across sessions using 3-way RRF.

const results = await aquifer.recall('search query', {
  agentId: 'main',
  limit: 10,
  entities: ['postgres', 'migration'],
  entityMode: 'all',            // 'any' (default) or 'all'
  weights: { rrf, timeDecay, access, entityBoost },
});
// Returns: [{ sessionId, score, trustScore, summaryText, matchedTurnText, _debug, ... }]

aquifer.feedback(sessionId, opts)

Records trust feedback. Returns { trustBefore, trustAfter, verdict }.

await aquifer.feedback('session-id', {
  verdict: 'helpful',   // or 'unhelpful'
  agentId: 'main',
  note: 'reason',
});

aquifer.bootstrap(opts)

Loads recent session context for a new conversation โ€” summaries, open loops, and decisions. Time-based (no embedding search), designed for session-start injection.

const result = await aquifer.bootstrap({
  agentId: 'main',
  limit: 5,              // max sessions (default: 5)
  lookbackDays: 14,      // how far back (default: 14)
  maxChars: 4000,        // max output chars (default: 4000)
  format: 'text',        // 'text', 'structured', or 'both'
});
// format='text': result.text contains XML block ready for injection
// format='structured': result.sessions, result.openLoops, result.recentDecisions

Cross-session dedup on open loops and decisions, sentinel filtering (removes ็„ก/none/n/a), and maxChars truncation.

aquifer.insights.commitInsight(opts) / recallInsights(query, opts) / markStale(id) / supersede(oldId, newId)

Higher-order reflections distilled from session windows (preferences, patterns, frustrations, workflows). Split into two identities: a canonical key that describes what the insight is about (stable across rewordings), and an idempotency key that describes which revision of that claim was written.

await aquifer.insights.commitInsight({
  agentId:        'main',
  type:           'preference',
  canonicalClaim: 'mk prefers checking context before coding',  // required โ€” short declarative claim
  title:          'Context-first discipline',                    // best-effort display
  body:           'โ€ฆ',
  entities:       ['mk', 'claude code'],
  sourceSessionIds: ['sess-a', 'sess-b'],
  evidenceWindow:  { from: isoString, to: isoString },
  importance:     0.9,
});

Write rules: duplicate (same idempotency key โ†’ return existing), revision (same canonical key + newer evidence โ†’ INSERT + inline supersede of prior active), back-fill revision (same canonical key + older evidence โ†’ INSERT without supersede), stale replay (same canonical + same body โ†’ return existing). Old pre-1.5.6 rows are not retrofitted; their canonical_key_v2 stays NULL and they age out naturally.

aquifer.close()

Closes the PostgreSQL connection pool (only if Aquifer created it).

Configuration

Aquifer resolves config from three sources in priority order: config file โ†’ environment variables โ†’ programmatic overrides. See consumers/shared/config.js for the full env-to-config mapping.

Config file is auto-discovered at aquifer.config.json in the working directory, or set AQUIFER_CONFIG=/path/to/config.json.

createAquifer({
  db: 'postgresql://user:pass@localhost/mydb',  // or an existing pg.Pool
  schema: 'aquifer',           // PG schema (default: 'aquifer')
  tenantId: 'default',         // multi-tenant key
  embed: {
    fn: myEmbedFn,             // async (texts: string[]) => number[][]
    dim: 1024,                 // optional dimension hint
  },
  llm: {
    fn: myLlmFn,               // async (prompt: string) => string
  },
  entities: {
    enabled: true,             // enable KG (default: false)
    scope: 'my-app',           // entity namespace โ€” decoupled from agentId
    mergeCall: true,           // merge entity extraction into summary prompt
  },
  rank: {
    rrf: 0.65,                 // FTS + embedding fusion weight
    timeDecay: 0.25,           // recency weight
    access: 0.10,              // access frequency weight
    entityBoost: 0.18,         // entity match boost
  },
  migrations: {
    mode: 'apply',             // 'apply' | 'check' | 'off'
    lockTimeoutMs: 30000,      // abort init() if advisory lock held this long
    startupTimeoutMs: 60000,   // overall init() deadline (plan probe + DDL combined)
    onEvent: null,             // (e) => void โ€” lifecycle hook, see below
  },
});

Startup observability

Set migrations.onEvent to observe the lifecycle without parsing logs. Event names: init_started, check_completed, apply_started, apply_succeeded, apply_failed. Each payload carries schema, mode, the plan, ddlExecuted, durationMs, and on failure the error / failedAt / notices. No listener โ†’ zero cost.

Entity Scope

entities.scope defines the namespace for entity identity. The unique constraint is (tenant_id, normalized_name, entity_scope) โ€” the same entity name in different scopes creates separate entities. This decouples entity identity from agentId, allowing multiple agents to share an entity namespace.

Fallback chain: config.entities.scope โ†’ 'default'.

Database Schema

001-base.sql

TablePurpose
sessionsRaw conversation data with messages (JSONB), token counts, timestamps
session_summariesLLM-generated structured summaries with embeddings
turn_embeddingsPer-turn user message embeddings for precise retrieval

Key indexes: GIN on messages, GiST on tsvector, ivfflat on embeddings, B-tree on tenant/agent/timestamps.

Note: the schema uses basic ivfflat indexes suitable for development and moderate-scale use. For large deployments (100k+ embeddings), consider adding HNSW indexes โ€” this is a future optimization area, not included out of the box.

002-entities.sql

TablePurpose
entitiesNormalized named entities with type, aliases, frequency, entity_scope, optional embedding
entity_mentionsEntity ร— session join with mention count and context
entity_relationsCo-occurrence edges (undirected, CHECK src < dst)
entity_sessionsEntity-session association for boost scoring

Key indexes: trigram on entity names, GiST on embeddings, unique on (tenant_id, normalized_name, entity_scope).

003-trust-feedback.sql

TablePurpose
session_feedbackExplicit feedback audit trail (helpful/unhelpful verdicts, trust deltas)

Also adds trust_score column to session_summaries (default 0.5, range 0โ€“1).

005-entity-state-history.sql (entities enabled)

TablePurpose
entity_state_historyTemporal state-change log with partial UNIQUE (tenant, agent, entity, attribute) WHERE valid_to IS NULL to enforce at-most-one-current. Out-of-order backfill is supported via predecessor/successor overlap checks

Opt-in pipeline (createAquifer({stateChanges: {enabled, whitelist, confidenceThreshold, timeoutMs, ...}})) extracts temporal state transitions from session text during enrich(); off by default to control LLM cost.

006-insights.sql

TablePurpose
insightsHigher-order reflections with TSTZRANGE evidence window, importance, GIN on source_session_ids, HNSW on 1024-dim embedding, and a non-unique partial index on canonical_key_v2 for the canonical/revision dedup contract

Key indexes: idx_insights_canonical_v2_active (partial on active rows with canonical key set), idx_insights_idempotency_key (unique on revision key).

Troubleshooting

error: type "vector" does not exist โ€” pgvector extension is not installed. Run CREATE EXTENSION IF NOT EXISTS vector; as a superuser, or use the pgvector/pgvector Docker image which includes it.

aquifer mcp requires @modelcontextprotocol/sdk and zod โ€” These are now regular dependencies and should be installed automatically. If you see this error, run npm install again to ensure all deps are present.

Recall returns no results โ€” Make sure you've run enrich after commit. Raw sessions are not searchable until enriched (summarized + embedded). Check aquifer stats to see if summaries and turn embeddings exist.

OpenClaw tools not visible โ€” Use mcp.servers.aquifer in openclaw.json, not the plugin. Tools appear as aquifer__memory_recall, aquifer__historical_recall, aquifer__session_recall, etc. The plugin (consumers/openclaw-plugin.js) is for session capture only.

Embedding provider connection refused โ€” Verify your AQUIFER_EMBED_BASE_URL is reachable. For local Ollama, make sure the server is running and the model is pulled (ollama pull bge-m3).

AQ_MIGRATION_LOCK_TIMEOUT on startup โ€” another process holds the migration advisory lock for aquifer:<schema>. Either it is a concurrent aquifer.init() racing yours (expected; one will win, the other re-runs and finds pending=[]) or a crashed worker left the lock held. Raise migrations.lockTimeoutMs, or drop the stale backend via SELECT pg_terminate_backend(pid) FROM pg_locks WHERE locktype='advisory' after you have confirmed which pid is dead.

MCP process exits non-zero at startup โ€” expected when migrations.mode=apply and aquifer.init() returns ready=false. Read the [aquifer-mcp] startup aborted line on stderr for the error.code / failedAt. If you need the old lazy-migrate-on-first-tool-call behaviour instead, set AQUIFER_MIGRATIONS_MODE=check (and run migrate() out of band) or =off.

Dependencies

PackagePurpose
pg โ‰ฅ 8.13PostgreSQL client
@modelcontextprotocol/sdk โ‰ฅ 1.29MCP server protocol
zod โ‰ฅ 3.25Schema validation (MCP tools)

LLM and embedding calls use raw HTTP โ€” no additional SDK required.

License

MIT

Keywords

ai

FAQs

Package last updated on 22 May 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts