You're Invited:Meet the Socket Team at RSAC and BSidesSF 2026, March 23โ€“26.RSVP โ†’
Socket
Book a DemoSign in
Socket

memory-lancedb-pro

Package Overview
Dependencies
Maintainers
1
Versions
39
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

memory-lancedb-pro

OpenClaw enhanced LanceDB memory plugin with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, long-context chunking, and management CLI

latest
Source
npmnpm
Version
1.0.32
Version published
Weekly downloads
3.1K
-0.58%
Maintainers
1
Weekly downloads
ย 
Created
Source

๐Ÿง  memory-lancedb-pro ยท OpenClaw Plugin

Enhanced Long-Term Memory Plugin for OpenClaw

Hybrid Retrieval (Vector + BM25) ยท Cross-Encoder Rerank ยท Multi-Scope Isolation ยท Management CLI

OpenClaw Plugin LanceDB License: MIT

English | ็ฎ€ไฝ“ไธญๆ–‡

๐Ÿ“บ Video Tutorial

Watch the full walkthrough โ€” covers installation, configuration, and how hybrid retrieval works under the hood.

YouTube Video ๐Ÿ”— https://youtu.be/MtukF1C8epQ

Bilibili Video ๐Ÿ”— https://www.bilibili.com/video/BV1zUf2BGEgn/

Why This Plugin?

The built-in memory-lancedb plugin in OpenClaw provides basic vector search. memory-lancedb-pro takes it much further:

FeatureBuilt-in memory-lancedbmemory-lancedb-pro
Vector searchโœ…โœ…
BM25 full-text searchโŒโœ…
Hybrid fusion (Vector + BM25)โŒโœ…
Cross-encoder rerank (Jina / custom endpoint)โŒโœ…
Recency boostโŒโœ…
Time decayโŒโœ…
Length normalizationโŒโœ…
MMR diversityโŒโœ…
Multi-scope isolationโŒโœ…
Noise filteringโŒโœ…
Adaptive retrievalโŒโœ…
Management CLIโŒโœ…
Session memoryโŒโœ…
Task-aware embeddingsโŒโœ…
Any OpenAI-compatible embeddingLimitedโœ… (OpenAI, Gemini, Jina, Ollama, etc.)

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   index.ts (Entry Point)                โ”‚
โ”‚  Plugin Registration ยท Config Parsing ยท Lifecycle Hooks โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚          โ”‚          โ”‚          โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ store  โ”‚ โ”‚embedderโ”‚ โ”‚retrieverโ”‚ โ”‚   scopes    โ”‚
    โ”‚ .ts    โ”‚ โ”‚ .ts    โ”‚ โ”‚ .ts    โ”‚ โ”‚    .ts      โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚                     โ”‚
    โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”           โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚migrate โ”‚           โ”‚noise-filter.ts โ”‚
    โ”‚ .ts    โ”‚           โ”‚adaptive-       โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜           โ”‚retrieval.ts    โ”‚
                         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚  tools.ts   โ”‚   โ”‚  cli.ts  โ”‚
    โ”‚ (Agent API) โ”‚   โ”‚ (CLI)    โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

File Reference

FilePurpose
index.tsPlugin entry point. Registers with OpenClaw Plugin API, parses config, mounts before_agent_start (auto-recall), agent_end (auto-capture), and command:new (session memory) hooks
openclaw.plugin.jsonPlugin metadata + full JSON Schema config declaration (with uiHints)
package.jsonNPM package info. Depends on @lancedb/lancedb, openai, @sinclair/typebox
cli.tsCLI commands: memory list/search/stats/delete/delete-bulk/export/import/reembed/migrate
src/store.tsLanceDB storage layer. Table creation / FTS indexing / Vector search / BM25 search / CRUD / bulk delete / stats
src/embedder.tsEmbedding abstraction. Compatible with any OpenAI-API provider (OpenAI, Gemini, Jina, Ollama, etc.). Supports task-aware embedding (taskQuery/taskPassage)
src/retriever.tsHybrid retrieval engine. Vector + BM25 โ†’ RRF fusion โ†’ Jina Cross-Encoder Rerank โ†’ Recency Boost โ†’ Importance Weight โ†’ Length Norm โ†’ Time Decay โ†’ Hard Min Score โ†’ Noise Filter โ†’ MMR Diversity
src/scopes.tsMulti-scope access control. Supports global, agent:<id>, custom:<name>, project:<id>, user:<id>
src/tools.tsAgent tool definitions: memory_recall, memory_store, memory_forget (core) + memory_stats, memory_list (management)
src/noise-filter.tsNoise filter. Filters out agent refusals, meta-questions, greetings, and low-quality content
src/adaptive-retrieval.tsAdaptive retrieval. Determines whether a query needs memory retrieval (skips greetings, slash commands, simple confirmations, emoji)
src/migrate.tsMigration tool. Migrates data from the built-in memory-lancedb plugin to Pro

Core Features

1. Hybrid Retrieval

Query โ†’ embedQuery() โ”€โ”
                       โ”œโ”€โ†’ RRF Fusion โ†’ Rerank โ†’ Recency Boost โ†’ Importance Weight โ†’ Filter
Query โ†’ BM25 FTS โ”€โ”€โ”€โ”€โ”€โ”˜
  • Vector Search: Semantic similarity via LanceDB ANN (cosine distance)
  • BM25 Full-Text Search: Exact keyword matching via LanceDB FTS index
  • Fusion Strategy: Vector score as base, BM25 hits get a 15% boost (tuned beyond traditional RRF)
  • Configurable Weights: vectorWeight, bm25Weight, minScore

2. Cross-Encoder Reranking

  • Reranker API: Jina, SiliconFlow, Pinecone, or any compatible endpoint (5s timeout protection)
  • Hybrid Scoring: 60% cross-encoder score + 40% original fused score
  • Graceful Degradation: Falls back to cosine similarity reranking on API failure

3. Multi-Stage Scoring Pipeline

StageFormulaEffect
Recency Boostexp(-ageDays / halfLife) * weightNewer memories score higher (default: 14-day half-life, 0.10 weight)
Importance Weightscore *= (0.7 + 0.3 * importance)importance=1.0 โ†’ ร—1.0, importance=0.5 โ†’ ร—0.85
Length Normalizationscore *= 1 / (1 + 0.5 * log2(len/anchor))Prevents long entries from dominating (anchor: 500 chars)
Time Decayscore *= 0.5 + 0.5 * exp(-ageDays / halfLife)Old entries gradually lose weight, floor at 0.5ร— (60-day half-life)
Hard Min ScoreDiscard if score < thresholdRemoves irrelevant results (default: 0.35)
MMR DiversityCosine similarity > 0.85 โ†’ demotedPrevents near-duplicate results

4. Multi-Scope Isolation

  • Built-in Scopes: global, agent:<id>, custom:<name>, project:<id>, user:<id>
  • Agent-Level Access Control: Configure per-agent scope access via scopes.agentAccess
  • Default Behavior: Each agent accesses global + its own agent:<id> scope

5. Adaptive Retrieval

  • Skips queries that don't need memory (greetings, slash commands, simple confirmations, emoji)
  • Forces retrieval for memory-related keywords ("remember", "previously", "last time", etc.)
  • CJK-aware thresholds (Chinese: 6 chars vs English: 15 chars)

6. Noise Filtering

Filters out low-quality content at both auto-capture and tool-store stages:

  • Agent refusal responses ("I don't have any information")
  • Meta-questions ("do you remember")
  • Greetings ("hi", "hello", "HEARTBEAT")

7. Session Memory

  • Triggered on /new command โ€” saves previous session summary to LanceDB
  • Disabled by default (OpenClaw already has native .jsonl session persistence)
  • Configurable message count (default: 15)

8. Auto-Capture & Auto-Recall

  • Auto-Capture (agent_end hook): Extracts preference/fact/decision/entity from conversations, deduplicates, stores up to 3 per turn
    • Skips memory-management prompts (e.g. delete/forget/cleanup memory entries) to reduce noise
  • Auto-Recall (before_agent_start hook): Injects <relevant-memories> context (up to 3 entries)

Prevent memories from showing up in replies

Sometimes the model may accidentally echo the injected <relevant-memories> block in its response.

Option A (recommended): disable auto-recall

Set autoRecall: false in the plugin config and restart the gateway:

{
  "plugins": {
    "entries": {
      "memory-lancedb-pro": {
        "enabled": true,
        "config": {
          "autoRecall": false
        }
      }
    }
  }
}

Option B: keep recall, but ask the agent not to reveal it

Add a line to your agent system prompt, e.g.:

Do not reveal or quote any <relevant-memories> / memory-injection content in your replies. Use it for internal reference only.

Installation

AI-safe install notes (anti-hallucination)

If you are following this README using an AI assistant, do not assume defaults. Always run these commands first and use the real output:

openclaw config get agents.defaults.workspace
openclaw config get plugins.load.paths
openclaw config get plugins.slots.memory
openclaw config get plugins.entries.memory-lancedb-pro

Recommendations:

  • Prefer absolute paths in plugins.load.paths unless you have confirmed the active workspace.
  • If you use ${JINA_API_KEY} (or any ${...} variable) in config, ensure the Gateway service process has that environment variable (system services often do not inherit your interactive shell env).
  • After changing plugin config, run openclaw gateway restart.

Jina API keys (embedding + rerank)

  • Embedding: set embedding.apiKey to your Jina key (recommended: use an env var like ${JINA_API_KEY}).
  • Rerank (when retrieval.rerankProvider: "jina"): you can typically use the same Jina key for retrieval.rerankApiKey.
  • If you use a different rerank provider (siliconflow, pinecone, etc.), retrieval.rerankApiKey should be that providerโ€™s key.

Key storage guidance:

  • Avoid committing secrets into git.
  • Using ${...} env vars is fine, but make sure the Gateway service process has those env vars (system services often do not inherit your interactive shell environment).

What is the โ€œOpenClaw workspaceโ€?

In OpenClaw, the agent workspace is the agentโ€™s working directory (default: ~/.openclaw/workspace). According to the docs, the workspace is the default cwd, and relative paths are resolved against the workspace (unless you use an absolute path).

Note: OpenClaw configuration typically lives under ~/.openclaw/openclaw.json (separate from the workspace).

Common mistake: cloning the plugin somewhere else, while keeping a relative path like plugins.load.paths: ["plugins/memory-lancedb-pro"]. Relative paths can be resolved against different working directories depending on how the Gateway is started.

To avoid ambiguity, use an absolute path (Option B) or clone into <workspace>/plugins/ (Option A) and keep your config consistent.

# 1) Go to your OpenClaw workspace (default: ~/.openclaw/workspace)
#    (You can override it via agents.defaults.workspace.)
cd /path/to/your/openclaw/workspace

# 2) Clone the plugin into workspace/plugins/
git clone https://github.com/win4r/memory-lancedb-pro.git plugins/memory-lancedb-pro

# 3) Install dependencies
cd plugins/memory-lancedb-pro
npm install

Then reference it with a relative path in your OpenClaw config:

{
  "plugins": {
    "load": {
      "paths": ["plugins/memory-lancedb-pro"]
    },
    "entries": {
      "memory-lancedb-pro": {
        "enabled": true,
        "config": {
          "embedding": {
            "apiKey": "${JINA_API_KEY}",
            "model": "jina-embeddings-v5-text-small",
            "baseURL": "https://api.jina.ai/v1",
            "dimensions": 1024,
            "taskQuery": "retrieval.query",
            "taskPassage": "retrieval.passage",
            "normalized": true
          }
        }
      }
    },
    "slots": {
      "memory": "memory-lancedb-pro"
    }
  }
}

Option B: clone anywhere, but use an absolute path

{
  "plugins": {
    "load": {
      "paths": ["/absolute/path/to/memory-lancedb-pro"]
    }
  }
}

Restart

openclaw gateway restart

Note: If you previously used the built-in memory-lancedb, disable it when enabling this plugin. Only one memory plugin can be active at a time.

  • Confirm the plugin is discoverable/loaded:
openclaw plugins list
openclaw plugins info memory-lancedb-pro
  • If anything looks wrong, run the built-in diagnostics:
openclaw plugins doctor
  • Confirm the memory slot points to this plugin:
# Look for: plugins.slots.memory = "memory-lancedb-pro"
openclaw config get plugins.slots.memory

Configuration

Full Configuration Example (click to expand)
{
  "embedding": {
    "apiKey": "${JINA_API_KEY}",
    "model": "jina-embeddings-v5-text-small",
    "baseURL": "https://api.jina.ai/v1",
    "dimensions": 1024,
    "taskQuery": "retrieval.query",
    "taskPassage": "retrieval.passage",
    "normalized": true
  },
  "dbPath": "~/.openclaw/memory/lancedb-pro",
  "autoCapture": true,
  "autoRecall": false,
  "retrieval": {
    "mode": "hybrid",
    "vectorWeight": 0.7,
    "bm25Weight": 0.3,
    "minScore": 0.3,
    "rerank": "cross-encoder",
    "rerankApiKey": "${JINA_API_KEY}",
    "rerankModel": "jina-reranker-v3",
    "rerankEndpoint": "https://api.jina.ai/v1/rerank",
    "rerankProvider": "jina",
    "candidatePoolSize": 20,
    "recencyHalfLifeDays": 14,
    "recencyWeight": 0.1,
    "filterNoise": true,
    "lengthNormAnchor": 500,
    "hardMinScore": 0.35,
    "timeDecayHalfLifeDays": 60,
    "reinforcementFactor": 0.5,
    "maxHalfLifeMultiplier": 3
  },
  "enableManagementTools": false,
  "scopes": {
    "default": "global",
    "definitions": {
      "global": { "description": "Shared knowledge" },
      "agent:discord-bot": { "description": "Discord bot private" }
    },
    "agentAccess": {
      "discord-bot": ["global", "agent:discord-bot"]
    }
  },
  "sessionMemory": {
    "enabled": false,
    "messageCount": 15
  }
}

Access Reinforcement (1.0.26)

To make frequently used memories decay more slowly, the retriever can extend the effective time-decay half-life based on manual recall frequency (spaced-repetition style).

Config keys (under retrieval):

  • reinforcementFactor (range: 0โ€“2, default: 0.5) โ€” set 0 to disable
  • maxHalfLifeMultiplier (range: 1โ€“10, default: 3) โ€” hard cap: effective half-life โ‰ค base ร— multiplier

Notes:

  • Reinforcement is whitelisted to source: "manual" (i.e. user/tool initiated recall), to avoid accidental strengthening from auto-recall.

Embedding Providers

This plugin works with any OpenAI-compatible embedding API:

ProviderModelBase URLDimensions
Jina (recommended)jina-embeddings-v5-text-smallhttps://api.jina.ai/v11024
OpenAItext-embedding-3-smallhttps://api.openai.com/v11536
Google Geminigemini-embedding-001https://generativelanguage.googleapis.com/v1beta/openai/3072
Ollama (local)nomic-embed-texthttp://localhost:11434/v1provider-specific (set embedding.dimensions to match your Ollama model output)

Rerank Providers

Cross-encoder reranking supports multiple providers via rerankProvider:

ProviderrerankProviderEndpointExample Model
Jina (default)jinahttps://api.jina.ai/v1/rerankjina-reranker-v3
SiliconFlow (free tier available)siliconflowhttps://api.siliconflow.com/v1/rerankBAAI/bge-reranker-v2-m3, Qwen/Qwen3-Reranker-8B
Voyage AIvoyagehttps://api.voyageai.com/v1/rerankrerank-2.5
Pineconepineconehttps://api.pinecone.io/rerankbge-reranker-v2-m3

Notes:

  • voyage sends { model, query, documents } without top_n.
  • Voyage responses are parsed from data[].relevance_score.
SiliconFlow Example
{
  "retrieval": {
    "rerank": "cross-encoder",
    "rerankProvider": "siliconflow",
    "rerankEndpoint": "https://api.siliconflow.com/v1/rerank",
    "rerankApiKey": "sk-xxx",
    "rerankModel": "BAAI/bge-reranker-v2-m3"
  }
}
Voyage Example
{
  "retrieval": {
    "rerank": "cross-encoder",
    "rerankProvider": "voyage",
    "rerankEndpoint": "https://api.voyageai.com/v1/rerank",
    "rerankApiKey": "${VOYAGE_API_KEY}",
    "rerankModel": "rerank-2.5"
  }
}
Pinecone Example
{
  "retrieval": {
    "rerank": "cross-encoder",
    "rerankProvider": "pinecone",
    "rerankEndpoint": "https://api.pinecone.io/rerank",
    "rerankApiKey": "pcsk_xxx",
    "rerankModel": "bge-reranker-v2-m3"
  }
}

Optional: JSONL Session Distillation (Auto-memories from chat logs)

OpenClaw already persists full session transcripts as JSONL files:

  • ~/.openclaw/agents/<agentId>/sessions/*.jsonl

This plugin focuses on high-quality long-term memory. If you dump raw transcripts into LanceDB, retrieval quality quickly degrades.

Instead, recommended (2026-02+) is a non-blocking /new pipeline:

  • Trigger: command:new (you type /new)
  • Hook: enqueue a tiny JSON task file (fast; no LLM calls inside the hook)
  • Worker: a user-level systemd service watches the inbox and runs Gemini Map-Reduce on the session JSONL transcript
  • Store: writes 0โ€“20 high-signal, atomic lessons into LanceDB Pro via openclaw memory-pro import
  • Keywords: each memory includes Keywords (zh) with a simple taxonomy (Entity + Action + Symptom). Entity keywords must be copied verbatim from the transcript (no hallucinated project names).
  • Notify: optional Telegram/Discord notification (even if 0 lessons)

See the self-contained example files in:

  • examples/new-session-distill/

Legacy option: an hourly distiller cron that:

  • Incrementally reads only the newly appended tail of each session JSONL (byte-offset cursor)
  • Filters noise (tool output, injected <relevant-memories>, logs, boilerplate)
  • Uses a dedicated agent to distill reusable lessons / rules / preferences into short atomic memories
  • Stores them via memory_store into the right scope (global or agent:<agentId>)

What you get

  • โœ… Fully automatic (cron)
  • โœ… Multi-agent support (main + bots)
  • โœ… No re-reading: cursor ensures the next run only processes new lines
  • โœ… Memory hygiene: quality gate + dedupe + per-run caps

Script

This repo includes the extractor script:

  • scripts/jsonl_distill.py

It produces a small batch JSON file under:

  • ~/.openclaw/state/jsonl-distill/batches/

and keeps a cursor here:

  • ~/.openclaw/state/jsonl-distill/cursor.json

The script is safe: it never modifies session logs.

By default it skips historical reset snapshots (*.reset.*) and excludes the distiller agent itself (memory-distiller) to prevent self-ingestion loops.

Optional: restrict distillation sources (allowlist)

By default, the extractor scans all agents (except memory-distiller).

If you want higher signal (e.g., only distill from your main assistant + coding bot), set:

export OPENCLAW_JSONL_DISTILL_ALLOWED_AGENT_IDS="main,code-agent"
  • Unset / empty / * / all โ†’ allow all agents (default)
  • Comma-separated list โ†’ only those agents are scanned

1) Create a dedicated agent

openclaw agents add memory-distiller \
  --non-interactive \
  --workspace ~/.openclaw/workspace-memory-distiller \
  --model openai-codex/gpt-5.2

2) Initialize cursor (Mode A: start from now)

This marks all existing JSONL files as "already read" by setting offsets to EOF.

# Set PLUGIN_DIR to where this plugin is installed.
# - If you cloned into your OpenClaw workspace (recommended):
#   PLUGIN_DIR="$HOME/.openclaw/workspace/plugins/memory-lancedb-pro"
# - Otherwise, check: `openclaw plugins info memory-lancedb-pro` and locate the directory.
PLUGIN_DIR="/path/to/memory-lancedb-pro"

python3 "$PLUGIN_DIR/scripts/jsonl_distill.py" init

3) Create an hourly cron job (Asia/Shanghai)

Tip: start the message with run ... so memory-lancedb-pro's adaptive retrieval will skip auto-recall injection (saves tokens).

# IMPORTANT: replace <PLUGIN_DIR> in the template below with your actual plugin path.
MSG=$(cat <<'EOF'
run jsonl memory distill

Goal: distill NEW chat content from OpenClaw session JSONL files into high-quality LanceDB memories using memory_store.

Hard rules:
- Incremental only: call the extractor script; do NOT scan full history.
- Store only reusable memories; skip routine chatter.
- English memory text + final line: Keywords (zh): ...
- < 500 chars, atomic.
- <= 3 memories per agent per run; <= 3 global per run.
- Scope: global for broadly reusable; otherwise agent:<agentId>.

Workflow:
1) exec: python3 <PLUGIN_DIR>/scripts/jsonl_distill.py run
2) If noop: stop.
3) Read batchFile (created/pending)
4) memory_store(...) for selected memories
5) exec: python3 <PLUGIN_DIR>/scripts/jsonl_distill.py commit --batch-file <batchFile>
EOF
)

openclaw cron add \
  --agent memory-distiller \
  --name "jsonl-memory-distill (hourly)" \
  --cron "0 * * * *" \
  --tz "Asia/Shanghai" \
  --session isolated \
  --wake now \
  --timeout-seconds 420 \
  --stagger 5m \
  --no-deliver \
  --message "$MSG"

4) Debug run

openclaw cron run <jobId> --expect-final --timeout 180000
openclaw cron runs --id <jobId> --limit 5

When distilling all agents, always set scope explicitly when calling memory_store:

  • Broadly reusable โ†’ scope=global
  • Agent-specific โ†’ scope=agent:<agentId>

This prevents cross-bot memory pollution.

Rollback

  • Disable/remove cron job: openclaw cron disable <jobId> / openclaw cron rm <jobId>
  • Delete agent: openclaw agents delete memory-distiller
  • Remove cursor state: rm -rf ~/.openclaw/state/jsonl-distill/

CLI Commands

# List memories (output includes the memory id)
openclaw memory-pro list [--scope global] [--category fact] [--limit 20] [--json]

# Search memories
openclaw memory-pro search "query" [--scope global] [--limit 10] [--json]

# View statistics
openclaw memory-pro stats [--scope global] [--json]

# Delete a memory by ID (supports 8+ char prefix)
# Tip: copy the id shown by `memory-pro list` / `memory-pro search` (or use --json for full output)
openclaw memory-pro delete <id>

# Bulk delete with filters
openclaw memory-pro delete-bulk --scope global [--before 2025-01-01] [--dry-run]

# Export / Import
openclaw memory-pro export [--scope global] [--output memories.json]
openclaw memory-pro import memories.json [--scope global] [--dry-run]

# Re-embed all entries with a new model
openclaw memory-pro reembed --source-db /path/to/old-db [--batch-size 32] [--skip-existing]

# Migrate from built-in memory-lancedb
openclaw memory-pro migrate check [--source /path]
openclaw memory-pro migrate run [--source /path] [--dry-run] [--skip-existing]
openclaw memory-pro migrate verify [--source /path]

Custom Commands (e.g. /lesson)

This plugin provides the core memory tools (memory_store, memory_recall, memory_forget, memory_update). You can define custom slash commands in your Agent's system prompt to create convenient shortcuts.

Example: /lesson command

Add this to your CLAUDE.md, AGENTS.md, or system prompt:

## /lesson command
When the user sends `/lesson <content>`:
1. Use memory_store to save as category=fact (the raw knowledge)
2. Use memory_store to save as category=decision (actionable takeaway)
3. Confirm what was saved

Example: /remember command

## /remember command
When the user sends `/remember <content>`:
1. Use memory_store to save with appropriate category and importance
2. Confirm with the stored memory ID

Built-in Tools Reference

ToolDescription
memory_storeStore a memory (supports category, importance, scope)
memory_recallSearch memories (hybrid vector + BM25 retrieval)
memory_forgetDelete a memory by ID or search query
memory_updateUpdate an existing memory in-place

Note: These tools are registered automatically when the plugin loads. Custom commands like /lesson are not built into the plugin โ€” they are defined at the Agent/system-prompt level and simply call these tools.

Database Schema

LanceDB table memories:

FieldTypeDescription
idstring (UUID)Primary key
textstringMemory text (FTS indexed)
vectorfloat[]Embedding vector
categorystringpreference / fact / decision / entity / other
scopestringScope identifier (e.g., global, agent:main)
importancefloatImportance score 0โ€“1
timestampint64Creation timestamp (ms)
metadatastring (JSON)Extended metadata

Troubleshooting

"Cannot mix BigInt and other types" (LanceDB / Apache Arrow)

On LanceDB 0.26+ (via Apache Arrow), some numeric columns may be returned as BigInt at runtime (commonly: timestamp, importance, _distance, _score). If you see errors like:

  • TypeError: Cannot mix BigInt and other types, use explicit conversions

upgrade to memory-lancedb-pro >= 1.0.14. This plugin now coerces these values using Number(...) before doing arithmetic (for example, when computing scores or sorting by timestamp).

Iron Rules for AI Agents (้“ๅพ‹)

For OpenClaw users: copy the code block below into your AGENTS.md so your agent enforces these rules automatically.

## Rule 1 โ€” ๅŒๅฑ‚่ฎฐๅฟ†ๅญ˜ๅ‚จ๏ผˆ้“ๅพ‹๏ผ‰

Every pitfall/lesson learned โ†’ IMMEDIATELY store TWO memories to LanceDB before moving on:

- **Technical layer**: Pitfall: [symptom]. Cause: [root cause]. Fix: [solution]. Prevention: [how to avoid]
  (category: fact, importance โ‰ฅ 0.8)
- **Principle layer**: Decision principle ([tag]): [behavioral rule]. Trigger: [when it applies]. Action: [what to do]
  (category: decision, importance โ‰ฅ 0.85)
- After each store, immediately `memory_recall` with anchor keywords to verify retrieval.
  If not found, rewrite and re-store.
- Missing either layer = incomplete.
  Do NOT proceed to next topic until both are stored and verified.
- Also update relevant SKILL.md files to prevent recurrence.

## Rule 2 โ€” LanceDB ๅซ็”Ÿ

Entries must be short and atomic (< 500 chars). Never store raw conversation summaries, large blobs, or duplicates.
Prefer structured format with keywords for retrieval.

## Rule 3 โ€” Recall before retry

On ANY tool failure, repeated error, or unexpected behavior, ALWAYS `memory_recall` with relevant keywords
(error message, tool name, symptom) BEFORE retrying. LanceDB likely already has the fix.
Blind retries waste time and repeat known mistakes.

## Rule 4 โ€” ็ผ–่พ‘ๅ‰็กฎ่ฎค็›ฎๆ ‡ไปฃ็ ๅบ“

When working on memory plugins, confirm you are editing the intended package
(e.g., `memory-lancedb-pro` vs built-in `memory-lancedb`) before making changes;
use `memory_recall` + filesystem search to avoid patching the wrong repo.

## Rule 5 โ€” ๆ’ไปถไปฃ็ ๅ˜ๆ›ดๅฟ…้กปๆธ… jiti ็ผ“ๅญ˜๏ผˆMANDATORY๏ผ‰

After modifying ANY `.ts` file under `plugins/`, MUST run `rm -rf /tmp/jiti/` BEFORE `openclaw gateway restart`.
jiti caches compiled TS; restart alone loads STALE code. This has caused silent bugs multiple times.
Config-only changes do NOT need cache clearing.

Dependencies

PackagePurpose
@lancedb/lancedb โ‰ฅ0.26.2Vector database (ANN + FTS)
openai โ‰ฅ6.21.0OpenAI-compatible Embedding API client
@sinclair/typebox 0.34.48JSON Schema type definitions (tool parameters)

Contributors

Top contributors (from GitHub's contributors list, sorted by commit contributions; bots excluded):

@win4r @kctony @Akatsuki-Ryu @AliceLJY @JasonSuz @Minidoracat @rwmjhb @furedericca-lab @joe2643 @chenjiyong

Full list: https://github.com/win4r/memory-lancedb-pro/graphs/contributors

โญ Star History

Star History Chart

License

MIT

Buy Me a Coffee

"Buy Me A Coffee"

My WeChat Group and My WeChat QR Code

Keywords

openclaw

FAQs

Package last updated on 06 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts