ULMA (Universal LLM Memory Architecture) Plugin
ULMA is the fundamental solution to the "Project Amnesia" problem in AI coding agents. It provides a strictly isolated, local-first memory architecture that guarantees coding efficiency and architectural consistency across long-term projects.
Unlike generic memory plugins that treat context as a flat list, ULMA understands project boundaries and session lifecycles, ensuring your agent never "hallucinates" patterns from other repos or forgets architectural decisions made weeks ago.
The Core Value: Solving Global Memory Fundamentally
Most AI agents suffer from Context Drift: as a project grows, they lose track of established patterns, or worse, bleed context from Project A into Project B. ULMA solves this at the architectural level:
1. Agent Efficiency & Zero-Shot Accuracy
- Problem: Agents often waste tokens re-learning your project structure in every session.
- Solution: With persistent, project-hashed memory, ULMA instantly provides the exact relevant context (types, utils, patterns) before the agent writes a single line. This drastically reduces "correction loops" and makes the agent "get it right the first time."
2. Architectural Consistency (The "Deep Memory")
- Problem: "Why did you use
var here? We switched to const last week!" Agents forget past decisions.
- Solution: By retrieving historical context scoped strictly to the current project, the agent maintains consistent coding styles, variable naming conventions, and architectural patterns throughout the project's lifecycle—whether it's Day 1 or Day 100.
3. Absolute Project Isolation (No "Cross-Talk")
- Problem: Working on a Rust backend in the morning and a React frontend in the afternoon often confuses agents, leading to "hallucinated" imports or syntax.
- Solution: Hard Directory Hashing. Vectors for Project A are physically stored in
.ulma/vectors/<hash_A>. It is mathematically impossible for the agent to access Project B's context while working on Project A. This is not just a filter; it's a physical firewall for your context.
Why ULMA: The Definitive Comparison
While plugins like opencode-supermemory or opencode-mem offer memory, they often rely on cloud services or loose tagging. ULMA is built for Enterprise-Grade Consistency.
| Global Memory Problem | SOLVED (Physical Isolation) | Mitigated (Tags/Filters) | Unsolved (Flat File) |
| Agent Consistency | High (Project-Scoped RAG) | Medium (Global Context Noise) | Low (Session Only) |
| Coding Efficiency | High (Precise Retrieval) | Medium (Network Latency) | Low (Limited Context) |
| Data Privacy | 100% Local (LanceDB) | Cloud Dependent | Local |
| Indexing Strategy | Incremental & Background | Full Re-upload | Manual / Sync |
Verdict: If you need an agent that "codes like a senior engineer" who remembers the project history without getting confused by other projects, ULMA is the only architectural solution.
Performance & Benchmarks (v1.2.0)
ULMA is built for speed and accuracy in real-world, medium-to-large codebases. Below are the results from our automated benchmark suite running on the Wasteland project (~122,000 lines of C#, JS, TS, Rust) on a standard developer machine.
| Indexing Speed | ~700 lines/s | Full index (122k lines) in ~173s |
| Retrieval Latency | 170 ms (Avg) | P99 < 300ms (Hybrid Retrieval) |
| Symbol Navigation | 70.3% Recall | High accuracy for finding class/method definitions |
| Logic Query | 3.3% Recall | Current Limitation: Semantic queries need future graph analysis |
Benchmark Environment: MacBook Pro M2, Node.js v18, LanceDB Local.
Key Improvements in v1.2:
- Tree-sitter Parsing: Replaced regex-based chunking with AST-based parsing for C#, Rust, Go, and JS/TS.
- Smart Context: Chunks now include file path and method context for better retrieval.
- Optimized Performance: Query pre-compilation reduced indexing time by 42%.
Future Roadmap: Advanced Semantic Understanding
While ULMA excels at structural symbol navigation (70%+ recall), Logic-Query recall remains low (~3.3%). This is a known semantic gap: a user asks "How is input handled?", but the code uses terms like ProcessPacket or OnKeyDown.
To bridge this gap and achieve "Senior Engineer" level understanding, the next phase of ULMA will focus on:
1. Code Property Graph (CPG)
We will move beyond flat vector retrieval to a graph-based approach. By constructing a lightweight CPG, ULMA will understand:
- Call Graphs: "Who calls
ProcessPacket?"
- Data Flow: "Where does the
input variable go?"
This allows the agent to trace logical connections that keyword search misses.
Initial CPG-lite signals are now integrated for retrieval boosting; full call/data-flow analysis remains pending.
We use the lite version to keep indexing latency and memory usage stable in real projects while the full graph pipeline is being validated.
2. Query Expansion & HyDE
Using LLMs to hallucinate potential code implementations from a natural language query (Hypothetical Document Embeddings).
- User: "Handle player input"
- Expansion: "Input.GetKey, OnMouseDown, EventSystem.current"
This expanded query significantly improves vector retrieval accuracy.
HyDE-lite expansion is now integrated into retrieval; deeper prompt tuning remains pending.
We keep HyDE-lite to control noise and cost, and only expand when retrieval benefit is measurable.
3. Local-First Deep Embedding
Replacing generic embedding models with code-specialized models (e.g., fine-tuned BERT for code) that natively understand that handle input and ProcessPacket are semantically related in a game development context.
Core Features
L2 Storage Model
-
Runtime: the plugin maintains in-memory task state for the active session
-
Persistence:
- Local JSON Write-Ahead Log:
.ulma/tasks_db.json (transactional layer)
- “Dual-Write” with vector tables (Tasks/Archive) for recoverability
-
Integration:
- When integrating with ulma-core service, L2 semantics are event-sourced (L2E/L2S) with files
tasks_events.jsonl (event log) and tasks.json (state view). The plugin aligns to core semantics without redefining them.
-
Task tracking and synchronization
-
Memory indexing and retrieval with LanceDB
-
Project‑level vector isolation via hashed project root
-
Session‑level retrieval filters for tasks and history
-
Incremental background indexing
-
Experimental CPG-lite graph signals for retrieval boosting
-
Query expansion (HyDE-lite) for better semantic recall
Requirements
- OpenCode CLI 1.1.37 (or later)
- Node.js 18+
Installation
Option A: Install from npm (Recommended)
- Ensure you have the OpenCode CLI installed.
- Edit your OpenCode configuration file (usually
~/.config/opencode/config.json or %USERPROFILE%\\.config\\opencode\\config.json):
{
"$schema": "https://opencode.ai/config.json",
"plugin": [
"@alltheright121/ulma-plugin"
]
}
- Restart your OpenCode session. The plugin will be automatically downloaded and installed.
Option B: Install from GitHub (Local Path)
If you want to modify the plugin or use a local version:
-
Clone this repository:
git clone https://github.com/drpr/alex.git
cd alex/ulma-plugin
-
Install dependencies:
npm install
-
Install Language Parsers (Important):
ULMA uses Tree-sitter for advanced code analysis. You must install the WASM parsers before running the plugin.
node scripts/download_parsers.cjs
This script will attempt to download parsers from CDN or extract them from node_modules.
-
Build the plugin (if applicable) or ensure plugin.mjs is ready.
Tree-sitter & WASM Configuration
ULMA relies on web-tree-sitter and language-specific WASM files (e.g., tree-sitter-c_sharp.wasm) for structural code understanding. These files are stored in the parsers/ directory.
Managing Parsers
The scripts/download_parsers.cjs script is the easiest way to manage these files. It reads configurations from src/languages/definitions/*.json and fetches the required WASM files.
If automatic download fails:
- Install the language package manually:
npm install tree-sitter-c-sharp
- Copy the
.wasm file from node_modules/tree-sitter-c-sharp/ to parsers/tree-sitter-c_sharp.wasm.
Supported Languages
Currently configured languages (see src/languages/definitions/):
- C# (
.cs): Supports class-based chunking and method symbol extraction.
- Rust (
.rs): Supports impl/function chunking.
- Go (
.go): Supports function/method chunking.
- JavaScript/TypeScript (
.js, .ts): Basic support.
To add more languages, add a JSON definition file and ensure the WASM file is available.
5. Point your OpenCode config to the absolute path of the ulma-plugin directory:
{
"plugin": [
"/Users/yourname/path/to/alex/ulma-plugin"
]
}
Configuration
To customize indexing behavior, create a .ulma.json file in the root of the project you are working on (not the plugin directory).
{
"vectorDir": ".opencode/vectors",
"include": ["**/*.{js,ts,jsx,tsx,py,java,cs,go,rs,md,c,cpp,h}"],
"exclude": ["**/node_modules/**", "**/dist/**", "**/.git/**", "**/.vscode/**"],
"topK": 8,
"indexingDebounce": 1000
}
Configuration fields
| vectorDir | Vector storage directory | Keep default |
| include | File patterns to index | Cover primary languages |
| exclude | File patterns to ignore | Exclude build outputs and dependencies |
| topK | Max candidates per retrieval | 3–8 |
Usage
- Install the plugin and complete configuration
- First run builds the initial index (if missing, the plugin will auto-create the
codebase table)
- Retrieval is scoped by session and project automatically
- Switching projects requires no manual cleanup because isolation is enforced
Troubleshooting
v1.3.0 Sync Note
-
Benchmarks relocated under benchmarks/ (taskset: benchmarks/wasteland-bench-v2.json, runner: benchmarks/ulma-bench-v2.cjs)
-
Results are saved to benchmarks/results/ulma-bench-v2.csv
-
Plugin default config added: .ulma-plugin.json (project-level overrides via .ulma_plugin.json)
-
GitHub structure: keep only ulma-plugin/ (legacy ulma/ not required)
-
Bun Error Overlay (macOS)
Use polling-based file watching instead of native fsevents (built-in). If the overlay persists, restart OpenCode and retry.
-
First retrieval returns “Table 'codebase' was not found”
The plugin auto-detects and creates the codebase table. Wait for the initial indexing to complete.
-
Version alignment
Ensure your host CLI and plugin interface align to @opencode-ai/plugin@1.1.37. Prefer installing via the npm package name to avoid loading stale local path versions.
Isolation Model
- Project isolation: vectors live under
.opencode/vectors/<projectId>, derived from a hash of the project root
- Session isolation: task and history retrieval are filtered by session ID
Landscape and References
ULMA is designed for developers who demand absolute context control.
References:
Development
npm install
Publish
npm publish --access public
License
MIT