Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement β†’
Sign In

prism-mcp-server

Package Overview
Dependencies
Maintainers
1
Versions
131
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

prism-mcp-server

Prism Coder β€” Cognitive memory + tool-calling intelligence for AI agents. Mind Palace persistent memory (BFCL Gold Certified, 100% Tool-Call Accuracy, 54 Agent Skills, Zero-Search HDC/HRR retrieval, HRR Semantic Drift Detection across BCBA/Coding/AAC doma

latest
Source
npmnpm
Version
17.0.1
Version published
Weekly downloads
1.7K
147.81%
Maintainers
1
Weekly downloads
Β 
Created
Source

🧠 Prism Coder

🌐 Read in your language: πŸ‡¬πŸ‡§ English Β· πŸ‡ͺπŸ‡Έ EspaΓ±ol Β· πŸ‡«πŸ‡· FranΓ§ais Β· πŸ‡΅πŸ‡Ή PortuguΓͺs Β· πŸ‡·πŸ‡΄ RomΓ’nΔƒ Β· πŸ‡ΊπŸ‡¦ Π£ΠΊΡ€Π°Ρ—Π½ΡΡŒΠΊΠ° Β· πŸ‡·πŸ‡Ί Русский Β· πŸ‡©πŸ‡ͺ Deutsch Β· πŸ‡―πŸ‡΅ ζ—₯本θͺž Β· πŸ‡°πŸ‡· ν•œκ΅­μ–΄ Β· πŸ‡¨πŸ‡³ δΈ­ζ–‡ Β· πŸ‡ΈπŸ‡¦ Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©

Persistent memory + tool-calling intelligence for AI agents. (formerly Prism MCP)

A Model Context Protocol server that gives Claude, Cursor, and other AI tools a Mind Palace β€” long-term memory that survives across sessions, with semantic search, cognitive routing, a visual dashboard, and the prism-coder:1b7 / prism-coder:8b / prism-coder:14b / prism-coder:32b LLM fleet for offline tool-calling.

npm VS Marketplace Website MCP Registry Smithery License: AGPL-3.0

Renamed in v14.0.0: the project is now Prism Coder to cover both the Mind Palace memory server and the prism-coder:1b7 / prism-coder:8b / prism-coder:14b / prism-coder:32b LLM fleet on HuggingFace + Ollama. The npm package stays prism-mcp-server so existing install URLs and mcp.json entries keep working β€” the prism-coder binary has been the canonical entry point since v12.

What Prism Coder does

πŸ’Ύ Your AI remembers across sessions

Every conversation feeds the Mind Palace. Next session, your AI agent loads the right context automatically β€” no re-explaining.

πŸ” Semantic search over your history

Ask "what did I decide about the auth flow last month?" and get the answer with citations. Vector search + keyword + graph traversal.

🧬 Cognitive routing

Different memory types live in different stores: episodic (what happened), semantic (what's true), procedural (how to do X). The router picks where to store and where to retrieve.

πŸ”„ Proactive session drift detection (new in v15)

Your AI agent can now detect when it has drifted from your original goals β€” mid-session, automatically β€” and self-correct before you notice the problem.

Three direct Prism calls:

  • session_save_ledger β€” snapshot current state
  • session_cognitive_route β€” compare current work against original goals, returns on_track / minor_drift / major_drift
  • session_compact_ledger β€” if drifted, compress and reload only what matters

When major drift is detected, the alert routes to the Synalux portal so it's visible across sessions and devices β€” not just in the current conversation.

Real example it caught: A training session promised BFCL β‰₯90% for three AI models. The agent spent 3 hours debugging audio bugs instead. The drift check surfaced: "Training goal unmet. Layer3 corpus missing from all training sets. 0 BFCL scores measured." The session immediately re-aligned.

No scripts. No cron. No hooks. Three tool calls, Prism handles the rest.

πŸ›‘ Local-first β€” security + speed

Free tier runs entirely on your machine β€” SQLite, local embedding model, no API keys, no cloud. Paid tier adds cloud sync via Synalux portal.

Why local models matter:

Cloud LLMLocal prism-coder
Tool-call latency200ms–3s~1.6s (1.7B) / ~1.1s (14B)
API key requiredYesNo
Data sent externallyEvery promptNothing
Works offlineβŒβœ…
Cost at scale$0.002–0.06/call$0
HIPAARequires BAAOn-prem = no BAA

Install in one command β€” no config, no keys, no vendor agreements:

ollama pull dcostenco/prism-coder:14b   # 9 GB  Β· default router Β· Mac M2+ / iPad Pro
ollama pull dcostenco/prism-coder:4b    # 2.5 GB Β· verifier Β· iPhone 15/16 Pro
ollama pull dcostenco/prism-coder:1b7   # 2.2 GB Β· ultra-low RAM / Apple Watch
ollama pull dcostenco/prism-coder:32b   # 19 GB  Β· complex tasks Β· Mac M2 Ultra+
ollama pull dcostenco/prism-coder:8b    # 4.7 GB Β· balanced Β· iPhone/iPad 8GB

Prism MCP detects both the namespaced (dcostenco/prism-coder:14b) and bare (prism-coder:14b) Ollama tag forms automatically β€” nothing else to configure. If you want the bare tags as aliases for direct ollama run prism-coder:14b use, run:

prism register-models           # aliases */prism-coder:* β†’ prism-coder:* via `ollama cp`
prism register-models --dry-run # preview what would be aliased

Cascade architecture

Three-tier local cascade with cloud fallback:

Query arrives
  β”‚
  β–Ό
prism-coder:14b ── routes (100% eval_300) ──▢  serve  (~3s, 9GB, FREE)
  β”‚                                              β”‚
  β”‚                                    knowledge_search (RAG context)
  β”‚                                              β”‚
  β–Ό                                              β–Ό
prism-coder:4b ── verifies claims ──────────▢  grounded response
  β”‚                 (2.5GB, <1s)
  β”‚
  β–Ό  (complex tasks only, explicit ceiling="32b")
prism-coder:32b ── deep reasoning ──────────▢  serve  (~8s, 19GB, FREE)
  β”‚
  β–Ό  (cloud fallback when local insufficient)
Claude Sonnet 4 β†’ Claude Opus 4.7 ─────────▢  serve  (cloud, ~$0.01/req)
TierModelRoleRAMLatencyCost
Defaultprism-coder:14bRouter + general inference9 GB~3s$0
Verifierprism-coder:4bGrounding claims check2.5 GB<1s$0
Complexprism-coder:32bDeep reasoning (on-demand)19 GB~8s$0
CloudSonnet β†’ OpusFallback for max qualityβ€”~5-10s~$0.01

Mobile / offline cascade (Prism AAC iOS):

prism-coder:14b (iPad Pro 16GB) β†’ prism-coder:4b (iPhone 8GB)
  β†’ prism-coder:1.7b (any device, always fits)

Knowledge ingestion β€” teach Prism your codebase

Your code knowledge lives in the knowledge graph, not in model weights. Routing stays at 100%.

bash scripts/knowledge-ingest/setup.sh   # one-time setup
# Then every git commit auto-indexes changed files into the knowledge graph

Three entry points:

  • MCP tool: knowledge_ingest β€” AI says "learn this code"
  • GitHub webhook: POST /api/github/webhook β€” auto on push
  • REST API: POST /api/v1/prism/ingest β€” open interface

See KNOWLEDGE_INGESTION.md for full setup guide.

Routing accuracy

Head-to-head: prism-coder:14b vs Claude Opus (25-case benchmark, production system prompt, May 2026):

Metricprism-coder:14bClaude Opus 4
Overall accuracy96% (24/25)88% (22/25)
Tool routing (15 tests)93% (14/15)80% (12/15)
Abstention (10 tests)100% (10/10)100% (10/10)
Avg latency0.8s5.5s
Cost per query$0~$0.017
Annual @ 1K/day$0~$6,100

prism-coder:14b beats Opus on tool routing β€” 7x faster, free, runs offline.

eval_300 (300 cases, 17 tools + NO_TOOL, 9 categories, 3-seed validated):

Modeleval_300 strictSizeLatency
prism-coder:32b300/300 (100%)19 GB~1.4s
prism-coder:14b299/300 (99.7%)9 GB~0.8s
prism-coder:4b300/300 (100%)2.5 GB~0.5s
prism-coder:1.7b300/300 (100%)2.2 GB~1.6s

Categories: abstention, adversarial traps, cascade, disambiguation, edge cases, multi-intent, natural phrasing, parameter extraction, verifier prompts.

What this means: a child in a hospital without WiFi, a nonverbal adult on an airplane, or a family on a budget gets Claude-grade routing accuracy with zero cloud dependency β€” the AAC path routes correctly 100% of the time across all tiers.

What it does NOT mean: these scores measure routing precision on a 17-tool taxonomy, not general intelligence. Claude outperforms on everything outside this task. The value is offline reliability at zero cost, not replacing Claude. Code and clinical knowledge come from RAG via knowledge_search.

πŸ” L3 Grounding Verifier

When prism_infer receives an evidence payload, the grounding verifier automatically checks the model's response against the provided evidence before returning to the caller. Unverified or hallucinated claims are flagged. This is the third layer (L3) of the cascade β€” after tool routing (L1) and confidence gating (L2).

🧠 HRR Semantic Drift Detection (v17.0)

Detects when long AI agent sessions drift from their original goal β€” using Holographic Reduced Representations for temporal trajectory encoding and anomaly detection.

Three domains, one detector:

DomainSignalsSafety
BCBA/ClinicalClient specificity decay, function-intervention alignment (4 functions), contraindication detection (epilepsy/pica/dysphagia/diabetes)PHI-safe, deterministic
CodingFile scope entropy, summary vagueness, test coverage ratio, trajectory HRR divergenceAdaptive threshold for refactors
AACPrediction accuracy, vocabulary stagnation, topic divergenceEmergency phrases always β‰₯ 0.95

Research-backed: trajectory association (Frady et al. 2018), HDAD anomaly detection (Wang et al. 2021), unit-modulus projection (Ganesan et al. NeurIPS 2021). 306 tests across 8 files, zero failures. Use session_detect_drift with optional domain parameter.

⚑ Zero-search retrieval (new in v15.8)

Holographic Reduced Representations (HRR) via Rust WASM for instant memory retrieval without a database query.

Three adaptive strategies:

  • GloVe embeddings (offline, 50K words) β€” 87% Top-1 accuracy, stable at 200+ concepts
  • API embeddings (Gemini/Voyage) β€” 90%+ accuracy when online
  • NeurIPS 2021 projection β€” unit-modulus normalization for numerical stability

Retrieval cascade: HRR (~0.2ms) β†’ FTS5 (~50ms) β†’ Supabase (~200ms)

MetricHRR (WASM)FTS5Supabase Vector
Latency0.2ms50ms200ms
Speedup1x250x slower1000x slower
OfflineYesYesNo
Accuracy (GloVe)87% Top-195%+95%+
Hologram size8KBIndex variesCloud

HRR acts as Tier 0 β€” if confidence is high, FTS5 is skipped entirely. Falls through gracefully when HRR has no match. 97 dedicated tests (72 system + 25 API/client). Built with Rust + rustfft + wasm-bindgen (229KB binary).

HRR AAC prediction benchmark β€” real-world impact on Prism AAC word prediction (10 scenarios, 54 integration tests):

ScenarioBaseline Top-1+HRR Top-1Top-1 LiftMRR Lift
Core AAC phrases36.7%46.7%+27.3%+6.0%
Personal vocabulary70.4%81.5%+15.8%+9.2%
Mixed (all phrases)47.2%56.9%+20.6%+5.7%
Cross-session recall80.0%80.0%+0.0%+0.0%

Top-1 = correct word is tile #1. MRR = Mean Reciprocal Rank. Zero Top-5 regressions in any scenario. HRR encodes bigrams + trigrams from every spoken phrase; probes take ~0.2ms β€” safe on every keystroke. All Synalux apps (clinical, AAC, PrismCoach) share HRR via the portal /api/v1/hrr endpoint.

Competitive comparison:

SystemRetrievalOfflineCostLatency
Prism CoderHRR + FTS5 + Supabase cascadeYes$00.2ms
Mem0Vector DB (Qdrant/Pinecone)No$249/mo~100ms
ZepVector DB + temporal graphNo$99/mo~80ms
Hermes (NousResearch)HRR + SQLiteYesFree~5ms

🌐 Multi-agent Hivemind

Multiple AI agents share the same Mind Palace. Each agent has a role (dev / qa / pm / etc.) and sees scoped context. Heartbeat + roster for coordination.

Get started

# Install globally
npm install -g prism-mcp-server

# Or use npx (no install)
npx prism-mcp-server

Add to Claude Desktop / Cursor config:

{
  "mcpServers": {
    "prism": {
      "command": "npx",
      "args": ["-y", "prism-mcp-server"]
    }
  }
}

That's it. Open Claude / Cursor and your AI now has memory.

More setup details in docs/SETUP_GEMINI.md.

Monitoring & Observability (new in v16.2)

Built-in Datadog integration β€” every tool call is logged with tool name, project, and latency. Zero config for self-hosted users (logs to stdout); set DD_API_KEY to send structured logs to Datadog HTTP intake.

# Enable Datadog logging (optional)
export DD_API_KEY=your_datadog_api_key

# Enable OpenTelemetry tracing (optional β€” works with Jaeger, Zipkin, Datadog, Grafana Tempo)
export PRISM_OTEL_ENABLED=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

What's tracked automatically:

  • mcp.tool.success β€” tool name, project, duration (ms) on every successful call
  • mcp.tool.error β€” tool name, error message, stack trace on failures
  • OpenTelemetry spans with tool.name and project attributes on all 50 tool handlers
DashboardWhat it tracks
Prism MCP β€” Server AnalyticsTool call volume, latency per tool (avg/p95), errors by tool, project activity, knowledge search/ingest, session memory ops

In-app analytics for paid users (new in v16.2)

Paid Synalux subscribers get a built-in analytics dashboard at /app/memory-analytics:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Analytics                              [standard] plan β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  πŸ“ Sessions: 147  πŸ”„ Handoffs: 23  πŸ“š Knowledge: 89  β”‚
β”‚  πŸ“ Projects: 5    πŸ’Ύ Memory: 42 KB                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Today's Usage    🧠 47/200  πŸ”Ž 12/50  πŸ’¬ 85/200       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  30-Day Trend     β–‚β–ƒβ–…β–‡β–†β–„β–ƒβ–…β–†β–‡β–ˆβ–‡β–…β–ƒβ–‚β–ƒβ–…β–†β–‡β–…β–ƒβ–‚β–β–‚β–ƒβ–…β–‡β–†β–…β–ƒ    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Top Projects     prism-mcp (45) Β· portal (32) Β· ...   β”‚
β”‚  Compaction       3 entries > 5KB β€” run compact_ledger  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Free tier: paywall with upgrade CTA
  • Standard+: session counts, handoffs, knowledge entries, daily quotas with tier limits, 30-day activity trend, project breakdown, compaction candidates

How AI agents use it

ToolWhat it does
session_load_contextRecover prior session's state on boot
session_save_ledgerAppend immutable session log entry
session_save_handoffSave live state for the next session
knowledge_searchSemantic + keyword search over all memories
query_memory_naturalNatural-language Q&A over your Mind Palace
extract_entitiesPull people / projects / decisions from text
session_synthesize_edgesAuto-link related memories into a graph

(35+ tools total β€” full TypeScript signatures in src/tools/. Architecture overview in docs/ARCHITECTURE.md.)

πŸ”„ How Prism handles context compaction and context loss

The LLM context window is treated as ephemeral scratch space. All durable state lives in Prism's persistent store (SQLite / Supabase). Context compaction is a non-event.

Boot protocol β€” every session (including post-compaction) begins with a mandatory session_load_context call, enforced via CLAUDE.md. The agent is fully oriented before writing a single byte of response.

Two persistent stores:

  • session_save_ledger β€” immutable append-only work log (decisions, files changed, summaries)
  • session_save_handoff β€” versioned live-state snapshot (current task, TODOs, open context)

Ledger compaction (session_compact_ledger) β€” when a project exceeds a threshold (default: 50 entries), Prism summarizes old entries via LLM into a rollup row, soft-archives originals, and links them via spawned_from graph edges. Runs on a 12-hour background scheduler.

β†’ Full details: docs/COMPACTION.md

Models

Prism Coder inference cascades through fine-tuned models first, with Claude as a quality-gate fallback. Models route through the Synalux router (authentication + subscription required). Cascade: Cloud (OpenRouter) β†’ Ollama local β†’ Claude fallback.

ModelOllama tagWhereTierLatency
prism-coder:1.7bprism-coder:1b7 (v42)On-device (Mac/local) Β· iOS via llama.cppFree~1.6s
prism-coder:8bprism-coder:8b (v36)On-device iPhone/iPad 8GB+ Β· local MacFree~0.8s
prism-coder:14bprism-coder:14b (v36)On-device Mac 24GB+ Β· iPad Pro Β· Cloud A100Standard+~1.1s
prism-coder:32bprism-coder:32b (v7 MoE)Cloud (OpenRouter) A100 80GB via SynaluxPro/Enterprise~0.8s

Models use the Synalux SFT corpus (AAC + Prism MCP tool taxonomy + clinical workflows). Internal quality gate: β‰₯ 90% on the Prism 102-case eval before production promotion.

Training note: Base Qwen3 models are strong tool-routers out of the box. Heavy fine-tuning regresses tool-vs-plain-text decisions; light-touch polish recipes (small corpus, balanced tool/plain-text split) are the published path. Production adapter selection and retrain methodology are managed in the Synalux portal.

Per-category breakdown β€” Prism 102-case eval (3-seed mean, v36/v7 system prompt, May 2026):

ModelOverallLoad ctxSaveSrch memHandoffCompactKnow srchAACTranslateNo-toolInfoEdgeAvg latInv
prism-coder:32b v7100.0%100%100%100%100%100%100%100%100%100%100%100%0.8s0
prism-coder:8b v36100.0%100%100%100%100%100%100%100%100%100%100%100%0.8s0
prism-coder:14b v36100.0%100%100%100%100%100%100%100%100%100%100%100%1.1s0
Claude Opus 4.798.3%100%100%100%100%100%100%100%100%100%100%83%3.0s0
prism-coder:1.7b v42100.0%100%100%100%100%100%100%100%100%100%100%100%1.6s0

Methodology: 102-case pool across 12 categories. Scores are 3-seed mean (seeds 2027/2028/2029, zero variance across all seeds). All fine-tuned models use the Qwen3 nothink template with keyword-trigger routing prompts and -> respond directly (no tool) for the no-tool class. Full runner: tests/benchmarks/prism-routing-100/benchmark.py Β· Cascade runner: tests/benchmarks/cascade-14b-32b-opus/cascade_eval.py.

These are NOT general-purpose LLM benchmarks. This eval measures routing precision on 6 specific MCP tools. The prism-coder models are specialists trained on this exact task β€” they match or exceed Claude on routing while Claude dominates on general reasoning, coding, and open-domain QA. The value is offline reliability at zero cost, not replacing cloud AI.

iOS deployment: On-device inference via llama.cpp Swift SPM. Auto-selects by device RAM: 14B on iPad Pro 16GB (100% routing), 8B on iPhone/iPad 8GB (100%, OOM fallback to 1.7B at 100%). CoreML not viable β€” coremltools doesn't support Qwen3 attention ops. Integration: LLMEngine.swift β†’ prismNativeBridge.askAI() β†’ token stream. WiFi fallback: Mac Ollama (OLLAMA_HOST=0.0.0.0).

Benchmarks β€” run them yourself

All benchmarks are open-source. Reproduce every number in this README:

git clone https://github.com/dcostenco/prism-coder
cd prism-coder
pip install anthropic requests

# Per-model solo eval (102 cases, 3 seeds)
python3 tests/benchmarks/prism-routing-100/benchmark.py --models 14b 8b 32b 1b7 opus

# Cascade eval β€” 14B β†’ 32B β†’ Opus (Claude Opus as etalon)
export ANTHROPIC_API_KEY=sk-ant-...
ollama pull dcostenco/prism-coder:14b dcostenco/prism-coder:32b
python3 tests/benchmarks/cascade-14b-32b-opus/cascade_eval.py

Not a general function-calling benchmark. This measures routing precision on 6 specific MCP tools. We don't claim to beat Claude on general capabilities. We match or exceed Claude on the ONE task that matters for offline AAC: correct tool routing, every time, under 2 seconds, with zero cloud.

BenchmarkSourceWhat it measures
Per-model BFCLtests/benchmarks/prism-routing-100/Solo accuracy per model, 12 categories
Cascade vs Opustests/benchmarks/cascade-14b-32b-opus/Tier distribution, Opus engagement rate, cascade accuracy
LoCoMo-Plus (Cognitive)dcostenco/Locomo-PlusLong-context dialogue coherence and historical memory retention

Cognitive Dialogue Memory (LoCoMo-Plus Benchmark)

LoCoMo-Plus is a long-context, multi-day dialogue benchmark designed to test an AI agent's memory retention, context awareness, and ability to coherently reference historical dialogue evidence.

The Cognitive subset (401 multi-day dialogue scenarios) was evaluated head-to-head comparing raw baseline models against the Prism-MCP framework (using local SQLite semantic memory). Graded by a neutral gemini-2.5-flash model acting as judge (scoring on coherence, continuity, and fact accuracy):

ConfigurationSamplesTotal ScoreAverage ScoreAbsolute DeltaRelative Error Reduction
Gemini-2.5-flash (Baseline)401278.0 / 40169.33%β€”β€”
Prism-MCP (Gemini-2.5-flash + Memory)401361.0 / 40190.02%+20.69pp67.5%
Gemini-3.1-pro-preview (Baseline)401272.0 / 40167.83%β€”β€”
Prism-MCP (Gemini-3.1-pro + Memory)401382.0 / 40195.26%+27.43pp85.3%
Gemini-3.5-flash (Baseline)401237.0 / 40159.10%β€”β€”
Prism-MCP (Gemini-3.5-flash + Memory)401388.0 / 40196.76%+37.66pp92.1%
Claude Sonnet 4.6 (Baseline)401290.0 / 40172.32%β€”β€”
Prism-MCP (Claude Sonnet 4.6 + Memory)401357.0 / 40189.03%+16.71pp60.4%

Key Takeaways:

  • Pure attention limits: Even the strongest frontier model tested β€” Claude Sonnet 4.6 at 72.32% β€” misses over a quarter of cognitive memory cues without external memory. Gemini 3.5 Flash baseline sits at 59.10%. Both suffer from attention dilution when parsing massive multi-day transcripts directly in active context.
  • Prism lifts every model: Prism-MCP yields large gains regardless of base model β€” from +16.71pp (Claude) to +37.66pp (Gemini 3.5 Flash). Even Claude's stronger native recall benefits from structured retrieval, jumping from 72.32% to 89.03%.
  • Best overall: Prism-MCP + Gemini 3.5 Flash achieves the highest score (96.76%), eliminating 92.1% of baseline errors. This makes the cheapest model + Prism more accurate than the most expensive model alone.
  • Claude vs Gemini (raw): Claude Sonnet 4.6 outperforms all Gemini baselines by a wide margin (+13.22pp over Flash 3.5, +4.49pp over Pro 3.1), confirming stronger native long-context recall.
πŸ” View Test Case Schema & Sample

A representative test sample from the unified_cognitive_only.json (GitHub source) dataset contains a multi-turn chat history with a memory "needle" placed days prior, followed by a cued dialogue prompt:

{
  "category": "Cognitive",
  "input_prompt": "Caroline said, \"...\"\nMelanie said, \"...\"",
  "trigger": "Melanie said, \"Hey, Caroline! Nice to hear from you! Love the necklace, any special meaning to it?\"",
  "evidence": "Swedish grandmother's necklace was gifted to Caroline",
  "answer": "Yes, this necklace was a gift from my grandmother in my home country, Sweden."
}

When evaluated:

  • Baseline models without memory frequently output a generic guess (e.g., "Thanks, it was a gift from a friend") or fail to reference the Sweden/grandmother relationship.
  • Prism-MCP automatically embeds the prior turns, stores them in SQLite, and when cued, retrieves the precise "Swedish grandmother" evidence turn via semantic vectors to inject it into active context.
πŸ’» View How to Reproduce Publicly (Test Source & Guide)

To run and review the evaluation suite on your local setup using the benchmark runner scripts (evaluate_qa.py and llm_as_judge.py):

# 1. Clone the LoCoMo-Plus evaluation codebase
git clone https://github.com/dcostenco/Locomo-Plus /tmp/Locomo-Plus
cd /tmp/Locomo-Plus

# 2. Run Baseline Gemini 3.1 Pro Evaluation (concurrency 5)
export GOOGLE_API_KEY="your-api-key"
PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
  --data-file data/unified_cognitive_only.json \
  --out-file output/gemini_3.1_pro_pred.json \
  --model gemini-3.1-pro-preview \
  --backend call_gemini \
  --concurrency 5

# 3. Run Prism-MCP powered by Gemini 3.1 Pro Evaluation (concurrency 1 to guard SQLite locks)
export PRISM_TEXT_MODEL=gemini-3.1-pro-preview
PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
  --data-file data/unified_cognitive_only.json \
  --out-file output/prism_gemini_3.1_pro_pred.json \
  --model gemini-3.1-pro-preview \
  --backend call_prism \
  --concurrency 1

# 4. Run Claude Sonnet 4.6 Baseline Evaluation (concurrency 3, rate-limit safe)
export ANTHROPIC_API_KEY="your-api-key"
PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/evaluate_qa.py \
  --data-file data/unified_cognitive_only.json \
  --out-file output/claude_sonnet46_pred.json \
  --model claude-sonnet-4-6 \
  --backend call_claude \
  --concurrency 3

# 5. Grade results using the LLM-as-a-Judge script
PYTHONPATH=/tmp/Locomo-Plus python3 evaluation_framework/task_eval/llm_as_judge.py \
  --input-file output/prism_gemini_3.1_pro_pred.json \
  --out-file output/prism_gemini_3.1_pro_judged.json \
  --model gemini-2.5-flash \
  --backend call_gemini \
  --concurrency 5 \
  --summary-file output/prism_gemini_3.1_pro_summary.json

Models on HuggingFace

ModelHuggingFaceSolo BFCLCascade roleSize
prism-coder:32bdcostenco/prism-coder-32b100.0% routing (v7 MoE)Tier 2 (catches ~1% 14B misses)16 GB
prism-coder:8bdcostenco/prism-coder-8b100.0% routing (v36)Mobile tier4.7 GB
prism-coder:14bdcostenco/prism-coder-14b100.0% routing (v36)Tier 1 (serves ~99% of traffic)8.4 GB
prism-coder:1.7bdcostenco/prism-coder-1.7b100.0% routing (v42)On-device / always-fits fallback1.1 GB
prism-ide:14bdcostenco/prism-ide22/22 TypeScript eval (v1)Code generation tier 1 (~1.1s)8.4 GB
prism-ide:32bdcostenco/prism-ideComplex code + multi-file (v3)Code generation tier 2 (~0.8s MoE)16 GB

Self-hosted / Local AI (Enterprise)

Run the full Prism model stack on your own hardware β€” zero cloud, zero latency, full data sovereignty.

Requirements: Mac M2 Pro+ (48GB recommended) or Linux with NVIDIA GPU Β· Ollama

# On-device tier β€” 1.1 GB (any machine, iPhone) β€” 100% routing
ollama pull dcostenco/prism-coder:1b7

# Mobile tier β€” 4.7 GB (iPhone/iPad 8GB, Mac M1+) β€” 100% routing
ollama pull dcostenco/prism-coder:8b

# Standard tier β€” 8.4 GB (Mac 24GB+, iPad Pro 16GB) β€” 100% routing
ollama pull dcostenco/prism-coder:14b

# Reasoning tier β€” 16 GB (Mac M2 Ultra+, 30B-A3B MoE) β€” 100% routing
ollama pull dcostenco/prism-coder:32b

Set LOCAL_LLM_URL=http://localhost:11434 in your portal config. Routing is automatic:

Desktop/server: 14B β†’ 32B β†’ Claude Opus fallback Β· Mobile/offline: 14B β†’ 8B β†’ 1.7B

iOS/mobile on same WiFi: OLLAMA_HOST=0.0.0.0 ollama serve on the Mac, then point LOCAL_LLM_URL at the Mac's IP.
Routing accuracy (May 2026, v36/v7 system prompt, 3-seed mean): 32B v7 = 100.0% Β· 8B v36 = 100.0% Β· 14B v36 = 100.0% Β· 1.7B v42 = 100.0%
Cascade (14B→32B): 100.0% · Opus solo: 98.3% · Opus engaged: 0% of requests → Full results

Plans

PlanCloud modelDaily limitOn-device
Freeβ€”unlimited localprism-coder:1.7b (100%) + 8b (100%) + 14b (100%)
Standard $19/moClaude Sonnet 4200 req+ cloud fallback
Pro $49/moprism-coder:32b2,000 req+ reasoning tier
Enterprise $99/moprism-coder:32b priorityunlimited+ HIPAA BAA + custom fine-tuning

All on-device models are free for every tier β€” no subscription needed for local inference. Offline translation (1,261 phrases Γ— 20 languages) included in all plans.

Subscribe β†’

What you can build with it

  • Persistent coding assistant that remembers your codebase, your decisions, your team's conventions
  • Research agent that builds knowledge over time β€” Auto-Scholar pipeline ingests papers / docs and synthesizes
  • Clinical scribe that retains patient context across visits (HIPAA-compliant cloud + local)
  • Customer support agent that learns from every ticket
  • Writing assistant that knows your voice, your prior drafts, and what you've already published

Companions

🌐 Website & Docs

synalux.ai/prism-mcp β€” full documentation, dashboard, subscription plans, and model downloads.

πŸ’» Web IDE β€” Synalux Coder

Use Prism Coder directly in your browser β€” no install required. Local-first IDE with the prism-coder agent built in. Connects to GitHub repos, Synalux Mail, Drive, and Source for cross-product workflows.

synalux.ai/coder Β· also reachable at synalux.ai/prism-ide

FeatureDetail
Agentprism-coder:7b offline Β· Claude Sonnet 4 (Standard+) Β· Claude Opus 4 (Enterprise)
IntegrationsGitHub repos, Synalux Mail, Drive, Source β€” same OAuth, no separate accounts
ComplianceAudit log on every turn Β· PHI redaction Β· air-gapped offline mode (HIPAA)

🧩 VS Code Extension β€” Synalux

Memory-augmented AI inside VS Code, powered by Prism. 20 multimodal tools, multi-agent orchestration, 12-language support. Works offline (Ollama) or cloud (OpenRouter). HIPAA-compliant healthcare workflows.

VS Marketplace

# Install from terminal
code --install-extension synalux-ai.synalux

Or open VS Code β†’ Extensions (β‡§βŒ˜X) β†’ search "Synalux" β†’ Install.

πŸ“¦ npm / npx

# Run without installing (always latest version)
npx prism-mcp-server

# Or install globally
npm install -g prism-mcp-server
prism load my-project

Package: prism-mcp-server on npm

PrismAAC

AAC communication app for non-speaking users. Powered by Prism's spreading-activation phrase ranking + on-device 7B model. macOS / iOS / Android via web. β†’ github.com/dcostenco/prism-aac

πŸ†• Prism as Foundation (v14.0.0)

As of v14.0.0, Prism's algorithm exports are a stable public contract under SemVer. External systems can port actrActivation.ts (ACT-R cognitive decay), spreadingActivation.ts (the 0.7 similarity + 0.3 activation hybrid score), routerExperience.ts (experience bias with MIN_SAMPLES=5 cold-start gate), compactionHandler.ts (the 25KB prompt-budget cap), and graphMetrics.ts (warning ratios) with citations and pin a Prism version.

Reference consumers

ConsumerWhat it uses from Prism
Audit hooks frameworkACT-R decay (d=0.25 lesson rate), spreading activation hybrid score (0.7/0.3), experience bias (MIN_SAMPLES=5, MAX_BIAS_CAP=0.15), graph-metrics warning ratios (0.20 / 0.30 / 0.40), compaction's 25KB prompt-budget. 327 tests pin every constant β€” CI catches divergence automatically.
PrismAACSpreading-activation phrase ranking (recency Γ— frequency Γ— per-user history). Caregiver corrections auto-harvest into the personalization corpus via the audit-hooks postflight harvester. The on-device 7B model + this algorithm stack is what makes PrismAAC defensible.
Synalux portalTier-aware model routing using experience bias on prior outcomes per fingerprint. HIPAA-compliant clinical scribe with on-device-first privacy guarantees.

CLI Reference

Prism Coder includes a CLI for session management, code review, and sync operations.

prism load <project>          # Load session context (same as session_load_context MCP tool)
prism save                    # Save session state (ledger + handoff)
prism ledger <project>        # Save a session log entry (same as session_save_ledger)
prism handoff <project>       # Update live project state for next session
prism push                    # Push local SQLite data to Supabase cloud
prism sync                    # Cross-backend data synchronization
prism search <query>          # Search code across repos (exact, regex, symbol, semantic)
prism review <files...>       # AI code review β€” security, performance, style
prism scan <files...>         # Security scan β€” secrets, licenses, Dockerfile
prism dora                    # Show DORA metrics for current project
prism scm                     # Source control, AI review, security scanning
prism verify                  # Manage the verification harness
prism status                  # Check verification state and config drift
prism generate                # Bless current rubric as canonical
prism register-models         # Alias dcostenco/prism-coder:* β†’ prism-coder:*

Testing

npm test                           # 2,418 test cases across 81 files (vitest)
npm test -- --coverage             # coverage report
python3 tests/benchmarks/prism-routing-100/benchmark.py --models 1b7 14b 32b

Pinned in CI β€” 327 tests enforce every constant: ACT-R decay d=0.25, spreading-activation hybrid score 0.7/0.3, experience bias MIN_SAMPLES=5 / MAX_BIAS_CAP=0.15, graph-metrics warning ratios 0.20 / 0.30 / 0.40, compaction's 25KB prompt-budget. CI catches divergence automatically.

Coverage areas:

  • HRR zero-search retrieval (97 tests: 3 embedding strategies, edge cases, persistence, adaptive cascade, API client, chat integration)
  • Knowledge ingestion (32 tests: chunker, Q&A gen, webhook, security, storage round-trip)
  • Prism infer cascade (110 tests: tier selection, cloud fallback, grounding verifier)
  • Compaction handler (rollup creation, concurrency guard, LLM failure)
  • Model picker (20 tests: 14b default ceiling, 4b verifier, RAM gating)
  • Storage round-trip (12 architectural guard tests preventing bypass)
  • BCBA skill integration
  • Deep storage tier
  • Dashboard rendering
  • Routing benchmarks (eval_300: 300 cases, 17 tools)

Migration

Local SQLite β†’ Synalux portal

If you've been running Prism on the free tier and want to move historical session data into the paid-tier portal, use the migration script:

# dry run first β€” prints what would be migrated, hits no network
node scripts/migrate-local-to-portal.mjs --dry-run

# real run β€” pushes ledger + handoff entries through POST /api/v1/prism/memory
PRISM_SYNALUX_API_KEY=synalux_sk_... \
  node scripts/migrate-local-to-portal.mjs

# scope to one project
node scripts/migrate-local-to-portal.mjs --project=my-project

# include scholar entries (excluded by default β€” usually large + low-value)
node scripts/migrate-local-to-portal.mjs --include-scholar

What it does: reads ~/.prism-mcp/data.db via @libsql/client (already a runtime dep β€” no extra install), exchanges the refresh token for a JWT (cached + auto-refreshed before expiry), and POSTs each ledger entry and handoff to the portal. Failures are logged with the source row id; successes are counted at the end.

Credentials: PRISM_SYNALUX_API_KEY from env. If unset, the script also checks ~/prism/.env for PRISM_SYNALUX_API_KEY=... as a convenience for dev workflows.

Idempotency: handoffs are written with the portal's CRDT merge (last-write-wins per project+role); ledger entries are append-only and de-duped server-side by (project, conversation_id, summary). Re-running on the same DB is safe.

One-shot only: this script is a migration tool, not a sync daemon. Once you've moved, set PRISM_STORAGE=synalux (or leave it on auto and let the resolver pick synalux when credentials are present) and the MCP server writes directly to the portal going forward.

Production Infrastructure

Architecture

  CLIENTS
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  prism-aac (iOS/web)β”‚  β”‚  Claude Code Β· Cursor Β· IDE β”‚
  β”‚  Vercel             β”‚  β”‚  MCP config β†’ Railway URL   β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚ inference                  β”‚ memory
             β–Ό                            β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  SYNALUX ROUTER      β”‚  β”‚  prism-mcp SERVER           β”‚
  β”‚  Vercel              β”‚  β”‚                             β”‚
  β”‚  β€’ JWT auth          β”‚  β”‚  Primary   β€” Railway        β”‚
  β”‚  β€’ tier enforcement  β”‚  β”‚  Standby   β€” Fly.io         β”‚
  β”‚  β€’ complexity route  β”‚  β”‚  Fallback  β€” Supabase REST  β”‚
  β”‚  β€’ proxy to cloud    β”‚  β”‚  auto-failover chain        β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                            β”‚
             β–Ό                            β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  OPENROUTER / LOCAL          β”‚  β”‚  SUPABASE                   β”‚
  β”‚                              β”‚  β”‚  session ledgers            β”‚
  β”‚  Cloud: Claude Sonnet 4      β”‚  β”‚  knowledge graph            β”‚
  β”‚  Routing: prism-coder        β”‚  β”‚  handoffs & todos           β”‚
  β”‚   :32b(100%) :14b(100%)      β”‚  β”‚                             β”‚
  β”‚   :8b(100%)  :1b7(100%)      β”‚  β”‚  source of truth            β”‚
  β”‚  Code:    prism-ide          β”‚  β”‚                             β”‚
  β”‚   :14b Β· :32b                β”‚  β”‚                             β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Service Routing

LLM Backends

SurfacePrimaryFallbackLocal
AI Chat (free)Gemini 2.5 Flash (direct API)Claude Haiku 3.5prism-coder:14b via Ollama
AI Chat (paid)Claude Sonnet 4 (OpenRouter)Claude Haiku 3.5prism-coder:14b via Ollama
Prism Coder (tool-calling)Claude Haiku 3.5 (OpenRouter)β€”prism-coder:14b via Ollama
Prism AACLocal prism-coder:14bGemini 2.5 Flash / Claudeprism-coder:8b / :1b7

Web Search

SurfacePrimaryFallback
AI Chat @searchFirecrawlβ€”
Prism MCP agents (cloud)Firecrawlβ€”
Prism MCP server (local)Firecrawl (via MCP tools)β€”
Clinical researchPubMed + ERIC + Semantic ScholarDuckDuckGo

TTS (Text-to-Speech)

TierEngineOffline
1Inworld TTS-2 (cloud)β€”
1.5Kokoro-82M neural (WASM)en/es/fr/pt/ja/zh
2OS Web Speech APIall
3WASM espeak-ngall

Other Services

ServiceProviderPurpose
PaymentsStripeSubscriptions, checkout
EmailResendTransactional (invites, shares)
VideoLiveKitTelehealth, case conferences
SMSTwilioEmergency alerts, caregiver notifications
TranslationOffline dictionary (1,261 Γ— 20 langs)AAC, Watch

Synalux Inference Router

All Prism AAC model inference is protected behind Synalux as a mandatory router. Models are never accessible directly β€” all traffic goes through Synalux for auth, billing, and rate limiting.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  CLIENT LAYER                                            β”‚
β”‚  prism-aac (iOS/web)         β”‚   Synalux Portal          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ POST /api/v1/prism-aac/inference
               β”‚ Authorization: Bearer <user-JWT>
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  SYNALUX ROUTER                                          β”‚
β”‚  1. Verify JWT (no anonymous access)                     β”‚
β”‚  2. Check subscription tier                              β”‚
β”‚  3. Enforce rate limit (per-tier daily cap)               β”‚
β”‚  4. Route to model tier by complexity                    β”‚
β”‚  5. Proxy β†’ OpenRouter / Gemini (key never exposed)      β”‚
β”‚  6. Log β†’ aac_inference_log (audit trail)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                               β”‚
           β–Ό                               β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  LOCAL (Ollama)    β”‚      β”‚  CLOUD (OpenRouter)  β”‚
  β”‚  prism-coder:14b   β”‚      β”‚  Claude Sonnet 4     β”‚
  β”‚  prism-coder:8b    β”‚      β”‚  Claude Haiku 3.5    β”‚
  β”‚  prism-coder:1b7   β”‚      β”‚  Gemini 2.5 Flash    β”‚
  β”‚  free, offline     β”‚      β”‚  paid tiers          β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

On-device (free, offline):
  prism-coder:1b7 GGUF Q4_K_M (1.1 GB) β†’ any Apple device
  prism-coder:8b  GGUF Q4_K_M (4.7 GB) β†’ iPhone/iPad 8 GB+
  prism-coder:14b GGUF Q4_K_M (8.4 GB) β†’ Mac/iPad Pro 16 GB+

HuggingFace: dcostenco/prism-coder-{14b,8b,32b,1.7b} (public GGUF weights)
PlanCloud modelDaily limitOn-device
Freeβ€”unlimited localprism-coder:1.7b (100%) + 8b (100%) + 14b (100%)
Standard $19/moClaude Sonnet 4200 req+ cloud fallback
Pro $49/moprism-coder:32b2,000 req+ reasoning tier
Enterprise $99/moprism-coder:32b priorityunlimited+ HIPAA BAA + custom fine-tuning

All on-device models are free for every tier β€” no subscription needed for local inference. Offline translation (1,261 phrases Γ— 20 languages) included in all plans.

Subscribe β†’

See docs/WOW_FEATURES.md for the algorithm catalogue. Release notes in docs/releases/v14.0.0-prism-as-foundation.md.

πŸ“š Architecture, cognitive systems, and full feature catalog

Detailed docs in this repo:

The original 1933-line README is preserved in git history. To browse the prior version (full feature catalog, Cognitive Architecture v7.8, Autonomous Cognitive OS v9.0, HRR Zero-Search, Adversarial Evaluation walkthroughs, Universal Import patterns, competitive analysis vs LangMem/MemGPT/Letta/Zep, v12.5 Unified Billing details, v11.6 Hivemind, v11.5.1 Auto-Scholar): git show HEAD~1:README.md.

License

AGPL-3.0 β€” Open source. Same license as Prism AAC. Commercial use via Synalux subscription for hosted/managed deployment.

Keywords

mcp

FAQs

Package last updated on 03 Jun 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts