Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

@ruvector/rvf-solver

Package Overview
Dependencies
Maintainers
1
Versions
9
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ruvector/rvf-solver

RVF self-learning temporal solver — Thompson Sampling, PolicyKernel, ReasoningBank

latest
Source
npmnpm
Version
0.1.8
Version published
Maintainers
1
Created
Source

@ruvector/rvf-solver

npm license platforms

Self-learning temporal solver with Thompson Sampling, PolicyKernel, ReasoningBank, and SHAKE-256 tamper-evident witness chains. Runs in the browser, Node.js, and edge runtimes via WebAssembly.

Install

npm install @ruvector/rvf-solver

Or via the unified SDK:

npm install @ruvector/rvf

Features

  • Thompson Sampling two-signal model — safety Beta distribution + cost EMA for adaptive policy selection
  • 18 context-bucketed bandits — 3 range x 3 distractor x 2 noise levels for fine-grained context awareness
  • KnowledgeCompiler with signature-based pattern cache — distills learned patterns into reusable compiled configurations
  • Speculative dual-path execution — runs two candidate arms in parallel, picks the winner
  • Three-loop adaptive solver — fast: constraint propagation solve, medium: PolicyKernel skip-mode selection, slow: KnowledgeCompiler pattern distillation
  • SHAKE-256 tamper-evident witness chain — 73 bytes per entry, cryptographically linked proof of all operations
  • Full acceptance test with A/B/C ablation modes — validates learned policy outperforms fixed and compiler baselines
  • ~160 KB WASM binary, no_std — runs anywhere WebAssembly does (browsers, Node.js, Deno, Cloudflare Workers, edge runtimes)

Quick Start

import { RvfSolver } from '@ruvector/rvf-solver';

// Create a solver instance (loads WASM on first call)
const solver = await RvfSolver.create();

// Train on 100 puzzles (difficulty 1-5)
const result = solver.train({ count: 100, minDifficulty: 1, maxDifficulty: 5 });
console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
console.log(`Patterns learned: ${result.patternsLearned}`);

// Run full acceptance test (A/B/C ablation)
const manifest = solver.acceptance({ cycles: 3 });
console.log(`Mode A (fixed):    ${manifest.modeA.finalAccuracy.toFixed(3)}`);
console.log(`Mode B (compiler): ${manifest.modeB.finalAccuracy.toFixed(3)}`);
console.log(`Mode C (learned):  ${manifest.modeC.finalAccuracy.toFixed(3)}`);
console.log(`All passed: ${manifest.allPassed}`);

// Inspect Thompson Sampling policy state
const policy = solver.policy();
console.log(`Context buckets: ${Object.keys(policy?.contextStats ?? {}).length}`);
console.log(`Speculative attempts: ${policy?.speculativeAttempts}`);

// Get raw SHAKE-256 witness chain
const chain = solver.witnessChain();
console.log(`Witness chain: ${chain?.length ?? 0} bytes`);

// Free WASM resources
solver.destroy();

API Reference

RvfSolver.create(): Promise<RvfSolver>

Creates a new solver instance. Initializes the WASM module on the first call; subsequent calls reuse the loaded module. Up to 7 concurrent instances are supported.

const solver = await RvfSolver.create();

solver.train(options: TrainOptions): TrainResult

Trains the solver on randomly generated puzzles using the three-loop architecture. The fast loop applies constraint propagation, the medium loop selects skip modes via Thompson Sampling, and the slow loop distills patterns into the KnowledgeCompiler cache.

const result = solver.train({ count: 200, minDifficulty: 1, maxDifficulty: 10 });

solver.acceptance(options?: AcceptanceOptions): AcceptanceManifest

Runs the full acceptance test with training/holdout cycles across all three ablation modes (A, B, C). Returns a manifest with per-cycle metrics, pass/fail status, and witness chain metadata.

const manifest = solver.acceptance({ cycles: 5, holdoutSize: 50 });

solver.policy(): PolicyState | null

Returns the current Thompson Sampling policy state including per-context-bucket arm statistics, KnowledgeCompiler cache stats, and speculative execution counters. Returns null if no training has been performed.

const policy = solver.policy();

solver.witnessChain(): Uint8Array | null

Returns the raw SHAKE-256 witness chain bytes. Each entry is 73 bytes and provides tamper-evident proof of all training and acceptance operations. Returns null if the chain is empty. The returned Uint8Array is a copy safe to use after destroy().

const chain = solver.witnessChain();

solver.destroy(): void

Frees the WASM solver instance and releases all associated memory. The instance must not be used after calling destroy().

solver.destroy();

Types

TrainOptions

FieldTypeDefaultDescription
countnumberrequiredNumber of puzzles to generate and solve
minDifficultynumber1Minimum puzzle difficulty (1-10)
maxDifficultynumber10Maximum puzzle difficulty (1-10)
seedbigint | numberrandomRNG seed for reproducible runs

TrainResult

FieldTypeDescription
trainednumberNumber of puzzles trained on
correctnumberNumber solved correctly
accuracynumberAccuracy ratio (correct / trained)
patternsLearnednumberPatterns distilled by the ReasoningBank

AcceptanceOptions

FieldTypeDefaultDescription
holdoutSizenumber50Number of holdout puzzles per cycle
trainingPerCyclenumber200Number of training puzzles per cycle
cyclesnumber5Number of train/test cycles
stepBudgetnumber500Maximum constraint propagation steps per puzzle
seedbigint | numberrandomRNG seed for reproducible runs

AcceptanceManifest

FieldTypeDescription
versionnumberManifest schema version
modeAAcceptanceModeResultMode A results (fixed heuristic)
modeBAcceptanceModeResultMode B results (compiler-suggested)
modeCAcceptanceModeResultMode C results (learned policy)
allPassedbooleantrue if Mode C passed
witnessEntriesnumberNumber of entries in the witness chain
witnessChainBytesnumberTotal witness chain size in bytes

AcceptanceModeResult

FieldTypeDescription
passedbooleanWhether this mode met the accuracy threshold
finalAccuracynumberAccuracy on the final holdout cycle
cyclesCycleMetrics[]Per-cycle accuracy and cost metrics

PolicyState

FieldTypeDescription
contextStatsRecord<string, Record<string, SkipModeStats>>Per-context-bucket, per-arm Thompson Sampling statistics
earlyCommitPenaltiesnumberTotal early-commit penalty cost
earlyCommitsTotalnumberTotal early-commit attempts
earlyCommitsWrongnumberEarly commits that were incorrect
prepassstringCurrent prepass strategy identifier
speculativeAttemptsnumberNumber of speculative dual-path executions
speculativeArm2WinsnumberTimes the second speculative arm won

Acceptance Test Modes

The acceptance test validates the solver's learning capability through three ablation modes run across multiple train/test cycles:

Mode A (Fixed) -- Uses a fixed heuristic skip-mode policy. This establishes the baseline performance without any learning. The policy does not adapt regardless of puzzle characteristics.

Mode B (Compiler) -- Uses the KnowledgeCompiler's signature-based pattern cache to select skip modes. The compiler distills observed patterns into compiled configurations but does not perform online Thompson Sampling updates.

Mode C (Learned) -- Uses the full Thompson Sampling two-signal model with context-bucketed bandits. This is the complete system: the fast loop solves, the medium loop selects arms based on safety Beta and cost EMA, and the slow loop feeds patterns back to the compiler. Mode C should outperform both A and B, demonstrating genuine self-improvement.

The test passes when Mode C achieves the accuracy threshold on holdout puzzles. The witness chain records every training and evaluation operation for tamper-evident auditability.

Architecture

The solver uses a three-loop adaptive architecture:

+-----------------------------------------------+
|  Slow Loop: KnowledgeCompiler                  |
|  - Signature-based pattern cache               |
|  - Distills observations into compiled configs  |
+-----------------------------------------------+
        |                          ^
        v                          |
+-----------------------------------------------+
|  Medium Loop: PolicyKernel                     |
|  - Thompson Sampling (safety Beta + cost EMA)  |
|  - 18 context buckets (range x distractor x noise) |
|  - Speculative dual-path execution             |
+-----------------------------------------------+
        |                          ^
        v                          |
+-----------------------------------------------+
|  Fast Loop: Constraint Propagation Solver      |
|  - Generates and solves puzzles                |
|  - Reports outcomes back to PolicyKernel       |
+-----------------------------------------------+
        |
        v
+-----------------------------------------------+
|  SHAKE-256 Witness Chain (73 bytes/entry)      |
|  - Tamper-evident proof of all operations      |
+-----------------------------------------------+

The fast loop runs on every puzzle, the medium loop updates policy parameters after each solve, and the slow loop periodically compiles accumulated observations into cached patterns. All operations are recorded in the SHAKE-256 witness chain.

Unified SDK

When using the @ruvector/rvf unified SDK, the solver is available as a sub-module:

import { RvfSolver } from '@ruvector/rvf';

const solver = await RvfSolver.create();
const result = solver.train({ count: 100 });
console.log(`Accuracy: ${(result.accuracy * 100).toFixed(1)}%`);
solver.destroy();
PackageDescription
@ruvector/rvfUnified TypeScript SDK
@ruvector/rvf-nodeNative N-API bindings for Node.js
@ruvector/rvf-wasmBrowser WASM package
@ruvector/rvf-mcp-serverMCP server for AI agents

License

MIT OR Apache-2.0

Keywords

solver

FAQs

Package last updated on 25 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts