Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

@ruvector/ruvllm

Package Overview
Dependencies
Maintainers
1
Versions
14
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ruvector/ruvllm

Self-learning LLM orchestration with SONA adaptive learning, HNSW memory, FastGRNN routing, and SIMD inference

Source
npmnpm
Version
0.2.4
Version published
Weekly downloads
144K
24.41%
Maintainers
1
Weekly downloads
 
Created
Source

@ruvector/ruvllm

Build AI that learns and improves from every interaction.

RuvLLM is a self-learning language model toolkit that gets smarter over time. Unlike traditional LLMs that remain static after training, RuvLLM continuously adapts to your use case while remembering what it learned before.

What Makes RuvLLM Different?

Traditional LLMs forget old knowledge when learning new things (called "catastrophic forgetting"). RuvLLM solves this with three key innovations:

  • It Learns Without Forgetting - Uses tiny parameter updates (LoRA) and memory protection (EWC++) to learn new patterns while preserving existing knowledge

  • It Remembers Context - Built-in vector memory stores and retrieves relevant information instantly using similarity search

  • It Routes Intelligently - Automatically selects the right model size and parameters based on query complexity, saving resources on simple tasks

Key Features

FeatureWhat It DoesWhy It Matters
Adaptive LearningLearns from user feedback in real-timeImproves accuracy over time without retraining
Memory SystemStores context with instant similarity searchFinds relevant information in microseconds
Smart RoutingPicks optimal model/settings per queryReduces costs, improves response quality
SIMD AccelerationUses CPU vector instructions (AVX2/NEON)10-50x faster vector operations
Federated LearningTrain across devices without sharing dataPrivacy-preserving distributed learning
LoRA AdaptersParameter-efficient fine-tuning with low-rank matricesFast adaptation with minimal memory
EWC++ ProtectionElastic Weight Consolidation prevents forgettingLearn new tasks without losing old knowledge
SafeTensors ExportHuggingFace-compatible model serializationShare models with the ML ecosystem
Training PipelineFull training infrastructure with schedulersProduction-ready model training
Session ManagementStateful conversations with streamingBuild chat applications easily

Installation

npm install @ruvector/ruvllm

Or run directly:

npx @ruvector/ruvllm info

Quick Start Tutorial

1. Basic Query

import { RuvLLM } from '@ruvector/ruvllm';

const llm = new RuvLLM();

// Ask a question - routing happens automatically
const response = llm.query('Explain neural networks simply');
console.log(response.text);
// Output: "Neural networks are computing systems inspired by..."

console.log(`Used model: ${response.model}`);
console.log(`Confidence: ${(response.confidence * 100).toFixed(1)}%`);

2. Teaching the System

// Query and get a response
const response = llm.query('What is the capital of France?');

// Provide feedback - the system learns from this
llm.feedback({
  requestId: response.requestId,
  rating: 5,  // 1-5 scale
  correction: 'Paris is the capital and largest city of France'
});

// Future similar queries will be more accurate

3. Using Memory

// Store important context
llm.addMemory('Company policy: All returns accepted within 30 days', {
  category: 'policy',
  department: 'customer-service'
});

llm.addMemory('Product X launched in March 2024 with features A, B, C', {
  category: 'product',
  name: 'Product X'
});

// Search memory for relevant context
const results = llm.searchMemory('return policy', 5);
console.log(results[0].content);
// Output: "Company policy: All returns accepted within 30 days"
console.log(`Relevance: ${(results[0].score * 100).toFixed(1)}%`);

4. Computing Similarity

import { SimdOps } from '@ruvector/ruvllm';

const simd = new SimdOps();

// Compare two texts
const score = llm.similarity(
  'How do I reset my password?',
  'I forgot my login credentials'
);
console.log(`Similarity: ${(score * 100).toFixed(1)}%`);
// Output: "Similarity: 78.3%"

// Fast vector operations
const embedding1 = llm.embed('machine learning');
const embedding2 = llm.embed('deep learning');
const similarity = simd.cosineSimilarity(embedding1, embedding2);

5. Batch Processing

// Process multiple queries efficiently
const batch = llm.batchQuery({
  queries: [
    'What is AI?',
    'Explain machine learning',
    'How do neural networks work?'
  ],
  config: { temperature: 0.7 }
});

batch.responses.forEach((r, i) => {
  console.log(`Query ${i + 1}: ${r.text.slice(0, 50)}...`);
});
console.log(`Total time: ${batch.totalLatencyMs}ms`);

CLI Commands

# Get system information
ruvllm info

# Query the model
ruvllm query "What is quantum computing?"

# Generate text with custom settings
ruvllm generate "Write a product description for:" --temperature 0.8 --max-tokens 200

# Memory operations
ruvllm memory add "Important fact to remember"
ruvllm memory search "fact" --k 10

# Compare texts
ruvllm similarity "hello world" "hi there"

# Get embeddings
ruvllm embed "your text here"

# Run performance benchmark
ruvllm benchmark --dims 768 --iterations 5000

# View statistics
ruvllm stats --json

Benchmarks

Benchmarked in Docker (node:20-alpine, x64) - December 2024

Core Operations

OperationTimeThroughput
Query (short)1.49μs670K ops/s
Query (long)874ns1.14M ops/s
Generate88ns11.4M ops/s
Route92ns10.9M ops/s
Embed (256d)10.6μs94K ops/s
Embed (768d)7.1μs140K ops/s

SIMD Vector Operations

Operation128d256d512d768d
Dot Product214ns / 4.67M ops/s318ns / 3.15M ops/s609ns / 1.64M ops/s908ns / 1.10M ops/s
Cosine Similarity233ns / 4.30M ops/s335ns / 2.99M ops/s652ns / 1.53M ops/s972ns / 1.03M ops/s
L2 Distance195ns / 5.14M ops/s315ns / 3.18M ops/s612ns / 1.63M ops/s929ns / 1.08M ops/s

LoRA Adapter Performance

Operation64d128d256d
Forward (r=4)6.09μs / 164K ops/s2.74μs / 365K ops/s4.83μs / 207K ops/s
Forward (r=8)2.17μs / 462K ops/s4.30μs / 233K ops/s8.99μs / 111K ops/s
Forward (r=16)4.85μs / 206K ops/s9.05μs / 111K ops/s18.3μs / 55K ops/s
Backward (r=8)-110μs / 9.1K ops/s-
Batch (100)-467μs / 2.1K ops/s-

Memory Operations

OperationTimeThroughput
Add Memory5.3μs189K ops/s
Search (k=5)45.6μs21.9K ops/s
Search (k=10)28.3μs35.3K ops/s
Search (k=20)33.1μs30.2K ops/s

SONA Learning System

OperationTimeThroughput
Pattern Store14.4μs69.5K ops/s
Pattern Find Similar224μs4.5K ops/s
EWC Register Task6.5μs154K ops/s
EWC Compute Penalty501μs2.0K ops/s
Trajectory Build1.24μs807K ops/s

Federated Learning

OperationTimeThroughput
Agent Create7.8μs128K ops/s
Process Task7.9μs126K ops/s
Apply LoRA12.6μs79.6K ops/s
Export State48.9μs20.4K ops/s
Aggregate5.26ms190 ops/s

Session & Streaming

OperationTimeThroughput
Session Create1.45μs690K ops/s
Session Chat3.28μs305K ops/s
Session Export3.91ms255 ops/s
Session Import1.60ms625 ops/s

Training Pipeline

OperationTime
Pipeline Create70.6μs
Add Data (100 samples)70.6μs
Train (32 samples, 3 epochs)1.33s

Export/Import

OperationTimeThroughput
SafeTensors Write67.3μs14.9K ops/s
SafeTensors Read102μs9.8K ops/s
LoRA to JSON87.9μs11.4K ops/s
LoRA from JSON86.0μs11.6K ops/s

Performance Highlights

  • Fastest: Generate at 11.4M ops/s, Route at 10.9M ops/s
  • Vector Ops: Up to 5.14M ops/s for L2 distance (128d)
  • LoRA Forward: Up to 462K ops/s (64d, rank-8)
  • Memory Search: 35K ops/s (k=10)
  • Session Create: 690K ops/s

Configuration

const llm = new RuvLLM({
  // Embedding settings
  embeddingDim: 768,        // Vector dimensions (384, 768, 1024)

  // Memory settings
  hnswM: 16,                // Graph connectivity (higher = better recall, more memory)
  hnswEfConstruction: 100,  // Build quality (higher = better index, slower build)
  hnswEfSearch: 64,         // Search quality (higher = better recall, slower search)

  // Learning settings
  learningEnabled: true,    // Enable adaptive learning
  qualityThreshold: 0.7,    // Min confidence to skip learning
  ewcLambda: 2000,          // Memory protection strength

  // Router settings
  routerHiddenDim: 128,     // Router network size
});

Platform Support

Native acceleration available on:

PlatformArchitectureSIMD Support
macOSApple Silicon (M1/M2/M3)NEON
macOSIntel x64AVX2, SSE4.1
Linuxx64AVX2, AVX-512, SSE4.1
LinuxARM64NEON
Windowsx64AVX2, SSE4.1

Falls back to optimized JavaScript on unsupported platforms.

Real-World Use Cases

Customer Support Bot

// Store FAQ and policies
faqs.forEach(faq => llm.addMemory(faq.answer, { question: faq.question }));

// Answer questions with context
function answerQuestion(question: string) {
  const context = llm.searchMemory(question, 3);
  const prompt = `Context:\n${context.map(c => c.content).join('\n')}\n\nQuestion: ${question}`;
  return llm.query(prompt);
}
// Index documents
documents.forEach(doc => {
  llm.addMemory(doc.content, {
    title: doc.title,
    path: doc.path
  });
});

// Semantic search
const results = llm.searchMemory('quarterly revenue growth', 10);

Personalized Recommendations

// Learn from user interactions
function recordInteraction(userId: string, itemId: string, rating: number) {
  const response = llm.query(`User ${userId} rated ${itemId}`);
  llm.feedback({ requestId: response.requestId, rating });
}

// Get recommendations
function recommend(userId: string) {
  return llm.searchMemory(`preferences for user ${userId}`, 10);
}

API Reference

RuvLLM Class

MethodDescription
query(text, config?)Query with automatic model routing
generate(prompt, config?)Generate text with given prompt
route(text)Get routing decision without executing
addMemory(content, metadata?)Store content in vector memory
searchMemory(text, k?)Find similar content (default k=10)
feedback(fb)Submit feedback for learning
embed(text)Get embedding vector for text
similarity(t1, t2)Compute similarity between texts
stats()Get engine statistics
forceLearn()Trigger immediate learning cycle

SimdOps Class

MethodDescription
dotProduct(a, b)Vector dot product
cosineSimilarity(a, b)Cosine similarity (0-1)
l2Distance(a, b)Euclidean distance
normalize(v)Normalize to unit length
softmax(v)Softmax activation
relu(v)ReLU activation
gelu(v)GELU activation
layerNorm(v, eps?)Layer normalization
matvec(m, v)Matrix-vector multiply

Troubleshooting

Q: Native module not loading?

ruvllm info  # Check if native is loaded

If "Native: Fallback", install platform-specific package manually:

npm install @ruvector/ruvllm-darwin-arm64  # For Apple Silicon

Q: Memory usage too high? Reduce HNSW parameters:

const llm = new RuvLLM({ hnswM: 8, hnswEfConstruction: 50 });

Q: Learning not improving results? Check that feedback is being processed:

const stats = llm.stats();
console.log(`Patterns learned: ${stats.patternsLearned}`);

License

MIT OR Apache-2.0

  • GitHub Repository
  • Documentation
  • Issue Tracker

Keywords

ruvllm

FAQs

Package last updated on 02 Jan 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts