Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

@claude-flow/embeddings

Package Overview
Dependencies
Maintainers
1
Versions
40
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@claude-flow/embeddings

V3 Embedding Service - OpenAI, Transformers.js, Agentic-Flow (ONNX), Mock providers with hyperbolic embeddings, normalization, and chunking

alpha
latest
v3alpha
npmnpm
Version
3.0.0-alpha.45
Version published
Maintainers
1
Created
Source

@claude-flow/embeddings

npm version npm downloads License: MIT TypeScript Performance Footprint Quantization RaBitQ DiskANN

High-performance embedding generation module for Claude Flow V3 - multi-provider support with persistent caching, document chunking, normalization, hyperbolic embeddings, and neural substrate integration.

SOTA at a glance (ADR-121)

Measured 2026-05-17 via scripts/benchmark-embeddings-sota.mjs, gated in CI as embeddings-sota-benchmark:

CapabilityMeasuredAcceptance gate
Install footprint (provider: 'ruvector-onnx')7.13 MB≥30% smaller than Xenova → measured 97.1% (34.7× smaller)
int8 quantization recall (SearchableEmbeddingCache, quantize: 'int8')1.0000 mean≥0.98 mean recall on unit-normalized vectors ✅
RaBitQ memory reduction (RabitqSnapshot)32× (theoretical, 10k×384)≥32× ✅
Hyperbolic Poincaré ops (@ruvector/attention)~1 μs/op (native NAPI)Backend reachable + finite output ✅
DiskANN/Vamana (DiskannSnapshot, billion-scale on-SSD)200 vec build in 203 ms · 0.1 ms/querySearch latency <10 ms/query on 200-vec corpus ✅
ruvector MCP sidecar probe83 tools / 15 groups (when installed)Probe returns structured report without throwing ✅
Xenova provider deprecation (provider: 'transformers')One-shot warning per processCLAUDE_FLOW_EMBEDDINGS_XENOVA_DEPRECATED code present ✅

See ADR-121 for the per-phase rationale and #2036 for the iteration log.

Features

Core Embedding

  • Multiple Providers - Agentic-Flow (ONNX), OpenAI, Transformers.js, and Mock
  • Auto-Install - Automatically installs agentic-flow when using provider: 'auto'
  • Smart Fallback - Graceful fallback chain: agentic-flow → transformers → mock
  • LRU + Disk Caching - In-memory LRU + SQLite persistent cache with TTL
  • Batch Processing - Efficient batch embedding with partial cache hits
  • Similarity Functions - Cosine, Euclidean, and dot product metrics
  • 75x Faster - Agentic-flow ONNX is 75x faster than Transformers.js

Advanced Features (New in v3.0.0-alpha.11)

  • Document Chunking - Character, sentence, paragraph, and token-based chunking with overlap
  • Multiple Normalization - L2, L1, min-max, and z-score normalization
  • Hyperbolic Embeddings - Poincaré ball model for hierarchical representations
  • Neural Substrate - Semantic drift detection, memory physics, swarm coordination
  • Persistent Cache - SQLite-backed disk cache with LRU eviction and TTL

Installation

npm install @claude-flow/embeddings

Standalone use (without the Ruflo CLI)

This package runs on its own — no @claude-flow/cli, no MCP server, no API key required for the recipes below (the MockEmbeddingService gives deterministic 384-dim vectors so examples are reproducible). Swap to OpenAIEmbeddingService / TransformersEmbeddingService / AgenticFlowEmbeddingService when you're ready for real embeddings.

Recipe — Embed + compare without a network call

// recipe.mjs
import {
  MockEmbeddingService,
  cosineSimilarity,
} from '@claude-flow/embeddings';

const service = new MockEmbeddingService({ dimension: 384 });

// `embed()` returns { embedding: Float32Array, tokens, ... } —
// destructure for the raw vector.
const [{ embedding: cat }, { embedding: dog }, { embedding: car }] =
  await Promise.all([
    service.embed('a friendly cat'),
    service.embed('a friendly dog'),
    service.embed('a red sports car'),
  ]);

console.log('cat ↔ dog:', cosineSimilarity(cat, dog).toFixed(4));
console.log('cat ↔ car:', cosineSimilarity(cat, car).toFixed(4));
// With real embeddings the pet pair will score higher than the cat/car pair.

Provider swap — same surface, real vectors

import { createEmbeddingService, cosineSimilarity } from '@claude-flow/embeddings';

// OpenAI
const openai = createEmbeddingService({
  provider: 'openai',
  model: 'text-embedding-3-small',
  apiKey: process.env.OPENAI_API_KEY!,
});

// Or local ONNX via @huggingface/transformers (no API key needed)
const local = createEmbeddingService({
  provider: 'transformers',
  model: 'Xenova/all-MiniLM-L6-v2', // 384-dim, ~25MB
});

await openai.initialize();
const v = await openai.embed('hello world');

cosineSimilarity, euclideanDistance, dotProduct, and computeSimilarity are exported as plain functions — useful when you already have vectors from another source and just need the math.

Quick Start

import { createEmbeddingService, cosineSimilarity } from '@claude-flow/embeddings';

// Create embedding service
const service = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
  dimensions: 1536,
});

// Generate embedding
const result = await service.embed('Hello, world!');
console.log(`Embedding dimension: ${result.embedding.length}`);
console.log(`Latency: ${result.latencyMs}ms`);

// Batch embed
const batchResult = await service.embedBatch([
  'First document',
  'Second document',
  'Third document',
]);
console.log(`Processed ${batchResult.embeddings.length} embeddings`);
console.log(`Cache hits: ${batchResult.cacheStats?.hits}`);

// Calculate similarity
const similarity = cosineSimilarity(
  batchResult.embeddings[0],
  batchResult.embeddings[1]
);
console.log(`Similarity: ${similarity.toFixed(4)}`);

CLI Usage

# Generate embedding from CLI
claude-flow embeddings embed "Your text here"

# Batch embed from file
claude-flow embeddings batch documents.txt -o embeddings.json

# Similarity search
claude-flow embeddings search "query" --index ./vectors

# Initialize agentic-flow model
claude-flow embeddings init --provider agentic-flow

API Reference

Factory Functions

import {
  createEmbeddingService,
  createEmbeddingServiceAsync,
  getEmbedding
} from '@claude-flow/embeddings';

// Sync: Create with known provider
const service = createEmbeddingService({
  provider: 'openai',
  apiKey: 'your-api-key',
  model: 'text-embedding-3-small',
});

// Async: Auto-select best provider with fallback
const autoService = await createEmbeddingServiceAsync({
  provider: 'auto',       // agentic-flow → transformers → mock
  autoInstall: true,      // Install agentic-flow if missing
  fallback: 'transformers', // Custom fallback
});

// Quick one-off embedding
const embedding = await getEmbedding('Hello world', {
  provider: 'mock',
  dimensions: 384,
});

OpenAI Provider

import { OpenAIEmbeddingService } from '@claude-flow/embeddings';

const service = new OpenAIEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',  // or 'text-embedding-3-large'
  dimensions: 1536,                  // Adjustable for v3 models
  baseURL: 'https://api.openai.com/v1/embeddings',
  timeout: 30000,
  maxRetries: 3,
  cacheSize: 1000,
});

const result = await service.embed('Your text here');
console.log('Tokens used:', result.usage?.totalTokens);

Agentic-Flow Provider (Fastest)

import { AgenticFlowEmbeddingService } from '@claude-flow/embeddings';

const service = new AgenticFlowEmbeddingService({
  provider: 'agentic-flow',
  modelId: 'default',     // Uses optimized ONNX model
  cacheSize: 256,
});

// 75x faster than Transformers.js (3ms vs 233ms)
const result = await service.embed('Your text here');
console.log(`ONNX embedding in ${result.latencyMs}ms`);

Transformers.js Provider (Local)

import { TransformersEmbeddingService } from '@claude-flow/embeddings';

const service = new TransformersEmbeddingService({
  provider: 'transformers',
  model: 'Xenova/all-MiniLM-L6-v2',  // Any HuggingFace model
  cacheSize: 1000,
});

// First call downloads the model
const result = await service.embed('Your text here');
console.log(`Local embedding generated in ${result.latencyMs}ms`);

Mock Provider (Testing)

import { MockEmbeddingService } from '@claude-flow/embeddings';

const service = new MockEmbeddingService({
  provider: 'mock',
  dimensions: 384,
  simulatedLatency: 10,  // Optional delay
  cacheSize: 100,
});

// Deterministic embeddings based on text hash
const result = await service.embed('Your text here');

Batch Processing

const result = await service.embedBatch([
  'Document 1: Introduction to machine learning',
  'Document 2: Deep learning fundamentals',
  'Document 3: Natural language processing',
  'Document 4: Computer vision basics',
]);

console.log('Batch Results:', {
  count: result.embeddings.length,
  totalLatency: `${result.totalLatencyMs}ms`,
  avgLatency: `${result.avgLatencyMs}ms`,
  cacheHits: result.cacheStats?.hits,
  cacheMisses: result.cacheStats?.misses,
  tokensUsed: result.usage?.totalTokens,
});

Similarity Functions

import {
  cosineSimilarity,
  euclideanDistance,
  dotProduct,
  computeSimilarity,
} from '@claude-flow/embeddings';

// Cosine similarity (most common for embeddings)
const cosine = cosineSimilarity(embedding1, embedding2);
// Returns: 0.0 to 1.0 (higher = more similar)

// Euclidean distance
const distance = euclideanDistance(embedding1, embedding2);
// Returns: 0.0 to infinity (lower = more similar)

// Dot product
const dot = dotProduct(embedding1, embedding2);
// Returns: unbounded (higher = more similar for normalized vectors)

// Generic similarity with metric selection
const result = computeSimilarity(embedding1, embedding2, 'cosine');
// Returns: { score: 0.95, metric: 'cosine' }

Cache Management

// Get cache statistics
const stats = service.getCacheStats();
console.log('Cache Stats:', {
  size: stats.size,
  maxSize: stats.maxSize,
  hitRate: `${(stats.hitRate * 100).toFixed(1)}%`,
});

// Clear cache
service.clearCache();

// Shutdown service
await service.shutdown();

Event System

// Subscribe to embedding events
service.addEventListener((event) => {
  switch (event.type) {
    case 'embed_start':
      console.log(`Starting: ${event.text.substring(0, 50)}...`);
      break;
    case 'embed_complete':
      console.log(`Completed in ${event.latencyMs}ms`);
      break;
    case 'embed_error':
      console.error(`Error: ${event.error}`);
      break;
    case 'cache_hit':
      console.log('Cache hit!');
      break;
    case 'batch_start':
      console.log(`Batch of ${event.count} started`);
      break;
    case 'batch_complete':
      console.log(`Batch completed in ${event.latencyMs}ms`);
      break;
    case 'cache_eviction':
      console.log(`Cache evicted ${event.size} entries`);
      break;
  }
});

// Remove listener
service.removeEventListener(listener);

Provider Comparison

ProviderLatencyQualityCostOffline
Agentic-Flow~3msGoodFreeYes
OpenAI~50-100msExcellent$0.02-0.13/1M tokensNo
Transformers.js~230msGoodFreeYes
Mock<1msN/AFreeYes
ModelDimensionsSpeedBest For
default3843msGeneral purpose, fastest

OpenAI Models

ModelDimensionsMax TokensBest For
text-embedding-3-small15368191General purpose, cost-effective
text-embedding-3-large30728191Highest quality
text-embedding-ada-00215368191Legacy support

Transformers.js Models

ModelDimensionsSizeBest For
Xenova/all-MiniLM-L6-v238423MBFast, general purpose
Xenova/all-mpnet-base-v2768110MBHigher quality
Xenova/bge-small-en-v1.538433MBRetrieval optimized

TypeScript Types

import type {
  // Provider types
  EmbeddingProvider,
  EmbeddingConfig,
  OpenAIEmbeddingConfig,
  TransformersEmbeddingConfig,
  AgenticFlowEmbeddingConfig,
  MockEmbeddingConfig,
  AutoEmbeddingConfig,

  // Result types
  EmbeddingResult,
  BatchEmbeddingResult,

  // Service interface
  IEmbeddingService,

  // Event types
  EmbeddingEvent,
  EmbeddingEventListener,

  // Similarity types
  SimilarityMetric,
  SimilarityResult,
} from '@claude-flow/embeddings';

Environment Variables

# OpenAI configuration
OPENAI_API_KEY=sk-...

# Optional: Custom base URL (for Azure OpenAI, etc.)
OPENAI_BASE_URL=https://your-endpoint.openai.azure.com/

Error Handling

try {
  const result = await service.embed('Your text');
} catch (error) {
  if (error instanceof Error) {
    if (error.message.includes('API error')) {
      // Handle API errors (rate limits, auth, etc.)
    } else if (error.message.includes('Failed to initialize')) {
      // Handle model loading errors (Transformers.js)
    }
  }
}

Integration with Memory Module

import { createEmbeddingService } from '@claude-flow/embeddings';
import { HNSWIndex } from '@claude-flow/memory';

// Create embedding service
const embeddings = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
});

// Create HNSW index
const index = new HNSWIndex({
  dimensions: 1536,
  metric: 'cosine',
});

// Index documents
const documents = ['Doc 1 content', 'Doc 2 content', 'Doc 3 content'];
const { embeddings: vectors } = await embeddings.embedBatch(documents);

vectors.forEach((vector, i) => {
  index.addPoint(`doc-${i}`, new Float32Array(vector));
});

// Search
const queryResult = await embeddings.embed('Search query');
const results = await index.search(new Float32Array(queryResult.embedding), 5);

Document Chunking

Split long documents into overlapping chunks for embedding:

import { chunkText, estimateTokens, reconstructFromChunks } from '@claude-flow/embeddings';

// Chunk by sentence (default)
const result = chunkText(longDocument, {
  maxChunkSize: 512,
  overlap: 50,
  strategy: 'sentence',  // 'character' | 'sentence' | 'paragraph' | 'token'
  minChunkSize: 100,
});

console.log('Chunks:', result.totalChunks);
result.chunks.forEach((chunk, i) => {
  console.log(`Chunk ${i}: ${chunk.length} chars, ~${chunk.tokenCount} tokens`);
});

// Estimate tokens
const tokens = estimateTokens('Hello world');  // ~3 tokens

// Reconstruct (approximate)
const reconstructed = reconstructFromChunks(result.chunks);

Normalization

Normalize embeddings for consistent similarity computation:

import {
  l2Normalize,    // Unit vector (Euclidean norm = 1)
  l1Normalize,    // Manhattan norm = 1
  minMaxNormalize, // Values in [0, 1]
  zScoreNormalize, // Mean 0, std 1
  normalize,       // Generic with type option
  l2Norm,
  isNormalized,
} from '@claude-flow/embeddings';

const embedding = new Float32Array([3, 4, 0]);

// L2 normalize (most common for cosine similarity)
const l2 = l2Normalize(embedding);  // [0.6, 0.8, 0]
console.log('L2 norm:', l2Norm(l2));  // 1.0

// Check if already normalized
console.log(isNormalized(l2));  // true
console.log(isNormalized(embedding));  // false

// Generic normalize with type
const normalized = normalize(embedding, { type: 'l2' });

Hyperbolic Embeddings (Poincaré Ball)

Transform embeddings to hyperbolic space for better hierarchical representation:

import {
  euclideanToPoincare,
  poincareToEuclidean,
  hyperbolicDistance,
  mobiusAdd,
  isInPoincareBall,
  batchEuclideanToPoincare,
  hyperbolicCentroid,
} from '@claude-flow/embeddings';

// Convert Euclidean embedding to Poincaré ball
const euclidean = new Float32Array([0.5, 0.3, 0.2]);
const poincare = euclideanToPoincare(euclidean);

// Check if point is in the ball
console.log(isInPoincareBall(poincare));  // true

// Round-trip conversion
const back = poincareToEuclidean(poincare);

// Hyperbolic distance (geodesic in Poincaré ball)
const a = euclideanToPoincare(new Float32Array([0.1, 0.2, 0.1]));
const b = euclideanToPoincare(new Float32Array([0.3, 0.1, 0.2]));
const dist = hyperbolicDistance(a, b);

// Möbius addition (hyperbolic "plus")
const sum = mobiusAdd(a, b);

// Batch conversion
const embeddings = [vec1, vec2, vec3];
const hyperbolic = batchEuclideanToPoincare(embeddings);

// Hyperbolic centroid (Fréchet mean)
const centroid = hyperbolicCentroid(hyperbolic);

Why Hyperbolic?

Hyperbolic space has natural properties for representing hierarchical data:

  • Exponential growth - Tree-like structures fit naturally
  • Better hierarchy - Parent-child relationships preserved
  • Lower distortion - Taxonomies represented with less error

Neural Substrate Integration

Access agentic-flow's neural features for advanced embedding operations:

import {
  NeuralEmbeddingService,
  createNeuralService,
  isNeuralAvailable,
  listEmbeddingModels,
  downloadEmbeddingModel,
} from '@claude-flow/embeddings';

// Check if neural features are available
const available = await isNeuralAvailable();

// Create neural service
const neural = createNeuralService({ dimension: 384 });
await neural.init();

if (neural.isAvailable()) {
  // Semantic drift detection
  await neural.setDriftBaseline('Initial context about the topic');
  const drift = await neural.detectDrift('New input to check for drift');
  console.log('Drift:', drift?.trend);  // 'stable' | 'drifting' | 'accelerating'

  // Memory with interference detection
  const stored = await neural.storeMemory('mem-1', 'Important information');
  console.log('Interference:', stored?.interference);

  // Recall by similarity
  const memories = await neural.recallMemories('query', 5);

  // Swarm coordination
  await neural.addSwarmAgent('agent-1', 'researcher');
  const coordination = await neural.coordinateSwarm('Analyze this task');

  // Coherence checking
  await neural.calibrateCoherence(['good output 1', 'good output 2']);
  const coherence = await neural.checkCoherence('Output to check');

  // Health status
  const health = neural.health();
  console.log('Memory count:', health?.memoryCount);
}

// List available ONNX models
const models = await listEmbeddingModels();
console.log(models);
// [{ id: 'all-MiniLM-L6-v2', dimension: 384, size: '23MB', ... }]

// Download model
const path = await downloadEmbeddingModel('all-MiniLM-L6-v2', '.models');

Persistent Disk Cache

SQLite-backed persistent cache for embeddings:

import { PersistentEmbeddingCache, isPersistentCacheAvailable } from '@claude-flow/embeddings';

// Check if SQLite is available
const hasSQLite = await isPersistentCacheAvailable();

// Create persistent cache
const cache = new PersistentEmbeddingCache({
  dbPath: './embeddings.db',  // SQLite database path
  maxSize: 10000,             // Max entries before LRU eviction
  ttlMs: 7 * 24 * 60 * 60 * 1000,  // 7 day TTL
});

// Initialize
await cache.init();

// Store embedding
await cache.set('my text', new Float32Array([0.1, 0.2, 0.3]));

// Retrieve
const embedding = await cache.get('my text');

// Get stats
const stats = await cache.getStats();
console.log('Cache stats:', {
  size: stats.totalEntries,
  hitRate: stats.hitRate,
  avgLatency: stats.avgLatencyMs,
});

// Close when done
await cache.close();

Enable in Embedding Service

const service = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  persistentCache: {
    enabled: true,
    dbPath: './cache/embeddings.db',
    maxSize: 50000,
    ttlMs: 30 * 24 * 60 * 60 * 1000,  // 30 days
  },
  normalization: 'l2',  // Auto-normalize embeddings
});

CLI Commands (New)

# Document chunking
claude-flow embeddings chunk document.txt --strategy sentence --max-size 512

# Normalize embedding file
claude-flow embeddings normalize embeddings.json --type l2 -o normalized.json

# Convert to hyperbolic
claude-flow embeddings hyperbolic embeddings.json -o poincare.json

# Neural operations
claude-flow embeddings neural drift --baseline "context" --input "check this"
claude-flow embeddings neural store --id mem-1 --content "data"
claude-flow embeddings neural recall "query" --top-k 5

# List/download models
claude-flow embeddings models list
claude-flow embeddings models download all-MiniLM-L6-v2

# Cache management
claude-flow embeddings cache stats
claude-flow embeddings cache clear --older-than 7d

License

MIT

Keywords

embeddings

FAQs

Package last updated on 18 May 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts