Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement →

@claude-flow/embeddings

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@claude-flow/embeddings

V3 Embedding Service - OpenAI, Transformers.js, Agentic-Flow (ONNX), Mock providers with hyperbolic embeddings, normalization, and chunking

alpha

latest

v3alpha

npm

Version: 3.0.0-alpha.45

Version published: 3 weeks ago

Maintainers: 1

Created: 5 months ago

Source

@claude-flow/embeddings

High-performance embedding generation module for Claude Flow V3 - multi-provider support with persistent caching, document chunking, normalization, hyperbolic embeddings, and neural substrate integration.

SOTA at a glance (ADR-121)

Measured 2026-05-17 via scripts/benchmark-embeddings-sota.mjs, gated in CI as embeddings-sota-benchmark:

Capability	Measured	Acceptance gate
Install footprint (`provider: 'ruvector-onnx'`)	7.13 MB	≥30% smaller than Xenova → measured 97.1% (34.7× smaller)
int8 quantization recall (`SearchableEmbeddingCache`, `quantize: 'int8'`)	1.0000 mean	≥0.98 mean recall on unit-normalized vectors ✅
RaBitQ memory reduction (`RabitqSnapshot`)	32× (theoretical, 10k×384)	≥32× ✅
Hyperbolic Poincaré ops (`@ruvector/attention`)	~1 μs/op (native NAPI)	Backend reachable + finite output ✅
DiskANN/Vamana (`DiskannSnapshot`, billion-scale on-SSD)	200 vec build in 203 ms · 0.1 ms/query	Search latency <10 ms/query on 200-vec corpus ✅
ruvector MCP sidecar probe	83 tools / 15 groups (when installed)	Probe returns structured report without throwing ✅
Xenova provider deprecation (`provider: 'transformers'`)	One-shot warning per process	`CLAUDE_FLOW_EMBEDDINGS_XENOVA_DEPRECATED` code present ✅

See ADR-121 for the per-phase rationale and #2036 for the iteration log.

Features

Core Embedding

Multiple Providers - Agentic-Flow (ONNX), OpenAI, Transformers.js, and Mock
Auto-Install - Automatically installs agentic-flow when using provider: 'auto'
Smart Fallback - Graceful fallback chain: agentic-flow → transformers → mock
LRU + Disk Caching - In-memory LRU + SQLite persistent cache with TTL
Batch Processing - Efficient batch embedding with partial cache hits
Similarity Functions - Cosine, Euclidean, and dot product metrics
75x Faster - Agentic-flow ONNX is 75x faster than Transformers.js

Advanced Features (New in v3.0.0-alpha.11)

Document Chunking - Character, sentence, paragraph, and token-based chunking with overlap
Multiple Normalization - L2, L1, min-max, and z-score normalization
Hyperbolic Embeddings - Poincaré ball model for hierarchical representations
Neural Substrate - Semantic drift detection, memory physics, swarm coordination
Persistent Cache - SQLite-backed disk cache with LRU eviction and TTL

Installation

npm install @claude-flow/embeddings

Standalone use (without the Ruflo CLI)

This package runs on its own — no @claude-flow/cli, no MCP server, no API key required for the recipes below (the MockEmbeddingService gives deterministic 384-dim vectors so examples are reproducible). Swap to OpenAIEmbeddingService / TransformersEmbeddingService / AgenticFlowEmbeddingService when you're ready for real embeddings.

Recipe — Embed + compare without a network call

// recipe.mjs
import {
  MockEmbeddingService,
  cosineSimilarity,
} from '@claude-flow/embeddings';

const service = new MockEmbeddingService({ dimension: 384 });

// `embed()` returns { embedding: Float32Array, tokens, ... } —
// destructure for the raw vector.
const [{ embedding: cat }, { embedding: dog }, { embedding: car }] =
  await Promise.all([
    service.embed('a friendly cat'),
    service.embed('a friendly dog'),
    service.embed('a red sports car'),
  ]);

console.log('cat ↔ dog:', cosineSimilarity(cat, dog).toFixed(4));
console.log('cat ↔ car:', cosineSimilarity(cat, car).toFixed(4));
// With real embeddings the pet pair will score higher than the cat/car pair.

Provider swap — same surface, real vectors

import { createEmbeddingService, cosineSimilarity } from '@claude-flow/embeddings';

// OpenAI
const openai = createEmbeddingService({
  provider: 'openai',
  model: 'text-embedding-3-small',
  apiKey: process.env.OPENAI_API_KEY!,
});

// Or local ONNX via @huggingface/transformers (no API key needed)
const local = createEmbeddingService({
  provider: 'transformers',
  model: 'Xenova/all-MiniLM-L6-v2', // 384-dim, ~25MB
});

await openai.initialize();
const v = await openai.embed('hello world');

cosineSimilarity, euclideanDistance, dotProduct, and computeSimilarity are exported as plain functions — useful when you already have vectors from another source and just need the math.

Quick Start

import { createEmbeddingService, cosineSimilarity } from '@claude-flow/embeddings';

// Create embedding service
const service = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
  dimensions: 1536,
});

// Generate embedding
const result = await service.embed('Hello, world!');
console.log(`Embedding dimension: ${result.embedding.length}`);
console.log(`Latency: ${result.latencyMs}ms`);

// Batch embed
const batchResult = await service.embedBatch([
  'First document',
  'Second document',
  'Third document',
]);
console.log(`Processed ${batchResult.embeddings.length} embeddings`);
console.log(`Cache hits: ${batchResult.cacheStats?.hits}`);

// Calculate similarity
const similarity = cosineSimilarity(
  batchResult.embeddings[0],
  batchResult.embeddings[1]
);
console.log(`Similarity: ${similarity.toFixed(4)}`);

CLI Usage

# Generate embedding from CLI
claude-flow embeddings embed "Your text here"

# Batch embed from file
claude-flow embeddings batch documents.txt -o embeddings.json

# Similarity search
claude-flow embeddings search "query" --index ./vectors

# Initialize agentic-flow model
claude-flow embeddings init --provider agentic-flow

API Reference

Factory Functions

import {
  createEmbeddingService,
  createEmbeddingServiceAsync,
  getEmbedding
} from '@claude-flow/embeddings';

// Sync: Create with known provider
const service = createEmbeddingService({
  provider: 'openai',
  apiKey: 'your-api-key',
  model: 'text-embedding-3-small',
});

// Async: Auto-select best provider with fallback
const autoService = await createEmbeddingServiceAsync({
  provider: 'auto',       // agentic-flow → transformers → mock
  autoInstall: true,      // Install agentic-flow if missing
  fallback: 'transformers', // Custom fallback
});

// Quick one-off embedding
const embedding = await getEmbedding('Hello world', {
  provider: 'mock',
  dimensions: 384,
});

OpenAI Provider

import { OpenAIEmbeddingService } from '@claude-flow/embeddings';

const service = new OpenAIEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',  // or 'text-embedding-3-large'
  dimensions: 1536,                  // Adjustable for v3 models
  baseURL: 'https://api.openai.com/v1/embeddings',
  timeout: 30000,
  maxRetries: 3,
  cacheSize: 1000,
});

const result = await service.embed('Your text here');
console.log('Tokens used:', result.usage?.totalTokens);

Agentic-Flow Provider (Fastest)

import { AgenticFlowEmbeddingService } from '@claude-flow/embeddings';

const service = new AgenticFlowEmbeddingService({
  provider: 'agentic-flow',
  modelId: 'default',     // Uses optimized ONNX model
  cacheSize: 256,
});

// 75x faster than Transformers.js (3ms vs 233ms)
const result = await service.embed('Your text here');
console.log(`ONNX embedding in ${result.latencyMs}ms`);

Transformers.js Provider (Local)

import { TransformersEmbeddingService } from '@claude-flow/embeddings';

const service = new TransformersEmbeddingService({
  provider: 'transformers',
  model: 'Xenova/all-MiniLM-L6-v2',  // Any HuggingFace model
  cacheSize: 1000,
});

// First call downloads the model
const result = await service.embed('Your text here');
console.log(`Local embedding generated in ${result.latencyMs}ms`);

Mock Provider (Testing)

import { MockEmbeddingService } from '@claude-flow/embeddings';

const service = new MockEmbeddingService({
  provider: 'mock',
  dimensions: 384,
  simulatedLatency: 10,  // Optional delay
  cacheSize: 100,
});

// Deterministic embeddings based on text hash
const result = await service.embed('Your text here');

Batch Processing

const result = await service.embedBatch([
  'Document 1: Introduction to machine learning',
  'Document 2: Deep learning fundamentals',
  'Document 3: Natural language processing',
  'Document 4: Computer vision basics',
]);

console.log('Batch Results:', {
  count: result.embeddings.length,
  totalLatency: `${result.totalLatencyMs}ms`,
  avgLatency: `${result.avgLatencyMs}ms`,
  cacheHits: result.cacheStats?.hits,
  cacheMisses: result.cacheStats?.misses,
  tokensUsed: result.usage?.totalTokens,
});

Similarity Functions

import {
  cosineSimilarity,
  euclideanDistance,
  dotProduct,
  computeSimilarity,
} from '@claude-flow/embeddings';

// Cosine similarity (most common for embeddings)
const cosine = cosineSimilarity(embedding1, embedding2);
// Returns: 0.0 to 1.0 (higher = more similar)

// Euclidean distance
const distance = euclideanDistance(embedding1, embedding2);
// Returns: 0.0 to infinity (lower = more similar)

// Dot product
const dot = dotProduct(embedding1, embedding2);
// Returns: unbounded (higher = more similar for normalized vectors)

// Generic similarity with metric selection
const result = computeSimilarity(embedding1, embedding2, 'cosine');
// Returns: { score: 0.95, metric: 'cosine' }

Cache Management

// Get cache statistics
const stats = service.getCacheStats();
console.log('Cache Stats:', {
  size: stats.size,
  maxSize: stats.maxSize,
  hitRate: `${(stats.hitRate * 100).toFixed(1)}%`,
});

// Clear cache
service.clearCache();

// Shutdown service
await service.shutdown();

Event System

// Subscribe to embedding events
service.addEventListener((event) => {
  switch (event.type) {
    case 'embed_start':
      console.log(`Starting: ${event.text.substring(0, 50)}...`);
      break;
    case 'embed_complete':
      console.log(`Completed in ${event.latencyMs}ms`);
      break;
    case 'embed_error':
      console.error(`Error: ${event.error}`);
      break;
    case 'cache_hit':
      console.log('Cache hit!');
      break;
    case 'batch_start':
      console.log(`Batch of ${event.count} started`);
      break;
    case 'batch_complete':
      console.log(`Batch completed in ${event.latencyMs}ms`);
      break;
    case 'cache_eviction':
      console.log(`Cache evicted ${event.size} entries`);
      break;
  }
});

// Remove listener
service.removeEventListener(listener);

Provider Comparison

Provider	Latency	Quality	Cost	Offline
Agentic-Flow	~3ms	Good	Free	Yes
OpenAI	~50-100ms	Excellent	$0.02-0.13/1M tokens	No
Transformers.js	~230ms	Good	Free	Yes
Mock	<1ms	N/A	Free	Yes

Agentic-Flow (Recommended)

Model	Dimensions	Speed	Best For
`default`	384	3ms	General purpose, fastest

OpenAI Models

Model	Dimensions	Max Tokens	Best For
`text-embedding-3-small`	1536	8191	General purpose, cost-effective
`text-embedding-3-large`	3072	8191	Highest quality
`text-embedding-ada-002`	1536	8191	Legacy support

Transformers.js Models

Model	Dimensions	Size	Best For
`Xenova/all-MiniLM-L6-v2`	384	23MB	Fast, general purpose
`Xenova/all-mpnet-base-v2`	768	110MB	Higher quality
`Xenova/bge-small-en-v1.5`	384	33MB	Retrieval optimized

TypeScript Types

import type {
  // Provider types
  EmbeddingProvider,
  EmbeddingConfig,
  OpenAIEmbeddingConfig,
  TransformersEmbeddingConfig,
  AgenticFlowEmbeddingConfig,
  MockEmbeddingConfig,
  AutoEmbeddingConfig,

  // Result types
  EmbeddingResult,
  BatchEmbeddingResult,

  // Service interface
  IEmbeddingService,

  // Event types
  EmbeddingEvent,
  EmbeddingEventListener,

  // Similarity types
  SimilarityMetric,
  SimilarityResult,
} from '@claude-flow/embeddings';

Environment Variables

# OpenAI configuration
OPENAI_API_KEY=sk-...

# Optional: Custom base URL (for Azure OpenAI, etc.)
OPENAI_BASE_URL=https://your-endpoint.openai.azure.com/

Error Handling

try {
  const result = await service.embed('Your text');
} catch (error) {
  if (error instanceof Error) {
    if (error.message.includes('API error')) {
      // Handle API errors (rate limits, auth, etc.)
    } else if (error.message.includes('Failed to initialize')) {
      // Handle model loading errors (Transformers.js)
    }
  }
}

Integration with Memory Module

import { createEmbeddingService } from '@claude-flow/embeddings';
import { HNSWIndex } from '@claude-flow/memory';

// Create embedding service
const embeddings = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
});

// Create HNSW index
const index = new HNSWIndex({
  dimensions: 1536,
  metric: 'cosine',
});

// Index documents
const documents = ['Doc 1 content', 'Doc 2 content', 'Doc 3 content'];
const { embeddings: vectors } = await embeddings.embedBatch(documents);

vectors.forEach((vector, i) => {
  index.addPoint(`doc-${i}`, new Float32Array(vector));
});

// Search
const queryResult = await embeddings.embed('Search query');
const results = await index.search(new Float32Array(queryResult.embedding), 5);

Document Chunking

Split long documents into overlapping chunks for embedding:

import { chunkText, estimateTokens, reconstructFromChunks } from '@claude-flow/embeddings';

// Chunk by sentence (default)
const result = chunkText(longDocument, {
  maxChunkSize: 512,
  overlap: 50,
  strategy: 'sentence',  // 'character' | 'sentence' | 'paragraph' | 'token'
  minChunkSize: 100,
});

console.log('Chunks:', result.totalChunks);
result.chunks.forEach((chunk, i) => {
  console.log(`Chunk ${i}: ${chunk.length} chars, ~${chunk.tokenCount} tokens`);
});

// Estimate tokens
const tokens = estimateTokens('Hello world');  // ~3 tokens

// Reconstruct (approximate)
const reconstructed = reconstructFromChunks(result.chunks);

Normalization

Normalize embeddings for consistent similarity computation:

import {
  l2Normalize,    // Unit vector (Euclidean norm = 1)
  l1Normalize,    // Manhattan norm = 1
  minMaxNormalize, // Values in [0, 1]
  zScoreNormalize, // Mean 0, std 1
  normalize,       // Generic with type option
  l2Norm,
  isNormalized,
} from '@claude-flow/embeddings';

const embedding = new Float32Array([3, 4, 0]);

// L2 normalize (most common for cosine similarity)
const l2 = l2Normalize(embedding);  // [0.6, 0.8, 0]
console.log('L2 norm:', l2Norm(l2));  // 1.0

// Check if already normalized
console.log(isNormalized(l2));  // true
console.log(isNormalized(embedding));  // false

// Generic normalize with type
const normalized = normalize(embedding, { type: 'l2' });

Hyperbolic Embeddings (Poincaré Ball)

Transform embeddings to hyperbolic space for better hierarchical representation:

import {
  euclideanToPoincare,
  poincareToEuclidean,
  hyperbolicDistance,
  mobiusAdd,
  isInPoincareBall,
  batchEuclideanToPoincare,
  hyperbolicCentroid,
} from '@claude-flow/embeddings';

// Convert Euclidean embedding to Poincaré ball
const euclidean = new Float32Array([0.5, 0.3, 0.2]);
const poincare = euclideanToPoincare(euclidean);

// Check if point is in the ball
console.log(isInPoincareBall(poincare));  // true

// Round-trip conversion
const back = poincareToEuclidean(poincare);

// Hyperbolic distance (geodesic in Poincaré ball)
const a = euclideanToPoincare(new Float32Array([0.1, 0.2, 0.1]));
const b = euclideanToPoincare(new Float32Array([0.3, 0.1, 0.2]));
const dist = hyperbolicDistance(a, b);

// Möbius addition (hyperbolic "plus")
const sum = mobiusAdd(a, b);

// Batch conversion
const embeddings = [vec1, vec2, vec3];
const hyperbolic = batchEuclideanToPoincare(embeddings);

// Hyperbolic centroid (Fréchet mean)
const centroid = hyperbolicCentroid(hyperbolic);

Why Hyperbolic?

Hyperbolic space has natural properties for representing hierarchical data:

Exponential growth - Tree-like structures fit naturally
Better hierarchy - Parent-child relationships preserved
Lower distortion - Taxonomies represented with less error

Neural Substrate Integration

Access agentic-flow's neural features for advanced embedding operations:

import {
  NeuralEmbeddingService,
  createNeuralService,
  isNeuralAvailable,
  listEmbeddingModels,
  downloadEmbeddingModel,
} from '@claude-flow/embeddings';

// Check if neural features are available
const available = await isNeuralAvailable();

// Create neural service
const neural = createNeuralService({ dimension: 384 });
await neural.init();

if (neural.isAvailable()) {
  // Semantic drift detection
  await neural.setDriftBaseline('Initial context about the topic');
  const drift = await neural.detectDrift('New input to check for drift');
  console.log('Drift:', drift?.trend);  // 'stable' | 'drifting' | 'accelerating'

  // Memory with interference detection
  const stored = await neural.storeMemory('mem-1', 'Important information');
  console.log('Interference:', stored?.interference);

  // Recall by similarity
  const memories = await neural.recallMemories('query', 5);

  // Swarm coordination
  await neural.addSwarmAgent('agent-1', 'researcher');
  const coordination = await neural.coordinateSwarm('Analyze this task');

  // Coherence checking
  await neural.calibrateCoherence(['good output 1', 'good output 2']);
  const coherence = await neural.checkCoherence('Output to check');

  // Health status
  const health = neural.health();
  console.log('Memory count:', health?.memoryCount);
}

// List available ONNX models
const models = await listEmbeddingModels();
console.log(models);
// [{ id: 'all-MiniLM-L6-v2', dimension: 384, size: '23MB', ... }]

// Download model
const path = await downloadEmbeddingModel('all-MiniLM-L6-v2', '.models');

Persistent Disk Cache

SQLite-backed persistent cache for embeddings:

import { PersistentEmbeddingCache, isPersistentCacheAvailable } from '@claude-flow/embeddings';

// Check if SQLite is available
const hasSQLite = await isPersistentCacheAvailable();

// Create persistent cache
const cache = new PersistentEmbeddingCache({
  dbPath: './embeddings.db',  // SQLite database path
  maxSize: 10000,             // Max entries before LRU eviction
  ttlMs: 7 * 24 * 60 * 60 * 1000,  // 7 day TTL
});

// Initialize
await cache.init();

// Store embedding
await cache.set('my text', new Float32Array([0.1, 0.2, 0.3]));

// Retrieve
const embedding = await cache.get('my text');

// Get stats
const stats = await cache.getStats();
console.log('Cache stats:', {
  size: stats.totalEntries,
  hitRate: stats.hitRate,
  avgLatency: stats.avgLatencyMs,
});

// Close when done
await cache.close();

Enable in Embedding Service

const service = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  persistentCache: {
    enabled: true,
    dbPath: './cache/embeddings.db',
    maxSize: 50000,
    ttlMs: 30 * 24 * 60 * 60 * 1000,  // 30 days
  },
  normalization: 'l2',  // Auto-normalize embeddings
});

CLI Commands (New)

# Document chunking
claude-flow embeddings chunk document.txt --strategy sentence --max-size 512

# Normalize embedding file
claude-flow embeddings normalize embeddings.json --type l2 -o normalized.json

# Convert to hyperbolic
claude-flow embeddings hyperbolic embeddings.json -o poincare.json

# Neural operations
claude-flow embeddings neural drift --baseline "context" --input "check this"
claude-flow embeddings neural store --id mem-1 --content "data"
claude-flow embeddings neural recall "query" --top-k 5

# List/download models
claude-flow embeddings models list
claude-flow embeddings models download all-MiniLM-L6-v2

# Cache management
claude-flow embeddings cache stats
claude-flow embeddings cache clear --older-than 7d

@claude-flow/memory - HNSW indexing and vector storage
@claude-flow/providers - Multi-LLM provider system
@claude-flow/neural - SONA learning integration

License

MIT

Keywords

FAQs

What is @claude-flow/embeddings?

Is @claude-flow/embeddings well maintained?

Package last updated on 18 May 2026

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@claude-flow/embeddings

@claude-flow/embeddings

SOTA at a glance (ADR-121)

Features

Core Embedding

Advanced Features (New in v3.0.0-alpha.11)

Installation

Standalone use (without the Ruflo CLI)

Recipe — Embed + compare without a network call

Provider swap — same surface, real vectors

Quick Start

CLI Usage

API Reference

Factory Functions

OpenAI Provider

Agentic-Flow Provider (Fastest)

Transformers.js Provider (Local)

Mock Provider (Testing)

Batch Processing

Similarity Functions

Cache Management

Event System

Provider Comparison

Agentic-Flow (Recommended)

OpenAI Models

Transformers.js Models

TypeScript Types

Environment Variables

Error Handling

Integration with Memory Module

Document Chunking

Normalization

Hyperbolic Embeddings (Poincaré Ball)

Why Hyperbolic?

Neural Substrate Integration

Persistent Disk Cache

Enable in Embedding Service

CLI Commands (New)

Related Packages

License

Keywords

Related posts

RubyGems Adds Cooldown Feature to Bundler for Newly Published Gems

pnpm 11.5 Adds Support for Recognizing npm Staged Publishes