@ruvector/wasm
Advanced tools
+22
-33
| { | ||
| "name": "@ruvector/wasm", | ||
| "version": "0.1.16", | ||
| "description": "High-performance Rust vector database for browsers via WASM", | ||
| "main": "pkg/ruvector_wasm.js", | ||
| "types": "pkg/ruvector_wasm.d.ts", | ||
| "files": [ | ||
| "pkg", | ||
| "src/worker.js", | ||
| "src/worker-pool.js", | ||
| "src/indexeddb.js" | ||
| ], | ||
| "version": "0.1.22", | ||
| "description": "WebAssembly bindings for RuVector vector database", | ||
| "main": "dist/index.js", | ||
| "types": "dist/index.d.ts", | ||
| "scripts": { | ||
| "build": "npm run build:web && npm run build:simd && npm run build:bundler", | ||
| "build:web": "wasm-pack build --target web --out-dir pkg --release", | ||
| "build:simd": "wasm-pack build --target web --out-dir pkg-simd --release -- --features simd", | ||
| "build:node": "wasm-pack build --target nodejs --out-dir pkg-node --release", | ||
| "build:bundler": "wasm-pack build --target bundler --out-dir pkg-bundler --release", | ||
| "build:all": "npm run build && npm run build:node && npm run build:bundler", | ||
| "test": "wasm-pack test --headless --chrome", | ||
| "test:firefox": "wasm-pack test --headless --firefox", | ||
| "test:node": "wasm-pack test --node", | ||
| "size": "npm run build && gzip -c pkg/ruvector_wasm_bg.wasm | wc -c && echo 'bytes (gzipped)'", | ||
| "optimize": "npm run build && wasm-opt -Oz pkg/ruvector_wasm_bg.wasm -o pkg/ruvector_wasm_bg.wasm", | ||
| "serve": "python3 -m http.server 8080" | ||
| "build": "tsc -b", | ||
| "build:wasm": "cd ../../../crates/ruvector-wasm && wasm-pack build --target nodejs --out-dir ../../npm/packages/wasm/wasm-pkg", | ||
| "clean": "rm -rf dist *.tsbuildinfo wasm-pkg", | ||
| "test": "echo \"Tests not yet implemented\"", | ||
| "typecheck": "tsc --noEmit", | ||
| "lint": "eslint src --ext .ts" | ||
| }, | ||
@@ -30,18 +18,19 @@ "keywords": [ | ||
| "database", | ||
| "embeddings", | ||
| "wasm", | ||
| "browser", | ||
| "rust", | ||
| "simd", | ||
| "web-workers", | ||
| "indexeddb" | ||
| "webassembly", | ||
| "embeddings" | ||
| ], | ||
| "author": "", | ||
| "license": "MIT", | ||
| "repository": { | ||
| "type": "git", | ||
| "url": "https://github.com/ruvnet/ruvector.git" | ||
| "files": [ | ||
| "dist", | ||
| "wasm-pkg", | ||
| "README.md" | ||
| ], | ||
| "publishConfig": { | ||
| "access": "public" | ||
| }, | ||
| "devDependencies": { | ||
| "wasm-pack": "^0.12.1" | ||
| "dependencies": { | ||
| "@ruvector/core": "^0.1.0" | ||
| } | ||
| } |
-969
| # Ruvector WASM | ||
| [](https://opensource.org/licenses/MIT) | ||
| [](https://www.npmjs.com/package/@ruvector/wasm) | ||
| [](#bundle-size) | ||
| [](#browser-compatibility) | ||
| [](https://webassembly.org/) | ||
| **High-performance vector database running entirely in your browser via WebAssembly.** | ||
| > Bring **sub-millisecond vector search** to the edge with **offline-first** capabilities. Perfect for AI applications, semantic search, and recommendation engines that run completely client-side. Built by [rUv](https://ruv.io) with Rust and WebAssembly. | ||
| ## 🌟 Why Ruvector WASM? | ||
| In the age of privacy-first, offline-capable web applications, running AI workloads **entirely in the browser** is no longer optional—it's essential. | ||
| **Ruvector WASM brings enterprise-grade vector search to the browser:** | ||
| - ⚡ **Blazing Fast**: <1ms query latency with HNSW indexing and SIMD acceleration | ||
| - 🔒 **Privacy First**: All data stays in the browser—zero server round-trips | ||
| - 📴 **Offline Capable**: Full functionality without internet via IndexedDB persistence | ||
| - 🌐 **Edge Computing**: Deploy to CDNs for ultra-low latency globally | ||
| - 💾 **Persistent Storage**: IndexedDB integration with automatic synchronization | ||
| - 🧵 **Multi-threaded**: Web Workers support for parallel processing | ||
| - 📦 **Compact**: <400KB gzipped with optimizations | ||
| - 🎯 **Zero Dependencies**: Pure Rust compiled to WebAssembly | ||
| ## 🚀 Features | ||
| ### Core Capabilities | ||
| - **Complete VectorDB API**: Insert, search, delete, batch operations with familiar patterns | ||
| - **HNSW Indexing**: Hierarchical Navigable Small World for fast approximate nearest neighbor search | ||
| - **Multiple Distance Metrics**: Euclidean, Cosine, Dot Product, Manhattan | ||
| - **SIMD Acceleration**: 2-4x speedup on supported hardware with automatic detection | ||
| - **Memory Efficient**: Optimized memory layouts and zero-copy operations | ||
| - **Type-Safe**: Full TypeScript definitions included | ||
| ### Browser-Specific Features | ||
| - **IndexedDB Persistence**: Save/load database state with progressive loading | ||
| - **Web Workers Integration**: Parallel operations across multiple threads | ||
| - **Worker Pool Management**: Automatic load balancing across 4-8 workers | ||
| - **Zero-Copy Transfers**: Transferable objects for efficient data passing | ||
| - **Browser Console Debugging**: Enhanced error messages and stack traces | ||
| - **Progressive Web Apps**: Perfect for PWA offline scenarios | ||
| ### Performance Optimizations | ||
| - **Batch Operations**: Efficient bulk insert/search for large datasets | ||
| - **LRU Caching**: 1000-entry hot vector cache for frequently accessed data | ||
| - **Lazy Loading**: Progressive data loading with callbacks | ||
| - **Compressed Storage**: Optimized serialization for IndexedDB | ||
| - **WASM Streaming**: Compile WASM modules while downloading | ||
| ## 📦 Installation | ||
| ### NPM | ||
| ```bash | ||
| npm install @ruvector/wasm | ||
| ``` | ||
| ### Yarn | ||
| ```bash | ||
| yarn add @ruvector/wasm | ||
| ``` | ||
| ### CDN (for quick prototyping) | ||
| ```html | ||
| <script type="module"> | ||
| import init, { VectorDB } from 'https://unpkg.com/@ruvector/wasm/pkg/ruvector_wasm.js'; | ||
| await init(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| </script> | ||
| ``` | ||
| ## ⚡ Quick Start | ||
| ### Basic Usage | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| // 1. Initialize WASM module (one-time setup) | ||
| await init(); | ||
| // 2. Create database with 384-dimensional vectors | ||
| const db = new VectorDB( | ||
| 384, // dimensions | ||
| 'cosine', // distance metric | ||
| true // enable HNSW index | ||
| ); | ||
| // 3. Insert vectors with metadata | ||
| const embedding = new Float32Array(384).map(() => Math.random()); | ||
| const id = db.insert( | ||
| embedding, | ||
| 'doc_1', // optional ID | ||
| { title: 'My Document', type: 'article' } // optional metadata | ||
| ); | ||
| // 4. Search for similar vectors | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| const results = db.search(query, 10); // top 10 results | ||
| // 5. Process results | ||
| results.forEach(result => { | ||
| console.log(`ID: ${result.id}`); | ||
| console.log(`Score: ${result.score}`); | ||
| console.log(`Metadata:`, result.metadata); | ||
| }); | ||
| ``` | ||
| ### React Integration | ||
| ```typescript | ||
| import { useEffect, useState } from 'react'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| function SemanticSearch() { | ||
| const [db, setDb] = useState<VectorDB | null>(null); | ||
| const [results, setResults] = useState([]); | ||
| const [loading, setLoading] = useState(true); | ||
| useEffect(() => { | ||
| // Initialize WASM and create database | ||
| init().then(() => { | ||
| const vectorDB = new VectorDB(384, 'cosine', true); | ||
| setDb(vectorDB); | ||
| setLoading(false); | ||
| }); | ||
| }, []); | ||
| const handleSearch = async (queryEmbedding: Float32Array) => { | ||
| if (!db) return; | ||
| const searchResults = db.search(queryEmbedding, 10); | ||
| setResults(searchResults); | ||
| }; | ||
| if (loading) return <div>Loading vector database...</div>; | ||
| return ( | ||
| <div> | ||
| <h1>Semantic Search</h1> | ||
| {/* Your search UI */} | ||
| </div> | ||
| ); | ||
| } | ||
| ``` | ||
| ### Vue.js Integration | ||
| ```vue | ||
| <template> | ||
| <div> | ||
| <h1>Vector Search</h1> | ||
| <div v-if="!dbReady">Initializing...</div> | ||
| <div v-else> | ||
| <button @click="search">Search</button> | ||
| <ul> | ||
| <li v-for="result in results" :key="result.id"> | ||
| {{ result.id }}: {{ result.score }} | ||
| </li> | ||
| </ul> | ||
| </div> | ||
| </div> | ||
| </template> | ||
| <script setup> | ||
| import { ref, onMounted } from 'vue'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| const db = ref(null); | ||
| const dbReady = ref(false); | ||
| const results = ref([]); | ||
| onMounted(async () => { | ||
| await init(); | ||
| db.value = new VectorDB(384, 'cosine', true); | ||
| dbReady.value = true; | ||
| }); | ||
| const search = () => { | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| results.value = db.value.search(query, 10); | ||
| }; | ||
| </script> | ||
| ``` | ||
| ### Svelte Integration | ||
| ```svelte | ||
| <script> | ||
| import { onMount } from 'svelte'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| let db = null; | ||
| let ready = false; | ||
| let results = []; | ||
| onMount(async () => { | ||
| await init(); | ||
| db = new VectorDB(384, 'cosine', true); | ||
| ready = true; | ||
| }); | ||
| function search() { | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| results = db.search(query, 10); | ||
| } | ||
| </script> | ||
| {#if !ready} | ||
| <p>Loading...</p> | ||
| {:else} | ||
| <button on:click={search}>Search</button> | ||
| {#each results as result} | ||
| <div>{result.id}: {result.score}</div> | ||
| {/each} | ||
| {/if} | ||
| ``` | ||
| ## 🔥 Advanced Usage | ||
| ### Web Workers for Background Processing | ||
| Offload heavy vector operations to background threads for smooth UI performance: | ||
| ```javascript | ||
| // main.js | ||
| import { WorkerPool } from '@ruvector/wasm/worker-pool'; | ||
| const pool = new WorkerPool( | ||
| '/worker.js', | ||
| '/pkg/ruvector_wasm.js', | ||
| { | ||
| poolSize: navigator.hardwareConcurrency || 4, // Auto-detect CPU cores | ||
| dimensions: 384, | ||
| metric: 'cosine', | ||
| useHnsw: true | ||
| } | ||
| ); | ||
| // Initialize worker pool | ||
| await pool.init(); | ||
| // Batch insert in parallel (non-blocking) | ||
| const vectors = generateVectors(10000, 384); | ||
| const ids = await pool.insertBatch(vectors); | ||
| // Parallel search across workers | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| const results = await pool.search(query, 100); | ||
| // Get pool statistics | ||
| const stats = pool.getStats(); | ||
| console.log(`Workers: ${stats.busyWorkers}/${stats.poolSize} busy`); | ||
| console.log(`Queue: ${stats.queuedTasks} tasks waiting`); | ||
| // Cleanup when done | ||
| pool.terminate(); | ||
| ``` | ||
| ```javascript | ||
| // worker.js - Web Worker implementation | ||
| importScripts('/pkg/ruvector_wasm.js'); | ||
| const { VectorDB } = wasm_bindgen; | ||
| let db = null; | ||
| self.onmessage = async (e) => { | ||
| const { type, data } = e.data; | ||
| switch (type) { | ||
| case 'init': | ||
| await wasm_bindgen('/pkg/ruvector_wasm_bg.wasm'); | ||
| db = new VectorDB(data.dimensions, data.metric, data.useHnsw); | ||
| self.postMessage({ type: 'ready' }); | ||
| break; | ||
| case 'insert': | ||
| const id = db.insert(data.vector, data.id, data.metadata); | ||
| self.postMessage({ type: 'inserted', id }); | ||
| break; | ||
| case 'search': | ||
| const results = db.search(data.query, data.k); | ||
| self.postMessage({ type: 'results', results }); | ||
| break; | ||
| } | ||
| }; | ||
| ``` | ||
| ### IndexedDB Persistence - Offline First | ||
| Keep your vector database synchronized across sessions: | ||
| ```javascript | ||
| import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| // Create persistence layer | ||
| const persistence = new IndexedDBPersistence('my_vector_db', { | ||
| version: 1, | ||
| cacheSize: 1000, // LRU cache for hot vectors | ||
| batchSize: 100 // Batch size for bulk operations | ||
| }); | ||
| await persistence.open(); | ||
| // Create or restore VectorDB | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Load existing data from IndexedDB (with progress) | ||
| await persistence.loadAll(async (progress) => { | ||
| console.log(`Loading: ${progress.loaded}/${progress.total} vectors`); | ||
| console.log(`Progress: ${(progress.percent * 100).toFixed(1)}%`); | ||
| // Insert batch into VectorDB | ||
| if (progress.vectors.length > 0) { | ||
| const ids = db.insertBatch(progress.vectors); | ||
| console.log(`Inserted ${ids.length} vectors`); | ||
| } | ||
| if (progress.complete) { | ||
| console.log('Database fully loaded!'); | ||
| } | ||
| }); | ||
| // Insert new vectors and save to IndexedDB | ||
| const vector = new Float32Array(384).map(() => Math.random()); | ||
| const id = db.insert(vector, 'vec_123', { category: 'new' }); | ||
| await persistence.save({ | ||
| id, | ||
| vector, | ||
| metadata: { category: 'new' } | ||
| }); | ||
| // Batch save for better performance | ||
| const entries = [...]; // Your vector entries | ||
| await persistence.saveBatch(entries); | ||
| // Get storage statistics | ||
| const stats = await persistence.getStats(); | ||
| console.log(`Total vectors: ${stats.totalVectors}`); | ||
| console.log(`Storage used: ${(stats.storageBytes / 1024 / 1024).toFixed(2)} MB`); | ||
| console.log(`Cache size: ${stats.cacheSize}`); | ||
| console.log(`Cache hit rate: ${(stats.cacheHitRate * 100).toFixed(2)}%`); | ||
| // Clear old data | ||
| await persistence.clear(); | ||
| ``` | ||
| ### Batch Operations for Performance | ||
| Process large datasets efficiently: | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Batch insert (10x faster than individual inserts) | ||
| const entries = []; | ||
| for (let i = 0; i < 10000; i++) { | ||
| entries.push({ | ||
| vector: new Float32Array(384).map(() => Math.random()), | ||
| id: `vec_${i}`, | ||
| metadata: { index: i, batch: Math.floor(i / 100) } | ||
| }); | ||
| } | ||
| const ids = db.insertBatch(entries); | ||
| console.log(`Inserted ${ids.length} vectors in batch`); | ||
| // Multiple parallel searches | ||
| const queries = Array.from({ length: 100 }, () => | ||
| new Float32Array(384).map(() => Math.random()) | ||
| ); | ||
| const allResults = queries.map(query => db.search(query, 10)); | ||
| console.log(`Completed ${allResults.length} searches`); | ||
| ``` | ||
| ### Memory Management Best Practices | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| // Reuse Float32Array buffers to reduce GC pressure | ||
| const buffer = new Float32Array(384); | ||
| // Insert with reused buffer | ||
| for (let i = 0; i < 1000; i++) { | ||
| // Fill buffer with new data | ||
| for (let j = 0; j < 384; j++) { | ||
| buffer[j] = Math.random(); | ||
| } | ||
| db.insert(buffer, `vec_${i}`, { index: i }); | ||
| // Buffer is copied internally, safe to reuse | ||
| } | ||
| // Check memory usage | ||
| const vectorCount = db.len(); | ||
| const isEmpty = db.isEmpty(); | ||
| const dimensions = db.dimensions; | ||
| console.log(`Vectors: ${vectorCount}, Dims: ${dimensions}`); | ||
| // Clean up when done | ||
| // JavaScript GC will handle WASM memory automatically | ||
| ``` | ||
| ## 📊 Performance Benchmarks | ||
| ### Browser Performance (Chrome 120 on M1 MacBook Pro) | ||
| | Operation | Vectors | Dimensions | Standard | SIMD | Speedup | | ||
| |-----------|---------|------------|----------|------|---------| | ||
| | **Insert (individual)** | 10,000 | 384 | 3.2s | 1.1s | 2.9x | | ||
| | **Insert (batch)** | 10,000 | 384 | 1.2s | 0.4s | 3.0x | | ||
| | **Search (k=10)** | 100 queries | 384 | 0.5s | 0.2s | 2.5x | | ||
| | **Search (k=100)** | 100 queries | 384 | 1.8s | 0.7s | 2.6x | | ||
| | **Delete** | 1,000 | 384 | 0.2s | 0.1s | 2.0x | | ||
| ### Throughput Comparison | ||
| ``` | ||
| Operation Ruvector WASM Tensorflow.js ml5.js | ||
| ───────────────────────────────────────────────────────────────── | ||
| Insert (ops/sec) 25,000 5,000 1,200 | ||
| Search (queries/sec) 500 80 20 | ||
| Memory (10K vectors) ~50MB ~200MB ~150MB | ||
| Bundle Size (gzipped) 380KB 800KB 450KB | ||
| Offline Support ✅ Partial ❌ | ||
| SIMD Acceleration ✅ ❌ ❌ | ||
| ``` | ||
| ### Real-World Application Performance | ||
| **Semantic Search (10,000 documents, 384-dim embeddings)** | ||
| - Cold start: ~800ms (WASM compile + data load) | ||
| - Warm query: <5ms (with HNSW index) | ||
| - IndexedDB load: ~2s (10,000 vectors) | ||
| - Memory footprint: ~60MB | ||
| **Recommendation Engine (100,000 items, 128-dim embeddings)** | ||
| - Initial load: ~8s from IndexedDB | ||
| - Query latency: <10ms (p50) | ||
| - Memory usage: ~180MB | ||
| - Bundle impact: +400KB gzipped | ||
| ## 🌐 Browser Compatibility | ||
| ### Support Matrix | ||
| | Browser | Version | WASM | SIMD | Workers | IndexedDB | Status | | ||
| |---------|---------|------|------|---------|-----------|--------| | ||
| | **Chrome** | 91+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Firefox** | 89+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Safari** | 16.4+ | ✅ | Partial | ✅ | ✅ | Limited SIMD | | ||
| | **Edge** | 91+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Opera** | 77+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Samsung Internet** | 15+ | ✅ | ❌ | ✅ | ✅ | No SIMD | | ||
| ### SIMD Support Detection | ||
| ```javascript | ||
| import { detectSIMD } from '@ruvector/wasm'; | ||
| if (detectSIMD()) { | ||
| console.log('SIMD acceleration available!'); | ||
| // Load SIMD-optimized build | ||
| await import('@ruvector/wasm/pkg-simd/ruvector_wasm.js'); | ||
| } else { | ||
| console.log('Standard build'); | ||
| // Load standard build | ||
| await import('@ruvector/wasm'); | ||
| } | ||
| ``` | ||
| ### Polyfills and Fallbacks | ||
| ```javascript | ||
| // Check for required features | ||
| const hasWASM = typeof WebAssembly !== 'undefined'; | ||
| const hasWorkers = typeof Worker !== 'undefined'; | ||
| const hasIndexedDB = typeof indexedDB !== 'undefined'; | ||
| if (!hasWASM) { | ||
| console.error('WebAssembly not supported'); | ||
| // Fallback to server-side processing | ||
| } | ||
| if (!hasWorkers) { | ||
| console.warn('Web Workers not available, using main thread'); | ||
| // Use synchronous API | ||
| } | ||
| if (!hasIndexedDB) { | ||
| console.warn('IndexedDB not available, data will not persist'); | ||
| // Use in-memory only | ||
| } | ||
| ``` | ||
| ## 📦 Bundle Size | ||
| ### Production Build Sizes | ||
| ``` | ||
| Build Type Uncompressed Gzipped Brotli | ||
| ────────────────────────────────────────────────────────── | ||
| Standard WASM 1.2 MB 450 KB 380 KB | ||
| SIMD WASM 1.3 MB 480 KB 410 KB | ||
| JavaScript Glue 45 KB 12 KB 9 KB | ||
| TypeScript Definitions 8 KB 2 KB 1.5 KB | ||
| ────────────────────────────────────────────────────────── | ||
| Total (Standard) 1.25 MB 462 KB 390 KB | ||
| Total (SIMD) 1.35 MB 492 KB 420 KB | ||
| ``` | ||
| ### With Optimizations (wasm-opt) | ||
| ```bash | ||
| npm run optimize | ||
| ``` | ||
| ``` | ||
| Optimized Build Uncompressed Gzipped Brotli | ||
| ────────────────────────────────────────────────────────── | ||
| Standard WASM 900 KB 380 KB 320 KB | ||
| SIMD WASM 980 KB 410 KB 350 KB | ||
| ``` | ||
| ### Code Splitting Strategy | ||
| ```javascript | ||
| // Lazy load WASM module when needed | ||
| const loadVectorDB = async () => { | ||
| const { default: init, VectorDB } = await import('@ruvector/wasm'); | ||
| await init(); | ||
| return VectorDB; | ||
| }; | ||
| // Use in your application | ||
| button.addEventListener('click', async () => { | ||
| const VectorDB = await loadVectorDB(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Use db... | ||
| }); | ||
| ``` | ||
| ## 🔨 Building from Source | ||
| ### Prerequisites | ||
| - **Rust**: 1.77 or higher | ||
| - **wasm-pack**: Latest version | ||
| - **Node.js**: 18.0 or higher | ||
| ```bash | ||
| # Install wasm-pack | ||
| curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh | ||
| # Or via npm | ||
| npm install -g wasm-pack | ||
| ``` | ||
| ### Build Commands | ||
| ```bash | ||
| # Clone repository | ||
| git clone https://github.com/ruvnet/ruvector.git | ||
| cd ruvector/crates/ruvector-wasm | ||
| # Install dependencies | ||
| npm install | ||
| # Build for web (ES modules) | ||
| npm run build:web | ||
| # Build with SIMD optimizations | ||
| npm run build:simd | ||
| # Build for Node.js | ||
| npm run build:node | ||
| # Build for bundlers (webpack, rollup, etc.) | ||
| npm run build:bundler | ||
| # Build all targets | ||
| npm run build:all | ||
| # Run tests in browser | ||
| npm test | ||
| # Run tests in Node.js | ||
| npm run test:node | ||
| # Check bundle size | ||
| npm run size | ||
| # Optimize with wasm-opt (requires binaryen) | ||
| npm run optimize | ||
| # Serve examples locally | ||
| npm run serve | ||
| ``` | ||
| ### Development Workflow | ||
| ```bash | ||
| # Watch mode (requires custom setup) | ||
| wasm-pack build --dev --target web -- --features simd | ||
| # Run specific browser tests | ||
| npm run test:firefox | ||
| # Profile WASM performance | ||
| wasm-pack build --profiling --target web | ||
| # Generate documentation | ||
| cargo doc --no-deps --open | ||
| ``` | ||
| ### Custom Build Configuration | ||
| ```toml | ||
| # .cargo/config.toml | ||
| [target.wasm32-unknown-unknown] | ||
| rustflags = [ | ||
| "-C", "opt-level=z", | ||
| "-C", "lto=fat", | ||
| "-C", "codegen-units=1" | ||
| ] | ||
| ``` | ||
| ## 📚 API Reference | ||
| ### VectorDB Class | ||
| ```typescript | ||
| class VectorDB { | ||
| constructor( | ||
| dimensions: number, | ||
| metric?: 'euclidean' | 'cosine' | 'dotproduct' | 'manhattan', | ||
| useHnsw?: boolean | ||
| ); | ||
| // Insert operations | ||
| insert(vector: Float32Array, id?: string, metadata?: object): string; | ||
| insertBatch(entries: VectorEntry[]): string[]; | ||
| // Search operations | ||
| search(query: Float32Array, k: number, filter?: object): SearchResult[]; | ||
| // Retrieval operations | ||
| get(id: string): VectorEntry | null; | ||
| len(): number; | ||
| isEmpty(): boolean; | ||
| // Delete operations | ||
| delete(id: string): boolean; | ||
| // Persistence (IndexedDB) | ||
| saveToIndexedDB(): Promise<void>; | ||
| static loadFromIndexedDB(dbName: string): Promise<VectorDB>; | ||
| // Properties | ||
| readonly dimensions: number; | ||
| } | ||
| ``` | ||
| ### Types | ||
| ```typescript | ||
| interface VectorEntry { | ||
| id?: string; | ||
| vector: Float32Array; | ||
| metadata?: Record<string, any>; | ||
| } | ||
| interface SearchResult { | ||
| id: string; | ||
| score: number; | ||
| vector?: Float32Array; | ||
| metadata?: Record<string, any>; | ||
| } | ||
| ``` | ||
| ### Utility Functions | ||
| ```typescript | ||
| // Detect SIMD support | ||
| function detectSIMD(): boolean; | ||
| // Get version | ||
| function version(): string; | ||
| // Array conversion | ||
| function arrayToFloat32Array(arr: number[]): Float32Array; | ||
| // Benchmarking | ||
| function benchmark(name: string, iterations: number, dimensions: number): number; | ||
| ``` | ||
| See [WASM API Documentation](../../docs/getting-started/wasm-api.md) for complete reference. | ||
| ## 🎯 Example Applications | ||
| ### Semantic Search Engine | ||
| ```javascript | ||
| // Semantic search with OpenAI embeddings | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| import { Configuration, OpenAIApi } from 'openai'; | ||
| await init(); | ||
| const openai = new OpenAIApi(new Configuration({ | ||
| apiKey: process.env.OPENAI_API_KEY | ||
| })); | ||
| const db = new VectorDB(1536, 'cosine', true); // OpenAI ada-002 = 1536 dims | ||
| // Index documents | ||
| const documents = [ | ||
| 'The quick brown fox jumps over the lazy dog', | ||
| 'Machine learning is a subset of artificial intelligence', | ||
| 'WebAssembly enables high-performance web applications' | ||
| ]; | ||
| for (const [i, doc] of documents.entries()) { | ||
| const response = await openai.createEmbedding({ | ||
| model: 'text-embedding-ada-002', | ||
| input: doc | ||
| }); | ||
| const embedding = new Float32Array(response.data.data[0].embedding); | ||
| db.insert(embedding, `doc_${i}`, { text: doc }); | ||
| } | ||
| // Search | ||
| const queryResponse = await openai.createEmbedding({ | ||
| model: 'text-embedding-ada-002', | ||
| input: 'What is AI?' | ||
| }); | ||
| const queryEmbedding = new Float32Array(queryResponse.data.data[0].embedding); | ||
| const results = db.search(queryEmbedding, 3); | ||
| results.forEach(result => { | ||
| console.log(`${result.score.toFixed(4)}: ${result.metadata.text}`); | ||
| }); | ||
| ``` | ||
| ### Offline Recommendation Engine | ||
| ```javascript | ||
| // Product recommendations that work offline | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb'; | ||
| await init(); | ||
| const db = new VectorDB(128, 'cosine', true); | ||
| const persistence = new IndexedDBPersistence('product_recommendations'); | ||
| await persistence.open(); | ||
| // Load cached recommendations | ||
| await persistence.loadAll(async (progress) => { | ||
| if (progress.vectors.length > 0) { | ||
| db.insertBatch(progress.vectors); | ||
| } | ||
| }); | ||
| // Get recommendations based on user history | ||
| function getRecommendations(userHistory, k = 10) { | ||
| // Compute user preference vector (average of liked items) | ||
| const userVector = computeAverageEmbedding(userHistory); | ||
| const recommendations = db.search(userVector, k); | ||
| return recommendations.map(r => ({ | ||
| productId: r.id, | ||
| score: r.score, | ||
| ...r.metadata | ||
| })); | ||
| } | ||
| // Add new products (syncs to IndexedDB) | ||
| async function addProduct(productId, embedding, metadata) { | ||
| db.insert(embedding, productId, metadata); | ||
| await persistence.save({ id: productId, vector: embedding, metadata }); | ||
| } | ||
| ``` | ||
| ### RAG (Retrieval-Augmented Generation) | ||
| ```javascript | ||
| // Browser-based RAG system | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| const db = new VectorDB(768, 'cosine', true); // BERT embeddings | ||
| // Index knowledge base | ||
| const knowledgeBase = loadKnowledgeBase(); // Your documents | ||
| for (const doc of knowledgeBase) { | ||
| const embedding = await getBertEmbedding(doc.text); | ||
| db.insert(embedding, doc.id, { text: doc.text, source: doc.source }); | ||
| } | ||
| // RAG query function | ||
| async function ragQuery(question, llm) { | ||
| // 1. Get question embedding | ||
| const questionEmbedding = await getBertEmbedding(question); | ||
| // 2. Retrieve relevant context | ||
| const context = db.search(questionEmbedding, 5); | ||
| // 3. Augment prompt with context | ||
| const prompt = ` | ||
| Context: | ||
| ${context.map(r => r.metadata.text).join('\n\n')} | ||
| Question: ${question} | ||
| Answer based on the context above: | ||
| `; | ||
| // 4. Generate response | ||
| const response = await llm.generate(prompt); | ||
| return { | ||
| answer: response, | ||
| sources: context.map(r => r.metadata.source) | ||
| }; | ||
| } | ||
| ``` | ||
| ## 🐛 Troubleshooting | ||
| ### Common Issues | ||
| **1. WASM Module Not Loading** | ||
| ```javascript | ||
| // Ensure correct MIME type | ||
| // Add to server config (nginx): | ||
| // types { | ||
| // application/wasm wasm; | ||
| // } | ||
| // Or use explicit fetch | ||
| const wasmUrl = new URL('./pkg/ruvector_wasm_bg.wasm', import.meta.url); | ||
| await init(await fetch(wasmUrl)); | ||
| ``` | ||
| **2. CORS Errors** | ||
| ```javascript | ||
| // For local development | ||
| // package.json | ||
| { | ||
| "scripts": { | ||
| "serve": "python3 -m http.server 8080 --bind 127.0.0.1" | ||
| } | ||
| } | ||
| ``` | ||
| **3. Memory Issues** | ||
| ```javascript | ||
| // Monitor memory usage | ||
| const stats = db.len(); | ||
| const estimatedMemory = stats * dimensions * 4; // bytes | ||
| if (estimatedMemory > 100_000_000) { // 100MB | ||
| console.warn('High memory usage, consider chunking'); | ||
| } | ||
| // Use batch operations to reduce GC pressure | ||
| const BATCH_SIZE = 1000; | ||
| for (let i = 0; i < entries.length; i += BATCH_SIZE) { | ||
| const batch = entries.slice(i, i + BATCH_SIZE); | ||
| db.insertBatch(batch); | ||
| } | ||
| ``` | ||
| **4. Web Worker Issues** | ||
| ```javascript | ||
| // Ensure worker script URL is correct | ||
| const workerUrl = new URL('./worker.js', import.meta.url); | ||
| const worker = new Worker(workerUrl, { type: 'module' }); | ||
| // Handle worker errors | ||
| worker.onerror = (error) => { | ||
| console.error('Worker error:', error); | ||
| }; | ||
| ``` | ||
| See [WASM Troubleshooting Guide](../../docs/getting-started/wasm-troubleshooting.md) for more solutions. | ||
| ## 🔗 Links & Resources | ||
| ### Documentation | ||
| - **[Getting Started Guide](../../docs/guide/GETTING_STARTED.md)** - Complete setup and usage | ||
| - **[WASM API Reference](../../docs/getting-started/wasm-api.md)** - Full API documentation | ||
| - **[Performance Tuning](../../docs/optimization/PERFORMANCE_TUNING_GUIDE.md)** - Optimization tips | ||
| - **[Main README](../../README.md)** - Project overview and features | ||
| ### Examples & Demos | ||
| - **[Vanilla JS Example](../../examples/wasm-vanilla/)** - Basic implementation | ||
| - **[React Demo](../../examples/wasm-react/)** - React integration with hooks | ||
| - **[Live Demo](https://ruvector-demo.vercel.app)** - Try it in your browser | ||
| - **[CodeSandbox](https://codesandbox.io/s/ruvector-wasm)** - Interactive playground | ||
| ### Community & Support | ||
| - **GitHub**: [github.com/ruvnet/ruvector](https://github.com/ruvnet/ruvector) | ||
| - **Discord**: [Join our community](https://discord.gg/ruvnet) | ||
| - **Twitter**: [@ruvnet](https://twitter.com/ruvnet) | ||
| - **Issues**: [Report bugs](https://github.com/ruvnet/ruvector/issues) | ||
| ## 📄 License | ||
| MIT License - see [LICENSE](../../LICENSE) for details. | ||
| Free to use for commercial and personal projects. | ||
| ## 🙏 Acknowledgments | ||
| - Built with [wasm-pack](https://github.com/rustwasm/wasm-pack) and [wasm-bindgen](https://github.com/rustwasm/wasm-bindgen) | ||
| - HNSW algorithm implementation from [hnsw_rs](https://github.com/jean-pierreBoth/hnswlib-rs) | ||
| - SIMD optimizations powered by Rust's excellent WebAssembly support | ||
| - The WebAssembly community for making this possible | ||
| --- | ||
| <div align="center"> | ||
| **Built by [rUv](https://ruv.io) • Open Source on [GitHub](https://github.com/ruvnet/ruvector)** | ||
| [](https://github.com/ruvnet/ruvector) | ||
| [](https://twitter.com/ruvnet) | ||
| **Perfect for**: PWAs • Offline-First Apps • Edge Computing • Privacy-First AI | ||
| [Get Started](../../docs/guide/GETTING_STARTED.md) • [API Docs](../../docs/getting-started/wasm-api.md) • [Examples](../../examples/) | ||
| </div> |
-969
| # Ruvector WASM | ||
| [](https://opensource.org/licenses/MIT) | ||
| [](https://www.npmjs.com/package/@ruvector/wasm) | ||
| [](#bundle-size) | ||
| [](#browser-compatibility) | ||
| [](https://webassembly.org/) | ||
| **High-performance vector database running entirely in your browser via WebAssembly.** | ||
| > Bring **sub-millisecond vector search** to the edge with **offline-first** capabilities. Perfect for AI applications, semantic search, and recommendation engines that run completely client-side. Built by [rUv](https://ruv.io) with Rust and WebAssembly. | ||
| ## 🌟 Why Ruvector WASM? | ||
| In the age of privacy-first, offline-capable web applications, running AI workloads **entirely in the browser** is no longer optional—it's essential. | ||
| **Ruvector WASM brings enterprise-grade vector search to the browser:** | ||
| - ⚡ **Blazing Fast**: <1ms query latency with HNSW indexing and SIMD acceleration | ||
| - 🔒 **Privacy First**: All data stays in the browser—zero server round-trips | ||
| - 📴 **Offline Capable**: Full functionality without internet via IndexedDB persistence | ||
| - 🌐 **Edge Computing**: Deploy to CDNs for ultra-low latency globally | ||
| - 💾 **Persistent Storage**: IndexedDB integration with automatic synchronization | ||
| - 🧵 **Multi-threaded**: Web Workers support for parallel processing | ||
| - 📦 **Compact**: <400KB gzipped with optimizations | ||
| - 🎯 **Zero Dependencies**: Pure Rust compiled to WebAssembly | ||
| ## 🚀 Features | ||
| ### Core Capabilities | ||
| - **Complete VectorDB API**: Insert, search, delete, batch operations with familiar patterns | ||
| - **HNSW Indexing**: Hierarchical Navigable Small World for fast approximate nearest neighbor search | ||
| - **Multiple Distance Metrics**: Euclidean, Cosine, Dot Product, Manhattan | ||
| - **SIMD Acceleration**: 2-4x speedup on supported hardware with automatic detection | ||
| - **Memory Efficient**: Optimized memory layouts and zero-copy operations | ||
| - **Type-Safe**: Full TypeScript definitions included | ||
| ### Browser-Specific Features | ||
| - **IndexedDB Persistence**: Save/load database state with progressive loading | ||
| - **Web Workers Integration**: Parallel operations across multiple threads | ||
| - **Worker Pool Management**: Automatic load balancing across 4-8 workers | ||
| - **Zero-Copy Transfers**: Transferable objects for efficient data passing | ||
| - **Browser Console Debugging**: Enhanced error messages and stack traces | ||
| - **Progressive Web Apps**: Perfect for PWA offline scenarios | ||
| ### Performance Optimizations | ||
| - **Batch Operations**: Efficient bulk insert/search for large datasets | ||
| - **LRU Caching**: 1000-entry hot vector cache for frequently accessed data | ||
| - **Lazy Loading**: Progressive data loading with callbacks | ||
| - **Compressed Storage**: Optimized serialization for IndexedDB | ||
| - **WASM Streaming**: Compile WASM modules while downloading | ||
| ## 📦 Installation | ||
| ### NPM | ||
| ```bash | ||
| npm install @ruvector/wasm | ||
| ``` | ||
| ### Yarn | ||
| ```bash | ||
| yarn add @ruvector/wasm | ||
| ``` | ||
| ### CDN (for quick prototyping) | ||
| ```html | ||
| <script type="module"> | ||
| import init, { VectorDB } from 'https://unpkg.com/@ruvector/wasm/pkg/ruvector_wasm.js'; | ||
| await init(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| </script> | ||
| ``` | ||
| ## ⚡ Quick Start | ||
| ### Basic Usage | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| // 1. Initialize WASM module (one-time setup) | ||
| await init(); | ||
| // 2. Create database with 384-dimensional vectors | ||
| const db = new VectorDB( | ||
| 384, // dimensions | ||
| 'cosine', // distance metric | ||
| true // enable HNSW index | ||
| ); | ||
| // 3. Insert vectors with metadata | ||
| const embedding = new Float32Array(384).map(() => Math.random()); | ||
| const id = db.insert( | ||
| embedding, | ||
| 'doc_1', // optional ID | ||
| { title: 'My Document', type: 'article' } // optional metadata | ||
| ); | ||
| // 4. Search for similar vectors | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| const results = db.search(query, 10); // top 10 results | ||
| // 5. Process results | ||
| results.forEach(result => { | ||
| console.log(`ID: ${result.id}`); | ||
| console.log(`Score: ${result.score}`); | ||
| console.log(`Metadata:`, result.metadata); | ||
| }); | ||
| ``` | ||
| ### React Integration | ||
| ```typescript | ||
| import { useEffect, useState } from 'react'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| function SemanticSearch() { | ||
| const [db, setDb] = useState<VectorDB | null>(null); | ||
| const [results, setResults] = useState([]); | ||
| const [loading, setLoading] = useState(true); | ||
| useEffect(() => { | ||
| // Initialize WASM and create database | ||
| init().then(() => { | ||
| const vectorDB = new VectorDB(384, 'cosine', true); | ||
| setDb(vectorDB); | ||
| setLoading(false); | ||
| }); | ||
| }, []); | ||
| const handleSearch = async (queryEmbedding: Float32Array) => { | ||
| if (!db) return; | ||
| const searchResults = db.search(queryEmbedding, 10); | ||
| setResults(searchResults); | ||
| }; | ||
| if (loading) return <div>Loading vector database...</div>; | ||
| return ( | ||
| <div> | ||
| <h1>Semantic Search</h1> | ||
| {/* Your search UI */} | ||
| </div> | ||
| ); | ||
| } | ||
| ``` | ||
| ### Vue.js Integration | ||
| ```vue | ||
| <template> | ||
| <div> | ||
| <h1>Vector Search</h1> | ||
| <div v-if="!dbReady">Initializing...</div> | ||
| <div v-else> | ||
| <button @click="search">Search</button> | ||
| <ul> | ||
| <li v-for="result in results" :key="result.id"> | ||
| {{ result.id }}: {{ result.score }} | ||
| </li> | ||
| </ul> | ||
| </div> | ||
| </div> | ||
| </template> | ||
| <script setup> | ||
| import { ref, onMounted } from 'vue'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| const db = ref(null); | ||
| const dbReady = ref(false); | ||
| const results = ref([]); | ||
| onMounted(async () => { | ||
| await init(); | ||
| db.value = new VectorDB(384, 'cosine', true); | ||
| dbReady.value = true; | ||
| }); | ||
| const search = () => { | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| results.value = db.value.search(query, 10); | ||
| }; | ||
| </script> | ||
| ``` | ||
| ### Svelte Integration | ||
| ```svelte | ||
| <script> | ||
| import { onMount } from 'svelte'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| let db = null; | ||
| let ready = false; | ||
| let results = []; | ||
| onMount(async () => { | ||
| await init(); | ||
| db = new VectorDB(384, 'cosine', true); | ||
| ready = true; | ||
| }); | ||
| function search() { | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| results = db.search(query, 10); | ||
| } | ||
| </script> | ||
| {#if !ready} | ||
| <p>Loading...</p> | ||
| {:else} | ||
| <button on:click={search}>Search</button> | ||
| {#each results as result} | ||
| <div>{result.id}: {result.score}</div> | ||
| {/each} | ||
| {/if} | ||
| ``` | ||
| ## 🔥 Advanced Usage | ||
| ### Web Workers for Background Processing | ||
| Offload heavy vector operations to background threads for smooth UI performance: | ||
| ```javascript | ||
| // main.js | ||
| import { WorkerPool } from '@ruvector/wasm/worker-pool'; | ||
| const pool = new WorkerPool( | ||
| '/worker.js', | ||
| '/pkg/ruvector_wasm.js', | ||
| { | ||
| poolSize: navigator.hardwareConcurrency || 4, // Auto-detect CPU cores | ||
| dimensions: 384, | ||
| metric: 'cosine', | ||
| useHnsw: true | ||
| } | ||
| ); | ||
| // Initialize worker pool | ||
| await pool.init(); | ||
| // Batch insert in parallel (non-blocking) | ||
| const vectors = generateVectors(10000, 384); | ||
| const ids = await pool.insertBatch(vectors); | ||
| // Parallel search across workers | ||
| const query = new Float32Array(384).map(() => Math.random()); | ||
| const results = await pool.search(query, 100); | ||
| // Get pool statistics | ||
| const stats = pool.getStats(); | ||
| console.log(`Workers: ${stats.busyWorkers}/${stats.poolSize} busy`); | ||
| console.log(`Queue: ${stats.queuedTasks} tasks waiting`); | ||
| // Cleanup when done | ||
| pool.terminate(); | ||
| ``` | ||
| ```javascript | ||
| // worker.js - Web Worker implementation | ||
| importScripts('/pkg/ruvector_wasm.js'); | ||
| const { VectorDB } = wasm_bindgen; | ||
| let db = null; | ||
| self.onmessage = async (e) => { | ||
| const { type, data } = e.data; | ||
| switch (type) { | ||
| case 'init': | ||
| await wasm_bindgen('/pkg/ruvector_wasm_bg.wasm'); | ||
| db = new VectorDB(data.dimensions, data.metric, data.useHnsw); | ||
| self.postMessage({ type: 'ready' }); | ||
| break; | ||
| case 'insert': | ||
| const id = db.insert(data.vector, data.id, data.metadata); | ||
| self.postMessage({ type: 'inserted', id }); | ||
| break; | ||
| case 'search': | ||
| const results = db.search(data.query, data.k); | ||
| self.postMessage({ type: 'results', results }); | ||
| break; | ||
| } | ||
| }; | ||
| ``` | ||
| ### IndexedDB Persistence - Offline First | ||
| Keep your vector database synchronized across sessions: | ||
| ```javascript | ||
| import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb'; | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| // Create persistence layer | ||
| const persistence = new IndexedDBPersistence('my_vector_db', { | ||
| version: 1, | ||
| cacheSize: 1000, // LRU cache for hot vectors | ||
| batchSize: 100 // Batch size for bulk operations | ||
| }); | ||
| await persistence.open(); | ||
| // Create or restore VectorDB | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Load existing data from IndexedDB (with progress) | ||
| await persistence.loadAll(async (progress) => { | ||
| console.log(`Loading: ${progress.loaded}/${progress.total} vectors`); | ||
| console.log(`Progress: ${(progress.percent * 100).toFixed(1)}%`); | ||
| // Insert batch into VectorDB | ||
| if (progress.vectors.length > 0) { | ||
| const ids = db.insertBatch(progress.vectors); | ||
| console.log(`Inserted ${ids.length} vectors`); | ||
| } | ||
| if (progress.complete) { | ||
| console.log('Database fully loaded!'); | ||
| } | ||
| }); | ||
| // Insert new vectors and save to IndexedDB | ||
| const vector = new Float32Array(384).map(() => Math.random()); | ||
| const id = db.insert(vector, 'vec_123', { category: 'new' }); | ||
| await persistence.save({ | ||
| id, | ||
| vector, | ||
| metadata: { category: 'new' } | ||
| }); | ||
| // Batch save for better performance | ||
| const entries = [...]; // Your vector entries | ||
| await persistence.saveBatch(entries); | ||
| // Get storage statistics | ||
| const stats = await persistence.getStats(); | ||
| console.log(`Total vectors: ${stats.totalVectors}`); | ||
| console.log(`Storage used: ${(stats.storageBytes / 1024 / 1024).toFixed(2)} MB`); | ||
| console.log(`Cache size: ${stats.cacheSize}`); | ||
| console.log(`Cache hit rate: ${(stats.cacheHitRate * 100).toFixed(2)}%`); | ||
| // Clear old data | ||
| await persistence.clear(); | ||
| ``` | ||
| ### Batch Operations for Performance | ||
| Process large datasets efficiently: | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Batch insert (10x faster than individual inserts) | ||
| const entries = []; | ||
| for (let i = 0; i < 10000; i++) { | ||
| entries.push({ | ||
| vector: new Float32Array(384).map(() => Math.random()), | ||
| id: `vec_${i}`, | ||
| metadata: { index: i, batch: Math.floor(i / 100) } | ||
| }); | ||
| } | ||
| const ids = db.insertBatch(entries); | ||
| console.log(`Inserted ${ids.length} vectors in batch`); | ||
| // Multiple parallel searches | ||
| const queries = Array.from({ length: 100 }, () => | ||
| new Float32Array(384).map(() => Math.random()) | ||
| ); | ||
| const allResults = queries.map(query => db.search(query, 10)); | ||
| console.log(`Completed ${allResults.length} searches`); | ||
| ``` | ||
| ### Memory Management Best Practices | ||
| ```javascript | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| // Reuse Float32Array buffers to reduce GC pressure | ||
| const buffer = new Float32Array(384); | ||
| // Insert with reused buffer | ||
| for (let i = 0; i < 1000; i++) { | ||
| // Fill buffer with new data | ||
| for (let j = 0; j < 384; j++) { | ||
| buffer[j] = Math.random(); | ||
| } | ||
| db.insert(buffer, `vec_${i}`, { index: i }); | ||
| // Buffer is copied internally, safe to reuse | ||
| } | ||
| // Check memory usage | ||
| const vectorCount = db.len(); | ||
| const isEmpty = db.isEmpty(); | ||
| const dimensions = db.dimensions; | ||
| console.log(`Vectors: ${vectorCount}, Dims: ${dimensions}`); | ||
| // Clean up when done | ||
| // JavaScript GC will handle WASM memory automatically | ||
| ``` | ||
| ## 📊 Performance Benchmarks | ||
| ### Browser Performance (Chrome 120 on M1 MacBook Pro) | ||
| | Operation | Vectors | Dimensions | Standard | SIMD | Speedup | | ||
| |-----------|---------|------------|----------|------|---------| | ||
| | **Insert (individual)** | 10,000 | 384 | 3.2s | 1.1s | 2.9x | | ||
| | **Insert (batch)** | 10,000 | 384 | 1.2s | 0.4s | 3.0x | | ||
| | **Search (k=10)** | 100 queries | 384 | 0.5s | 0.2s | 2.5x | | ||
| | **Search (k=100)** | 100 queries | 384 | 1.8s | 0.7s | 2.6x | | ||
| | **Delete** | 1,000 | 384 | 0.2s | 0.1s | 2.0x | | ||
| ### Throughput Comparison | ||
| ``` | ||
| Operation Ruvector WASM Tensorflow.js ml5.js | ||
| ───────────────────────────────────────────────────────────────── | ||
| Insert (ops/sec) 25,000 5,000 1,200 | ||
| Search (queries/sec) 500 80 20 | ||
| Memory (10K vectors) ~50MB ~200MB ~150MB | ||
| Bundle Size (gzipped) 380KB 800KB 450KB | ||
| Offline Support ✅ Partial ❌ | ||
| SIMD Acceleration ✅ ❌ ❌ | ||
| ``` | ||
| ### Real-World Application Performance | ||
| **Semantic Search (10,000 documents, 384-dim embeddings)** | ||
| - Cold start: ~800ms (WASM compile + data load) | ||
| - Warm query: <5ms (with HNSW index) | ||
| - IndexedDB load: ~2s (10,000 vectors) | ||
| - Memory footprint: ~60MB | ||
| **Recommendation Engine (100,000 items, 128-dim embeddings)** | ||
| - Initial load: ~8s from IndexedDB | ||
| - Query latency: <10ms (p50) | ||
| - Memory usage: ~180MB | ||
| - Bundle impact: +400KB gzipped | ||
| ## 🌐 Browser Compatibility | ||
| ### Support Matrix | ||
| | Browser | Version | WASM | SIMD | Workers | IndexedDB | Status | | ||
| |---------|---------|------|------|---------|-----------|--------| | ||
| | **Chrome** | 91+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Firefox** | 89+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Safari** | 16.4+ | ✅ | Partial | ✅ | ✅ | Limited SIMD | | ||
| | **Edge** | 91+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Opera** | 77+ | ✅ | ✅ | ✅ | ✅ | Full Support | | ||
| | **Samsung Internet** | 15+ | ✅ | ❌ | ✅ | ✅ | No SIMD | | ||
| ### SIMD Support Detection | ||
| ```javascript | ||
| import { detectSIMD } from '@ruvector/wasm'; | ||
| if (detectSIMD()) { | ||
| console.log('SIMD acceleration available!'); | ||
| // Load SIMD-optimized build | ||
| await import('@ruvector/wasm/pkg-simd/ruvector_wasm.js'); | ||
| } else { | ||
| console.log('Standard build'); | ||
| // Load standard build | ||
| await import('@ruvector/wasm'); | ||
| } | ||
| ``` | ||
| ### Polyfills and Fallbacks | ||
| ```javascript | ||
| // Check for required features | ||
| const hasWASM = typeof WebAssembly !== 'undefined'; | ||
| const hasWorkers = typeof Worker !== 'undefined'; | ||
| const hasIndexedDB = typeof indexedDB !== 'undefined'; | ||
| if (!hasWASM) { | ||
| console.error('WebAssembly not supported'); | ||
| // Fallback to server-side processing | ||
| } | ||
| if (!hasWorkers) { | ||
| console.warn('Web Workers not available, using main thread'); | ||
| // Use synchronous API | ||
| } | ||
| if (!hasIndexedDB) { | ||
| console.warn('IndexedDB not available, data will not persist'); | ||
| // Use in-memory only | ||
| } | ||
| ``` | ||
| ## 📦 Bundle Size | ||
| ### Production Build Sizes | ||
| ``` | ||
| Build Type Uncompressed Gzipped Brotli | ||
| ────────────────────────────────────────────────────────── | ||
| Standard WASM 1.2 MB 450 KB 380 KB | ||
| SIMD WASM 1.3 MB 480 KB 410 KB | ||
| JavaScript Glue 45 KB 12 KB 9 KB | ||
| TypeScript Definitions 8 KB 2 KB 1.5 KB | ||
| ────────────────────────────────────────────────────────── | ||
| Total (Standard) 1.25 MB 462 KB 390 KB | ||
| Total (SIMD) 1.35 MB 492 KB 420 KB | ||
| ``` | ||
| ### With Optimizations (wasm-opt) | ||
| ```bash | ||
| npm run optimize | ||
| ``` | ||
| ``` | ||
| Optimized Build Uncompressed Gzipped Brotli | ||
| ────────────────────────────────────────────────────────── | ||
| Standard WASM 900 KB 380 KB 320 KB | ||
| SIMD WASM 980 KB 410 KB 350 KB | ||
| ``` | ||
| ### Code Splitting Strategy | ||
| ```javascript | ||
| // Lazy load WASM module when needed | ||
| const loadVectorDB = async () => { | ||
| const { default: init, VectorDB } = await import('@ruvector/wasm'); | ||
| await init(); | ||
| return VectorDB; | ||
| }; | ||
| // Use in your application | ||
| button.addEventListener('click', async () => { | ||
| const VectorDB = await loadVectorDB(); | ||
| const db = new VectorDB(384, 'cosine', true); | ||
| // Use db... | ||
| }); | ||
| ``` | ||
| ## 🔨 Building from Source | ||
| ### Prerequisites | ||
| - **Rust**: 1.77 or higher | ||
| - **wasm-pack**: Latest version | ||
| - **Node.js**: 18.0 or higher | ||
| ```bash | ||
| # Install wasm-pack | ||
| curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh | ||
| # Or via npm | ||
| npm install -g wasm-pack | ||
| ``` | ||
| ### Build Commands | ||
| ```bash | ||
| # Clone repository | ||
| git clone https://github.com/ruvnet/ruvector.git | ||
| cd ruvector/crates/ruvector-wasm | ||
| # Install dependencies | ||
| npm install | ||
| # Build for web (ES modules) | ||
| npm run build:web | ||
| # Build with SIMD optimizations | ||
| npm run build:simd | ||
| # Build for Node.js | ||
| npm run build:node | ||
| # Build for bundlers (webpack, rollup, etc.) | ||
| npm run build:bundler | ||
| # Build all targets | ||
| npm run build:all | ||
| # Run tests in browser | ||
| npm test | ||
| # Run tests in Node.js | ||
| npm run test:node | ||
| # Check bundle size | ||
| npm run size | ||
| # Optimize with wasm-opt (requires binaryen) | ||
| npm run optimize | ||
| # Serve examples locally | ||
| npm run serve | ||
| ``` | ||
| ### Development Workflow | ||
| ```bash | ||
| # Watch mode (requires custom setup) | ||
| wasm-pack build --dev --target web -- --features simd | ||
| # Run specific browser tests | ||
| npm run test:firefox | ||
| # Profile WASM performance | ||
| wasm-pack build --profiling --target web | ||
| # Generate documentation | ||
| cargo doc --no-deps --open | ||
| ``` | ||
| ### Custom Build Configuration | ||
| ```toml | ||
| # .cargo/config.toml | ||
| [target.wasm32-unknown-unknown] | ||
| rustflags = [ | ||
| "-C", "opt-level=z", | ||
| "-C", "lto=fat", | ||
| "-C", "codegen-units=1" | ||
| ] | ||
| ``` | ||
| ## 📚 API Reference | ||
| ### VectorDB Class | ||
| ```typescript | ||
| class VectorDB { | ||
| constructor( | ||
| dimensions: number, | ||
| metric?: 'euclidean' | 'cosine' | 'dotproduct' | 'manhattan', | ||
| useHnsw?: boolean | ||
| ); | ||
| // Insert operations | ||
| insert(vector: Float32Array, id?: string, metadata?: object): string; | ||
| insertBatch(entries: VectorEntry[]): string[]; | ||
| // Search operations | ||
| search(query: Float32Array, k: number, filter?: object): SearchResult[]; | ||
| // Retrieval operations | ||
| get(id: string): VectorEntry | null; | ||
| len(): number; | ||
| isEmpty(): boolean; | ||
| // Delete operations | ||
| delete(id: string): boolean; | ||
| // Persistence (IndexedDB) | ||
| saveToIndexedDB(): Promise<void>; | ||
| static loadFromIndexedDB(dbName: string): Promise<VectorDB>; | ||
| // Properties | ||
| readonly dimensions: number; | ||
| } | ||
| ``` | ||
| ### Types | ||
| ```typescript | ||
| interface VectorEntry { | ||
| id?: string; | ||
| vector: Float32Array; | ||
| metadata?: Record<string, any>; | ||
| } | ||
| interface SearchResult { | ||
| id: string; | ||
| score: number; | ||
| vector?: Float32Array; | ||
| metadata?: Record<string, any>; | ||
| } | ||
| ``` | ||
| ### Utility Functions | ||
| ```typescript | ||
| // Detect SIMD support | ||
| function detectSIMD(): boolean; | ||
| // Get version | ||
| function version(): string; | ||
| // Array conversion | ||
| function arrayToFloat32Array(arr: number[]): Float32Array; | ||
| // Benchmarking | ||
| function benchmark(name: string, iterations: number, dimensions: number): number; | ||
| ``` | ||
| See [WASM API Documentation](../../docs/getting-started/wasm-api.md) for complete reference. | ||
| ## 🎯 Example Applications | ||
| ### Semantic Search Engine | ||
| ```javascript | ||
| // Semantic search with OpenAI embeddings | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| import { Configuration, OpenAIApi } from 'openai'; | ||
| await init(); | ||
| const openai = new OpenAIApi(new Configuration({ | ||
| apiKey: process.env.OPENAI_API_KEY | ||
| })); | ||
| const db = new VectorDB(1536, 'cosine', true); // OpenAI ada-002 = 1536 dims | ||
| // Index documents | ||
| const documents = [ | ||
| 'The quick brown fox jumps over the lazy dog', | ||
| 'Machine learning is a subset of artificial intelligence', | ||
| 'WebAssembly enables high-performance web applications' | ||
| ]; | ||
| for (const [i, doc] of documents.entries()) { | ||
| const response = await openai.createEmbedding({ | ||
| model: 'text-embedding-ada-002', | ||
| input: doc | ||
| }); | ||
| const embedding = new Float32Array(response.data.data[0].embedding); | ||
| db.insert(embedding, `doc_${i}`, { text: doc }); | ||
| } | ||
| // Search | ||
| const queryResponse = await openai.createEmbedding({ | ||
| model: 'text-embedding-ada-002', | ||
| input: 'What is AI?' | ||
| }); | ||
| const queryEmbedding = new Float32Array(queryResponse.data.data[0].embedding); | ||
| const results = db.search(queryEmbedding, 3); | ||
| results.forEach(result => { | ||
| console.log(`${result.score.toFixed(4)}: ${result.metadata.text}`); | ||
| }); | ||
| ``` | ||
| ### Offline Recommendation Engine | ||
| ```javascript | ||
| // Product recommendations that work offline | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| import { IndexedDBPersistence } from '@ruvector/wasm/indexeddb'; | ||
| await init(); | ||
| const db = new VectorDB(128, 'cosine', true); | ||
| const persistence = new IndexedDBPersistence('product_recommendations'); | ||
| await persistence.open(); | ||
| // Load cached recommendations | ||
| await persistence.loadAll(async (progress) => { | ||
| if (progress.vectors.length > 0) { | ||
| db.insertBatch(progress.vectors); | ||
| } | ||
| }); | ||
| // Get recommendations based on user history | ||
| function getRecommendations(userHistory, k = 10) { | ||
| // Compute user preference vector (average of liked items) | ||
| const userVector = computeAverageEmbedding(userHistory); | ||
| const recommendations = db.search(userVector, k); | ||
| return recommendations.map(r => ({ | ||
| productId: r.id, | ||
| score: r.score, | ||
| ...r.metadata | ||
| })); | ||
| } | ||
| // Add new products (syncs to IndexedDB) | ||
| async function addProduct(productId, embedding, metadata) { | ||
| db.insert(embedding, productId, metadata); | ||
| await persistence.save({ id: productId, vector: embedding, metadata }); | ||
| } | ||
| ``` | ||
| ### RAG (Retrieval-Augmented Generation) | ||
| ```javascript | ||
| // Browser-based RAG system | ||
| import init, { VectorDB } from '@ruvector/wasm'; | ||
| await init(); | ||
| const db = new VectorDB(768, 'cosine', true); // BERT embeddings | ||
| // Index knowledge base | ||
| const knowledgeBase = loadKnowledgeBase(); // Your documents | ||
| for (const doc of knowledgeBase) { | ||
| const embedding = await getBertEmbedding(doc.text); | ||
| db.insert(embedding, doc.id, { text: doc.text, source: doc.source }); | ||
| } | ||
| // RAG query function | ||
| async function ragQuery(question, llm) { | ||
| // 1. Get question embedding | ||
| const questionEmbedding = await getBertEmbedding(question); | ||
| // 2. Retrieve relevant context | ||
| const context = db.search(questionEmbedding, 5); | ||
| // 3. Augment prompt with context | ||
| const prompt = ` | ||
| Context: | ||
| ${context.map(r => r.metadata.text).join('\n\n')} | ||
| Question: ${question} | ||
| Answer based on the context above: | ||
| `; | ||
| // 4. Generate response | ||
| const response = await llm.generate(prompt); | ||
| return { | ||
| answer: response, | ||
| sources: context.map(r => r.metadata.source) | ||
| }; | ||
| } | ||
| ``` | ||
| ## 🐛 Troubleshooting | ||
| ### Common Issues | ||
| **1. WASM Module Not Loading** | ||
| ```javascript | ||
| // Ensure correct MIME type | ||
| // Add to server config (nginx): | ||
| // types { | ||
| // application/wasm wasm; | ||
| // } | ||
| // Or use explicit fetch | ||
| const wasmUrl = new URL('./pkg/ruvector_wasm_bg.wasm', import.meta.url); | ||
| await init(await fetch(wasmUrl)); | ||
| ``` | ||
| **2. CORS Errors** | ||
| ```javascript | ||
| // For local development | ||
| // package.json | ||
| { | ||
| "scripts": { | ||
| "serve": "python3 -m http.server 8080 --bind 127.0.0.1" | ||
| } | ||
| } | ||
| ``` | ||
| **3. Memory Issues** | ||
| ```javascript | ||
| // Monitor memory usage | ||
| const stats = db.len(); | ||
| const estimatedMemory = stats * dimensions * 4; // bytes | ||
| if (estimatedMemory > 100_000_000) { // 100MB | ||
| console.warn('High memory usage, consider chunking'); | ||
| } | ||
| // Use batch operations to reduce GC pressure | ||
| const BATCH_SIZE = 1000; | ||
| for (let i = 0; i < entries.length; i += BATCH_SIZE) { | ||
| const batch = entries.slice(i, i + BATCH_SIZE); | ||
| db.insertBatch(batch); | ||
| } | ||
| ``` | ||
| **4. Web Worker Issues** | ||
| ```javascript | ||
| // Ensure worker script URL is correct | ||
| const workerUrl = new URL('./worker.js', import.meta.url); | ||
| const worker = new Worker(workerUrl, { type: 'module' }); | ||
| // Handle worker errors | ||
| worker.onerror = (error) => { | ||
| console.error('Worker error:', error); | ||
| }; | ||
| ``` | ||
| See [WASM Troubleshooting Guide](../../docs/getting-started/wasm-troubleshooting.md) for more solutions. | ||
| ## 🔗 Links & Resources | ||
| ### Documentation | ||
| - **[Getting Started Guide](../../docs/guide/GETTING_STARTED.md)** - Complete setup and usage | ||
| - **[WASM API Reference](../../docs/getting-started/wasm-api.md)** - Full API documentation | ||
| - **[Performance Tuning](../../docs/optimization/PERFORMANCE_TUNING_GUIDE.md)** - Optimization tips | ||
| - **[Main README](../../README.md)** - Project overview and features | ||
| ### Examples & Demos | ||
| - **[Vanilla JS Example](../../examples/wasm-vanilla/)** - Basic implementation | ||
| - **[React Demo](../../examples/wasm-react/)** - React integration with hooks | ||
| - **[Live Demo](https://ruvector-demo.vercel.app)** - Try it in your browser | ||
| - **[CodeSandbox](https://codesandbox.io/s/ruvector-wasm)** - Interactive playground | ||
| ### Community & Support | ||
| - **GitHub**: [github.com/ruvnet/ruvector](https://github.com/ruvnet/ruvector) | ||
| - **Discord**: [Join our community](https://discord.gg/ruvnet) | ||
| - **Twitter**: [@ruvnet](https://twitter.com/ruvnet) | ||
| - **Issues**: [Report bugs](https://github.com/ruvnet/ruvector/issues) | ||
| ## 📄 License | ||
| MIT License - see [LICENSE](../../LICENSE) for details. | ||
| Free to use for commercial and personal projects. | ||
| ## 🙏 Acknowledgments | ||
| - Built with [wasm-pack](https://github.com/rustwasm/wasm-pack) and [wasm-bindgen](https://github.com/rustwasm/wasm-bindgen) | ||
| - HNSW algorithm implementation from [hnsw_rs](https://github.com/jean-pierreBoth/hnswlib-rs) | ||
| - SIMD optimizations powered by Rust's excellent WebAssembly support | ||
| - The WebAssembly community for making this possible | ||
| --- | ||
| <div align="center"> | ||
| **Built by [rUv](https://ruv.io) • Open Source on [GitHub](https://github.com/ruvnet/ruvector)** | ||
| [](https://github.com/ruvnet/ruvector) | ||
| [](https://twitter.com/ruvnet) | ||
| **Perfect for**: PWAs • Offline-First Apps • Edge Computing • Privacy-First AI | ||
| [Get Started](../../docs/guide/GETTING_STARTED.md) • [API Docs](../../docs/getting-started/wasm-api.md) • [Examples](../../examples/) | ||
| </div> |
-355
| /** | ||
| * IndexedDB Persistence Layer for Ruvector | ||
| * | ||
| * Provides: | ||
| * - Save/load database state to IndexedDB | ||
| * - Batch operations for performance | ||
| * - Progressive loading with pagination | ||
| * - LRU cache for hot vectors | ||
| */ | ||
| const DB_NAME = 'ruvector_storage'; | ||
| const DB_VERSION = 1; | ||
| const VECTOR_STORE = 'vectors'; | ||
| const META_STORE = 'metadata'; | ||
| /** | ||
| * LRU Cache for hot vectors | ||
| */ | ||
| class LRUCache { | ||
| constructor(capacity = 1000) { | ||
| this.capacity = capacity; | ||
| this.cache = new Map(); | ||
| } | ||
| get(key) { | ||
| if (!this.cache.has(key)) return null; | ||
| // Move to end (most recently used) | ||
| const value = this.cache.get(key); | ||
| this.cache.delete(key); | ||
| this.cache.set(key, value); | ||
| return value; | ||
| } | ||
| set(key, value) { | ||
| // Remove if exists | ||
| if (this.cache.has(key)) { | ||
| this.cache.delete(key); | ||
| } | ||
| // Add to end | ||
| this.cache.set(key, value); | ||
| // Evict oldest if over capacity | ||
| if (this.cache.size > this.capacity) { | ||
| const firstKey = this.cache.keys().next().value; | ||
| this.cache.delete(firstKey); | ||
| } | ||
| } | ||
| has(key) { | ||
| return this.cache.has(key); | ||
| } | ||
| clear() { | ||
| this.cache.clear(); | ||
| } | ||
| get size() { | ||
| return this.cache.size; | ||
| } | ||
| } | ||
| /** | ||
| * IndexedDB Persistence Manager | ||
| */ | ||
| export class IndexedDBPersistence { | ||
| constructor(dbName = null) { | ||
| this.dbName = dbName || DB_NAME; | ||
| this.db = null; | ||
| this.cache = new LRUCache(1000); | ||
| } | ||
| /** | ||
| * Open IndexedDB connection | ||
| */ | ||
| async open() { | ||
| return new Promise((resolve, reject) => { | ||
| const request = indexedDB.open(this.dbName, DB_VERSION); | ||
| request.onerror = () => reject(request.error); | ||
| request.onsuccess = () => { | ||
| this.db = request.result; | ||
| resolve(this.db); | ||
| }; | ||
| request.onupgradeneeded = (event) => { | ||
| const db = event.target.result; | ||
| // Create object stores if they don't exist | ||
| if (!db.objectStoreNames.contains(VECTOR_STORE)) { | ||
| const vectorStore = db.createObjectStore(VECTOR_STORE, { keyPath: 'id' }); | ||
| vectorStore.createIndex('timestamp', 'timestamp', { unique: false }); | ||
| } | ||
| if (!db.objectStoreNames.contains(META_STORE)) { | ||
| db.createObjectStore(META_STORE, { keyPath: 'key' }); | ||
| } | ||
| }; | ||
| }); | ||
| } | ||
| /** | ||
| * Save a single vector | ||
| */ | ||
| async saveVector(id, vector, metadata = null) { | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readwrite'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const data = { | ||
| id, | ||
| vector: Array.from(vector), // Convert Float32Array to regular array | ||
| metadata, | ||
| timestamp: Date.now() | ||
| }; | ||
| const request = store.put(data); | ||
| request.onsuccess = () => { | ||
| this.cache.set(id, data); | ||
| resolve(id); | ||
| }; | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Save vectors in batch (more efficient) | ||
| */ | ||
| async saveBatch(entries, batchSize = 100) { | ||
| if (!this.db) await this.open(); | ||
| const chunks = []; | ||
| for (let i = 0; i < entries.length; i += batchSize) { | ||
| chunks.push(entries.slice(i, i + batchSize)); | ||
| } | ||
| for (const chunk of chunks) { | ||
| await new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readwrite'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| for (const entry of chunk) { | ||
| const data = { | ||
| id: entry.id, | ||
| vector: Array.from(entry.vector), | ||
| metadata: entry.metadata, | ||
| timestamp: Date.now() | ||
| }; | ||
| store.put(data); | ||
| this.cache.set(entry.id, data); | ||
| } | ||
| transaction.oncomplete = () => resolve(); | ||
| transaction.onerror = () => reject(transaction.error); | ||
| }); | ||
| } | ||
| return entries.length; | ||
| } | ||
| /** | ||
| * Load a single vector by ID | ||
| */ | ||
| async loadVector(id) { | ||
| // Check cache first | ||
| if (this.cache.has(id)) { | ||
| return this.cache.get(id); | ||
| } | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readonly'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const request = store.get(id); | ||
| request.onsuccess = () => { | ||
| const data = request.result; | ||
| if (data) { | ||
| // Convert array back to Float32Array | ||
| data.vector = new Float32Array(data.vector); | ||
| this.cache.set(id, data); | ||
| } | ||
| resolve(data); | ||
| }; | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Load all vectors (with progressive loading) | ||
| */ | ||
| async loadAll(onProgress = null, batchSize = 100) { | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readonly'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const request = store.openCursor(); | ||
| const vectors = []; | ||
| let count = 0; | ||
| request.onsuccess = (event) => { | ||
| const cursor = event.target.result; | ||
| if (cursor) { | ||
| const data = cursor.value; | ||
| data.vector = new Float32Array(data.vector); | ||
| vectors.push(data); | ||
| count++; | ||
| // Cache hot vectors (first 1000) | ||
| if (count <= 1000) { | ||
| this.cache.set(data.id, data); | ||
| } | ||
| // Report progress every batch | ||
| if (onProgress && count % batchSize === 0) { | ||
| onProgress({ | ||
| loaded: count, | ||
| vectors: [...vectors] | ||
| }); | ||
| vectors.length = 0; // Clear batch | ||
| } | ||
| cursor.continue(); | ||
| } else { | ||
| // Done | ||
| if (onProgress && vectors.length > 0) { | ||
| onProgress({ | ||
| loaded: count, | ||
| vectors: vectors, | ||
| complete: true | ||
| }); | ||
| } | ||
| resolve({ count, complete: true }); | ||
| } | ||
| }; | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Delete a vector by ID | ||
| */ | ||
| async deleteVector(id) { | ||
| if (!this.db) await this.open(); | ||
| this.cache.delete(id); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readwrite'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const request = store.delete(id); | ||
| request.onsuccess = () => resolve(true); | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Clear all vectors | ||
| */ | ||
| async clear() { | ||
| if (!this.db) await this.open(); | ||
| this.cache.clear(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readwrite'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const request = store.clear(); | ||
| request.onsuccess = () => resolve(); | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Get database statistics | ||
| */ | ||
| async getStats() { | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([VECTOR_STORE], 'readonly'); | ||
| const store = transaction.objectStore(VECTOR_STORE); | ||
| const request = store.count(); | ||
| request.onsuccess = () => { | ||
| resolve({ | ||
| totalVectors: request.result, | ||
| cacheSize: this.cache.size, | ||
| cacheHitRate: this.cache.size / request.result | ||
| }); | ||
| }; | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Save metadata | ||
| */ | ||
| async saveMeta(key, value) { | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([META_STORE], 'readwrite'); | ||
| const store = transaction.objectStore(META_STORE); | ||
| const request = store.put({ key, value }); | ||
| request.onsuccess = () => resolve(); | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Load metadata | ||
| */ | ||
| async loadMeta(key) { | ||
| if (!this.db) await this.open(); | ||
| return new Promise((resolve, reject) => { | ||
| const transaction = this.db.transaction([META_STORE], 'readonly'); | ||
| const store = transaction.objectStore(META_STORE); | ||
| const request = store.get(key); | ||
| request.onsuccess = () => { | ||
| const data = request.result; | ||
| resolve(data ? data.value : null); | ||
| }; | ||
| request.onerror = () => reject(request.error); | ||
| }); | ||
| } | ||
| /** | ||
| * Close the database connection | ||
| */ | ||
| close() { | ||
| if (this.db) { | ||
| this.db.close(); | ||
| this.db = null; | ||
| } | ||
| } | ||
| } | ||
| export default IndexedDBPersistence; |
| /** | ||
| * Web Worker Pool Manager | ||
| * | ||
| * Manages a pool of workers for parallel vector operations. | ||
| * Supports: | ||
| * - Round-robin task distribution | ||
| * - Load balancing | ||
| * - Automatic worker initialization | ||
| * - Promise-based API | ||
| */ | ||
| export class WorkerPool { | ||
| constructor(workerUrl, wasmUrl, options = {}) { | ||
| this.workerUrl = workerUrl; | ||
| this.wasmUrl = wasmUrl; | ||
| this.poolSize = options.poolSize || navigator.hardwareConcurrency || 4; | ||
| this.workers = []; | ||
| this.nextWorker = 0; | ||
| this.pendingRequests = new Map(); | ||
| this.requestId = 0; | ||
| this.initialized = false; | ||
| this.options = options; | ||
| } | ||
| /** | ||
| * Initialize the worker pool | ||
| */ | ||
| async init() { | ||
| if (this.initialized) return; | ||
| console.log(`Initializing worker pool with ${this.poolSize} workers...`); | ||
| const initPromises = []; | ||
| for (let i = 0; i < this.poolSize; i++) { | ||
| const worker = new Worker(this.workerUrl, { type: 'module' }); | ||
| worker.onmessage = (e) => this.handleMessage(i, e); | ||
| worker.onerror = (error) => this.handleError(i, error); | ||
| this.workers.push({ | ||
| worker, | ||
| busy: false, | ||
| id: i | ||
| }); | ||
| // Initialize worker with WASM | ||
| const initPromise = this.sendToWorker(i, 'init', { | ||
| wasmUrl: this.wasmUrl, | ||
| dimensions: this.options.dimensions, | ||
| metric: this.options.metric, | ||
| useHnsw: this.options.useHnsw | ||
| }); | ||
| initPromises.push(initPromise); | ||
| } | ||
| await Promise.all(initPromises); | ||
| this.initialized = true; | ||
| console.log(`Worker pool initialized successfully`); | ||
| } | ||
| /** | ||
| * Handle message from worker | ||
| */ | ||
| handleMessage(workerId, event) { | ||
| const { type, requestId, data, error } = event.data; | ||
| if (type === 'error') { | ||
| const request = this.pendingRequests.get(requestId); | ||
| if (request) { | ||
| request.reject(new Error(error.message)); | ||
| this.pendingRequests.delete(requestId); | ||
| } | ||
| return; | ||
| } | ||
| const request = this.pendingRequests.get(requestId); | ||
| if (request) { | ||
| this.workers[workerId].busy = false; | ||
| request.resolve(data); | ||
| this.pendingRequests.delete(requestId); | ||
| } | ||
| } | ||
| /** | ||
| * Handle worker error | ||
| */ | ||
| handleError(workerId, error) { | ||
| console.error(`Worker ${workerId} error:`, error); | ||
| // Reject all pending requests for this worker | ||
| for (const [requestId, request] of this.pendingRequests) { | ||
| if (request.workerId === workerId) { | ||
| request.reject(error); | ||
| this.pendingRequests.delete(requestId); | ||
| } | ||
| } | ||
| } | ||
| /** | ||
| * Get next available worker (round-robin) | ||
| */ | ||
| getNextWorker() { | ||
| // Try to find an idle worker | ||
| for (let i = 0; i < this.workers.length; i++) { | ||
| const idx = (this.nextWorker + i) % this.workers.length; | ||
| if (!this.workers[idx].busy) { | ||
| this.nextWorker = (idx + 1) % this.workers.length; | ||
| return idx; | ||
| } | ||
| } | ||
| // All busy, use round-robin | ||
| const idx = this.nextWorker; | ||
| this.nextWorker = (this.nextWorker + 1) % this.workers.length; | ||
| return idx; | ||
| } | ||
| /** | ||
| * Send message to specific worker | ||
| */ | ||
| sendToWorker(workerId, type, data) { | ||
| return new Promise((resolve, reject) => { | ||
| const requestId = this.requestId++; | ||
| this.pendingRequests.set(requestId, { | ||
| resolve, | ||
| reject, | ||
| workerId, | ||
| timestamp: Date.now() | ||
| }); | ||
| this.workers[workerId].busy = true; | ||
| this.workers[workerId].worker.postMessage({ | ||
| type, | ||
| data: { ...data, requestId } | ||
| }); | ||
| // Timeout after 30 seconds | ||
| setTimeout(() => { | ||
| if (this.pendingRequests.has(requestId)) { | ||
| this.pendingRequests.delete(requestId); | ||
| reject(new Error('Request timeout')); | ||
| } | ||
| }, 30000); | ||
| }); | ||
| } | ||
| /** | ||
| * Execute operation on next available worker | ||
| */ | ||
| async execute(type, data) { | ||
| if (!this.initialized) { | ||
| await this.init(); | ||
| } | ||
| const workerId = this.getNextWorker(); | ||
| return this.sendToWorker(workerId, type, data); | ||
| } | ||
| /** | ||
| * Insert vector | ||
| */ | ||
| async insert(vector, id = null, metadata = null) { | ||
| return this.execute('insert', { vector, id, metadata }); | ||
| } | ||
| /** | ||
| * Insert batch of vectors | ||
| */ | ||
| async insertBatch(entries) { | ||
| // Distribute batch across workers | ||
| const chunkSize = Math.ceil(entries.length / this.poolSize); | ||
| const chunks = []; | ||
| for (let i = 0; i < entries.length; i += chunkSize) { | ||
| chunks.push(entries.slice(i, i + chunkSize)); | ||
| } | ||
| const promises = chunks.map((chunk, i) => | ||
| this.sendToWorker(i % this.poolSize, 'insertBatch', { entries: chunk }) | ||
| ); | ||
| const results = await Promise.all(promises); | ||
| return results.flat(); | ||
| } | ||
| /** | ||
| * Search for similar vectors | ||
| */ | ||
| async search(query, k = 10, filter = null) { | ||
| return this.execute('search', { query, k, filter }); | ||
| } | ||
| /** | ||
| * Parallel search across multiple queries | ||
| */ | ||
| async searchBatch(queries, k = 10, filter = null) { | ||
| const promises = queries.map((query, i) => | ||
| this.sendToWorker(i % this.poolSize, 'search', { query, k, filter }) | ||
| ); | ||
| return Promise.all(promises); | ||
| } | ||
| /** | ||
| * Delete vector | ||
| */ | ||
| async delete(id) { | ||
| return this.execute('delete', { id }); | ||
| } | ||
| /** | ||
| * Get vector by ID | ||
| */ | ||
| async get(id) { | ||
| return this.execute('get', { id }); | ||
| } | ||
| /** | ||
| * Get database length (from first worker) | ||
| */ | ||
| async len() { | ||
| return this.sendToWorker(0, 'len', {}); | ||
| } | ||
| /** | ||
| * Terminate all workers | ||
| */ | ||
| terminate() { | ||
| for (const { worker } of this.workers) { | ||
| worker.terminate(); | ||
| } | ||
| this.workers = []; | ||
| this.initialized = false; | ||
| console.log('Worker pool terminated'); | ||
| } | ||
| /** | ||
| * Get pool statistics | ||
| */ | ||
| getStats() { | ||
| return { | ||
| poolSize: this.poolSize, | ||
| busyWorkers: this.workers.filter(w => w.busy).length, | ||
| idleWorkers: this.workers.filter(w => !w.busy).length, | ||
| pendingRequests: this.pendingRequests.size | ||
| }; | ||
| } | ||
| } | ||
| export default WorkerPool; |
-184
| /** | ||
| * Web Worker for parallel vector search operations | ||
| * | ||
| * This worker handles: | ||
| * - Vector search operations in parallel | ||
| * - Batch insert operations | ||
| * - Zero-copy transfers via transferable objects | ||
| */ | ||
| // Import the WASM module | ||
| let wasmModule = null; | ||
| let vectorDB = null; | ||
| /** | ||
| * Initialize the worker with WASM module | ||
| */ | ||
| self.onmessage = async function(e) { | ||
| const { type, data } = e.data; | ||
| try { | ||
| switch (type) { | ||
| case 'init': | ||
| await initWorker(data); | ||
| self.postMessage({ type: 'init', success: true }); | ||
| break; | ||
| case 'insert': | ||
| await handleInsert(data); | ||
| break; | ||
| case 'insertBatch': | ||
| await handleInsertBatch(data); | ||
| break; | ||
| case 'search': | ||
| await handleSearch(data); | ||
| break; | ||
| case 'delete': | ||
| await handleDelete(data); | ||
| break; | ||
| case 'get': | ||
| await handleGet(data); | ||
| break; | ||
| case 'len': | ||
| const length = vectorDB.len(); | ||
| self.postMessage({ type: 'len', data: length }); | ||
| break; | ||
| default: | ||
| throw new Error(`Unknown message type: ${type}`); | ||
| } | ||
| } catch (error) { | ||
| self.postMessage({ | ||
| type: 'error', | ||
| error: { | ||
| message: error.message, | ||
| stack: error.stack | ||
| } | ||
| }); | ||
| } | ||
| }; | ||
| /** | ||
| * Initialize WASM module and VectorDB | ||
| */ | ||
| async function initWorker(config) { | ||
| const { wasmUrl, dimensions, metric, useHnsw } = config; | ||
| // Import WASM module | ||
| wasmModule = await import(wasmUrl); | ||
| // Initialize WASM | ||
| await wasmModule.default(); | ||
| // Create VectorDB instance | ||
| vectorDB = new wasmModule.VectorDB(dimensions, metric, useHnsw); | ||
| console.log(`Worker initialized with dimensions=${dimensions}, metric=${metric}, SIMD=${wasmModule.detectSIMD()}`); | ||
| } | ||
| /** | ||
| * Handle single vector insert | ||
| */ | ||
| async function handleInsert(data) { | ||
| const { vector, id, metadata, requestId } = data; | ||
| // Convert array to Float32Array if needed | ||
| const vectorArray = new Float32Array(vector); | ||
| const resultId = vectorDB.insert(vectorArray, id, metadata); | ||
| self.postMessage({ | ||
| type: 'insert', | ||
| requestId, | ||
| data: resultId | ||
| }); | ||
| } | ||
| /** | ||
| * Handle batch insert | ||
| */ | ||
| async function handleInsertBatch(data) { | ||
| const { entries, requestId } = data; | ||
| // Convert vectors to Float32Array | ||
| const processedEntries = entries.map(entry => ({ | ||
| vector: new Float32Array(entry.vector), | ||
| id: entry.id, | ||
| metadata: entry.metadata | ||
| })); | ||
| const ids = vectorDB.insertBatch(processedEntries); | ||
| self.postMessage({ | ||
| type: 'insertBatch', | ||
| requestId, | ||
| data: ids | ||
| }); | ||
| } | ||
| /** | ||
| * Handle vector search | ||
| */ | ||
| async function handleSearch(data) { | ||
| const { query, k, filter, requestId } = data; | ||
| // Convert query to Float32Array | ||
| const queryArray = new Float32Array(query); | ||
| const results = vectorDB.search(queryArray, k, filter); | ||
| // Convert results to plain objects | ||
| const plainResults = results.map(result => ({ | ||
| id: result.id, | ||
| score: result.score, | ||
| vector: result.vector ? Array.from(result.vector) : null, | ||
| metadata: result.metadata | ||
| })); | ||
| self.postMessage({ | ||
| type: 'search', | ||
| requestId, | ||
| data: plainResults | ||
| }); | ||
| } | ||
| /** | ||
| * Handle delete operation | ||
| */ | ||
| async function handleDelete(data) { | ||
| const { id, requestId } = data; | ||
| const deleted = vectorDB.delete(id); | ||
| self.postMessage({ | ||
| type: 'delete', | ||
| requestId, | ||
| data: deleted | ||
| }); | ||
| } | ||
| /** | ||
| * Handle get operation | ||
| */ | ||
| async function handleGet(data) { | ||
| const { id, requestId } = data; | ||
| const entry = vectorDB.get(id); | ||
| const plainEntry = entry ? { | ||
| id: entry.id, | ||
| vector: Array.from(entry.vector), | ||
| metadata: entry.metadata | ||
| } : null; | ||
| self.postMessage({ | ||
| type: 'get', | ||
| requestId, | ||
| data: plainEntry | ||
| }); | ||
| } |
Empty package
Supply chain riskPackage does not contain any code. It may be removed, is name squatting, or the result of a faulty package publish.
Found 1 instance in 1 package
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
No README
QualityPackage does not have a README. This may indicate a failed publish or a low quality package.
Found 1 instance in 1 package
No repository
Supply chain riskPackage does not have a linked source code repository. Without this field, a package will have no reference to the location of the source code use to generate the package.
Found 1 instance in 1 package
AI-detected potential code anomaly
Supply chain riskAI has identified unusual behaviors that may pose a security risk.
Found 1 instance in 1 package
Long strings
Supply chain riskContains long string literals, which may be a sign of obfuscated or packed code.
Found 1 instance in 1 package
No contributors or author data
MaintenancePackage does not specify a list of contributors or an author in package.json.
Found 1 instance in 1 package
0
-100%1
-50%827
-98.86%1
Infinity%1
-83.33%0
-100%1
Infinity%0
-100%2
Infinity%+ Added
+ Added
+ Added
+ Added
+ Added
+ Added
+ Added