
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
native-vector-store
Advanced tools
High-performance local vector store with SIMD optimization for MCP servers
High-performance vector store with SIMD optimization for MCP servers and local RAG applications.
📚 API Documentation | 📦 npm | 🐙 GitHub
This vector store is designed for immutable, one-time loading scenarios common in modern cloud deployments:
This design eliminates complex state management, ensures consistent performance, and aligns perfectly with cloud-native deployment patterns where domain-specific knowledge bases are the norm.
📊 Production Case Study: Real-world deployment with 65k documents (1.5GB) on AWS Lambda achieving 15-20s cold start and 40-45ms search latency.
npm install native-vector-store
Runtime Requirements:
sudo apt-get install libgomp1 (Ubuntu/Debian) or dnf install libgomp (Fedora)apk add libgompbrew install libompPrebuilt binaries are included for:
If building from source, you'll need:
const { VectorStore } = require('native-vector-store');
// Initialize with embedding dimensions (e.g., 1536 for OpenAI)
const store = new VectorStore(1536);
// Load documents from directory
store.loadDir('./documents'); // Automatically finalizes after loading
// Or add documents manually then finalize
const document = {
id: 'doc-1',
text: 'Example document text',
metadata: {
embedding: new Array(1536).fill(0).map(() => Math.random()),
category: 'example'
}
};
store.addDocument(document);
store.finalize(); // Must call before searching!
// Search for similar documents
const queryEmbedding = new Float32Array(1536);
// Option 1: Vector-only search (traditional)
const results = store.search(queryEmbedding, 5); // Top 5 results
// Option 2: Hybrid search (NEW - combines vector + BM25 text search)
const hybridResults = store.search(queryEmbedding, 5, "your search query text");
// Option 3: BM25 text-only search
const textResults = store.searchBM25("your search query", 5);
// Results format - array of SearchResult objects, sorted by score (highest first):
console.log(results);
// [
// {
// score: 0.987654, // Similarity score (0-1, higher = more similar)
// id: "doc-1", // Your document ID
// text: "Example document...", // Full document text
// metadata_json: "{\"embedding\":[0.1,0.2,...],\"category\":\"example\"}" // JSON string
// },
// { score: 0.943210, id: "doc-7", text: "Another doc...", metadata_json: "..." },
// // ... up to 5 results
// ]
// Parse metadata from the top result
const topResult = results[0];
const metadata = JSON.parse(topResult.metadata_json);
console.log(metadata.category); // "example"
// Initialize once during cold start
let store;
async function initializeStore() {
if (!store) {
store = new VectorStore(1536);
store.loadDir('./knowledge-base'); // Loads and finalizes
}
return store;
}
// Handler reuses the store across invocations
export async function handler(event) {
const store = await initializeStore();
const embedding = new Float32Array(event.embedding);
return store.search(embedding, 10);
}
const { VectorStore } = require('native-vector-store');
// Load different knowledge domains at startup
const stores = {
products: new VectorStore(1536),
support: new VectorStore(1536),
general: new VectorStore(1536)
};
stores.products.loadDir('./knowledge/products');
stores.support.loadDir('./knowledge/support');
stores.general.loadDir('./knowledge/general');
// Route searches to appropriate domain
server.on('search', (query) => {
const store = stores[query.domain] || stores.general;
const results = store.search(query.embedding, 5);
return results.filter(r => r.score > 0.7);
});
#!/usr/bin/env node
const { VectorStore } = require('native-vector-store');
// Load knowledge base once
const store = new VectorStore(1536);
store.loadDir(process.env.KNOWLEDGE_PATH || './docs');
// Interactive REPL with fast responses
const repl = require('repl');
const r = repl.start('> ');
r.context.search = (embedding, k = 5) => store.search(embedding, k);
Structure your documents by category for separate vector stores:
knowledge-base/
├── products/ # Product documentation
│ ├── api-reference.json
│ └── user-guide.json
├── support/ # Support articles
│ ├── faq.json
│ └── troubleshooting.json
└── context/ # Context-specific docs
├── company-info.json
└── policies.json
Load each category into its own VectorStore:
// Create separate stores for different domains
const productStore = new VectorStore(1536);
const supportStore = new VectorStore(1536);
const contextStore = new VectorStore(1536);
// Load each category independently
productStore.loadDir('./knowledge-base/products');
supportStore.loadDir('./knowledge-base/support');
contextStore.loadDir('./knowledge-base/context');
// Search specific domains
const productResults = productStore.search(queryEmbedding, 5);
const supportResults = supportStore.search(queryEmbedding, 5);
Each JSON file contains self-contained documents with embeddings:
{
"id": "unique-id", // Required: unique document identifier
"text": "Document content...", // Required: searchable text content (or use "content" for Spring AI)
"metadata": { // Required: metadata object
"embedding": [0.1, 0.2, ...], // Required: array of numbers matching vector dimensions
"category": "product", // Optional: additional metadata
"lastUpdated": "2024-01-01" // Optional: additional metadata
}
}
Spring AI Compatibility: You can use "content" instead of "text" for the document field. The library auto-detects which field name you're using from the first document and optimizes subsequent lookups.
Common Mistakes:
embedding at the root level instead of inside metadataid, text, or metadata)Validate your JSON format:
node node_modules/native-vector-store/examples/validate-format.js your-file.json
// Load new version without downtime
const newStore = new VectorStore(1536);
newStore.loadDir('./knowledge-base-v2');
// Atomic switch
app.locals.store = newStore;
deployments/
├── v1.0.0/
│ └── documents/
├── v1.1.0/
│ └── documents/
└── current -> v1.1.0 # Symlink to active version
const fs = require('fs');
function reloadStore() {
const newStore = new VectorStore(1536);
newStore.loadDir('./documents');
global.store = newStore;
console.log(`Reloaded ${newStore.size()} documents`);
}
// Initial load
reloadStore();
// Watch for changes in development
if (process.env.NODE_ENV === 'development') {
fs.watch('./documents', { recursive: true }, reloadStore);
}
The vector store now supports hybrid search, combining semantic similarity (vector search) with lexical matching (BM25 text search) for improved retrieval accuracy:
const { VectorStore } = require('native-vector-store');
const store = new VectorStore(1536);
store.loadDir('./documents');
// Hybrid search automatically combines vector and text search
const queryEmbedding = new Float32Array(1536);
const results = store.search(
queryEmbedding,
10, // Top 10 results
"machine learning algorithms" // Query text for BM25
);
// You can also use individual search methods
const vectorResults = store.searchVector(queryEmbedding, 10);
const textResults = store.searchBM25("machine learning", 10);
// Or explicitly control the hybrid weights
const customResults = store.searchHybrid(
queryEmbedding,
"machine learning",
10,
0.3, // Vector weight (30%)
0.7 // BM25 weight (70%)
);
// Tune BM25 parameters for your corpus
store.setBM25Parameters(
1.2, // k1: Term frequency saturation (default: 1.2)
0.75, // b: Document length normalization (default: 0.75)
1.0 // delta: Smoothing parameter (default: 1.0)
);
Hybrid search is particularly effective for:
Perfect for building local RAG capabilities in MCP servers:
const { MCPVectorServer } = require('native-vector-store/examples/mcp-server');
const server = new MCPVectorServer(1536);
// Load document corpus
await server.loadDocuments('./documents');
// Handle MCP requests
const response = await server.handleMCPRequest('vector_search', {
query: queryEmbedding,
k: 5,
threshold: 0.7
});
Full API documentation is available at:
https://mboros1.github.io/native-vector-store/{version}/ (e.g., /v0.3.0/)open node_modules/native-vector-store/docs/index.htmlVectorStorenew VectorStore(dimensions: number)
loadDir(path: string): voidLoad all JSON documents from a directory and automatically finalize the store. Files should contain document objects with embeddings.
addDocument(doc: Document): voidAdd a single document to the store. Only works during loading phase (before finalization).
interface Document {
id: string;
text: string;
metadata: {
embedding: number[];
[key: string]: any;
};
}
search(query: Float32Array, k: number, normalizeQuery?: boolean): SearchResult[]Search for k most similar documents. Returns an array sorted by score (highest first).
interface SearchResult {
score: number; // Cosine similarity (0-1, higher = more similar)
id: string; // Document ID
text: string; // Document text content
metadata_json: string; // JSON string with all metadata including embedding
}
// Example return value:
[
{
score: 0.98765,
id: "doc-123",
text: "Introduction to machine learning...",
metadata_json: "{\"embedding\":[0.1,0.2,...],\"author\":\"Jane Doe\",\"tags\":[\"ML\",\"intro\"]}"
},
{
score: 0.94321,
id: "doc-456",
text: "Deep learning fundamentals...",
metadata_json: "{\"embedding\":[0.3,0.4,...],\"difficulty\":\"intermediate\"}"
}
// ... more results
]
finalize(): voidFinalize the store: normalize all embeddings and switch to serving mode. After this, no more documents can be added but searches become available. This is automatically called by loadDir().
isFinalized(): booleanCheck if the store has been finalized and is ready for searching.
normalize(): voidDeprecated: Use finalize() instead.
size(): numberGet the number of documents in the store.
The native-vector-store achieves exceptional performance through:
Performance on typical hardware (M1 MacBook Pro):
| Operation | Documents | Time | Throughput |
|---|---|---|---|
| Loading (from disk) | 10,000 | 153ms | 65k docs/sec |
| Loading (from disk) | 100,000 | ~560ms | 178k docs/sec |
| Loading (production) | 65,000 | 15-20s | 3.2-4.3k docs/sec |
| Search (k=10) | 10,000 corpus | 2ms | 500 queries/sec |
| Search (k=10) | 65,000 corpus | 40-45ms | 20-25 queries/sec |
| Search (k=100) | 100,000 corpus | 8-12ms | 80-125 queries/sec |
| Normalization | 100,000 | <100ms | 1M+ docs/sec |
Optimal File Organization:
Memory Considerations:
embedding_size * 4 bytes + metadata_size + text_sizeSearch Performance:
Corpus Size Optimization:
| Feature | native-vector-store | Faiss | ChromaDB | Pinecone |
|---|---|---|---|---|
| Load 100k docs | <1s | 2-5s | 30-60s | N/A (API) |
| Search latency | 1-2ms | 0.5-1ms | 50-200ms | 50-300ms |
| Memory efficiency | High | Medium | Low | N/A |
| Dependencies | Minimal | Heavy | Heavy | None |
| Deployment | Simple | Complex | Complex | SaaS |
| Sweet spot | <100k docs | Any size | Any size | Any size |
# Install dependencies
npm install
# Build native module
npm run build
# Run tests
npm test
# Run performance benchmarks
npm run benchmark
# Try MCP server example
npm run example
Ideal for building local RAG (Retrieval-Augmented Generation) capabilities:
Perfect for personal knowledge management systems:
Suitable for academic and research projects with focused datasets:
MIT License - see LICENSE file for details.
Performance on M1 MacBook Pro with 1536-dimensional embeddings:
| Operation | Document Count | Time | Rate |
|---|---|---|---|
| Load | 10,000 | 153ms | 65.4k docs/sec |
| Search | 10,000 | 2ms | 5M docs/sec |
| Normalize | 10,000 | 12ms | 833k docs/sec |
Results may vary based on hardware and document characteristics.
FAQs
High-performance local vector store with SIMD optimization for MCP servers
We found that native-vector-store demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.