🚨 Shai-Hulud Strikes Again:834 Packages Compromised.Technical Analysis →

Book a Demo Install Sign in

embeddb

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

embeddb

A vector-based tag system for efficient similarity search and retrieval

latest

Source

npm

Version: 0.1.10

Version published: 12 months ago

Maintainers: 0

Created: last year

Source

EmbedDB

Hey there! Welcome to EmbedDB! This is a super cool vector-based tag system written in TypeScript. It makes similarity searching as easy as having an AI assistant helping you find stuff!

Features

Powerful vector-based similarity search
Weighted tags with confidence scores (You say it's important? It's important!)
Category weights for fine-tuned search (Control which categories matter more!)
Batch operations (Handle lots of data at once, super efficient!)
Built-in query caching (Repeated queries? Lightning fast!)
Full TypeScript support (Type-safe, developer-friendly!)
Memory-efficient sparse vector implementation (Your RAM will thank you!)
Import/Export functionality (Save and restore your indexes!)
Pagination support with filter-first approach (Get filtered results in chunks!)
Advanced filtering system (Filter first, sort by similarity!)

Quick Start

First, install the package:

npm install embeddb

Let's see it in action:

import { TagVectorSystem, Tag, IndexTag } from 'embeddb';

// Create a new system
const system = new TagVectorSystem();

// Define our tag universe
const tags: IndexTag[] = [
    { category: 'color', value: 'red' },    // Red is rad!
    { category: 'color', value: 'blue' },   // Blue is cool!
    { category: 'size', value: 'large' }    // Size matters!
];

// Build the tag index (important step!)
system.buildIndex(tags);

// Add an item with its tags and confidence scores
const item = {
    id: 'cool-item-1',
    tags: [
        { category: 'color', value: 'red', confidence: 1.0 },   // 100% sure it's red!
        { category: 'size', value: 'large', confidence: 0.8 }   // Pretty sure it's large
    ]
};
system.addItem(item);

// Set category weights to prioritize color matches
system.setCategoryWeight('color', 2.0); // Color matches are twice as important

// Let's find similar items
const query = {
    tags: [
        { category: 'color', value: 'red', confidence: 0.9 }
    ]
};

// Query with pagination
const results = system.query(query.tags, { page: 1, size: 10 }); // Get first 10 results

// Export the index for later use
const exportedData = system.exportIndex();

// Import the index in another instance
const newSystem = new TagVectorSystem();
newSystem.importIndex(exportedData);

API Reference

TagVectorSystem Class

This is our superhero! It handles all the operations.

Core Methods

buildIndex(tags: IndexTag[]): Build your tag universe

// Define your tag world!
system.buildIndex([
  { category: 'color', value: 'red' },
  { category: 'style', value: 'modern' }
]);

addItem(item: ItemTags): Add a single item

// Add something awesome
system.addItem({
  id: 'awesome-item',
  tags: [
    { category: 'color', value: 'red', confidence: 1.0 }
  ]
});

addItemBatch(items: ItemTags[], batchSize?: number): Batch add items

// Add multiple items at once for better performance!
system.addItemBatch([item1, item2, item3], 10);

query(tags: Tag[], options?: QueryOptions): Search for similar items

// Find similar stuff
const results = system.query([
  { category: 'style', value: 'modern', confidence: 0.9 }
], { page: 1, size: 20 });

queryFirst(tags: Tag[]): Get the most similar item

// Just get the best match
const bestMatch = system.queryFirst([
  { category: 'color', value: 'red', confidence: 1.0 }
]);

getStats(): Get system statistics

// Check out the system stats
const stats = system.getStats();
console.log(`Total items: ${stats.totalItems}`);

exportIndex() & importIndex(): Export/Import index data

// Save your data for later
const data = system.exportIndex();
// ... later ...
system.importIndex(data);

setCategoryWeight(category: string, weight: number): Set category weight

// Make color matches twice as important
system.setCategoryWeight('color', 2.0);

Development

Want to contribute? Awesome! Here are some handy commands:

# Install dependencies
npm install

# Build the project
npm run build

# Run tests (we love testing!)
npm test

# Check code style
npm run lint

# Make the code pretty
npm run format

How It Works

EmbedDB uses vector magic to make similarity search possible:

Tag Indexing:
- Each category-value pair gets mapped to a unique vector position
- This lets us transform tags into numerical vectors
Vector Transformation:
- Item tags are converted into sparse vectors
- Confidence scores are used as vector weights
Similarity Calculation:
- Uses cosine similarity to measure vector relationships
- This helps us find the most similar items
Performance Optimizations:
- Sparse vectors for memory efficiency
- Query caching for speed
- Batch operations for better throughput

Technical Details

Under the hood, EmbedDB uses several clever techniques:

Sparse Vector Implementation
- Only stores non-zero values
- Reduces memory footprint
- Perfect for tag-based systems where most values are zero
Cosine Similarity
- Measures angle between vectors
- Range: -1 to 1 (we normalize to 0 to 1)
- Used only for sorting, not filtering
- Ideal for high-dimensional sparse spaces
Filter-First Architecture
- Filters are applied before similarity calculation
- Results quantity determined by filters only
- Similarity scores used purely for sorting
- Efficient for large datasets
Category Weight Management
- Fine-grained control over category importance
- Individual and batch weight updates
- Default weights for unknown categories
- Automatic cache invalidation on weight changes

License

MIT License - Go wild, build awesome stuff!

Need Help?

Got questions or suggestions? We'd love to hear from you:

Open an Issue
Submit a PR

Let's make EmbedDB even more awesome!

Star Us!

If you find EmbedDB useful, give us a star! It helps others discover this project and motivates us to keep improving it!

Keywords

vector

similarity

embeddb

EmbedDB

Features

Quick Start

API Reference

TagVectorSystem Class

Core Methods

Development

How It Works

Technical Details

License

Need Help?

Star Us!

Keywords

Related posts

npm Revokes Classic Tokens, as OpenJS Warns Maintainers About OIDC Gaps

Rust RFC Proposes a Security Tab on crates.io for RustSec Advisories