New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

github.com/theapemachine/dmt

Package Overview

Dependencies

Alerts

File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/theapemachine/dmt

v0.0.0-20240721222602-0abf8f4b73ae
Source
Go

Version published: 7 months ago

Created: 7 months ago

Source

Distributed Memory Trie (DMT)

Welcome to the Distributed Memory Trie (DMT) project! This project aims to create a highly efficient, distributed, and fault-tolerant system for managing and querying large-scale data stored in an S3-compatible storage system. The core of the project revolves around a radix tree and prefix permutation to efficiently distribute and retrieve data.

Introduction

The Distributed Memory Trie (DMT) project provides a scalable and distributed system for managing prefixes in an S3-compatible storage. It leverages a radix tree for efficient storage and retrieval and a prefix permutation mechanism for load balancing and fault tolerance.

Concepts

Radix Tree: A compressed trie data structure used to store a set of strings. It's optimized for efficient lookups and minimizes memory usage by sharing common prefixes.
Prefix Permutation: A mechanism to generate all possible ordered combinations of a prefix's segments. This helps in distributing load and ensuring fault tolerance by enabling multiple nodes to manage different permutations of the same prefix.

Features

Efficient Storage and Retrieval: Uses a radix tree for fast and memory-efficient storage and retrieval of data.
Scalability: Designed to scale horizontally by distributing load across multiple nodes.
Fault Tolerance: Ensures data is replicated across nodes and can recover from node failures.
Prefix Permutation: Generates permutations of prefixes to distribute load evenly and make data relational.

Getting Started

Prerequisites

Go (1.16 or higher)
An S3-compatible storage system (e.g., Amazon S3, MinIO)
Docker and Kubernetes (for deploying the distributed system)

Installation

Clone the repository:

git clone https://github.com/yourusername/dmt.git
cd dmt

Build the project:

go build -o dmt main.go

Run the project:

./dmt

Usage

Inserting Data

To inset data into the tree:

tree := NewTree()
tree.Insert("key1", []byte("value1"))
tree.Insert("key2", []byte("value2"))

Retrieving Data

To retrieve data from the tree:

value, found := tree.Get("key1")
if found {
    fmt.Println(value)
}

Deleting Data

To delete data from the tree:

tree.Delete("key1")

Prefix Permuation

To generate permutations of a prefix:

perm := NewPrefixPermutator("a/b/c")
permutations := perm.Permute()
fmt.Println(permutations)

Prefix Permutations and Relational Data

Concept

The primary purpose of prefix permutations is to establish relationships between data that would otherwise be difficult to find using a single-prefix order. By permuting the prefixes, you can query different permutations to uncover related data. This makes the data "relational", allowing you to efficiently retrieve associated information based on various prefix orders.

Simple Example

Consider a scenario where you store user information with the following prefix structure:

users/{userid}/profile
users/{userid}/orders/{orderid}
users/{userid}/settings

Using prefix permutations, you can generate and query the following permutations to find related data:

Permutations for users/{userid}/profile

users/{userid}/profile
users/profile/{userid}
{userid}/users/profile
{userid}/profile/users
profile/users/{userid}
profile/{userid}/users

Permutations for users/{userid}/orders/{orderid}

users/{userid}/orders/{orderid}
users/{userid}/{orderid}/orders
users/orders/{userid}/{orderid}
users/orders/{orderid}/{userid}
users/{orderid}/{userid}/orders
users/{orderid}/orders/{userid}
{userid}/users/orders/{orderid}
{userid}/users/{orderid}/orders
{userid}/orders/users/{orderid}
{userid}/orders/{orderid}/users
{userid}/{orderid}/users/orders
{userid}/{orderid}/orders/users
orders/users/{userid}/{orderid}
orders/users/{orderid}/{userid}
orders/{userid}/users/{orderid}
orders/{userid}/{orderid}/users
orders/{orderid}/users/{userid}
orders/{orderid}/{userid}/users
{orderid}/users/{userid}/orders
{orderid}/users/orders/{userid}
{orderid}/{userid}/users/orders
{orderid}/{userid}/orders/users
{orderid}/orders/users/{userid}
{orderid}/orders/{userid}/users

Permutations for users/{userid}/settings

users/{userid}/settings
users/settings/{userid}
{userid}/users/settings
{userid}/settings/users
settings/users/{userid}
settings/{userid}/users

Querying Permutations

By storing and querying these permutations, the system can efficiently distribute the data across multiple nodes and ensure fault tolerance. If a particular node managing a specific permutation fails, other nodes managing different permutations of the same prefix can still serve the data.

For instance, if you need to retrieve all settings of a user, you could query:

users/{userid}/settings

If you want all settings, and have the user data "joined" onto the setting data, you could query:

settings/

If you want everything related to a user, you could query:

users/{userid}/

Or even simply:

users/

to retrieve all data related to all users.

By generating all the permutations of a prefix, it allows us to circumvent issues we have when data is nested in a single prefix.

Project Structure

.
├── README.md
├── tree.go
├── tree_test.go
├── tree_benchmark_test.go
├── prefix_permutator.go
├── prefix_permutator_test.go
├── prefix_permutator_benchmark_test.go
└── ...

FAQs

What is github.com/theapemachine/dmt?

Package last updated on 21 Jul 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

github.com/theapemachine/dmt

Distributed Memory Trie (DMT)

Table of Contents

Introduction

Concepts

Features

Getting Started

Prerequisites

Installation

Usage

Inserting Data

Retrieving Data

Deleting Data

Prefix Permuation

Prefix Permutations and Relational Data

Concept

Simple Example

Querying Permutations

Project Structure

Related posts

github.com/theapemachine/dmt

Distributed Memory Trie (DMT)

Table of Contents

Introduction

Concepts

Features

Getting Started

Prerequisites

Installation

Usage

Inserting Data

Retrieving Data

Deleting Data

Prefix Permuation

Prefix Permutations and Relational Data

Concept

Simple Example

Querying Permutations

Project Structure

Related posts

Linux Foundation Warns Open Source Developers: Compliance with Sanctions Is Not Optional

Maven Central Adds Sigstore Signature Validation