github.com/flynnfc/bagginsdb

Package Overview

Dependencies

Alerts

File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/flynnfc/bagginsdb

v1.1.1

Source

Version published: 6 months ago

Created: 6 months ago

Source

Centered Image

Baggins DB

Baggins DB is a simple Cassandra-inspired wide-column db. While not production-ready, it serves as an educational project to explore low-level database internals, concurrency control, and performance-tuning techniques. Learn more on how it's made here

Features

Memtable (In-Memory Index):
Stores recently written data in a sorted skiplist for quick insertion and retrieval. Once it reaches a certain size threshold, it is flushed to disk as an immutable SSTable.
SSTables (On-Disk Storage):
Writes are organized into append-only, immutable files known as SSTables. Each SSTable is sorted by key and includes:
- A Bloom filter to quickly determine if a key might exist.
- A sparse index to jump near the desired key without scanning the entire file.
Compaction:
Over time, multiple SSTables are merged and deduplicated into a single larger SSTable. This process, known as compaction, reduces storage fragmentation and stabilises read performance by limiting the number of SSTables that must be searched.
TrueTime Integration (Mocked):
The code incorporates a truetime component, simulating reliable timestamp generation, similar in spirit to Google’s TrueTime, though far simpler and not distributed. This allows the program to avoid distributed locks and choose the newest value during compactions.

Project Structure

internal/database/
Contains the core database logic including:
- database.go: The Database struct that ties together mem-tables, SSTableManager, and timing.
- memtable.go, skiplist.go: In-memory skiplist for quick writes and reads.
- sstable.go, sstable_manager.go: Handling on-disk SSTables, building them from memtables, indexing, and merging them during compaction.
internal/truetime/
Mock time service that provides timestamps for record inserts.
simulation/
Contains load-testing and simulation scripts that run various scenarios:
- Bulk inserts and reads.
- Mixed workloads (reads/writes).
- Realistic Cassandra-like loads with random keys, values, and read/write ratios.
logger/
A simple logging wrapper configured to produce structured logs via zap.

Performance tuning and improvements

I've opted to track and log performance at quite a granular level. You can find saved graphs and performance notes in the Performance folder

Roadmap

Add delete support (tombstones).
Implement proper error handling and recovery after crashes.
Infinite scale!
Integrate more benchmarks and profiling tools to guide optimizations.

License

This project is distributed under the MIT License. See LICENSE for details.

FAQs

What is github.com/flynnfc/bagginsdb?

Package last updated on 18 Feb 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

github.com/flynnfc/bagginsdb

Baggins DB

Features

Project Structure

Performance tuning and improvements

Roadmap

License

Related posts

Risky Biz Podcast: Making Reachability Analysis Work in Real-World Codebases

Nx npm Packages Compromised in Supply Chain Attack Weaponizing AI CLI Tools