Socket
Socket
Sign inDemoInstall

snap-db

Package Overview
Dependencies
0
Maintainers
1
Versions
21
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    snap-db

Fast & Durable pure javascript key-value store for NodeJS/Electron


Version published
Weekly downloads
1.5K
decreased by-12.38%
Maintainers
1
Install size
965 kB
Created
Weekly downloads
 

Changelog

Source

[1.0.8] 06-21-2019

  • Readme tweaks.
  • Added backwards compatibility for breaking change in 1.0.7.

Readme

Source

Snap-DB

Fast & Durable key-value store for NodeJS/Electron

Get a running database in a snap!

SnapDB is a pure javascript persistent key-value store that provides ordered mapping from keys to string values. Data is persisted to disk using a Log Structure Merge Tree (LSM Tree) inspired by LevelDB / RocksDB.

Uses synchronous filesystem methods to exclusively perform append writes to disk, this puts the performance of SnapDB near the theoretical maximum write performance for ACID compliant javascript databases.

Features

  • Zero dependencies.
  • Zero compiling.
  • Zero configuring.
  • ACID Compliant with transaction support.
  • Optionally control compaction manually.
  • Constant time range & offset/limit queries.
  • Typescript & Babel friendly.
  • Works in NodeJS and Electron.
  • Keys are sorted, allowing very fast range queries.
  • Data is durable in the face of application or power failure.
  • Runs in it's own thread to prevent blocking.
  • Includes event system.

Installation

npm i snap-db --save

Usage

import { SnapDB } from "snap-db";

const db = new SnapDB({
    dir: "my_db", // database folder
    key: "int", // key type, can be "int", "string" or "float"
    cache: false, // enable or disable value cache,
    autoFlush: true // automatically flush the log and perform compaction
});

// wait for db to be ready
db.ready().then(() => {
    // put a record
    return db.put(20, "hello");
}).then(() => {
    // get a record
    return db.get(20);
}).then((data) => {
    console.log(data) // "hello"
})

API

The SnapDB class accepts a single argument which is an object that has the following properties:

Arguments

ArgumentRequiredTypeDetails
dirtruestringThe folder to persist data into.
keytrue"int" | "string" | "float"The database can only use one type of key at a time. You cannot change the key after the database has been created.
cachefalseboolIf enabled, data will be loaded to/from js memory in addition to being saved to disk, allowing MUCH faster reads at the cost of having the entire database in memory.
autoFlushfalsebool | numberThe database automatically flushes the log/memtable to SSTables once the log/memtable reaches 2MB or greater in size. Set this to false to disable automatic flushes/compaction entirely. Set this to a number (in MB) to control how large the log/memtable should get before a flush/compaction is performed.

Properties

.version: number

The current version of SnapDB being used.

.isCompacting: boolean

This is true when the database is performing compaction in the background, false otherwise. Compactions are performed in a separate thread so should only affect write throughput.

.isTx: boolean

This is true when a transaction is active, false otherwise.

Methods

.ready():Promise<void>

Call on database initialization to know when the database is ready.

.put(key: any, data: string): Promise<void>

Puts data into the database at the provided key.

.get(key: any):Promise<string>

Used to get the value of a single key.

.delete(key: any): Promise<void>

Deletes a key and it's value from the database.

.getAllKeys(onKey: (key: any) => void, onComplete: (err?: any) => void, reverse?: boolean): void;

Gets all the keys in the database, use the callback functions to capture the data. Can optionally return the keys in reverse. This is orders of magnitude faster than the getAll method.

.getAll(onData: (key: any, data: string) => void, onComplete: (err?: any) => void, reverse?: boolean): void;

Gets all the keys & values in the database, use the callback functions to capture the data. Can optionally return the keys/values in reverse order.

.range(lower: any, higher: any, onData: (key: any, data: string) => void, onComplete: (err?: any) => void, reverse?: boolean)

Gets a range of rows between the provided lower and upper values. Can optionally return the results in reverse.

.offset(offset: number, limit: number, onData: (key: any, data: string) => void, onComplete: (err?: any) => void, reverse?: boolean)

Gets a section of rows provided the offset and limit you'd like. Can optionally return the results in reverse from the bottom of the key list.

.getCount(): Promise<number>

Gets the total number of records in the database. This uses a very fast lookup method.

.empty(): Promise<void>

Clears all keys and values from the datastore. All other query types will fail while the database is being emptied, wait for this to complete before attempting to write new data to the database.

.close(): Promise<void>

Closes the datastore, clears the keys from memory and kills the worker threads. This isn't reversible, you have to create a new SnapDB instance to get things going again.

.flushLog(): Promise<void>

Forces the log to be flushed into database files and clears the memtable. Normally the database waits until the log/memtable is 2MB or larger before flushing them. Once the log is flushed into disk files a compaction is performed if it's needed.

The log files and database files are both written to disk in a safe, robust manner so this method isn't needed for normal activity or to make writes more reliable. You might use this method to perform compactions at times that are more convinient than waiting for the system to perform the compactions when the log fills up. Also if you have autoFlush off you'll need this method to flush the log/memtable.

.begin_transaction(): Promise<void>

Start a database transaction.

.end_transaction(): Promise<void>

End a database transaction, making sure it's been flushed to the filesystem.

.on(event: string, callback: (eventData) => void): void

Subscribe to a specific event.

.off(event: string, callback: (eventData) => void): void

Unsubscribe from a specific event.

Supported Events

You can listen for the following events:

EventTrigger Details
readyAfter the database is ready to query.
getAfter a value is retrieved with .get.
putAfter a value is set with .put.
deleteAfter a value is deleted with .delete.
get-keysFor every key when .getAllKeys is called.
get-keys-endAfter the end of a .getAllKeys query.
get-countAfter .getCount is query is returned.
get-allFor every key value pair returned when .getAll is called.
get-all-endAfter the end of a .getAll query.
get-offsetFor every key value pair returned when .offset is called.
get-offset-endAfter the end of a .offset query.
get-rangeFor every key value pair returned when .range is called.
get-range-endAfter the end of a .range query.
tx-startAfter a transaction is started.
tx-endAfter a transaction completes.
closeAfter the database is closed.
clearAfter the database is cleared.
compact-startWhen a compaction is starting.
compact-endAfter a compaction has completed.

Tips / Limitations

  • Keys and values can technically be almost any size, but try to keep them under 10 megabytes.
  • Using transactions will batch writes/deletes together into a single disk seek, use them when you can.
  • Transactions cannot be nested. Make sure you close each transaction before starting a new one.
  • Keys are kept in javascript memory for performance, in practice the size of the database you can have with SnapDB will be limited by how much memory nodejs/electron has access to.
  • Larger transactions take more memory to compact, if you run out of memory during a transaction then break it up into smaller chunks. Transactions in the tens of thousands of puts/deletes should be fine, hundreds of thousands will likely be problematic.
  • If you need to store millions of rows or terabytes of data RocksDB/LevelDB is a much better choice.

How LSM Tree Databases Work

The architecture of SnapDB is heavily borrowed from LevelDB/RocksDB and shares many of their advantages and limitations. LSM Tree databases are written almost exclusively to solve a single problem: how do you prevent extremely costly random writes to the disk? The LSM Tree structure is used to incrementally migrate and merge updates into progressively larger "Levels" using sorted immutable files. In practice additional structures like bloom filters and red-black trees are needed to get acceptable read performance.

Writes

When you perform a write the data is loaded into an in memory cache (memtable) and appended to a log file. Once the data is stored in the log file the write is considered to be committed to the database. Deletes work much the same, except they write a special "tombstone" record so that we know the record is deleted and to ignore older writes of that same key.

Logfile writes are always append only meaning the data in the logfile is considered unsorted, however the memtable maintains a sorted mapping of all the log values to make compaction fast and efficient.

The logfile and memtable always share the same data/state so the memtable can be restored from the logfile at any point.

Log Flushing & Compaction

Compactions are only possibly performed following a log flush or when manually triggered.

Once the logfile/memtable reach a threshold in size (2MB by default) all it's data is flushed to the first of many "Levels" of database files. Each database file is an immutable SSTable containing a bloom filter, a sorted index, and a data file containing the actual values. Database/Level files are never overwritten, modified or larger than 2MB unless single database keys/values are larger than 2MB.

A limitation to the size restriction for log/memtable involves transactions. A single transaction, regardless of it's size, is committed entirely to the log before compaction begins. This guarantees that transactions are ACID but limits the transaction size to available memory.

Log flushes involve loading all Level 0 files into memory, merging their contents with the memtable, then writing all new Level 0 files. Once all Level 0 files collectively contain more than 10MB of data one of the files in Level 0 is selected and it's contents are merged with files in Level 1 that overlap the keys in the selected Level 0 file. After the merge all data in the selected Level 0 file is now in Level 1 and the original Level 0 file is marked for deletion. This cycle continues with every subsequent Level, each Level being limited to a maximum of 10^L+1 MB worth of files. (Level 0 = 10MB, Level 1 = 100MB, etc)

Since data is sorted into SSTables for each Level, it's easy to discover overlapping keys between Levels and perform compactions that only handle data in the 10s of megabytes, even if the database is storing dozens of gigabytes of data. Additionally compactions will merge redundant key values (only keeping the newest value) and drop tombstones if no older values for a given key exist.

Each compaction normally won't move records across more than one set of Levels. The result ends up being all log/compaction writes and reads are sequential and relatively minor in size regardless of how large the database gets.

Once a log flush/compaction completes these actions are performed, in this order:

  1. A new manifest file is written describing the new Level files with the expired ones removed.
  2. The log file is deleted/emptied.
  3. The memtable is emptied.
  4. Expired Level files are deleted.

The order of operations guarantees that if there is a crash at any point before, during or after compaction the database remains in a state that can easily be recovered from with no possibility of data loss. The absolute worst case is a compaction is performed that ends up being discarded and must be done again.

Additionally, individual log writes and database files are stored with checksums to guarantee file and log integrity.

Reads

Reads are performed in this order:

  1. If cache is enabled, that's checked first.
  2. If the value/tombstone is in the memtable that's returned.
  3. The manifest file is checked to discover which Level files contain a key range that include the requested key. The files are then sorted from newest to oldest. Each file's bloom filter is checked against the requested key. If a bloom filter returns a positive result, we attempt to load data from that Level file. If the data/tombstone isn't in the Level file we move to progressively older Level files until a result is found. Finally, if the key isn't found in any files we return an error that the key isn't in the database.

One of the main ways SnapDB is different from LevelDB/RocksDB is the database keys are stored in a red-black tree to make key traversal faster. This allows fast .offset and .getCount methods in the API that aren't typically available for LevelDB/RocksDB stores. The tradeoff is that all keys must fit in javascript memory.

MIT License

Copyright (c) 2019 Scott Lott

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Keywords

FAQs

Last updated on 21 Jun 2019

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc