alldata-storage-leveldb
Stability: 1 - Experimental
LevelDB backed storage module for AllData, a distributed master-less append-only immutable event store database implementing "All Data" part of Lambda Architecture.
Usage
var AllDataStorage = require('alldata-storage-leveldb');
var allDataStorage = new AllDataStorage('./db', {
consolidationInterval: "P1D",
cacheSize: 8 * 1024 * 1024,
compression: true,
keyEncoding: 'utf8',
valueEncoding: 'json'
});
allDataStorage.on('interval closed', function (pathToIntervalDb) {
console.log('new read-only interval at ' + pathToIntervalDb);
});
allDataStorage.put('20130927T005240652508858176', {foo: 'bar'}, function (error) {
if (error) throw error;
});
Test
npm test
Overview
AllDataStorage provides a storage implementation for AllData with some particular characteristics.
Consolidation Interval
Since AllData is an append-only immutable event store with keys generated by alldata-keygen, this means that all the keys will be tightly coupled to the time they were generated, as well as monotonically increasing. Therefore, it is assumed that the store will, over a small window of time, stop receiving keys coupled to some time in the past. For example, if it is 10am, it is assumed that storage will not receive keys created at 7am. However, it is assumed to be more likely that keys created at 9:55am could still be received. Because of this, the time interval, in which it is assumed to be likely that events in the past could still be received, is configurable as consolidationInterval
option.
There are two time intervals open at any given time, the current interval, and the previous interval. This is to allow data that falls into the previous interval to still be collected. Once time passes and a new interval is entered, what used to be a previous interval is now no longer available to be written to and becomes read-only. This is illustrated below:
-------------------+------------------+
prev interval | current interval |
-------------------+------------------+
^
|
now
-------------------+-----------------------+------------------+
... READ ONLY ... | prev interval | current interval |
-------------------+-----------------------+------------------+
^
|
later
When a previous interval turns read-only, the interval closed
event is emitted.
Notice that the read-only portion of the store can now be packaged up in any way that is convenient to enable batch processing by other components of the Lambda Architecture. To support this use-case, AllDataStorage (LevelDB-based) creates a new LevelDB database for each consolidation interval.
On the boundary between intervals, there might be a condition when an event not far off into the "future" is created. In that case, a temporary "next" interval is created that will become the current interval once the time comes. Trying to insert future events beyond the next interval is an error.
+-----------------------+---------------------+-+-----------------------+
| prev interval | current interval | | next interval |
+-----------------------+---------------------+-+-----------------------+
^
|
now
Documentation
AllDataStorage
Public API
new AllDataStorage(location, [options])
location
: String Path to a directory where data will be stored.options
: Object (Default: undefined)
cacheSize
: Integer (Default: 8 * 1024 * 1024
) The size (in bytes and per interval) of the in-memory LRU cache with frequently used uncompressed block contents.compression
: Boolean (Default: true
) If true, all compressible data will be run through the Snappy compression algorithm before being stored. Snappy is very fast and shouldn't gain much speed by disabling so leave this on unless you have good reason to turn it off.consolidationInterval
: String (Default: P1D
) ISO8601 duration specifying the length of a consolidation interval. For now, this is limited to one of: P1D
, PT3H
, PT1H
, PT15M
, PT5M
.keyEncoding
: String (Default: utf8
) Encoding for the key. One of hex
, utf8
, ascii
, binary
, base64
, ucs2
, utf16le
, json
.valueEncoding
: String (Default: json
) Encoding for the key. One of hex
, utf8
, ascii
, binary
, base64
, ucs2
, utf16le
, json
.
Creates a new LevelDB-backed AllDataStorage instance.
allDataStorage.close([callback])
callback
: Function (Default: undefined) function (error) {}
If provided a callback to call once storage is closed.
Closes storage.
allDataStorage.intervalCheck([now])
CAUTION: reserved for internal use
now
: Date (Default: undefined) Time to use as "now" for interval rotation check.
When this method is executed, current time is compared with what consolidation intervals are currently available. If current time falls into the current interval, nothing happens. If current time is later than the current interval, the following happens:
- previous interval is closed and becomes read only
- current interval becomes previous interval
- next interval becomes current interval (and is created if needed)
- 'interval closed' event is emitted with path to the closed interval from
1
Upon creation of a new AllDataStorage instance this method is scheduled to run at regular intervals.
allDataStorage.put(key, value, [options], callback)
key
: String AllData formatted key, example: 20130927T005240652508858176
.value
: Object Event to put.options
: Object (Default: undefined) Optional options for this specific put operation.
keyEncoding
: String (Default: undefined) CAUTION: not recommended Alternative encoding for the key. One of: hex
, utf8
, ascii
, binary
, base64
, ucs2
, utf16le
, json
.sync
: Boolean (Default: false) Will force the Operating System to synchronize to disk prior to calling the callback
with success.valueEncoding
: String (Default: undefined) CAUTION: not recommended Alternative encoding for the value. One of: hex
, utf8
, ascii
, binary
, base64
, ucs2
, utf16le
, json
.
callback
: Function function (error) {}
Callback to call with error or success.
During normal operation keyEncoding
and valueEncoding
"global" values will be used. It is not recommended to have different encodings for different puts to the same underlying LevelDB database.
If trying to put
to a READ ONLY portion of the story, an error will be returned.
Event interval closed
function (closedIntervalPath) {}
closedIntervalPath
: String Path to LevelDB corresponding to the closed interval.
Emitted when an interval is closed and becomes read-only.
allDataStorage.on('interval closed', function (closedIntervalPath) {
console.log("interval " + closedIntervalPath + " closed");
});