Databank
This package is an abstraction tool for document stores or key-value
stores in Node.js.
My goal is to hedge my bets by using a simple CRUD + search interface
for interacting with a datastore. If at some point I really need the
special snowflake features of Redis or MongoDB or Cassandra or Riak or
whatever, I should be able to bust out of this simple abstraction and
use their native interface without rewriting a lot of code.
I also want the data structures stored to look roughly like what
someone experienced with the datastore would expect.
I chose the name "databank" since it's not in widespread use and won't
cause name conflicts, and because it sounds like something a 1960s
robot would say.
As a note: I've used this library for a couple of big projects, and
mostly it just works.
License
Copyright 2011-2014 E14N https://e14n.com/
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Drivers
The point of the Databank interface is so applications can use one
interface for developing persistence code, and then at deployment time
you can decide what driver to use.
There are three drivers included in this package: 'memory',
'partitioning', and 'caching'. The first is great for development but
pretty bad for production.
There are a few drivers not in this package. You can search for them
on npm; they all start with 'databank-'. So, 'databank-leveldb',
'databank-mongodb', 'databank-memcached', 'databank-redis',
'databank-disk'.
Installation
I'm still not 100% on this, so comments welcome. I'd love better
instructions.
At deployment time, if you want to use a particular driver, it's going
to need to be available to the 'databank' libraries so that
Databank.get()
can find it. This means you have two options:
-
Install the driver globally, like npm install -g databank-redis
. This is
probably OK as long as you don't have version conflicts between apps.
-
Install the driver in the databank
dir, like so:
npm install databank
cd node_modules/databank/
npm install databank-redis
If you're still stuck, there's a Databank.register()
method that
will let you associate a databank driver class with a driver
name. That's probably only a last resort, though.
Built-in drivers
The built-in drivers are documented in MEMORY.md, CACHING.md, and
PARTITIONING.md respectively.
Schemata
This library assumes you have document "types" - like "person",
"chair", "photo", "bankaccount", "trainreservation" -- that you can
identify with a unique scalar key -- email address, URL, UUID, SSN, or
whatever.
Your "document" is anything that can be JSON-encoded and
decoded. Scalar, array and object/tree values are all totally cool.
Implementation classes that support schemata should support a "schema"
element on the constructor params for Databank.get()
(see below). A
schema can have elements for each type, with the following elements:
Dotted notation
In schemata you can use dotted-notation, a la MongoDB, to define
fields that are part of parts of the object. For example, for an
object like this:
{ email: "evan@e14n.com", name: { last: "Prodromou", first: "Evan" } }
...you may have a schema like this:
{ person: { pkey: "email", indices: ["name.last"] } }
Databank
The class has a static method for for initializing an instance:
-
get(driver, params)
Get an instance of DriverDatabank
from the module databank-driver
and
initialize it with the provided params (passed as a single object).
This is the place you should usually pass in a schema parameter.
var bank = Databank.get('redis', {schema: {person: {pkey: "email"}}});
bank.connect({}, function(err) {
if (err) {
console.log("Couldn't connect to databank: " + err.message);
} else {
// ...
}
});
There's another static method to change how get()
works:
The databank interface has these methods:
-
connect(params, onCompletion)
Connect to the databank. params
may be used by the underlying server.
onCompletion
takes one argument: a DatabankError
object. Null if no error.
-
disconnect(onCompletion)
Disconnect from the databank. onCompletion
takes one argument, a DatabankError.
-
create(type, id, value, onCompletion)
Create a databank entry of type type
with id id
and content value
.
How type
and id
are mapped to keys or whatever in the DB is
unspecified. Don't mix and match.
onCompletion
takes two arguments: a DatabankError
(or null) and the
created object. That created object may have some extra stuff added on.
Common error type here is AlreadyExistsError
.
store.create('activity', uuid, activity, function(err, value) {
if (err instanceof AlreadyExistsError) {
res.writeHead(409, {'Content-Type': 'application/json'});
res.end(JSON.stringify(err.message));
} else if (err) {
res.writeHead(400, {'Content-Type': 'application/json'});
res.end(JSON.stringify(err.message));
} else {
res.writeHead(200, {'Content-Type': 'application/json'});
res.end(JSON.stringify(value));
}
});
-
read(type, id, onCompletion)
Read an object of type type
with id id
from the databank. onCompletion
will get
two arguments: a DatabankError
(or null) and the object if found.
Common error type here is NoSuchThingError
if the databank has no such object.
bank.read('Book', '978-0141439600', function(err, user) {
if (err instanceof NoSuchThingError) {
res.writeHead(404, {'Content-Type': 'application/json'});
res.end(JSON.stringify(err.message));
} else if (err) {
res.writeHead(500, {'Content-Type': 'application/json'});
res.end(JSON.stringify(err.message));
} else {
res.writeHead(200, {'Content-Type': 'application/json'});
res.end(JSON.stringify(user));
}
});
-
update(type, id, value, onCompletion)
Update the (existing) object of type type
with id id
in the databank. onCompletion
will get two arguments: a DatabankError
(or null) and the object if found.
Common error type here is NoSuchThingError
if the databank has no such object.
-
save(type, id, value, onCompletion)
Either create a new object, or update an existing object. For when
you don't care which.
-
del(type, id, onCompletion)
Delete the object of type type
with id id
. onCompletion
takes one
argument, a DatabankError
(null on success).
"delete" is a keyword, so I decided not to use that.
-
search(type, criteria, onResult, onCompletion)
Finds objects of type type
which match criteria
, a map of
property names to exact value matches. onResult
is called one time
for each result, with a single argument, the object that matches the
criteria. Use a collector array if you want all the results in an array.
Property names can be dotted to indicate deeper structures; for
example, this object:
{name: {last: "Prodromou", first: "Evan"}, age: 43}
would match the criteria {"name.last": "Prodromou"}
.
onCompletion
takes one argument, a DatabankError
. A search with
no results will get a NoSuchThingError
. I think this is the method
most likely to elicit a NotImplementedError
, since most key-value
stores don't handle this kind of thing.
You're also on your own on sorting.
function getModerators(callback) {
var results = [];
bank.search('user', {role: 'moderator'}, function(result) {
results.push(result);
},
function(err) {
if (err) {
callback(err, null);
} else {
results.sort(function(a, b) {
return a.created - b.created;
});
callback(null, results);
}
});
}
-
scan(type, onResult, onCompletion)
Finds all objects of type type
. onResult
is called one time
for each result, with a single argument, the object that matches the
criteria. Use a collector array if you want all the results in an array.
onCompletion
takes one argument, a DatabankError
. A search with
no results will get a NoSuchThingError
. I think this is the method
most likely to elicit a NotImplementedError
, since most key-value
stores don't handle this kind of thing.
This is probably most useful for off-line processing, like doing a
backup or for initializing roll-up data. At scale, this may take
days to complete. If you want to do something like searching a
range, figure out a better way, like storing an array of matches at
write time.
-
readAll(type, ids, onCompletion)
Gets all the objects of type type
with ids in the array
ids
. Results are an object mapping an id to the results. If an ID
doesn't exist, the mapped value will be null
.
This is kind of like calling read
over and over, but if the driver
supports multiple reads in one call, it can be much more performant.
onCompletion
gets two arguments: an error, and the results map.
Integers
These are special shims for integer values.
-
incrBy(type, id, n, onCompletion)
Increments the integer value of type
and id
by
n
steps. onCompletion
takes two params: an error, and the resulting
integer value. If integer value doesn't yet exists, goes to n
.
Defaults to a read
and an update
or create
, but drivers can
override to do an atomic increment.
-
decrBy(type, id, n, onCompletion)
Decrements the integer value of type
and id
by
n
. onCompletion
takes two params: an error, and the resulting
integer value. If integer value doesn't yet exists, goes to -n
.
Defaults to incrBy
with -1 * n
, but drivers can override to do an atomic
increment.
-
incr(type, id, onCompletion)
Increments the integer value of type
and id
by
one. onCompletion
takes two params: an error, and the resulting
integer value. If integer value doesn't yet exists, goes to 1.
Defaults to incrBy
with n
= 1, but drivers can override to do an atomic
increment.
-
decr(type, id, onCompletion)
Decrements the integer value of type
and id
by
one. onCompletion
takes two params: an error, and the resulting
integer value. If integer value doesn't yet exists, goes to -1.
Defaults to decrBy
with n
= 1, but drivers can override to do an atomic
decrement.
Arrays
These are special shims for array values.
-
append(type, id, toAppend, onCompletion)
Appends the value toAppend
to the array at type
and id
.
onCompletion
takes one param: an error. If array doesn't yet
exists, it becomes a single-element array.
Defaults to call appendAll
, but drivers can override to do an atomic append.
-
prepend(type, id, toPrepend, onCompletion)
Prepends the value toPrepend
to the array at type
and id
.
onCompletion
takes one param: an error. If array doesn't yet
exists, it becomes a single-element array.
Defaults to call prependAll
, but drivers can override to do an atomic
prepend.
-
appendAll(type, id, items, onCompletion)
Appends the values in array items
to the array at type
and id
.
onCompletion
takes one param: an error. If array doesn't yet
exists, it becomes a new array consisting of items
.
Defaults to a read
and an update
or create
, but drivers can
override to do an atomic append.
-
prependAll(type, id, items, onCompletion)
Prepends the values in array items
to the array at type
and id
.
onCompletion
takes one param: an error. If array doesn't yet
exists, it becomes a new array consisting of items
.
Defaults to a read
and an update
or create
, but drivers can
override to do an atomic prepend.
-
item(type, id, index, onCompletion)
Gets the value at index
in the array at type
and id
.
onCompletion
takes two params: an error, and the resulting
item value.
Defaults to read the whole array and pluck out the value, but some
drivers might support atomic query of just one item.
-
slice(type, id, begin, end, onCompletion)
Like Array.slice()
, gets the sub-array starting at index begin
and ending at index end
of the array at type
and id
. onCompletion
takes two params: err
for error, and results
for the resulting
slice.
Defaults to read the whole array and pluck out the slice, but some
drivers might support atomic query of a slice.
-
indexOf(type, id, item, onCompletion)
Like Array.indexOf()
, gets the first index of item
in the array
at type
and id
. onCompletion
takes two params: err
for
error, and index
for the resulting index. Will give an index of -1
(like Javascript) on a miss.
-
remove(type, id, item, onCompletion)
Like Array.remove()
, removes the first instance of item
in the
array at type
and id
. onCompletion
takes one param: err
for
error. NOTE: removing an item that doesn't exist in the array does not
generate an error.
Defaults to call removeAll
with item
.
-
removeAll(type, id, items, onCompletion)
Like Array.remove()
, removes the first instance of each member of items
in
the array at type
and id
. onCompletion
takes one param: err
for
error. NOTE: removing an item that doesn't exist in the array does not
generate an error.
Defaults to call removeAll
with item
.
-
length(type, id, onCompletion)
Gets the length of the array value for type
and id
. Defaults to read
but drivers can override to give atomic results. onCompletion
takes two
parameters: an err
and the length
results.
-
truncate(type, id, length, onCompletion)
Truncates the array value for type
and id
to new length length
. Defaults
to read
and update
but drivers can override to give atomic results.
onCompletion
takes one parameters: an err
.
DatabankError
This is a subclass of Error
for stuff that went wrong with a
Databank
. Subclasses include:
-
NotImplementedError
That doesn't work (yet).
-
NoSuchThingError
The type/id pair you were trying to read/update/delete doesn't exist.
-
AlreadyExistsError
The type/id pair you were trying to create does exist.
-
NotConnectedError
You forgot to call connect
first.
-
AlreadyConnectedError
You already called connect
.
-
NoSuchItemError
There's no item in that array with that value.
-
WrongTypeError
You tried to use one of the array operators on a non-array value.
DatabankObject
This is a utility class for objects you want to store in a
Databank. To create the class, do this:
var MyClass = DatabankObject.subClass('mytype');
This will make an object class that stores data in the 'mytype'
type. You can add more stuff to the class, of course.
The class's type
is stored in MyClass.type
.
The constructor takes an object as a parameter; it will copy all its
properties from this object. Good for "classifying" JSON. So:
var json = getSomeJSONfromSomewhere();
var myInst = new MyClass(json);
Each class has the following class methods:
Gets the class's databank. Used internally for making queries. By
default, gets the DatabankObject.bank property. If you want to change
how this works, replace this function with... something else.
Gets the class's primary key. By default, looks for a class attribute
"schema" and tries to get the "pkey" element of that. Otherwise, it
checks the class's schema, looks for an element that matches the type
name, and tries to get pkey element of that. If that fails, it looks
at the class's databank's "schema", and tries to get that. Otherwise,
it just returns "id". Override if you have a better plan.
Get the object with primary key id
and returns it to the callback
.
search(criteria, callback)
Does a search for objects matching the criteria, collects them, and
returns an array to callback
.
Finds all objects of this type and calls handler
on each one. At the
end, fires callback
with a single err
parameter.
create(properties, callback)
Creates a new instance of class with properties
and returns it to callback.
Reads all objects from the databank with the given array of
primary-key ids, and returns a map of {id: object}.
Reads all objects from the databank with the given array of
primary-key ids, and returns an array of objects in the same order.
Each instance has the following methods:
update(properties, callback)
For an existing object, update to the provided properties, and return
the resulting object to callback
. Note that you can use only a few
properties; note that you can't use this method to remove properties.
Delete the object. callback
takes a single error arg.
Save the current state of the object, and return it to
callback
. Will create new objects or update existing ones.
Hooks
When I started using this library, I found myself overloading the
create(), update(), and save() methods to do extra things, like add an
auto-generated ID or timestamp, or to expand attributes stored by
reference. It was a little tricky, since I had to save off the default
auto-created function, then define a new function that called that
saved one.
To make this easier, I added a hooks mechanism. Now, every
DatabankObject subclass has the option of hooking certain
functionality without having to replicate the core
functionality. Default values are all no-ops.
Class methods:
beforeCreate(props, callback)
Called before create()
. A chance to add default values
or validate. callback
takes two args: an err, or the (possibly
modified) props.
Instance methods:
Called after create()
. Good chance to save references. callback
takes one arg: an err.
Called before get()
. I don't see a lot of reason to mess with
this, but it's here if you need it. callback
takes two args: an err,
or the (possibly modified) id.
Called after get()
. Good chance to expand references. callback
takes one arg: an err.
This is also called once for each instance returned in readAll()
.
beforeUpdate(props, callback)
Called before update()
. Validate, preserve immutables, or add
auto-generated properties. callback
takes two args: an err and the
(possibly modified) props.
Called after update()
. callback
takes one arg: an err.
Called before del()
. Maybe prevent deleting something important?
Referential integrity? callback
takes one arg: an err.
Called after del()
. Delete related stuff? callback
takes one arg:
an err.
Called before save()
. Validate, preserve, autogenerate. callback
takes one args: an err.
Called after save()
. callback
takes one args: an err.
TODO
See https://github.com/e14n/databank/issues