Map Reduce for leveldb (via levelup)
Incremental map-reduces and real-time results.
Waat?
An "incremental map reduce" means when you update one key,
only a relevant protion of the data needs to be recalculated.
"real-time results" means that you can listen to the database,
and recieve change notifications on the fly! a la
level-live-stream
Example
create a simple map-reduce
var levelup = require('levelup')
var mapReduce = require('map-reduce')
levelup(flie, {createIfMissing:true}, function (err, db) {
mapReduce(db)
db.mapReduce.add({
name : 'example',
start : '',
end : '~',
map : function (key, value, emit) {
var obj = JSON.parse(value)
emit(['all', obj.group], ''+obj.lines.length)
},
reduce: function (acc, value, key) {
return return ''+(Number(acc) + Number(value))
},
initial: '0'
})
})
map-reduce
uses level-hooks
and level-queue to make map reduces durable.
querying results.
db.mapReduce.view(viewName, {start: ['all', group]})
db.mapReduce.view(viewName, {start: ['all', true]})
db.mapReduce.view(viewName, {start: []})
db.mapReduce.view(viewName, {start: ['all', group1], end: ['all', groupN]})
db.mapReduce.view()
returns an instance of
level-live-stream
by default, the stream will stay open, and continue to give you the latest results.
This may be disabled by passing {tail:false}
.
The stream responds correctly to stream.pause()
and stream.resume()
db.mapReduce.view(viewName, {start: ['all', true], tail: false})
complex aggregations
map-reduce with multiple levels of aggregation.
suppose we are building a database of all the street-food in the world.
the data looks like this:
{
country: USA | Germany | Cambodia, etc...
state: CA | NY | '', etc...
city: Oakland | New York | Berlin | Phnom Penh, etc...
type: taco | chili-dog | doner | noodles, etc...
}
we will aggregate to counts per-region, that look like this:
{
'taco': 23497,
'chili-dog': 5643,
etc...
}
first we'll map the raw data to ([country, state, city, street],type)
tuples.
then we'll count up all the instances of a particular type in that region!
var levelup = require('levelup')
var mapReduce = require('map-reduce')
levelup(flie, {createIfMissing:true}, function (err, db) {
mapReduce(db)
db.mapReduce.add({
name : 'streetfood',
map : function (key, value, emit) {
var obj = JSON.parse(value)
emit(
[obj.country, obj.state || '', obj.city],
JSON.stringify(obj.type)
)
},
reduce: function (acc, value) {
acc = JSON.parse(acc)
value = JSON.parse(value)
if('string' === typeof value) {
acc[value] = (acc[value] || 0) ++
return JSON.stringify(acc)
}
for(var type in value) {
acc[type] = (acc[type] || 0) + value[type]
}
return JSON.stringify(acc)
},
initial: '{}'
})
})
then query it like this:
db.mapReduce.view('streetfood', {start: ['USA', 'CA'], tail: false})
.pipe(...)
db.mapReduce.view('streetfood', {start: ['USA', true]})
License
MIT