We have some staff-only views that expose stats about how people use our app. Eventually, our tables grew so large that MySQL wouldn't aggregate them all at once. So we can use this to generate those stats over time.
-
Setup ActiveJob with a queue named :bramble
-
Setup Redis and give Bramble a connection object:
my_redis_connection = Redis.new
Bramble.config do |conf|
conf.redis_conn = my_redis_connection
end
-
Define a module with map
, reduce
and items(options = {})
functions:
module LetterCount
def self.items(filepath)
File.read(filepath).split(" ")
end
def self.map(word)
letters = word.upcase.each_char
letters.each { |letter| yield(letter, 1) }
end
def self.reduce(letter, observations)
observations.length
end
def self.on_error(err)
Bugsnag.notify(err)
end
end
Inputs and outputs are serialized with JSON, so some Ruby types will be lost (eg, Symbols).
-
Start a job with a handle, module, and an (optional) argument for finding data:
handle = "shakespeare-letter-count"
hamlet_path = "./shakespeare/hamlet.txt"
Bramble.map_reduce(handle, LetterCount, hamlet_path)
-
Later, fetch the result using the handle:
result = Bramble.get("shakespeare-letter-count")
result.running?
result.finished?
result.data
result.percent_finished
result.percent_mapped
result.percent_reduced
result.finished_at
-
Delete the saved result:
Bramble.delete("shakespeare-letter-count")
Or delete everything:
Bramble.delete_all