Benchmark Lab
Run Real Experiment and Calculate Non-Parametric Statistics.
Requirements
The ruby version required is at least 2.1
.
Installation
Install it yourself as:
$ gem install benchmark-lab
Usage
There are two ways to use it:
- classic: as Benchmark.bm does
- iterative: collects and measures separately, stores into different JSON
files, then put everything together and rank them
Classic Usage
require 'benchmark/lab'
n = 5_000_000
cases = {
'for:' => proc { for i in 1..n; a = "1"; end },
'times:' => proc { n.times do ; a = "1"; end },
'upto:' => proc { 1.upto(n) do ; a = "1"; end }
}
nbr_of_samples = 20
Benchmark.experiment(nbr_of_samples) do |x|
cases.each { |label, blk| x.report(label, &blk) }
end
The output looks like the following:
user system total real
for: [0.77,0.77,0.78] [0.00,0.00,0.00] [0.77,0.77,0.78] [0.77,0.77,0.78]
times: [0.74,0.74,0.74] [0.00,0.00,0.00] [0.74,0.74,0.74] [0.74,0.74,0.74]
upto: [0.75,0.75,0.75] [0.00,0.00,0.00] [0.75,0.75,0.75] [0.75,0.75,0.75]
The best "times:" is significantly (95%) better (total time).
Iterative Usage
require 'benchmark/lab'
n = 5_000_000
nbr_of_samples = 20
jsons = []
jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
x.report('for') { for i in 1..n; a = "1"; end }
end
jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
x.report('times') { n.times do ; a = "1"; end }
end
jsons << Benchmark.observe_and_summarize(nbr_of_samples) do |x|
x.report('upto') { 1.upto(n) do ; a = "1"; end }
end
best, is_h0_rejected = Benchmark.aggregate_and_rank(jsons.map { |json| JSON.parse(json) })
puts best
puts is_h0_rejected
The output looks like the following:
{"name"=>"total", "sample"=>[0.6899999999999977, 0.6899999999999977, 0.6899999999999977, 0.6899999999999977, 0.6900000000000013, 0.6900000000000048, 0.6900000000000048, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999957, 0.6999999999999993, 0.6999999999999993, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028, 0.7000000000000028], "sample_size"=>20, "minimum"=>0.6899999999999977, "maximum"=>0.7000000000000028, "first_quartile"=>0.690000000000003, "third_quartile"=>0.7000000000000028, "median"=>0.6999999999999957, "interquartile_range"=>0.009999999999999787, "label"=>"upto"}
true
Ideas
- compare two different implementations of a same function
- get the stats, then compare
- use git (commit, branch)
- use tests to check no performance regression at the same time
- annotate the tests you want to check
- decide the sample size automatically (based on the power you want to reach)
- explain correctly why we should do that
Contributing
- Fork it ( https://github.com/toch/benchmark-lab/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request