Highly-Advanced Jedis client for Ruby
The raison d'être of this project can be pretty much described in one single
table:
Warming up --------------------------------------
Postgres 1.000 i/100ms
Redis 1.000 i/100ms
Jedis 1.000 i/100ms
Ruby hash 5.000 i/100ms
Calculating -------------------------------------
Postgres 2.289 (± 0.0%) i/s - 12.000 in 5.248779s
Redis 0.016 (± 0.0%) i/s - 1.000 in 60.470617s
Jedis 0.047 (± 0.0%) i/s - 1.000 in 21.109477s
Ruby hash 52.735 (±13.3%) i/s - 260.000 in 5.042344s
It looks slightly odd, but let's explain it a bit.
This is the result of a plain benchmark done on a Mac OS X development
machine using JRuby 9.2.0.0 and several libraries:
- Sequel with jdbc-postgres for the
Postgres
row - Plain redis-rb for the
Redis
row - For the
Jedis
row the jedis_rb gem was used, which is a simple wrapper over the jedis Java library - Finally,
Memory
meant using a plain Ruby hash
For the test, a table was created in a Postgres database that looked like a key-value store, something along the lines of:
SQL
CREATE TABLE IF NOT EXISTS things (
k text,
v text,
PRIMARY KEY (k)
);
TRUNCATE TABLE things;
SQL
The SQL query used:
SELECT v FROM things WHERE k='#{random_key}';
The Redis query used:
redis.get(random_key)
Redis should be faster, especially when used as a key-value store, but Postgres actually blew it out of the water in this little test, which seemed odd.
First, let's talk about the elephant in the room, in this case the significant amount of time required by redis-rb: it's a plain Ruby application and a lot of time is spent in userland (280 seconds out of the 298 seconds), while the other libraries use faster languages (Java bindings for both the JDBC driver and Jedis and JRuby optimizes the hash quite nicely).
So, in this case, the language was the bottleneck and maybe some gains can be achieved if the JRuby team will look into the redis-rb gem and suggest some optimizations.
However, it's not so obvious why jedis_rb was so slow compared to Sequel + jdbc-postgres, especially since it's relying on the pretty fast jedis Java library (which is also one of the recommended libraries by Redis).
In this case, the answer lies in the smaller real
time than the total
time, which probably means that there are things happening in parallel. So the answer is probably in the connection pool.
There are two things you can do with a connection pool:
- Create a limited set of connections that can be reused by the application when needed.
- Parallelize the requests, when possible.
The Sequel gem was doing both, while jedis_rb didn't even an option to set the number of concurrent connections. Actually, when digging a bit through Sequel but also through jedis, it becomes rather obvious that the Sequel developers spent a lot of time and effort to maximize the output of the library by using as many of the connections in the connection pool as possible.
This is where HAJ comes in.
The goal of this project is to have a more advanced connection pool for Jedis which can take advantage of multiple parallel connections to Redis and dramatically increase the output.
Obviously, this is a work in progress so please don't use it in production.