Sequel::Elasticsearch
Sequel::Elasticsearch allows you to transparently mirror your database, or specific tables, to Elasticsearch. It's especially useful if you want the power of search through Elasticsearch, but keep the sanity and structure of a relational database.
Installation
Add this line to your application's Gemfile:
gem 'sequel-elasticsearch'
And then execute:
$ bundle
Or install it yourself as:
$ gem install sequel-elasticsearch
Usage
Require the gem with:
require 'sequel/plugins/elasticsearch'
You'll need an Elasticsearch cluster to sync your data to. By default the gem will try to connect to http://localhost:9200
. Set the ELASTICSEARCH_URL
ENV variable to the URL of your cluster.
This is a Sequel plugin, so you can enable it DB wide:
Sequel::Model.plugin :elasticsearch
Or per model:
Document.plugin Sequel::Elasticsearch
class Document < Sequel::Model
plugin :elasticsearch
end
There's a couple of options you can set:
Sequel::Model.plugin :elasticsearch,
elasticsearch: { log: true },
index: 'all-my-data',
type: 'is-mine'
And that's it! Just transact as you normally would, and your records will be created and updated in the Elasticsearch cluster.
Indexing
Ensure that you create the index mappings for your data before using this plugin, otherwise you might get some weird results.
The records will by default be indexed using the values
call of the model. Should you need to customize what's indexed, you can define a indexed_values
method (or as_indexed_json
method if you prefer the Rails way).
Searching
Your model is now searchable through Elasticsearch. Just pass down a string that's parsable as a query string query.
Document.es('title:Sequel')
Document.es('title:Sequel AND body:Elasticsearch')
The result from the es
method is an enumerable containing Sequel::Model
instances of your model:
results = Document.es('title:Sequel')
results.each { |e| p e }
The result also contains the meta info about the Elasticsearch query result:
results = Document.es('title:Sequel')
p results.count
p results.total
p results.timed_out
p results.took
You can also use the scroll API to search and fetch large datasets:
scroll = Document.es('test', scroll: '5m')
p scroll_id
puts "Found #{scroll.count} of #{scroll.total} documents"
scroll.each { |e| p e }
while (scroll = Document.es(scroll, scroll: '1m')) && scroll.empty? == false do
puts "Found #{scroll.count} of #{scroll.total} documents"
scroll.each { |e| p e }
end
Import
You can import the whole dataset, or specify a dataset to be imported. This will create a new, timestamped index for your dataset, and import all the records from that dataset into the index. An alias will be created (or updated) to point to the newly created index.
Document.import!
Document.import!(dataset: Document.where(active: true))
Document.import!(
index: 'active-documents',
dataset: Document.where(active: true),
batch_size: 20
)
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/jrgns/sequel-elasticsearch.
Features that needs to be built:
License
The gem is available as open source under the terms of the MIT License.