
Security News
The Changelog Podcast: Practical Steps to Stay Safe on npm
Learn the essential steps every developer should take to stay secure on npm and reduce exposure to supply chain attacks.
metacrunch-elasticsearch
Advanced tools
This is the official Elasticsearch package for the metacrunch ETL toolkit.
NOTE: metacrunch-elasticsearch 5.x requires Elasticsearch 7.x. For older versions of Elasticsearch try metacrunch-elasticsearch 4.x
Include the gem in your Gemfile
gem "metacrunch-elasticsearch", "~> 5.0.0"
and run $ bundle install to install it.
Or install it manually
$ gem install metacrunch-elasticsearch
Note: For working examples on how to use this package check out our demo repository.
Metacrunch::Elasticsearch::SourceThis class provides a metacrunch source implementation that can be used to read data from Elasticsearch into a metacrunch job.
# my_job.metacrunch
# Create a Elasticsearch connection
elasticsearch = Elasticsearch::Client.new(...)
# Set the source
source Metacrunch::Elasticsearch::Source.new(elasticsearch, OPTIONS)
Options
:search_options: A hash with search options (including your query) as described here. We have set some meaningful defaults though: size: 100, scroll: 1m, sort: ["_doc"]. Depending on your use-case it may be needed to modify :size and :scroll for optimal performance.:total_hits_callback: You can set a Proc that gets called with the total number of hits your query will match. Use can use this callback to setup a progress bar for example. Defaults to nil.Metacrunch::Elasticsearch::DestinationThis class provides a metacrunch destination implementation that can be used to write data from a metacrunch job to Elasticsearch.
The data that gets passed to the destination, must be in a proper format. You can use a transformation to transform your data before it reaches the destination.
As Metacrunch::Elasticsearch::Destination utilizes the Elasticsearch bulk API, the expected format must match one of the available options for the bodyparameter described here. Please note that you can use the bulk API not only to index records. You can update or delete records as well.
# my_job.metacrunch
# Transform data into a format that the destination can understand.
# In this example `data` is some hash.
transformation ->(data) do
{
index: {
_index: "my-index",
_id: data.delete(:id),
data: data
}
}
end
It is not efficient to call Elasticsearch for every single record. Therefore we can use a transformation with a buffer, to create bulks of records. In this example we use a buffer size of 10. In production environments and depending on your data, larger buffers may be useful.
# my_job.metacrunch
transformation ->(data) { data }, buffer: 10
If these transformations are in place you can now use the Metacrunch::Elasticsearch::Destination class as a destination.
# my_job.metacrunch
# Write data into elasticsearch
destination Metacrunch::Elasticsearch::Destination.new(elasticsearch [, OPTIONS])
Options
:result_callback: You can set a Proc that gets called with the result from the bulk operation. Defaults to nil.:bulk_options: A hash of options for the Eleasticsearch bulk API as described here. Setting body here will be ignored. Defaults to {}.metacrunch-elasticsearch is available at github under MIT license.
FAQs
Unknown package
We found that metacrunch-elasticsearch demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Learn the essential steps every developer should take to stay secure on npm and reduce exposure to supply chain attacks.

Security News
Experts push back on new claims about AI-driven ransomware, warning that hype and sponsored research are distorting how the threat is understood.

Security News
Ruby's creator Matz assumes control of RubyGems and Bundler repositories while former maintainers agree to step back and transfer all rights to end the dispute.