
Research
/Security News
Contagious Interview Campaign Escalates With 67 Malicious npm Packages and New Malware Loader
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
metacrunch-elasticsearch
Advanced tools
This is the official Elasticsearch package for the metacrunch ETL toolkit.
NOTE: metacrunch-elasticsearch 5.x requires Elasticsearch 7.x. For older versions of Elasticsearch try metacrunch-elasticsearch 4.x
Include the gem in your Gemfile
gem "metacrunch-elasticsearch", "~> 5.0.0"
and run $ bundle install
to install it.
Or install it manually
$ gem install metacrunch-elasticsearch
Note: For working examples on how to use this package check out our demo repository.
Metacrunch::Elasticsearch::Source
This class provides a metacrunch source
implementation that can be used to read data from Elasticsearch into a metacrunch job.
# my_job.metacrunch
# Create a Elasticsearch connection
elasticsearch = Elasticsearch::Client.new(...)
# Set the source
source Metacrunch::Elasticsearch::Source.new(elasticsearch, OPTIONS)
Options
:search_options
: A hash with search options (including your query) as described here. We have set some meaningful defaults though: size: 100
, scroll: 1m
, sort: ["_doc"]
. Depending on your use-case it may be needed to modify :size
and :scroll
for optimal performance.:total_hits_callback
: You can set a Proc
that gets called with the total number of hits your query will match. Use can use this callback to setup a progress bar for example. Defaults to nil
.Metacrunch::Elasticsearch::Destination
This class provides a metacrunch destination
implementation that can be used to write data from a metacrunch job to Elasticsearch.
The data that gets passed to the destination, must be in a proper format. You can use a transformation to transform your data before it reaches the destination.
As Metacrunch::Elasticsearch::Destination
utilizes the Elasticsearch bulk API, the expected format must match one of the available options for the body
parameter described here. Please note that you can use the bulk API not only to index records. You can update or delete records as well.
# my_job.metacrunch
# Transform data into a format that the destination can understand.
# In this example `data` is some hash.
transformation ->(data) do
{
index: {
_index: "my-index",
_id: data.delete(:id),
data: data
}
}
end
It is not efficient to call Elasticsearch for every single record. Therefore we can use a transformation with a buffer, to create bulks of records. In this example we use a buffer size of 10. In production environments and depending on your data, larger buffers may be useful.
# my_job.metacrunch
transformation ->(data) { data }, buffer: 10
If these transformations are in place you can now use the Metacrunch::Elasticsearch::Destination
class as a destination.
# my_job.metacrunch
# Write data into elasticsearch
destination Metacrunch::Elasticsearch::Destination.new(elasticsearch [, OPTIONS])
Options
:result_callback
: You can set a Proc
that gets called with the result from the bulk operation. Defaults to nil
.:bulk_options
: A hash of options for the Eleasticsearch bulk API as described here. Setting body
here will be ignored. Defaults to {}
.metacrunch-elasticsearch is available at github under MIT license.
FAQs
Unknown package
We found that metacrunch-elasticsearch demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.