![PyPI Now Supports iOS and Android Wheels for Mobile Python Development](https://cdn.sanity.io/images/cgdhsj6q/production/96416c872705517a6a65ad9646ce3e7caef623a0-1024x1024.webp?w=400&fit=max&auto=format)
Security News
PyPI Now Supports iOS and Android Wheels for Mobile Python Development
PyPI now supports iOS and Android wheels, making it easier for Python developers to distribute mobile packages.
elasticsearch-helper
Advanced tools
A Nodejs module facilitating querying Elasticsearch clusters.
After experiencing a lot of issues due to the way Elasticsearch handles the queries, I decided to create this helper currently used on production level that had helped us to drastically improve the readability and flexibility of our code.
With this helper you will be able to query your elasticsearch clusters very easily. Everything is chainable and the query always returns a promise.
NOTE: Even if we use this on production level, we still find bugs and add improvements to the module codebase. Feel free to fork it and modify it for your own needs.
npm install --save elasticsearch-helper
const ES = require("elasticsearch-helper")
// Will create a default client
ES.addClient("127.0.0.1:9200");
// Will create a client with name "client1"
ES.addClient("client1","127.0.0.1:9200");
// Will create a client with name "client1" and will be used as default
ES.addClient("client1","127.0.0.1:9200",true);
// Alias:
ES.AddClient(...)
The client is chainable which means that you can call functions one after the other until you execute the query. The query is then returning a promise.
Initialise a query:
// Querying on index "Index1"
ES.query("Index1");
// Querying on all indexes starting with "Index"
ES.query("Index*");
// Querying on index "Index1" and type "Type1"
ES.query("Index1","Type1");
// Querying on index "Index1" and type "Type1" using the client "Client1"
ES.query("Index1","Type1)".use("Client1")
We implemented some helpers based on what we were using a lot.
New ones will be added over time.
NOTE: All those methods return a promise.
Easily copy an index/type to another client/index/type using bulk inserts.
NOTE1: you can copy based on a query, check below to see how to do queries.
NOTE2: If you want to copy millions of rows remember to set size()
, Elasticsearch-helper will create a scroll.
//Copy from index1 to index2
ES.query("Index1")
.copyTo(ES.query("Index2"));
//Copy from index1 to index2 on client2
ES.query("Index1")
.copyTo(ES.query("Index2").use("client2"));
//Copy from index1, type1 to index2, type1
ES.query("Index1","Type1")
.copyTo(ES.query("Index2"));
//Copy from index1, type1 to index2, type2
ES.query("Index1","Type1")
.copyTo(ES.query("Index2","Type2"));
//Copy documents with first name is Josh from index1 to index2
ES.query("Index1")
.must(
ES.type.term("first_name","Josh"),
)
.copyTo(ES.query("Index2"));
Delete an index
WARNING: This operation is final and cannot be reverted unless you have a snapshot, use at you own risk.
NOTE: For security reason you cannot delete multiple indexes at the same time.
//Delete index1
ES.query("Index1")
.deleteIndex();
//Delete index1 from client2
ES.query("Index1")
.use("client2")
.deleteIndex();
Check if an index exists.
ES.query("Index1")
.exists();
ES.query("Index1")
.use("client2")
.exists();
A method can be created to handle errors (like logging or formatting), This error method is part of a Promise and should return something if it needs to keep processing.
Errors are always processed as Promise rejection
// Global error handling for all queries
ES.onError(function(err){
console.log("This message will appear after every error")
return err;
})
// Query specific error handling
ES.query("Index1","Type1")
.onError(function(err){
//This onError will overwrite the global onError method for this query.
console.log("This message will appear after this query has an error")
return err;
})
Doing query:
For those example we will use the query variable 'q':
// initialise query
var q = ES.query("Index1","Type1");
q.id("ID")
.run()
.then(function(hit){
// return hit object or false if not found
console.log(hit.id()) // get Document ID
console.log(hit.index()) // get Document index
console.log(hit.type()) // get Document type
console.log(hit.data()) // get Document source
})
q.id("ID")
.delete()
.then(function(hit){
// return true
})
q.id("ID")
.body({...}) // Data object to store
.run()
.then(function(hit){
// return the data object
})
q.id("ID")
.update({...}) // Data object to update
.run()
.then(function(hit){
// return the data object
})
q.id("ID")
.upsert({...}) // Data object to upsert
.run()
.then(function(hit){
// return the data object
})
This helper includes the different search features of Elasticsearch such as must
, must_not
etc.
GETs and DELETEs are using the same methodology for querying building. Example:
q.must(
// Term type
ES.type.term("fieldname","fieldvalue"),
// Add a sub filter in the query
ES.filter.should(
ES.type.terms("fieldname2","fieldvalues")
)
)
ES.filter.must(/* search types as arguments */);
ES.filter.must_not(/* search types as arguments */);
ES.filter.should(/* search types as arguments */);
ES.filter.filter(/* search types as arguments */);
NOTE: not all types are currently implemented. Others will be added over time.
ES.type.term("fieldkey","fieldvalue");
// ex:
ES.type.term("name.first_name","josh");
ES.type.terms("fieldkey","fieldvalues as array");
// ex:
ES.type.terms("name.first_name",["josh","alan","jack"]);
ES.type.exists("fieldkey");
// ex:
ES.type.exists("name.first_name");
ES.type.range("fieldkey","range object options");
// ex:
ES.type.range("age",{
gte: 10,
lte: 30
});
ES.type.wildcard("fieldkey","fieldvalue");
// ex:
ES.type.wildcard("name.first_name","josh*");
ES.type.prefix("fieldkey","fieldvalue");
// ex:
ES.type.prefix("name.first_name","josh");
Nested is an advanced feature of Elasticsearch allowing to do queries on sub-documents such as an array of objects. This type require that a specific mapping being setup. For more information: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
In nested query you always define the parent and the filters always prepend the parent name. All filters are available.
This type can be combined with other types at any level and/or create sub nested queries.
ES.type.nested("parent","filter object");
// ex:
ES.type.nested("name",ES.filter.must(
ES.type.term("name.first", "josh"),
ES.type.term("name.last", "wake")
));
q.must(
// Types
).run().then(function(hits){
// return array of hits objects
var hit = hits[0];
console.log(hit.id()) // get Document ID
console.log(hit.index()) // get Document index
console.log(hit.type()) // get Document type
console.log(hit.data()) // get Document source
})
Delete by query is only avalaible on Elasticsearch 5.X
q.must(
// Types
).delete().then(function(hits){
// return array of hits objects
})
Count the documents
q.must(
// Types
).count().then(function(count){
// return count of documents
})
Elasticsearch has a very powerful aggregation system but the way to handle it can be tricky. I tried to solve this issue by wrapping it in what I think is the simplest way.
NOTE: Right now I only handle 2 types of aggregation, terms
and date_histogram
, others will be added over time.
q.aggs(
ES.agg.date_histogram("created_date")("date_created","1d")
// Child aggregation to the "created_date" aggregation
.aggs(
ES.agg.terms("first_name")("data.first_name")
)
// Add more aggregations
).run()
.then(function(response){
// retrieve the "created_date" aggregation
var arrayAggList = response.agg("created_date")
var arrayValues = arrayAggList.values() // return an array of values objects. array types values will depend on the aggregation type
var firstValue = arrayValues[0];
var valueID = firstValue.id(); // key of the value. If it is a date_histogram type it will be a moment object
var valueData = firstValue.data(); // value of the aggregation for this key.
// To retrieve a child aggregation:
// Note: Each parent aggregation value has its own aggregation so you will have to loop through to get the child aggregation
var arrayChildAggList = arrayAggList.agg("first_name");
for(var parentKeyvalue in arrayChildAggList){
arrayChildAggList[parentKeyvalue].values().forEach(function(value){
console.log(parentKeyvalue, value.id(),value.data());
})
}
})
ES.agg.terms("aggregation name")("field to aggregate on"[,"options object"])
interval: string using a time unit
ES.agg.date_histogram("aggregation name")("field to aggregate on","interval")
ES.agg.average("aggregation name")("field to aggregate on")
NOTE: Aggregations below do not support sub aggregations. Error will be thrown.
ES.agg.cardinality("aggregation name")("field to aggregate on")
ES.agg.extended_stats("aggregation name")("field to aggregate on")
ES.agg.maximum("aggregation name")("field to aggregate on")
ES.agg.minimum("aggregation name")("field to aggregate on")
ES.agg.sum("aggregation name")("field to aggregate on")
ES.agg.value_count("aggregation name")("field to aggregate on")
// will retrieve 1000 results maximum
// all queries with a size over 500 will be converted into a scroll.
q.size(1000)
// Works with size
// will retrieve results from index 10
q.from(10)
q.fields(["name","id"])
// will change/retrieve the type
q.type("type1")
q.sort([{ "post_date" : {"order" : "asc"}}, ...])
const ES = require("elasticsearch-helper")
ES.AddClient("client1","127.0.0.1:9200");
ES.query("Index1","Type1")
.use("client1")
.size(10)
.must(
ES.addType().term("name","John"),
ES.addType().terms("lastname",["Smith","Wake"])
)
.must_not(
ES.addType().range("age",{
lte:20,
gte:30
})
)
.run()
.then(function(hits){
//hits array
})
const ES = require("elasticsearch-helper")
ES.AddClient("client1","127.0.0.1:9200");
ES.Query("user")
.size(1001) // when an aggregation is set, size is set to 0.
.must(
ES.type.term("name","jacques"),
ES.type.range("age",{gt:20,lte:40}),
ES.filter.should(
ES.type.term("color","blue"),
ES.type.term("vehicle","car")
)
)
.aggs(
ES.agg.date_histogram("created_date")("date_created","1d")
// Child aggregation to the "created_date" aggregation
.aggs(
ES.agg.terms("first_name")("data.first_name.raw")
)
)
.run()
.then(function(response){
// retrieve the "created_date" aggregation
var arrayAggList = response.agg("created_date")
var arrayValues = arrayAggList.values() // return an array of values objects. array types values will depend on the aggregation type
var firstValue = arrayValues[0];
var valueID = firstValue.id(); // key of the value. If it is a date_histogram type it will be a moment object
var valueData = firstValue.data(); // value of the aggregation for this key.
// To retrieve a child aggregation:
// Note: Each parent aggregation value has its own aggregation so you will have to loop through to get the child aggregation
var arrayChildAggList = arrayAggList.agg("first_name");
for(var parentKeyvalue in arrayChildAggList){
arrayChildAggList[parentKeyvalue].values().forEach(function(value){
console.log(parentKeyvalue, value.id(),value.data());
})
}
}).catch(function(err){
// error
console.log(err)
})
FAQs
A Nodejs module facilitating querying Elasticsearch clusters.
We found that elasticsearch-helper demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PyPI now supports iOS and Android wheels, making it easier for Python developers to distribute mobile packages.
Security News
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.