Elasticsearch ODM
Like Mongoose but for Elasticsearch. Define models, preform CRUD operations, and build advanced search queries. Most commands and functionality that exist in Mongoose exist in this library. All asynchronous functions use Bluebird Promises instead of callbacks.
This is currently the only ODM/ORM library that exists for Elasticsearch on Node.js. Waterline has a plugin for Elasticsearch but it is incomplete and doesn't exactly harness it's searching power.
Loopback has a storage plugin, but it also doesn't focus on important parts of Elasticsearch, such as mappings and efficient queries. This library automatically handles merging and updating Elasticsearch mappings based on your schema definition.
Installation
If you currently have npm elasticsearch installed, you can remove it and access it from client in this library if you still need it.
$ npm install elasticsearch-odm-5
Features
- Easy to use API that mimics Mongoose, but cuts out the extras.
- Models, Schemas and Elasticsearch specific type mapping.
- Add Elasticsearch specific type options to your Schema, like boost, analyzer or score.
- Utilizes bulk and scroll features from Elasticsearch when needed.
- Easy search queries without generating your own DSL.
- Seamlessly handles updating your Elasticsearch mappings based off your models Schema.
Quick Start
You'll find the API is intuitive if you've used Mongoose or Waterline.
Example (no schema):
let esodm = require('elasticsearch-odm-5');
let Car = esodm.model('Car');
let car = new Car({
type: 'Ford', color: 'Black'
});
esodm.connect('my-index').then(function(){
car.save().then(function(document){
console.log(document);
});
});
Example (using a schema):
let esodm = require('elasticsearch-odm-5');
let carSchema = new esodm.Schema({
type: String,
color: {type: String, required: true}
});
let Car = esodm.model('Car', carSchema);
API Reference
Core
Core methods can be called directly on the Elasticsearch ODM instance. These include methods to configure, connect, and get information from your Elasticsearch database. Most methods act upon the official Elasticsearch client.
.connect(String/Object options)
-> Promise
Returns a promise that is resolved when the connection is complete. Can be passed a single index name, or a full configuration object. The default host is localhost:9200 when no host is provided, or just an index name is used.
This method should be called at the start of your application.
If the index name does not exist, it is automatically created for you.
You can also add any of the Elasticsearch specific options, like SSL configs.
Example:
let esodm = require('elasticsearch-odm-5');
esodm.connect({
host: 'localhost:9200',
index: 'my-index',
logging: false,
trace: true,
ssl: {
ca: fs.readFileSync('./cacert.pem'),
rejectUnauthorized: true
},
options : {
settings: {
index: {
number_of_shards: 1,
number_of_replicas: 0
}
}
}
});
esodm.connect('my-index');
.disconnect()
-> Promise
Returns a promise that is resolved when the disconnection is complete.
This method should be called to close elasticsearch connection.
Example:
let esodm = require('elasticsearch-odm-5');
esodm.connect('my-index')
.then(function(){
})
.then(esodm.disconnect)
.then(function(){
console.log('disconnected');
});
new Schema(Object options)
-> Schema
Returns a new schema definition to be used for models.
.model(String modelName, Optional/Schema schema)
-> Model
Creates and returns a new Model, like calling Mongoose.model(). Takes a type name, in mongodb this is also known as the collection name. This is global function and adds the model to Elasticsearch ODM instance.
.client
-> Elasticsearch
The raw instance to the underlying Elasticsearch client. Not really needed, but it's there if you need it, for example to run queries that aren't provided by this library.
.stats()
Returns a promise that is resolved with index stats for the current Elasticsearch connections.
.removeIndex(String index)
Takes an index name, and complete destroys the index. Resolves the promise when it's complete.
.createIndex(String index, Object mappings)
Takes an index name, and a json string or object representing your mapping.
Resolves the promise when it's complete.
Document
Like Mongoose, instances of models are considered documents, and are returned from calls like find() & create(). Documents include the following functions to make working with them easier.
.save()
-> Document
Saves or updates the document. If it doesn't exist it is created. Like Mongoose, Elasticsearches internal '_id' is copied to 'id' for you. If you'd like to force a custom id, you can set the id property to something before calling save(). Every document gets a createdOn and updatedOn property set with ISO-8601 formatted time.
Note : In order to access document just after insertion you must add {refresh: true}
as save()
parameter. See index.refresh_interval. Force refresh has a negative impact on elasticsearch, but depending on your use it can be mandatory.
Example :
let esodm = require('elasticsearch-odm-5');
let Car = esodm.model('Car');
let car = new Car({
type: 'Ford', color: 'Black'
});
esodm.connect('my-index').then(function(){
car
.save({refresh: true})
.then(function(document){
console.log(document);
});
});
.remove()
Removes the document and destroys the cuurrent document instance. No value is resolved, and missing documents are ignored.
.update(Object data)
-> Document
Partially updates the document. Data passed will be merged with the document, and the updated version will be returned. This also sets the current model instance with the new document.
.set(Object data)
-> Document
Completely overwrites the document with the data passed, and returns the new document. This also sets the current model instance with the new document.
Will remove any fields in the document that aren't passed.
.toObject()
Like Mongoose, strips all non-document properties from the instance and returns a raw object.
Model
Model definitions returned from .model() in core include several static functions to help query and manage documents. Most functions are similar to Mongoose, but due to the differences in Elasticsearch, querying includes some extra advanced features.
.count()
-> Object
Object returned includes a 'count' property with the number of documents for this Model (also known as _type in Elasticsearch). See Elasticsearch count.
.create(Object data)
-> Document
A helper function. Similar to calling new Model(data).save(). Takes an object, and returns the new document.
.update(String id, Object data)
-> Document
A helper function. Similar to calling new Model().update(data). Takes an id and a partial object to update the document with.
.remove(String id)
Removes the document by it's id. No value is resolved, and missing documents are ignored.
.removeByIds(Array ids)
Help function, see remove. Takes an array of ids.
.set(String id, Object data)
-> Document
Completely overwrites the document matching the id with the data passed, and returns the new document.
Will remove any fields in the document that aren't passed.
.find(Object/String match, Object queryOptions)
-> Document
There are four ways to call .find() and it's siblings. You can mix and match styles.
- Passing only a match object like
.find({name:'Joe'})
- Passing only a string to match against all document fields
.find('some string')
- Passing Query Options (match can be set to null/empty)
.find({}, {must: {active: true, sort: 'createdOn'}}}
- Use chaining options (alias for QueryOptions)
.find({}).must({active: true}).sort('createdOn').then(..)
note : current version support up to 10000 results due to elasticsearch default limitations.
Unlike mongoose, finding exact matches requires the fields in your mapping to be set to 'not_analyzed'. By default {index: not_analyzed}
is added to all string fields in your Schema unless you override it.
Depending on the analyzer in your mapping, find queries like must, not, and matches may not find any results.
match => Optional. An alias for the 'must' Query Option. Like Mongoose this matches name/value in documents. Also, instead of an object, just a string can be passed which will match against all document fields using the power of an Elasticsearch QueryStringQuery.
queryOptions => Optional (can also use chaining instead). An object with Query Options. Here you can specifiy paging, filtering, sorting and other advanced options. See here for more details. You can set the first argument to null, and only use filters from the query options if you wanted.
returns => Found documents, or null if nothing was found.
Example:
let Car = esodm.model('Car');
Car.find({color: 'blue'}).then(function(results){
console.log(results);
});
Car.find({'location.city': 'New York'})
Car.find(null, {sort: 'createdOn'})
Car.find('some text')
Car.find()
.must({color: 'blue'})
.exists('owner')
.sort('createdOn')
.then(...)
.findById(String id, Object queryOptions)
-> Document
Finds a document by id. 'fields' argument is optional and specifies the fields of the document you'd like to include.
.findByIds(Array ids, Object queryOptions)
-> Document
Same as .findById() but for multiple documents.
.findOne(Object/String match, Object queryOptions)
-> Document
Same arguments as .find(). Returns the first matching document.
.findAndRemove(Object/String match, Object queryOptions)
-> 'Object'
Same arguments as .find(). Removes all matching documents and returns their raw objects.
.findOneAndRemove(Object/String match, Object queryOptions)
-> 'Object'
Same arguments as .findAndRemove(). Removes the first found document.
.makeInstance(Object data)
-> Document
Helper function. Takes a raw object and creates a document instance out of it. The object would need at least an id property. The document returned can be used normally as if it were returned from other calls like .find().
.toMapping()
Returns a complete Elasticsearch mapping for this model based off it's schema. If no schema was used, it returns nothing. Used internally, but it's there if you'd like it.
Query Options
The query options object includes several options that are normally included in mongoose chained queries, like sort, and paging (skip/limit), and also some advanced features from Elasticsearch.
The Elasticsearch Query and Filter DSL is generated using best practices.
page & per_page
Type: Integer
For most use cases, paging is better suited than skip/limit, so this library includes thhis instead. Page 0/1 are the same thing, so either can be used. Page and per_page both use default when the other is set, page defaults to the first, and per_page defaults to 10.
Including page or per_page will result in the response being wrapped in a meta data object like the following. You can call toJSON and toObject on this response and it'll call that method on all document instances under the hits property.
{
total: 0,
hits: [],
page: 0,
pages: 0
}
fields
Type: Array or String
A list of fields to include in the documents returned. For example, you could pass 'id' to only return the matching document id's. See Elasticsearch Fields.
{
fields: ['name', 'age']
}
.find()
.fields(['name', 'age'])
.then(...)
sort
Type: Array or String
A list of fields to sort on. If multiple fields are passed then they are executed in order. Adding a '-' sign to the start of the field name makes it sort descending. Default is ascending. See Elasticsearch Sort.
Example:
{
sort: ['name', 'createdOn']
}
.find()
.sort(['name', 'createdOn'])
.then(...)
q
Type: String
A string to search all document fields with using Elasticsearch QueryStringQuery. This can be expensive, so use it sparingly.
Example:
{
q: 'Red dog run'
}
.find('Red dog run')
.then(...)
must
Type: Object
Key value pairs to match documents against. Essentially it's the same as first argument passed to Mongoose .find(). This is also an alias to the first argument passed to .find() in this library.
This is a 'must' Bool Filter.
Elasticsearches internal Tokenizers are used, and fields are analyzed.
You can query nested fields using dot notation.
Example:
{
must: {
name: 'Jim',
'location.country': 'Canada'
}
}
.find()
.must({name: 'Jim', 'location.country': 'Canada'})
.then(...)
not
Type: Object
The same as must, but matches documents where the key value pairs DON'T match.
This is a 'must_not' Bool Filter query.
You can query nested fields using dot notation.
Example:
{
not: {
name: 'Jim',
'location.country': 'Canada'
}
}
.find()
.not({name: 'Jim', 'location.country': 'Canada'})
.then(...)
missing
Type: Array or String
A single field name, or array of field names. Matches documents where these field names are missing. A field is considered mising, when it is null, empty, or does not exist. See [MissingFilter]
(https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html).
Example:
{
missing: ['description', 'name']
}
.find()
.missing(['description', 'name'])
.then(...)
exists
Type: Array or String
A single field name, or array of field names. Matches documents where these field names exists. The opposite of missing.
Example:
{
exists: ['description', 'name']
}
.find()
.exists(['description', 'name'])
.then(...)
Schemas
Models don't require schemas, but it's best to use them - especially if you'll be making search queries. Elasticsearch-odm will generate and update Elasticsearch with the proper mappings based off your schema definition.
The schemas are similar to Mongoose, but several new field types have been added which Elasticsearch supports. These are; float, double, long, short, byte, binary, geo_point. Generally for numbers, only the Number type is needed (which converts to Elasticsearch integer). You can read more about Elasticsearch types here.
NOTE
- Types can be defined in several ways. The regular mongoose types exist, or you can use the actual type names Elasticsearch uses.
- You can also add any of the field options you see for Elasticsearch Core Types
- String types will default to
"index": "not_analyzed"
. See Custom Field Mappings. This is so the .find() call acts like it does in Mongoose by only fidning exact matches, however, this prevents the ability to do full text search on this field. Simply set {"index":"analyzed"}
if you'd like full text search instead.
Example:
let carSchema = new esodm.Schema({
available: Boolean,
safteyRating: 'float',
parts: [String],
oldPrices: {type: ['double']},
color: {type: String, required: true},
type: {type: String},
owner: {
name: String,
age: Number,
location: {type: 'geo_point', required: true}
},
inspections: [{
date: Date,
grade: Number
}],
description: {type:String, index: 'analyzed'}
price: {type: 'double', ignore_malformed: true}
});
Hooks and Middleware
Schemas include pre and post hooks that function similar to Mongoose. Currently, there are pre/post hooks for 'save' and 'remove'.
Pre Hooks
Same conventions as Mongoose. Function takes a done() callback that must be called when your function is finished. this
is scoped to the current document. assing an Error to done() will cancel the current operation. For example, in a pre 'save' hook, passing an error to done() will cause the document not to be saved and will return your error to the save() callers rejection handler.
let schema = new esodm.Schema(...);
schema.pre('save', function(done){
console.log(this);
done();
});
Post Hooks
Same conventions as Mongoose. Does not have a done() callback. Executed after the hooked method. The first argument is the current document which may or may not be a document instance (eg. post remove only receives the raw object as the document no longer exists).
let schema = new esodm.Schema(...);
schema.post('remove', function(document){
console.log(document);
});
Static and Instance Methods
Add methods to your schema with the same convention as Mongoose.
let schema = new esodm.Schema(...);
schema.methods.getFullName = function(){
return this.firstName + ' ' + this.lastName;
});
schema.statics.findByColor = function(color){
return this.find({color: color});
});
CHANGLELOG
See here.
CONTRIBUTING
This is a library Elasticsearch desperately needed for Node.js. Currently the official npm elasticsearch client has about 23,000 downloads per week, many of them would benefit from this library instead. Pull requests are welcome. There are Mocha and benchmark tests in the root directory.
TODO
- Browser build.
- Add support for querying nested document arrays with dot notation syntax.
- Add scrolling
- Add a wrapper to enable streaming of document results.
- Add snapshots/backups
- Allow methods to call Elasticsearch facets.
- Performance tweak application, fix garbage collection issues, and do benchmark tests.
- Integrate npm 'friendly' for use with expanding/collapsing parent/child documents.
- Use source filtering instead of fields.
Elasticsearch 2.x :
- Find requests : use scroll api in order to get more than 10000 results if needed