contexture-elasticsearch
Elasticsearch Provider for Contexture
Usage
This provider takes a config object as a parameter, and expects a getClient
method to be provided, which should be an instantiated elasticsearch client.
Provider
This provider takes a config object as a parameter:
Option | Type | Description | Required |
---|
getClient | function | Returns an instantiated elasticsearch client | x |
searchWrapper | function | Higher order function for search caching | |
types | object | Contexture node types, like all other providers | |
Schemas
Schemas with with an elasticsearch provider can specify any or all of the following properties:
Option | Type | Description | Required |
---|
index | string | Which ES index to use when querying | x |
Example Schema for SomeType in SomeIndex
module.exports = {
elasticsearch: {
index: 'SomeIndex',
},
}
Seting up contexture
let _ = require('lodash/fp')
let Contexture = require('contexture')
let provider = require('contexture-elasticsearch')
let types = require('contexture-elasticsearch/types')
let schemas = require('./path/to/schemas')
let elasticsearch = require('elasticsearch')
let AgentKeepAlive = require('agentkeepalive')
let process = Contexture({
schemas,
providers: {
elasticsearch: provider({
getClient: _.memoize(() =>
elasticsearch.Client({
minSockets: 1,
maxSockets: 20,
keepAlive: true,
createNodeAgent: (connection, config) =>
new AgentKeepAlive(connection.makeAgentConfig(config)),
})
),
types: types({
geo: {
geocodeLocation: (query) =>
googleplaces.textSearch({
query,
}),
},
}),
}),
},
})
process(tree)
process(tree, {
requestOptions: {
headers: {
'custom-header-app-name': 'my-app-sent-this',
},
},
})
Automatic Schema Detection
As of 0.10.0, a getSchemas
async method is exposed on an instantiated provider, which will read the elasticsearch mappings and aliases to automatically generate schemas.
Generated schemas also include field definitions, which can leveraged with something like exampleTypeSchemaMapping
to make them fit for consumption by dynamic field pickers such as the one in contexture-react.
Default Types
Combo Filter + Result Types
These types both filter and have contextual results.
facet
Facet represents a list of dynamic choices, e.g. a checkbox list filter.
Input
Name | Type | Default | Description |
---|
field | string | None, required | The field it's operating on |
mode | include /exclude | include | Should this filter act as inclusion or exclusion of the values |
values | array[string] | [] | What is checked |
size | number | 12 | How many options to return |
includeZeroes | boolean | false | If true, it will include options with 0 matching documents (aka min_doc_count: 0 ) |
optionsFilter | string | '' | Filters the options further, e.g. a find box above a checkbox list |
sort | term /count | count | Sort results alphabetically or by count of matching records |
Output
{
cardinality: Number,
options: [{
name: String,
count: Number
}]
}
geo
Represents a geographic radius search. Requires geocoding on the client before passing up.
Input
Name | Type | Default | Description |
---|
field | string | None, required | The field it's operating on |
latitude | number/string | None, required | Latitude |
longitude | number/string | None, required | Longitude |
radius | number | None, required | Radius in miles |
operator | within /not within | within | Whether the filter forces inclusion or exclusion |
dateRangeFacet
dateRangeFacet is like a facet
but the options correspond to named date range buckets
Input
Name | Type | Default | Description |
---|
field | string | None, required | The field it's operating on |
ranges | array[{ range: NamedDateRange, key: string}] | None, required | Ranges should have 'range' prop containing the range phrase (eg. 'allFutureDates') and a key to represent the value |
values | array[string] | [] | What is checked |
timezone | string | 'UTC' | What timezone to use |
Output
{
options: [
{
name: String,
count: Number,
},
]
}
Filter Only Types
Filter only types just filter and nothing more. They don't have contextual results of their own.
bool
Bool represent a boolean check, e.g. a checkbox for true/false
{
field: String,
value: String|Boolean
}
date
Date represents a data range filter, with support datemath
{
field: String,
range: String,
from: DateString,
to: DateString,
isDateTime: Boolean
}
exists
Exists represents whether or not a field is present on results
{
field: String,
value: Boolean
}
number
Number represents a number range with inclusive bounds. This type provides the ability to determine the best range values based on percentile interval and range threshold.
Some Notes:
- An empty value as the upper boundary represents infinity.
- An empty value as the lower boundary represents negative infinity.
- Zero has to be respected as a boundary value.
- If findBestRange is true it will return the best min and max range.
Request:
{
field: String,
min: Number,
max: Number,
percentileInterval: Number,
rangeThreshold: Number,
findBestRange = Boolean
}
Response:
{
statistical: {
count: Number,
min: Number,
max: Number,
avg: Number,
sum: Number
},
percentiles: {
rangeMin: Number,
rangeMax: Number,
intervalMin: Number,
intervalMax: Number
},
bestRange: {
min: Number,
max: Number
}
}
query
Query represents a raw elasticsearch query_string.
{
field: String,
query: String,
exact: Boolean
}
tagsQuery
Tags represents one or more search terms.
{
field: String,
tags: [
{
word: String,
misspellings: Boolean (optional),
isPhrase: Boolean (optional),
distance: Number (optional)
}
],
join: 'any|all|none',
exact: Boolean
}
text
Text implements raw text analysis like starts with, ends with, etc. These are generally regex queries.
Result-Only Types
These types don't do any filtering of their own and only have results. These often power charts or analytics pages.
results
Search result "hits", with support for highlighting, paging, sorting, etc.
xGroupStats
We have a few new nodes of the form xGroupStats, where x
is a grouping (bucketing) type. They all share a similar API:
Documentation here is still deeply WIP.
Name | Type | Default | Description |
---|
groupField | string | None, required | The field to group by |
statsField | string | None | The field to calculate stats for |
stats | [string] | ['sum', 'min', 'max', 'sum'] | Which stats to include, can be avg, min, max, sum, or any of the other metrics supported by elasticsearch. |
Here's a kitchen example, with sections for the various types along with explanations for the more mongo focused developer:
let example = {
type: 'fieldValuesGroupStats',
size: 10,
filter: 'asdf',
sort: {
field: 'sum|min|max|avg|count|term',
order: 'asc|desc',
},
type: 'numberIntervalGroupStats',
groupField: 'price',
interval: 500,
type: 'dateIntervalGroupStats',
interval: 'year',
type: 'numberRangesGroupStats',
groupField: 'price',
ranges: [
{ from: 0, to: 500 },
{ from: 501, to: 1000 },
],
type: 'dateRangesGroupStats',
ranges: [{ from, to }],
type: 'fieldValuePartitionGroupStats',
groupField: 'CompanyState',
matches: 'FL',
type: 'percentilesGroupStats',
statsField: 'awardAmount',
stats: ['count|min|max|sum|avg|cardinality'],
}
All of these types share a similar output structure. Results are on a context property called results
with stat aggs flattened on as properties of each result (bucket)
pivot
Supports nested groupings of xGroupStats
Name | Type | Default | Description |
---|
columns rows | object[] | None | Column and row groupings to apply. Includes field , type can be any of xGroupStat . groupCounts adds subgroups counter to the parent level. skip skips the groping level in the aggregation query |
values | object[] | None | Metric values to compute, includes field , type , and potentially type specific fields. Type can be avg, min, max, sum, or any of the other metrics supported by elasticsearch. skip skips the value in the aggregation query |
drilldown | string[] | None | Drills down results where each entry of the array corresponds to a key from a grouping to allow progresive "drilldown"/"zooming" of groups. If a drilldown is specified, it will exclude nested groups > 1 deeper (e.g. ['a'] will filter the first group to a and expand the second, [] will only include the root group). Passing a falsey value will include all groups |
sort | object[] | None | { columnValues, valueIndex, valueProp, direction } |
sort.columnValues | string/number[] | None | Values of columns to sort by, e.g. [2017, 'Q1']. If null, it will sort by the root level values. |
sort.valueIndex | number | None | The index of the value from node.values to sort by. If null, the behavior will be to use doc_count |
sort.valueProp | string | None | For multivalue values like stats , pick a subfield to sort by. In most cases, you'd use the relevant single value metric instead. |
sort.direction | asc/desc | desc | Sort ascending or descending |
filters | [{rows: [], columns: []}] | None | Allows applying values as filters. Groups and columns work similarly to drilldown - you pass up the keys in order to filter, e.g. [{ groups: ['Nevada', 'Reno'], columns: ['2017'] }] |
Deprecated
cardinality
Use stats
with stats: ['cardinality']
instead
A cardinality aggregation. Returns the cardinality of a field.
Input
Name | Type | Default | Description |
---|
field | string | None, required | The field it's operating on |
Output
{
cardinality: {
cardinality: {
field: String,
},
},
}
dateHistogram
Use dateIntervalGroupStats
instead
A nested stats aggregation inside a dateHistogram aggregation.
groupedMetric
Use ??????
instead
A more general version of esTwoLevelAggregation, used in analysis builders/pivot tables. It takes config for an array of buckets and a metric agg. The buckets are nested with the metric on the inside.
matchStats
Use fieldValuePartitionGroupStats
instead
A filters bucket which puts results into a pass and fail bucket, along with a stats metric nested inside.
rangeStats
Use numberRangesGroupStats
instead
A stats aggregation in a range aggregation.
smartIntervalHistogram
Use numberIntervalGroupStats
instead
A stats aggregation inside a histogram aggreation - divided into intelligent chunks based on the min and max and snapping to clean "smart" business friendly intervals (roughly 25% of powers of 10).
statistical
Use stats
instead
A stats aggregation.
terms_stats
Use fieldValuesGroupStats
instead