Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Stretchy is a query builder for Elasticsearch. It helps you quickly construct the JSON to send to Elastic, which can get rather complicated.
Stretchy is modeled after ActiveRecord's interface and architecture - query objects are immutable and chainable, which makes quickly building the right query and caching the results easy. The goals are:
Stretchy is not:
Add this line to your application's Gemfile:
gem 'stretchy'
And then execute:
$ bundle
Or install it yourself as:
$ gem install stretchy
Stretchy is still in early development, so it does not yet support the full feature set of the Elasticsearch API. There may be bugs, though we try for solid spec coverage. We may introduce breaking changes in minor versions, though we try to stick with semantic versioning.
It does support fairly basic queries in an ActiveRecord-ish style.
See the Stretchy docs on rubydocs for fairly detailed documentation on the API. Specifically, you'll probably want the docs for the API class, which exposes the public methods for building queries.
Stretchy.client = Elasticsearch::Client.new
# returns a Stretchy::API object
api = Stretchy.query(index: 'myapp_development')
From here, you can chain the methods to build your desired query.
From here, you can chain the following query methods:
The most utility generated from Stretchy is building composable elements for a function_score
query, and the boolean logic for queries, filters, and boost functions within. Whenever you use one of the API methods having to do with context, you change the state in which the filters or queries afterwards will be applied.
# filters out documents matching the terms filer
api = api.filter.not(term: {my_field: 'my_val'})
# constructs a bool: query with the regexp query in the should: clause
api = api.should.query(regexp: {my_field: 'aw*ome'})
# constructs a function_score query, with a boost function (weight 5)
# that boosts the score of documents matching the regexp query
# (a filter of type query:)
api = api.boost.query(regexp: {my_field: 'inter*on'}, weight: 5)
As soon as you pass parameters to one of the methods, however, the context resets. This allows setting the context by multiple method chains, then adding a query, filter, or boost function with the context applied.
api = api.filter(term: {my_field: 'my_val'}).query(match: {_all: 'hello'})
{
filtered: {
query: {match: {_all: 'hello'}},
filter: {term: {my_field: 'my_val'}}
}
}
api = api.should.query(match: {_all: 'hello'}).query(match: {_all: 'goodbye'})
{
bool: {
must: {match: {_all: 'goodbye'}},
should: {match: {_all: 'hello'}}
}
}
api = api.should.not.query(match: {_all: 'hello'})
.should.query(match: {_all: 'goodbye'})
{
bool: {
must: {},
must_not: {},
should: {
bool: {
must: {match: {_all: 'goodbye'}},
must_not: {match: {_all: 'hello'}}
}
}
}
}
Furthermore, API objects are immutable. Each chain method produces a new API object, so you never have to worry about mutation or cache busting. This, each example has api = api.method.calls
, since you will need to store the new query object to get the results you are expecting.
api = api.match.query(
multi_match: {
query: 'super smash bros',
fields: ['developer.games', 'developer.bio']
}
)
api = api.match.not.match.query(
multi_match: {
query: 'rez',
fields: ['developer.games', 'developer.bio']
}
)
Adds arbitrary JSON as a query. If you want to use a query type not currently supported by Stretchy, you can call this method and pass in the requisite json fragment. You can also prefix this with the context methods to put this query in the right place when you send it to Elastic.
api = api.filter(
geo_polygon: {
'person.location' => {
points: [
{lat: 40, lon: -70},
{lat: 30, lon: -80},
{lat: 20, lon: -90}
]
}
}
)
Adds arbitrary JSON as a filter. If you want to use a filter type not currently supported by Stretchy, you can call this method and pass in the requisite json fragment. You can also prefix this with context methods.
api = api.where.not(rating: 0)
.not.match('angry')
If called after a .where
or .match
, .not
will act like that method, but will invert the specified queries or filters. If called before some other method such as .query()
, it will invert the resulting object.
api = api.match.should(name: 'Ada')
.should.not.query(regexp: {title: 'boring'})
If called after a .where
or .match
, .should
will act like that method, but will combine other queries or filters into a bool:
type, and place the resulting query or filter objects in the should:
clause. If called before any query or filter method, it will take the results of that method and apply them in a should:
clause.
Each should:
clause inside a query boosts the relevance score.
The should:
clause without a must:
clause requires at least one of the should:
statements to match on a document. In a bool:
filter, this is really all they do.
See Elastic's documentation for BoolQuery and BoolFilter for more info.
api = api.match('welcome to my web site')
.match(title: 'welcome to my web site')
Performs a match query for the given string. If given a hash, it will use a match query on the specified fields, otherwise it will default to '_all'
. By default, a match query searches for any of the analyzed terms, and scores them using Lucene's practical scoring formula, which combines TF/IDF, the vector space model, and a few other niceties.
api = api.fulltext('Generic user-input phrase')
Performs a query for the given string anywhere in the document. At least one of the terms must match, and the closer a document is to having the exact phrase, the higher its' score. See the Elasticsearch guide's article on proximity scoring for more info on how this works.
api = api.more_like(ids: [1, 2, 3])
.more_like(docs: other_search.results)
.more_like(
like: 'puppies and kittens are great',
fields: ['about_me']
)
Finds documents similar to a list of input documents. You must pass in one of the :ids
, :docs
or :like_text
parameters, but everything else is optional. This method accepts any of the params available in the Elasticsearch more_like_this query.
api = api.where(
name: 'alice',
email: [
'alice@company.com',
'beatrice.christine@other_company.com'
],
commit_count: 27..33,
is_robot: nil
)
Allows passing a hash of matchable options similar to ActiveRecord's where
method. To be matched, the document must match each of the parameters. If you pass an array of parameters for a field, the document must match at least one of those parameters.
If you pass a string or symbol for a field, it will be converted to a Term Filter for the specified field. Since Elastic analyzes terms by default, what is stored in the elasticsearch index may not exactly match the specified terms.
To use a match:
query as a filter instead of a terms:
filter, use the context methods:
api = api.filter.match(name: 'Alice', email: 'alice@company.com')
api = api.range(rating: {gte: 3, lte: 5})
.range(released: {gte: Time.now - 60*60*24*100})
.range(quantity {lt: 100})
.range(awesomeness: {gt: 89, lte: 100})
Only documents with the specified field, and within the specified range match. You can also pass in dates and times as ranges. While you could pass a normal ruby Range
object to .where
, this allows you to specify only a minimum or only a maximum.
api = api.geo_distance(
field: 'coords',
distance: '20mi',
origin: [135.7683, 35.0117]
)
Filters for documents where the specified geo_point
field is within the given range of the origin
point.
The field must be mapped as a geo_point
field. See Elasticsearch types for more info.
The origin:
point should be specified in one of the following formats:
'35,135' # string: lat,lon - no space
'drm3btev3e86' # geohash as a string
[135, 35] # array: [lon, lat] - two elements, longitude first
{lat: 35, lon: 135} # hash with lat: and lon: keys
Note that the lat/lon order is reversed for the array format to comply with GeoJSON. The hash uses the lon
key, not lng
. For more information about geohashes, see Elastic's documentation.
api = api.boost.where(category: 3, weight: 100)
.boost.range(:awesomeness, min: 10, weight: 10)
.boost.match.not('sucks')
Boosts use a Function Score Query with filters to allow you to affect the score for the document. Each condition will be applied as a filter with an optional weight.
api = api.boost.near(
field: :published_at,
origin: Time.now,
scale: '5d',
decay_function: :linear
)
api = api.boost.near(
field: :coords,
origin: [135.7683, 35.0117],
scale: '10mi',
decay: 0.33,
weight: 1000,
decay_function: :gauss
)
Boosts a document by how close a given field is to a given :origin
. Accepts dates, times, numbers, and geographical points. Unlike .where.range
or .boost.geo
, .boost.near
is not a binary operation. All documents get a score for that field, which decays the further it is away from the origin point.
The :scale
param determines how quickly the value falls off. In the example above, if a document's :coords
field is 10 miles away from the starting point, its score is about 1/3 that of a document at the origin point.
See the Function Score Query section on Decay Functions for more info.
api = api.boost.field_value(field: :popularity)
.boost.field_value(field: :timestamp, factor: 0.5, modifier: :sqrt)
.boost.field_value(field: :votes, weight: 100)
Boosts a document by a numeric value contained in the specified fields. You can also specify a factor
(an amount to multiply the field value by) and a modifier
(a function for normalizing values).
See the Boosting By Popularity Guide and the Field Value Factor documentation for more info.
api = api.boost.random(user.id)
.boost.random(seed: user.id, weight: 100)
Gives each document a randomized boost with a given seed and optional weight. This allows you to show slightly different result sets to different users, but show the same result set to that user every time.
api = api.fields(:name, :email, :id)
Instead of returning the entire document, only return the specified fields.
query = query.limit(20).offset(1000)
# or...
query = query.page(50, per_page: 20)
Works the same way as ActiveRecord's limit and offset methods - analogous to Elasticsearch's from
and size
parameters. The .page
method allows you to set both at once, and is compatible with the Kaminari gem.
query = query.explain.where()
Tells Elasticsearch to return an explanation of the score for each document. See the explain parameter for how this is used, and the explain API for what the explanations will look like.
.limit_value
for Kaminari compatibilityquery.results
Executes the query and provides the parsed json for each hit returned by Elasticsearch, along with _index
, _type
, _id
, and _score
fields.
query.ids
Provides only the ids for each hit. If your document ids are numeric (as is the case for many ActiveRecord integrations), they will be converted to integers.
query.response
Executes the query, returns the raw JSON response from Elasticsearch and caches it. Use this to get at search API data not in the source documents.
query.total
Returns the total number of matches returned by the query - not just the current page. Makes plugging into Kaminari a snap.
query.explanations
Collect the '_explanation'
field for each result, so you can easily see how the document scores were computed.
results = query.query_results
results.per_page
results.limit_value
results.total_pages
Included in the Results object for Kaminari compatibility.
After checking out the repo, run bundle install
to install dependencies. Then, run pry
for an interactive prompt that will allow you to experiment. Run specs with rspec
For bugs and feature requests, please open a new issue.
Please see the CONTRIBUTING guide for guidelines on contributing to Stretchy.
FAQs
Unknown package
We found that stretchy demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.