🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more →

Book a Demo Install Sign in

pyelasticsearch

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

pyelasticsearch

Flexible, high-scale API to elasticsearch

1.4.1

PyPI

Maintainers: 3

=============== pyelasticsearch

.. image:: https://travis-ci.org/pyelasticsearch/pyelasticsearch.png :alt: Build Status :align: right :target: https://travis-ci.org/pyelasticsearch/pyelasticsearch

pyelasticsearch is a clean, future-proof, high-scale API to elasticsearch. It provides...

Transparent conversion of Python data types to and from JSON, including datetimes and the arbitrary-precision Decimal type
Translation of HTTP failure status codes into exceptions
Connection pooling
HTTP basic auth and HTTPS support
Load balancing across nodes in a cluster
Failed-node marking to avoid downed nodes for a period
Optional automatic retrying of failed requests
Thread safety
Loosely coupled design, letting you customize things like JSON encoding and bulk indexing

For more on our philosophy and history, see Comparison with elasticsearch-py, the “Official Client” <https://pyelasticsearch.readthedocs.org/en/latest/elasticsearch-py/>_.

A Taste of the API

Make a pooling, balancing, all-singing, all-dancing connection object::

from pyelasticsearch import ElasticSearch es = ElasticSearch('http://localhost:9200/')

Index a document::

es.index('contacts', ... 'person', ... {'name': 'Joe Tester', 'age': 25, 'title': 'QA Master'}, ... id=1) {u'_type': u'person', u'_id': u'1', u'ok': True, u'_version': 1, u'_index': u'contacts'}

Index a couple more documents, this time in a single request using the bulk-indexing API::

docs = [{'id': 2, 'name': 'Jessica Coder', 'age': 32, 'title': 'Programmer'}, ... {'id': 3, 'name': 'Freddy Tester', 'age': 29, 'title': 'Office Assistant'}] es.bulk((es.index_op(doc, id=doc.pop('id')) for doc in docs), ... index='contacts', ... doc_type='person')

If we had many documents and wanted to chunk them for performance, bulk_chunks() <https://pyelasticsearch.readthedocs.org/en/latest/api/#pyelasticsearch.bulk_chunks>_ would easily rise to the task, dividing either at a certain number of documents per batch or, for curated platforms like Google App Engine, at a certain number of bytes. Thanks to the decoupled design, you can even substitute your own batching function if you have unusual needs. Bulk indexing is the most demanding ES task in most applications, so we provide very thorough tools for representing operations, optimizing wire traffic, and dealing with errors. See bulk() <https://pyelasticsearch.readthedocs.org/en/latest/api/#pyelasticsearch.ElasticSearch.bulk>_ for more.

Refresh the index to pick up the latest::

es.refresh('contacts') {u'ok': True, u'_shards': {u'successful': 5, u'failed': 0, u'total': 10}}

Get just Jessica's document::

es.get('contacts', 'person', 2) {u'_id': u'2', u'_index': u'contacts', u'_source': {u'age': 32, u'name': u'Jessica Coder', u'title': u'Programmer'}, u'_type': u'person', u'_version': 1, u'exists': True}

Perform a simple search::

es.search('name:joe OR name:freddy', index='contacts') {u'_shards': {u'failed': 0, u'successful': 42, u'total': 42}, u'hits': {u'hits': [{u'_id': u'1', u'_index': u'contacts', u'_score': 0.028130024999999999, u'_source': {u'age': 25, u'name': u'Joe Tester', u'title': u'QA Master'}, u'_type': u'person'}, {u'_id': u'3', u'_index': u'contacts', u'_score': 0.028130024999999999, u'_source': {u'age': 29, u'name': u'Freddy Tester', u'title': u'Office Assistant'}, u'_type': u'person'}], u'max_score': 0.028130024999999999, u'total': 2}, u'timed_out': False, u'took': 4}

Perform a search using the elasticsearch query DSL_:

.. _elasticsearch query DSL: http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

query = { ... 'query': { ... 'filtered': { ... 'query': { ... 'query_string': {'query': 'name:tester'} ... }, ... 'filter': { ... 'range': { ... 'age': { ... 'from': 27, ... 'to': 37, ... }, ... }, ... }, ... }, ... }, ... } es.search(query, index='contacts') {u'_shards': {u'failed': 0, u'successful': 42, u'total': 42}, u'hits': {u'hits': [{u'_id': u'3', u'_index': u'contacts', u'_score': 0.19178301, u'_source': {u'age': 29, u'name': u'Freddy Tester', u'title': u'Office Assistant'}, u'_type': u'person'}], u'max_score': 0.19178301, u'total': 1}, u'timed_out': False, u'took': 2}

Delete the index::

es.delete_index('contacts') {u'acknowledged': True, u'ok': True}

For more, see the full API Documentation <https://pyelasticsearch.readthedocs.org/en/latest/api/>_.

Changelog

v1.4.1 (2018-04-02)

Recognize new "index already exists" spelling so we raise the right exceptions. Close #195.
Fix CI setup.
Drop Python 2.6 support.
Drop nose for testing.

v1.4

Add support for custom certificate authorities via the ca_certs arg to the ElasticSearch constructor.
Add support for client certificates via the client_cert arg.

v1.3

Add support for HTTPS.
Add username, password, and port kwargs to the constructor so you don't have to repeat their values if they're the same across many servers.

v1.2.4 (2015-05-21)

Don't crash when the query_params kwarg is omitted from calls to send_request().

v1.2.3 (2015-04-17)

Make delete_all_indexes() work.
Fix a bug in which specifying _all as an index name sometimes caused doctype names to be treated as index names.

v1.2.2 (2015-04-10)

Correct a typo in the bulk() docs.

v1.2.1 (2015-04-09)

Update ES doc links, now that Elastic has changed domains and reorganized its docs.
Require elasticsearch lib 1.3 or greater, as that's when it started exposing ConnectionTimeout.

v1.2 (2015-03-06)

Make sure the Content-Length header gets set when calling create_index() with no explicit settings arg. This solves 411s when using nginx as a proxy.
Add doc_as_upsert() arg to update().
Make bulk_chunks() compute perfectly optimal results, no longer ever exceeding the byte limit unless a single document is over the limit on its own.

v1.1 (2015-02-12)

Introduce new bulk API, supporting all types of bulk operations (index, update, create, and delete), providing chunking via bulk_chunks(), and introducing per-action error-handling. All errors raise exceptions--even individual failed operations--and the exceptions expose enough data to identify operations for retrying or reporting. The design is decoupled in case you want to create your own chunkers or operation builders.
Deprecate bulk_index() in favor of the more capable bulk().
Make one last update to bulk_index(). It now catches individual operation failures, raising BulkError. Also add the index_field and type_field args, allowing you to index across different indices and doc types within one request.
ElasticSearch object now defaults to http://localhost:9200/ if you don't provide any node URLs.
Improve docs: give a better overview on the front page, and document how to customize JSON encoding.

v1.0 (2015-01-23)

Switch to elasticsearch-py's transport and downtime-pooling machinery, much of which was borrowed from us anyway.
Make bulk indexing (and likely other network things) 15 times faster.
Add a comparison with the official client to the docs.
Fix delete_by_query() to work with ES 1.0 and later.
Bring percolate() es_kwargs up to date.
Fix all tests that were failing on modern versions of ES.
Tolerate errors that are non-strings and create exceptions for them properly.

.. note::

Backward incompatible:

Drop compatibility with elasticsearch < 1.0.
Redo cluster_state() to work with ES 1.0 and later. Arguments have changed.
InvalidJsonResponseError no longer provides access to the HTTP response (in the response property): just the bad data (the input property).
Change from the logger "pyelasticsearch" to "elasticsearch.trace".
Remove revival_delay param from ElasticSearch object.
Remove encode_body param from send_request(). Now all dicts are JSON-encoded, and all strings are left alone.

v0.7.1 (2014-08-12)

Brings tests up to date with update_aliases() API change.

v0.7 (2014-08-12)

When an id_field is specified for bulk_index(), don't index it under its original name as well; use it only as the _id.
Rename aliases() to get_aliases() for consistency with other methods. Original name still works but is deprecated. Add an alias kwarg to the method so you can fetch specific aliases.

.. note::

Backward incompatible:

update_aliases() no longer requires a dict with an actions key; that much is implied. Just pass the value of that key.

v0.6.1 (2013-11-01)

Update package requirements to allow requests 2.0, which is in fact compatible. (Natim)
Properly raise IndexAlreadyExistsException even if the error is reported by a node other than the one to which the client is directly connected. (Jannis Leidel)

v0.6 (2013-07-23)

.. note::

Note the change in behavior of bulk_index() in this release. This change probably brings it more in line with your expectations. But double check, since it now overwrites existing docs in situations where it didn't before.

Also, we made a backward-incompatible spelling change to a little-used index() kwarg.

bulk_index() now overwrites any existing doc of the same ID and doctype. Before, in certain versions of ES (like 0.90RC2), it did nothing at all if a document already existed, probably much to your surprise. (We removed the 'op_type': 'create' pair, whose intentions were always mysterious.) (Gavin Carothers)
Rename the force_insert kwarg of index() to overwrite_existing. The old name implied the opposite of what it actually did. (Gavin Carothers)

v0.5 (2013-04-20)

Support multiple indices and doctypes in delete_by_query(). Accept both string and JSON queries in the query arg, just as search() does. Passing the q arg explicitly is now deprecated.
Add multi_get.
Add percolate. Thanks, Adam Georgiou and Joseph Rose!
Add ability to specify the parent document in bulk_index(). Thanks, Gavin Carothers!
Remove the internal, undocumented from_python method. django-haystack users will need to upgrade to a newer version that avoids using it.
Refactor JSON encoding machinery. Now it's clearer how to customize it: just plug your custom JSON encoder class into ElasticSearch.json_encoder.
Don't crash under python -OO.
Support non-ASCII URL path components (like Unicode document IDs) and query string param values.
Switch to the nose testrunner.

v0.4.1 (2013-03-25)

Fix a bug introduced in 0.4 wherein "None" was accidentally sent to ES when an ID wasn't passed to index().

v0.4 (2013-03-19)

Support Python 3.
Support more APIs:
- cluster_state
- get_settings
- update_aliases and aliases
- update (existed but didn't work before)
Support the size param of the search method. (You can now change es_size to size in your code if you like.)
Support the fields param on index and update methods, new since ES 0.20.
Maintain better precision of floats when passed to ES.
Change endpoint of bulk indexing so it works on ES < 0.18.
Support documents whose ID is 0.
URL-escape path components, so doc IDs containing funny chars work.
Add a dedicated IndexAlreadyExistsError exception for when you try to create an index that already exists. This helps you trap this situation unambiguously.
Add docs about upgrading from pyes.
Remove the undocumented and unused to_python method.

v0.3 (2013-01-10)

Correct the requests requirement to require a version that has everything we need. In fact, require requests 1.x, which has a stable API.
Add update() method.
Make send_request method public so you can use ES APIs we don't yet explicitly support.
Handle JSON translation of Decimal class and sets.
Make more_like_this() take an arbitrary request body so you can filter the returned docs.
Replace the fields arg of more_like_this with mlt_fields. This makes it actually work, as it's the param name ES expects.
Make explicit our undeclared dependency on simplejson.

v0.2 (2012-10-06)

Many thanks to Erik Rose for almost completely rewriting the API to follow best practices, improve the API user experience, and make pyelasticsearch future-proof.

.. note::

This release is backward-incompatible in numerous ways, please read the following section carefully. If in doubt, you can easily stick with pyelasticsearch 0.1.

Backward-incompatible changes:

Simplify search() and count() calling conventions. Each now supports either a textual or a dict-based query as its first argument. There's no longer a need to, for example, pass an empty string as the first arg in order to use a JSON query (a common case).
Standardize on the singular for the names of the index and doc_type kwargs. It's not always obvious whether an ES API allows for multiple indexes. This was leading me to have to look aside to the docs to determine whether the kwarg was called index or indexes. Using the singular everywhere will result in fewer doc lookups, especially for the common case of a single index.
Rename morelikethis to more_like_this for consistency with other methods.
index() now takes (index, doc_type, doc) rather than (doc, index, doc_type), for consistency with bulk_index() and other methods.
Similarly, put_mapping() now takes (index, doc_type, mapping) rather than (doc_type, mapping, index).
To prevent callers from accidentally destroying large amounts of data...
- delete() no longer deletes all documents of a doctype when no ID is specified; use delete_all() instead.
- delete_index() no longer deletes all indexes when none are given; use delete_all_indexes() instead.
- update_settings() no longer updates the settings of all indexes when none are specified; use update_all_settings() instead.
setup_logging() is gone. If you want to configure logging, use the logging module's usual facilities. We still log to the "pyelasticsearch" named logger.
Rethink error handling:
- Raise a more specific exception for HTTP error codes so callers can catch it without examining a string.
- Catch non-JSON responses properly, and raise the more specific NonJsonResponseError instead of the generic ElasticSearchError.
- Remove mentions of nonexistent exception types that would cause crashes in their except clauses.
- Crash harder if JSON encoding fails: that always indicates a bug in pyelasticsearch.
- Remove the ill-defined ElasticSearchError.
- Raise ConnectionError rather than ElasticSearchError if we can't connect to a node (and we're out of auto-retries).
- Raise ValueError rather than ElasticSearchError if no documents are passed to bulk_index.
- All exceptions are now more introspectable, because they don't immediately mash all the context down into a string. For example, you can recover the unmolested response object from ElasticHttpError.
- Removed quiet kwarg, meaning we always expose errors.

Other changes:

Add Sphinx documentation.
Add load-balancing across multiple nodes.
Add failover in the case where a node doesn't respond.
Add close_index, open_index, update_settings, health.
Support passing arbitrary kwargs through to the ES query string. Known ones are taken verbatim; unanticipated ones need an "\es_" prefix to guarantee forward compatibility.
Automatically convert datetime objects when encoding JSON.
Recognize and convert datetimes and dates in pass-through kwargs. This is useful for timeout.
In routines that can take either one or many indexes, don't require the caller to wrap a single index name in a list.
Many other internal improvements

v0.1 (2012-08-30)

Initial release based on the work of Robert Eanes and other authors

FAQs

What is pyelasticsearch?

Is pyelasticsearch well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

pyelasticsearch

=============== pyelasticsearch

A Taste of the API

Changelog

v1.4.1 (2018-04-02)

v1.4

v1.3

v1.2.4 (2015-05-21)

v1.2.3 (2015-04-17)

v1.2.2 (2015-04-10)

v1.2.1 (2015-04-09)

v1.2 (2015-03-06)

v1.1 (2015-02-12)

v1.0 (2015-01-23)

v0.7.1 (2014-08-12)

v0.7 (2014-08-12)

v0.6.1 (2013-11-01)

v0.6 (2013-07-23)

v0.5 (2013-04-20)

v0.4.1 (2013-03-25)

v0.4 (2013-03-19)

v0.3 (2013-01-10)

v0.2 (2012-10-06)

v0.1 (2012-08-30)

Related posts

8 More Malicious Firefox Extensions: Exploiting Popular Game Recognition, Hijacking User Sessions, and Stealing OAuth Credentials

Official Go SDK for MCP in Development, Stable Release Expected in August