Security News
GitHub Removes Malicious Pull Requests Targeting Open Source Repositories
GitHub removed 27 malicious pull requests attempting to inject harmful code across multiple open source repositories, in another round of low-effort attacks.
com.strapdata.elasticsearch:elasticsearch
Advanced tools
Elassandra is a fork of Elasticsearch modified to run as a plugin for Apache Cassandra in a scalable and resilient peer-to-peer architecture. Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serves as an Elasticsearch data and configuration store.
Elassandra supports Cassandra vnodes and scales horizontally by adding more nodes.
Project documentation is available at doc.elassandra.io.
For Cassandra users, Elassandra provides the following features for integration with Elasticsearch:
For Elasticsearch users, Elassandra provides these useful features due to integration with Cassandra:
Quick Start guide to run a single node Elassandra cluster in docker.
Elassandra uses the Cassandra GOSSIP protocol to manage the Elasticsearch routing table. As of Elassandra 6.2.3.25+, we have added support for the compression of the X1 application state to increase the maxmimum number of Elasticsearch indices. For backward compatibility, the compression is disabled by default, but once all your nodes are upgraded into version 6.2.3.25+, you should enable the X1 compression by adding -Des.compress_x1=true in your conf/jvm.options and rolling restart all nodes. Nodes running version 6.2.3.25+ are able to read both compressed and not compressed X1.
Before version 6.2.3.21, the Cassandra replication factor for the elasic_admin keyspace (and elastic_admin_[datacenter.group]) was automatically adjusted to the number of nodes of the datacenter. Because the replication factor setting has a performance impact on large clusters, we wanted to make this more customizable, and as of version 6.2.3.21 we are leaving this up to your Elassandra administrator to properly adjust the replication factor for this keyspace. Keep in mind that Elasticsearch mapping updates rely on a PAXOS transaction that requires QUORUM nodes to succeed, so the replication factor should be at least 3 on each datacenter.
Elassandra 6.2.3.19 metadata version now relies on the Cassandra table elastic_admin.metadata_log (that was elastic_admin.metadata from 6.2.3.8 to 6.2.3.18) to keep the elasticsearch mapping update history and automatically recover from a possible PAXOS write timeout issue.
When upgrading the first node of a cluster, Elassandra automatically copy the current metadata.version into the new elastic_admin.metadata_log table. To avoid Elasticsearch mapping inconsistency, you must avoid mapping update while the rolling upgrade is in progress. Once all nodes are upgraded, the elastic_admin.metadata is not more used and can be removed. Then, you can get the mapping update history from the new elastic_admin.metadata_log and know which node has updated the mapping, when and for which reason.
Elassandra 6.2.3.8+ now fully manages the elasticsearch mapping in the CQL schema through the use of CQL schema extensions (see system_schema.tables, column extensions). These table extensions and the CQL schema updates resulting of elasticsearch index creation/modification are updated in batched atomic schema updates to ensure consistency when concurrent updates occurs. Moreover, these extensions are stored in binary and support partial updates to be more efficient. As a result, the elasticsearch mapping is not more stored in the elastic_admin.metadata table.
WARNING: During the rolling upgrade, elasticsearch mapping changes are not propagated between nodes running the new and the old versions, so don't change your mapping while you're upgrading. Once all your nodes have been upgraded to 6.2.3.8+ and validated, apply the following CQL statements to remove useless elasticsearch metadata:
ALTER TABLE elastic_admin.metadata DROP metadata;
ALTER TABLE elastic_admin.metadata WITH comment = '';
WARNING: Due to CQL table extensions used by Elassandra, some old versions of cqlsh may lead to the following error message "'module' object has no attribute 'viewkeys'.". This comes from the old python cassandra driver embedded in Cassandra and has been reported in CASSANDRA-14942. Possible workarounds:
docker run -it --rm strapdata/cqlsh:0.1 node.example.com
Ensure Java 8 is installed and JAVA_HOME
points to the correct location.
export CASSANDRA_HOME=<extracted_directory>
bin/cassandra -e
bin/nodetool status
curl -XGET localhost:9200/_cluster/state
Try indexing a document on a non-existing index:
curl -XPUT 'http://localhost:9200/twitter/_doc/1?pretty' -H 'Content-Type: application/json' -d '{
"user": "Poulpy",
"post_date": "2017-10-04T13:12:00Z",
"message": "Elassandra adds dynamic mapping to Cassandra"
}'
Then look-up in Cassandra:
bin/cqlsh -e "SELECT * from twitter.\"_doc\""
Behind the scenes, Elassandra has created a new Keyspace twitter
and table _doc
.
admin@cqlsh>DESC KEYSPACE twitter;
CREATE KEYSPACE twitter WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '1'} AND durable_writes = true;
CREATE TABLE twitter."_doc" (
"_id" text PRIMARY KEY,
message list<text>,
post_date list<timestamp>,
user list<text>
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX elastic__doc_idx ON twitter."_doc" () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';
By default, multi valued Elasticsearch fields are mapped to Cassandra list. Now, insert a row with CQL :
INSERT INTO twitter."_doc" ("_id", user, post_date, message)
VALUES ( '2', ['Jimmy'], [dateof(now())], ['New data is indexed automatically']);
SELECT * FROM twitter."_doc";
_id | message | post_date | user
-----+--------------------------------------------------+-------------------------------------+------------
2 | ['New data is indexed automatically'] | ['2019-07-04 06:00:21.893000+0000'] | ['Jimmy']
1 | ['Elassandra adds dynamic mapping to Cassandra'] | ['2017-10-04 13:12:00.000000+0000'] | ['Poulpy']
(2 rows)
Then search for it with the Elasticsearch API:
curl "localhost:9200/twitter/_search?q=user:Jimmy&pretty"
And here is a sample response :
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.6931472,
"hits" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.6931472,
"_source" : {
"post_date" : "2019-07-04T06:00:21.893Z",
"message" : "New data is indexed automatically",
"user" : "Jimmy"
}
}
]
}
}
This software is licensed under the Apache License, version 2 ("ALv2"), quoted below.
Copyright 2015-2018, Strapdata (contact@strapdata.com).
Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.
FAQs
Elasticsearch subproject :server
We found that com.strapdata.elasticsearch:elasticsearch demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
GitHub removed 27 malicious pull requests attempting to inject harmful code across multiple open source repositories, in another round of low-effort attacks.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
Security News
Node.js will be enforcing stricter semver-major PR policies a month before major releases to enhance stability and ensure reliable release candidates.