express-cassandra
Advanced tools
Comparing version 1.1.0 to 1.1.1
{ | ||
"name": "express-cassandra", | ||
"version": "1.1.0", | ||
"version": "1.1.1", | ||
"dependencies": { | ||
@@ -5,0 +5,0 @@ "async": "^1.0.0", |
1049
README.md
@@ -124,3 +124,3 @@ [![Build Status](https://travis-ci.org/masumsoft/express-cassandra.svg)](https://travis-ci.org/masumsoft/express-cassandra) | ||
## Write a Model named `PersonModel.js` inside models directory | ||
### Write a Model named `PersonModel.js` inside models directory | ||
@@ -143,3 +143,3 @@ ```js | ||
## Let's insert some data into PersonModel | ||
### Let's insert some data into PersonModel | ||
@@ -156,3 +156,3 @@ ```js | ||
## Now let's find it | ||
### Now let's find it | ||
@@ -210,10 +210,9 @@ ```js | ||
indexes: ["name"], | ||
custom_index: { | ||
on: 'age', | ||
using: 'path.to.the.IndexClass', | ||
options: { | ||
option1 : '...', | ||
option2: '...' | ||
custom_indexes: [ | ||
{ | ||
on: 'complete_name', | ||
using: 'org.apache.cassandra.index.sasi.SASIIndex', | ||
options: {} | ||
} | ||
}, | ||
], | ||
table_name: "my_custom_table_name" | ||
@@ -252,3 +251,3 @@ } | ||
- `custom_index` provides the ability to define custom indexes with a Cassandra table. The `on` section should contain the column name on which the index should be built, the `using` section should contain the custom indexer class path and the `options` section should contain the passed options for the indexer class if any. | ||
- `custom_indexes` is an array of objects defining the custom indexes for the table. The `on` section should contain the column name on which the index should be built, the `using` section should contain the custom indexer class path and the `options` section should contain the passed options for the indexer class if any. If no `options` are required, pass a blank {} object. | ||
@@ -268,3 +267,3 @@ - `table_name` provides the ability to use a different name for the actual table in cassandra. By default the lowercased modelname is used as the table name. But if you want a different table name instead, then you may want to use this optional field to specify the custom name for your cassandra table. | ||
Ok, we are done with John, let's delete it: | ||
Ok, we are done with John, let's delete him: | ||
@@ -279,96 +278,442 @@ ```js | ||
### A few handy tools for your model | ||
## Querying your data | ||
Express cassandra exposes some node driver methods for convenience. To generate uuids e.g. in field defaults: | ||
Ok, now you have a bunch of people on db. How do I retrieve them? | ||
* `models.uuid()` | ||
returns a type 3 (random) uuid, suitable for Cassandra `uuid` fields, as a string | ||
* `models.uuidFromString(str)` | ||
returns a type 3 uuid from input string, suitable for Cassandra `uuid` fields | ||
* `models.timeuuid() / .maxTimeuuid() / .minTimeuuid()` | ||
returns a type 1 (time-based) uuid, suitable for Cassandra `timeuuid` fields, as a string. From the [Datastax documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timeuuid_functions_r.html): | ||
### Find (results are model instances) | ||
> The min/maxTimeuuid example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000'). | ||
```js | ||
> The values returned by minTimeuuid and maxTimeuuid functions are not true UUIDs in that the values do not conform to the Time-Based UUID generation process specified by the RFC 4122. The results of these functions are deterministic, unlike the now function. | ||
* `models.consistencies` | ||
this object contains all the available consistency enums defined by node cassandra driver, so you can for example use models.consistencies.one, models.consistencies.quorum etc. | ||
* `models.datatypes` | ||
this object contains all the available datatypes defined by node cassandra driver, so you can for example use | ||
models.datatypes.Long to deal with the cassandra bigint or counter field types. | ||
models.instance.Person.find({name: 'John'}, function(err, people){ | ||
if(err) throw err; | ||
//people is an array of model instances containing the persons with name `John` | ||
console.log('Found ', people); | ||
}); | ||
//If you specifically expect only a single object after find, you may do this | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
//The variable `john` is a model instance containing the person named `John` | ||
//`john` will be undefined if no person named `John` was found | ||
console.log('Found ', john.name); | ||
}); | ||
### Cassandra to Javascript Datatypes | ||
``` | ||
When saving or retrieving the value of a column, the value is typed according to the following table. | ||
Note that, result objects here in callback will be model instances. So you may do operations like `john.save`, `john.delete` etc on the result object directly. If you want to extract the raw javascript object values from a model instance, you may use the toJSON method like `john.toJSON()`. | ||
| Cassandra Field Types | Javascript Types | | ||
|------------------------|-----------------------------------| | ||
| ascii | String | | ||
| bigint | [models.datatypes.Long](https://google.github.io/closure-library/api/goog.math.Long.html)| | ||
| blob | [Buffer](https://nodejs.org/api/buffer.html)| | ||
| boolean | Boolean | | ||
| counter | [models.datatypes.Long](https://google.github.io/closure-library/api/goog.math.Long.html)| | ||
| date | [models.datatypes.LocalDate](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-LocalDate.html)| | ||
| decimal | [models.datatypes.BigDecimal](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-BigDecimal.html)| | ||
| double | Number | | ||
| float | Number | | ||
| inet | [models.datatypes.InetAddress](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-InetAddress.html)| | ||
| int | Number (Integer) | | ||
| list | Array | | ||
| map | Object | | ||
| set | Array | | ||
| smallint | Number (Integer)| | ||
| text | String | | ||
| time | [models.datatypes.LocalTime](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-LocalTime.html)| | ||
| timestamp | Date | | ||
| timeuuid | [models.datatypes.TimeUuid](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-TimeUuid.html)| | ||
| tinyint | Number (Integer)| | ||
| tuple | [models.datatypes.Tuple](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Tuple.html)| | ||
| uuid | [models.datatypes.Uuid](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Uuid.html)| | ||
| varchar | String | | ||
| varint | [models.datatypes.Integer](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Integer.html)| | ||
In the above example it will perform the query `SELECT * FROM person WHERE name='john'` but `find()` allows you to perform even more complex queries on cassandra. You should be aware of how to query cassandra. Every error will be reported to you in the `err` argument, while in `people` you'll find instances of `Person`. | ||
### Find (results are raw objects) | ||
For example, you have a User model schema like the following: | ||
If you don't want the orm to cast results to instances of your model you can use the `raw` option as in the following example: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { raw: true }, function(err, people){ | ||
//people is an array of plain objects | ||
}); | ||
``` | ||
### Find (A more complex query) | ||
```js | ||
var query = { | ||
// equal query stays for name='john', also could be written as name: { $eq: 'John' } | ||
name: 'John', | ||
// range query stays for age>10 and age<=20. You can use $gt (>), $gte (>=), $lt (<), $lte (<=) | ||
age : { '$gt':10, '$lte':20 }, | ||
// IN clause, means surname should either be Doe or Smith | ||
surname : { '$in': ['Doe','Smith'] }, | ||
// like query supported by sasi indexes, complete_name must have an SASI index defined in custom_indexes | ||
complete_name: { '$like': 'J%' }, | ||
// order results by age in ascending order. | ||
// also allowed $desc and complex order like $orderby: {'$asc' : ['k1','k2'] } | ||
$orderby:{ '$asc' :'age' }, | ||
//limit the result set to 10 rows | ||
$limit: 10 | ||
} | ||
models.instance.Person.find(query, {raw: true}, function(err, people){ | ||
//people is an array of plain objects satisfying the query conditions above | ||
}); | ||
``` | ||
Note that all query clauses must be Cassandra compliant. You cannot, for example, use $in operator for a key which is not part of the primary key. Querying in Cassandra is very basic but could be confusing at first. Take a look at this [post](http://mechanics.flite.com/blog/2013/11/05/breaking-down-the-cql-where-clause/) and, obvsiouly, at the [cql query documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_using/useQueryDataTOC.html) | ||
### Find (results to contain only selected columns) | ||
You can also select particular columns using the select key in the options object like the following example: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { select: ['name as username','age'] }, function(err, people){ | ||
//people is an array of plain objects with only name and age | ||
}); | ||
``` | ||
Note that if you use the `select` option, then the results will always be raw plain objects instead of model instances. | ||
Also **Remember** that your select needs to include all the partition key columns defined for your table! | ||
If your model key looks like this: | ||
```js | ||
module.exports = { | ||
"fields": { | ||
"user_id": "bigint", | ||
"user_name": "text" | ||
fields: { | ||
//fields are not shown for clarity | ||
}, | ||
"key" : ["user_id"] | ||
key : [["columnOne","columnTwo","columnThree"],"columnFour","ColumnFive"] | ||
} | ||
``` | ||
Now to insert data in the model, you need the Long data type. To create a Long type data, you can use the `models.datatypes.Long` like the following: | ||
Then your `select`-array has to at least include the partition key columns like this: `select: ['columnOne', 'columnTwo', 'columnThree']`. | ||
### Find (using aggregate function) | ||
You can also use `aggregate functions` using the select key in the options object like the following example: | ||
```js | ||
var user = new models.instance.User({ | ||
user_id: models.datatypes.Long.fromString('1234556567676782'), | ||
user_name: 'john' | ||
models.instance.Person.find({name: 'John'}, { select: ['name','sum(age)'] }, function(err, people){ | ||
//people is an array of plain objects with sum of all ages where name is John | ||
}); | ||
user.save(function(err){ | ||
//Now let's find the saved user | ||
models.instance.User.findOne({user_id: models.datatypes.Long.fromString('1234556567676782')}, function(err, john){ | ||
console.log(john.user_id.toString()); // john.user_id is of type Long. | ||
}); | ||
``` | ||
### Find (using distinct select) | ||
Also, `DISTINCT` selects are possible: | ||
```js | ||
models.instance.Person.find({}, { select: ['name','age'], distinct: true }, function(err, people){ | ||
//people is a distinct array of plain objects with only distinct name and ages. | ||
}); | ||
``` | ||
### Null and unset values | ||
### Find (querying a materialized view) | ||
To complete a distributed DELETE operation, Cassandra replaces it with a special value called a tombstone which can be propagated to replicas. When inserting or updating a field, you can set a certain field to null as a way to clear the value of a field, and it is considered a DELETE operation. In some cases, you might insert rows using null for values that are not specified, and even though our intention is to leave the value empty, Cassandra represents it as a tombstone causing unnecessary overhead. | ||
And if you have defined `materialized views` in your schema as described in the schema detail section, then you can query your views by using the similar find/findOne functions. Just add an option with the materialized view name like the following: | ||
To avoid tombstones, cassandra has the concept of unset for a parameter value. So you can do the following to unset a field value for example: | ||
```js | ||
models.instance.User.update({user_id: models.datatypes.Long.fromString('1234556567676782')}, { | ||
user_name: models.datatypes.unset | ||
models.instance.Person.find({name: 'John'}, { materialized_view: 'view_name1', raw: true }, function(err, people){ | ||
//people is an array of plain objects taken from the materialized view | ||
}); | ||
``` | ||
### Find (with allow filtering) | ||
If you want to set allow filtering option, you may do that like this: | ||
```js | ||
models.instance.Person.find(query, {raw:true, allow_filtering: true}, function(err, people){ | ||
//people is an array of plain objects | ||
}); | ||
``` | ||
### Find (using index expression) | ||
If you want to use custom index expressions, you may do that like this: | ||
```js | ||
var query = { | ||
$expr: { | ||
index: 'YOUR_INDEX_NAME', | ||
query: 'YOUR_CUSTOM_EXPR_QUERY' | ||
} | ||
} | ||
models.instance.Person.find(query, function(err, people){ | ||
}); | ||
``` | ||
### Find (fetching large result sets using streaming queries) | ||
The stream() method automatically fetches the following pages, yielding the rows as they come through the network and retrieving the following page after the previous rows were read (throttling). | ||
```js | ||
models.instance.Person.stream({Name: 'John'}, {raw: true}, function(reader){ | ||
var row; | ||
while (row = reader.readRow()) { | ||
//process row | ||
} | ||
}, function(err){ | ||
//user name is now unset | ||
}) | ||
//emitted when all rows have been retrieved and read | ||
}); | ||
``` | ||
With the eachRow() method, you can retrieve the following pages automatically by setting the autoPage flag to true in the query options to request the following pages automatically. Because eachRow() does not handle backpressure, it is only suitable when there is minimum computation per row required and no additional I/O, otherwise it ends up buffering an unbounded amount of rows. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {autoPage : true}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// ... | ||
}); | ||
``` | ||
If you want to retrieve the next page of results only when you ask for it (for example, in a web page or after a certain computation or job finished), you can use the eachRow() method in the following way: | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
if (result.nextPage) { | ||
// retrieve the following pages | ||
// the same row handler from above will be used | ||
result.nextPage(); | ||
} | ||
}); | ||
``` | ||
You can also use the `pageState` property, a string token made available in the result if there are additional result pages. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
// store the paging state | ||
pageState = result.pageState; | ||
}); | ||
``` | ||
In the next request, use the page state to fetch the following rows. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100, pageState : pageState}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
// store the next paging state. | ||
pageState = result.pageState; | ||
}); | ||
``` | ||
Saving the paging state works well when you only let the user move from one page to the next. But it doesn’t allow random jumps (like "go directly to page 10"), because you can't fetch a page unless you have the paging state of the previous one. Such a feature would require offset queries, which are not natively supported by Cassandra. | ||
Note: The page state token can be manipulated to retrieve other results within the same column family, so it is not safe to expose it to the users. | ||
### Find (token based pagination) | ||
You can also use the `token` comparison function while querying a result set using the $token operator. This is specially useful for [paging through unordered partitioner results](https://docs.datastax.com/en/cql/3.3/cql/cql_using/usePaging.html). | ||
```js | ||
//consider the following situation | ||
var query = { | ||
$limit:10 | ||
}; | ||
models.instance.Person.find(query, function(err, people){ | ||
//people is an array of first 10 persons | ||
//Say your PRIMARY_KEY column is `name` and the 10th person has the name 'John' | ||
//Now to get the next 10 results, you may use the $token operator like the following: | ||
var query = { | ||
name:{ | ||
'$token':{'$gt':'John'} | ||
}, | ||
$limit:10 | ||
}; | ||
//The above query translates to `Select * from person where token(name) > token('John') limit 10` | ||
models.instance.Person.find(query, function(err, people){ | ||
//people is an array of objects containing the 11th - 20th person | ||
}); | ||
}); | ||
``` | ||
If you have a `composite partition key`, then the token operator should be contained in comma (,) separated partition key field names and the values should be an array containing the values for the partition key fields. Following is an example to demonstrate that: | ||
```js | ||
var query = { | ||
'id,name':{ | ||
'$token':{ | ||
'$gt':[1234,'John'] | ||
} | ||
} | ||
}; | ||
models.instance.Person.find(query, function(err, people){ | ||
}); | ||
``` | ||
## Save / Update / Delete / Batch | ||
### Save | ||
The save operation on a model instance will insert a new record with the attribute values mentioned when creating the model object. It will update the record if it already exists in the database. A record is updated or inserted based on the primary key definition. If the primary key values are same as an existing record, then the record will be updated and otherwise it will be inserted as new record. | ||
```js | ||
var john = new models.instance.Person({name: 'John', surname: 'Doe', age: 32}); | ||
john.save(function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
You can use the find query to get an object and modify it and save it like the following: | ||
```js | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
if(john){ | ||
john.age = 30; | ||
john.save(function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
} | ||
}); | ||
``` | ||
The save function also takes optional parameters. By default cassandra will update the row if the primary key | ||
already exists. If you want to avoid on duplicate key updates, you may set if_not_exist:true. | ||
```js | ||
john.save({if_not_exist: true}, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
You can also set an expiry ttl for the saved row if you want. In that case the row will be removed by cassandra | ||
automatically after the time to live has expired. | ||
```js | ||
//The row will be removed after 86400 seconds or one day | ||
john.save({ttl: 86400}, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
### Update | ||
Use the update function if your requirements are not satisfied with the `save()` function or you directly want to update records without reading them from the db. The update function takes the following forms, (options are optional): | ||
```js | ||
/* | ||
UPDATE person | ||
USING TTL 86400 | ||
SET email='abc@gmail.com' | ||
WHERE username= 'abc' | ||
IF EXISTS | ||
*/ | ||
var query_object = {username: 'abc'}; | ||
var update_values_object = {email: 'abc@gmail.com'}; | ||
var options = {ttl: 86400, if_exists: true}; | ||
models.instance.Person.update(query_object, update_values_object, options, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
/* | ||
UPDATE person | ||
SET email='abc@gmail.com' | ||
WHERE username= 'abc' | ||
IF email='typo@gmail.com' | ||
*/ | ||
var query_object = {username: 'abc'}; | ||
var update_values_object = {email: 'abc@gmail.com'}; | ||
var options = {conditions: {email: 'typo@gmail.com'}}; | ||
models.instance.Person.update(query_object, update_values_object, options, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
### Delete | ||
The delete function takes the following form: | ||
```js | ||
//DELETE FROM person WHERE username='abc'; | ||
var query_object = {username: 'abc'}; | ||
models.instance.Person.delete(query_object, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
If you have a model instance and you want to delete the instance object, you may do that like the following: | ||
```js | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
//Note that returned variable john here is an instance of your model, | ||
//so you can do john.delete() like the following | ||
john.delete(function(err){ | ||
//... | ||
}); | ||
}); | ||
``` | ||
### Batching ORM Operations | ||
You can batch any number of save, update and delete operations using the `models.doBatch` function. To use more than one of those functions as a combined batch operation, you need to tell each of the save/update/delete functions, that you want to get the final built query from the orm instead of executing it immediately. You can do that by adding a `return_query` parameter in the options object of the corresponding function and build an array of operations to execute atomically like the following: | ||
```js | ||
var queries = []; | ||
var event = new models.instance.Event({ | ||
id: 3, | ||
body: 'hello3' | ||
}); | ||
var save_query = event.save({return_query: true}); | ||
queries.push(save_query); | ||
var update_query = models.instance.Event.update( | ||
{id: 1}, | ||
{body: 'hello1 updated'}, | ||
{return_query: true} | ||
); | ||
queries.push(update_query); | ||
var delete_query = models.instance.Event.delete( | ||
{id: 2}, | ||
{return_query: true} | ||
); | ||
queries.push(delete_query); | ||
models.doBatch(queries, function(err){ | ||
if(err) throw err; | ||
}); | ||
``` | ||
## Complex Datatypes and Operations | ||
### Counter Column Operations | ||
@@ -392,3 +737,3 @@ | ||
### Support for Collection Data Types | ||
### Collection Data Types | ||
@@ -512,3 +857,3 @@ Cassandra collection data types (`map`, `list` & `set`) are supported in model schema definitions. An additional `typeDef` attribute is used to define the collection type. | ||
### Support for Frozen Collections | ||
### Frozen Collections | ||
@@ -524,3 +869,3 @@ Frozen collections are useful if you want to use them in the primary key. Frozen collection can only be replaced as a whole, you cannot for example add/remove elements in a frozen collection. | ||
### Support for Tuple Data Type | ||
### Tuple Data Type | ||
@@ -546,3 +891,3 @@ Cassandra tuple data types can be declared using the `frozen` type like the following: | ||
### Support for User Defined Types, Functions and Aggregates | ||
### User Defined Types, Functions and Aggregates | ||
@@ -627,3 +972,3 @@ User defined types (UDTs), user defined functions (UDFs) and user defined aggregates (UDAs) are supported too. The UDTs, UDFs & UDAs should be defined globally against your keyspace. You can defined them in the configuration object passed to initialize express-cassandra, so that express cassandra could create and sync them against your keyspace. So you may be able to use them in your schema definition and queries. The configuration object should have some more object keys representing the user defined types, functions and aggregates under `ormOptions` like the following: | ||
### Support for shared static columns | ||
### Shared Static Columns | ||
@@ -641,3 +986,3 @@ In a table that uses clustering columns, non-clustering columns can be declared static in the schema definition like the following: | ||
### Support for indexed collections | ||
### Indexed Collections | ||
@@ -708,2 +1053,96 @@ Collections can be indexed and queried to find a collection containing a particular value. Sets and lists are indexed slightly differently from maps, given the key-value nature of maps. | ||
## A few handy tools for your model | ||
Express cassandra exposes some node driver methods for convenience. To generate uuids e.g. in field defaults: | ||
* `models.uuid()` | ||
returns a type 3 (random) uuid, suitable for Cassandra `uuid` fields, as a string | ||
* `models.uuidFromString(str)` | ||
returns a type 3 uuid from input string, suitable for Cassandra `uuid` fields | ||
* `models.timeuuid() / .maxTimeuuid() / .minTimeuuid()` | ||
returns a type 1 (time-based) uuid, suitable for Cassandra `timeuuid` fields, as a string. From the [Datastax documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timeuuid_functions_r.html): | ||
> The min/maxTimeuuid example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000'). | ||
> The values returned by minTimeuuid and maxTimeuuid functions are not true UUIDs in that the values do not conform to the Time-Based UUID generation process specified by the RFC 4122. The results of these functions are deterministic, unlike the now function. | ||
* `models.consistencies` | ||
this object contains all the available consistency enums defined by node cassandra driver, so you can for example use models.consistencies.one, models.consistencies.quorum etc. | ||
* `models.datatypes` | ||
this object contains all the available datatypes defined by node cassandra driver, so you can for example use | ||
models.datatypes.Long to deal with the cassandra bigint or counter field types. | ||
## Cassandra to Javascript Datatypes | ||
When saving or retrieving the value of a column, the value is typed according to the following table. | ||
| Cassandra Field Types | Javascript Types | | ||
|------------------------|-----------------------------------| | ||
| ascii | String | | ||
| bigint | [models.datatypes.Long](https://google.github.io/closure-library/api/goog.math.Long.html)| | ||
| blob | [Buffer](https://nodejs.org/api/buffer.html)| | ||
| boolean | Boolean | | ||
| counter | [models.datatypes.Long](https://google.github.io/closure-library/api/goog.math.Long.html)| | ||
| date | [models.datatypes.LocalDate](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-LocalDate.html)| | ||
| decimal | [models.datatypes.BigDecimal](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-BigDecimal.html)| | ||
| double | Number | | ||
| float | Number | | ||
| inet | [models.datatypes.InetAddress](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-InetAddress.html)| | ||
| int | Number (Integer) | | ||
| list | Array | | ||
| map | Object | | ||
| set | Array | | ||
| smallint | Number (Integer)| | ||
| text | String | | ||
| time | [models.datatypes.LocalTime](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-LocalTime.html)| | ||
| timestamp | Date | | ||
| timeuuid | [models.datatypes.TimeUuid](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-TimeUuid.html)| | ||
| tinyint | Number (Integer)| | ||
| tuple | [models.datatypes.Tuple](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Tuple.html)| | ||
| uuid | [models.datatypes.Uuid](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Uuid.html)| | ||
| varchar | String | | ||
| varint | [models.datatypes.Integer](http://docs.datastax.com/en/drivers/nodejs/3.0/module-types-Integer.html)| | ||
For example, you have a User model schema like the following: | ||
```js | ||
module.exports = { | ||
"fields": { | ||
"user_id": "bigint", | ||
"user_name": "text" | ||
}, | ||
"key" : ["user_id"] | ||
} | ||
``` | ||
Now to insert data in the model, you need the Long data type. To create a Long type data, you can use the `models.datatypes.Long` like the following: | ||
```js | ||
var user = new models.instance.User({ | ||
user_id: models.datatypes.Long.fromString('1234556567676782'), | ||
user_name: 'john' | ||
}); | ||
user.save(function(err){ | ||
//Now let's find the saved user | ||
models.instance.User.findOne({user_id: models.datatypes.Long.fromString('1234556567676782')}, function(err, john){ | ||
console.log(john.user_id.toString()); // john.user_id is of type Long. | ||
}); | ||
}); | ||
``` | ||
### Null and unset values | ||
To complete a distributed DELETE operation, Cassandra replaces it with a special value called a tombstone which can be propagated to replicas. When inserting or updating a field, you can set a certain field to null as a way to clear the value of a field, and it is considered a DELETE operation. In some cases, you might insert rows using null for values that are not specified, and even though our intention is to leave the value empty, Cassandra represents it as a tombstone causing unnecessary overhead. | ||
To avoid tombstones, cassandra has the concept of unset for a parameter value. So you can do the following to unset a field value for example: | ||
```js | ||
models.instance.User.update({user_id: models.datatypes.Long.fromString('1234556567676782')}, { | ||
user_name: models.datatypes.unset | ||
}, function(err){ | ||
//user name is now unset | ||
}) | ||
``` | ||
## Virtual fields | ||
@@ -811,401 +1250,3 @@ | ||
## Querying your data | ||
Ok, now you have a bunch of people on db. How do I retrieve them? | ||
### Find (results are model instances) | ||
```js | ||
models.instance.Person.find({name: 'John'}, function(err, people){ | ||
if(err) throw err; | ||
//people is an array of model instances containing the persons with name `John` | ||
console.log('Found ', people); | ||
}); | ||
//If you specifically expect only a single object after find, you may do this | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
//The variable `john` is a model instance containing the person named `John` | ||
//`john` will be undefined if no person named `John` was found | ||
console.log('Found ', john.name); | ||
}); | ||
``` | ||
Note that, result objects here in callback will be model instances. So you may do operations like `john.save`, `john.delete` etc on the result object directly. If you want to extract the raw javascript object values from a model instance, you may use the toJSON method like `john.toJSON()`. | ||
In the above example it will perform the query `SELECT * FROM person WHERE name='john'` but `find()` allows you to perform even more complex queries on cassandra. You should be aware of how to query cassandra. Every error will be reported to you in the `err` argument, while in `people` you'll find instances of `Person`. | ||
#### Find (results are raw objects) | ||
If you don't want the orm to cast results to instances of your model you can use the `raw` option as in the following example: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { raw: true }, function(err, people){ | ||
//people is an array of plain objects | ||
}); | ||
``` | ||
#### Find (A more complex query) | ||
```js | ||
var query = { | ||
name: 'John', // stays for name='john' | ||
age : { '$gt':10, '$lte':20 }, // stays for age>10 and age<=20 You can also use $gt, $gte, $lt, $lte, $eq | ||
surname : { '$in': ['Doe','Smith'] }, //This is an IN clause | ||
$orderby:{'$asc' :'age'}, //Order results by age in ascending order. Also allowed $desc and complex order like $orderby:{'$asc' : ['k1','k2'] } | ||
$limit: 10 //limit result set | ||
} | ||
models.instance.Person.find(query, {raw: true}, function(err, people){ | ||
//people is an array of plain objects satisfying the query conditions above | ||
}); | ||
``` | ||
#### Find (results to contain only selected columns) | ||
You can also select particular columns using the select key in the options object like the following example: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { select: ['name as username','age'] }, function(err, people){ | ||
//people is an array of plain objects with only name and age | ||
}); | ||
``` | ||
Note that if you use the `select` option, then the results will always be raw plain objects instead of model instances. | ||
Also **Remember** that your select needs to include all the partition key columns defined for your table! | ||
If your model key looks like this: | ||
```js | ||
module.exports = { | ||
fields: { | ||
//fields are not shown for clarity | ||
}, | ||
key : [["columnOne","columnTwo","columnThree"],"columnFour","ColumnFive"] | ||
} | ||
``` | ||
Then your `select`-array has to at least include the partition key columns like this: `select: ['columnOne', 'columnTwo', 'columnThree']`. | ||
#### Find (using aggregate function) | ||
You can also use `aggregate functions` using the select key in the options object like the following example: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { select: ['name','sum(age)'] }, function(err, people){ | ||
//people is an array of plain objects with sum of all ages where name is John | ||
}); | ||
``` | ||
#### Find (using distinct select) | ||
Also, `DISTINCT` selects are possible: | ||
```js | ||
models.instance.Person.find({}, { select: ['name','age'], distinct: true }, function(err, people){ | ||
//people is a distinct array of plain objects with only distinct name and ages. | ||
}); | ||
``` | ||
#### Find (querying a materialized view) | ||
And if you have defined `materialized views` in your schema as described in the schema detail section, then you can query your views by using the similar find/findOne functions. Just add an option with the materialized view name like the following: | ||
```js | ||
models.instance.Person.find({name: 'John'}, { materialized_view: 'view_name1', raw: true }, function(err, people){ | ||
//people is an array of plain objects taken from the materialized view | ||
}); | ||
``` | ||
#### Find (with allow filtering) | ||
If you want to set allow filtering option, you may do that like this: | ||
```js | ||
models.instance.Person.find(query, {raw:true, allow_filtering: true}, function(err, people){ | ||
//people is an array of plain objects | ||
}); | ||
``` | ||
#### Find (using index expression) | ||
If you want to use custom index expressions, you may do that like this: | ||
```js | ||
var query = { | ||
$expr: { | ||
index: 'YOUR_INDEX_NAME', | ||
query: 'YOUR_CUSTOM_EXPR_QUERY' | ||
} | ||
} | ||
models.instance.Person.find(query, function(err, people){ | ||
}); | ||
``` | ||
#### Find (fetching large result sets using streaming queries) | ||
The stream() method automatically fetches the following pages, yielding the rows as they come through the network and retrieving the following page after the previous rows were read (throttling). | ||
```js | ||
models.instance.Person.stream({Name: 'John'}, {raw: true}, function(reader){ | ||
var row; | ||
while (row = reader.readRow()) { | ||
//process row | ||
} | ||
}, function(err){ | ||
//emitted when all rows have been retrieved and read | ||
}); | ||
``` | ||
With the eachRow() method, you can retrieve the following pages automatically by setting the autoPage flag to true in the query options to request the following pages automatically. Because eachRow() does not handle backpressure, it is only suitable when there is minimum computation per row required and no additional I/O, otherwise it ends up buffering an unbounded amount of rows. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {autoPage : true}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// ... | ||
}); | ||
``` | ||
If you want to retrieve the next page of results only when you ask for it (for example, in a web page or after a certain computation or job finished), you can use the eachRow() method in the following way: | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
if (result.nextPage) { | ||
// retrieve the following pages | ||
// the same row handler from above will be used | ||
result.nextPage(); | ||
} | ||
}); | ||
``` | ||
You can also use the `pageState` property, a string token made available in the result if there are additional result pages. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
// store the paging state | ||
pageState = result.pageState; | ||
}); | ||
``` | ||
In the next request, use the page state to fetch the following rows. | ||
```js | ||
models.instance.Person.eachRow({Name: 'John'}, {fetchSize : 100, pageState : pageState}, function(n, row){ | ||
// invoked per each row in all the pages | ||
}, function(err, result){ | ||
// called once the page has been retrieved. | ||
if(err) throw err; | ||
// store the next paging state. | ||
pageState = result.pageState; | ||
}); | ||
``` | ||
Saving the paging state works well when you only let the user move from one page to the next. But it doesn’t allow random jumps (like "go directly to page 10"), because you can't fetch a page unless you have the paging state of the previous one. Such a feature would require offset queries, which are not natively supported by Cassandra. | ||
Note: The page state token can be manipulated to retrieve other results within the same column family, so it is not safe to expose it to the users. | ||
#### Find (token based pagination) | ||
You can also use the `token` comparison function while querying a result set using the $token operator. This is specially useful for [paging through unordered partitioner results](https://docs.datastax.com/en/cql/3.3/cql/cql_using/usePaging.html). | ||
```js | ||
//consider the following situation | ||
var query = { | ||
$limit:10 | ||
}; | ||
models.instance.Person.find(query, function(err, people){ | ||
//people is an array of first 10 persons | ||
//Say your PRIMARY_KEY column is `name` and the 10th person has the name 'John' | ||
//Now to get the next 10 results, you may use the $token operator like the following: | ||
var query = { | ||
name:{ | ||
'$token':{'$gt':'John'} | ||
}, | ||
$limit:10 | ||
}; | ||
//The above query translates to `Select * from person where token(name) > token('John') limit 10` | ||
models.instance.Person.find(query, function(err, people){ | ||
//people is an array of objects containing the 11th - 20th person | ||
}); | ||
}); | ||
``` | ||
If you have a `composite partition key`, then the token operator should be contained in comma (,) separated partition key field names and the values should be an array containing the values for the partition key fields. Following is an example to demonstrate that: | ||
```js | ||
var query = { | ||
'id,name':{ | ||
'$token':{ | ||
'$gt':[1234,'John'] | ||
} | ||
} | ||
}; | ||
models.instance.Person.find(query, function(err, people){ | ||
}); | ||
``` | ||
Note that all query clauses must be Cassandra compliant. You cannot, for example, use $in operator for a key which is not part of the primary key. Querying in Cassandra is very basic but could be confusing at first. Take a look at this [post](http://mechanics.flite.com/blog/2013/11/05/breaking-down-the-cql-where-clause/) and, obvsiouly, at the [cql query documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_using/useQueryDataTOC.html) | ||
## Save / Update / Delete | ||
### Save | ||
The save operation on a model instance will insert a new record with the attribute values mentioned when creating the model object. It will update the record if it already exists in the database. A record is updated or inserted based on the primary key definition. If the primary key values are same as an existing record, then the record will be updated and otherwise it will be inserted as new record. | ||
```js | ||
var john = new models.instance.Person({name: 'John', surname: 'Doe', age: 32}); | ||
john.save(function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
You can use the find query to get an object and modify it and save it like the following: | ||
```js | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
if(john){ | ||
john.age = 30; | ||
john.save(function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
} | ||
}); | ||
``` | ||
The save function also takes optional parameters. By default cassandra will update the row if the primary key | ||
already exists. If you want to avoid on duplicate key updates, you may set if_not_exist:true. | ||
```js | ||
john.save({if_not_exist: true}, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
You can also set an expiry ttl for the saved row if you want. In that case the row will be removed by cassandra | ||
automatically after the time to live has expired. | ||
```js | ||
//The row will be removed after 86400 seconds or one day | ||
john.save({ttl: 86400}, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
### Update | ||
Use the update function if your requirements are not satisfied with the `save()` function or you directly want to update records without reading them from the db. The update function takes the following forms, (options are optional): | ||
```js | ||
/* | ||
UPDATE person | ||
USING TTL 86400 | ||
SET email='abc@gmail.com' | ||
WHERE username= 'abc' | ||
IF EXISTS | ||
*/ | ||
var query_object = {username: 'abc'}; | ||
var update_values_object = {email: 'abc@gmail.com'}; | ||
var options = {ttl: 86400, if_exists: true}; | ||
models.instance.Person.update(query_object, update_values_object, options, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
/* | ||
UPDATE person | ||
SET email='abc@gmail.com' | ||
WHERE username= 'abc' | ||
IF email='typo@gmail.com' | ||
*/ | ||
var query_object = {username: 'abc'}; | ||
var update_values_object = {email: 'abc@gmail.com'}; | ||
var options = {conditions: {email: 'typo@gmail.com'}}; | ||
models.instance.Person.update(query_object, update_values_object, options, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
### Delete | ||
The delete function takes the following form: | ||
```js | ||
//DELETE FROM person WHERE username='abc'; | ||
var query_object = {username: 'abc'}; | ||
models.instance.Person.delete(query_object, function(err){ | ||
if(err) console.log(err); | ||
else console.log('Yuppiie!'); | ||
}); | ||
``` | ||
If you have a model instance and you want to delete the instance object, you may do that like the following: | ||
```js | ||
models.instance.Person.findOne({name: 'John'}, function(err, john){ | ||
if(err) throw err; | ||
//Note that returned variable john here is an instance of your model, | ||
//so you can do john.delete() like the following | ||
john.delete(function(err){ | ||
//... | ||
}); | ||
}); | ||
``` | ||
## Raw Query | ||
@@ -1225,42 +1266,2 @@ | ||
## Batching ORM Operations | ||
You can batch any number of save, update and delete operations using the `models.doBatch` function. To use more than one of those functions as a combined batch operation, you need to tell each of the save/update/delete functions, that you want to get the final built query from the orm instead of executing it immediately. You can do that by adding a `return_query` parameter in the options object of the corresponding function and build an array of operations to execute atomically like the following: | ||
```js | ||
var queries = []; | ||
var event = new models.instance.Event({ | ||
id: 3, | ||
body: 'hello3' | ||
}); | ||
var save_query = event.save({return_query: true}); | ||
queries.push(save_query); | ||
var update_query = models.instance.Event.update( | ||
{id: 1}, | ||
{body: 'hello1 updated'}, | ||
{return_query: true} | ||
); | ||
queries.push(update_query); | ||
var delete_query = models.instance.Event.delete( | ||
{id: 2}, | ||
{return_query: true} | ||
); | ||
queries.push(delete_query); | ||
models.doBatch(queries, function(err){ | ||
if(err) throw err; | ||
}); | ||
``` | ||
## Debug Logging Queries | ||
You can log the generated queries by the orm if you want. Just set the `DEBUG` environment variable like the following while starting your app: | ||
``` | ||
DEBUG=express-cassandra node app.js | ||
``` | ||
## Raw Batch Query | ||
@@ -1300,2 +1301,10 @@ | ||
## Debug Logging Queries | ||
You can log the generated queries by the orm if you want. Just set the `DEBUG` environment variable like the following while starting your app: | ||
``` | ||
DEBUG=express-cassandra node app.js | ||
``` | ||
## Closing connections to cassandra | ||
@@ -1302,0 +1311,0 @@ |
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
221826
1312