You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Language Detection

github.com/apache/cassandra-gocql-driver/v2

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: By default, PasswordAuthenticator will attempt to authenticate regardless of what implementation the server returns in its AUTHENTICATE message as its authenticator, (e.g. org.apache.cassandra.auth.PasswordAuthenticator). If you wish to restrict this you may use PasswordAuthenticator.AllowedAuthenticators: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.Batch to create a new batch and then fill-in details of individual queries. Then execute the batch with Batch.Exec. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Batch.Exec prepares individual statements in the batch. If you have variable-length batches using the same statement, using Batch.Exec is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Batch.ExecCAS and Batch.MapExecCAS when executing the batch to learn about the result of the LWT. See example for Batch.MapExecCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v2.0.0-rc1-tentative • 4 weeks ago

github.com/hungtorus/cassandra-gocql-driver

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.7.0 • 10 months ago

github.com/hasty/pigeon

Command pigeon generates parsers in Go from a PEG grammar. From Wikipedia [0]: Its features and syntax are inspired by the PEG.js project [1], while the implementation is loosely based on [2]. Formal presentation of the PEG theory by Bryan Ford is also an important reference [3]. An introductory blog post can be found at [4]. The pigeon tool must be called with PEG input as defined by the accepted PEG syntax below. The grammar may be provided by a file or read from stdin. The generated parser is written to stdout by default. The following options can be specified: The tool makes no attempt to format the code, nor to detect the required imports. It is recommended to use goimports to properly generate the output code: The goimports tool can be installed with: If the code blocks in the grammar (see below, section "Code block") are golint- and go vet-compliant, then the resulting generated code will also be golint- and go vet-compliant. The generated code doesn't use any third-party dependency unless code blocks in the grammar require such a dependency. The accepted syntax for the grammar is formally defined in the grammar/pigeon.peg file, using the PEG syntax. What follows is an informal description of this syntax. Identifiers, whitespace, comments and literals follow the same notation as the Go language, as defined in the language specification (http://golang.org/ref/spec#Source_code_representation): The grammar must be Unicode text encoded in UTF-8. New lines are identified by the \n character (U+000A). Space (U+0020), horizontal tabs (U+0009) and carriage returns (U+000D) are considered whitespace and are ignored except to separate tokens. A PEG grammar consists of a set of rules. A rule is an identifier followed by a rule definition operator and an expression. An optional display name - a string literal used in error messages instead of the rule identifier - can be specified after the rule identifier. E.g.: The rule definition operator can be any one of those: A rule is defined by an expression. The following sections describe the various expression types. Expressions can be grouped by using parentheses, and a rule can be referenced by its identifier in place of an expression. The choice expression is a list of expressions that will be tested in the order they are defined. The first one that matches will be used. Expressions are separated by the forward slash character "/". E.g.: Because the first match is used, it is important to think about the order of expressions. For example, in this rule, "<=" would never be used because the "<" expression comes first: The sequence expression is a list of expressions that must all match in that same order for the sequence expression to be considered a match. Expressions are separated by whitespace. E.g.: A labeled expression consists of an identifier followed by a colon ":" and an expression. A labeled expression introduces a variable named with the label that can be referenced in the code blocks in the same scope. The variable will have the value of the expression that follows the colon. E.g.: The variable is typed as an empty interface, and the underlying type depends on the following: For terminals (character and string literals, character classes and the any matcher), the value is []byte. E.g.: For predicates (& and !), the value is always nil. E.g.: For a sequence, the value is a slice of empty interfaces, one for each expression value in the sequence. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a repetition (+ and *), the value is a slice of empty interfaces, one for each repetition. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a choice expression, the value is that of the matching choice. E.g.: For the optional expression (?), the value is nil or the value of the expression. E.g.: Of course, the type of the value can be anything once an action code block is used. E.g.: An expression prefixed with the ampersand "&" is the "and" predicate expression: it is considered a match if the following expression is a match, but it does not consume any input. An expression prefixed with the exclamation point "!" is the "not" predicate expression: it is considered a match if the following expression is not a match, but it does not consume any input. E.g.: The expression following the & and ! operators can be a code block. In that case, the code block must return a bool and an error. The operator's semantic is the same, & is a match if the code block returns true, ! is a match if the code block returns false. The code block has access to any labeled value defined in its scope. E.g.: An expression followed by "*", "?" or "+" is a match if the expression occurs zero or more times ("*"), zero or one time "?" or one or more times ("+") respectively. The match is greedy, it will match as many times as possible. E.g. A literal matcher tries to match the input against a single character or a string literal. The literal may be a single-quoted single character, a double-quoted string or a backtick-quoted raw string. The same rules as in Go apply regarding the allowed characters and escapes. The literal may be followed by a lowercase "i" (outside the ending quote) to indicate that the match is case-insensitive. E.g.: A character class matcher tries to match the input against a class of characters inside square brackets "[...]". Inside the brackets, characters represent themselves and the same escapes as in string literals are available, except that the single- and double-quote escape is not valid, instead the closing square bracket "]" must be escaped to be used. Character ranges can be specified using the "[a-z]" notation. Unicode classes can be specified using the "[\pL]" notation, where L is a single-letter Unicode class of characters, or using the "[\p{Class}]" notation where Class is a valid Unicode class (e.g. "Latin"). As for string literals, a lowercase "i" may follow the matcher (outside the ending square bracket) to indicate that the match is case-insensitive. A "^" as first character inside the square brackets indicates that the match is inverted (it is a match if the input does not match the character class matcher). E.g.: The any matcher is represented by the dot ".". It matches any character except the end of file, thus the "!." expression is used to indicate "match the end of file". E.g.: Code blocks can be added to generate custom Go code. There are three kinds of code blocks: the initializer, the action and the predicate. All code blocks appear inside curly braces "{...}". The initializer must appear first in the grammar, before any rule. It is copied as-is (minus the wrapping curly braces) at the top of the generated parser. It may contain function declarations, types, variables, etc. just like any Go file. Every symbol declared here will be available to all other code blocks. Although the initializer is optional in a valid grammar, it is usually required to generate a valid Go source code file (for the package clause). E.g.: Action code blocks are code blocks declared after an expression in a rule. Those code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being the value of the expression (an interface{}), and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: Predicate code blocks are code blocks declared immediately after the and "&" or the not "!" operators. Like action code blocks, predicate code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being a bool and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: The current type is a struct that provides two useful fields that can be accessed in action and predicate code blocks: "pos" and "text". The "pos" field indicates the current position of the parser in the source input. It is itself a struct with three fields: "line", "col" and "offset". Line is a 1-based line number, col is a 1-based column number that counts runes from the start of the line, and offset is a 0-based byte offset. The "text" field is the slice of bytes of the current match. It is empty in a predicate code block. The parser generated by pigeon exports a few symbols so that it can be used as a package with public functions to parse input text. The exported API is: See the godoc page of the generated parser for the test/predicates grammar for an example documentation page of the exported API: http://godoc.org/github.com/mna/pigeon/test/predicates. Like the grammar used to generate the parser, the input text must be UTF-8-encoded Unicode. The start rule of the parser is the first rule in the PEG grammar used to generate the parser. A call to any of the Parse* functions returns the value generated by executing the grammar on the provided input text, and an optional error. Typically, the grammar should generate some kind of abstract syntax tree (AST), but for simple grammars it may evaluate the result immediately, such as in the examples/calculator example. There are no constraints imposed on the author of the grammar, it can return whatever is needed. When the parser returns a non-nil error, the error is always of type errList, which is defined as a slice of errors ([]error). Each error in the list is of type *parserError. This is a struct that has an "Inner" field that can be used to access the original error. So if a code block returns some well-known error like: The original error can be accessed this way: By defaut the parser will continue after an error is returned and will cumulate all errors found during parsing. If the grammar reaches a point where it shouldn't continue, a panic statement can be used to terminate parsing. The panic will be caught at the top-level of the Parse* call and will be converted into a *parserError like any error, and an errList will still be returned to the caller. The divide by zero error in the examples/calculator grammar leverages this feature (no special code is needed to handle division by zero, if it happens, the runtime panics and it is recovered and returned as a parsing error). Providing good error reporting in a parser is not a trivial task. Part of it is provided by the pigeon tool, by offering features such as filename, position and rule name in the error message, but an important part of good error reporting needs to be done by the grammar author. For example, many programming languages use double-quotes for string literals. Usually, if the opening quote is found, the closing quote is expected, and if none is found, there won't be any other rule that will match, there's no need to backtrack and try other choices, an error should be added to the list and the match should be consumed. In order to do this, the grammar can look something like this: This is just one example, but it illustrates the idea that error reporting needs to be thought out when designing the grammar. Generated parsers have user-provided code mixed with pigeon code in the same package, so there is no package boundary in the resulting code to prevent access to unexported symbols. What is meant to be implementation details in pigeon is also available to user code - which doesn't mean it should be used. For this reason, it is important to precisely define what is intended to be the supported API of pigeon, the parts that will be stable in future versions. The "stability" of the API attempts to make a similar guarantee as the Go 1 compatibility [5]. The following lists what part of the current pigeon code falls under that guarantee (features may be added in the future): The pigeon command-line flags and arguments: those will not be removed and will maintain the same semantics. The explicitly exported API generated by pigeon. See [6] for the documentation of this API on a generated parser. The PEG syntax, as documented above. The code blocks (except the initializer) will always be generated as methods on the *current type, and this type is guaranteed to have the fields pos (type position) and text (type []byte). There are no guarantees on other fields and methods of this type. The position type will always have the fields line, col and offset, all defined as int. There are no guarantees on other fields and methods of this type. The type of the error value returned by the Parse* functions, when not nil, will always be errList defined as a []error. There are no guarantees on methods of this type, other than the fact it implements the error interface. Individual errors in the errList will always be of type *parserError, and this type is guaranteed to have an Inner field that contains the original error value. There are no guarantees on other fields and methods of this type. References:

v1.2.1 • 2 years ago

github.com/cadence-workflow/cadence-go-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v1.3.0 • 2 weeks ago

go.deanishe.net/awgo

Package aw is a utility library/framework for Alfred 3 workflows https://www.alfredapp.com/ It provides APIs for interacting with Alfred (e.g. Script Filter feedback) and the workflow environment (variables, caches, settings). NOTE: AwGo is currently in development. The API *will* change and should not be considered stable until v1.0. Until then, vendoring AwGo (e.g. with dep or vgo) is strongly recommended. As of AwGo 0.14, all applicable features of Alfred 3.6 are supported. The main features are: Typically, you'd call your program's main entry point via Run(). This way, the library will rescue any panic, log the stack trace and show an error message to the user in Alfred. In the Script box (Language = "/bin/bash"): To generate results for Alfred to show in a Script Filter, use the feedback API of Workflow: You can set workflow variables (via feedback) with Workflow.Var, Item.Var and Modifier.Var. See Workflow.SendFeedback for more documentation. Alfred requires a different JSON format if you wish to set workflow variables. Use the ArgVars (named for its equivalent element in Alfred) struct to generate output from Run Script actions. Be sure to set TextErrors to true to prevent Workflow from generating Alfred JSON if it catches a panic: See ArgVars for more information. New() creates a *Workflow using the default values and workflow settings read from environment variables set by Alfred. You can change defaults by passing one or more Options to New(). If you do not want to use Alfred's environment variables, or they aren't set (i.e. you're not running the code in Alfred), you must pass an Env as the first Option to New() using CustomEnv(). A Workflow can be re-configured later using its Configure() method. Check out the _examples/ subdirectory for some simple, but complete, workflows which you can copy to get started. See the documentation for Option for more information on configuring a Workflow. AwGo can filter Script Filter feedback using a Sublime Text-like fuzzy matching algorithm. Workflow.Filter() sorts feedback Items against the provided query, removing those that do not match. Sorting is performed by subpackage fuzzy via the fuzzy.Sortable interface. See _examples/fuzzy for a basic demonstration. See _examples/bookmarks for a demonstration of implementing fuzzy.Sortable on your own structs and customising the fuzzy sort settings. AwGo automatically configures the default log package to write to STDERR (Alfred's debugger) and a log file in the workflow's cache directory. The log file is necessary because background processes aren't connected to Alfred, so their output is only visible in the log. It is rotated when it exceeds 1 MiB in size. One previous log is kept. AwGo detects when Alfred's debugger is open (Workflow.Debug() returns true) and in this case prepends filename:linenumber: to log messages. The Config struct (which is included in Workflow as Workflow.Config) provides an interface to the workflow's settings from the Workflow Environment Variables panel. https://www.alfredapp.com/help/workflows/advanced/variables/#environment Alfred exports these settings as environment variables, and you can read them ad-hoc with the Config.Get*() methods, and save values back to Alfred with Config.Set(). Using Config.To() and Config.From(), you can "bind" your own structs to the settings in Alfred: And to save a struct's fields to the workflow's settings in Alfred: See the documentation for Config.To and Config.From for more information, and _examples/settings for a demo workflow based on the API. The Alfred struct provides methods for the rest of Alfred's AppleScript API. Amongst other things, you can use it to tell Alfred to open, to search for a query, or to browse/action files & directories. See documentation of the Alfred struct for more information. AwGo provides a basic, but useful, API for loading and saving data. In addition to reading/writing bytes and marshalling/unmarshalling to/from JSON, the API can auto-refresh expired cache data. See Cache and Session for the API documentation. Workflow has three caches tied to different directories: These all share the same API. The difference is in when the data go away. Data saved with Session are deleted after the user closes Alfred or starts using a different workflow. The Cache directory is in a system cache directory, so may be deleted by the system or "System Maintenance" tools. The Data directory lives with Alfred's application data and would not normally be deleted. Subpackage util provides several functions for running script files and snippets of AppleScript/JavaScript code. See util for documentation and examples. AwGo offers a simple API to start/stop background processes via Workflow's RunInBackground(), IsRunning() and Kill() methods. This is useful for running checks for updates and other jobs that hit the network or take a significant amount of time to complete, allowing you to keep your Script Filters extremely responsive. See _examples/update and _examples/workflows for demonstrations of this API.

v0.29.1 • 4 years ago

github.com/scylladb-solutions/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. If the query is LWT and the configured RetryPolicy additionally implements LWTRetryPolicy interface, then the policy will be cast to LWTRetryPolicy and used this way. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.15.0 • 12 months ago

github.com/karolorg/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. If the query is LWT and the configured RetryPolicy additionally implements LWTRetryPolicy interface, then the policy will be cast to LWTRetryPolicy and used this way. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.15.0 • 12 months ago

github.com/karol-kokoszka/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. If the query is LWT and the configured RetryPolicy additionally implements LWTRetryPolicy interface, then the policy will be cast to LWTRetryPolicy and used this way. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.17.0 • 12 months ago

github.com/dkropachev/gocql/v2

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. If the query is LWT and the configured RetryPolicy additionally implements LWTRetryPolicy interface, then the policy will be cast to LWTRetryPolicy and used this way. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v2.0.1 • 12 months ago

github.com/dkropachev/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). The driver advertises the module name and version in the STARTUP message, so servers are able to detect the version. If you use replace directive in go.mod, the driver will send information about the replacement module instead. When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. The driver can only use token-aware routing for queries where all partition key columns are query parameters. For example, instead of use The DCAwareRoundRobinPolicy can be replaced with RackAwareRoundRobinPolicy, which takes two parameters, datacenter and rack. Instead of dividing hosts with two tiers (local datacenter and remote datacenters) it divides hosts into three (the local rack, the rest of the local datacenter, and everything else). RackAwareRoundRobinPolicy can be combined with TokenAwareHostPolicy in the same way as DCAwareRoundRobinPolicy. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. If the query is LWT and the configured RetryPolicy additionally implements LWTRetryPolicy interface, then the policy will be cast to LWTRetryPolicy and used this way. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.17.0 • 12 months ago

github.com/burakson/lingua-go

Package lingua accurately detects the natural language of written text, be it long or short. Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. In cases where you don't need the full-fledged functionality of those systems or don't want to learn the ropes of those, a small flexible library comes in handy. So far, the only other comprehensive open source library in the Go ecosystem for this task is Whatlanggo (https://github.com/abadojack/whatlanggo). Unfortunately, it has two major drawbacks: 1. Detection only works with quite lengthy text fragments. For very short text snippets such as Twitter messages, it does not provide adequate results. 2. The more languages take part in the decision process, the less accurate are the detection results. Lingua aims at eliminating these problems. It nearly does not need any configuration and yields pretty accurate results on both long and short text, even on single words and phrases. It draws on both rule-based and statistical methods but does not use any dictionaries of words. It does not need a connection to any external API or service either. Once the library has been downloaded, it can be used completely offline. Compared to other language detection libraries, Lingua's focus is on quality over quantity, that is, getting detection right for a small set of languages first before adding new ones. Currently, 75 languages are supported. They are listed as variants of type Language. Lingua is able to report accuracy statistics for some bundled test data available for each supported language. The test data for each language is split into three parts: 1. a list of single words with a minimum length of 5 characters 2. a list of word pairs with a minimum length of 10 characters 3. a list of complete grammatical sentences of various lengths Both the language models and the test data have been created from separate documents of the Wortschatz corpora (https://wortschatz.uni-leipzig.de) offered by Leipzig University, Germany. Data crawled from various news websites have been used for training, each corpus comprising one million sentences. For testing, corpora made of arbitrarily chosen websites have been used, each comprising ten thousand sentences. From each test corpus, a random unsorted subset of 1000 single words, 1000 word pairs and 1000 sentences has been extracted, respectively. Given the generated test data, I have compared the detection results of Lingua, and Whatlanggo running over the data of Lingua's supported 75 languages. Additionally, I have added Google's CLD3 (https://github.com/google/cld3/) to the comparison with the help of the gocld3 bindings (https://github.com/jmhodges/gocld3). Languages that are not supported by CLD3 or Whatlanggo are simply ignored during the detection process. Lingua clearly outperforms its contenders. Every language detector uses a probabilistic n-gram (https://en.wikipedia.org/wiki/N-gram) model trained on the character distribution in some training corpus. Most libraries only use n-grams of size 3 (trigrams) which is satisfactory for detecting the language of longer text fragments consisting of multiple sentences. For short phrases or single words, however, trigrams are not enough. The shorter the input text is, the less n-grams are available. The probabilities estimated from such few n-grams are not reliable. This is why Lingua makes use of n-grams of sizes 1 up to 5 which results in much more accurate prediction of the correct language. A second important difference is that Lingua does not only use such a statistical model, but also a rule-based engine. This engine first determines the alphabet of the input text and searches for characters which are unique in one or more languages. If exactly one language can be reliably chosen this way, the statistical model is not necessary anymore. In any case, the rule-based engine filters out languages that do not satisfy the conditions of the input text. Only then, in a second step, the probabilistic n-gram model is taken into consideration. This makes sense because loading less language models means less memory consumption and better runtime performance. In general, it is always a good idea to restrict the set of languages to be considered in the classification process using the respective api methods. If you know beforehand that certain languages are never to occur in an input text, do not let those take part in the classifcation process. The filtering mechanism of the rule-based engine is quite good, however, filtering based on your own knowledge of the input text is always preferable. There might be classification tasks where you know beforehand that your language data is definitely not written in Latin, for instance. The detection accuracy can become better in such cases if you exclude certain languages from the decision process or just explicitly include relevant languages. Knowing about the most likely language is nice but how reliable is the computed likelihood? And how less likely are the other examined languages in comparison to the most likely one? In the example below, a slice of ConfidenceValue is returned containing those languages which the calling instance of LanguageDetector has been built from. The entries are sorted by their confidence value in descending order. Each value is a probability between 0.0 and 1.0. The probabilities of all languages will sum to 1.0. If the language is unambiguously identified by the rule engine, the value 1.0 will always be returned for this language. The other languages will receive a value of 0.0. By default, Lingua uses lazy-loading to load only those language models on demand which are considered relevant by the rule-based filter engine. For web services, for instance, it is rather beneficial to preload all language models into memory to avoid unexpected latency while waiting for the service response. If you want to enable the eager-loading mode, you can do it as seen below. Multiple instances of LanguageDetector share the same language models in memory which are accessed asynchronously by the instances. By default, Lingua returns the most likely language for a given input text. However, there are certain words that are spelled the same in more than one language. The word `prologue`, for instance, is both a valid English and French word. Lingua would output either English or French which might be wrong in the given context. For cases like that, it is possible to specify a minimum relative distance that the logarithmized and summed up probabilities for each possible language have to satisfy. It can be stated as seen below. Be aware that the distance between the language probabilities is dependent on the length of the input text. The longer the input text, the larger the distance between the languages. So if you want to classify very short text phrases, do not set the minimum relative distance too high. Otherwise Unknown will be returned most of the time as in the example below. This is the return value for cases where language detection is not reliably possible.

v1.4.1 • last year

13 4 5

Language Detection

github.com/BurNTSusHI/toml

github.com/apache/cassandra-gocql-driver/v2

github.com/bramburn/gnssgo/pkg/gnssgo

github.com/BurntSushi/TOML

d.zyszy.best/rylans/getlang

github.com/olric-data/olric

github.com/rainycape/cld2

github.com/boomhut/getlang

github.com/localrivet/gomcp

github.com/culturedsen/phony

github.com/simplepupil/phony

github.com/stunningome/phony

github.com/smithy-security/pkg/detect-repo-languages

github.com/akfaew/whatlanggo

cmd/go

cool-lake.m-2023.workers.dev/neurosnap/sentences

github.com/hungtorus/cassandra-gocql-driver

github.com/arceliar/phony

github.com/BurntsUSHI/toml

github.com/AdityaCyberSafe/phony

github.com/tochemey/olric

github.xufengfeng.xyz/neurosnap/sentences

github.com/hasty/pigeon

github.com/cadence-workflow/cadence-go-client

go.deanishe.net/awgo

github.com/inigosola/whatlanggo

github.com/alecthomas/pigeon

go.temporal.io/sdk/temporal

github.hscsec.cn/rylans/getlang

github.com/eltaline/toml

github.com/RadhiFadlillah/whatlanggo

github.com/scylladb-solutions/gocql

github.com/karolorg/gocql

github.com/karol-kokoszka/gocql

github.com/dkropachev/gocql/v2

github.com/openshift/configuration-anomaly-detection/cadctl

github.com/dkropachev/gocql

e.coding.net/g-nnjn4981/aito/aws-sdk-go-v2/service/comprehendmedical

e.coding.net/g-nnjn4981/aito/aws-sdk-go-v2/service/connectcontactlens

github.com/npillmayer/uax

e.coding.net/g-nnjn4981/aito/aws-sdk-go-v2/service/qconnect

e.coding.net/g-nnjn4981/aito/aws-sdk-go-v2/service/neptune

54hub.com/arkdep/toml

github.com/endeveit/guesslanguage

github.com/generaltso/linguist

github.com/limetext/lime-backend

github.com/zhangbo2008/nlp_analyse_language_bygo

github.com/jekans/go_language_detect

github.com/burakson/lingua-go

github.com/chef/toml