Language Detection

github.com/boyvanduuren/ql

Package ql implements a pure Go embedded SQL database engine. QL is a member of the SQL family of languages. It is less complex and less powerful than SQL (whichever specification SQL is considered to be). 2017-01-10: Release v1.1.0 fixes some bugs and adds a configurable WAL headroom. 2016-07-29: Release v1.0.6 enables alternatively using = instead of == for equality operation. 2016-07-11: Release v1.0.5 undoes vendoring of lldb. QL now uses stable lldb (github.com/cznic/lldb). 2016-07-06: Release v1.0.4 fixes a panic when closing the WAL file. 2016-04-03: Release v1.0.3 fixes a data race. 2016-03-23: Release v1.0.2 vendors github.com/cznic/exp/lldb and github.com/camlistore/go4/lock. 2016-03-17: Release v1.0.1 adjusts for latest goyacc. Parser error messages are improved and changed, but their exact form is not considered a API change. 2016-03-05: The current version has been tagged v1.0.0. 2015-06-15: To improve compatibility with other SQL implementations, the count built-in aggregate function now accepts * as its argument. 2015-05-29: The execution planner was rewritten from scratch. It should use indices in all places where they were used before plus in some additional situations. It is possible to investigate the plan using the newly added EXPLAIN statement. The QL tool is handy for such analysis. If the planner would have used an index, but no such exists, the plan includes hints in form of copy/paste ready CREATE INDEX statements. The planner is still quite simple and a lot of work on it is yet ahead. You can help this process by filling an issue with a schema and query which fails to use an index or indices when it should, in your opinion. Bonus points for including output of `ql 'explain <query>'`. 2015-05-09: The grammar of the CREATE INDEX statement now accepts an expression list instead of a single expression, which was further limited to just a column name or the built-in id(). As a side effect, composite indices are now functional. However, the values in the expression-list style index are not yet used by other statements or the statement/query planner. The composite index is useful while having UNIQUE clause to check for semantically duplicate rows before they get added to the table or when such a row is mutated using the UPDATE statement and the expression-list style index tuple of the row is thus recomputed. 2015-05-02: The Schema field of table __Table now correctly reflects any column constraints and/or defaults. Also, the (*DB).Info method now has that information provided in new ColumInfo fields NotNull, Constraint and Default. 2015-04-20: Added support for {LEFT,RIGHT,FULL} [OUTER] JOIN. 2015-04-18: Column definitions can now have constraints and defaults. Details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. 2015-03-06: New built-in functions formatFloat and formatInt. Thanks urandom! (https://github.com/urandom) 2015-02-16: IN predicate now accepts a SELECT statement. See the updated "Predicates" section. 2015-01-17: Logical operators || and && have now alternative spellings: OR and AND (case insensitive). AND was a keyword before, but OR is a new one. This can possibly break existing queries. For the record, it's a good idea to not use any name appearing in, for example, [7] in your queries as the list of QL's keywords may expand for gaining better compatibility with existing SQL "standards". 2015-01-12: ACID guarantees were tightened at the cost of performance in some cases. The write collecting window mechanism, a formerly used implementation detail, was removed. Inserting rows one by one in a transaction is now slow. I mean very slow. Try to avoid inserting single rows in a transaction. Instead, whenever possible, perform batch updates of tens to, say thousands of rows in a single transaction. See also: http://www.sqlite.org/faq.html#q19, the discussed synchronization principles involved are the same as for QL, modulo minor details. Note: A side effect is that closing a DB before exiting an application, both for the Go API and through database/sql driver, is no more required, strictly speaking. Beware that exiting an application while there is an open (uncommitted) transaction in progress means losing the transaction data. However, the DB will not become corrupted because of not closing it. Nor that was the case before, but formerly failing to close a DB could have resulted in losing the data of the last transaction. 2014-09-21: id() now optionally accepts a single argument - a table name. 2014-09-01: Added the DB.Flush() method and the LIKE pattern matching predicate. 2014-08-08: The built in functions max and min now accept also time values. Thanks opennota! (https://github.com/opennota) 2014-06-05: RecordSet interface extended by new methods FirstRow and Rows. 2014-06-02: Indices on id() are now used by SELECT statements. 2014-05-07: Introduction of Marshal, Schema, Unmarshal. 2014-04-15: Added optional IF NOT EXISTS clause to CREATE INDEX and optional IF EXISTS clause to DROP INDEX. 2014-04-12: The column Unique in the virtual table __Index was renamed to IsUnique because the old name is a keyword. Unfortunately, this is a breaking change, sorry. 2014-04-11: Introduction of LIMIT, OFFSET. 2014-04-10: Introduction of query rewriting. 2014-04-07: Introduction of indices. QL imports zappy[8], a block-based compressor, which speeds up its performance by using a C version of the compression/decompression algorithms. If a CGO-free (pure Go) version of QL, or an app using QL, is required, please include 'purego' in the -tags option of go {build,get,install}. For example: If zappy was installed before installing QL, it might be necessary to rebuild zappy first (or rebuild QL with all its dependencies using the -a option): The syntax is specified using Extended Backus-Naur Form (EBNF) Lower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes “. The form a … b represents the set of characters from a through b as alternatives. The horizontal ellipsis … is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. QL source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. Each code point is distinct; for instance, upper and lower case letters are different characters. Implementation restriction: For compatibility with other tools, the parser may disallow the NUL character (U+0000) in the statement. Implementation restriction: A byte order mark is disallowed anywhere in QL statements. The following terms are used to denote specific character classes The underscore character _ (U+005F) is considered a letter. Lexical elements are comments, tokens, identifiers, keywords, operators and delimiters, integer, floating-point, imaginary, rune and string literals and QL parameters. Line comments start with the character sequence // or -- and stop at the end of the line. A line comment acts like a space. General comments start with the character sequence /* and continue through the character sequence */. A general comment acts like a space. Comments do not nest. Tokens form the vocabulary of QL. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. The formal grammar uses semicolons ";" as separators of QL statements. A single QL statement or the last QL statement in a list of statements can have an optional semicolon terminator. (Actually a separator from the following empty statement.) Identifiers name entities such as tables or record set columns. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. For example No identifiers are predeclared, however note that no keyword can be used as an identifier. Identifiers starting with two underscores are used for meta data virtual tables names. For forward compatibility, users should generally avoid using any identifiers starting with two underscores. For example The following keywords are reserved and may not be used as identifiers. Keywords are not case sensitive. The following character sequences represent operators, delimiters, and other special tokens Operators consisting of more than one character are referred to by names in the rest of the documentation An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0 for octal, 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15. For example A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided. For example An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or decimal integer followed by the lower-case letter i. For example A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes. Within the quotes, any character may appear except single quote and newline. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. The simplest form represents the single character within the quotes; since QL statements are Unicode characters encoded in UTF-8, multiple UTF-8-encoded bytes may represent a single integer value. For instance, the literal 'a' holds a single byte representing a literal a, Unicode U+0061, value 0x61, while 'ä' holds two bytes (0xc3 0xa4) representing a literal a-dieresis, U+00E4, value 0xe4. Several backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base. Although these representations all result in an integer, they have different valid ranges. Octal escapes must represent a value between 0 and 255 inclusive. Hexadecimal escapes satisfy this condition by construction. The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves. After a backslash, certain single-character escapes represent special values All other sequences starting with a backslash are illegal inside rune literals. For example A string literal represents a string constant obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals. Raw string literals are character sequences between back quotes “. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage returns inside raw string literals are discarded from the raw string value. Interpreted string literals are character sequences between double quotes "". The text between the quotes, which may not contain newlines, forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. For example These examples all represent the same string If the statement source represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal. Literals are assigned their values from the respective text representation at "compile" (parse) time. QL parameters provide the same functionality as literals, but their value is assigned at execution time from an expression list passed to DB.Run or DB.Execute. Using '?' or '$' is completely equivalent. For example Keywords 'false' and 'true' (not case sensitive) represent the two possible constant values of type bool (also not case sensitive). Keyword 'NULL' (not case sensitive) represents an untyped constant which is assignable to any type. NULL is distinct from any other value of any type. A type determines the set of values and operations specific to values of that type. A type is specified by a type name. Named instances of the boolean, numeric, and string types are keywords. The names are not case sensitive. Note: The blob type is exchanged between the back end and the API as []byte. On 32 bit platforms this limits the size which the implementation can handle to 2G. A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. A duration type represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years. A numeric type represents sets of integer or floating-point values. The predeclared architecture-independent numeric types are The value of an n-bit integer is n bits wide and represented using two's complement arithmetic. Conversions are required when different numeric types are mixed in an expression or assignment. A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The case insensitive keyword for the string type is 'string'. The length of a string (its size in bytes) can be discovered using the built-in function len. A time type represents an instant in time with nanosecond precision. Each time has associated with it a location, consulted when computing the presentation form of the time. The following functions are implicitly declared An expression specifies the computation of a value by applying operators and functions to operands. Operands denote the elementary values in an expression. An operand may be a literal, a (possibly qualified) identifier denoting a constant or a function or a table/record set column, or a parenthesized expression. A qualified identifier is an identifier qualified with a table/record set name prefix. For example Primary expression are the operands for unary and binary expressions. For example A primary expression of the form denotes the element of a string indexed by x. Its type is byte. The value x is called the index. The following rules apply - The index x must be of integer type except bigint or duration; it is in range if 0 <= x < len(s), otherwise it is out of range. - A constant index must be non-negative and representable by a value of type int. - A constant index must be in range if the string a is a literal. - If x is out of range at run time, a run-time error occurs. - s[x] is the byte at index x and the type of s[x] is byte. If s is NULL or x is NULL then the result is NULL. Otherwise s[x] is illegal. For a string, the primary expression constructs a substring. The indices low and high select which elements appear in the result. The result has indices starting at 0 and length equal to high - low. For convenience, any of the indices may be omitted. A missing low index defaults to zero; a missing high index defaults to the length of the sliced operand The indices low and high are in range if 0 <= low <= high <= len(a), otherwise they are out of range. A constant index must be non-negative and representable by a value of type int. If both indices are constant, they must satisfy low <= high. If the indices are out of range at run time, a run-time error occurs. Integer values of type bigint or duration cannot be used as indices. If s is NULL the result is NULL. If low or high is not omitted and is NULL then the result is NULL. Given an identifier f denoting a predeclared function, calls f with arguments a1, a2, … an. Arguments are evaluated before the function is called. The type of the expression is the result type of f. In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return value of the function is passed by value when the function returns. Calling an undefined function causes a compile-time error. Operators combine operands into expressions. Comparisons are discussed elsewhere. For other binary operators, the operand types must be identical unless the operation involves shifts or untyped constants. For operations involving constants only, see the section on constant expressions. Except for shift operations, if one operand is an untyped constant and the other operand is not, the constant is converted to the type of the other operand. The right operand in a shift expression must have unsigned integer type or be an untyped constant that can be converted to unsigned integer type. If the left operand of a non-constant shift expression is an untyped constant, the type of the constant is what it would be if the shift expression were replaced by its left operand alone. Expressions of the form yield a boolean value true if expr2, a regular expression, matches expr1 (see also [6]). Both expression must be of type string. If any one of the expressions is NULL the result is NULL. Predicates are special form expressions having a boolean result type. Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be comparable as defined in "Comparison operators". Another form of the IN predicate creates the expression list from a result of a SelectStmt. The SelectStmt must select only one column. The produced expression list is resource limited by the memory available to the process. NULL values produced by the SelectStmt are ignored, but if all records of the SelectStmt are NULL the predicate yields NULL. The select statement is evaluated only once. If the type of expr is not the same as the type of the field returned by the SelectStmt then the set operation yields false. The type of the column returned by the SelectStmt must be one of the simple (non blob-like) types: Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be ordered as defined in "Comparison operators". Expressions of the form yield a boolean value true if expr does not have a specific type (case A) or if expr has a specific type (case B). In other cases the result is a boolean value false. Unary operators have the highest precedence. There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR) Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z. Note that the operator precedence is reflected explicitly by the grammar. Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, rational, floating-point, and complex types; + also applies to strings; +,- also applies to times. All other arithmetic operators apply to integers only. sum integers, rationals, floats, complex values, strings difference integers, rationals, floats, complex values, times product integers, rationals, floats, complex values / quotient integers, rationals, floats, complex values % remainder integers & bitwise AND integers | bitwise OR integers ^ bitwise XOR integers &^ bit clear (AND NOT) integers << left shift integer << unsigned integer >> right shift integer >> unsigned integer Strings can be concatenated using the + operator String addition creates a new string by concatenating the operands. A value of type duration can be added to or subtracted from a value of type time. Times can subtracted from each other producing a value of type duration. For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships with x / y truncated towards zero ("truncated division"). As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0). If the divisor is a constant expression, it must not be zero. If the divisor is zero at run time, a run-time error occurs. If the dividend is non-negative and the divisor is a constant power of 2, the division may be replaced by a right shift, and computing the remainder may be replaced by a bitwise AND operation The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. There is no upper limit on the shift count. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity. For integer operands, the unary operators +, -, and ^ are defined as follows For floating-point and complex numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time error occurs is implementation-specific. Whenever any operand of any arithmetic operation, unary or binary, is NULL, as well as in the case of the string concatenating operation, the result is NULL. For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and expressions may rely on “wrap around”. For signed integers with a finite bit width, the operations +, -, *, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow. An evaluator may not optimize an expression under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true. Integers of type bigint and rationals do not overflow but their handling is limited by the memory resources available to the program. Comparison operators compare two operands and yield a boolean value. In any comparison, the first operand must be of same type as is the second operand, or vice versa. The equality operators == and != apply to operands that are comparable. The ordering operators <, <=, >, and >= apply to operands that are ordered. These terms and the result of the comparisons are defined as follows - Boolean values are comparable. Two boolean values are equal if they are either both true or both false. - Complex values are comparable. Two complex values u and v are equal if both real(u) == real(v) and imag(u) == imag(v). - Integer values are comparable and ordered, in the usual way. Note that durations are integers. - Floating point values are comparable and ordered, as defined by the IEEE-754 standard. - Rational values are comparable and ordered, in the usual way. - String values are comparable and ordered, lexically byte-wise. - Time values are comparable and ordered. Whenever any operand of any comparison operation is NULL, the result is NULL. Note that slices are always of type string. Logical operators apply to boolean values and yield a boolean result. The right operand is evaluated conditionally. The truth tables for logical operations with NULL values Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T. A constant value x can be converted to type T in any of these cases: - x is representable by a value of type T. - x is a floating-point constant, T is a floating-point type, and x is representable by a value of type T after rounding using IEEE 754 round-to-even rules. The constant T(x) is the rounded value. - x is an integer constant and T is a string type. The same rule as for non-constant x applies in this case. Converting a constant yields a typed constant as result. A non-constant value x can be converted to type T in any of these cases: - x has type T. - x's type and T are both integer or floating point types. - x's type and T are both complex types. - x is an integer, except bigint or duration, and T is a string type. Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x. A conversion of NULL to any type yields NULL. For the conversion of non-constant numeric values, the following rules apply 1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. It is then truncated to fit in the result type's size. For example, if v == uint16(0x10F0), then uint32(int8(v)) == 0xFFFFFFF0. The conversion always yields a valid value; there is no indication of overflow. 2. When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero). 3. When converting an integer or floating-point number to a floating-point type, or a complex number to another complex type, the result value is rounded to the precision specified by the destination type. For instance, the value of a variable x of type float32 may be stored using additional precision beyond that of an IEEE-754 32-bit number, but float32(x) represents the result of rounding x's value to 32-bit precision. Similarly, x + 0.1 may use more than 32 bits of precision, but float32(x + 0.1) does not. In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent. 1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD". 2. Converting a blob to a string type yields a string whose successive bytes are the elements of the blob. 3. Converting a value of a string type to a blob yields a blob whose successive elements are the bytes of the string. 4. Converting a value of a bigint type to a string yields a string containing the decimal decimal representation of the integer. 5. Converting a value of a string type to a bigint yields a bigint value containing the integer represented by the string value. A prefix of “0x” or “0X” selects base 16; the “0” prefix selects base 8, and a “0b” or “0B” prefix selects base 2. Otherwise the value is interpreted in base 10. An error occurs if the string value is not in any valid format. 6. Converting a value of a rational type to a string yields a string containing the decimal decimal representation of the rational in the form "a/b" (even if b == 1). 7. Converting a value of a string type to a bigrat yields a bigrat value containing the rational represented by the string value. The string can be given as a fraction "a/b" or as a floating-point number optionally followed by an exponent. An error occurs if the string value is not in any valid format. 8. Converting a value of a duration type to a string returns a string representing the duration in the form "72h3m0.5s". Leading zero units are omitted. As a special case, durations less than one second format using a smaller unit (milli-, micro-, or nanoseconds) to ensure that the leading digit is non-zero. The zero duration formats as 0, with no unit. 9. Converting a string value to a duration yields a duration represented by the string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 10. Converting a time value to a string returns the time formatted using the format string When evaluating the operands of an expression or of function calls, operations are evaluated in lexical left-to-right order. For example, in the evaluation of the function calls and evaluation of c happen in the order h(), i(), j(), c. Floating-point operations within a single expression are evaluated according to the associativity of the operators. Explicit parentheses affect the evaluation by overriding the default associativity. In the expression x + (y + z) the addition y + z is performed before adding x. Statements control execution. The empty statement does nothing. Alter table statements modify existing tables. With the ADD clause it adds a new column to the table. The column must not exist. With the DROP clause it removes an existing column from a table. The column must exist and it must be not the only (last) column of the table. IOW, there cannot be a table with no columns. For example When adding a column to a table with existing data, the constraint clause of the ColumnDef cannot be used. Adding a constrained column to an empty table is fine. Begin transactions statements introduce a new transaction level. Every transaction level must be eventually balanced by exactly one of COMMIT or ROLLBACK statements. Note that when a transaction is roll-backed because of a statement failure then no explicit balancing of the respective BEGIN TRANSACTION is statement is required nor permitted. Failure to properly balance any opened transaction level may cause dead locks and/or lose of data updated in the uppermost opened but never properly closed transaction level. For example A database cannot be updated (mutated) outside of a transaction. Statements requiring a transaction A database is effectively read only outside of a transaction. Statements not requiring a transaction The commit statement closes the innermost transaction nesting level. If that's the outermost level then the updates to the DB made by the transaction are atomically made persistent. For example Create index statements create new indices. Index is a named projection of ordered values of a table column to the respective records. As a special case the id() of the record can be indexed. Index name must not be the same as any of the existing tables and it also cannot be the same as of any column name of the table the index is on. For example Now certain SELECT statements may use the indices to speed up joins and/or to speed up record set filtering when the WHERE clause is used; or the indices might be used to improve the performance when the ORDER BY clause is present. The UNIQUE modifier requires the indexed values tuple to be index-wise unique or have all values NULL. The optional IF NOT EXISTS clause makes the statement a no operation if the index already exists. A simple index consists of only one expression which must be either a column name or the built-in id(). A more complex and more general index is one that consists of more than one expression or its single expression does not qualify as a simple index. In this case the type of all expressions in the list must be one of the non blob-like types. Note: Blob-like types are blob, bigint, bigrat, time and duration. Create table statements create new tables. A column definition declares the column name and type. Table names and column names are case sensitive. Neither a table or an index of the same name may exist in the DB. For example The optional IF NOT EXISTS clause makes the statement a no operation if the table already exists. The optional constraint clause has two forms. The first one is found in many SQL dialects. This form prevents the data in column DepartmentName to be NULL. The second form allows an arbitrary boolean expression to be used to validate the column. If the value of the expression is true then the validation succeeded. If the value of the expression is false or NULL then the validation fails. If the value of the expression is not of type bool an error occurs. The optional DEFAULT clause is an expression which, if present, is substituted instead of a NULL value when the colum is assigned a value. Note that the constraint and/or default expressions may refer to other columns by name: When a table row is inserted by the INSERT INTO statement or when a table row is updated by the UPDATE statement, the order of operations is as follows: 1. The new values of the affected columns are set and the values of all the row columns become the named values which can be referred to in default expressions evaluated in step 2. 2. If any row column value is NULL and the DEFAULT clause is present in the column's definition, the default expression is evaluated and its value is set as the respective column value. 3. The values, potentially updated, of row columns become the named values which can be referred to in constraint expressions evaluated during step 4. 4. All row columns which definition has the constraint clause present will have that constraint checked. If any constraint violation is detected, the overall operation fails and no changes to the table are made. Delete from statements remove rows from a table, which must exist. For example If the WHERE clause is not present then all rows are removed and the statement is equivalent to the TRUNCATE TABLE statement. Drop index statements remove indices from the DB. The index must exist. For example The optional IF EXISTS clause makes the statement a no operation if the index does not exist. Drop table statements remove tables from the DB. The table must exist. For example The optional IF EXISTS clause makes the statement a no operation if the table does not exist. Insert into statements insert new rows into tables. New rows come from literal data, if using the VALUES clause, or are a result of select statement. In the later case the select statement is fully evaluated before the insertion of any rows is performed, allowing to insert values calculated from the same table rows are to be inserted into. If the ColumnNameList part is omitted then the number of values inserted in the row must be the same as are columns in the table. If the ColumnNameList part is present then the number of values per row must be same as the same number of column names. All other columns of the record are set to NULL. The type of the value assigned to a column must be the same as is the column's type or the value must be NULL. For example If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. Explain statement produces a recordset consisting of lines of text which describe the execution plan of a statement, if any. For example, the QL tool treats the explain statement specially and outputs the joined lines: The explanation may aid in uderstanding how a statement/query would be executed and if indices are used as expected - or which indices may possibly improve the statement performance. The create index statements above were directly copy/pasted in the terminal from the suggestions provided by the filter recordset pipeline part returned by the explain statement. If the statement has nothing special in its plan, the result is the original statement. To get an explanation of the select statement of the IN predicate, use the EXPLAIN statement with that particular select statement. The rollback statement closes the innermost transaction nesting level discarding any updates to the DB made by it. If that's the outermost level then the effects on the DB are as if the transaction never happened. For example The (temporary) record set from the last statement is returned and can be processed by the client. In this case the rollback is the same as 'DROP TABLE tmp;' but it can be a more complex operation. Select from statements produce recordsets. The optional DISTINCT modifier ensures all rows in the result recordset are unique. Either all of the resulting fields are returned ('*') or only those named in FieldList. RecordSetList is a list of table names or parenthesized select statements, optionally (re)named using the AS clause. The result can be filtered using a WhereClause and orderd by the OrderBy clause. For example If Recordset is a nested, parenthesized SelectStmt then it must be given a name using the AS clause if its field are to be accessible in expressions. A field is an named expression. Identifiers, not used as a type in conversion or a function name in the Call clause, denote names of (other) fields, values of which should be used in the expression. The expression can be named using the AS clause. If the AS clause is not present and the expression consists solely of a field name, then that field name is used as the name of the resulting field. Otherwise the field is unnamed. For example The SELECT statement can optionally enumerate the desired/resulting fields in a list. No two identical field names can appear in the list. When more than one record set is used in the FROM clause record set list, the result record set field names are rewritten to be qualified using the record set names. If a particular record set doesn't have a name, its respective fields became unnamed. The optional JOIN clause, for example is mostly equal to except that the rows from a which, when they appear in the cross join, never made expr to evaluate to true, are combined with a virtual row from b, containing all nulls, and added to the result set. For the RIGHT JOIN variant the discussed rules are used for rows from b not satisfying expr == true and the virtual, all-null row "comes" from a. The FULL JOIN adds the respective rows which would be otherwise provided by the separate executions of the LEFT JOIN and RIGHT JOIN variants. For more thorough OUTER JOIN discussion please see the Wikipedia article at [10]. Resultins rows of a SELECT statement can be optionally ordered by the ORDER BY clause. Collating proceeds by considering the expressions in the expression list left to right until a collating order is determined. Any possibly remaining expressions are not evaluated. All of the expression values must yield an ordered type or NULL. Ordered types are defined in "Comparison operators". Collating of elements having a NULL value is different compared to what the comparison operators yield in expression evaluation (NULL result instead of a boolean value). Below, T denotes a non NULL value of any QL type. NULL collates before any non NULL value (is considered smaller than T). Two NULLs have no collating order (are considered equal). The WHERE clause restricts records considered by some statements, like SELECT FROM, DELETE FROM, or UPDATE. It is an error if the expression evaluates to a non null value of non bool type. The GROUP BY clause is used to project rows having common values into a smaller set of rows. For example Using the GROUP BY without any aggregate functions in the selected fields is in certain cases equal to using the DISTINCT modifier. The last two examples above produce the same resultsets. The optional OFFSET clause allows to ignore first N records. For example The above will produce only rows 11, 12, ... of the record set, if they exist. The value of the expression must a non negative integer, but not bigint or duration. The optional LIMIT clause allows to ignore all but first N records. For example The above will return at most the first 10 records of the record set. The value of the expression must a non negative integer, but not bigint or duration. The LIMIT and OFFSET clauses can be combined. For example Considering table t has, say 10 records, the above will produce only records 4 - 8. After returning record #8, no more result rows/records are computed. 1. The FROM clause is evaluated, producing a Cartesian product of its source record sets (tables or nested SELECT statements). 2. If present, the JOIN cluase is evaluated on the result set of the previous evaluation and the recordset specified by the JOIN clause. (... JOIN Recordset ON ...) 3. If present, the WHERE clause is evaluated on the result set of the previous evaluation. 4. If present, the GROUP BY clause is evaluated on the result set of the previous evaluation(s). 5. The SELECT field expressions are evaluated on the result set of the previous evaluation(s). 6. If present, the DISTINCT modifier is evaluated on the result set of the previous evaluation(s). 7. If present, the ORDER BY clause is evaluated on the result set of the previous evaluation(s). 8. If present, the OFFSET clause is evaluated on the result set of the previous evaluation(s). The offset expression is evaluated once for the first record produced by the previous evaluations. 9. If present, the LIMIT clause is evaluated on the result set of the previous evaluation(s). The limit expression is evaluated once for the first record produced by the previous evaluations. Truncate table statements remove all records from a table. The table must exist. For example Update statements change values of fields in rows of a table. For example Note: The SET clause is optional. If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. To allow to query for DB meta data, there exist specially named tables, some of them being virtual. Note: Virtual system tables may have fake table-wise unique but meaningless and unstable record IDs. Do not apply the built-in id() to any system table. The table __Table lists all tables in the DB. The schema is The Schema column returns the statement to (re)create table Name. This table is virtual. The table __Colum lists all columns of all tables in the DB. The schema is The Ordinal column defines the 1-based index of the column in the record. This table is virtual. The table __Colum2 lists all columns of all tables in the DB which have the constraint NOT NULL or which have a constraint expression defined or which have a default expression defined. The schema is It's possible to obtain a consolidated recordset for all properties of all DB columns using The Name column is the column name in TableName. The table __Index lists all indices in the DB. The schema is The IsUnique columns reflects if the index was created using the optional UNIQUE clause. This table is virtual. Built-in functions are predeclared. The built-in aggregate function avg returns the average of values of an expression. Avg ignores NULL values, but returns NULL if all values of a column are NULL or if avg is applied to an empty record set. The column values must be of a numeric type. The built-in function contains returns true if substr is within s. If any argument to contains is NULL the result is NULL. The built-in aggregate function count returns how many times an expression has a non NULL values or the number of rows in a record set. Note: count() returns 0 for an empty record set. For example Date returns the time corresponding to in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1. A daylight savings time transition skips or repeats times. For example, in the United States, March 13, 2011 2:15am never occurred, while November 6, 2011 1:15am occurred twice. In such cases, the choice of time zone, and therefore the time, is not well-defined. Date returns a time that is correct in one of the two zones involved in the transition, but it does not guarantee which. A location maps time instants to the zone in use at that time. Typically, the location represents the collection of time offsets in use in a geographical area, such as "CEST" and "CET" for central Europe. "local" represents the system's local time zone. "UTC" represents Universal Coordinated Time (UTC). The month specifies a month of the year (January = 1, ...). If any argument to date is NULL the result is NULL. The built-in function day returns the day of the month specified by t. If the argument to day is NULL the result is NULL. The built-in function formatTime returns a textual representation of the time value formatted according to layout, which defines the format by showing how the reference time, would be displayed if it were the value; it serves as an example of the desired output. The same display rules will then be applied to the time value. If any argument to formatTime is NULL the result is NULL. NOTE: The string value of the time zone, like "CET" or "ACDT", is dependent on the time zone of the machine the function is run on. For example, if the t value is in "CET", but the machine is in "ACDT", instead of "CET" the result is "+0100". This is the same what Go (time.Time).String() returns and in fact formatTime directly calls t.String(). returns on a machine in the CET time zone, but may return on a machine in the ACDT zone. The time value is in both cases the same so its ordering and comparing is correct. Only the display value can differ. The built-in functions formatFloat and formatInt format numbers to strings using go's number format functions in the `strconv` package. For all three functions, only the first argument is mandatory. The default values of the rest are shown in the examples. If the first argument is NULL, the result is NULL. returns returns returns Unlike the `strconv` equivalent, the formatInt function handles all integer types, both signed and unsigned. The built-in function hasPrefix tests whether the string s begins with prefix. If any argument to hasPrefix is NULL the result is NULL. The built-in function hasSuffix tests whether the string s ends with suffix. If any argument to hasSuffix is NULL the result is NULL. The built-in function hour returns the hour within the day specified by t, in the range [0, 23]. If the argument to hour is NULL the result is NULL. The built-in function hours returns the duration as a floating point number of hours. If the argument to hours is NULL the result is NULL. The built-in function id takes zero or one arguments. If no argument is provided, id() returns a table-unique automatically assigned numeric identifier of type int. Ids of deleted records are not reused unless the DB becomes completely empty (has no tables). For example If id() without arguments is called for a row which is not a table record then the result value is NULL. For example If id() has one argument it must be a table name of a table in a cross join. For example The built-in function len takes a string argument and returns the lentgh of the string in bytes. The expression len(s) is constant if s is a string constant. If the argument to len is NULL the result is NULL. The built-in aggregate function max returns the largest value of an expression in a record set. Max ignores NULL values, but returns NULL if all values of a column are NULL or if max is applied to an empty record set. The expression values must be of an ordered type. For example The built-in aggregate function min returns the smallest value of an expression in a record set. Min ignores NULL values, but returns NULL if all values of a column are NULL or if min is applied to an empty record set. For example The column values must be of an ordered type. The built-in function minute returns the minute offset within the hour specified by t, in the range [0, 59]. If the argument to minute is NULL the result is NULL. The built-in function minutes returns the duration as a floating point number of minutes. If the argument to minutes is NULL the result is NULL. The built-in function month returns the month of the year specified by t (January = 1, ...). If the argument to month is NULL the result is NULL. The built-in function nanosecond returns the nanosecond offset within the second specified by t, in the range [0, 999999999]. If the argument to nanosecond is NULL the result is NULL. The built-in function nanoseconds returns the duration as an integer nanosecond count. If the argument to nanoseconds is NULL the result is NULL. The built-in function now returns the current local time. The built-in function parseTime parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, would be interpreted if it were the value; it serves as an example of the input format. The same interpretation will then be made to the input string. Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored. In the absence of a time zone indicator, parseTime returns a time in UTC. When parsing a time with a zone offset like -0700, if the offset corresponds to a time zone used by the current location, then parseTime uses that location and zone in the returned time. Otherwise it records the time as being in a fabricated location with time fixed at the given zone offset. When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset. This choice means that such a time can be parses and reformatted with the same layout losslessly, but the exact instant used in the representation will differ by the actual zone offset. To avoid such problems, prefer time layouts that use a numeric zone offset. If any argument to parseTime is NULL the result is NULL. The built-in function second returns the second offset within the minute specified by t, in the range [0, 59]. If the argument to second is NULL the result is NULL. The built-in function seconds returns the duration as a floating point number of seconds. If the argument to seconds is NULL the result is NULL. The built-in function since returns the time elapsed since t. It is shorthand for now()-t. If the argument to since is NULL the result is NULL. The built-in aggregate function sum returns the sum of values of an expression for all rows of a record set. Sum ignores NULL values, but returns NULL if all values of a column are NULL or if sum is applied to an empty record set. The column values must be of a numeric type. The built-in function timeIn returns t with the location information set to loc. For discussion of the loc argument please see date(). If any argument to timeIn is NULL the result is NULL. The built-in function weekday returns the day of the week specified by t. Sunday == 0, Monday == 1, ... If the argument to weekday is NULL the result is NULL. The built-in function year returns the year in which t occurs. If the argument to year is NULL the result is NULL. The built-in function yearDay returns the day of the year specified by t, in the range [1,365] for non-leap years, and [1,366] in leap years. If the argument to yearDay is NULL the result is NULL. Three functions assemble and disassemble complex numbers. The built-in function complex constructs a complex value from a floating-point real and imaginary part, while real and imag extract the real and imaginary parts of a complex value. The type of the arguments and return value correspond. For complex, the two arguments must be of the same floating-point type and the return type is the complex type with the corresponding floating-point constituents: complex64 for float32, complex128 for float64. The real and imag functions together form the inverse, so for a complex value z, z == complex(real(z), imag(z)). If the operands of these functions are all constants, the return value is a constant. If any argument to any of complex, real, imag functions is NULL the result is NULL. For the numeric types, the following sizes are guaranteed Portions of this specification page are modifications based on work[2] created and shared by Google[3] and used according to terms described in the Creative Commons 3.0 Attribution License[4]. This specification is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a BSD license[5]. Links from the above documentation This section is not part of the specification. WARNING: The implementation of indices is new and it surely needs more time to become mature. Indices are used currently used only by the WHERE clause. The following expression patterns of 'WHERE expression' are recognized and trigger index use. The relOp is one of the relation operators <, <=, ==, >=, >. For the equality operator both operands must be of comparable types. For all other operators both operands must be of ordered types. The constant expression is a compile time constant expression. Some constant folding is still a TODO. Parameter is a QL parameter ($1 etc.). Consider tables t and u, both with an indexed field f. The WHERE expression doesn't comply with the above simple detected cases. However, such query is now automatically rewritten to which will use both of the indices. The impact of using the indices can be substantial (cf. BenchmarkCrossJoin*) if the resulting rows have low "selectivity", ie. only few rows from both tables are selected by the respective WHERE filtering. Note: Existing QL DBs can be used and indices can be added to them. However, once any indices are present in the DB, the old QL versions cannot work with such DB anymore. Running a benchmark with -v (-test.v) outputs information about the scale used to report records/s and a brief description of the benchmark. For example Running the full suite of benchmarks takes a lot of time. Use the -timeout flag to avoid them being killed after the default time limit (10 minutes).

v1.1.0 • 9 years ago

github.com/amoskong/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v1.5.0 • 5 years ago

github.com/bobg/cadence-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v0.17.0 • 4 years ago

github.com/EmiPhil/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v0.0.1 • 4 years ago

github.com/dbrower/ql

Package ql implements a pure Go embedded SQL database engine. QL is a member of the SQL family of languages. It is less complex and less powerful than SQL (whichever specification SQL is considered to be). 2017-01-10: Release v1.1.0 fixes some bugs and adds a configurable WAL headroom. 2016-07-29: Release v1.0.6 enables alternatively using = instead of == for equality operation. 2016-07-11: Release v1.0.5 undoes vendoring of lldb. QL now uses stable lldb (github.com/cznic/lldb). 2016-07-06: Release v1.0.4 fixes a panic when closing the WAL file. 2016-04-03: Release v1.0.3 fixes a data race. 2016-03-23: Release v1.0.2 vendors github.com/cznic/exp/lldb and github.com/camlistore/go4/lock. 2016-03-17: Release v1.0.1 adjusts for latest goyacc. Parser error messages are improved and changed, but their exact form is not considered a API change. 2016-03-05: The current version has been tagged v1.0.0. 2015-06-15: To improve compatibility with other SQL implementations, the count built-in aggregate function now accepts * as its argument. 2015-05-29: The execution planner was rewritten from scratch. It should use indices in all places where they were used before plus in some additional situations. It is possible to investigate the plan using the newly added EXPLAIN statement. The QL tool is handy for such analysis. If the planner would have used an index, but no such exists, the plan includes hints in form of copy/paste ready CREATE INDEX statements. The planner is still quite simple and a lot of work on it is yet ahead. You can help this process by filling an issue with a schema and query which fails to use an index or indices when it should, in your opinion. Bonus points for including output of `ql 'explain <query>'`. 2015-05-09: The grammar of the CREATE INDEX statement now accepts an expression list instead of a single expression, which was further limited to just a column name or the built-in id(). As a side effect, composite indices are now functional. However, the values in the expression-list style index are not yet used by other statements or the statement/query planner. The composite index is useful while having UNIQUE clause to check for semantically duplicate rows before they get added to the table or when such a row is mutated using the UPDATE statement and the expression-list style index tuple of the row is thus recomputed. 2015-05-02: The Schema field of table __Table now correctly reflects any column constraints and/or defaults. Also, the (*DB).Info method now has that information provided in new ColumInfo fields NotNull, Constraint and Default. 2015-04-20: Added support for {LEFT,RIGHT,FULL} [OUTER] JOIN. 2015-04-18: Column definitions can now have constraints and defaults. Details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. 2015-03-06: New built-in functions formatFloat and formatInt. Thanks urandom! (https://github.com/urandom) 2015-02-16: IN predicate now accepts a SELECT statement. See the updated "Predicates" section. 2015-01-17: Logical operators || and && have now alternative spellings: OR and AND (case insensitive). AND was a keyword before, but OR is a new one. This can possibly break existing queries. For the record, it's a good idea to not use any name appearing in, for example, [7] in your queries as the list of QL's keywords may expand for gaining better compatibility with existing SQL "standards". 2015-01-12: ACID guarantees were tightened at the cost of performance in some cases. The write collecting window mechanism, a formerly used implementation detail, was removed. Inserting rows one by one in a transaction is now slow. I mean very slow. Try to avoid inserting single rows in a transaction. Instead, whenever possible, perform batch updates of tens to, say thousands of rows in a single transaction. See also: http://www.sqlite.org/faq.html#q19, the discussed synchronization principles involved are the same as for QL, modulo minor details. Note: A side effect is that closing a DB before exiting an application, both for the Go API and through database/sql driver, is no more required, strictly speaking. Beware that exiting an application while there is an open (uncommitted) transaction in progress means losing the transaction data. However, the DB will not become corrupted because of not closing it. Nor that was the case before, but formerly failing to close a DB could have resulted in losing the data of the last transaction. 2014-09-21: id() now optionally accepts a single argument - a table name. 2014-09-01: Added the DB.Flush() method and the LIKE pattern matching predicate. 2014-08-08: The built in functions max and min now accept also time values. Thanks opennota! (https://github.com/opennota) 2014-06-05: RecordSet interface extended by new methods FirstRow and Rows. 2014-06-02: Indices on id() are now used by SELECT statements. 2014-05-07: Introduction of Marshal, Schema, Unmarshal. 2014-04-15: Added optional IF NOT EXISTS clause to CREATE INDEX and optional IF EXISTS clause to DROP INDEX. 2014-04-12: The column Unique in the virtual table __Index was renamed to IsUnique because the old name is a keyword. Unfortunately, this is a breaking change, sorry. 2014-04-11: Introduction of LIMIT, OFFSET. 2014-04-10: Introduction of query rewriting. 2014-04-07: Introduction of indices. QL imports zappy[8], a block-based compressor, which speeds up its performance by using a C version of the compression/decompression algorithms. If a CGO-free (pure Go) version of QL, or an app using QL, is required, please include 'purego' in the -tags option of go {build,get,install}. For example: If zappy was installed before installing QL, it might be necessary to rebuild zappy first (or rebuild QL with all its dependencies using the -a option): The syntax is specified using Extended Backus-Naur Form (EBNF) Lower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes “. The form a … b represents the set of characters from a through b as alternatives. The horizontal ellipsis … is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. QL source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. Each code point is distinct; for instance, upper and lower case letters are different characters. Implementation restriction: For compatibility with other tools, the parser may disallow the NUL character (U+0000) in the statement. Implementation restriction: A byte order mark is disallowed anywhere in QL statements. The following terms are used to denote specific character classes The underscore character _ (U+005F) is considered a letter. Lexical elements are comments, tokens, identifiers, keywords, operators and delimiters, integer, floating-point, imaginary, rune and string literals and QL parameters. Line comments start with the character sequence // or -- and stop at the end of the line. A line comment acts like a space. General comments start with the character sequence /* and continue through the character sequence */. A general comment acts like a space. Comments do not nest. Tokens form the vocabulary of QL. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. The formal grammar uses semicolons ";" as separators of QL statements. A single QL statement or the last QL statement in a list of statements can have an optional semicolon terminator. (Actually a separator from the following empty statement.) Identifiers name entities such as tables or record set columns. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. For example No identifiers are predeclared, however note that no keyword can be used as an identifier. Identifiers starting with two underscores are used for meta data virtual tables names. For forward compatibility, users should generally avoid using any identifiers starting with two underscores. For example The following keywords are reserved and may not be used as identifiers. Keywords are not case sensitive. The following character sequences represent operators, delimiters, and other special tokens Operators consisting of more than one character are referred to by names in the rest of the documentation An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0 for octal, 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15. For example A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided. For example An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or decimal integer followed by the lower-case letter i. For example A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes. Within the quotes, any character may appear except single quote and newline. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. The simplest form represents the single character within the quotes; since QL statements are Unicode characters encoded in UTF-8, multiple UTF-8-encoded bytes may represent a single integer value. For instance, the literal 'a' holds a single byte representing a literal a, Unicode U+0061, value 0x61, while 'ä' holds two bytes (0xc3 0xa4) representing a literal a-dieresis, U+00E4, value 0xe4. Several backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base. Although these representations all result in an integer, they have different valid ranges. Octal escapes must represent a value between 0 and 255 inclusive. Hexadecimal escapes satisfy this condition by construction. The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves. After a backslash, certain single-character escapes represent special values All other sequences starting with a backslash are illegal inside rune literals. For example A string literal represents a string constant obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals. Raw string literals are character sequences between back quotes “. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage returns inside raw string literals are discarded from the raw string value. Interpreted string literals are character sequences between double quotes "". The text between the quotes, which may not contain newlines, forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. For example These examples all represent the same string If the statement source represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal. Literals are assigned their values from the respective text representation at "compile" (parse) time. QL parameters provide the same functionality as literals, but their value is assigned at execution time from an expression list passed to DB.Run or DB.Execute. Using '?' or '$' is completely equivalent. For example Keywords 'false' and 'true' (not case sensitive) represent the two possible constant values of type bool (also not case sensitive). Keyword 'NULL' (not case sensitive) represents an untyped constant which is assignable to any type. NULL is distinct from any other value of any type. A type determines the set of values and operations specific to values of that type. A type is specified by a type name. Named instances of the boolean, numeric, and string types are keywords. The names are not case sensitive. Note: The blob type is exchanged between the back end and the API as []byte. On 32 bit platforms this limits the size which the implementation can handle to 2G. A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. A duration type represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years. A numeric type represents sets of integer or floating-point values. The predeclared architecture-independent numeric types are The value of an n-bit integer is n bits wide and represented using two's complement arithmetic. Conversions are required when different numeric types are mixed in an expression or assignment. A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The case insensitive keyword for the string type is 'string'. The length of a string (its size in bytes) can be discovered using the built-in function len. A time type represents an instant in time with nanosecond precision. Each time has associated with it a location, consulted when computing the presentation form of the time. The following functions are implicitly declared An expression specifies the computation of a value by applying operators and functions to operands. Operands denote the elementary values in an expression. An operand may be a literal, a (possibly qualified) identifier denoting a constant or a function or a table/record set column, or a parenthesized expression. A qualified identifier is an identifier qualified with a table/record set name prefix. For example Primary expression are the operands for unary and binary expressions. For example A primary expression of the form denotes the element of a string indexed by x. Its type is byte. The value x is called the index. The following rules apply - The index x must be of integer type except bigint or duration; it is in range if 0 <= x < len(s), otherwise it is out of range. - A constant index must be non-negative and representable by a value of type int. - A constant index must be in range if the string a is a literal. - If x is out of range at run time, a run-time error occurs. - s[x] is the byte at index x and the type of s[x] is byte. If s is NULL or x is NULL then the result is NULL. Otherwise s[x] is illegal. For a string, the primary expression constructs a substring. The indices low and high select which elements appear in the result. The result has indices starting at 0 and length equal to high - low. For convenience, any of the indices may be omitted. A missing low index defaults to zero; a missing high index defaults to the length of the sliced operand The indices low and high are in range if 0 <= low <= high <= len(a), otherwise they are out of range. A constant index must be non-negative and representable by a value of type int. If both indices are constant, they must satisfy low <= high. If the indices are out of range at run time, a run-time error occurs. Integer values of type bigint or duration cannot be used as indices. If s is NULL the result is NULL. If low or high is not omitted and is NULL then the result is NULL. Given an identifier f denoting a predeclared function, calls f with arguments a1, a2, … an. Arguments are evaluated before the function is called. The type of the expression is the result type of f. In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return value of the function is passed by value when the function returns. Calling an undefined function causes a compile-time error. Operators combine operands into expressions. Comparisons are discussed elsewhere. For other binary operators, the operand types must be identical unless the operation involves shifts or untyped constants. For operations involving constants only, see the section on constant expressions. Except for shift operations, if one operand is an untyped constant and the other operand is not, the constant is converted to the type of the other operand. The right operand in a shift expression must have unsigned integer type or be an untyped constant that can be converted to unsigned integer type. If the left operand of a non-constant shift expression is an untyped constant, the type of the constant is what it would be if the shift expression were replaced by its left operand alone. Expressions of the form yield a boolean value true if expr2, a regular expression, matches expr1 (see also [6]). Both expression must be of type string. If any one of the expressions is NULL the result is NULL. Predicates are special form expressions having a boolean result type. Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be comparable as defined in "Comparison operators". Another form of the IN predicate creates the expression list from a result of a SelectStmt. The SelectStmt must select only one column. The produced expression list is resource limited by the memory available to the process. NULL values produced by the SelectStmt are ignored, but if all records of the SelectStmt are NULL the predicate yields NULL. The select statement is evaluated only once. If the type of expr is not the same as the type of the field returned by the SelectStmt then the set operation yields false. The type of the column returned by the SelectStmt must be one of the simple (non blob-like) types: Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be ordered as defined in "Comparison operators". Expressions of the form yield a boolean value true if expr does not have a specific type (case A) or if expr has a specific type (case B). In other cases the result is a boolean value false. Unary operators have the highest precedence. There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR) Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z. Note that the operator precedence is reflected explicitly by the grammar. Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, rational, floating-point, and complex types; + also applies to strings; +,- also applies to times. All other arithmetic operators apply to integers only. sum integers, rationals, floats, complex values, strings difference integers, rationals, floats, complex values, times product integers, rationals, floats, complex values / quotient integers, rationals, floats, complex values % remainder integers & bitwise AND integers | bitwise OR integers ^ bitwise XOR integers &^ bit clear (AND NOT) integers << left shift integer << unsigned integer >> right shift integer >> unsigned integer Strings can be concatenated using the + operator String addition creates a new string by concatenating the operands. A value of type duration can be added to or subtracted from a value of type time. Times can subtracted from each other producing a value of type duration. For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships with x / y truncated towards zero ("truncated division"). As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0). If the divisor is a constant expression, it must not be zero. If the divisor is zero at run time, a run-time error occurs. If the dividend is non-negative and the divisor is a constant power of 2, the division may be replaced by a right shift, and computing the remainder may be replaced by a bitwise AND operation The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. There is no upper limit on the shift count. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity. For integer operands, the unary operators +, -, and ^ are defined as follows For floating-point and complex numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time error occurs is implementation-specific. Whenever any operand of any arithmetic operation, unary or binary, is NULL, as well as in the case of the string concatenating operation, the result is NULL. For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and expressions may rely on “wrap around”. For signed integers with a finite bit width, the operations +, -, *, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow. An evaluator may not optimize an expression under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true. Integers of type bigint and rationals do not overflow but their handling is limited by the memory resources available to the program. Comparison operators compare two operands and yield a boolean value. In any comparison, the first operand must be of same type as is the second operand, or vice versa. The equality operators == and != apply to operands that are comparable. The ordering operators <, <=, >, and >= apply to operands that are ordered. These terms and the result of the comparisons are defined as follows - Boolean values are comparable. Two boolean values are equal if they are either both true or both false. - Complex values are comparable. Two complex values u and v are equal if both real(u) == real(v) and imag(u) == imag(v). - Integer values are comparable and ordered, in the usual way. Note that durations are integers. - Floating point values are comparable and ordered, as defined by the IEEE-754 standard. - Rational values are comparable and ordered, in the usual way. - String values are comparable and ordered, lexically byte-wise. - Time values are comparable and ordered. Whenever any operand of any comparison operation is NULL, the result is NULL. Note that slices are always of type string. Logical operators apply to boolean values and yield a boolean result. The right operand is evaluated conditionally. The truth tables for logical operations with NULL values Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T. A constant value x can be converted to type T in any of these cases: - x is representable by a value of type T. - x is a floating-point constant, T is a floating-point type, and x is representable by a value of type T after rounding using IEEE 754 round-to-even rules. The constant T(x) is the rounded value. - x is an integer constant and T is a string type. The same rule as for non-constant x applies in this case. Converting a constant yields a typed constant as result. A non-constant value x can be converted to type T in any of these cases: - x has type T. - x's type and T are both integer or floating point types. - x's type and T are both complex types. - x is an integer, except bigint or duration, and T is a string type. Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x. A conversion of NULL to any type yields NULL. For the conversion of non-constant numeric values, the following rules apply 1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. It is then truncated to fit in the result type's size. For example, if v == uint16(0x10F0), then uint32(int8(v)) == 0xFFFFFFF0. The conversion always yields a valid value; there is no indication of overflow. 2. When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero). 3. When converting an integer or floating-point number to a floating-point type, or a complex number to another complex type, the result value is rounded to the precision specified by the destination type. For instance, the value of a variable x of type float32 may be stored using additional precision beyond that of an IEEE-754 32-bit number, but float32(x) represents the result of rounding x's value to 32-bit precision. Similarly, x + 0.1 may use more than 32 bits of precision, but float32(x + 0.1) does not. In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent. 1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD". 2. Converting a blob to a string type yields a string whose successive bytes are the elements of the blob. 3. Converting a value of a string type to a blob yields a blob whose successive elements are the bytes of the string. 4. Converting a value of a bigint type to a string yields a string containing the decimal decimal representation of the integer. 5. Converting a value of a string type to a bigint yields a bigint value containing the integer represented by the string value. A prefix of “0x” or “0X” selects base 16; the “0” prefix selects base 8, and a “0b” or “0B” prefix selects base 2. Otherwise the value is interpreted in base 10. An error occurs if the string value is not in any valid format. 6. Converting a value of a rational type to a string yields a string containing the decimal decimal representation of the rational in the form "a/b" (even if b == 1). 7. Converting a value of a string type to a bigrat yields a bigrat value containing the rational represented by the string value. The string can be given as a fraction "a/b" or as a floating-point number optionally followed by an exponent. An error occurs if the string value is not in any valid format. 8. Converting a value of a duration type to a string returns a string representing the duration in the form "72h3m0.5s". Leading zero units are omitted. As a special case, durations less than one second format using a smaller unit (milli-, micro-, or nanoseconds) to ensure that the leading digit is non-zero. The zero duration formats as 0, with no unit. 9. Converting a string value to a duration yields a duration represented by the string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 10. Converting a time value to a string returns the time formatted using the format string When evaluating the operands of an expression or of function calls, operations are evaluated in lexical left-to-right order. For example, in the evaluation of the function calls and evaluation of c happen in the order h(), i(), j(), c. Floating-point operations within a single expression are evaluated according to the associativity of the operators. Explicit parentheses affect the evaluation by overriding the default associativity. In the expression x + (y + z) the addition y + z is performed before adding x. Statements control execution. The empty statement does nothing. Alter table statements modify existing tables. With the ADD clause it adds a new column to the table. The column must not exist. With the DROP clause it removes an existing column from a table. The column must exist and it must be not the only (last) column of the table. IOW, there cannot be a table with no columns. For example When adding a column to a table with existing data, the constraint clause of the ColumnDef cannot be used. Adding a constrained column to an empty table is fine. Begin transactions statements introduce a new transaction level. Every transaction level must be eventually balanced by exactly one of COMMIT or ROLLBACK statements. Note that when a transaction is roll-backed because of a statement failure then no explicit balancing of the respective BEGIN TRANSACTION is statement is required nor permitted. Failure to properly balance any opened transaction level may cause dead locks and/or lose of data updated in the uppermost opened but never properly closed transaction level. For example A database cannot be updated (mutated) outside of a transaction. Statements requiring a transaction A database is effectively read only outside of a transaction. Statements not requiring a transaction The commit statement closes the innermost transaction nesting level. If that's the outermost level then the updates to the DB made by the transaction are atomically made persistent. For example Create index statements create new indices. Index is a named projection of ordered values of a table column to the respective records. As a special case the id() of the record can be indexed. Index name must not be the same as any of the existing tables and it also cannot be the same as of any column name of the table the index is on. For example Now certain SELECT statements may use the indices to speed up joins and/or to speed up record set filtering when the WHERE clause is used; or the indices might be used to improve the performance when the ORDER BY clause is present. The UNIQUE modifier requires the indexed values tuple to be index-wise unique or have all values NULL. The optional IF NOT EXISTS clause makes the statement a no operation if the index already exists. A simple index consists of only one expression which must be either a column name or the built-in id(). A more complex and more general index is one that consists of more than one expression or its single expression does not qualify as a simple index. In this case the type of all expressions in the list must be one of the non blob-like types. Note: Blob-like types are blob, bigint, bigrat, time and duration. Create table statements create new tables. A column definition declares the column name and type. Table names and column names are case sensitive. Neither a table or an index of the same name may exist in the DB. For example The optional IF NOT EXISTS clause makes the statement a no operation if the table already exists. The optional constraint clause has two forms. The first one is found in many SQL dialects. This form prevents the data in column DepartmentName to be NULL. The second form allows an arbitrary boolean expression to be used to validate the column. If the value of the expression is true then the validation succeeded. If the value of the expression is false or NULL then the validation fails. If the value of the expression is not of type bool an error occurs. The optional DEFAULT clause is an expression which, if present, is substituted instead of a NULL value when the colum is assigned a value. Note that the constraint and/or default expressions may refer to other columns by name: When a table row is inserted by the INSERT INTO statement or when a table row is updated by the UPDATE statement, the order of operations is as follows: 1. The new values of the affected columns are set and the values of all the row columns become the named values which can be referred to in default expressions evaluated in step 2. 2. If any row column value is NULL and the DEFAULT clause is present in the column's definition, the default expression is evaluated and its value is set as the respective column value. 3. The values, potentially updated, of row columns become the named values which can be referred to in constraint expressions evaluated during step 4. 4. All row columns which definition has the constraint clause present will have that constraint checked. If any constraint violation is detected, the overall operation fails and no changes to the table are made. Delete from statements remove rows from a table, which must exist. For example If the WHERE clause is not present then all rows are removed and the statement is equivalent to the TRUNCATE TABLE statement. Drop index statements remove indices from the DB. The index must exist. For example The optional IF EXISTS clause makes the statement a no operation if the index does not exist. Drop table statements remove tables from the DB. The table must exist. For example The optional IF EXISTS clause makes the statement a no operation if the table does not exist. Insert into statements insert new rows into tables. New rows come from literal data, if using the VALUES clause, or are a result of select statement. In the later case the select statement is fully evaluated before the insertion of any rows is performed, allowing to insert values calculated from the same table rows are to be inserted into. If the ColumnNameList part is omitted then the number of values inserted in the row must be the same as are columns in the table. If the ColumnNameList part is present then the number of values per row must be same as the same number of column names. All other columns of the record are set to NULL. The type of the value assigned to a column must be the same as is the column's type or the value must be NULL. For example If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. Explain statement produces a recordset consisting of lines of text which describe the execution plan of a statement, if any. For example, the QL tool treats the explain statement specially and outputs the joined lines: The explanation may aid in uderstanding how a statement/query would be executed and if indices are used as expected - or which indices may possibly improve the statement performance. The create index statements above were directly copy/pasted in the terminal from the suggestions provided by the filter recordset pipeline part returned by the explain statement. If the statement has nothing special in its plan, the result is the original statement. To get an explanation of the select statement of the IN predicate, use the EXPLAIN statement with that particular select statement. The rollback statement closes the innermost transaction nesting level discarding any updates to the DB made by it. If that's the outermost level then the effects on the DB are as if the transaction never happened. For example The (temporary) record set from the last statement is returned and can be processed by the client. In this case the rollback is the same as 'DROP TABLE tmp;' but it can be a more complex operation. Select from statements produce recordsets. The optional DISTINCT modifier ensures all rows in the result recordset are unique. Either all of the resulting fields are returned ('*') or only those named in FieldList. RecordSetList is a list of table names or parenthesized select statements, optionally (re)named using the AS clause. The result can be filtered using a WhereClause and orderd by the OrderBy clause. For example If Recordset is a nested, parenthesized SelectStmt then it must be given a name using the AS clause if its field are to be accessible in expressions. A field is an named expression. Identifiers, not used as a type in conversion or a function name in the Call clause, denote names of (other) fields, values of which should be used in the expression. The expression can be named using the AS clause. If the AS clause is not present and the expression consists solely of a field name, then that field name is used as the name of the resulting field. Otherwise the field is unnamed. For example The SELECT statement can optionally enumerate the desired/resulting fields in a list. No two identical field names can appear in the list. When more than one record set is used in the FROM clause record set list, the result record set field names are rewritten to be qualified using the record set names. If a particular record set doesn't have a name, its respective fields became unnamed. The optional JOIN clause, for example is mostly equal to except that the rows from a which, when they appear in the cross join, never made expr to evaluate to true, are combined with a virtual row from b, containing all nulls, and added to the result set. For the RIGHT JOIN variant the discussed rules are used for rows from b not satisfying expr == true and the virtual, all-null row "comes" from a. The FULL JOIN adds the respective rows which would be otherwise provided by the separate executions of the LEFT JOIN and RIGHT JOIN variants. For more thorough OUTER JOIN discussion please see the Wikipedia article at [10]. Resultins rows of a SELECT statement can be optionally ordered by the ORDER BY clause. Collating proceeds by considering the expressions in the expression list left to right until a collating order is determined. Any possibly remaining expressions are not evaluated. All of the expression values must yield an ordered type or NULL. Ordered types are defined in "Comparison operators". Collating of elements having a NULL value is different compared to what the comparison operators yield in expression evaluation (NULL result instead of a boolean value). Below, T denotes a non NULL value of any QL type. NULL collates before any non NULL value (is considered smaller than T). Two NULLs have no collating order (are considered equal). The WHERE clause restricts records considered by some statements, like SELECT FROM, DELETE FROM, or UPDATE. It is an error if the expression evaluates to a non null value of non bool type. The GROUP BY clause is used to project rows having common values into a smaller set of rows. For example Using the GROUP BY without any aggregate functions in the selected fields is in certain cases equal to using the DISTINCT modifier. The last two examples above produce the same resultsets. The optional OFFSET clause allows to ignore first N records. For example The above will produce only rows 11, 12, ... of the record set, if they exist. The value of the expression must a non negative integer, but not bigint or duration. The optional LIMIT clause allows to ignore all but first N records. For example The above will return at most the first 10 records of the record set. The value of the expression must a non negative integer, but not bigint or duration. The LIMIT and OFFSET clauses can be combined. For example Considering table t has, say 10 records, the above will produce only records 4 - 8. After returning record #8, no more result rows/records are computed. 1. The FROM clause is evaluated, producing a Cartesian product of its source record sets (tables or nested SELECT statements). 2. If present, the JOIN cluase is evaluated on the result set of the previous evaluation and the recordset specified by the JOIN clause. (... JOIN Recordset ON ...) 3. If present, the WHERE clause is evaluated on the result set of the previous evaluation. 4. If present, the GROUP BY clause is evaluated on the result set of the previous evaluation(s). 5. The SELECT field expressions are evaluated on the result set of the previous evaluation(s). 6. If present, the DISTINCT modifier is evaluated on the result set of the previous evaluation(s). 7. If present, the ORDER BY clause is evaluated on the result set of the previous evaluation(s). 8. If present, the OFFSET clause is evaluated on the result set of the previous evaluation(s). The offset expression is evaluated once for the first record produced by the previous evaluations. 9. If present, the LIMIT clause is evaluated on the result set of the previous evaluation(s). The limit expression is evaluated once for the first record produced by the previous evaluations. Truncate table statements remove all records from a table. The table must exist. For example Update statements change values of fields in rows of a table. For example Note: The SET clause is optional. If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. To allow to query for DB meta data, there exist specially named tables, some of them being virtual. Note: Virtual system tables may have fake table-wise unique but meaningless and unstable record IDs. Do not apply the built-in id() to any system table. The table __Table lists all tables in the DB. The schema is The Schema column returns the statement to (re)create table Name. This table is virtual. The table __Colum lists all columns of all tables in the DB. The schema is The Ordinal column defines the 1-based index of the column in the record. This table is virtual. The table __Colum2 lists all columns of all tables in the DB which have the constraint NOT NULL or which have a constraint expression defined or which have a default expression defined. The schema is It's possible to obtain a consolidated recordset for all properties of all DB columns using The Name column is the column name in TableName. The table __Index lists all indices in the DB. The schema is The IsUnique columns reflects if the index was created using the optional UNIQUE clause. This table is virtual. Built-in functions are predeclared. The built-in aggregate function avg returns the average of values of an expression. Avg ignores NULL values, but returns NULL if all values of a column are NULL or if avg is applied to an empty record set. The column values must be of a numeric type. The built-in function contains returns true if substr is within s. If any argument to contains is NULL the result is NULL. The built-in aggregate function count returns how many times an expression has a non NULL values or the number of rows in a record set. Note: count() returns 0 for an empty record set. For example Date returns the time corresponding to in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1. A daylight savings time transition skips or repeats times. For example, in the United States, March 13, 2011 2:15am never occurred, while November 6, 2011 1:15am occurred twice. In such cases, the choice of time zone, and therefore the time, is not well-defined. Date returns a time that is correct in one of the two zones involved in the transition, but it does not guarantee which. A location maps time instants to the zone in use at that time. Typically, the location represents the collection of time offsets in use in a geographical area, such as "CEST" and "CET" for central Europe. "local" represents the system's local time zone. "UTC" represents Universal Coordinated Time (UTC). The month specifies a month of the year (January = 1, ...). If any argument to date is NULL the result is NULL. The built-in function day returns the day of the month specified by t. If the argument to day is NULL the result is NULL. The built-in function formatTime returns a textual representation of the time value formatted according to layout, which defines the format by showing how the reference time, would be displayed if it were the value; it serves as an example of the desired output. The same display rules will then be applied to the time value. If any argument to formatTime is NULL the result is NULL. NOTE: The string value of the time zone, like "CET" or "ACDT", is dependent on the time zone of the machine the function is run on. For example, if the t value is in "CET", but the machine is in "ACDT", instead of "CET" the result is "+0100". This is the same what Go (time.Time).String() returns and in fact formatTime directly calls t.String(). returns on a machine in the CET time zone, but may return on a machine in the ACDT zone. The time value is in both cases the same so its ordering and comparing is correct. Only the display value can differ. The built-in functions formatFloat and formatInt format numbers to strings using go's number format functions in the `strconv` package. For all three functions, only the first argument is mandatory. The default values of the rest are shown in the examples. If the first argument is NULL, the result is NULL. returns returns returns Unlike the `strconv` equivalent, the formatInt function handles all integer types, both signed and unsigned. The built-in function hasPrefix tests whether the string s begins with prefix. If any argument to hasPrefix is NULL the result is NULL. The built-in function hasSuffix tests whether the string s ends with suffix. If any argument to hasSuffix is NULL the result is NULL. The built-in function hour returns the hour within the day specified by t, in the range [0, 23]. If the argument to hour is NULL the result is NULL. The built-in function hours returns the duration as a floating point number of hours. If the argument to hours is NULL the result is NULL. The built-in function id takes zero or one arguments. If no argument is provided, id() returns a table-unique automatically assigned numeric identifier of type int. Ids of deleted records are not reused unless the DB becomes completely empty (has no tables). For example If id() without arguments is called for a row which is not a table record then the result value is NULL. For example If id() has one argument it must be a table name of a table in a cross join. For example The built-in function len takes a string argument and returns the lentgh of the string in bytes. The expression len(s) is constant if s is a string constant. If the argument to len is NULL the result is NULL. The built-in aggregate function max returns the largest value of an expression in a record set. Max ignores NULL values, but returns NULL if all values of a column are NULL or if max is applied to an empty record set. The expression values must be of an ordered type. For example The built-in aggregate function min returns the smallest value of an expression in a record set. Min ignores NULL values, but returns NULL if all values of a column are NULL or if min is applied to an empty record set. For example The column values must be of an ordered type. The built-in function minute returns the minute offset within the hour specified by t, in the range [0, 59]. If the argument to minute is NULL the result is NULL. The built-in function minutes returns the duration as a floating point number of minutes. If the argument to minutes is NULL the result is NULL. The built-in function month returns the month of the year specified by t (January = 1, ...). If the argument to month is NULL the result is NULL. The built-in function nanosecond returns the nanosecond offset within the second specified by t, in the range [0, 999999999]. If the argument to nanosecond is NULL the result is NULL. The built-in function nanoseconds returns the duration as an integer nanosecond count. If the argument to nanoseconds is NULL the result is NULL. The built-in function now returns the current local time. The built-in function parseTime parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, would be interpreted if it were the value; it serves as an example of the input format. The same interpretation will then be made to the input string. Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored. In the absence of a time zone indicator, parseTime returns a time in UTC. When parsing a time with a zone offset like -0700, if the offset corresponds to a time zone used by the current location, then parseTime uses that location and zone in the returned time. Otherwise it records the time as being in a fabricated location with time fixed at the given zone offset. When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset. This choice means that such a time can be parses and reformatted with the same layout losslessly, but the exact instant used in the representation will differ by the actual zone offset. To avoid such problems, prefer time layouts that use a numeric zone offset. If any argument to parseTime is NULL the result is NULL. The built-in function second returns the second offset within the minute specified by t, in the range [0, 59]. If the argument to second is NULL the result is NULL. The built-in function seconds returns the duration as a floating point number of seconds. If the argument to seconds is NULL the result is NULL. The built-in function since returns the time elapsed since t. It is shorthand for now()-t. If the argument to since is NULL the result is NULL. The built-in aggregate function sum returns the sum of values of an expression for all rows of a record set. Sum ignores NULL values, but returns NULL if all values of a column are NULL or if sum is applied to an empty record set. The column values must be of a numeric type. The built-in function timeIn returns t with the location information set to loc. For discussion of the loc argument please see date(). If any argument to timeIn is NULL the result is NULL. The built-in function weekday returns the day of the week specified by t. Sunday == 0, Monday == 1, ... If the argument to weekday is NULL the result is NULL. The built-in function year returns the year in which t occurs. If the argument to year is NULL the result is NULL. The built-in function yearDay returns the day of the year specified by t, in the range [1,365] for non-leap years, and [1,366] in leap years. If the argument to yearDay is NULL the result is NULL. Three functions assemble and disassemble complex numbers. The built-in function complex constructs a complex value from a floating-point real and imaginary part, while real and imag extract the real and imaginary parts of a complex value. The type of the arguments and return value correspond. For complex, the two arguments must be of the same floating-point type and the return type is the complex type with the corresponding floating-point constituents: complex64 for float32, complex128 for float64. The real and imag functions together form the inverse, so for a complex value z, z == complex(real(z), imag(z)). If the operands of these functions are all constants, the return value is a constant. If any argument to any of complex, real, imag functions is NULL the result is NULL. For the numeric types, the following sizes are guaranteed Portions of this specification page are modifications based on work[2] created and shared by Google[3] and used according to terms described in the Creative Commons 3.0 Attribution License[4]. This specification is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a BSD license[5]. Links from the above documentation This section is not part of the specification. WARNING: The implementation of indices is new and it surely needs more time to become mature. Indices are used currently used only by the WHERE clause. The following expression patterns of 'WHERE expression' are recognized and trigger index use. The relOp is one of the relation operators <, <=, ==, >=, >. For the equality operator both operands must be of comparable types. For all other operators both operands must be of ordered types. The constant expression is a compile time constant expression. Some constant folding is still a TODO. Parameter is a QL parameter ($1 etc.). Consider tables t and u, both with an indexed field f. The WHERE expression doesn't comply with the above simple detected cases. However, such query is now automatically rewritten to which will use both of the indices. The impact of using the indices can be substantial (cf. BenchmarkCrossJoin*) if the resulting rows have low "selectivity", ie. only few rows from both tables are selected by the respective WHERE filtering. Note: Existing QL DBs can be used and indices can be added to them. However, once any indices are present in the DB, the old QL versions cannot work with such DB anymore. Running a benchmark with -v (-test.v) outputs information about the scale used to report records/s and a brief description of the benchmark. For example Running the full suite of benchmarks takes a lot of time. Use the -timeout flag to avoid them being killed after the default time limit (10 minutes).

v1.1.0 • 9 years ago

github.com/eliasnaur/ql

Package ql implements a pure Go embedded SQL database engine. QL is a member of the SQL family of languages. It is less complex and less powerful than SQL (whichever specification SQL is considered to be). 2016-03-17: Release v1.0.1 adjusts for latest goyacc. Parser error messages are improved and changed, but their exact form is not considered a API change. 2016-03-05: The current version has been tagged v1.0.0. 2015-06-15: To improve compatibility with other SQL implementations, the count built-in aggregate function now accepts * as its argument. 2015-05-29: The execution planner was rewritten from scratch. It should use indices in all places where they were used before plus in some additional situations. It is possible to investigate the plan using the newly added EXPLAIN statement. The QL tool is handy for such analysis. If the planner would have used an index, but no such exists, the plan includes hints in form of copy/paste ready CREATE INDEX statements. The planner is still quite simple and a lot of work on it is yet ahead. You can help this process by filling an issue with a schema and query which fails to use an index or indices when it should, in your opinion. Bonus points for including output of `ql 'explain <query>'`. 2015-05-09: The grammar of the CREATE INDEX statement now accepts an expression list instead of a single expression, which was further limited to just a column name or the built-in id(). As a side effect, composite indices are now functional. However, the values in the expression-list style index are not yet used by other statements or the statement/query planner. The composite index is useful while having UNIQUE clause to check for semantically duplicate rows before they get added to the table or when such a row is mutated using the UPDATE statement and the expression-list style index tuple of the row is thus recomputed. 2015-05-02: The Schema field of table __Table now correctly reflects any column constraints and/or defaults. Also, the (*DB).Info method now has that information provided in new ColumInfo fields NotNull, Constraint and Default. 2015-04-20: Added support for {LEFT,RIGHT,FULL} [OUTER] JOIN. 2015-04-18: Column definitions can now have constraints and defaults. Details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. 2015-03-06: New built-in functions formatFloat and formatInt. Thanks urandom! (https://github.com/urandom) 2015-02-16: IN predicate now accepts a SELECT statement. See the updated "Predicates" section. 2015-01-17: Logical operators || and && have now alternative spellings: OR and AND (case insensitive). AND was a keyword before, but OR is a new one. This can possibly break existing queries. For the record, it's a good idea to not use any name appearing in, for example, [7] in your queries as the list of QL's keywords may expand for gaining better compatibility with existing SQL "standards". 2015-01-12: ACID guarantees were tightened at the cost of performance in some cases. The write collecting window mechanism, a formerly used implementation detail, was removed. Inserting rows one by one in a transaction is now slow. I mean very slow. Try to avoid inserting single rows in a transaction. Instead, whenever possible, perform batch updates of tens to, say thousands of rows in a single transaction. See also: http://www.sqlite.org/faq.html#q19, the discussed synchronization principles involved are the same as for QL, modulo minor details. Note: A side effect is that closing a DB before exiting an application, both for the Go API and through database/sql driver, is no more required, strictly speaking. Beware that exiting an application while there is an open (uncommitted) transaction in progress means losing the transaction data. However, the DB will not become corrupted because of not closing it. Nor that was the case before, but formerly failing to close a DB could have resulted in losing the data of the last transaction. 2014-09-21: id() now optionally accepts a single argument - a table name. 2014-09-01: Added the DB.Flush() method and the LIKE pattern matching predicate. 2014-08-08: The built in functions max and min now accept also time values. Thanks opennota! (https://github.com/opennota) 2014-06-05: RecordSet interface extended by new methods FirstRow and Rows. 2014-06-02: Indices on id() are now used by SELECT statements. 2014-05-07: Introduction of Marshal, Schema, Unmarshal. 2014-04-15: Added optional IF NOT EXISTS clause to CREATE INDEX and optional IF EXISTS clause to DROP INDEX. 2014-04-12: The column Unique in the virtual table __Index was renamed to IsUnique because the old name is a keyword. Unfortunately, this is a breaking change, sorry. 2014-04-11: Introduction of LIMIT, OFFSET. 2014-04-10: Introduction of query rewriting. 2014-04-07: Introduction of indices. QL imports zappy[8], a block-based compressor, which speeds up its performance by using a C version of the compression/decompression algorithms. If a CGO-free (pure Go) version of QL, or an app using QL, is required, please include 'purego' in the -tags option of go {build,get,install}. For example: If zappy was installed before installing QL, it might be necessary to rebuild zappy first (or rebuild QL with all its dependencies using the -a option): The syntax is specified using Extended Backus-Naur Form (EBNF) Lower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes “. The form a … b represents the set of characters from a through b as alternatives. The horizontal ellipsis … is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. QL source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. Each code point is distinct; for instance, upper and lower case letters are different characters. Implementation restriction: For compatibility with other tools, the parser may disallow the NUL character (U+0000) in the statement. Implementation restriction: A byte order mark is disallowed anywhere in QL statements. The following terms are used to denote specific character classes The underscore character _ (U+005F) is considered a letter. Lexical elements are comments, tokens, identifiers, keywords, operators and delimiters, integer, floating-point, imaginary, rune and string literals and QL parameters. Line comments start with the character sequence // or -- and stop at the end of the line. A line comment acts like a space. General comments start with the character sequence /* and continue through the character sequence */. A general comment acts like a space. Comments do not nest. Tokens form the vocabulary of QL. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. The formal grammar uses semicolons ";" as separators of QL statements. A single QL statement or the last QL statement in a list of statements can have an optional semicolon terminator. (Actually a separator from the following empty statement.) Identifiers name entities such as tables or record set columns. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. For example No identifiers are predeclared, however note that no keyword can be used as an identifier. Identifiers starting with two underscores are used for meta data virtual tables names. For forward compatibility, users should generally avoid using any identifiers starting with two underscores. For example The following keywords are reserved and may not be used as identifiers. Keywords are not case sensitive. The following character sequences represent operators, delimiters, and other special tokens Operators consisting of more than one character are referred to by names in the rest of the documentation An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0 for octal, 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15. For example A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided. For example An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or decimal integer followed by the lower-case letter i. For example A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes. Within the quotes, any character may appear except single quote and newline. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. The simplest form represents the single character within the quotes; since QL statements are Unicode characters encoded in UTF-8, multiple UTF-8-encoded bytes may represent a single integer value. For instance, the literal 'a' holds a single byte representing a literal a, Unicode U+0061, value 0x61, while 'ä' holds two bytes (0xc3 0xa4) representing a literal a-dieresis, U+00E4, value 0xe4. Several backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base. Although these representations all result in an integer, they have different valid ranges. Octal escapes must represent a value between 0 and 255 inclusive. Hexadecimal escapes satisfy this condition by construction. The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves. After a backslash, certain single-character escapes represent special values All other sequences starting with a backslash are illegal inside rune literals. For example A string literal represents a string constant obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals. Raw string literals are character sequences between back quotes “. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage returns inside raw string literals are discarded from the raw string value. Interpreted string literals are character sequences between double quotes "". The text between the quotes, which may not contain newlines, forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. For example These examples all represent the same string If the statement source represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal. Literals are assigned their values from the respective text representation at "compile" (parse) time. QL parameters provide the same functionality as literals, but their value is assigned at execution time from an expression list passed to DB.Run or DB.Execute. Using '?' or '$' is completely equivalent. For example Keywords 'false' and 'true' (not case sensitive) represent the two possible constant values of type bool (also not case sensitive). Keyword 'NULL' (not case sensitive) represents an untyped constant which is assignable to any type. NULL is distinct from any other value of any type. A type determines the set of values and operations specific to values of that type. A type is specified by a type name. Named instances of the boolean, numeric, and string types are keywords. The names are not case sensitive. Note: The blob type is exchanged between the back end and the API as []byte. On 32 bit platforms this limits the size which the implementation can handle to 2G. A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. A duration type represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years. A numeric type represents sets of integer or floating-point values. The predeclared architecture-independent numeric types are The value of an n-bit integer is n bits wide and represented using two's complement arithmetic. Conversions are required when different numeric types are mixed in an expression or assignment. A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The case insensitive keyword for the string type is 'string'. The length of a string (its size in bytes) can be discovered using the built-in function len. A time type represents an instant in time with nanosecond precision. Each time has associated with it a location, consulted when computing the presentation form of the time. The following functions are implicitly declared An expression specifies the computation of a value by applying operators and functions to operands. Operands denote the elementary values in an expression. An operand may be a literal, a (possibly qualified) identifier denoting a constant or a function or a table/record set column, or a parenthesized expression. A qualified identifier is an identifier qualified with a table/record set name prefix. For example Primary expression are the operands for unary and binary expressions. For example A primary expression of the form denotes the element of a string indexed by x. Its type is byte. The value x is called the index. The following rules apply - The index x must be of integer type except bigint or duration; it is in range if 0 <= x < len(s), otherwise it is out of range. - A constant index must be non-negative and representable by a value of type int. - A constant index must be in range if the string a is a literal. - If x is out of range at run time, a run-time error occurs. - s[x] is the byte at index x and the type of s[x] is byte. If s is NULL or x is NULL then the result is NULL. Otherwise s[x] is illegal. For a string, the primary expression constructs a substring. The indices low and high select which elements appear in the result. The result has indices starting at 0 and length equal to high - low. For convenience, any of the indices may be omitted. A missing low index defaults to zero; a missing high index defaults to the length of the sliced operand The indices low and high are in range if 0 <= low <= high <= len(a), otherwise they are out of range. A constant index must be non-negative and representable by a value of type int. If both indices are constant, they must satisfy low <= high. If the indices are out of range at run time, a run-time error occurs. Integer values of type bigint or duration cannot be used as indices. If s is NULL the result is NULL. If low or high is not omitted and is NULL then the result is NULL. Given an identifier f denoting a predeclared function, calls f with arguments a1, a2, … an. Arguments are evaluated before the function is called. The type of the expression is the result type of f. In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return value of the function is passed by value when the function returns. Calling an undefined function causes a compile-time error. Operators combine operands into expressions. Comparisons are discussed elsewhere. For other binary operators, the operand types must be identical unless the operation involves shifts or untyped constants. For operations involving constants only, see the section on constant expressions. Except for shift operations, if one operand is an untyped constant and the other operand is not, the constant is converted to the type of the other operand. The right operand in a shift expression must have unsigned integer type or be an untyped constant that can be converted to unsigned integer type. If the left operand of a non-constant shift expression is an untyped constant, the type of the constant is what it would be if the shift expression were replaced by its left operand alone. Expressions of the form yeild a boolean value true if expr2, a regular expression, matches expr1 (see also [6]). Both expression must be of type string. If any one of the expressions is NULL the result is NULL. Predicates are special form expressions having a boolean result type. Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be comparable as defined in "Comparison operators". Another form of the IN predicate creates the expression list from a result of a SelectStmt. The SelectStmt must select only one column. The produced expression list is resource limited by the memory available to the process. NULL values produced by the SelectStmt are ignored, but if all records of the SelectStmt are NULL the predicate yields NULL. The select statement is evaluated only once. If the type of expr is not the same as the type of the field returned by the SelectStmt then the set operation yields false. The type of the column returned by the SelectStmt must be one of the simple (non blob-like) types: Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be ordered as defined in "Comparison operators". Expressions of the form yeild a boolean value true if expr does not have a specific type (case A) or if expr has a specific type (case B). In other cases the result is a boolean value false. Unary operators have the highest precedence. There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR) Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z. Note that the operator precedence is reflected explicitly by the grammar. Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, rational, floating-point, and complex types; + also applies to strings; +,- also applies to times. All other arithmetic operators apply to integers only. sum integers, rationals, floats, complex values, strings difference integers, rationals, floats, complex values, times product integers, rationals, floats, complex values / quotient integers, rationals, floats, complex values % remainder integers & bitwise AND integers | bitwise OR integers ^ bitwise XOR integers &^ bit clear (AND NOT) integers << left shift integer << unsigned integer >> right shift integer >> unsigned integer Strings can be concatenated using the + operator String addition creates a new string by concatenating the operands. A value of type duration can be added to or subtracted from a value of type time. Times can subtracted from each other producing a value of type duration. For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships with x / y truncated towards zero ("truncated division"). As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0). If the divisor is a constant expression, it must not be zero. If the divisor is zero at run time, a run-time error occurs. If the dividend is non-negative and the divisor is a constant power of 2, the division may be replaced by a right shift, and computing the remainder may be replaced by a bitwise AND operation The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. There is no upper limit on the shift count. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity. For integer operands, the unary operators +, -, and ^ are defined as follows For floating-point and complex numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time error occurs is implementation-specific. Whenever any operand of any arithmetic operation, unary or binary, is NULL, as well as in the case of the string concatenating operation, the result is NULL. For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and expressions may rely on “wrap around”. For signed integers with a finite bit width, the operations +, -, *, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow. An evaluator may not optimize an expression under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true. Integers of type bigint and rationals do not overflow but their handling is limited by the memory resources available to the program. Comparison operators compare two operands and yield a boolean value. In any comparison, the first operand must be of same type as is the second operand, or vice versa. The equality operators == and != apply to operands that are comparable. The ordering operators <, <=, >, and >= apply to operands that are ordered. These terms and the result of the comparisons are defined as follows - Boolean values are comparable. Two boolean values are equal if they are either both true or both false. - Complex values are comparable. Two complex values u and v are equal if both real(u) == real(v) and imag(u) == imag(v). - Integer values are comparable and ordered, in the usual way. Note that durations are integers. - Floating point values are comparable and ordered, as defined by the IEEE-754 standard. - Rational values are comparable and ordered, in the usual way. - String values are comparable and ordered, lexically byte-wise. - Time values are comparable and ordered. Whenever any operand of any comparison operation is NULL, the result is NULL. Note that slices are always of type string. Logical operators apply to boolean values and yield a boolean result. The right operand is evaluated conditionally. The truth tables for logical operations with NULL values Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T. A constant value x can be converted to type T in any of these cases: - x is representable by a value of type T. - x is a floating-point constant, T is a floating-point type, and x is representable by a value of type T after rounding using IEEE 754 round-to-even rules. The constant T(x) is the rounded value. - x is an integer constant and T is a string type. The same rule as for non-constant x applies in this case. Converting a constant yields a typed constant as result. A non-constant value x can be converted to type T in any of these cases: - x has type T. - x's type and T are both integer or floating point types. - x's type and T are both complex types. - x is an integer, except bigint or duration, and T is a string type. Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x. A conversion of NULL to any type yields NULL. For the conversion of non-constant numeric values, the following rules apply 1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. It is then truncated to fit in the result type's size. For example, if v == uint16(0x10F0), then uint32(int8(v)) == 0xFFFFFFF0. The conversion always yields a valid value; there is no indication of overflow. 2. When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero). 3. When converting an integer or floating-point number to a floating-point type, or a complex number to another complex type, the result value is rounded to the precision specified by the destination type. For instance, the value of a variable x of type float32 may be stored using additional precision beyond that of an IEEE-754 32-bit number, but float32(x) represents the result of rounding x's value to 32-bit precision. Similarly, x + 0.1 may use more than 32 bits of precision, but float32(x + 0.1) does not. In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent. 1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD". 2. Converting a blob to a string type yields a string whose successive bytes are the elements of the blob. 3. Converting a value of a string type to a blob yields a blob whose successive elements are the bytes of the string. 4. Converting a value of a bigint type to a string yields a string containing the decimal decimal representation of the integer. 5. Converting a value of a string type to a bigint yields a bigint value containing the integer represented by the string value. A prefix of “0x” or “0X” selects base 16; the “0” prefix selects base 8, and a “0b” or “0B” prefix selects base 2. Otherwise the value is interpreted in base 10. An error occurs if the string value is not in any valid format. 6. Converting a value of a rational type to a string yields a string containing the decimal decimal representation of the rational in the form "a/b" (even if b == 1). 7. Converting a value of a string type to a bigrat yields a bigrat value containing the rational represented by the string value. The string can be given as a fraction "a/b" or as a floating-point number optionally followed by an exponent. An error occurs if the string value is not in any valid format. 8. Converting a value of a duration type to a string returns a string representing the duration in the form "72h3m0.5s". Leading zero units are omitted. As a special case, durations less than one second format using a smaller unit (milli-, micro-, or nanoseconds) to ensure that the leading digit is non-zero. The zero duration formats as 0, with no unit. 9. Converting a string value to a duration yields a duration represented by the string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 10. Converting a time value to a string returns the time formatted using the format string When evaluating the operands of an expression or of function calls, operations are evaluated in lexical left-to-right order. For example, in the evaluation of the function calls and evaluation of c happen in the order h(), i(), j(), c. Floating-point operations within a single expression are evaluated according to the associativity of the operators. Explicit parentheses affect the evaluation by overriding the default associativity. In the expression x + (y + z) the addition y + z is performed before adding x. Statements control execution. The empty statement does nothing. Alter table statements modify existing tables. With the ADD clause it adds a new column to the table. The column must not exist. With the DROP clause it removes an existing column from a table. The column must exist and it must be not the only (last) column of the table. IOW, there cannot be a table with no columns. For example When adding a column to a table with existing data, the constraint clause of the ColumnDef cannot be used. Adding a constrained column to an empty table is fine. Begin transactions statements introduce a new transaction level. Every transaction level must be eventually balanced by exactly one of COMMIT or ROLLBACK statements. Note that when a transaction is roll-backed because of a statement failure then no explicit balancing of the respective BEGIN TRANSACTION is statement is required nor permitted. Failure to properly balance any opened transaction level may cause dead locks and/or lose of data updated in the uppermost opened but never properly closed transaction level. For example A database cannot be updated (mutated) outside of a transaction. Statements requiring a transaction A database is effectively read only outside of a transaction. Statements not requiring a transaction The commit statement closes the innermost transaction nesting level. If that's the outermost level then the updates to the DB made by the transaction are atomically made persistent. For example Create index statements create new indices. Index is a named projection of ordered values of a table column to the respective records. As a special case the id() of the record can be indexed. Index name must not be the same as any of the existing tables and it also cannot be the same as of any column name of the table the index is on. For example Now certain SELECT statements may use the indices to speed up joins and/or to speed up record set filtering when the WHERE clause is used; or the indices might be used to improve the performance when the ORDER BY clause is present. The UNIQUE modifier requires the indexed values tuple to be index-wise unique or have all values NULL. The optional IF NOT EXISTS clause makes the statement a no operation if the index already exists. A simple index consists of only one expression which must be either a column name or the built-in id(). A more complex and more general index is one that consists of more than one expression or its single expression does not qualify as a simple index. In this case the type of all expressions in the list must be one of the non blob-like types. Note: Blob-like types are blob, bigint, bigrat, time and duration. Create table statements create new tables. A column definition declares the column name and type. Table names and column names are case sensitive. Neither a table or an index of the same name may exist in the DB. For example The optional IF NOT EXISTS clause makes the statement a no operation if the table already exists. The optional constraint clause has two forms. The first one is found in many SQL dialects. This form prevents the data in column DepartmentName to be NULL. The second form allows an arbitrary boolean expression to be used to validate the column. If the value of the expression is true then the validation succeeded. If the value of the expression is false or NULL then the validation fails. If the value of the expression is not of type bool an error occurs. The optional DEFAULT clause is an expression which, if present, is substituted instead of a NULL value when the colum is assigned a value. Note that the constraint and/or default expressions may refer to other columns by name: When a table row is inserted by the INSERT INTO statement or when a table row is updated by the UPDATE statement, the order of operations is as follows: 1. The new values of the affected columns are set and the values of all the row columns become the named values which can be referred to in default expressions evaluated in step 2. 2. If any row column value is NULL and the DEFAULT clause is present in the column's definition, the default expression is evaluated and its value is set as the respective column value. 3. The values, potentially updated, of row columns become the named values which can be referred to in constraint expressions evaluated during step 4. 4. All row columns which definition has the constraint clause present will have that constraint checked. If any constraint violation is detected, the overall operation fails and no changes to the table are made. Delete from statements remove rows from a table, which must exist. For example If the WHERE clause is not present then all rows are removed and the statement is equivalent to the TRUNCATE TABLE statement. Drop index statements remove indices from the DB. The index must exist. For example The optional IF EXISTS clause makes the statement a no operation if the index does not exist. Drop table statements remove tables from the DB. The table must exist. For example The optional IF EXISTS clause makes the statement a no operation if the table does not exist. Insert into statements insert new rows into tables. New rows come from literal data, if using the VALUES clause, or are a result of select statement. In the later case the select statement is fully evaluated before the insertion of any rows is performed, allowing to insert values calculated from the same table rows are to be inserted into. If the ColumnNameList part is omitted then the number of values inserted in the row must be the same as are columns in the table. If the ColumnNameList part is present then the number of values per row must be same as the same number of column names. All other columns of the record are set to NULL. The type of the value assigned to a column must be the same as is the column's type or the value must be NULL. For example If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. Explain statement produces a recordset consisting of lines of text which describe the execution plan of a statement, if any. For example, the QL tool treats the explain statement specially and outputs the joined lines: The explanation may aid in uderstanding how a statement/query would be executed and if indices are used as expected - or which indices may possibly improve the statement performance. The create index statements above were directly copy/pasted in the terminal from the suggestions provided by the filter recordset pipeline part returned by the explain statement. If the statement has nothing special in its plan, the result is the original statement. To get an explanation of the select statement of the IN predicate, use the EXPLAIN statement with that particular select statement. The rollback statement closes the innermost transaction nesting level discarding any updates to the DB made by it. If that's the outermost level then the effects on the DB are as if the transaction never happened. For example The (temporary) record set from the last statement is returned and can be processed by the client. In this case the rollback is the same as 'DROP TABLE tmp;' but it can be a more complex operation. Select from statements produce recordsets. The optional DISTINCT modifier ensures all rows in the result recordset are unique. Either all of the resulting fields are returned ('*') or only those named in FieldList. RecordSetList is a list of table names or parenthesized select statements, optionally (re)named using the AS clause. The result can be filtered using a WhereClause and orderd by the OrderBy clause. For example If Recordset is a nested, parenthesized SelectStmt then it must be given a name using the AS clause if its field are to be accessible in expressions. A field is an named expression. Identifiers, not used as a type in conversion or a function name in the Call clause, denote names of (other) fields, values of which should be used in the expression. The expression can be named using the AS clause. If the AS clause is not present and the expression consists solely of a field name, then that field name is used as the name of the resulting field. Otherwise the field is unnamed. For example The SELECT statement can optionally enumerate the desired/resulting fields in a list. No two identical field names can appear in the list. When more than one record set is used in the FROM clause record set list, the result record set field names are rewritten to be qualified using the record set names. If a particular record set doesn't have a name, its respective fields became unnamed. The optional JOIN clause, for example is mostly equal to except that the rows from a which, when they appear in the cross join, never made expr to evaluate to true, are combined with a virtual row from b, containing all nulls, and added to the result set. For the RIGHT JOIN variant the discussed rules are used for rows from b not satisfying expr == true and the virtual, all-null row "comes" from a. The FULL JOIN adds the respective rows which would be otherwise provided by the separate executions of the LEFT JOIN and RIGHT JOIN variants. For more thorough OUTER JOIN discussion please see the Wikipedia article at [10]. Resultins rows of a SELECT statement can be optionally ordered by the ORDER BY clause. Collating proceeds by considering the expressions in the expression list left to right until a collating order is determined. Any possibly remaining expressions are not evaluated. All of the expression values must yield an ordered type or NULL. Ordered types are defined in "Comparison operators". Collating of elements having a NULL value is different compared to what the comparison operators yield in expression evaluation (NULL result instead of a boolean value). Below, T denotes a non NULL value of any QL type. NULL collates before any non NULL value (is considered smaller than T). Two NULLs have no collating order (are considered equal). The WHERE clause restricts records considered by some statements, like SELECT FROM, DELETE FROM, or UPDATE. It is an error if the expression evaluates to a non null value of non bool type. The GROUP BY clause is used to project rows having common values into a smaller set of rows. For example Using the GROUP BY without any aggregate functions in the selected fields is in certain cases equal to using the DISTINCT modifier. The last two examples above produce the same resultsets. The optional OFFSET clause allows to ignore first N records. For example The above will produce only rows 11, 12, ... of the record set, if they exist. The value of the expression must a non negative integer, but not bigint or duration. The optional LIMIT clause allows to ignore all but first N records. For example The above will return at most the first 10 records of the record set. The value of the expression must a non negative integer, but not bigint or duration. The LIMIT and OFFSET clauses can be combined. For example Considering table t has, say 10 records, the above will produce only records 4 - 8. After returning record #8, no more result rows/records are computed. 1. The FROM clause is evaluated, producing a Cartesian product of its source record sets (tables or nested SELECT statements). 2. If present, the JOIN cluase is evaluated on the result set of the previous evaluation and the recordset specified by the JOIN clause. (... JOIN Recordset ON ...) 3. If present, the WHERE clause is evaluated on the result set of the previous evaluation. 4. If present, the GROUP BY clause is evaluated on the result set of the previous evaluation(s). 5. The SELECT field expressions are evaluated on the result set of the previous evaluation(s). 6. If present, the DISTINCT modifier is evaluated on the result set of the previous evaluation(s). 7. If present, the ORDER BY clause is evaluated on the result set of the previous evaluation(s). 8. If present, the OFFSET clause is evaluated on the result set of the previous evaluation(s). The offset expression is evaluated once for the first record produced by the previous evaluations. 9. If present, the LIMIT clause is evaluated on the result set of the previous evaluation(s). The limit expression is evaluated once for the first record produced by the previous evaluations. Truncate table statements remove all records from a table. The table must exist. For example Update statements change values of fields in rows of a table. For example Note: The SET clause is optional. If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. To allow to query for DB meta data, there exist specially named tables, some of them being virtual. Note: Virtual system tables may have fake table-wise unique but meaningless and unstable record IDs. Do not apply the built-in id() to any system table. The table __Table lists all tables in the DB. The schema is The Schema column returns the statement to (re)create table Name. This table is virtual. The table __Colum lists all columns of all tables in the DB. The schema is The Ordinal column defines the 1-based index of the column in the record. This table is virtual. The table __Colum2 lists all columns of all tables in the DB which have the constraint NOT NULL or which have a constraint expression defined or which have a default expression defined. The schema is It's possible to obtain a consolidated recordset for all properties of all DB columns using The Name column is the column name in TableName. The table __Index lists all indices in the DB. The schema is The IsUnique columns reflects if the index was created using the optional UNIQUE clause. This table is virtual. Built-in functions are predeclared. The built-in aggregate function avg returns the average of values of an expression. Avg ignores NULL values, but returns NULL if all values of a column are NULL or if avg is applied to an empty record set. The column values must be of a numeric type. The built-in function contains returns true if substr is within s. If any argument to contains is NULL the result is NULL. The built-in aggregate function count returns how many times an expression has a non NULL values or the number of rows in a record set. Note: count() returns 0 for an empty record set. For example Date returns the time corresponding to in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1. A daylight savings time transition skips or repeats times. For example, in the United States, March 13, 2011 2:15am never occurred, while November 6, 2011 1:15am occurred twice. In such cases, the choice of time zone, and therefore the time, is not well-defined. Date returns a time that is correct in one of the two zones involved in the transition, but it does not guarantee which. A location maps time instants to the zone in use at that time. Typically, the location represents the collection of time offsets in use in a geographical area, such as "CEST" and "CET" for central Europe. "local" represents the system's local time zone. "UTC" represents Universal Coordinated Time (UTC). The month specifies a month of the year (January = 1, ...). If any argument to date is NULL the result is NULL. The built-in function day returns the day of the month specified by t. If the argument to day is NULL the result is NULL. The built-in function formatTime returns a textual representation of the time value formatted according to layout, which defines the format by showing how the reference time, would be displayed if it were the value; it serves as an example of the desired output. The same display rules will then be applied to the time value. If any argument to formatTime is NULL the result is NULL. NOTE: The string value of the time zone, like "CET" or "ACDT", is dependent on the time zone of the machine the function is run on. For example, if the t value is in "CET", but the machine is in "ACDT", instead of "CET" the result is "+0100". This is the same what Go (time.Time).String() returns and in fact formatTime directly calls t.String(). returns on a machine in the CET time zone, but may return on a machine in the ACDT zone. The time value is in both cases the same so its ordering and comparing is correct. Only the display value can differ. The built-in functions formatFloat and formatInt format numbers to strings using go's number format functions in the `strconv` package. For all three functions, only the first argument is mandatory. The default values of the rest are shown in the examples. If the first argument is NULL, the result is NULL. returns returns returns Unlike the `strconv` equivalent, the formatInt function handles all integer types, both signed and unsigned. The built-in function hasPrefix tests whether the string s begins with prefix. If any argument to hasPrefix is NULL the result is NULL. The built-in function hasSuffix tests whether the string s ends with suffix. If any argument to hasSuffix is NULL the result is NULL. The built-in function hour returns the hour within the day specified by t, in the range [0, 23]. If the argument to hour is NULL the result is NULL. The built-in function hours returns the duration as a floating point number of hours. If the argument to hours is NULL the result is NULL. The built-in function id takes zero or one arguments. If no argument is provided, id() returns a table-unique automatically assigned numeric identifier of type int. Ids of deleted records are not reused unless the DB becomes completely empty (has no tables). For example If id() without arguments is called for a row which is not a table record then the result value is NULL. For example If id() has one argument it must be a table name of a table in a cross join. For example The built-in function len takes a string argument and returns the lentgh of the string in bytes. The expression len(s) is constant if s is a string constant. If the argument to len is NULL the result is NULL. The built-in aggregate function max returns the largest value of an expression in a record set. Max ignores NULL values, but returns NULL if all values of a column are NULL or if max is applied to an empty record set. The expression values must be of an ordered type. For example The built-in aggregate function min returns the smallest value of an expression in a record set. Min ignores NULL values, but returns NULL if all values of a column are NULL or if min is applied to an empty record set. For example The column values must be of an ordered type. The built-in function minute returns the minute offset within the hour specified by t, in the range [0, 59]. If the argument to minute is NULL the result is NULL. The built-in function minutes returns the duration as a floating point number of minutes. If the argument to minutes is NULL the result is NULL. The built-in function month returns the month of the year specified by t (January = 1, ...). If the argument to month is NULL the result is NULL. The built-in function nanosecond returns the nanosecond offset within the second specified by t, in the range [0, 999999999]. If the argument to nanosecond is NULL the result is NULL. The built-in function nanoseconds returns the duration as an integer nanosecond count. If the argument to nanoseconds is NULL the result is NULL. The built-in function now returns the current local time. The built-in function parseTime parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, would be interpreted if it were the value; it serves as an example of the input format. The same interpretation will then be made to the input string. Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored. In the absence of a time zone indicator, parseTime returns a time in UTC. When parsing a time with a zone offset like -0700, if the offset corresponds to a time zone used by the current location, then parseTime uses that location and zone in the returned time. Otherwise it records the time as being in a fabricated location with time fixed at the given zone offset. When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset. This choice means that such a time can be parses and reformatted with the same layout losslessly, but the exact instant used in the representation will differ by the actual zone offset. To avoid such problems, prefer time layouts that use a numeric zone offset. If any argument to parseTime is NULL the result is NULL. The built-in function second returns the second offset within the minute specified by t, in the range [0, 59]. If the argument to second is NULL the result is NULL. The built-in function seconds returns the duration as a floating point number of seconds. If the argument to seconds is NULL the result is NULL. The built-in function since returns the time elapsed since t. It is shorthand for now()-t. If the argument to since is NULL the result is NULL. The built-in aggregate function sum returns the sum of values of an expression for all rows of a record set. Sum ignores NULL values, but returns NULL if all values of a column are NULL or if sum is applied to an empty record set. The column values must be of a numeric type. The built-in function timeIn returns t with the location information set to loc. For discussion of the loc argument please see date(). If any argument to timeIn is NULL the result is NULL. The built-in function weekday returns the day of the week specified by t. Sunday == 0, Monday == 1, ... If the argument to weekday is NULL the result is NULL. The built-in function year returns the year in which t occurs. If the argument to year is NULL the result is NULL. The built-in function yearDay returns the day of the year specified by t, in the range [1,365] for non-leap years, and [1,366] in leap years. If the argument to yearDay is NULL the result is NULL. Three functions assemble and disassemble complex numbers. The built-in function complex constructs a complex value from a floating-point real and imaginary part, while real and imag extract the real and imaginary parts of a complex value. The type of the arguments and return value correspond. For complex, the two arguments must be of the same floating-point type and the return type is the complex type with the corresponding floating-point constituents: complex64 for float32, complex128 for float64. The real and imag functions together form the inverse, so for a complex value z, z == complex(real(z), imag(z)). If the operands of these functions are all constants, the return value is a constant. If any argument to any of complex, real, imag functions is NULL the result is NULL. For the numeric types, the following sizes are guaranteed Portions of this specification page are modifications based on work[2] created and shared by Google[3] and used according to terms described in the Creative Commons 3.0 Attribution License[4]. This specification is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a BSD license[5]. Links from the above documentation This section is not part of the specification. WARNING: The implementation of indices is new and it surely needs more time to become mature. Indices are used currently used only by the WHERE clause. The following expression patterns of 'WHERE expression' are recognized and trigger index use. The relOp is one of the relation operators <, <=, ==, >=, >. For the equality operator both operands must be of comparable types. For all other operators both operands must be of ordered types. The constant expression is a compile time constant expression. Some constant folding is still a TODO. Parameter is a QL parameter ($1 etc.). Consider tables t and u, both with an indexed field f. The WHERE expression doesn't comply with the above simple detected cases. However, such query is now automatically rewritten to which will use both of the indices. The impact of using the indices can be substantial (cf. BenchmarkCrossJoin*) if the resulting rows have low "selectivity", ie. only few rows from both tables are selected by the respective WHERE filtering. Note: Existing QL DBs can be used and indices can be added to them. However, once any indices are present in the DB, the old QL versions cannot work with such DB anymore. Running a benchmark with -v (-test.v) outputs information about the scale used to report records/s and a brief description of the benchmark. For example Running the full suite of benchmarks takes a lot of time. Use the -timeout flag to avoid them being killed after the default time limit (10 minutes).

v1.0.1 • 9 years ago

github.com/fanyang01/ql

Package ql implements a pure Go embedded SQL database engine. QL is a member of the SQL family of languages. It is less complex and less powerful than SQL (whichever specification SQL is considered to be). 2017-01-10: Release v1.1.0 fixes some bugs and adds a configurable WAL headroom. 2016-07-29: Release v1.0.6 enables alternatively using = instead of == for equality operation. 2016-07-11: Release v1.0.5 undoes vendoring of lldb. QL now uses stable lldb (github.com/cznic/lldb). 2016-07-06: Release v1.0.4 fixes a panic when closing the WAL file. 2016-04-03: Release v1.0.3 fixes a data race. 2016-03-23: Release v1.0.2 vendors github.com/cznic/exp/lldb and github.com/camlistore/go4/lock. 2016-03-17: Release v1.0.1 adjusts for latest goyacc. Parser error messages are improved and changed, but their exact form is not considered a API change. 2016-03-05: The current version has been tagged v1.0.0. 2015-06-15: To improve compatibility with other SQL implementations, the count built-in aggregate function now accepts * as its argument. 2015-05-29: The execution planner was rewritten from scratch. It should use indices in all places where they were used before plus in some additional situations. It is possible to investigate the plan using the newly added EXPLAIN statement. The QL tool is handy for such analysis. If the planner would have used an index, but no such exists, the plan includes hints in form of copy/paste ready CREATE INDEX statements. The planner is still quite simple and a lot of work on it is yet ahead. You can help this process by filling an issue with a schema and query which fails to use an index or indices when it should, in your opinion. Bonus points for including output of `ql 'explain <query>'`. 2015-05-09: The grammar of the CREATE INDEX statement now accepts an expression list instead of a single expression, which was further limited to just a column name or the built-in id(). As a side effect, composite indices are now functional. However, the values in the expression-list style index are not yet used by other statements or the statement/query planner. The composite index is useful while having UNIQUE clause to check for semantically duplicate rows before they get added to the table or when such a row is mutated using the UPDATE statement and the expression-list style index tuple of the row is thus recomputed. 2015-05-02: The Schema field of table __Table now correctly reflects any column constraints and/or defaults. Also, the (*DB).Info method now has that information provided in new ColumInfo fields NotNull, Constraint and Default. 2015-04-20: Added support for {LEFT,RIGHT,FULL} [OUTER] JOIN. 2015-04-18: Column definitions can now have constraints and defaults. Details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. 2015-03-06: New built-in functions formatFloat and formatInt. Thanks urandom! (https://github.com/urandom) 2015-02-16: IN predicate now accepts a SELECT statement. See the updated "Predicates" section. 2015-01-17: Logical operators || and && have now alternative spellings: OR and AND (case insensitive). AND was a keyword before, but OR is a new one. This can possibly break existing queries. For the record, it's a good idea to not use any name appearing in, for example, [7] in your queries as the list of QL's keywords may expand for gaining better compatibility with existing SQL "standards". 2015-01-12: ACID guarantees were tightened at the cost of performance in some cases. The write collecting window mechanism, a formerly used implementation detail, was removed. Inserting rows one by one in a transaction is now slow. I mean very slow. Try to avoid inserting single rows in a transaction. Instead, whenever possible, perform batch updates of tens to, say thousands of rows in a single transaction. See also: http://www.sqlite.org/faq.html#q19, the discussed synchronization principles involved are the same as for QL, modulo minor details. Note: A side effect is that closing a DB before exiting an application, both for the Go API and through database/sql driver, is no more required, strictly speaking. Beware that exiting an application while there is an open (uncommitted) transaction in progress means losing the transaction data. However, the DB will not become corrupted because of not closing it. Nor that was the case before, but formerly failing to close a DB could have resulted in losing the data of the last transaction. 2014-09-21: id() now optionally accepts a single argument - a table name. 2014-09-01: Added the DB.Flush() method and the LIKE pattern matching predicate. 2014-08-08: The built in functions max and min now accept also time values. Thanks opennota! (https://github.com/opennota) 2014-06-05: RecordSet interface extended by new methods FirstRow and Rows. 2014-06-02: Indices on id() are now used by SELECT statements. 2014-05-07: Introduction of Marshal, Schema, Unmarshal. 2014-04-15: Added optional IF NOT EXISTS clause to CREATE INDEX and optional IF EXISTS clause to DROP INDEX. 2014-04-12: The column Unique in the virtual table __Index was renamed to IsUnique because the old name is a keyword. Unfortunately, this is a breaking change, sorry. 2014-04-11: Introduction of LIMIT, OFFSET. 2014-04-10: Introduction of query rewriting. 2014-04-07: Introduction of indices. QL imports zappy[8], a block-based compressor, which speeds up its performance by using a C version of the compression/decompression algorithms. If a CGO-free (pure Go) version of QL, or an app using QL, is required, please include 'purego' in the -tags option of go {build,get,install}. For example: If zappy was installed before installing QL, it might be necessary to rebuild zappy first (or rebuild QL with all its dependencies using the -a option): The syntax is specified using Extended Backus-Naur Form (EBNF) Lower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes “. The form a … b represents the set of characters from a through b as alternatives. The horizontal ellipsis … is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. QL source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. Each code point is distinct; for instance, upper and lower case letters are different characters. Implementation restriction: For compatibility with other tools, the parser may disallow the NUL character (U+0000) in the statement. Implementation restriction: A byte order mark is disallowed anywhere in QL statements. The following terms are used to denote specific character classes The underscore character _ (U+005F) is considered a letter. Lexical elements are comments, tokens, identifiers, keywords, operators and delimiters, integer, floating-point, imaginary, rune and string literals and QL parameters. Line comments start with the character sequence // or -- and stop at the end of the line. A line comment acts like a space. General comments start with the character sequence /* and continue through the character sequence */. A general comment acts like a space. Comments do not nest. Tokens form the vocabulary of QL. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. The formal grammar uses semicolons ";" as separators of QL statements. A single QL statement or the last QL statement in a list of statements can have an optional semicolon terminator. (Actually a separator from the following empty statement.) Identifiers name entities such as tables or record set columns. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. For example No identifiers are predeclared, however note that no keyword can be used as an identifier. Identifiers starting with two underscores are used for meta data virtual tables names. For forward compatibility, users should generally avoid using any identifiers starting with two underscores. For example The following keywords are reserved and may not be used as identifiers. Keywords are not case sensitive. The following character sequences represent operators, delimiters, and other special tokens Operators consisting of more than one character are referred to by names in the rest of the documentation An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0 for octal, 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15. For example A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided. For example An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or decimal integer followed by the lower-case letter i. For example A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes. Within the quotes, any character may appear except single quote and newline. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. The simplest form represents the single character within the quotes; since QL statements are Unicode characters encoded in UTF-8, multiple UTF-8-encoded bytes may represent a single integer value. For instance, the literal 'a' holds a single byte representing a literal a, Unicode U+0061, value 0x61, while 'ä' holds two bytes (0xc3 0xa4) representing a literal a-dieresis, U+00E4, value 0xe4. Several backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base. Although these representations all result in an integer, they have different valid ranges. Octal escapes must represent a value between 0 and 255 inclusive. Hexadecimal escapes satisfy this condition by construction. The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves. After a backslash, certain single-character escapes represent special values All other sequences starting with a backslash are illegal inside rune literals. For example A string literal represents a string constant obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals. Raw string literals are character sequences between back quotes “. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage returns inside raw string literals are discarded from the raw string value. Interpreted string literals are character sequences between double quotes "". The text between the quotes, which may not contain newlines, forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. For example These examples all represent the same string If the statement source represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal. Literals are assigned their values from the respective text representation at "compile" (parse) time. QL parameters provide the same functionality as literals, but their value is assigned at execution time from an expression list passed to DB.Run or DB.Execute. Using '?' or '$' is completely equivalent. For example Keywords 'false' and 'true' (not case sensitive) represent the two possible constant values of type bool (also not case sensitive). Keyword 'NULL' (not case sensitive) represents an untyped constant which is assignable to any type. NULL is distinct from any other value of any type. A type determines the set of values and operations specific to values of that type. A type is specified by a type name. Named instances of the boolean, numeric, and string types are keywords. The names are not case sensitive. Note: The blob type is exchanged between the back end and the API as []byte. On 32 bit platforms this limits the size which the implementation can handle to 2G. A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. A duration type represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years. A numeric type represents sets of integer or floating-point values. The predeclared architecture-independent numeric types are The value of an n-bit integer is n bits wide and represented using two's complement arithmetic. Conversions are required when different numeric types are mixed in an expression or assignment. A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The case insensitive keyword for the string type is 'string'. The length of a string (its size in bytes) can be discovered using the built-in function len. A time type represents an instant in time with nanosecond precision. Each time has associated with it a location, consulted when computing the presentation form of the time. The following functions are implicitly declared An expression specifies the computation of a value by applying operators and functions to operands. Operands denote the elementary values in an expression. An operand may be a literal, a (possibly qualified) identifier denoting a constant or a function or a table/record set column, or a parenthesized expression. A qualified identifier is an identifier qualified with a table/record set name prefix. For example Primary expression are the operands for unary and binary expressions. For example A primary expression of the form denotes the element of a string indexed by x. Its type is byte. The value x is called the index. The following rules apply - The index x must be of integer type except bigint or duration; it is in range if 0 <= x < len(s), otherwise it is out of range. - A constant index must be non-negative and representable by a value of type int. - A constant index must be in range if the string a is a literal. - If x is out of range at run time, a run-time error occurs. - s[x] is the byte at index x and the type of s[x] is byte. If s is NULL or x is NULL then the result is NULL. Otherwise s[x] is illegal. For a string, the primary expression constructs a substring. The indices low and high select which elements appear in the result. The result has indices starting at 0 and length equal to high - low. For convenience, any of the indices may be omitted. A missing low index defaults to zero; a missing high index defaults to the length of the sliced operand The indices low and high are in range if 0 <= low <= high <= len(a), otherwise they are out of range. A constant index must be non-negative and representable by a value of type int. If both indices are constant, they must satisfy low <= high. If the indices are out of range at run time, a run-time error occurs. Integer values of type bigint or duration cannot be used as indices. If s is NULL the result is NULL. If low or high is not omitted and is NULL then the result is NULL. Given an identifier f denoting a predeclared function, calls f with arguments a1, a2, … an. Arguments are evaluated before the function is called. The type of the expression is the result type of f. In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return value of the function is passed by value when the function returns. Calling an undefined function causes a compile-time error. Operators combine operands into expressions. Comparisons are discussed elsewhere. For other binary operators, the operand types must be identical unless the operation involves shifts or untyped constants. For operations involving constants only, see the section on constant expressions. Except for shift operations, if one operand is an untyped constant and the other operand is not, the constant is converted to the type of the other operand. The right operand in a shift expression must have unsigned integer type or be an untyped constant that can be converted to unsigned integer type. If the left operand of a non-constant shift expression is an untyped constant, the type of the constant is what it would be if the shift expression were replaced by its left operand alone. Expressions of the form yield a boolean value true if expr2, a regular expression, matches expr1 (see also [6]). Both expression must be of type string. If any one of the expressions is NULL the result is NULL. Predicates are special form expressions having a boolean result type. Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be comparable as defined in "Comparison operators". Another form of the IN predicate creates the expression list from a result of a SelectStmt. The SelectStmt must select only one column. The produced expression list is resource limited by the memory available to the process. NULL values produced by the SelectStmt are ignored, but if all records of the SelectStmt are NULL the predicate yields NULL. The select statement is evaluated only once. If the type of expr is not the same as the type of the field returned by the SelectStmt then the set operation yields false. The type of the column returned by the SelectStmt must be one of the simple (non blob-like) types: Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be ordered as defined in "Comparison operators". Expressions of the form yield a boolean value true if expr does not have a specific type (case A) or if expr has a specific type (case B). In other cases the result is a boolean value false. Unary operators have the highest precedence. There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR) Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z. Note that the operator precedence is reflected explicitly by the grammar. Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, rational, floating-point, and complex types; + also applies to strings; +,- also applies to times. All other arithmetic operators apply to integers only. sum integers, rationals, floats, complex values, strings difference integers, rationals, floats, complex values, times product integers, rationals, floats, complex values / quotient integers, rationals, floats, complex values % remainder integers & bitwise AND integers | bitwise OR integers ^ bitwise XOR integers &^ bit clear (AND NOT) integers << left shift integer << unsigned integer >> right shift integer >> unsigned integer Strings can be concatenated using the + operator String addition creates a new string by concatenating the operands. A value of type duration can be added to or subtracted from a value of type time. Times can subtracted from each other producing a value of type duration. For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships with x / y truncated towards zero ("truncated division"). As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0). If the divisor is a constant expression, it must not be zero. If the divisor is zero at run time, a run-time error occurs. If the dividend is non-negative and the divisor is a constant power of 2, the division may be replaced by a right shift, and computing the remainder may be replaced by a bitwise AND operation The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. There is no upper limit on the shift count. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity. For integer operands, the unary operators +, -, and ^ are defined as follows For floating-point and complex numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time error occurs is implementation-specific. Whenever any operand of any arithmetic operation, unary or binary, is NULL, as well as in the case of the string concatenating operation, the result is NULL. For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and expressions may rely on “wrap around”. For signed integers with a finite bit width, the operations +, -, *, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow. An evaluator may not optimize an expression under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true. Integers of type bigint and rationals do not overflow but their handling is limited by the memory resources available to the program. Comparison operators compare two operands and yield a boolean value. In any comparison, the first operand must be of same type as is the second operand, or vice versa. The equality operators == and != apply to operands that are comparable. The ordering operators <, <=, >, and >= apply to operands that are ordered. These terms and the result of the comparisons are defined as follows - Boolean values are comparable. Two boolean values are equal if they are either both true or both false. - Complex values are comparable. Two complex values u and v are equal if both real(u) == real(v) and imag(u) == imag(v). - Integer values are comparable and ordered, in the usual way. Note that durations are integers. - Floating point values are comparable and ordered, as defined by the IEEE-754 standard. - Rational values are comparable and ordered, in the usual way. - String values are comparable and ordered, lexically byte-wise. - Time values are comparable and ordered. Whenever any operand of any comparison operation is NULL, the result is NULL. Note that slices are always of type string. Logical operators apply to boolean values and yield a boolean result. The right operand is evaluated conditionally. The truth tables for logical operations with NULL values Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T. A constant value x can be converted to type T in any of these cases: - x is representable by a value of type T. - x is a floating-point constant, T is a floating-point type, and x is representable by a value of type T after rounding using IEEE 754 round-to-even rules. The constant T(x) is the rounded value. - x is an integer constant and T is a string type. The same rule as for non-constant x applies in this case. Converting a constant yields a typed constant as result. A non-constant value x can be converted to type T in any of these cases: - x has type T. - x's type and T are both integer or floating point types. - x's type and T are both complex types. - x is an integer, except bigint or duration, and T is a string type. Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x. A conversion of NULL to any type yields NULL. For the conversion of non-constant numeric values, the following rules apply 1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. It is then truncated to fit in the result type's size. For example, if v == uint16(0x10F0), then uint32(int8(v)) == 0xFFFFFFF0. The conversion always yields a valid value; there is no indication of overflow. 2. When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero). 3. When converting an integer or floating-point number to a floating-point type, or a complex number to another complex type, the result value is rounded to the precision specified by the destination type. For instance, the value of a variable x of type float32 may be stored using additional precision beyond that of an IEEE-754 32-bit number, but float32(x) represents the result of rounding x's value to 32-bit precision. Similarly, x + 0.1 may use more than 32 bits of precision, but float32(x + 0.1) does not. In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent. 1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD". 2. Converting a blob to a string type yields a string whose successive bytes are the elements of the blob. 3. Converting a value of a string type to a blob yields a blob whose successive elements are the bytes of the string. 4. Converting a value of a bigint type to a string yields a string containing the decimal decimal representation of the integer. 5. Converting a value of a string type to a bigint yields a bigint value containing the integer represented by the string value. A prefix of “0x” or “0X” selects base 16; the “0” prefix selects base 8, and a “0b” or “0B” prefix selects base 2. Otherwise the value is interpreted in base 10. An error occurs if the string value is not in any valid format. 6. Converting a value of a rational type to a string yields a string containing the decimal decimal representation of the rational in the form "a/b" (even if b == 1). 7. Converting a value of a string type to a bigrat yields a bigrat value containing the rational represented by the string value. The string can be given as a fraction "a/b" or as a floating-point number optionally followed by an exponent. An error occurs if the string value is not in any valid format. 8. Converting a value of a duration type to a string returns a string representing the duration in the form "72h3m0.5s". Leading zero units are omitted. As a special case, durations less than one second format using a smaller unit (milli-, micro-, or nanoseconds) to ensure that the leading digit is non-zero. The zero duration formats as 0, with no unit. 9. Converting a string value to a duration yields a duration represented by the string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 10. Converting a time value to a string returns the time formatted using the format string When evaluating the operands of an expression or of function calls, operations are evaluated in lexical left-to-right order. For example, in the evaluation of the function calls and evaluation of c happen in the order h(), i(), j(), c. Floating-point operations within a single expression are evaluated according to the associativity of the operators. Explicit parentheses affect the evaluation by overriding the default associativity. In the expression x + (y + z) the addition y + z is performed before adding x. Statements control execution. The empty statement does nothing. Alter table statements modify existing tables. With the ADD clause it adds a new column to the table. The column must not exist. With the DROP clause it removes an existing column from a table. The column must exist and it must be not the only (last) column of the table. IOW, there cannot be a table with no columns. For example When adding a column to a table with existing data, the constraint clause of the ColumnDef cannot be used. Adding a constrained column to an empty table is fine. Begin transactions statements introduce a new transaction level. Every transaction level must be eventually balanced by exactly one of COMMIT or ROLLBACK statements. Note that when a transaction is roll-backed because of a statement failure then no explicit balancing of the respective BEGIN TRANSACTION is statement is required nor permitted. Failure to properly balance any opened transaction level may cause dead locks and/or lose of data updated in the uppermost opened but never properly closed transaction level. For example A database cannot be updated (mutated) outside of a transaction. Statements requiring a transaction A database is effectively read only outside of a transaction. Statements not requiring a transaction The commit statement closes the innermost transaction nesting level. If that's the outermost level then the updates to the DB made by the transaction are atomically made persistent. For example Create index statements create new indices. Index is a named projection of ordered values of a table column to the respective records. As a special case the id() of the record can be indexed. Index name must not be the same as any of the existing tables and it also cannot be the same as of any column name of the table the index is on. For example Now certain SELECT statements may use the indices to speed up joins and/or to speed up record set filtering when the WHERE clause is used; or the indices might be used to improve the performance when the ORDER BY clause is present. The UNIQUE modifier requires the indexed values tuple to be index-wise unique or have all values NULL. The optional IF NOT EXISTS clause makes the statement a no operation if the index already exists. A simple index consists of only one expression which must be either a column name or the built-in id(). A more complex and more general index is one that consists of more than one expression or its single expression does not qualify as a simple index. In this case the type of all expressions in the list must be one of the non blob-like types. Note: Blob-like types are blob, bigint, bigrat, time and duration. Create table statements create new tables. A column definition declares the column name and type. Table names and column names are case sensitive. Neither a table or an index of the same name may exist in the DB. For example The optional IF NOT EXISTS clause makes the statement a no operation if the table already exists. The optional constraint clause has two forms. The first one is found in many SQL dialects. This form prevents the data in column DepartmentName to be NULL. The second form allows an arbitrary boolean expression to be used to validate the column. If the value of the expression is true then the validation succeeded. If the value of the expression is false or NULL then the validation fails. If the value of the expression is not of type bool an error occurs. The optional DEFAULT clause is an expression which, if present, is substituted instead of a NULL value when the colum is assigned a value. Note that the constraint and/or default expressions may refer to other columns by name: When a table row is inserted by the INSERT INTO statement or when a table row is updated by the UPDATE statement, the order of operations is as follows: 1. The new values of the affected columns are set and the values of all the row columns become the named values which can be referred to in default expressions evaluated in step 2. 2. If any row column value is NULL and the DEFAULT clause is present in the column's definition, the default expression is evaluated and its value is set as the respective column value. 3. The values, potentially updated, of row columns become the named values which can be referred to in constraint expressions evaluated during step 4. 4. All row columns which definition has the constraint clause present will have that constraint checked. If any constraint violation is detected, the overall operation fails and no changes to the table are made. Delete from statements remove rows from a table, which must exist. For example If the WHERE clause is not present then all rows are removed and the statement is equivalent to the TRUNCATE TABLE statement. Drop index statements remove indices from the DB. The index must exist. For example The optional IF EXISTS clause makes the statement a no operation if the index does not exist. Drop table statements remove tables from the DB. The table must exist. For example The optional IF EXISTS clause makes the statement a no operation if the table does not exist. Insert into statements insert new rows into tables. New rows come from literal data, if using the VALUES clause, or are a result of select statement. In the later case the select statement is fully evaluated before the insertion of any rows is performed, allowing to insert values calculated from the same table rows are to be inserted into. If the ColumnNameList part is omitted then the number of values inserted in the row must be the same as are columns in the table. If the ColumnNameList part is present then the number of values per row must be same as the same number of column names. All other columns of the record are set to NULL. The type of the value assigned to a column must be the same as is the column's type or the value must be NULL. For example If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. Explain statement produces a recordset consisting of lines of text which describe the execution plan of a statement, if any. For example, the QL tool treats the explain statement specially and outputs the joined lines: The explanation may aid in uderstanding how a statement/query would be executed and if indices are used as expected - or which indices may possibly improve the statement performance. The create index statements above were directly copy/pasted in the terminal from the suggestions provided by the filter recordset pipeline part returned by the explain statement. If the statement has nothing special in its plan, the result is the original statement. To get an explanation of the select statement of the IN predicate, use the EXPLAIN statement with that particular select statement. The rollback statement closes the innermost transaction nesting level discarding any updates to the DB made by it. If that's the outermost level then the effects on the DB are as if the transaction never happened. For example The (temporary) record set from the last statement is returned and can be processed by the client. In this case the rollback is the same as 'DROP TABLE tmp;' but it can be a more complex operation. Select from statements produce recordsets. The optional DISTINCT modifier ensures all rows in the result recordset are unique. Either all of the resulting fields are returned ('*') or only those named in FieldList. RecordSetList is a list of table names or parenthesized select statements, optionally (re)named using the AS clause. The result can be filtered using a WhereClause and orderd by the OrderBy clause. For example If Recordset is a nested, parenthesized SelectStmt then it must be given a name using the AS clause if its field are to be accessible in expressions. A field is an named expression. Identifiers, not used as a type in conversion or a function name in the Call clause, denote names of (other) fields, values of which should be used in the expression. The expression can be named using the AS clause. If the AS clause is not present and the expression consists solely of a field name, then that field name is used as the name of the resulting field. Otherwise the field is unnamed. For example The SELECT statement can optionally enumerate the desired/resulting fields in a list. No two identical field names can appear in the list. When more than one record set is used in the FROM clause record set list, the result record set field names are rewritten to be qualified using the record set names. If a particular record set doesn't have a name, its respective fields became unnamed. The optional JOIN clause, for example is mostly equal to except that the rows from a which, when they appear in the cross join, never made expr to evaluate to true, are combined with a virtual row from b, containing all nulls, and added to the result set. For the RIGHT JOIN variant the discussed rules are used for rows from b not satisfying expr == true and the virtual, all-null row "comes" from a. The FULL JOIN adds the respective rows which would be otherwise provided by the separate executions of the LEFT JOIN and RIGHT JOIN variants. For more thorough OUTER JOIN discussion please see the Wikipedia article at [10]. Resultins rows of a SELECT statement can be optionally ordered by the ORDER BY clause. Collating proceeds by considering the expressions in the expression list left to right until a collating order is determined. Any possibly remaining expressions are not evaluated. All of the expression values must yield an ordered type or NULL. Ordered types are defined in "Comparison operators". Collating of elements having a NULL value is different compared to what the comparison operators yield in expression evaluation (NULL result instead of a boolean value). Below, T denotes a non NULL value of any QL type. NULL collates before any non NULL value (is considered smaller than T). Two NULLs have no collating order (are considered equal). The WHERE clause restricts records considered by some statements, like SELECT FROM, DELETE FROM, or UPDATE. It is an error if the expression evaluates to a non null value of non bool type. The GROUP BY clause is used to project rows having common values into a smaller set of rows. For example Using the GROUP BY without any aggregate functions in the selected fields is in certain cases equal to using the DISTINCT modifier. The last two examples above produce the same resultsets. The optional OFFSET clause allows to ignore first N records. For example The above will produce only rows 11, 12, ... of the record set, if they exist. The value of the expression must a non negative integer, but not bigint or duration. The optional LIMIT clause allows to ignore all but first N records. For example The above will return at most the first 10 records of the record set. The value of the expression must a non negative integer, but not bigint or duration. The LIMIT and OFFSET clauses can be combined. For example Considering table t has, say 10 records, the above will produce only records 4 - 8. After returning record #8, no more result rows/records are computed. 1. The FROM clause is evaluated, producing a Cartesian product of its source record sets (tables or nested SELECT statements). 2. If present, the JOIN cluase is evaluated on the result set of the previous evaluation and the recordset specified by the JOIN clause. (... JOIN Recordset ON ...) 3. If present, the WHERE clause is evaluated on the result set of the previous evaluation. 4. If present, the GROUP BY clause is evaluated on the result set of the previous evaluation(s). 5. The SELECT field expressions are evaluated on the result set of the previous evaluation(s). 6. If present, the DISTINCT modifier is evaluated on the result set of the previous evaluation(s). 7. If present, the ORDER BY clause is evaluated on the result set of the previous evaluation(s). 8. If present, the OFFSET clause is evaluated on the result set of the previous evaluation(s). The offset expression is evaluated once for the first record produced by the previous evaluations. 9. If present, the LIMIT clause is evaluated on the result set of the previous evaluation(s). The limit expression is evaluated once for the first record produced by the previous evaluations. Truncate table statements remove all records from a table. The table must exist. For example Update statements change values of fields in rows of a table. For example Note: The SET clause is optional. If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. To allow to query for DB meta data, there exist specially named tables, some of them being virtual. Note: Virtual system tables may have fake table-wise unique but meaningless and unstable record IDs. Do not apply the built-in id() to any system table. The table __Table lists all tables in the DB. The schema is The Schema column returns the statement to (re)create table Name. This table is virtual. The table __Colum lists all columns of all tables in the DB. The schema is The Ordinal column defines the 1-based index of the column in the record. This table is virtual. The table __Colum2 lists all columns of all tables in the DB which have the constraint NOT NULL or which have a constraint expression defined or which have a default expression defined. The schema is It's possible to obtain a consolidated recordset for all properties of all DB columns using The Name column is the column name in TableName. The table __Index lists all indices in the DB. The schema is The IsUnique columns reflects if the index was created using the optional UNIQUE clause. This table is virtual. Built-in functions are predeclared. The built-in aggregate function avg returns the average of values of an expression. Avg ignores NULL values, but returns NULL if all values of a column are NULL or if avg is applied to an empty record set. The column values must be of a numeric type. The built-in function contains returns true if substr is within s. If any argument to contains is NULL the result is NULL. The built-in aggregate function count returns how many times an expression has a non NULL values or the number of rows in a record set. Note: count() returns 0 for an empty record set. For example Date returns the time corresponding to in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1. A daylight savings time transition skips or repeats times. For example, in the United States, March 13, 2011 2:15am never occurred, while November 6, 2011 1:15am occurred twice. In such cases, the choice of time zone, and therefore the time, is not well-defined. Date returns a time that is correct in one of the two zones involved in the transition, but it does not guarantee which. A location maps time instants to the zone in use at that time. Typically, the location represents the collection of time offsets in use in a geographical area, such as "CEST" and "CET" for central Europe. "local" represents the system's local time zone. "UTC" represents Universal Coordinated Time (UTC). The month specifies a month of the year (January = 1, ...). If any argument to date is NULL the result is NULL. The built-in function day returns the day of the month specified by t. If the argument to day is NULL the result is NULL. The built-in function formatTime returns a textual representation of the time value formatted according to layout, which defines the format by showing how the reference time, would be displayed if it were the value; it serves as an example of the desired output. The same display rules will then be applied to the time value. If any argument to formatTime is NULL the result is NULL. NOTE: The string value of the time zone, like "CET" or "ACDT", is dependent on the time zone of the machine the function is run on. For example, if the t value is in "CET", but the machine is in "ACDT", instead of "CET" the result is "+0100". This is the same what Go (time.Time).String() returns and in fact formatTime directly calls t.String(). returns on a machine in the CET time zone, but may return on a machine in the ACDT zone. The time value is in both cases the same so its ordering and comparing is correct. Only the display value can differ. The built-in functions formatFloat and formatInt format numbers to strings using go's number format functions in the `strconv` package. For all three functions, only the first argument is mandatory. The default values of the rest are shown in the examples. If the first argument is NULL, the result is NULL. returns returns returns Unlike the `strconv` equivalent, the formatInt function handles all integer types, both signed and unsigned. The built-in function hasPrefix tests whether the string s begins with prefix. If any argument to hasPrefix is NULL the result is NULL. The built-in function hasSuffix tests whether the string s ends with suffix. If any argument to hasSuffix is NULL the result is NULL. The built-in function hour returns the hour within the day specified by t, in the range [0, 23]. If the argument to hour is NULL the result is NULL. The built-in function hours returns the duration as a floating point number of hours. If the argument to hours is NULL the result is NULL. The built-in function id takes zero or one arguments. If no argument is provided, id() returns a table-unique automatically assigned numeric identifier of type int. Ids of deleted records are not reused unless the DB becomes completely empty (has no tables). For example If id() without arguments is called for a row which is not a table record then the result value is NULL. For example If id() has one argument it must be a table name of a table in a cross join. For example The built-in function len takes a string argument and returns the lentgh of the string in bytes. The expression len(s) is constant if s is a string constant. If the argument to len is NULL the result is NULL. The built-in aggregate function max returns the largest value of an expression in a record set. Max ignores NULL values, but returns NULL if all values of a column are NULL or if max is applied to an empty record set. The expression values must be of an ordered type. For example The built-in aggregate function min returns the smallest value of an expression in a record set. Min ignores NULL values, but returns NULL if all values of a column are NULL or if min is applied to an empty record set. For example The column values must be of an ordered type. The built-in function minute returns the minute offset within the hour specified by t, in the range [0, 59]. If the argument to minute is NULL the result is NULL. The built-in function minutes returns the duration as a floating point number of minutes. If the argument to minutes is NULL the result is NULL. The built-in function month returns the month of the year specified by t (January = 1, ...). If the argument to month is NULL the result is NULL. The built-in function nanosecond returns the nanosecond offset within the second specified by t, in the range [0, 999999999]. If the argument to nanosecond is NULL the result is NULL. The built-in function nanoseconds returns the duration as an integer nanosecond count. If the argument to nanoseconds is NULL the result is NULL. The built-in function now returns the current local time. The built-in function parseTime parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, would be interpreted if it were the value; it serves as an example of the input format. The same interpretation will then be made to the input string. Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored. In the absence of a time zone indicator, parseTime returns a time in UTC. When parsing a time with a zone offset like -0700, if the offset corresponds to a time zone used by the current location, then parseTime uses that location and zone in the returned time. Otherwise it records the time as being in a fabricated location with time fixed at the given zone offset. When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset. This choice means that such a time can be parses and reformatted with the same layout losslessly, but the exact instant used in the representation will differ by the actual zone offset. To avoid such problems, prefer time layouts that use a numeric zone offset. If any argument to parseTime is NULL the result is NULL. The built-in function second returns the second offset within the minute specified by t, in the range [0, 59]. If the argument to second is NULL the result is NULL. The built-in function seconds returns the duration as a floating point number of seconds. If the argument to seconds is NULL the result is NULL. The built-in function since returns the time elapsed since t. It is shorthand for now()-t. If the argument to since is NULL the result is NULL. The built-in aggregate function sum returns the sum of values of an expression for all rows of a record set. Sum ignores NULL values, but returns NULL if all values of a column are NULL or if sum is applied to an empty record set. The column values must be of a numeric type. The built-in function timeIn returns t with the location information set to loc. For discussion of the loc argument please see date(). If any argument to timeIn is NULL the result is NULL. The built-in function weekday returns the day of the week specified by t. Sunday == 0, Monday == 1, ... If the argument to weekday is NULL the result is NULL. The built-in function year returns the year in which t occurs. If the argument to year is NULL the result is NULL. The built-in function yearDay returns the day of the year specified by t, in the range [1,365] for non-leap years, and [1,366] in leap years. If the argument to yearDay is NULL the result is NULL. Three functions assemble and disassemble complex numbers. The built-in function complex constructs a complex value from a floating-point real and imaginary part, while real and imag extract the real and imaginary parts of a complex value. The type of the arguments and return value correspond. For complex, the two arguments must be of the same floating-point type and the return type is the complex type with the corresponding floating-point constituents: complex64 for float32, complex128 for float64. The real and imag functions together form the inverse, so for a complex value z, z == complex(real(z), imag(z)). If the operands of these functions are all constants, the return value is a constant. If any argument to any of complex, real, imag functions is NULL the result is NULL. For the numeric types, the following sizes are guaranteed Portions of this specification page are modifications based on work[2] created and shared by Google[3] and used according to terms described in the Creative Commons 3.0 Attribution License[4]. This specification is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a BSD license[5]. Links from the above documentation This section is not part of the specification. WARNING: The implementation of indices is new and it surely needs more time to become mature. Indices are used currently used only by the WHERE clause. The following expression patterns of 'WHERE expression' are recognized and trigger index use. The relOp is one of the relation operators <, <=, ==, >=, >. For the equality operator both operands must be of comparable types. For all other operators both operands must be of ordered types. The constant expression is a compile time constant expression. Some constant folding is still a TODO. Parameter is a QL parameter ($1 etc.). Consider tables t and u, both with an indexed field f. The WHERE expression doesn't comply with the above simple detected cases. However, such query is now automatically rewritten to which will use both of the indices. The impact of using the indices can be substantial (cf. BenchmarkCrossJoin*) if the resulting rows have low "selectivity", ie. only few rows from both tables are selected by the respective WHERE filtering. Note: Existing QL DBs can be used and indices can be added to them. However, once any indices are present in the DB, the old QL versions cannot work with such DB anymore. Running a benchmark with -v (-test.v) outputs information about the scale used to report records/s and a brief description of the benchmark. For example Running the full suite of benchmarks takes a lot of time. Use the -timeout flag to avoid them being killed after the default time limit (10 minutes).

v1.1.0 • 9 years ago

github.com/jpvetterli/args

Package args is used to configure programs using command line arguments and other sources. A program usually gets its parameters from a string array supplied by the system. This array is named "args" in many programming languages, or something close. By an incredible coincidence, this is precisely the name of this package. The package keeps simple things simple and makes complicated things possible. A first example shows how to provide help when the user says or maybe "-help", or "--help", or "-h": Parameters can be defined to take values for simple variables of builtin types and for arrays and slices. If needed, custom scanners can be used, so non-standard types are supported too. From the programmer's point of view, configuring parameters and taking values looks like this: From the user's point of view, here are 3 different ways of specifying 3 different things: Now, writing all this repeatedly on the command line is tedious. Fortunately the parser can take parameters from a file, like this: Assuming the 3 lines above have been written verbatim to the file, there will 9 things, not 3. But the parser is capable of ignoring parameters, so let's ignore the first two lines. The file could look like this (white space can be inserted freely): Symbols make it easy to override parameters in a file. Modify the file to use a symbol for the 3d value: The value can be overriden by setting the symbol DEFAULT3 before including the file: This works because the first value of a symbol wins. (For a scalar parameter it is the last one.) The syntax uses special characters, like $, [, and ] in these examples. In some environments, the predefined special characters can be problematic, but they can be reconfigured easily by the program, and the above input could look like this: In the args package, everything is a name-value pair: parameters, operators like include and -- (yes, it's a name). To keep simple things simple, it is possible to omit the name when defined as empty, the parameter is of type bool, and the value is true. This is the case of the "help" parameter in first example. So, to get help, the user does not need to write "help=true" but simply When no help is wanted, it is not necessary to say "help = false". The reason is that the initial value of the parameter is false. The programming interface is explained in the type and method documentation. The following sections describe the package from the point of view of the user. Parameters are formulated using a mini-language where words belong either to a name or to a value. Names and values must agree with parameter definitions made in the program. Names are composed of letters, digits (as tested by unicode.IsLetter and IsDigit), hyphens, or underscores. Syntactically, names and values are recognizable by the presence of a separator between them. The separator is one of five specially designated characters supporting the language syntax, in addition to white space: the symbol prefix, the name-value separator, the opening quote, the closing quote, and the escape character. Unless configured otherwise, these characters are: White space is a sequence of one or more characters for which unicode.IsSpace returns true. The escape character suppresses the effect of white space and special characters. When it has no effect, the escape is a normal character. An empty name is usually omitted, and in this case the separator must also be omitted. Parameters with an empty name are known as anonymous and their values as standalone values. When a standalone value is the name of a parameter, it is interpreted as that name with the value "true". This effect is usually what is wanted, but if necessary, can be avoided by using a quoted empty string: []. White space around separators is ignored but is significant between words, as it separates distinct values. It can be included in values by quoting. When a quoted string contains a nested quoted string, quotes must be balanced. Outermost quotes are removed but nested quotes are kept. A symbol is a name preceded by the symbol prefix. A symbol reference is a symbol prefix followed by a name between quotes. The following examples use the default set of special characters: The order in which parameters with different names are specified is not significant. The program will not know if parameter foo was specified before parameter bar. Multiple values of parameters can be mixed in any order. On the other hand, multiple parameter values are kept in specification sequence. If foo and bar are array or slice parameters, with an input like "foo=1 bar=a foo=2 bar=b bar=c foo=3", the program will see foo=[1 2 3] and bar=[a b c] but will not know if foo came before bar. if foo and bar are scalar parameters, the last value wins and the program will see foo=3 and bar=c. Parameter names are given by the program but symbols can be freely chosen, as long as they follow the same syntax rules as parameter names. A symbol is defined by prefixing it once with the symbol prefix. A reference consists of a symbol followed by name between quotes (as in the string "my $[foo] is rich"). Parameters and symbols are in distinct name spaces, and it is possible to use the name of a parameter for a symbol, as in "$foo=bar foo=$[foo]" which is equivalent to "foo=bar". Symbols must be defined before use: in "foo=$[x] $x=bar", the value of "foo" cannot be resolved. Redefining a symbol has no effect: the first wins (unlike parameters, where the last wins). The specification "$x=bar $x=quux foo=$[x]" is equivalent to "foo=bar". When a parameter is defined, it gets a target, which is the program variable that will take values from the parameter. The target's type determines the number and type of values the parameter can take. A target with a simple type can take one value. If multiple values are specified, the last value wins (again, contrast this with symbols, where the first wins). If no value is specified, an error occurs, unless the parameter has been defined as optional. The initial value of the target is the default value of an optional parameter. If the target is an array type, the number of values specified must be exactly equal to the array length. If the target is a slice type, the number of values can be at most equal to the slice capacity. If the slice capacity is zero, the parameter can take any number of values. A parameter with a slice target can be omitted or it can take a number of values less than its current length. In such cases the initial values in the slice provide the default values of the parameter. Instead of requiring repeating values, a parameter can be defined with a splitter, which is a regular expression. For example, if a splitter is defined to split a string around a colon (with optional white space) the specification "foo=[1:2:3] foo=[ 4 : 5]" sets 5 values. There are 7 operators built into args. Operators are built-in commands which have an effect on the state of the parser. From a user perspective, they look like any other parameter, with a name, a name-value separator, and a value containing subparameters. In increasing order of sophistication, they are -- (pronounced "comment"), dump, reset, import, macro, cond, and include. Operator subparameters are all defined as verbatim (see Param.Verbatim) except for two subparameters of include. The -- ("comment") operator ignores its value. The value can be anything, as long as it can be scanned by the parser, which means that quotes must be balanced. The operator is used to insert comments into a string of parameters. The operator is especially useful for commenting out blocks of parameters, which can span multiple lines and can contain nested comments. The dump operator is useful for debugging a complex args specification. It takes an optional "comment" parameter, and zero or more standalone values. dump interprets the values as parameter names and symbols and prints them line by line on standard error with their current values. The value of a symbol is followed by R if resolved, else by U. A name or symbol is preceded by ? if undefined. The empty parameter name is printed between quotes. If a comment is specified, it is printed first. A typical use may look like (import is explained ahead): and produce the output: The reset operator is used to remove symbols. It takes a series of values, which it interprets as symbols and removes them from the symbol table, if present. An error occurs if values are not symbols. It is useful because of the "first wins" principle. Contrary to parameter values, where the "last wins", once a symbol has been set, its value cannot be changed. When for some reason this is annoying, the symbol can be reset before taking a new value. For example, the specification: produces: The import operator is used to import environment variables. It takes a series of values, which it interprets as symbols. For each symbol a value is taken from the environment variable with the corresponding name (after removing the symbol prefix). The value is inserted into the symbol table unless there is already an entry for the symbol ("first wins" principle). If the environment variable does not exist, nothing is done. For example, the specification: produces the output: The macro operator is used to expand standalone symbol references. Without special care, standalone symbol references are either invalid (when no anonymous parameter was defined) or are taken up as values of anonymous parameters. They can be used when contained in verbatim parameters, typically intended to be parsed by sub-commands, as shown in the example with Param.Verbatim. The macro operator makes this pattern available at the top level without needing any auxiliary parameters. The operator takes a series of values, which it interprets as symbols, gets their values from the symbol table without resolving them, and passes them recursively to Parser.Parse. An error occurs if values are not symbols, if any symbol is not found, or if parsing fails. As an example, after parsing the input the string slice foo has the values "number 1" and "number 2". Notice the reset operator, necessary for the new count to take effect (because of the "first wins" principle). The cond operator allows conditional parsing. It has two mandatory parameters, "if" and "then" and an optional one, "else", all taking one value. The value of "if" is interpreted as a parameter name or symbol. It evaluates to true if the symbol exists or if the parameter has been set at least once. If the value is an undefined parameter, an error occurs. When the value of "if" is true the value of "then" is parsed, else it is the value of "else" which is parsed, if specified. For example, after parsing the input the string foo has the value "bar". The include operator has two different modes: a basic mode for recursively parsing a file containing parameters and a key-selection mode for extracting name-value pairs from a file. In basic mode, include takes a file name as anonymous parameter. It reads the file and parses its content recursively. Files can be included recursively and any cyclical dependency is detected. The anonymous parameter taking the file name is one the two operator parameters not defined as verbatim. In key-selection mode, include takes a file name, a "keys" parameter, and an optional "extractor" parameter. (The extractor parameter is the second operator parameter which is not verbatim.) The value of "keys" is interpreted as a series of standalone keys or key-translation pairs (using the current separator character of the parser). If there is no translation, the key translates to itself. include extracts name-value pairs from each line of the file, and if it finds a name matching one of the keys, it uses the value to set a parameter or a symbol, depending on the translated key. As always a symbol is set only if it does not already exist ("first wins" principle). The "extractor" parameter specifies a custom regular expression for extracting key-value pairs. The default extractor is \s*(\S+)\s*=\s*(\S+)\s*. It is an error to specify an extractor in basic mode (when no keys are specified). As an example, suppose there is file /home/u649/.db.conf with data we can use. Only the user and password information is needed. Parsing the input produces this dump output: This example will parse successfully only if the user running the program has read access to the file.

v1.0.0 • 4 years ago

github.com/kiteco/pigeon

Command pigeon generates parsers in Go from a PEG grammar. From Wikipedia [0]: Its features and syntax are inspired by the PEG.js project [1], while the implementation is loosely based on [2]. Formal presentation of the PEG theory by Bryan Ford is also an important reference [3]. An introductory blog post can be found at [4]. The pigeon tool must be called with PEG input as defined by the accepted PEG syntax below. The grammar may be provided by a file or read from stdin. The generated parser is written to stdout by default. The following options can be specified: The tool makes no attempt to format the code, nor to detect the required imports. It is recommended to use goimports to properly generate the output code: The goimports tool can be installed with: If the code blocks in the grammar (see below, section "Code block") are golint- and go vet-compliant, then the resulting generated code will also be golint- and go vet-compliant. The generated code doesn't use any third-party dependency unless code blocks in the grammar require such a dependency. The accepted syntax for the grammar is formally defined in the grammar/pigeon.peg file, using the PEG syntax. What follows is an informal description of this syntax. Identifiers, whitespace, comments and literals follow the same notation as the Go language, as defined in the language specification (http://golang.org/ref/spec#Source_code_representation): The grammar must be Unicode text encoded in UTF-8. New lines are identified by the \n character (U+000A). Space (U+0020), horizontal tabs (U+0009) and carriage returns (U+000D) are considered whitespace and are ignored except to separate tokens. A PEG grammar consists of a set of rules. A rule is an identifier followed by a rule definition operator and an expression. An optional display name - a string literal used in error messages instead of the rule identifier - can be specified after the rule identifier. E.g.: The rule definition operator can be any one of those: A rule is defined by an expression. The following sections describe the various expression types. Expressions can be grouped by using parentheses, and a rule can be referenced by its identifier in place of an expression. The choice expression is a list of expressions that will be tested in the order they are defined. The first one that matches will be used. Expressions are separated by the forward slash character "/". E.g.: Because the first match is used, it is important to think about the order of expressions. For example, in this rule, "<=" would never be used because the "<" expression comes first: The sequence expression is a list of expressions that must all match in that same order for the sequence expression to be considered a match. Expressions are separated by whitespace. E.g.: A labeled expression consists of an identifier followed by a colon ":" and an expression. A labeled expression introduces a variable named with the label that can be referenced in the code blocks in the same scope. The variable will have the value of the expression that follows the colon. E.g.: The variable is typed as an empty interface, and the underlying type depends on the following: For terminals (character and string literals, character classes and the any matcher), the value is []byte. E.g.: For predicates (& and !), the value is always nil. E.g.: For a sequence, the value is a slice of empty interfaces, one for each expression value in the sequence. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a repetition (+ and *), the value is a slice of empty interfaces, one for each repetition. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a choice expression, the value is that of the matching choice. E.g.: For the optional expression (?), the value is nil or the value of the expression. E.g.: Of course, the type of the value can be anything once an action code block is used. E.g.: An expression prefixed with the ampersand "&" is the "and" predicate expression: it is considered a match if the following expression is a match, but it does not consume any input. An expression prefixed with the exclamation point "!" is the "not" predicate expression: it is considered a match if the following expression is not a match, but it does not consume any input. E.g.: The expression following the & and ! operators can be a code block. In that case, the code block must return a bool and an error. The operator's semantic is the same, & is a match if the code block returns true, ! is a match if the code block returns false. The code block has access to any labeled value defined in its scope. E.g.: An expression followed by "*", "?" or "+" is a match if the expression occurs zero or more times ("*"), zero or one time "?" or one or more times ("+") respectively. The match is greedy, it will match as many times as possible. E.g. A literal matcher tries to match the input against a single character or a string literal. The literal may be a single-quoted single character, a double-quoted string or a backtick-quoted raw string. The same rules as in Go apply regarding the allowed characters and escapes. The literal may be followed by a lowercase "i" (outside the ending quote) to indicate that the match is case-insensitive. E.g.: A character class matcher tries to match the input against a class of characters inside square brackets "[...]". Inside the brackets, characters represent themselves and the same escapes as in string literals are available, except that the single- and double-quote escape is not valid, instead the closing square bracket "]" must be escaped to be used. Character ranges can be specified using the "[a-z]" notation. Unicode classes can be specified using the "[\pL]" notation, where L is a single-letter Unicode class of characters, or using the "[\p{Class}]" notation where Class is a valid Unicode class (e.g. "Latin"). As for string literals, a lowercase "i" may follow the matcher (outside the ending square bracket) to indicate that the match is case-insensitive. A "^" as first character inside the square brackets indicates that the match is inverted (it is a match if the input does not match the character class matcher). E.g.: The any matcher is represented by the dot ".". It matches any character except the end of file, thus the "!." expression is used to indicate "match the end of file". E.g.: Code blocks can be added to generate custom Go code. There are three kinds of code blocks: the initializer, the action and the predicate. All code blocks appear inside curly braces "{...}". The initializer must appear first in the grammar, before any rule. It is copied as-is (minus the wrapping curly braces) at the top of the generated parser. It may contain function declarations, types, variables, etc. just like any Go file. Every symbol declared here will be available to all other code blocks. Although the initializer is optional in a valid grammar, it is usually required to generate a valid Go source code file (for the package clause). E.g.: Action code blocks are code blocks declared after an expression in a rule. Those code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being the value of the expression (an interface{}), and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: Predicate code blocks are code blocks declared immediately after the and "&" or the not "!" operators. Like action code blocks, predicate code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being a bool and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: The current type is a struct that provides two useful fields that can be accessed in action and predicate code blocks: "pos" and "text". The "pos" field indicates the current position of the parser in the source input. It is itself a struct with three fields: "line", "col" and "offset". Line is a 1-based line number, col is a 1-based column number that counts runes from the start of the line, and offset is a 0-based byte offset. The "text" field is the slice of bytes of the current match. It is empty in a predicate code block. The parser generated by pigeon exports a few symbols so that it can be used as a package with public functions to parse input text. The exported API is: See the godoc page of the generated parser for the test/predicates grammar for an example documentation page of the exported API: http://godoc.org/github.com/mna/pigeon/test/predicates. Like the grammar used to generate the parser, the input text must be UTF-8-encoded Unicode. The start rule of the parser is the first rule in the PEG grammar used to generate the parser. A call to any of the Parse* functions returns the value generated by executing the grammar on the provided input text, and an optional error. Typically, the grammar should generate some kind of abstract syntax tree (AST), but for simple grammars it may evaluate the result immediately, such as in the examples/calculator example. There are no constraints imposed on the author of the grammar, it can return whatever is needed. When the parser returns a non-nil error, the error is always of type errList, which is defined as a slice of errors ([]error). Each error in the list is of type *parserError. This is a struct that has an "Inner" field that can be used to access the original error. So if a code block returns some well-known error like: The original error can be accessed this way: By defaut the parser will continue after an error is returned and will cumulate all errors found during parsing. If the grammar reaches a point where it shouldn't continue, a panic statement can be used to terminate parsing. The panic will be caught at the top-level of the Parse* call and will be converted into a *parserError like any error, and an errList will still be returned to the caller. The divide by zero error in the examples/calculator grammar leverages this feature (no special code is needed to handle division by zero, if it happens, the runtime panics and it is recovered and returned as a parsing error). Providing good error reporting in a parser is not a trivial task. Part of it is provided by the pigeon tool, by offering features such as filename, position and rule name in the error message, but an important part of good error reporting needs to be done by the grammar author. For example, many programming languages use double-quotes for string literals. Usually, if the opening quote is found, the closing quote is expected, and if none is found, there won't be any other rule that will match, there's no need to backtrack and try other choices, an error should be added to the list and the match should be consumed. In order to do this, the grammar can look something like this: This is just one example, but it illustrates the idea that error reporting needs to be thought out when designing the grammar. Generated parsers have user-provided code mixed with pigeon code in the same package, so there is no package boundary in the resulting code to prevent access to unexported symbols. What is meant to be implementation details in pigeon is also available to user code - which doesn't mean it should be used. For this reason, it is important to precisely define what is intended to be the supported API of pigeon, the parts that will be stable in future versions. The "stability" of the API attempts to make a similar guarantee as the Go 1 compatibility [5]. The following lists what part of the current pigeon code falls under that guarantee (features may be added in the future): The pigeon command-line flags and arguments: those will not be removed and will maintain the same semantics. The explicitly exported API generated by pigeon. See [6] for the documentation of this API on a generated parser. The PEG syntax, as documented above. The code blocks (except the initializer) will always be generated as methods on the *current type, and this type is guaranteed to have the fields pos (type position) and text (type []byte). There are no guarantees on other fields and methods of this type. The position type will always have the fields line, col and offset, all defined as int. There are no guarantees on other fields and methods of this type. The type of the error value returned by the Parse* functions, when not nil, will always be errList defined as a []error. There are no guarantees on methods of this type, other than the fact it implements the error interface. Individual errors in the errList will always be of type *parserError, and this type is guaranteed to have an Inner field that contains the original error value. There are no guarantees on other fields and methods of this type. References:

v1.0.0 • 8 years ago

github.com/mikolajb/cadence-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v0.7.5 • 7 years ago

github.com/abligh/cdl

Package cdl provides a configuration definition language for Go There are several ways to import configuration files into Go, from unmarshalling your own JSON using `encoding/json` to projects such as Viper (http://spf13.com/project/viper) which will read the configuration in JSON, YAML, TOML or from etcd, the command line, or possibly other sources. All of these have in common that in the end they produce something that looks in Go like However, these have the issue that they don't validate the configuration. If the supplied configuration omits mandatory keys, or puts in extra ones, or does both by misspelling one key, or puts the right bit of configuration at the wrong level, or any other error is made that doesn't actually prevent the JSON, YAML etc. parsing, then it's left for you to detect that manually in your program. cdl makes this all much easier. Simply supply a cdl template, compile it (once) using then validate using If the validation fails, you will get an `error` return with a context that will allow a user to discover the error in his file. So what was that `nil` parameter to `cdt.Validate` about? cdl also permits you to pass a configurator in, so that you can store the values retrieved in appropriate places. cdl templates are themselves a but are flat (i.e. only one level deep). The key represents a point in in the hierarchy to parse, and the value specifies what may appear at that point. The value is normally a `string`, but may be a pointer to a validation function you supply. For example: Here: The root level is specified to be a map ('`{}`'), which may consist of the elements `apple`, `peach` and `lemon`. There must be an `apple` element and `lemon` element, but the `peach` element is optional. The `apple` element must be a `float64` In order to validate the `peach` element, your own validator function (`isOneOrTwo`) is called. If this returns a `cdl.CdlError`, that error will be passed to the user (as an `error`). If it returns `nil`, then validation will continue. There is no validation at all on `peach` Let's take a more complicated example: Here we have allowed in the root level: `strawberry`: The `!` indicates it is mandatory; this is the default, so the `!` is unnecessary. Each `strawberry` must be an array of `nectarine` with between 1 and 3 components, and each `nectarine` must be a `string`. `rasbperry`: This is a shorthand for writing the same thing as above, i.e. an array of between 1 and 3 `raspberry`, each of which must be a string. `pear`: An array of zero or more items. Note the empty array must be there (if the array itself is optional, write `pear?*`). `plum`: An array of one or more entries. `kiwi`: An optionally present array of between 1 and 4 entries `guava`: A mandatory array of between 1 and 2 entries 1. Each key must either be `/` (for the root key) or consist of word characters (i.e. matching `\w+` in regexp terms) 2. Each key must have a value, which may be either: 4. Each validation instruction is a quoted string and may be either 5. Each pseudotype may be either 6. An array specifier has the form `[]key` optionally followed by a range specifier 7. A range specifier takes the form 9. A map element consists of a key (`key`) followed by zero or more modifiers 10. Permitted modifiers are: Where the validator is passed, it is a function with signature: Here's an example showing how it can return an error and send supplementary data back to the user. Note that cdl itself will add the appropriate context. A cdl configurator may optionally be passed to the `Validate` function. The configurator allows you to consume the configuration in your program now you know that is validated. The configurator consists of a map of keys to items. Each item should be either If a pointer to a variable is used, the variable must be of the same type as the item in the configuration, or an error will be issued; therefore as a type check is performed here, it is unnecessary in this case to require a specific type in the template. If a specific type is require, a type check is done twice. Certain pseudo-types being required will cause a type conversion: 1. If you required the pseudo-type `number`, you will be always be given a `float64` 2. If you required the pseudo-type `integer`, you will always be given an `int` If a pointer to an `Enum` is given, a `string` value is expected in the data, and it will be validated against that `Enum`. If a pointer configuration function is used, it has a `ConfiguratorFunc` type (or a function with a similar signature), which looks like this: This function is guaranteed to be called for each item in the tree (if it's key is present in the configurator) after it and all of its children have been validated. It may return an error (just like a validator function). The object passed will be the validated object from the configuration tree. It is guaranteed to be of the correct type, which means the type you asked for save for the following exceptions: 1. If you asked for the pseudo-type `number`, you will always be given a `float64`. 2. If you asked for the pseudo-type `integer`, you will always be given an `int`. As a trivial example: Here the parameter named `"i"` in the template will be stored in variable `i`.

v0.0.0-20150811161146-ed64eb8fed11 • 10 years ago

github.com/ashetkar/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v0.0.0-20211019035648-a5a9d81f6e56 • 4 years ago

github.com/mmcshane/gocql

Package gocql implements a fast and robust Cassandra driver for the Go programming language. Pass a list of initial node IP addresses to NewCluster to create a new cluster configuration: Port can be specified as part of the address, the above is equivalent to: It is recommended to use the value set in the Cassandra config for broadcast_address or listen_address, an IP address not a domain name. This is because events from Cassandra will use the configured IP address, which is used to index connected hosts. If the domain name specified resolves to more than 1 IP address then the driver may connect multiple times to the same host, and will not mark the node being down or up from events. Then you can customize more options (see ClusterConfig): The driver tries to automatically detect the protocol version to use if not set, but you might want to set the protocol version explicitly, as it's not defined which version will be used in certain situations (for example during upgrade of the cluster when some of the nodes support different set of protocol versions than other nodes). When ready, create a session from the configuration. Don't forget to Close the session once you are done with it: CQL protocol uses a SASL-based authentication mechanism and so consists of an exchange of server challenges and client response pairs. The details of the exchanged messages depend on the authenticator used. To use authentication, set ClusterConfig.Authenticator or ClusterConfig.AuthProvider. PasswordAuthenticator is provided to use for username/password authentication: It is possible to secure traffic between the client and server with TLS. To use TLS, set the ClusterConfig.SslOpts field. SslOptions embeds *tls.Config so you can set that directly. There are also helpers to load keys/certificates from files. Warning: Due to historical reasons, the SslOptions is insecure by default, so you need to set EnableHostVerification to true if no Config is set. Most users should set SslOptions.Config to a *tls.Config. SslOptions and Config.InsecureSkipVerify interact as follows: For example: To route queries to local DC first, use DCAwareRoundRobinPolicy. For example, if the datacenter you want to primarily connect is called dc1 (as configured in the database): The driver can route queries to nodes that hold data replicas based on partition key (preferring local DC). Note that TokenAwareHostPolicy can take options such as gocql.ShuffleReplicas and gocql.NonLocalReplicasFallback. We recommend running with a token aware host policy in production for maximum performance. Create queries with Session.Query. Query values must not be reused between different executions and must not be modified after starting execution of the query. To execute a query without reading results, use Query.Exec: Single row can be read by calling Query.Scan: Multiple rows can be read using Iter.Scanner: See Example for complete example. The driver automatically prepares DML queries (SELECT/INSERT/UPDATE/DELETE/BATCH statements) and maintains a cache of prepared statements. CQL protocol does not support preparing other query types. When using CQL protocol >= 4, it is possible to use gocql.UnsetValue as the bound value of a column. This will cause the database to ignore writing the column. The main advantage is the ability to keep the same prepared statement even when you don't want to update some fields, where before you needed to make another prepared statement. Session is safe to use from multiple goroutines, so to execute multiple concurrent queries, just execute them from several worker goroutines. Gocql provides synchronously-looking API (as recommended for Go APIs) and the queries are executed asynchronously at the protocol level. Null values are are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string variable instead of string. See Example_nulls for full example. The driver reuses backing memory of slices when unmarshalling. This is an optimization so that a buffer does not need to be allocated for every processed row. However, you need to be careful when storing the slices to other memory structures. When you want to save the data for later use, pass a new slice every time. A common pattern is to declare the slice variable within the scanner loop: The driver supports paging of results with automatic prefetch, see ClusterConfig.PageSize, Session.SetPrefetch, Query.PageSize, and Query.Prefetch. It is also possible to control the paging manually with Query.PageState (this disables automatic prefetch). Manual paging is useful if you want to store the page state externally, for example in a URL to allow users browse pages in a result. You might want to sign/encrypt the paging state when exposing it externally since it contains data from primary keys. Paging state is specific to the CQL protocol version and the exact query used. It is meant as opaque state that should not be modified. If you send paging state from different query or protocol version, then the behaviour is not defined (you might get unexpected results or an error from the server). For example, do not send paging state returned by node using protocol version 3 to a node using protocol version 4. Also, when using protocol version 4, paging state between Cassandra 2.2 and 3.0 is incompatible (https://issues.apache.org/jira/browse/CASSANDRA-10880). The driver does not check whether the paging state is from the same protocol version/statement. You might want to validate yourself as this could be a problem if you store paging state externally. For example, if you store paging state in a URL, the URLs might become broken when you upgrade your cluster. Call Query.PageState(nil) to fetch just the first page of the query results. Pass the page state returned by Iter.PageState to Query.PageState of a subsequent query to get the next page. If the length of slice returned by Iter.PageState is zero, there are no more pages available (or an error occurred). Using too low values of PageSize will negatively affect performance, a value below 100 is probably too low. While Cassandra returns exactly PageSize items (except for last page) in a page currently, the protocol authors explicitly reserved the right to return smaller or larger amount of items in a page for performance reasons, so don't rely on the page having the exact count of items. See Example_paging for an example of manual paging. There are certain situations when you don't know the list of columns in advance, mainly when the query is supplied by the user. Iter.Columns, Iter.RowData, Iter.MapScan and Iter.SliceMap can be used to handle this case. See Example_dynamicColumns. The CQL protocol supports sending batches of DML statements (INSERT/UPDATE/DELETE) and so does gocql. Use Session.NewBatch to create a new batch and then fill-in details of individual queries. Then execute the batch with Session.ExecuteBatch. Logged batches ensure atomicity, either all or none of the operations in the batch will succeed, but they have overhead to ensure this property. Unlogged batches don't have the overhead of logged batches, but don't guarantee atomicity. Updates of counters are handled specially by Cassandra so batches of counter updates have to use CounterBatch type. A counter batch can only contain statements to update counters. For unlogged batches it is recommended to send only single-partition batches (i.e. all statements in the batch should involve only a single partition). Multi-partition batch needs to be split by the coordinator node and re-sent to correct nodes. With single-partition batches you can send the batch directly to the node for the partition without incurring the additional network hop. It is also possible to pass entire BEGIN BATCH .. APPLY BATCH statement to Query.Exec. There are differences how those are executed. BEGIN BATCH statement passed to Query.Exec is prepared as a whole in a single statement. Session.ExecuteBatch prepares individual statements in the batch. If you have variable-length batches using the same statement, using Session.ExecuteBatch is more efficient. See Example_batch for an example. Query.ScanCAS or Query.MapScanCAS can be used to execute a single-statement lightweight transaction (an INSERT/UPDATE .. IF statement) and reading its result. See example for Query.MapScanCAS. Multiple-statement lightweight transactions can be executed as a logged batch that contains at least one conditional statement. All the conditions must return true for the batch to be applied. You can use Session.ExecuteBatchCAS and Session.MapExecuteBatchCAS when executing the batch to learn about the result of the LWT. See example for Session.MapExecuteBatchCAS. Queries can be marked as idempotent. Marking the query as idempotent tells the driver that the query can be executed multiple times without affecting its result. Non-idempotent queries are not eligible for retrying nor speculative execution. Idempotent queries are retried in case of errors based on the configured RetryPolicy. Queries can be retried even before they fail by setting a SpeculativeExecutionPolicy. The policy can cause the driver to retry on a different node if the query is taking longer than a specified delay even before the driver receives an error or timeout from the server. When a query is speculatively executed, the original execution is still executing. The two parallel executions of the query race to return a result, the first received result will be returned. UDTs can be mapped (un)marshaled from/to map[string]interface{} a Go struct (or a type implementing UDTUnmarshaler, UDTMarshaler, Unmarshaler or Marshaler interfaces). For structs, cql tag can be used to specify the CQL field name to be mapped to a struct field: See Example_userDefinedTypesMap, Example_userDefinedTypesStruct, ExampleUDTMarshaler, ExampleUDTUnmarshaler. It is possible to provide observer implementations that could be used to gather metrics: CQL protocol also supports tracing of queries. When enabled, the database will write information about internal events that happened during execution of the query. You can use Query.Trace to request tracing and receive the session ID that the database used to store the trace information in system_traces.sessions and system_traces.events tables. NewTraceWriter returns an implementation of Tracer that writes the events to a writer. Gathering trace information might be essential for debugging and optimizing queries, but writing traces has overhead, so this feature should not be used on production systems with very high load unless you know what you are doing. Example_batch demonstrates how to execute a batch of statements. Example_dynamicColumns demonstrates how to handle dynamic column list. Example_marshalerUnmarshaler demonstrates how to implement a Marshaler and Unmarshaler. Example_nulls demonstrates how to distinguish between null and zero value when needed. Null values are unmarshalled as zero value of the type. If you need to distinguish for example between text column being null and empty string, you can unmarshal into *string field. Example_paging demonstrates how to manually fetch pages and use page state. See also package documentation about paging. Example_set demonstrates how to use sets. Example_userDefinedTypesMap demonstrates how to work with user-defined types as maps. See also Example_userDefinedTypesStruct and examples for UDTMarshaler and UDTUnmarshaler if you want to map to structs. Example_userDefinedTypesStruct demonstrates how to work with user-defined types as structs. See also examples for UDTMarshaler and UDTUnmarshaler if you need more control/better performance.

v0.0.3 • 4 years ago

github.com/ic-projects/pigeon

Command pigeon generates parsers in Go from a PEG grammar. From Wikipedia [0]: Its features and syntax are inspired by the PEG.js project [1], while the implementation is loosely based on [2]. Formal presentation of the PEG theory by Bryan Ford is also an important reference [3]. An introductory blog post can be found at [4]. The pigeon tool must be called with PEG input as defined by the accepted PEG syntax below. The grammar may be provided by a file or read from stdin. The generated parser is written to stdout by default. The following options can be specified: The tool makes no attempt to format the code, nor to detect the required imports. It is recommended to use goimports to properly generate the output code: The goimports tool can be installed with: If the code blocks in the grammar (see below, section "Code block") are golint- and go vet-compliant, then the resulting generated code will also be golint- and go vet-compliant. The generated code doesn't use any third-party dependency unless code blocks in the grammar require such a dependency. The accepted syntax for the grammar is formally defined in the grammar/pigeon.peg file, using the PEG syntax. What follows is an informal description of this syntax. Identifiers, whitespace, comments and literals follow the same notation as the Go language, as defined in the language specification (http://golang.org/ref/spec#Source_code_representation): The grammar must be Unicode text encoded in UTF-8. New lines are identified by the \n character (U+000A). Space (U+0020), horizontal tabs (U+0009) and carriage returns (U+000D) are considered whitespace and are ignored except to separate tokens. A PEG grammar consists of a set of rules. A rule is an identifier followed by a rule definition operator and an expression. An optional display name - a string literal used in error messages instead of the rule identifier - can be specified after the rule identifier. E.g.: The rule definition operator can be any one of those: A rule is defined by an expression. The following sections describe the various expression types. Expressions can be grouped by using parentheses, and a rule can be referenced by its identifier in place of an expression. The choice expression is a list of expressions that will be tested in the order they are defined. The first one that matches will be used. Expressions are separated by the forward slash character "/". E.g.: Because the first match is used, it is important to think about the order of expressions. For example, in this rule, "<=" would never be used because the "<" expression comes first: The sequence expression is a list of expressions that must all match in that same order for the sequence expression to be considered a match. Expressions are separated by whitespace. E.g.: A labeled expression consists of an identifier followed by a colon ":" and an expression. A labeled expression introduces a variable named with the label that can be referenced in the code blocks in the same scope. The variable will have the value of the expression that follows the colon. E.g.: The variable is typed as an empty interface, and the underlying type depends on the following: For terminals (character and string literals, character classes and the any matcher), the value is []byte. E.g.: For predicates (& and !), the value is always nil. E.g.: For a sequence, the value is a slice of empty interfaces, one for each expression value in the sequence. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a repetition (+ and *), the value is a slice of empty interfaces, one for each repetition. The underlying types of each value in the slice follow the same rules described here, recursively. E.g.: For a choice expression, the value is that of the matching choice. E.g.: For the optional expression (?), the value is nil or the value of the expression. E.g.: Of course, the type of the value can be anything once an action code block is used. E.g.: An expression prefixed with the ampersand "&" is the "and" predicate expression: it is considered a match if the following expression is a match, but it does not consume any input. An expression prefixed with the exclamation point "!" is the "not" predicate expression: it is considered a match if the following expression is not a match, but it does not consume any input. E.g.: The expression following the & and ! operators can be a code block. In that case, the code block must return a bool and an error. The operator's semantic is the same, & is a match if the code block returns true, ! is a match if the code block returns false. The code block has access to any labeled value defined in its scope. E.g.: An expression followed by "*", "?" or "+" is a match if the expression occurs zero or more times ("*"), zero or one time "?" or one or more times ("+") respectively. The match is greedy, it will match as many times as possible. E.g. A literal matcher tries to match the input against a single character or a string literal. The literal may be a single-quoted single character, a double-quoted string or a backtick-quoted raw string. The same rules as in Go apply regarding the allowed characters and escapes. The literal may be followed by a lowercase "i" (outside the ending quote) to indicate that the match is case-insensitive. E.g.: A character class matcher tries to match the input against a class of characters inside square brackets "[...]". Inside the brackets, characters represent themselves and the same escapes as in string literals are available, except that the single- and double-quote escape is not valid, instead the closing square bracket "]" must be escaped to be used. Character ranges can be specified using the "[a-z]" notation. Unicode classes can be specified using the "[\pL]" notation, where L is a single-letter Unicode class of characters, or using the "[\p{Class}]" notation where Class is a valid Unicode class (e.g. "Latin"). As for string literals, a lowercase "i" may follow the matcher (outside the ending square bracket) to indicate that the match is case-insensitive. A "^" as first character inside the square brackets indicates that the match is inverted (it is a match if the input does not match the character class matcher). E.g.: The any matcher is represented by the dot ".". It matches any character except the end of file, thus the "!." expression is used to indicate "match the end of file". E.g.: Code blocks can be added to generate custom Go code. There are three kinds of code blocks: the initializer, the action and the predicate. All code blocks appear inside curly braces "{...}". The initializer must appear first in the grammar, before any rule. It is copied as-is (minus the wrapping curly braces) at the top of the generated parser. It may contain function declarations, types, variables, etc. just like any Go file. Every symbol declared here will be available to all other code blocks. Although the initializer is optional in a valid grammar, it is usually required to generate a valid Go source code file (for the package clause). E.g.: Action code blocks are code blocks declared after an expression in a rule. Those code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being the value of the expression (an interface{}), and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: Predicate code blocks are code blocks declared immediately after the and "&" or the not "!" operators. Like action code blocks, predicate code blocks are turned into a method on the "*current" type in the generated source code. The method receives any labeled expression's value as argument (as interface{}) and must return two values, the first being a bool and the second an error. If a non-nil error is returned, it is added to the list of errors that the parser will return. E.g.: The current type is a struct that provides two useful fields that can be accessed in action and predicate code blocks: "pos" and "text". The "pos" field indicates the current position of the parser in the source input. It is itself a struct with three fields: "line", "col" and "offset". Line is a 1-based line number, col is a 1-based column number that counts runes from the start of the line, and offset is a 0-based byte offset. The "text" field is the slice of bytes of the current match. It is empty in a predicate code block. The parser generated by pigeon exports a few symbols so that it can be used as a package with public functions to parse input text. The exported API is: See the godoc page of the generated parser for the test/predicates grammar for an example documentation page of the exported API: http://godoc.org/github.com/mna/pigeon/test/predicates. Like the grammar used to generate the parser, the input text must be UTF-8-encoded Unicode. The start rule of the parser is the first rule in the PEG grammar used to generate the parser. A call to any of the Parse* functions returns the value generated by executing the grammar on the provided input text, and an optional error. Typically, the grammar should generate some kind of abstract syntax tree (AST), but for simple grammars it may evaluate the result immediately, such as in the examples/calculator example. There are no constraints imposed on the author of the grammar, it can return whatever is needed. When the parser returns a non-nil error, the error is always of type errList, which is defined as a slice of errors ([]error). Each error in the list is of type *parserError. This is a struct that has an "Inner" field that can be used to access the original error. So if a code block returns some well-known error like: The original error can be accessed this way: By defaut the parser will continue after an error is returned and will cumulate all errors found during parsing. If the grammar reaches a point where it shouldn't continue, a panic statement can be used to terminate parsing. The panic will be caught at the top-level of the Parse* call and will be converted into a *parserError like any error, and an errList will still be returned to the caller. The divide by zero error in the examples/calculator grammar leverages this feature (no special code is needed to handle division by zero, if it happens, the runtime panics and it is recovered and returned as a parsing error). Providing good error reporting in a parser is not a trivial task. Part of it is provided by the pigeon tool, by offering features such as filename, position and rule name in the error message, but an important part of good error reporting needs to be done by the grammar author. For example, many programming languages use double-quotes for string literals. Usually, if the opening quote is found, the closing quote is expected, and if none is found, there won't be any other rule that will match, there's no need to backtrack and try other choices, an error should be added to the list and the match should be consumed. In order to do this, the grammar can look something like this: This is just one example, but it illustrates the idea that error reporting needs to be thought out when designing the grammar. Generated parsers have user-provided code mixed with pigeon code in the same package, so there is no package boundary in the resulting code to prevent access to unexported symbols. What is meant to be implementation details in pigeon is also available to user code - which doesn't mean it should be used. For this reason, it is important to precisely define what is intended to be the supported API of pigeon, the parts that will be stable in future versions. The "stability" of the API attempts to make a similar guarantee as the Go 1 compatibility [5]. The following lists what part of the current pigeon code falls under that guarantee (features may be added in the future): The pigeon command-line flags and arguments: those will not be removed and will maintain the same semantics. The explicitly exported API generated by pigeon. See [6] for the documentation of this API on a generated parser. The PEG syntax, as documented above. The code blocks (except the initializer) will always be generated as methods on the *current type, and this type is guaranteed to have the fields pos (type position) and text (type []byte). There are no guarantees on other fields and methods of this type. The position type will always have the fields line, col and offset, all defined as int. There are no guarantees on other fields and methods of this type. The type of the error value returned by the Parse* functions, when not nil, will always be errList defined as a []error. There are no guarantees on methods of this type, other than the fact it implements the error interface. Individual errors in the errList will always be of type *parserError, and this type is guaranteed to have an Inner field that contains the original error value. There are no guarantees on other fields and methods of this type. References:

v1.0.0 • 8 years ago

github.com/gandhikrishna/cadence-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v0.19.0 • 4 years ago

gitlab.com/cznic/ql

Package ql implements a pure Go embedded SQL database engine. QL is a member of the SQL family of languages. It is less complex and less powerful than SQL (whichever specification SQL is considered to be). 2018-11-04: Back end file format V2 is now released. To use the new format for newly created databases set the FileFormat field in *Options passed to OpenFile to value 2 or use the driver named "ql2" instead of "ql". - Both the old and new driver will properly open and use, read and write the old (V1) or new file (V2) format of an existing database. - V1 format has a record size limit of ~64 kB. V2 format record size limit is math.MaxInt32. - V1 format uncommitted transaction size is limited by memory resources. V2 format uncommitted transaction is limited by free disk space. - A direct consequence of the previous is that small transactions perform better using V1 format and big transactions perform better using V2 format. - V2 format uses substantially less memory. 2018-08-02: Release v1.2.0 adds initial support for Go modules. 2017-01-10: Release v1.1.0 fixes some bugs and adds a configurable WAL headroom. 2016-07-29: Release v1.0.6 enables alternatively using = instead of == for equality operation. 2016-07-11: Release v1.0.5 undoes vendoring of lldb. QL now uses stable lldb (modernc.org/lldb). 2016-07-06: Release v1.0.4 fixes a panic when closing the WAL file. 2016-04-03: Release v1.0.3 fixes a data race. 2016-03-23: Release v1.0.2 vendors gitlab.com/cznic/exp/lldb and github.com/camlistore/go4/lock. 2016-03-17: Release v1.0.1 adjusts for latest goyacc. Parser error messages are improved and changed, but their exact form is not considered a API change. 2016-03-05: The current version has been tagged v1.0.0. 2015-06-15: To improve compatibility with other SQL implementations, the count built-in aggregate function now accepts * as its argument. 2015-05-29: The execution planner was rewritten from scratch. It should use indices in all places where they were used before plus in some additional situations. It is possible to investigate the plan using the newly added EXPLAIN statement. The QL tool is handy for such analysis. If the planner would have used an index, but no such exists, the plan includes hints in form of copy/paste ready CREATE INDEX statements. The planner is still quite simple and a lot of work on it is yet ahead. You can help this process by filling an issue with a schema and query which fails to use an index or indices when it should, in your opinion. Bonus points for including output of `ql 'explain <query>'`. 2015-05-09: The grammar of the CREATE INDEX statement now accepts an expression list instead of a single expression, which was further limited to just a column name or the built-in id(). As a side effect, composite indices are now functional. However, the values in the expression-list style index are not yet used by other statements or the statement/query planner. The composite index is useful while having UNIQUE clause to check for semantically duplicate rows before they get added to the table or when such a row is mutated using the UPDATE statement and the expression-list style index tuple of the row is thus recomputed. 2015-05-02: The Schema field of table __Table now correctly reflects any column constraints and/or defaults. Also, the (*DB).Info method now has that information provided in new ColumInfo fields NotNull, Constraint and Default. 2015-04-20: Added support for {LEFT,RIGHT,FULL} [OUTER] JOIN. 2015-04-18: Column definitions can now have constraints and defaults. Details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. 2015-03-06: New built-in functions formatFloat and formatInt. Thanks urandom! (https://github.com/urandom) 2015-02-16: IN predicate now accepts a SELECT statement. See the updated "Predicates" section. 2015-01-17: Logical operators || and && have now alternative spellings: OR and AND (case insensitive). AND was a keyword before, but OR is a new one. This can possibly break existing queries. For the record, it's a good idea to not use any name appearing in, for example, [7] in your queries as the list of QL's keywords may expand for gaining better compatibility with existing SQL "standards". 2015-01-12: ACID guarantees were tightened at the cost of performance in some cases. The write collecting window mechanism, a formerly used implementation detail, was removed. Inserting rows one by one in a transaction is now slow. I mean very slow. Try to avoid inserting single rows in a transaction. Instead, whenever possible, perform batch updates of tens to, say thousands of rows in a single transaction. See also: http://www.sqlite.org/faq.html#q19, the discussed synchronization principles involved are the same as for QL, modulo minor details. Note: A side effect is that closing a DB before exiting an application, both for the Go API and through database/sql driver, is no more required, strictly speaking. Beware that exiting an application while there is an open (uncommitted) transaction in progress means losing the transaction data. However, the DB will not become corrupted because of not closing it. Nor that was the case before, but formerly failing to close a DB could have resulted in losing the data of the last transaction. 2014-09-21: id() now optionally accepts a single argument - a table name. 2014-09-01: Added the DB.Flush() method and the LIKE pattern matching predicate. 2014-08-08: The built in functions max and min now accept also time values. Thanks opennota! (https://github.com/opennota) 2014-06-05: RecordSet interface extended by new methods FirstRow and Rows. 2014-06-02: Indices on id() are now used by SELECT statements. 2014-05-07: Introduction of Marshal, Schema, Unmarshal. 2014-04-15: Added optional IF NOT EXISTS clause to CREATE INDEX and optional IF EXISTS clause to DROP INDEX. 2014-04-12: The column Unique in the virtual table __Index was renamed to IsUnique because the old name is a keyword. Unfortunately, this is a breaking change, sorry. 2014-04-11: Introduction of LIMIT, OFFSET. 2014-04-10: Introduction of query rewriting. 2014-04-07: Introduction of indices. QL imports zappy[8], a block-based compressor, which speeds up its performance by using a C version of the compression/decompression algorithms. If a CGO-free (pure Go) version of QL, or an app using QL, is required, please include 'purego' in the -tags option of go {build,get,install}. For example: If zappy was installed before installing QL, it might be necessary to rebuild zappy first (or rebuild QL with all its dependencies using the -a option): The syntax is specified using Extended Backus-Naur Form (EBNF) Lower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes “. The form a … b represents the set of characters from a through b as alternatives. The horizontal ellipsis … is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. QL source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. Each code point is distinct; for instance, upper and lower case letters are different characters. Implementation restriction: For compatibility with other tools, the parser may disallow the NUL character (U+0000) in the statement. Implementation restriction: A byte order mark is disallowed anywhere in QL statements. The following terms are used to denote specific character classes The underscore character _ (U+005F) is considered a letter. Lexical elements are comments, tokens, identifiers, keywords, operators and delimiters, integer, floating-point, imaginary, rune and string literals and QL parameters. Line comments start with the character sequence // or -- and stop at the end of the line. A line comment acts like a space. General comments start with the character sequence /* and continue through the character sequence */. A general comment acts like a space. Comments do not nest. Tokens form the vocabulary of QL. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U+0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. The formal grammar uses semicolons ";" as separators of QL statements. A single QL statement or the last QL statement in a list of statements can have an optional semicolon terminator. (Actually a separator from the following empty statement.) Identifiers name entities such as tables or record set columns. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter. For example No identifiers are predeclared, however note that no keyword can be used as an identifier. Identifiers starting with two underscores are used for meta data virtual tables names. For forward compatibility, users should generally avoid using any identifiers starting with two underscores. For example The following keywords are reserved and may not be used as identifiers. Keywords are not case sensitive. The following character sequences represent operators, delimiters, and other special tokens Operators consisting of more than one character are referred to by names in the rest of the documentation An integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0 for octal, 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15. For example A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided. For example An imaginary literal is a decimal representation of the imaginary part of a complex constant. It consists of a floating-point literal or decimal integer followed by the lower-case letter i. For example A rune literal represents a rune constant, an integer value identifying a Unicode code point. A rune literal is expressed as one or more characters enclosed in single quotes. Within the quotes, any character may appear except single quote and newline. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. The simplest form represents the single character within the quotes; since QL statements are Unicode characters encoded in UTF-8, multiple UTF-8-encoded bytes may represent a single integer value. For instance, the literal 'a' holds a single byte representing a literal a, Unicode U+0061, value 0x61, while 'ä' holds two bytes (0xc3 0xa4) representing a literal a-dieresis, U+00E4, value 0xe4. Several backslash escapes allow arbitrary values to be encoded as ASCII text. There are four ways to represent the integer value as a numeric constant: \x followed by exactly two hexadecimal digits; \u followed by exactly four hexadecimal digits; \U followed by exactly eight hexadecimal digits, and a plain backslash \ followed by exactly three octal digits. In each case the value of the literal is the value represented by the digits in the corresponding base. Although these representations all result in an integer, they have different valid ranges. Octal escapes must represent a value between 0 and 255 inclusive. Hexadecimal escapes satisfy this condition by construction. The escapes \u and \U represent Unicode code points so within them some values are illegal, in particular those above 0x10FFFF and surrogate halves. After a backslash, certain single-character escapes represent special values All other sequences starting with a backslash are illegal inside rune literals. For example A string literal represents a string constant obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals. Raw string literals are character sequences between back quotes “. Within the quotes, any character is legal except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage returns inside raw string literals are discarded from the raw string value. Interpreted string literals are character sequences between double quotes "". The text between the quotes, which may not contain newlines, forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. For example These examples all represent the same string If the statement source represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal. Literals are assigned their values from the respective text representation at "compile" (parse) time. QL parameters provide the same functionality as literals, but their value is assigned at execution time from an expression list passed to DB.Run or DB.Execute. Using '?' or '$' is completely equivalent. For example Keywords 'false' and 'true' (not case sensitive) represent the two possible constant values of type bool (also not case sensitive). Keyword 'NULL' (not case sensitive) represents an untyped constant which is assignable to any type. NULL is distinct from any other value of any type. A type determines the set of values and operations specific to values of that type. A type is specified by a type name. Named instances of the boolean, numeric, and string types are keywords. The names are not case sensitive. Note: The blob type is exchanged between the back end and the API as []byte. On 32 bit platforms this limits the size which the implementation can handle to 2G. A boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. A duration type represents the elapsed time between two instants as an int64 nanosecond count. The representation limits the largest representable duration to approximately 290 years. A numeric type represents sets of integer or floating-point values. The predeclared architecture-independent numeric types are The value of an n-bit integer is n bits wide and represented using two's complement arithmetic. Conversions are required when different numeric types are mixed in an expression or assignment. A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. The case insensitive keyword for the string type is 'string'. The length of a string (its size in bytes) can be discovered using the built-in function len. A time type represents an instant in time with nanosecond precision. Each time has associated with it a location, consulted when computing the presentation form of the time. The following functions are implicitly declared An expression specifies the computation of a value by applying operators and functions to operands. Operands denote the elementary values in an expression. An operand may be a literal, a (possibly qualified) identifier denoting a constant or a function or a table/record set column, or a parenthesized expression. A qualified identifier is an identifier qualified with a table/record set name prefix. For example Primary expression are the operands for unary and binary expressions. For example A primary expression of the form denotes the element of a string indexed by x. Its type is byte. The value x is called the index. The following rules apply - The index x must be of integer type except bigint or duration; it is in range if 0 <= x < len(s), otherwise it is out of range. - A constant index must be non-negative and representable by a value of type int. - A constant index must be in range if the string a is a literal. - If x is out of range at run time, a run-time error occurs. - s[x] is the byte at index x and the type of s[x] is byte. If s is NULL or x is NULL then the result is NULL. Otherwise s[x] is illegal. For a string, the primary expression constructs a substring. The indices low and high select which elements appear in the result. The result has indices starting at 0 and length equal to high - low. For convenience, any of the indices may be omitted. A missing low index defaults to zero; a missing high index defaults to the length of the sliced operand The indices low and high are in range if 0 <= low <= high <= len(a), otherwise they are out of range. A constant index must be non-negative and representable by a value of type int. If both indices are constant, they must satisfy low <= high. If the indices are out of range at run time, a run-time error occurs. Integer values of type bigint or duration cannot be used as indices. If s is NULL the result is NULL. If low or high is not omitted and is NULL then the result is NULL. Given an identifier f denoting a predeclared function, calls f with arguments a1, a2, … an. Arguments are evaluated before the function is called. The type of the expression is the result type of f. In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return value of the function is passed by value when the function returns. Calling an undefined function causes a compile-time error. Operators combine operands into expressions. Comparisons are discussed elsewhere. For other binary operators, the operand types must be identical unless the operation involves shifts or untyped constants. For operations involving constants only, see the section on constant expressions. Except for shift operations, if one operand is an untyped constant and the other operand is not, the constant is converted to the type of the other operand. The right operand in a shift expression must have unsigned integer type or be an untyped constant that can be converted to unsigned integer type. If the left operand of a non-constant shift expression is an untyped constant, the type of the constant is what it would be if the shift expression were replaced by its left operand alone. Expressions of the form yield a boolean value true if expr2, a regular expression, matches expr1 (see also [6]). Both expression must be of type string. If any one of the expressions is NULL the result is NULL. Predicates are special form expressions having a boolean result type. Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be comparable as defined in "Comparison operators". Another form of the IN predicate creates the expression list from a result of a SelectStmt. The SelectStmt must select only one column. The produced expression list is resource limited by the memory available to the process. NULL values produced by the SelectStmt are ignored, but if all records of the SelectStmt are NULL the predicate yields NULL. The select statement is evaluated only once. If the type of expr is not the same as the type of the field returned by the SelectStmt then the set operation yields false. The type of the column returned by the SelectStmt must be one of the simple (non blob-like) types: Expressions of the form are equivalent, including NULL handling, to The types of involved expressions must be ordered as defined in "Comparison operators". Expressions of the form yield a boolean value true if expr does not have a specific type (case A) or if expr has a specific type (case B). In other cases the result is a boolean value false. Unary operators have the highest precedence. There are five precedence levels for binary operators. Multiplication operators bind strongest, followed by addition operators, comparison operators, && (logical AND), and finally || (logical OR) Binary operators of the same precedence associate from left to right. For instance, x / y * z is the same as (x / y) * z. Note that the operator precedence is reflected explicitly by the grammar. Arithmetic operators apply to numeric values and yield a result of the same type as the first operand. The four standard arithmetic operators (+, -, *, /) apply to integer, rational, floating-point, and complex types; + also applies to strings; +,- also applies to times. All other arithmetic operators apply to integers only. sum integers, rationals, floats, complex values, strings difference integers, rationals, floats, complex values, times product integers, rationals, floats, complex values / quotient integers, rationals, floats, complex values % remainder integers & bitwise AND integers | bitwise OR integers ^ bitwise XOR integers &^ bit clear (AND NOT) integers << left shift integer << unsigned integer >> right shift integer >> unsigned integer Strings can be concatenated using the + operator String addition creates a new string by concatenating the operands. A value of type duration can be added to or subtracted from a value of type time. Times can subtracted from each other producing a value of type duration. For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships with x / y truncated towards zero ("truncated division"). As an exception to this rule, if the dividend x is the most negative value for the int type of x, the quotient q = x / -1 is equal to x (and r = 0). If the divisor is a constant expression, it must not be zero. If the divisor is zero at run time, a run-time error occurs. If the dividend is non-negative and the divisor is a constant power of 2, the division may be replaced by a right shift, and computing the remainder may be replaced by a bitwise AND operation The shift operators shift the left operand by the shift count specified by the right operand. They implement arithmetic shifts if the left operand is a signed integer and logical shifts if it is an unsigned integer. There is no upper limit on the shift count. Shifts behave as if the left operand is shifted n times by 1 for a shift count of n. As a result, x << 1 is the same as x*2 and x >> 1 is the same as x/2 but truncated towards negative infinity. For integer operands, the unary operators +, -, and ^ are defined as follows For floating-point and complex numbers, +x is the same as x, while -x is the negation of x. The result of a floating-point or complex division by zero is not specified beyond the IEEE-754 standard; whether a run-time error occurs is implementation-specific. Whenever any operand of any arithmetic operation, unary or binary, is NULL, as well as in the case of the string concatenating operation, the result is NULL. For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and expressions may rely on “wrap around”. For signed integers with a finite bit width, the operations +, -, *, and << may legally overflow and the resulting value exists and is deterministically defined by the signed integer representation, the operation, and its operands. No exception is raised as a result of overflow. An evaluator may not optimize an expression under the assumption that overflow does not occur. For instance, it may not assume that x < x + 1 is always true. Integers of type bigint and rationals do not overflow but their handling is limited by the memory resources available to the program. Comparison operators compare two operands and yield a boolean value. In any comparison, the first operand must be of same type as is the second operand, or vice versa. The equality operators == and != apply to operands that are comparable. The ordering operators <, <=, >, and >= apply to operands that are ordered. These terms and the result of the comparisons are defined as follows - Boolean values are comparable. Two boolean values are equal if they are either both true or both false. - Complex values are comparable. Two complex values u and v are equal if both real(u) == real(v) and imag(u) == imag(v). - Integer values are comparable and ordered, in the usual way. Note that durations are integers. - Floating point values are comparable and ordered, as defined by the IEEE-754 standard. - Rational values are comparable and ordered, in the usual way. - String and Blob values are comparable and ordered, lexically byte-wise. - Time values are comparable and ordered. Whenever any operand of any comparison operation is NULL, the result is NULL. Note that slices are always of type string. Logical operators apply to boolean values and yield a boolean result. The right operand is evaluated conditionally. The truth tables for logical operations with NULL values Conversions are expressions of the form T(x) where T is a type and x is an expression that can be converted to type T. A constant value x can be converted to type T in any of these cases: - x is representable by a value of type T. - x is a floating-point constant, T is a floating-point type, and x is representable by a value of type T after rounding using IEEE 754 round-to-even rules. The constant T(x) is the rounded value. - x is an integer constant and T is a string type. The same rule as for non-constant x applies in this case. Converting a constant yields a typed constant as result. A non-constant value x can be converted to type T in any of these cases: - x has type T. - x's type and T are both integer or floating point types. - x's type and T are both complex types. - x is an integer, except bigint or duration, and T is a string type. Specific rules apply to (non-constant) conversions between numeric types or to and from a string type. These conversions may change the representation of x and incur a run-time cost. All other conversions only change the type but not the representation of x. A conversion of NULL to any type yields NULL. For the conversion of non-constant numeric values, the following rules apply 1. When converting between integer types, if the value is a signed integer, it is sign extended to implicit infinite precision; otherwise it is zero extended. It is then truncated to fit in the result type's size. For example, if v == uint16(0x10F0), then uint32(int8(v)) == 0xFFFFFFF0. The conversion always yields a valid value; there is no indication of overflow. 2. When converting a floating-point number to an integer, the fraction is discarded (truncation towards zero). 3. When converting an integer or floating-point number to a floating-point type, or a complex number to another complex type, the result value is rounded to the precision specified by the destination type. For instance, the value of a variable x of type float32 may be stored using additional precision beyond that of an IEEE-754 32-bit number, but float32(x) represents the result of rounding x's value to 32-bit precision. Similarly, x + 0.1 may use more than 32 bits of precision, but float32(x + 0.1) does not. In all non-constant conversions involving floating-point or complex values, if the result type cannot represent the value the conversion succeeds but the result value is implementation-dependent. 1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD". 2. Converting a blob to a string type yields a string whose successive bytes are the elements of the blob. 3. Converting a value of a string type to a blob yields a blob whose successive elements are the bytes of the string. 4. Converting a value of a bigint type to a string yields a string containing the decimal decimal representation of the integer. 5. Converting a value of a string type to a bigint yields a bigint value containing the integer represented by the string value. A prefix of “0x” or “0X” selects base 16; the “0” prefix selects base 8, and a “0b” or “0B” prefix selects base 2. Otherwise the value is interpreted in base 10. An error occurs if the string value is not in any valid format. 6. Converting a value of a rational type to a string yields a string containing the decimal decimal representation of the rational in the form "a/b" (even if b == 1). 7. Converting a value of a string type to a bigrat yields a bigrat value containing the rational represented by the string value. The string can be given as a fraction "a/b" or as a floating-point number optionally followed by an exponent. An error occurs if the string value is not in any valid format. 8. Converting a value of a duration type to a string returns a string representing the duration in the form "72h3m0.5s". Leading zero units are omitted. As a special case, durations less than one second format using a smaller unit (milli-, micro-, or nanoseconds) to ensure that the leading digit is non-zero. The zero duration formats as 0, with no unit. 9. Converting a string value to a duration yields a duration represented by the string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 10. Converting a time value to a string returns the time formatted using the format string When evaluating the operands of an expression or of function calls, operations are evaluated in lexical left-to-right order. For example, in the evaluation of the function calls and evaluation of c happen in the order h(), i(), j(), c. Floating-point operations within a single expression are evaluated according to the associativity of the operators. Explicit parentheses affect the evaluation by overriding the default associativity. In the expression x + (y + z) the addition y + z is performed before adding x. Statements control execution. The empty statement does nothing. Alter table statements modify existing tables. With the ADD clause it adds a new column to the table. The column must not exist. With the DROP clause it removes an existing column from a table. The column must exist and it must be not the only (last) column of the table. IOW, there cannot be a table with no columns. For example When adding a column to a table with existing data, the constraint clause of the ColumnDef cannot be used. Adding a constrained column to an empty table is fine. Begin transactions statements introduce a new transaction level. Every transaction level must be eventually balanced by exactly one of COMMIT or ROLLBACK statements. Note that when a transaction is roll-backed because of a statement failure then no explicit balancing of the respective BEGIN TRANSACTION is statement is required nor permitted. Failure to properly balance any opened transaction level may cause dead locks and/or lose of data updated in the uppermost opened but never properly closed transaction level. For example A database cannot be updated (mutated) outside of a transaction. Statements requiring a transaction A database is effectively read only outside of a transaction. Statements not requiring a transaction The commit statement closes the innermost transaction nesting level. If that's the outermost level then the updates to the DB made by the transaction are atomically made persistent. For example Create index statements create new indices. Index is a named projection of ordered values of a table column to the respective records. As a special case the id() of the record can be indexed. Index name must not be the same as any of the existing tables and it also cannot be the same as of any column name of the table the index is on. For example Now certain SELECT statements may use the indices to speed up joins and/or to speed up record set filtering when the WHERE clause is used; or the indices might be used to improve the performance when the ORDER BY clause is present. The UNIQUE modifier requires the indexed values tuple to be index-wise unique or have all values NULL. The optional IF NOT EXISTS clause makes the statement a no operation if the index already exists. A simple index consists of only one expression which must be either a column name or the built-in id(). A more complex and more general index is one that consists of more than one expression or its single expression does not qualify as a simple index. In this case the type of all expressions in the list must be one of the non blob-like types. Note: Blob-like types are blob, bigint, bigrat, time and duration. Create table statements create new tables. A column definition declares the column name and type. Table names and column names are case sensitive. Neither a table or an index of the same name may exist in the DB. For example The optional IF NOT EXISTS clause makes the statement a no operation if the table already exists. The optional constraint clause has two forms. The first one is found in many SQL dialects. This form prevents the data in column DepartmentName to be NULL. The second form allows an arbitrary boolean expression to be used to validate the column. If the value of the expression is true then the validation succeeded. If the value of the expression is false or NULL then the validation fails. If the value of the expression is not of type bool an error occurs. The optional DEFAULT clause is an expression which, if present, is substituted instead of a NULL value when the colum is assigned a value. Note that the constraint and/or default expressions may refer to other columns by name: When a table row is inserted by the INSERT INTO statement or when a table row is updated by the UPDATE statement, the order of operations is as follows: 1. The new values of the affected columns are set and the values of all the row columns become the named values which can be referred to in default expressions evaluated in step 2. 2. If any row column value is NULL and the DEFAULT clause is present in the column's definition, the default expression is evaluated and its value is set as the respective column value. 3. The values, potentially updated, of row columns become the named values which can be referred to in constraint expressions evaluated during step 4. 4. All row columns which definition has the constraint clause present will have that constraint checked. If any constraint violation is detected, the overall operation fails and no changes to the table are made. Delete from statements remove rows from a table, which must exist. For example If the WHERE clause is not present then all rows are removed and the statement is equivalent to the TRUNCATE TABLE statement. Drop index statements remove indices from the DB. The index must exist. For example The optional IF EXISTS clause makes the statement a no operation if the index does not exist. Drop table statements remove tables from the DB. The table must exist. For example The optional IF EXISTS clause makes the statement a no operation if the table does not exist. Insert into statements insert new rows into tables. New rows come from literal data, if using the VALUES clause, or are a result of select statement. In the later case the select statement is fully evaluated before the insertion of any rows is performed, allowing to insert values calculated from the same table rows are to be inserted into. If the ColumnNameList part is omitted then the number of values inserted in the row must be the same as are columns in the table. If the ColumnNameList part is present then the number of values per row must be same as the same number of column names. All other columns of the record are set to NULL. The type of the value assigned to a column must be the same as is the column's type or the value must be NULL. For example If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. Explain statement produces a recordset consisting of lines of text which describe the execution plan of a statement, if any. For example, the QL tool treats the explain statement specially and outputs the joined lines: The explanation may aid in uderstanding how a statement/query would be executed and if indices are used as expected - or which indices may possibly improve the statement performance. The create index statements above were directly copy/pasted in the terminal from the suggestions provided by the filter recordset pipeline part returned by the explain statement. If the statement has nothing special in its plan, the result is the original statement. To get an explanation of the select statement of the IN predicate, use the EXPLAIN statement with that particular select statement. The rollback statement closes the innermost transaction nesting level discarding any updates to the DB made by it. If that's the outermost level then the effects on the DB are as if the transaction never happened. For example The (temporary) record set from the last statement is returned and can be processed by the client. In this case the rollback is the same as 'DROP TABLE tmp;' but it can be a more complex operation. Select from statements produce recordsets. The optional DISTINCT modifier ensures all rows in the result recordset are unique. Either all of the resulting fields are returned ('*') or only those named in FieldList. RecordSetList is a list of table names or parenthesized select statements, optionally (re)named using the AS clause. The result can be filtered using a WhereClause and orderd by the OrderBy clause. For example If Recordset is a nested, parenthesized SelectStmt then it must be given a name using the AS clause if its field are to be accessible in expressions. A field is an named expression. Identifiers, not used as a type in conversion or a function name in the Call clause, denote names of (other) fields, values of which should be used in the expression. The expression can be named using the AS clause. If the AS clause is not present and the expression consists solely of a field name, then that field name is used as the name of the resulting field. Otherwise the field is unnamed. For example The SELECT statement can optionally enumerate the desired/resulting fields in a list. No two identical field names can appear in the list. When more than one record set is used in the FROM clause record set list, the result record set field names are rewritten to be qualified using the record set names. If a particular record set doesn't have a name, its respective fields became unnamed. The optional JOIN clause, for example is mostly equal to except that the rows from a which, when they appear in the cross join, never made expr to evaluate to true, are combined with a virtual row from b, containing all nulls, and added to the result set. For the RIGHT JOIN variant the discussed rules are used for rows from b not satisfying expr == true and the virtual, all-null row "comes" from a. The FULL JOIN adds the respective rows which would be otherwise provided by the separate executions of the LEFT JOIN and RIGHT JOIN variants. For more thorough OUTER JOIN discussion please see the Wikipedia article at [10]. Resultins rows of a SELECT statement can be optionally ordered by the ORDER BY clause. Collating proceeds by considering the expressions in the expression list left to right until a collating order is determined. Any possibly remaining expressions are not evaluated. All of the expression values must yield an ordered type or NULL. Ordered types are defined in "Comparison operators". Collating of elements having a NULL value is different compared to what the comparison operators yield in expression evaluation (NULL result instead of a boolean value). Below, T denotes a non NULL value of any QL type. NULL collates before any non NULL value (is considered smaller than T). Two NULLs have no collating order (are considered equal). The WHERE clause restricts records considered by some statements, like SELECT FROM, DELETE FROM, or UPDATE. It is an error if the expression evaluates to a non null value of non bool type. Another form of the WHERE clause is an existence predicate of a parenthesized select statement. The EXISTS form evaluates to true if the parenthesized SELECT statement produces a non empty record set. The NOT EXISTS form evaluates to true if the parenthesized SELECT statement produces an empty record set. The parenthesized SELECT statement is evaluated only once (TODO issue #159). The GROUP BY clause is used to project rows having common values into a smaller set of rows. For example Using the GROUP BY without any aggregate functions in the selected fields is in certain cases equal to using the DISTINCT modifier. The last two examples above produce the same resultsets. The optional OFFSET clause allows to ignore first N records. For example The above will produce only rows 11, 12, ... of the record set, if they exist. The value of the expression must a non negative integer, but not bigint or duration. The optional LIMIT clause allows to ignore all but first N records. For example The above will return at most the first 10 records of the record set. The value of the expression must a non negative integer, but not bigint or duration. The LIMIT and OFFSET clauses can be combined. For example Considering table t has, say 10 records, the above will produce only records 4 - 8. After returning record #8, no more result rows/records are computed. 1. The FROM clause is evaluated, producing a Cartesian product of its source record sets (tables or nested SELECT statements). 2. If present, the JOIN cluase is evaluated on the result set of the previous evaluation and the recordset specified by the JOIN clause. (... JOIN Recordset ON ...) 3. If present, the WHERE clause is evaluated on the result set of the previous evaluation. 4. If present, the GROUP BY clause is evaluated on the result set of the previous evaluation(s). 5. The SELECT field expressions are evaluated on the result set of the previous evaluation(s). 6. If present, the DISTINCT modifier is evaluated on the result set of the previous evaluation(s). 7. If present, the ORDER BY clause is evaluated on the result set of the previous evaluation(s). 8. If present, the OFFSET clause is evaluated on the result set of the previous evaluation(s). The offset expression is evaluated once for the first record produced by the previous evaluations. 9. If present, the LIMIT clause is evaluated on the result set of the previous evaluation(s). The limit expression is evaluated once for the first record produced by the previous evaluations. Truncate table statements remove all records from a table. The table must exist. For example Update statements change values of fields in rows of a table. For example Note: The SET clause is optional. If any of the columns of the table were defined using the optional constraints clause or the optional defaults clause then those are processed on a per row basis. The details are discussed in the "Constraints and defaults" chapter below the CREATE TABLE statement documentation. To allow to query for DB meta data, there exist specially named tables, some of them being virtual. Note: Virtual system tables may have fake table-wise unique but meaningless and unstable record IDs. Do not apply the built-in id() to any system table. The table __Table lists all tables in the DB. The schema is The Schema column returns the statement to (re)create table Name. This table is virtual. The table __Colum lists all columns of all tables in the DB. The schema is The Ordinal column defines the 1-based index of the column in the record. This table is virtual. The table __Colum2 lists all columns of all tables in the DB which have the constraint NOT NULL or which have a constraint expression defined or which have a default expression defined. The schema is It's possible to obtain a consolidated recordset for all properties of all DB columns using The Name column is the column name in TableName. The table __Index lists all indices in the DB. The schema is The IsUnique columns reflects if the index was created using the optional UNIQUE clause. This table is virtual. Built-in functions are predeclared. The built-in aggregate function avg returns the average of values of an expression. Avg ignores NULL values, but returns NULL if all values of a column are NULL or if avg is applied to an empty record set. The column values must be of a numeric type. The built-in function contains returns true if substr is within s. If any argument to contains is NULL the result is NULL. The built-in aggregate function count returns how many times an expression has a non NULL values or the number of rows in a record set. Note: count() returns 0 for an empty record set. For example Date returns the time corresponding to in the appropriate zone for that time in the given location. The month, day, hour, min, sec, and nsec values may be outside their usual ranges and will be normalized during the conversion. For example, October 32 converts to November 1. A daylight savings time transition skips or repeats times. For example, in the United States, March 13, 2011 2:15am never occurred, while November 6, 2011 1:15am occurred twice. In such cases, the choice of time zone, and therefore the time, is not well-defined. Date returns a time that is correct in one of the two zones involved in the transition, but it does not guarantee which. A location maps time instants to the zone in use at that time. Typically, the location represents the collection of time offsets in use in a geographical area, such as "CEST" and "CET" for central Europe. "local" represents the system's local time zone. "UTC" represents Universal Coordinated Time (UTC). The month specifies a month of the year (January = 1, ...). If any argument to date is NULL the result is NULL. The built-in function day returns the day of the month specified by t. If the argument to day is NULL the result is NULL. The built-in function formatTime returns a textual representation of the time value formatted according to layout, which defines the format by showing how the reference time, would be displayed if it were the value; it serves as an example of the desired output. The same display rules will then be applied to the time value. If any argument to formatTime is NULL the result is NULL. NOTE: The string value of the time zone, like "CET" or "ACDT", is dependent on the time zone of the machine the function is run on. For example, if the t value is in "CET", but the machine is in "ACDT", instead of "CET" the result is "+0100". This is the same what Go (time.Time).String() returns and in fact formatTime directly calls t.String(). returns on a machine in the CET time zone, but may return on a machine in the ACDT zone. The time value is in both cases the same so its ordering and comparing is correct. Only the display value can differ. The built-in functions formatFloat and formatInt format numbers to strings using go's number format functions in the `strconv` package. For all three functions, only the first argument is mandatory. The default values of the rest are shown in the examples. If the first argument is NULL, the result is NULL. returns returns returns Unlike the `strconv` equivalent, the formatInt function handles all integer types, both signed and unsigned. The built-in function hasPrefix tests whether the string s begins with prefix. If any argument to hasPrefix is NULL the result is NULL. The built-in function hasSuffix tests whether the string s ends with suffix. If any argument to hasSuffix is NULL the result is NULL. The built-in function hour returns the hour within the day specified by t, in the range [0, 23]. If the argument to hour is NULL the result is NULL. The built-in function hours returns the duration as a floating point number of hours. If the argument to hours is NULL the result is NULL. The built-in function id takes zero or one arguments. If no argument is provided, id() returns a table-unique automatically assigned numeric identifier of type int. Ids of deleted records are not reused unless the DB becomes completely empty (has no tables). For example If id() without arguments is called for a row which is not a table record then the result value is NULL. For example If id() has one argument it must be a table name of a table in a cross join. For example The built-in function len takes a string argument and returns the lentgh of the string in bytes. The expression len(s) is constant if s is a string constant. If the argument to len is NULL the result is NULL. The built-in aggregate function max returns the largest value of an expression in a record set. Max ignores NULL values, but returns NULL if all values of a column are NULL or if max is applied to an empty record set. The expression values must be of an ordered type. For example The built-in aggregate function min returns the smallest value of an expression in a record set. Min ignores NULL values, but returns NULL if all values of a column are NULL or if min is applied to an empty record set. For example The column values must be of an ordered type. The built-in function minute returns the minute offset within the hour specified by t, in the range [0, 59]. If the argument to minute is NULL the result is NULL. The built-in function minutes returns the duration as a floating point number of minutes. If the argument to minutes is NULL the result is NULL. The built-in function month returns the month of the year specified by t (January = 1, ...). If the argument to month is NULL the result is NULL. The built-in function nanosecond returns the nanosecond offset within the second specified by t, in the range [0, 999999999]. If the argument to nanosecond is NULL the result is NULL. The built-in function nanoseconds returns the duration as an integer nanosecond count. If the argument to nanoseconds is NULL the result is NULL. The built-in function now returns the current local time. The built-in function parseTime parses a formatted string and returns the time value it represents. The layout defines the format by showing how the reference time, would be interpreted if it were the value; it serves as an example of the input format. The same interpretation will then be made to the input string. Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored. In the absence of a time zone indicator, parseTime returns a time in UTC. When parsing a time with a zone offset like -0700, if the offset corresponds to a time zone used by the current location, then parseTime uses that location and zone in the returned time. Otherwise it records the time as being in a fabricated location with time fixed at the given zone offset. When parsing a time with a zone abbreviation like MST, if the zone abbreviation has a defined offset in the current location, then that offset is used. The zone abbreviation "UTC" is recognized as UTC regardless of location. If the zone abbreviation is unknown, Parse records the time as being in a fabricated location with the given zone abbreviation and a zero offset. This choice means that such a time can be parses and reformatted with the same layout losslessly, but the exact instant used in the representation will differ by the actual zone offset. To avoid such problems, prefer time layouts that use a numeric zone offset. If any argument to parseTime is NULL the result is NULL. The built-in function second returns the second offset within the minute specified by t, in the range [0, 59]. If the argument to second is NULL the result is NULL. The built-in function seconds returns the duration as a floating point number of seconds. If the argument to seconds is NULL the result is NULL. The built-in function since returns the time elapsed since t. It is shorthand for now()-t. If the argument to since is NULL the result is NULL. The built-in aggregate function sum returns the sum of values of an expression for all rows of a record set. Sum ignores NULL values, but returns NULL if all values of a column are NULL or if sum is applied to an empty record set. The column values must be of a numeric type. The built-in function timeIn returns t with the location information set to loc. For discussion of the loc argument please see date(). If any argument to timeIn is NULL the result is NULL. The built-in function weekday returns the day of the week specified by t. Sunday == 0, Monday == 1, ... If the argument to weekday is NULL the result is NULL. The built-in function year returns the year in which t occurs. If the argument to year is NULL the result is NULL. The built-in function yearDay returns the day of the year specified by t, in the range [1,365] for non-leap years, and [1,366] in leap years. If the argument to yearDay is NULL the result is NULL. Three functions assemble and disassemble complex numbers. The built-in function complex constructs a complex value from a floating-point real and imaginary part, while real and imag extract the real and imaginary parts of a complex value. The type of the arguments and return value correspond. For complex, the two arguments must be of the same floating-point type and the return type is the complex type with the corresponding floating-point constituents: complex64 for float32, complex128 for float64. The real and imag functions together form the inverse, so for a complex value z, z == complex(real(z), imag(z)). If the operands of these functions are all constants, the return value is a constant. If any argument to any of complex, real, imag functions is NULL the result is NULL. For the numeric types, the following sizes are guaranteed Portions of this specification page are modifications based on work[2] created and shared by Google[3] and used according to terms described in the Creative Commons 3.0 Attribution License[4]. This specification is licensed under the Creative Commons Attribution 3.0 License, and code is licensed under a BSD license[5]. Links from the above documentation This section is not part of the specification. WARNING: The implementation of indices is new and it surely needs more time to become mature. Indices are used currently used only by the WHERE clause. The following expression patterns of 'WHERE expression' are recognized and trigger index use. The relOp is one of the relation operators <, <=, ==, >=, >. For the equality operator both operands must be of comparable types. For all other operators both operands must be of ordered types. The constant expression is a compile time constant expression. Some constant folding is still a TODO. Parameter is a QL parameter ($1 etc.). Consider tables t and u, both with an indexed field f. The WHERE expression doesn't comply with the above simple detected cases. However, such query is now automatically rewritten to which will use both of the indices. The impact of using the indices can be substantial (cf. BenchmarkCrossJoin*) if the resulting rows have low "selectivity", ie. only few rows from both tables are selected by the respective WHERE filtering. Note: Existing QL DBs can be used and indices can be added to them. However, once any indices are present in the DB, the old QL versions cannot work with such DB anymore. Running a benchmark with -v (-test.v) outputs information about the scale used to report records/s and a brief description of the benchmark. For example Running the full suite of benchmarks takes a lot of time. Use the -timeout flag to avoid them being killed after the default time limit (10 minutes).

v1.4.16 • 2 months ago

github.com/yycptt/cadence-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v0.8.0 • 7 years ago

github.com/sevein/cadence-client

Package cadence and its subdirectories contain the Cadence client side framework. The Cadence service is a task orchestrator for your application’s tasks. Applications using Cadence can execute a logical flow of tasks, especially long-running business logic, asynchronously or synchronously. They can also scale at runtime on distributed systems. A quick example illustrates its use case. Consider Uber Eats where Cadence manages the entire business flow from placing an order, accepting it, handling shopping cart processes (adding, updating, and calculating cart items), entering the order in a pipeline (for preparing food and coordinating delivery), to scheduling delivery as well as handling payments. Cadence consists of a programming framework (or client library) and a managed service (or backend). The framework enables developers to author and coordinate tasks in Go code. The root cadence package contains common data structures. The subpackages are: The Cadence hosted service brokers and persists events generated during workflow execution. Worker nodes owned and operated by customers execute the coordination and task logic. To facilitate the implementation of worker nodes Cadence provides a client-side library for the Go language. In Cadence, you can code the logical flow of events separately as a workflow and code business logic as activities. The workflow identifies the activities and sequences them, while an activity executes the logic. Dynamic workflow execution graphs - Determine the workflow execution graphs at runtime based on the data you are processing. Cadence does not pre-compute the execution graphs at compile time or at workflow start time. Therefore, you have the ability to write workflows that can dynamically adjust to the amount of data they are processing. If you need to trigger 10 instances of an activity to efficiently process all the data in one run, but only 3 for a subsequent run, you can do that. Child Workflows - Orchestrate the execution of a workflow from within another workflow. Cadence will return the results of the child workflow execution to the parent workflow upon completion of the child workflow. No polling is required in the parent workflow to monitor status of the child workflow, making the process efficient and fault tolerant. Durable Timers - Implement delayed execution of tasks in your workflows that are robust to worker failures. Cadence provides two easy to use APIs, **workflow.Sleep** and **workflow.Timer**, for implementing time based events in your workflows. Cadence ensures that the timer settings are persisted and the events are generated even if workers executing the workflow crash. Signals - Modify/influence the execution path of a running workflow by pushing additional data directly to the workflow using a signal. Via the Signal facility, Cadence provides a mechanism to consume external events directly in workflow code. Task routing - Efficiently process large amounts of data using a Cadence workflow, by caching the data locally on a worker and executing all activities meant to process that data on that same worker. Cadence enables you to choose the worker you want to execute a certain activity by scheduling that activity execution in the worker's specific task-list. Unique workflow ID enforcement - Use business entity IDs for your workflows and let Cadence ensure that only one workflow is running for a particular entity at a time. Cadence implements an atomic "uniqueness check" and ensures that no race conditions are possible that would result in multiple workflow executions for the same workflow ID. Therefore, you can implement your code to attempt to start a workflow without checking if the ID is already in use, even in the cases where only one active execution per workflow ID is desired. Perpetual/ContinueAsNew workflows - Run periodic tasks as a single perpetually running workflow. With the "ContinueAsNew" facility, Cadence allows you to leverage the "unique workflow ID enforcement" feature for periodic workflows. Cadence will complete the current execution and start the new execution atomically, ensuring you get to keep your workflow ID. By starting a new execution Cadence also ensures that workflow execution history does not grow indefinitely for perpetual workflows. At-most once activity execution - Execute non-idempotent activities as part of your workflows. Cadence will not automatically retry activities on failure. For every activity execution Cadence will return a success result, a failure result, or a timeout to the workflow code and let the workflow code determine how each one of those result types should be handled. Asynch Activity Completion - Incorporate human input or thrid-party service asynchronous callbacks into your workflows. Cadence allows a workflow to pause execution on an activity and wait for an external actor to resume it with a callback. During this pause the activity does not have any actively executing code, such as a polling loop, and is merely an entry in the Cadence datastore. Therefore, the workflow is unaffected by any worker failures happening over the duration of the pause. Activity Heartbeating - Detect unexpected failures/crashes and track progress in long running activities early. By configuring your activity to report progress periodically to the Cadence server, you can detect a crash that occurs 10 minutes into an hour-long activity execution much sooner, instead of waiting for the 60-minute execution timeout. The recorded progress before the crash gives you sufficient information to determine whether to restart the activity from the beginning or resume it from the point of failure. Timeouts for activities and workflow executions - Protect against stuck and unresponsive activities and workflows with appropriate timeout values. Cadence requires that timeout values are provided for every activity or workflow invocation. There is no upper bound on the timeout values, so you can set timeouts that span days, weeks, or even months. Visibility - Get a list of all your active and/or completed workflow. Explore the execution history of a particular workflow execution. Cadence provides a set of visibility APIs that allow you, the workflow owner, to monitor past and current workflow executions. Debuggability - Replay any workflow execution history locally under a debugger. The Cadence client library provides an API to allow you to capture a stack trace from any failed workflow execution history.

v0.10.0 • 6 years ago

github.com/flungloaf/cld2

github.com/tpbadger/stl

github.com/SamWhited/toml

github.com/KazkiMatz/cld2

github.com/sophatvathana/cld2

github.com/boyvanduuren/ql

github.com/amoskong/gocql

github.com/bobg/cadence-client

github.com/EmiPhil/gocql

github.com/dbrower/ql

github.com/eliasnaur/ql

github.com/elliott5/sentences

github.com/fanyang01/ql

github.com/jpvetterli/args

github.com/kiteco/pigeon

github.com/mikolajb/cadence-client

github.com/motomux/toml-1

github.com/rylans/whatlanggo

github.com/learning/backend

github.com/martingallagher/cld2

github.com/sudarshan-reddy/backend

github.com/abligh/cdl

github.com/clarenceb/backend

github.com/dugei/toml

github.com/ashetkar/gocql

github.com/mmcshane/gocql

github.com/bacongobbler/linguist

github.com/nixomose/toml

github.com/ic-projects/pigeon

github.com/lucasbutn/toml

github.com/vdmytriv/gocld2

github.com/gandhikrishna/cadence-client

github.com/hsnks100/cld2

github.com/ianlewis/linguist

gitlab.com/cznic/ql

github.com/jmhodges/gocld2

github.com/tsingson/toml

github.com/dayvonjersen/linguist

github.com/ssemilla/bs-toml

github.com/triamazikamno/whatlanggo

github.com/shriramsharma/toml

github.com/gqgs/whatlanggo

github.com/gmatty/toml

github.com/yokesnowwolf/toml

github.com/t3chguy/cld2

github.com/yycptt/cadence-client

github.com/sevein/cadence-client

github.com/jayce/toml

github.com/jonsen/toml

github.com/tmsong/toml