sqlite-parser
Advanced tools
Changelog
[v1.0.1] - 2017-06-15
AND
/ OR
operator composition order now matches the official SQLite 3 implementation so that AND
has a higher precedence than OR
Changelog
[v1.0.0] - 2017-01-21
The root node of the AST now has type
and variant
properties:
{
"type": "statement",
"variant": "list",
"statement": [{
"type": "statement",
"variant": "select",
"statement": {}
}]
}
There is now a command line version of the parser when it is installed as a global module (e.g., npm i -g sqlite-parser
). The sqlite-parser
command is then available to use to parse input SQL files and write the results to stdout or a JSON file. Additional usage instructions and options available through sqlite-parser --help
.
sqlite-parser input.sql --output foo.json
To allow users to parse arbitrarily long SQL files or other readable stream sources, there is now a stream transform that can accept a readable stream and then push (write) out JSON strings of the ASTs for individual statements.
The AST for each statement is pushed down the stream as soon as it is read and parsed instead of reading the entire file into memory before parsing begins.
var parserTransform = require('sqlite-parser').createParser();
var readStream = require('fs').createReadStream('./large-input-file.sql');
readStream.pipe(parserTransform);
parserTransform.pipe(process.stdout);
parserTransform.on('error', function (err) {
console.error(err);
process.exit(1);
});
parserTransform.on('finish', function () {
process.exit(0);
});
To pipe the output into a file that contains a single valid JSON structure, the output of the parser steam transform needs to be wrapped in statement list node where every statement is separated by a comma.
var fs = require('fs');
var sqliteParser = require('sqlite-parser');
var parserTransform = sqliteParser.createParser();
var singleNodeTransform = sqliteParser.createStitcher();
var readStream = fs.createReadStream('./large-input-file.sql');
var writeStream = fs.createWriteStream('./large-output-file.json');
readStream.pipe(parserTransform);
parserTransform.pipe(singleNodeTransform);
singleNodeTransform.pipe(writeStream);
parserTransform.on('error', function (err) {
console.error(err);
process.exit(1);
});
writeStream.on('finish', function () {
process.exit(0);
});
Added missing ATTACH DATABASE
statement. It will pair nicely with the existing DETACH DATABASE
statement.
ATTACH DATABASE 'bees2.db' AS more_bees
SQLite allows you to enter basically anything you want for a datatype, such as the datatype for a column in a CREATE TABLE
statement, because it doesn't enforce types you provide. So, the parser will accept almost any unquoted string in place of a datatype. ref1 ref2
CREATE TABLE t1(x DINOSAUR, y BIRD_PERSON);
Run parser against entire SQLite test corpus using grunt testall
command.
Allow multi-byte UTF-8 characters (e.g., \u1234
) in identifier names.
Add support for table functions in the FROM
clause of a SELECT
statement.
SELECT
j2.rowid, jx.rowid
FROM
j2, json_tree(j2.json) AS jx
BREAKING CHANGE Instead of publishing this module on npm as a browserified and minified bundle, The transpiled ES2015 code in lib/
is now published and I have left it up to the end user to decide if they want to browserify or minify the library. The combined unminified file sizes for the published version of the parser is now ~127kB.
dist/
folder containing the minified browserified bundle that comes in at ~81kB (7% reduction from v0.14.5
). This is defined in the package.json
as the browser version of the module, which is recognized by tools such as jspm and browserify.BREAKING CHANGE The on
property of a CREATE INDEX
statement is now treated as a table expression identifier, and has the corresponding type
and variant
:
{
"type": "statement",
"variant": "create",
"format": "index",
"target": {
"type": "identifier",
"variant": "index",
"name": "bees.hive_state"
},
"on": {
"type": "identifier",
"variant": "expression",
"format": "table",
"name": {
"type": "identifier",
"variant": "table",
"name": "hive"
},
"columns": []
}
}
BREAKING CHANGE Indexed columns (e.g., the column list in the ON
part of a CREATE INDEX
statement) and ordering expressions (e.g., the ORDER BY
part of a SELECT
statement) now have the following format:
ASC
, DESC
) and/or COLLATE
, such as ORDER BY nick ASC
{
"order": [{
"type": "expression",
"variant": "order",
"expression": {
"type": "identifier",
"variant": "column",
"name": "nick"
},
"direction": "asc"
}]
}
COLLATE
then it will only be the expression itself, and the implicit "direction": "asc"
will not be added to the AST, such as ORDER BY nick
:{
"order": [{
"type": "identifier",
"variant": "column",
"name": "nick"
}]
}
BREAKING CHANGE Because of changes to how binary expressions are parsed, the order that expressions are composed may be different then the previous release. For example, ASTs may change such as those for queries that contain multiple binary expressions:
SELECT *
FROM hats
WHERE x != 2 OR x == 3 AND y > 5
BREAKING CHANGE Expressions such as x NOT NULL
were previously treated as a unary expressions but are now considered binary expressions.
{
"type": "expression",
"format": "binary",
"variant": "operation",
"operation": "not",
"left": {
"type": "identifier",
"variant": "column",
"name": "x"
},
"right": {
"type": "literal",
"variant": "null",
"value": "null"
}
}
BREAKING CHANGE Now, instead of transaction statements being parsed as a single statement of type transaction
to be considered valid, each statement that makes up a the transaction (e.g., BEGIN
, END
) is considered its own distinct statement that can exist independent of the others. Because a single transaction can be spread across multiple input strings given to the parser, it is no longer treated as a single, large, transaction statement.
BEGIN;
DROP TABLE t1;
END;
{
"type": "statement",
"variant": "list",
"statement": [
{
"type": "statement",
"variant": "transaction",
"action": "begin"
},
{
"type": "statement",
"target": {
"type": "identifier",
"variant": "table",
"name": "t1"
},
"variant": "drop",
"format": "table",
"condition": []
},
{
"type": "statement",
"variant": "transaction",
"action": "commit"
}
]
}
BREAKING CHANGE COLLATE
can now appear multiple times in a row wherever it would previously be allowed to appear, and as a result, the collate
property of the AST will contain an array.
SELECT 'cats'
ORDER BY 1 COLLATE nocase COLLATE nocase
BREAKING CHANGE CONSTRAINT
names can now appear multiple times before or after a column or table constraint in a CREATE TABLE
statement. Having a CONSTRAINT
name after the constraint is an undocumented SQLite feature. However, while it will not give an error, any constraint name appearing after the constraint is ignored.
CREATE TABLE t2c(
-- Two leading and two trailing CONSTRAINT clauses
-- Name used: x_two
x INTEGER CONSTRAINT x_one CONSTRAINT x_two
CHECK( typeof( coalesce(x,0) ) == 'integer' )
CONSTRAINT x_two CONSTRAINT x_three,
y INTEGER,
z INTEGER,
-- Two trailing CONSTRAINT clauses
-- Name used: (none)
UNIQUE(x, y, z) CONSTRAINT u_one CONSTRAINT u_two
)
BREAKING CHANGE JOIN
clauses and table lists can now occur in the same FROM
clause of a single SELECT
statement. Tables separated by a comma will be included in the JOIN
map as a cross join.
SELECT *
FROM aa LEFT JOIN bb, cc
WHERE cc.c = aa.a;
BREAKING CHANGE A comma-separated list of table or subquery names in the FROM
clause of a SELECT
statement are now treated as a join map of cross joins. Also, each pair of comma-separated tables or subqueries can include a join constraint expression (e.g., ON t.col1 = b.col2
).
SELECT t1.rowid, t2.rowid
FROM t1, t2 ON t1.a = t2.b;
{
"type": "statement",
"variant": "list",
"statement": [
{
"type": "statement",
"variant": "select",
"result": [
{
"type": "identifier",
"variant": "column",
"name": "t1.rowid"
},
{
"type": "identifier",
"variant": "column",
"name": "t2.rowid"
}
],
"from": {
"type": "map",
"variant": "join",
"source": {
"type": "identifier",
"variant": "table",
"name": "t1"
},
"map": [
{
"type": "join",
"variant": "cross join",
"source": {
"type": "identifier",
"variant": "table",
"name": "t2"
},
"constraint": {
"type": "constraint",
"variant": "join",
"format": "on",
"on": {
"type": "expression",
"format": "binary",
"variant": "operation",
"operation": "=",
"left": {
"type": "identifier",
"variant": "column",
"name": "t1.a"
},
"right": {
"type": "identifier",
"variant": "column",
"name": "t2.b"
}
}
}
}
]
}
}
]
}
BREAKING CHANGE Instead of an array, for the args
property of an AST node, it will now contain an expression list node containing the arguments on the expression
property.
{
"type": "expression",
"variant": "list",
"expression": []
}
BREAKING CHANGE All named values for properties such as variant
, format
, and type
should always be lowercase, even when uppercase in the input SQL (e.g., variant
is now natural join
instead of NATURAL JOIN
in the AST).
BREAKING CHANGE New format for CASE
expression AST nodes:
variant
when
has a condition
and a consequent
variant
else
has just a consequent
expression
is now variant
case
instead of binary
CASE
and WHEN
(e.g., CASE foo WHEN ...
) and calling that the expression, it is now added as the discriminant
select
case acc
when a = 0 then a1
when a = 1 then b1
else c1
end
{
"type": "expression",
"variant": "case",
"expression": [
{
"type": "condition",
"variant": "when",
"condition": {
"type": "expression",
"format": "binary",
"variant": "operation",
"operation": "=",
"left": {
"type": "identifier",
"variant": "column",
"name": "a"
},
"right": {
"type": "literal",
"variant": "decimal",
"value": "0"
}
},
"consequent": {
"type": "identifier",
"variant": "column",
"name": "a1"
}
},
{
"type": "condition",
"variant": "when",
"condition": {
"type": "expression",
"format": "binary",
"variant": "operation",
"operation": "=",
"left": {
"type": "identifier",
"variant": "column",
"name": "a"
},
"right": {
"type": "literal",
"variant": "decimal",
"value": "1"
}
},
"consequent": {
"type": "identifier",
"variant": "column",
"name": "b1"
}
},
{
"type": "condition",
"variant": "else",
"consequent": {
"type": "identifier",
"variant": "column",
"name": "c1"
}
}
],
"discriminant": {
"type": "identifier",
"variant": "column",
"name": "acc"
}
}
BREAKING CHANGE New format for EXISTS
expression nodes. Use expression
node in condition
for IF NOT EXISTS
. NOT EXISTS
, and EXISTS
modifiers instead of a string value.
CREATE TABLE
statement
create table if not exists foo(id int)
{
"type": "statement",
"name": {
"type": "identifier",
"variant": "table",
"name": "foo"
},
"variant": "create",
"format": "table",
"definition": [
{
"type": "definition",
"variant": "column",
"name": "id",
"definition": [],
"datatype": {
"type": "datatype",
"variant": "int",
"affinity": "integer"
}
}
],
"condition": [
{
"type": "condition",
"variant": "if",
"condition": {
"type": "expression",
"variant": "exists",
"operator": "not exists"
}
}
]
}
DROP TABLE
statement
drop table if exists foo
{
"type": "statement",
"target": {
"type": "identifier",
"variant": "table",
"name": "foo"
},
"variant": "drop",
"format": "table",
"condition": [
{
"type": "condition",
"variant": "if",
"condition": {
"type": "expression",
"variant": "exists",
"operator": "exists"
}
}
]
}
NOT EXISTS
expression
select a
where not exists (select b)
{
"type": "statement",
"variant": "select",
"result": [
{
"type": "identifier",
"variant": "column",
"name": "a"
}
],
"where": [
{
"type": "expression",
"format": "unary",
"variant": "exists",
"expression": {
"type": "statement",
"variant": "select",
"result": [
{
"type": "identifier",
"variant": "column",
"name": "b"
}
]
},
"operator": "not exists"
}
]
}
RangeError: Maximum call stack size exceeded
generated when running the uglified bundle (dist/sqlite-parser.js
) in the browser, so I am skipping the minification step and only publishing the browserified bundle.
Fixed binary expression parsing logic so that it can handle expressions such as:
SELECT * FROM t where -1 * (2 + 3);
SELECT * FROM t where 3 + 4 * 5 > 20;
SELECT * FROM t where v1 = ((v2 * 5) - v3);
Allow qualified table name in ON
clause of CREATE TRIGGER
statement (e.g., ON dbName.tableName
).
Allow literal boolean values on
and off
in PRAGMA
statements:
PRAGMA legacy_file_format = ON;
Do not treat TEMP
or ROWID
as reserved words, since the official parser allows temp
or rowid
, when used as an identifier (e.g., a table named temp
or the rowid
column of a table).
CREATE TABLE temp.t1(a, b);
SELECT rowid AS "Internal Row ID" FROM bees;
Require semicolons to delineate BEGIN
and END
statements for transactions while also allowing unnecessary semicolons to be omitted:
BEGIN;CREATE TABLE nick(a, b);END
Only allow CRUD operations inside of the body of a CREATE TRIGGER
statement.
Allow empty strings or NULL
to be used as aliases, to match behavior of the native SQLite parser, such as in an ATTACH DATABASE
statement:
ATTACH DATABASE '' AS ''
Allow datatype names to be provided to COLLATE
to match the behavior of the official SQLite parser:
SELECT c1
FROM t
ORDER BY 1 COLLATE numeric
Some CREATE TRIGGER
statements were previously parsed as a binary expressions instead of create trigger statements.
Allow indexed columns to be parsed when they include a COLLATE
and/or a ordering direction (e.g., ASC
, DESC
) when part of a table constraint in a CREATE TABLE
statement or a ON
part of a CREATE INDEX
statement:
CREATE TABLE t1(id int, PRIMARY KEY(x COLLATE binary ASC, y COLLATE hex, z DESC))
Allow UNIQUE
column constraint type to be correctly parsed.
CREATE TABLE bees(
a INTEGER UNIQUE ON CONFLICT IGNORE
)
Allow nested unary expressions while preserving also the correct order of precedence.
SELECT not not foo
FROM bees
The action (e.g., ADD COLUMN
) and target (e.g., the table name) of a ALTER TABLE
statement was not being added to the AST.
Allow AUTOINCREMENT
in the column list of a PRIMARY KEY
table constraint.
CREATE TABLE t7(
x INTEGER,
y REAL,
PRIMARY KEY(x AUTOINCREMENT)
);
Now supporting custom datatypes with affinity inference where possible. See this page for explanation for choosing type affinity for custom types.
CREATE TABLE t3(
-- Affinity: NUMERIC
d STRING,
-- Affinity: INTEGER
e FLOATING POINT(5,10),
-- Affinity: TEXT
f NATIONAL CHARACTER(15) COLLATE RTRIM,
-- Affinity: INTEGER
g LONG INTEGER DEFAULT( 3600*12 )
);
Allow trailing .
in decimal value (e.g., SELECT 1. + 1
).
Allow functions to have datatype names such as date(arg)
or time(now)
.
Allow reserved words in the a VIRTUAL TABLE
statement USING
clause CTE columns (e.g., from
, to
).
Better nested expression parsing when unnecessary parenthesis are used within a complex expression.
SELECT
SUM(
CASE
WHEN ( t.color != 'yellow' ) THEN 1
ELSE 0
END
) AS imposter_bee_count
FROM
bees t;
Allow a reserved word to be used as a column name in a CREATE TABLE
statement as long as it can be safely implied that it is meant to be a column name.
CREATE TABLE changelog(
desc TEXT
);
Allow WITH
clause before a SELECT
statement wherever a SELECT
statement can be found in a complex query, such as in a insert into select query.
INSERT INTO t6
WITH s(x) AS ( VALUES (2) UNION ALL SELECT x + 2 FROM s WHERE x < 49 )
SELECT * FROM s;
A view expression can now be used in a CREATE VIEW
statement.
CREATE VIEW v1(a, b) AS VALUES(1, 2), (3, 4);
Changelog
[v0.14.5] - 2016-07-11
Fix alternate not equal operator <>
SELECT *
FROM hats
WHERE quantity <> 1