
Research
/Security News
Contagious Interview Campaign Escalates With 67 Malicious npm Packages and New Malware Loader
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
Let's make a SQL parser so we can provide a familiar interface to non-sql datastores!
Branch | Status |
---|---|
master | |
dev |
SQL is a familiar language used to access databases. Although, each database vendor has its quirky implementation, there is enough standardization that the average developer does not need to know of those quirks. This familiar core SQL (lowest common denominator, if you will) is useful enough to explore data in primitive ways. It is hoped that, once programmers have reviewed a datastore with basic SQL queries, and they see the value of that data, and they will be motivated to use the datastore's native query format.
The primary objective of this library is to convert SQL queries to JSON-izable parse trees. This originally targeted MySQL, but has grown to include other database vendors. Please paste some SQL into a new issue if it does not work for you
update
or insert
It is my sincere hope you can convert the JSON into queries for your particular backend datastore
Jan 2021 -There are almost 500 tests. This parser is good enough for basic usage, including inner queries, with
clauses, and window functions. There is still a lot missing to support BigQuery and Redshift queries.
pip install moz-sql-parser
>>> from moz_sql_parser import parse
>>> import json
>>> json.dumps(parse("select count(1) from jobs"))
'{"select": {"value": {"count": 1}}, "from": "jobs"}'
Each SQL query is parsed to an object: Each clause is assigned to an object property of the same name.
>>> json.dumps(parse("select a as hello, b as world from jobs"))
'{"select": [{"value": "a", "name": "hello"}, {"value": "b", "name": "world"}], "from": "jobs"}'
The SELECT
clause is an array of objects containing name
and value
properties.
Python's default recursion limit (1000) is not hit when parsing the test suite, but this may not be the case for large SQL. You can increase the recursion limit before you parse
:
>>> from moz_sql_parser import parse
>>> sys.setrecursionlimit(3000)
>>> parse(complicated_sql)
You may also generate SQL from the a given JSON document. This is done by the formatter, which is still incomplete (Jan2020).
>>> from moz_sql_parser import format
>>> format({"from":"test", "select":["a.b", "c"]})
'SELECT a.b, c FROM test'
In the event that the parser is not working for you, you can help make this better but simply pasting your sql (or JSON) into a new issue. Extra points if you describe the problem. Even more points if you submit a PR with a test. If you also submit a fix, then you also have my gratitude.
See the tests directory for instructions running tests, or writing new ones.
SQL queries are translated to JSON objects: Each clause is assigned to an object property of the same name.
# SELECT * FROM dual WHERE a>b ORDER BY a+b
{
"select": "*",
"from": "dual",
"where": {"gt": ["a", "b"]},
"orderby": {"value": {"add": ["a", "b"]}}
}
Expressions are also objects, but with only one property: The name of the operation, and the value holding (an array of) parameters for that operation.
{op: parameters}
and you can see this pattern in the previous example:
{"gt": ["a","b"]}
The moz-sql-parser.scrub()
method is used liberally throughout the code, and it "simplifies" the JSON. You may find this form a bit tedious to work with because the JSON property values can be values, lists of values, or missing. Please consider converting everything to arrays:
def listwrap(value):
if value is None:
return []
elif isinstance(value, list)
return value
else:
return [value]
then you may avoid all the is-it-a-list checks :
for select in listwrap(parsed_result.get('select')):
do_something(select)
you may find it easier if all JSON expressions had a list of operands:
def normalize(expression)
# ensure parameters are in a list
return {
op: params
for op, param = expression.items()
for params in [[normalize(p) for p in listwrap(param)]]
}
FAQs
Extract Parse Tree from SQL
We found that moz-sql-parser demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.