sql-metadata

Uses tokenized query returned by python-sqlparse and generates query metadata.
Extracts column names and tables used by the query.
Automatically conduct column alias resolution, sub queries aliases resolution as well as tables aliases resolving.
Provides also a helper for normalization of SQL queries.
Supported queries syntax:
(note that listed backends can differ quite substantially but should work in regard of query types supported by sql-metadata)
You can test the capabilities of sql-metadata with an interactive demo: https://sql-app.infocruncher.com/
Usage
pip install sql-metadata
from sql_metadata import Parser
Parser("SELECT * FROM foo").tokens
Extracting columns from query
from sql_metadata import Parser
Parser("SELECT test, id FROM foo, bar").columns
Parser("INSERT /* VoteHelper::addVote xxx */ INTO `page_vote` (article_id,user_id,`time`) VALUES ('442001','27574631','20180228130846')").columns
parser = Parser("SELECT a.* FROM product_a.users AS a JOIN product_b.users AS b ON a.ip_address = b.ip_address")
parser.columns
parser.columns_dict
Extracting columns aliases from query
from sql_metadata import Parser
parser = Parser("SELECT a, (b + c - u) as alias1, custome_func(d) alias2 from aa, bb order by alias1")
parser.columns
parser.columns_aliases_names
parser.columns_aliases
parser.columns_aliases_dict
assert parser.columns_dict == {'order_by': ['b', 'c', 'u'],
'select': ['a', 'b', 'c', 'u', 'd']}
from sql_metadata import Parser
Parser("SELECT a.* FROM product_a.users AS a JOIN product_b.users AS b ON a.ip_address = b.ip_address").tables
Parser("SELECT test, id FROM foo, bar").tables
parser = Parser("SELECT f.test FROM foo AS f")
parser.tables_aliases
parser.columns
from sql_metadata import Parser
parser = Parser(
"INSERT /* VoteHelper::addVote xxx */ INTO `page_vote` (article_id,user_id,`time`) "
"VALUES ('442001','27574631','20180228130846')"
)
parser.values
parser.values_dict
parser = Parser(
"INSERT IGNORE INTO `table` VALUES (9, 2.15, '123', '2017-01-01');"
)
parser.values
parser.values_dict
Extracting limit and offset
from sql_metadata import Parser
Parser('SELECT foo_limit FROM bar_offset LIMIT 50 OFFSET 1000').limit_and_offset
Parser('SELECT foo_limit FROM bar_offset limit 2000,50').limit_and_offset
from sql_metadata import Parser
parser = Parser(
"""
WITH
database1.tableFromWith AS (SELECT aa.* FROM table3 as aa
left join table4 on aa.col1=table4.col2),
test as (SELECT * from table3)
SELECT
"xxxxx"
FROM
database1.tableFromWith alias
LEFT JOIN database2.table2 ON ("tt"."ttt"."fff" = "xx"."xxx")
"""
)
parser.with_names
parser.with_queries
parser.tables
from sql_metadata import Parser
parser = Parser(
"""
SELECT COUNT(1) FROM
(SELECT std.task_id FROM some_task_detail std WHERE std.STATUS = 1) a
JOIN (SELECT st.task_id FROM some_task st WHERE task_type_id = 80) b
ON a.task_id = b.task_id;
"""
)
parser.subqueries
parser.subqueries_names
parser.columns
parser.columns_dict
See tests file for more examples of a bit more complex queries.
Queries normalization and comments extraction
from sql_metadata import Parser
parser = Parser('SELECT /* Test */ foo FROM bar WHERE id in (1, 2, 56)')
parser.generalize
parser.without_comments
parser.comments
See test/test_normalization.py file for more examples of a bit more complex queries.
Migrating from sql_metadata 1.x
sql_metadata.compat module has been implemented to make the introduction of sql-metadata v2.0 smoother.
You can use it by simply changing the imports in your code from:
from sql_metadata import get_query_columns, get_query_tables
into:
from sql_metadata.compat import get_query_columns, get_query_tables
The following functions from the old API are available in the sql_metadata.compat module:
generalize_sql
get_query_columns (since #131 columns aliases ARE NOT returned by this function)
get_query_limit_and_offset
get_query_tables
get_query_tokens
preprocess_query
Authors and contributors
Created and maintained by @macbre with a great contributions from @collerek and the others.
Stargazers over time
