Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
:Author: Catherine Devlin, http://catherinedevlin.blogspot.com
Introduces a %sql (or %%sql) magic.
Connect to a database, using SQLAlchemy URL
_ connect strings, then issue SQL
commands within IPython or IPython Notebook.
.. image:: https://raw.github.com/catherinedevlin/ipython-sql/master/examples/writers.png :width: 600px :alt: screenshot of ipython-sql in the Notebook
.. code-block:: python
In [1]: %load_ext sql
In [2]: %%sql postgresql://will:longliveliz@localhost/shakes
...: select * from character
...: where abbrev = 'ALICE'
...:
Out[2]: [(u'Alice', u'Alice', u'ALICE', u'a lady attending on Princess Katherine', 22)]
In [3]: result = _
In [4]: print(result)
charid charname abbrev description speechcount
=================================================================================
Alice Alice ALICE a lady attending on Princess Katherine 22
In [4]: result.keys
Out[5]: [u'charid', u'charname', u'abbrev', u'description', u'speechcount']
In [6]: result[0][0]
Out[6]: u'Alice'
In [7]: result[0].description
Out[7]: u'a lady attending on Princess Katherine'
After the first connection, connect info can be omitted::
In [8]: %sql select count(*) from work
Out[8]: [(43L,)]
Connections to multiple databases can be maintained. You can refer to an existing connection by username@database
.. code-block:: python
In [9]: %%sql will@shakes
...: select charname, speechcount from character
...: where speechcount = (select max(speechcount)
...: from character);
...:
Out[9]: [(u'Poet', 733)]
In [10]: print(_)
charname speechcount
======================
Poet 733
If no connect string is supplied, %sql
will provide a list of existing connections;
however, if no connections have yet been made and the environment variable DATABASE_URL
is available, that will be used.
For secure access, you may dynamically access your credentials (e.g. from your system environment or getpass.getpass
) to avoid storing your password in the notebook itself. Use the $
before any variable to access it in your %sql
command.
.. code-block:: python
In [11]: user = os.getenv('SOME_USER')
....: password = os.getenv('SOME_PASSWORD')
....: connection_string = "postgresql://{user}:{password}@localhost/some_database".format(user=user, password=password)
....: %sql $connection_string
Out[11]: u'Connected: some_user@some_database'
You may use multiple SQL statements inside a single cell, but you will only see any query results from the last of them, so this really only makes sense for statements with no output
.. code-block:: python
In [11]: %%sql sqlite://
....: CREATE TABLE writer (first_name, last_name, year_of_death);
....: INSERT INTO writer VALUES ('William', 'Shakespeare', 1616);
....: INSERT INTO writer VALUES ('Bertold', 'Brecht', 1956);
....:
Out[11]: []
As a convenience, dict-style access for result sets is supported, with the leftmost column serving as key, for unique values.
.. code-block:: python
In [12]: result = %sql select * from work
43 rows affected.
In [13]: result['richard2']
Out[14]: (u'richard2', u'Richard II', u'History of Richard II', 1595, u'h', None, u'Moby', 22411, 628)
Results can also be retrieved as an iterator of dictionaries (result.dicts()
)
or a single dictionary with a tuple of scalar values per key (result.dict()
)
Bind variables (bind parameters) can be used in the "named" (:x) style. The variable names used should be defined in the local namespace.
.. code-block:: python
In [15]: name = 'Countess'
In [16]: %sql select description from character where charname = :name
Out[16]: [(u'mother to Bertram',)]
Alternately, $variable_name
or {variable_name}
can be
used to inject variables from the local namespace into the SQL
statement before it is formed and passed to the SQL engine.
(Using $
and {}
together, as in ${variable_name}
,
is not supported.)
In [17]: %sql select description from character where charname = '{name}'
Out[17]: [(u'mother to Bertram',)]
Bind variables are passed through to the SQL engine and can only
be used to replace strings passed to SQL. $
and {}
are
substituted before passing to SQL and can be used to form SQL
statements dynamically.
Ordinary IPython assignment works for single-line %sql
queries:
.. code-block:: python
In [18]: works = %sql SELECT title, year FROM work
43 rows affected.
The <<
operator captures query results in a local variable, and
can be used in multi-line %%sql
:
.. code-block:: python
In [19]: %%sql works << SELECT title, year
...: FROM work
...:
43 rows affected.
Returning data to local variable works
Connection strings are SQLAlchemy URL
_ standard.
Some example connection strings::
mysql+pymysql://scott:tiger@localhost/foo
oracle://scott:tiger@127.0.0.1:1521/sidname
sqlite://
sqlite:///foo.db
mssql+pyodbc://username:password@host/database?driver=SQL+Server+Native+Client+11.0
.. _SQLAlchemy URL
: http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
Note that mysql
and mysql+pymysql
connections (and perhaps others)
don't read your client character set information from .my.cnf. You need
to specify it in the connection string::
mysql+pymysql://scott:tiger@localhost/foo?charset=utf8
Note that an impala
connection with impyla
_ for HiveServer2 requires disabling autocommit::
%config SqlMagic.autocommit=False
%sql impala://hserverhost:port/default?kerberos_service_name=hive&auth_mechanism=GSSAPI
.. _impyla: https://github.com/cloudera/impyla
Connection arguments not whitelisted by SQLALchemy can be provided as
a flag with (-a|--connection_arguments)the connection string as a JSON string.
See SQLAlchemy Args
_.
| %sql --connection_arguments {"timeout":10,"mode":"ro"} sqlite:// SELECT * FROM work;
| %sql -a '{"timeout":10, "mode":"ro"}' sqlite:// SELECT * from work;
.. _SQLAlchemy Args
: https://docs.sqlalchemy.org/en/13/core/engines.html#custom-dbapi-args
DSN connections
Alternately, you can store connection info in a
configuration file, under a section name chosen to
refer to your database.
For example, if dsn.ini contains
| [DB_CONFIG_1]
| drivername=postgres
| host=my.remote.host
| port=5433
| database=mydatabase
| username=myuser
| password=1234
then you can
| %config SqlMagic.dsn_filename='./dsn.ini'
| %sql --section DB_CONFIG_1
Configuration
-------------
Query results are loaded as lists, so very large result sets may use up
your system's memory and/or hang your browser. There is no autolimit
by default. However, `autolimit` (if set) limits the size of the result
set (usually with a `LIMIT` clause in the SQL). `displaylimit` is similar,
but the entire result set is still pulled into memory (for later analysis);
only the screen display is truncated.
.. code-block:: python
In [2]: %config SqlMagic
SqlMagic options
--------------
SqlMagic.autocommit=<Bool>
Current: True
Set autocommit mode
SqlMagic.autolimit=<Int>
Current: 0
Automatically limit the size of the returned result sets
SqlMagic.autopandas=<Bool>
Current: False
Return Pandas DataFrames instead of regular result sets
SqlMagic.column_local_vars=<Bool>
Current: False
Return data into local variables from column names
SqlMagic.displaycon=<Bool>
Current: False
Show connection string after execute
SqlMagic.displaylimit=<Int>
Current: None
Automatically limit the number of rows displayed (full result set is still
stored)
SqlMagic.dsn_filename=<Unicode>
Current: 'odbc.ini'
Path to DSN file. When the first argument is of the form [section], a
sqlalchemy connection string is formed from the matching section in the DSN
file.
SqlMagic.feedback=<Bool>
Current: False
Print number of rows affected by DML
SqlMagic.short_errors=<Bool>
Current: True
Don't display the full traceback on SQL Programming Error
SqlMagic.style=<Unicode>
Current: 'DEFAULT'
Set the table printing style to any of prettytable's defined styles
(currently DEFAULT, MSWORD_FRIENDLY, PLAIN_COLUMNS, RANDOM)
In[3]: %config SqlMagic.feedback = False
Please note: if you have autopandas set to true, the displaylimit option will not apply. You can set the pandas display limit by using the pandas ``max_rows`` option as described in the `pandas documentation <http://pandas.pydata.org/pandas-docs/version/0.18.1/options.html#frequently-used-options>`_.
Pandas
------
If you have installed ``pandas``, you can use a result set's
``.DataFrame()`` method
.. code-block:: python
In [3]: result = %sql SELECT * FROM character WHERE speechcount > 25
In [4]: dataframe = result.DataFrame()
The ``--persist`` argument, with the name of a
DataFrame object in memory,
will create a table name
in the database from the named DataFrame.
Or use ``--append`` to add rows to an existing
table by that name.
.. code-block:: python
In [5]: %sql --persist dataframe
In [6]: %sql SELECT * FROM dataframe;
.. _Pandas: http://pandas.pydata.org/
Graphing
--------
If you have installed ``matplotlib``, you can use a result set's
``.plot()``, ``.pie()``, and ``.bar()`` methods for quick plotting
.. code-block:: python
In[5]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'
In[6]: %matplotlib inline
In[7]: result.pie()
.. image:: https://raw.github.com/catherinedevlin/ipython-sql/master/examples/wordcount.png
:alt: pie chart of word count of Shakespeare's comedies
Dumping
-------
Result sets come with a ``.csv(filename=None)`` method. This generates
comma-separated text either as a return value (if ``filename`` is not
specified) or in a file of the given name.
.. code-block:: python
In[8]: result = %sql SELECT title, totalwords FROM work WHERE genretype = 'c'
In[9]: result.csv(filename='work.csv')
PostgreSQL features
-------------------
``psql``-style "backslash" `meta-commands`_ commands (``\d``, ``\dt``, etc.)
are provided by `PGSpecial`_. Example:
.. code-block:: python
In[9]: %sql \d
.. _PGSpecial: https://pypi.python.org/pypi/pgspecial
.. _meta-commands: https://www.postgresql.org/docs/9.6/static/app-psql.html#APP-PSQL-META-COMMANDS
Options
-------
``-l`` / ``--connections``
List all active connections
``-x`` / ``--close <session-name>``
Close named connection
``-c`` / ``--creator <creator-function>``
Specify creator function for new connection
``-s`` / ``--section <section-name>``
Section of dsn_file to be used for generating a connection string
``-p`` / ``--persist``
Create a table name in the database from the named DataFrame
``--append``
Like ``--persist``, but appends to the table if it already exists
``-a`` / ``--connection_arguments <"{connection arguments}">``
Specify dictionary of connection arguments to pass to SQL driver
``-f`` / ``--file <path>``
Run SQL from file at this path
Caution
-------
Comments
~~~~~~~~
Because ipyton-sql accepts ``--``-delimited options like ``--persist``, but ``--``
is also the syntax to denote a SQL comment, the parser needs to make some assumptions.
- If you try to pass an unsupported argument, like ``--lutefisk``, it will
be interpreted as a SQL comment and will not throw an unsupported argument
exception.
- If the SQL statement begins with a first-line comment that looks like one
of the accepted arguments - like ``%sql --persist is great!`` - it will be
parsed like an argument, not a comment. Moving the comment to the second
line or later will avoid this.
Installing
----------
Install the latest release with::
pip install ipython-sql
or download from https://github.com/catherinedevlin/ipython-sql and::
cd ipython-sql
sudo python setup.py install
Development
-----------
https://github.com/catherinedevlin/ipython-sql
Credits
-------
- Matthias Bussonnier for help with configuration
- Olivier Le Thanh Duong for ``%config`` fixes and improvements
- Distribute_
- Buildout_
- modern-package-template_
- Mike Wilson for bind variable code
- Thomas Kluyver and Steve Holden for debugging help
- Berton Earnshaw for DSN connection syntax
- Bruno Harbulot for DSN example
- Andrés Celis for SQL Server bugfix
- Michael Erasmus for DataFrame truth bugfix
- Noam Finkelstein for README clarification
- Xiaochuan Yu for `<<` operator, syntax colorization
- Amjith Ramanujam for PGSpecial and incorporating it here
- Alexander Maznev for better arg parsing, connections accepting specified creator
- Jonathan Larkin for configurable displaycon
- Jared Moore for ``connection-arguments`` support
- Gilbert Brault for ``--append``
- Lucas Zeer for multi-line bugfixes for var substitution, ``<<``
- vkk800 for ``--file``
- Jens Albrecht for MySQL DatabaseError bugfix
- meihkv for connection-closing bugfix
- Abhinav C for SQLAlchemy 2.0 compatibility
.. _Distribute: http://pypi.python.org/pypi/distribute
.. _Buildout: http://www.buildout.org/
.. _modern-package-template: http://pypi.python.org/pypi/modern-package-template
News
----
0.1
~~~
*Release date: 21-Mar-2013*
* Initial release
0.1.1
~~~~~
*Release date: 29-Mar-2013*
* Release to PyPI
* Results returned as lists
* print(_) to get table form in text console
* set autolimit and text wrap in configuration
0.1.2
~~~~~
*Release date: 29-Mar-2013*
* Python 3 compatibility
* use prettyprint package
* allow multiple SQL per cell
0.2.0
~~~~~
*Release date: 30-May-2013*
* Accept bind variables (Thanks Mike Wilson!)
0.2.1
~~~~~
*Release date: 15-June-2013*
* Recognize socket connection strings
* Bugfix - issue 4 (remember existing connections by case)
0.2.2
~~~~~
*Release date: 30-July-2013*
Converted from an IPython Plugin to an Extension for 1.0 compatibility
0.2.2.1
~~~~~~~
*Release date: 01-Aug-2013*
Deleted Plugin import left behind in 0.2.2
0.2.3
~~~~~
*Release date: 20-Sep-2013*
* Contributions from Olivier Le Thanh Duong:
- SQL errors reported without internal IPython error stack
- Proper handling of configuration
* Added .DataFrame(), .pie(), .plot(), and .bar() methods to
result sets
0.3.0
~~~~~
*Release date: 13-Oct-2013*
* displaylimit config parameter
* reports number of rows affected by each query
* test suite working again
* dict-style access for result sets by primary key
0.3.1
~~~~~
* Reporting of number of rows affected configurable with ``feedback``
* Local variables usable as SQL bind variables
0.3.2
~~~~~
* ``.csv(filename=None)`` method added to result sets
0.3.3
~~~~~
* Python 3 compatibility restored
* DSN access supported (thanks Berton Earnshaw)
0.3.4
~~~~~
* PERSIST pseudo-SQL command added
0.3.5
~~~~~
* Indentations visible in HTML cells
* COMMIT each SQL statement immediately - prevent locks
0.3.6
~~~~~
* Fixed issue #30, commit failures for sqlite (thanks stonebig, jandot)
0.3.7
~~~~~
* New `column_local_vars` config option submitted by darikg
* Avoid contaminating user namespace from locals (thanks alope107)
0.3.7.1
~~~~~~~
* Avoid "connection busy" error for SQL Server (thanks Andrés Celis)
0.3.8
~~~~~
* Stop warnings for deprecated use of IPython 3 traitlets in IPython 4 (thanks graphaelli; also stonebig, aebrahim, mccahill)
* README update for keeping connection info private, from eshilts
0.3.9
~~~~~
* Fix truth value of DataFrame error (thanks michael-erasmus)
* `<<` operator (thanks xiaochuanyu)
* added README example (thanks tanhuil)
* bugfix in executing column_local_vars (thanks tebeka)
* pgspecial installation optional (thanks jstoebel and arjoe)
* conceal passwords in connection strings (thanks jstoebel)
0.3.9
~~~~~
* Restored Python 2 compatibility (thanks tokenmathguy)
0.4.0
~~~~~
* Changed most non-SQL commands to argparse arguments (thanks pik)
* User can specify a creator for connections (thanks pik)
* Bogus pseudo-SQL command `PERSIST` removed, replaced with `--persist` arg
* Turn off echo of connection information with `displaycon` in config
* Consistent support for {} variables (thanks Lucas)
0.4.1
~~~~~
* Fixed .rst file location in MANIFEST.in
* Parse SQL comments in first line
* Bugfixes for DSN, `--close`, others
0.5.0
~~~~~
* Use SQLAlchemy 2.0
* Drop undocumented support for dict-style access to raw row instances
FAQs
RDBMS access via IPython
We found that ipython-sql demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.