aiochclient
An async http(s) ClickHouse client for python 3.6+ supporting type
conversion in both directions, streaming, lazy decoding on select queries, and a
fully typed interface.
Table of Contents
Installation
You can use it with either
aiohttp or
httpx http connectors.
To use with aiohttp
install it with command:
> pip install aiochclient[aiohttp]
Or aiochclient[aiohttp-speedups]
to install with extra speedups.
To use with httpx
install it with command:
> pip install aiochclient[httpx]
Or aiochclient[httpx-speedups]
to install with extra speedups.
Installing with [*-speedups]
adds the following:
- cChardet for
aiohttp
speedup - aiodns for
aiohttp
speedup - ciso8601 for ultra-fast datetime
parsing while decoding data from ClickHouse for
aiohttp
and httpx
.
Additionally the installation process attempts to use Cython for a speed boost
(roughly 30% faster).
Quick Start
Connecting to ClickHouse
aiochclient
needs aiohttp.ClientSession
or httpx.AsyncClient
to connect to ClickHouse:
from aiochclient import ChClient
from aiohttp import ClientSession
async def main():
async with ClientSession() as s:
client = ChClient(s)
assert await client.is_alive()
Querying the database
await client.execute(
"CREATE TABLE t (a UInt8, b Tuple(Date, Nullable(Float32))) ENGINE = Memory"
)
For INSERT queries you can pass values as *args
. Values should be
iterables:
await client.execute(
"INSERT INTO t VALUES",
(1, (dt.date(2018, 9, 7), None)),
(2, (dt.date(2018, 9, 8), 3.14)),
)
For fetching all rows at once use the
fetch
method:
all_rows = await client.fetch("SELECT * FROM t")
For fetching first row from result use the
fetchrow
method:
row = await client.fetchrow("SELECT * FROM t WHERE a=1")
assert row[0] == 1
assert row["b"] == (dt.date(2018, 9, 7), None)
You can also use
fetchval
method, which returns first value of the first row from query result:
val = await client.fetchval("SELECT b FROM t WHERE a=2")
assert val == (dt.date(2018, 9, 8), 3.14)
With async iteration on the query results stream you can fetch multiple
rows without loading them all into memory at once:
async for row in client.iterate(
"SELECT number, number*2 FROM system.numbers LIMIT 10000"
):
assert row[0] * 2 == row[1]
Use fetch
/fetchrow
/fetchval
/iterate
for SELECT queries and execute
or
any of last for INSERT and all another queries.
Working with query results
All fetch queries return rows as lightweight, memory efficient objects. Before
v1.0.0
rows were only returned as tuples. All rows have a full mapping interface, where you can
get fields by names or indexes:
row = await client.fetchrow("SELECT a, b FROM t WHERE a=1")
assert row["a"] == 1
assert row[0] == 1
assert row[:] == (1, (dt.date(2018, 9, 8), 3.14))
assert list(row.keys()) == ["a", "b"]
assert list(row.values()) == [1, (dt.date(2018, 9, 8), 3.14)]
Documentation
To check out the api docs,
visit the readthedocs site..
Type Conversion
aiochclient
automatically converts types from ClickHouse to python types and
vice-versa.
ClickHouse type | Python type |
---|
Bool | bool |
UInt8 | int |
UInt16 | int |
UInt32 | int |
UInt64 | int |
UInt128 | int |
UInt256 | int |
Int8 | int |
Int16 | int |
Int32 | int |
Int64 | int |
Int128 | int |
Int256 | int |
Float32 | float |
Float64 | float |
String | str |
FixedString | str |
Enum8 | str |
Enum16 | str |
Date | datetime.date |
DateTime | datetime.datetime |
DateTime64 | datetime.datetime |
Decimal | decimal.Decimal |
Decimal32 | decimal.Decimal |
Decimal64 | decimal.Decimal |
Decimal128 | decimal.Decimal |
IPv4 | ipaddress.IPv4Address |
IPv6 | ipaddress.IPv6Address |
UUID | uuid.UUID |
Nothing | None |
Tuple(T1, T2, ...) | Tuple[T1, T2, ...] |
Array(T) | List[T] |
Nullable(T) | None or T |
LowCardinality(T) | T |
Map(T1, T2) | Dict[T1, T2] |
Nested(T1, T2, ...) | List[Tuple[T1, T2, ...], Tuple[T1, T2, ...]] |
Connection Pool Settings
aiochclient
uses the
aiohttp.TCPConnector
to determine pool size. By default, the pool limit is 100 open connections.
Notes on Speed
It's highly recommended using uvloop
and installing aiochclient
with
speedups for the sake of speed. Some recent benchmarks on our
machines without parallelization:
- 180k-220k rows/sec on SELECT
- 50k-80k rows/sec on INSERT
Note: these benchmarks are system dependent