iceaxe
A modern, fast ORM for Python. We have the following goals:
- 🏎️ Performance: We want to exceed or match the fastest ORMs in Python. We want our ORM
to be as close as possible to raw-asyncpg speeds. See the "Benchmarks" section for more.
- 📝 Typehinting: Everything should be typehinted with expected types. Declare your data as you
expect in Python and it should bidirectionally sync to the database.
- 🐘 Postgres only: Leverage native Postgres features and simplify the implementation.
- ⚡ Common things are easy, rare things are possible: 99% of the SQL queries we write are
vanilla SELECT/INSERT/UPDATEs. These should be natively supported by your ORM. If you're writing really
complex queries, these are better done by hand so you can see exactly what SQL will be run.
Iceaxe is in early alpha. It's also an independent project. It's compatible with the Mountaineer ecosystem, but you can use it in whatever
project and web framework you're using.
Installation
If you're using poetry to manage your dependencies:
poetry add iceaxe
Otherwise install with pip:
pip install iceaxe
Usage
Define your models as a TableBase
subclass:
from iceaxe import TableBase
class Person(TableBase):
id: int
name: str
age: int
TableBase is a subclass of Pydantic's BaseModel
, so you get all of the validation and Field customization
out of the box. We provide our own Field
constructor that adds database-specific configuration. For instance, to make the
id
field a primary key / auto-incrementing you can do:
from iceaxe import Field
class Person(TableBase):
id: int = Field(primary_key=True)
name: str
age: int
Okay now you have a model. How do you interact with it?
Databases are based on a few core primitives to insert data, update it, and fetch it out again.
To do so you'll need a database connection, which is a connection over the network from your code
to your Postgres database. The DBConnection
is the core class for all ORM actions against the database.
from iceaxe import DBConnection
import asyncpg
conn = DBConnection(
await asyncpg.connect(
host="localhost",
port=5432,
user="db_user",
password="yoursecretpassword",
database="your_db",
)
)
The Person class currently just lives in memory. To back it with a full
database table, we can run raw SQL or run a migration to add it:
await conn.conn.execute(
"""
CREATE TABLE IF NOT EXISTS person (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
age INT NOT NULL
)
"""
)
Inserting Data
Instantiate object classes as you normally do:
people = [
Person(name="Alice", age=30),
Person(name="Bob", age=40),
Person(name="Charlie", age=50),
]
await conn.insert(people)
print(people[0].id)
print(people[1].id)
Because we're using an auto-incrementing primary key, the id
field will be populated after the insert.
Iceaxe will automatically update the object in place with the newly assigned value.
Updating data
Now that we have these lovely people, let's modify them.
person = people[0]
person.name = "Blice"
Right now, we have a Python object that's out of state with the database. But that's often okay. We can inspect it
and further write logic - it's fully decoupled from the database.
def ensure_b_letter(person: Person):
if person.name[0].lower() != "b":
raise ValueError("Name must start with 'B'")
ensure_b_letter(person)
To sync the values back to the database, we can call update
:
await conn.update([person])
If we were to query the database directly, we see that the name has been updated:
id | name | age
----+-------+-----
1 | Blice | 31
2 | Bob | 40
3 | Charlie | 50
But no other fields have been touched. This lets a potentially concurrent process
modify Alice
's record - say, updating the age to 31. By the time we update the data, we'll
change the name but nothing else. Under the hood we do this by tracking the fields that
have been modified in-memory and creating a targeted UPDATE to modify only those values.
Selecting data
To select data, we can use a QueryBuilder
. For a shortcut to select
query functions,
you can also just import select directly. This method takes the desired value parameters
and returns a list of the desired objects.
from iceaxe import select
query = select(Person).where(Person.name == "Blice", Person.age > 25)
results = await conn.exec(query)
If we inspect the typing of results
, we see that it's a list[Person]
objects. This matches
the typehint of the select
function. You can also target columns directly:
query = select((Person.id, Person.name)).where(Person.age > 25)
results = await conn.exec(query)
This will return a list of tuples, where each tuple is the id and name of the person: list[tuple[int, str]]
.
We support most of the common SQL operations. Just like the results, these are typehinted
to their proper types as well. Static typecheckers and your IDE will throw an error if you try to compare
a string column to an integer, for instance. A more complex example of a query:
query = select((
Person.id,
FavoriteColor,
)).join(
FavoriteColor,
Person.id == FavoriteColor.person_id,
).where(
Person.age > 25,
Person.name == "Blice",
).order_by(
Person.age.desc(),
).limit(10)
results = await conn.exec(query)
As expected this will deliver results - and typehint - as a list[tuple[int, FavoriteColor]]
Production
[!IMPORTANT]
Iceaxe is in early alpha. We're using it internally and showly rolling out to our production
applications, but we're not yet ready to recommend it for general use. The API and larger
stability is subject to change.
Note that underlying Postgres connection wrapped by conn
will be alive for as long as your object is in memory. This uses up one
of the allowable connections to your database. Your overall limit depends on your Postgres configuration
or hosting provider, but most managed solutions top out around 150-300. If you need more concurrent clients
connected (and even if you don't - connection creation at the Postgres level is expensive), you can adopt
a load balancer like pgbouncer
to better scale to traffic. More deployment notes to come.
It's also worth noting the absence of request pooling in this initialization. This is a feature of many ORMs that lets you limit
the overall connections you make to Postgres, and re-use these over time. We specifically don't offer request
pooling as part of Iceaxe, despite being supported by our underlying engine asyncpg
. This is a bit more
aligned to how things should be structured in production. Python apps are always bound to one process thanks to
the GIL. So no matter what your connection pool will always be tied to the current Python process / runtime. When you're deploying onto a server with multiple cores, the pool will be duplicated across CPUs and largely defeats the purpose of capping
network connections in the first place.
Benchmarking
We have basic benchmarking tests in the __tests__/benchmarks
directory. To run them, you'll need to execute the pytest suite:
poetry run pytest -m integration_tests
Current benchmarking as of October 11 2024 is:
| raw asyncpg | iceaxe | external overhead | |
---|
TableBase columns | 0.098s | 0.093s | | |
TableBase full | 0.164s | 1.345s | 10%: dict construction | 90%: pydantic overhead |
Development
If you update your Cython implementation during development, you'll need to re-compile the Cython code. This can be done with
a simple poetry install. Poetry is set up to create a dynamic setup.py
based on our build.py
definition.
poetry install
TODOs