
Research
PyPI Package Disguised as Instagram Growth Tool Harvests User Credentials
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
Ibis is the portable Python dataframe library:
See the documentation on "Why Ibis?" to learn more.
You can pip install
Ibis with a backend and example data:
pip install 'ibis-framework[duckdb,examples]'
π‘ Tip
See the installation guide for more installation options.
Then use Ibis:
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
βββββββββββ³ββββββββββββ³βββββββββββββββββ³ββββββββββββββββ³ββββββββββββββββββββ³ββββββββββββββ³βββββββββ³ββββββββ
β species β island β bill_length_mm β bill_depth_mm β flipper_length_mm β body_mass_g β sex β year β
β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β string β string β float64 β float64 β int64 β int64 β string β int64 β
βββββββββββΌββββββββββββΌβββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββΌββββββββββββββΌβββββββββΌββββββββ€
β Adelie β Torgersen β 39.1 β 18.7 β 181 β 3750 β male β 2007 β
β Adelie β Torgersen β 39.5 β 17.4 β 186 β 3800 β female β 2007 β
β Adelie β Torgersen β 40.3 β 18.0 β 195 β 3250 β female β 2007 β
β Adelie β Torgersen β NULL β NULL β NULL β NULL β NULL β 2007 β
β Adelie β Torgersen β 36.7 β 19.3 β 193 β 3450 β female β 2007 β
β Adelie β Torgersen β 39.3 β 20.6 β 190 β 3650 β male β 2007 β
β Adelie β Torgersen β 38.9 β 17.8 β 181 β 3625 β female β 2007 β
β Adelie β Torgersen β 39.2 β 19.6 β 195 β 4675 β male β 2007 β
β Adelie β Torgersen β 34.1 β 18.1 β 193 β 3475 β NULL β 2007 β
β Adelie β Torgersen β 42.0 β 20.2 β 190 β 4250 β NULL β 2007 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
βββββββββββ΄ββββββββββββ΄βββββββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ΄ββββββββ
>>> g = t.group_by("species", "island").agg(count=t.count()).order_by("count")
>>> g
βββββββββββββ³ββββββββββββ³ββββββββ
β species β island β count β
β‘ββββββββββββββββββββββββββββββββ©
β string β string β int64 β
βββββββββββββΌββββββββββββΌββββββββ€
β Adelie β Biscoe β 44 β
β Adelie β Torgersen β 52 β
β Adelie β Dream β 56 β
β Chinstrap β Dream β 68 β
β Gentoo β Biscoe β 124 β
βββββββββββββ΄ββββββββββββ΄ββββββββ
π‘ Tip
See the getting started tutorial for a full introduction to Ibis.
For most backends, Ibis works by compiling its dataframe expressions into SQL:
>>> ibis.to_sql(g)
SELECT
"t1"."species",
"t1"."island",
"t1"."count"
FROM (
SELECT
"t0"."species",
"t0"."island",
COUNT(*) AS "count"
FROM "penguins" AS "t0"
GROUP BY
1,
2
) AS "t1"
ORDER BY
"t1"."count" ASC
You can mix SQL and Python code:
>>> a = t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
>>> a
βββββββββββββ³ββββββββββββ³ββββββββ
β species β island β count β
β‘ββββββββββββββββββββββββββββββββ©
β string β string β int64 β
βββββββββββββΌββββββββββββΌββββββββ€
β Adelie β Torgersen β 52 β
β Adelie β Biscoe β 44 β
β Adelie β Dream β 56 β
β Gentoo β Biscoe β 124 β
β Chinstrap β Dream β 68 β
βββββββββββββ΄ββββββββββββ΄ββββββββ
>>> b = a.order_by("count")
>>> b
βββββββββββββ³ββββββββββββ³ββββββββ
β species β island β count β
β‘ββββββββββββββββββββββββββββββββ©
β string β string β int64 β
βββββββββββββΌββββββββββββΌββββββββ€
β Adelie β Biscoe β 44 β
β Adelie β Torgersen β 52 β
β Adelie β Dream β 56 β
β Chinstrap β Dream β 68 β
β Gentoo β Biscoe β 124 β
βββββββββββββ΄ββββββββββββ΄ββββββββ
This allows you to combine the flexibility of Python with the scale and performance of modern SQL.
Ibis supports nearly 20 backends:
Most Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backendβs native language. This means you can learn a single API and use it across any supported backend (execution engine).
Ibis broadly supports two types of backend:
To use different backends, you can set the backend Ibis uses:
>>> ibis.set_backend("duckdb")
>>> ibis.set_backend("polars")
>>> ibis.set_backend("datafusion")
Typically, you'll create a connection object:
>>> con = ibis.duckdb.connect()
>>> con = ibis.polars.connect()
>>> con = ibis.datafusion.connect()
And work with tables in that backend:
>>> con.list_tables()
['penguins']
>>> t = con.table("penguins")
You can also read from common file formats like CSV or Apache Parquet:
>>> t = con.read_csv("penguins.csv")
>>> t = con.read_parquet("penguins.parquet")
This allows you to iterate locally and deploy remotely by changing a single line of code.
π‘ Tip
Check out the blog on backend agnostic arrays for one example using the same code across DuckDB and BigQuery.
Ibis is an open source project and welcomes contributions from anyone in the community.
Join our community by interacting on GitHub or chatting with us on Zulip.
For more information visit https://ibis-project.org/.
The Ibis project is an independently governed open source community project to build and maintain the portable Python dataframe library. Ibis has contributors across a range of data companies and institutions.
FAQs
The portable Python dataframe library
We found that ibis-framework demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.Β It has 5 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
Product
Socket now supports pylock.toml, enabling secure, reproducible Python builds with advanced scanning and full alignment with PEP 751's new standard.
Security News
Research
Socket uncovered two npm packages that register hidden HTTP endpoints to delete all files on command.