Skip to main content

The portable Python dataframe library

Project description

Ibis

Documentation status Project chat Anaconda badge PyPI Build status Build status Codecov branch

What is Ibis?

Ibis is the portable Python dataframe library:

See the documentation on "Why Ibis?" to learn more.

Getting started

You can pip install Ibis with a backend and example data:

pip install 'ibis-framework[duckdb,examples]'

๐Ÿ’ก Tip

See the installation guide for more installation options.

Then use Ibis:

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ species โ”ƒ island    โ”ƒ bill_length_mm โ”ƒ bill_depth_mm โ”ƒ flipper_length_mm โ”ƒ body_mass_g โ”ƒ sex    โ”ƒ year  โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ string  โ”‚ string    โ”‚ float64        โ”‚ float64       โ”‚ int64             โ”‚ int64       โ”‚ string โ”‚ int64 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Adelie  โ”‚ Torgersen โ”‚           39.1 โ”‚          18.7 โ”‚               181 โ”‚        3750 โ”‚ male   โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           39.5 โ”‚          17.4 โ”‚               186 โ”‚        3800 โ”‚ female โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           40.3 โ”‚          18.0 โ”‚               195 โ”‚        3250 โ”‚ female โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           NULL โ”‚          NULL โ”‚              NULL โ”‚        NULL โ”‚ NULL   โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           36.7 โ”‚          19.3 โ”‚               193 โ”‚        3450 โ”‚ female โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           39.3 โ”‚          20.6 โ”‚               190 โ”‚        3650 โ”‚ male   โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           38.9 โ”‚          17.8 โ”‚               181 โ”‚        3625 โ”‚ female โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           39.2 โ”‚          19.6 โ”‚               195 โ”‚        4675 โ”‚ male   โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           34.1 โ”‚          18.1 โ”‚               193 โ”‚        3475 โ”‚ NULL   โ”‚  2007 โ”‚
โ”‚ Adelie  โ”‚ Torgersen โ”‚           42.0 โ”‚          20.2 โ”‚               190 โ”‚        4250 โ”‚ NULL   โ”‚  2007 โ”‚
โ”‚ โ€ฆ       โ”‚ โ€ฆ         โ”‚              โ€ฆ โ”‚             โ€ฆ โ”‚                 โ€ฆ โ”‚           โ€ฆ โ”‚ โ€ฆ      โ”‚     โ€ฆ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
>>> g = t.group_by("species", "island").agg(count=t.count()).order_by("count")
>>> g
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ species   โ”ƒ island    โ”ƒ count โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ string    โ”‚ string    โ”‚ int64 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Adelie    โ”‚ Biscoe    โ”‚    44 โ”‚
โ”‚ Adelie    โ”‚ Torgersen โ”‚    52 โ”‚
โ”‚ Adelie    โ”‚ Dream     โ”‚    56 โ”‚
โ”‚ Chinstrap โ”‚ Dream     โ”‚    68 โ”‚
โ”‚ Gentoo    โ”‚ Biscoe    โ”‚   124 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’ก Tip

See the getting started tutorial for a full introduction to Ibis.

Python + SQL: better together

For most backends, Ibis works by compiling its dataframe expressions into SQL:

>>> ibis.to_sql(g)
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" ASC

You can mix SQL and Python code:

>>> a = t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
>>> a
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ species   โ”ƒ island    โ”ƒ count โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ string    โ”‚ string    โ”‚ int64 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Adelie    โ”‚ Torgersen โ”‚    52 โ”‚
โ”‚ Adelie    โ”‚ Biscoe    โ”‚    44 โ”‚
โ”‚ Adelie    โ”‚ Dream     โ”‚    56 โ”‚
โ”‚ Gentoo    โ”‚ Biscoe    โ”‚   124 โ”‚
โ”‚ Chinstrap โ”‚ Dream     โ”‚    68 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
>>> b = a.order_by("count")
>>> b
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ species   โ”ƒ island    โ”ƒ count โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ string    โ”‚ string    โ”‚ int64 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Adelie    โ”‚ Biscoe    โ”‚    44 โ”‚
โ”‚ Adelie    โ”‚ Torgersen โ”‚    52 โ”‚
โ”‚ Adelie    โ”‚ Dream     โ”‚    56 โ”‚
โ”‚ Chinstrap โ”‚ Dream     โ”‚    68 โ”‚
โ”‚ Gentoo    โ”‚ Biscoe    โ”‚   124 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This allows you to combine the flexibility of Python with the scale and performance of modern SQL.

Backends

Ibis supports nearly 20 backends:

How it works

Most Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backendโ€™s native language. This means you can learn a single API and use it across any supported backend (execution engine).

Ibis broadly supports two types of backend:

  1. SQL-generating backends
  2. DataFrame-generating backends

Ibis backend types

Portability

To use different backends, you can set the backend Ibis uses:

>>> ibis.set_backend("duckdb")
>>> ibis.set_backend("polars")
>>> ibis.set_backend("datafusion")

Typically, you'll create a connection object:

>>> con = ibis.duckdb.connect()
>>> con = ibis.polars.connect()
>>> con = ibis.datafusion.connect()

And work with tables in that backend:

>>> con.list_tables()
['penguins']
>>> t = con.table("penguins")

You can also read from common file formats like CSV or Apache Parquet:

>>> t = con.read_csv("penguins.csv")
>>> t = con.read_parquet("penguins.parquet")

This allows you to iterate locally and deploy remotely by changing a single line of code.

๐Ÿ’ก Tip

Check out the blog on backend agnostic arrays for one example using the same code across DuckDB and BigQuery.

Community and contributing

Ibis is an open source project and welcomes contributions from anyone in the community.

Join our community by interacting on GitHub or chatting with us on Zulip.

For more information visit https://ibis-project.org/.

Governance

The Ibis project is an independently governed open source community project to build and maintain the portable Python dataframe library. Ibis has contributors across a range of data companies and institutions.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibis_framework-10.0.0.dev148.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

ibis_framework-10.0.0.dev148-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file ibis_framework-10.0.0.dev148.tar.gz.

File metadata

  • Download URL: ibis_framework-10.0.0.dev148.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.15 Linux/6.5.0-1025-azure

File hashes

Hashes for ibis_framework-10.0.0.dev148.tar.gz
Algorithm Hash digest
SHA256 6b6e58e226c09472989085f7277602cf3afc63dcdfc64d2c456e3680126bd74c
MD5 86b6ae6ff3798e643af9d7ba3bac139d
BLAKE2b-256 e366715286875131dd437cb6574457f3fb67105896f3bf3dabb5b88e0fdacabc

See more details on using hashes here.

File details

Details for the file ibis_framework-10.0.0.dev148-py3-none-any.whl.

File metadata

File hashes

Hashes for ibis_framework-10.0.0.dev148-py3-none-any.whl
Algorithm Hash digest
SHA256 2e8f99b678a9555c927dfc220fdb302da54cc96782ec9f9681b1cbfd94298500
MD5 32f6cfe11734eb8ada6ac18a07819ceb
BLAKE2b-256 aedf7544f2448adb35f75a3cf58dd95bf22927e971f3a4ac256488bd1da31758

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page