Skip to main content

The portable Python dataframe library

Project description

Ibis

Documentation status Project chat Anaconda badge PyPI Build status Build status Codecov branch

What is Ibis?

Ibis is the portable Python dataframe library:

See the documentation on "Why Ibis?" to learn more.

Getting started

You can pip install Ibis with a backend and example data:

pip install 'ibis-framework[duckdb,examples]'

[!TIP] See the installation guide for more installation options.

Then use Ibis:

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
 species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
 string   string     float64         float64        int64              int64        string  int64 
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
 Adelie   Torgersen            39.1           18.7                181         3750  male     2007 
 Adelie   Torgersen            39.5           17.4                186         3800  female   2007 
 Adelie   Torgersen            40.3           18.0                195         3250  female   2007 
 Adelie   Torgersen            NULL           NULL               NULL         NULL  NULL     2007 
 Adelie   Torgersen            36.7           19.3                193         3450  female   2007 
 Adelie   Torgersen            39.3           20.6                190         3650  male     2007 
 Adelie   Torgersen            38.9           17.8                181         3625  female   2007 
 Adelie   Torgersen            39.2           19.6                195         4675  male     2007 
 Adelie   Torgersen            34.1           18.1                193         3475  NULL     2007 
 Adelie   Torgersen            42.0           20.2                190         4250  NULL     2007 
                                                                                          
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
>>> g = t.group_by(["species", "island"]).agg(count=t.count()).order_by("count")
>>> g
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
 species    island     count 
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
 string     string     int64 
├───────────┼───────────┼───────┤
 Adelie     Biscoe        44 
 Adelie     Torgersen     52 
 Adelie     Dream         56 
 Chinstrap  Dream         68 
 Gentoo     Biscoe       124 
└───────────┴───────────┴───────┘

[!TIP] See the getting started tutorial for a full introduction to Ibis.

Python + SQL: better together

For most backends, Ibis works by compiling its dataframe expressions into SQL:

>>> ibis.to_sql(g)
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" ASC

You can mix SQL and Python code:

>>> a = t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
>>> a
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
 species    island     count 
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
 string     string     int64 
├───────────┼───────────┼───────┤
 Adelie     Torgersen     52 
 Adelie     Biscoe        44 
 Adelie     Dream         56 
 Gentoo     Biscoe       124 
 Chinstrap  Dream         68 
└───────────┴───────────┴───────┘
>>> b = a.order_by("count")
>>> b
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
 species    island     count 
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
 string     string     int64 
├───────────┼───────────┼───────┤
 Adelie     Biscoe        44 
 Adelie     Torgersen     52 
 Adelie     Dream         56 
 Chinstrap  Dream         68 
 Gentoo     Biscoe       124 
└───────────┴───────────┴───────┘

This allows you to combine the flexibility of Python with the scale and performance of modern SQL.

Backends

Ibis supports 20+ backends:

How it works

Most Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backend’s native language. This means you can learn a single API and use it across any supported backend (execution engine).

Ibis supports three types of backend:

  1. SQL-generating backends
  2. Expression-generating backends
  3. Naïve execution backends

Ibis backend types

Portability

To use different backends, you can set the backend Ibis uses:

>>> ibis.set_backend("duckdb")
>>> ibis.set_backend("polars")
>>> ibis.set_backend("datafusion")

Typically, you'll create a connection object:

>>> con = ibis.duckdb.connect()
>>> con = ibis.polars.connect()
>>> con = ibis.datafusion.connect()

And work with tables in that backend:

>>> con.list_tables()
['penguins']
>>> t = con.table("penguins")

You can also read from common file formats like CSV or Apache Parquet:

>>> t = con.read_csv("penguins.csv")
>>> t = con.read_parquet("penguins.parquet")

This allows you to iterate locally and deploy remotely by changing a single line of code.

[!TIP] Check out the blog on backend agnostic arrays for one example using the same code across DuckDB and BigQuery.

Community and contributing

Ibis is an open source project and welcomes contributions from anyone in the community.

Join our community by interacting on GitHub or chatting with us on Zulip.

For more information visit https://ibis-project.org/.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibis_framework-9.0.0.dev619.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

ibis_framework-9.0.0.dev619-py3-none-any.whl (1.8 MB view details)

Uploaded Python 3

File details

Details for the file ibis_framework-9.0.0.dev619.tar.gz.

File metadata

  • Download URL: ibis_framework-9.0.0.dev619.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.14 Linux/6.5.0-1017-azure

File hashes

Hashes for ibis_framework-9.0.0.dev619.tar.gz
Algorithm Hash digest
SHA256 e82925c1adfab242492a4732e37501688c45ddf48b17e446719f45a342472c10
MD5 c1ba972cf8b653827f4f74d2b45f6eb9
BLAKE2b-256 6b8115934c9746604ec14617a1bfbd2ff8841623c28c7c7010fa6deff625dfc8

See more details on using hashes here.

File details

Details for the file ibis_framework-9.0.0.dev619-py3-none-any.whl.

File metadata

File hashes

Hashes for ibis_framework-9.0.0.dev619-py3-none-any.whl
Algorithm Hash digest
SHA256 c740d3637298f3d073d5948395999dd41854acd58a0d0d8a5c29b53e893f8ae7
MD5 6fc7eb8825ed50b4afb4026dfa98f5ea
BLAKE2b-256 9c92c98fb597b0c50216d6084ec64c5ece28550201d7d3ff3c4c3ba86e7353fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page