Skip to main content

No project description provided

Project description

dbt-ibis

With dbt-ibis you can write your dbt models using Ibis.

This package is in very early development. Things might go wrong. Feedback and contributions are welcome!

Supported adapters:

  • DuckDB
  • Soon to come:
    • Snowflake
    • ... (hopefully all which are supported by both dbt and Ibis)

Basic example

pip install dbt-ibis

You can write your Ibis model in files with the extension .ibis. Each .ibis file needs to correspond to one model which is defined as a model function returning an Ibis table expression:

stg_stores.ibis:

from dbt_ibis import depends_on, source


@depends_on(source("sources_db", "stores"))
def model(stores):
    return stores.mutate(store_id=stores["store_id"].cast("int"))

You can now reference the stg_stores model in either a normal SQL model using {{ ref('stg_stores') }} or in another Ibis model:

usa_stores.ibis:

from dbt_ibis import depends_on, ref


@depends_on(ref("stg_stores"))
def model(stores):
    return stores.filter(stores["country"] == "USA")

Whenever your Ibis model references either a source, a seed, or a SQL model, you'll need to define the column data types as described in Model Contracts - getdbt.com (data_type refers to the data types as they are called by your database system) (for sources and SQL models) or in Seed configurations - getdbt.com (for seeds). If you reference another Ibis model, this is not necessary. In the examples above, you would need to provide it for the stores source table:

sources:
  - name: sources_db
    ...
    tables:
      - name: stores
        columns:
          - name: store_id
            data_type: varchar
          - name: store_name
            data_type: varchar
          - name: country
            data_type: varchar

For more examples, including column data type definitions, see the demo project.

You can use all the dbt commands you're used to, you simply need to replace dbt with dbt-ibis. For example:

dbt-ibis run --select stg_stores+

You'll notice that for every .ibis file, dbt-ibis will generate a corresponding .sql file in a __ibis_sql subfolder. This is because dbt-ibis simply compiles all Ibis expressions to SQL and then let's DBT do its thing. You should not edit those files as they are overwritten every time you execute a dbt-ibis command. However, it might be interesting to look at them if you want to debug an expression.

You can also execute dbt-ibis precompile if you only want to compile the .ibis to .sql files:

# This
dbt-ibis run

# Is the samee as
dbt-ibis precompile
dbt run

Editor configuration

You might want to configure your editor to treat .ibis files as normal Python files. In VS Code, you can do this by putting the following into your settings.json file:

    "files.associations": {
        "*.ibis": "python"
    },

Limitations

  • There is no database connection available in the Ibis model functions. Hence, you cannot use Ibis functions which would require this.
  • For non-Ibis models and for sources, you need to specify the data types of the columns. See "Basic example" above.

Integration with DBT

There are discussions on adding a plugin system to dbt which could be used to provide first-class support for other modeling languages such as Ibis (see this PoC by dbt and the discussion on Ibis as a dataframe API) or PRQL (see dbt-prql).

As this feature didn't make it onto the roadmap of dbt for 2023, I've decided to create dbt-ibis to bridge the time until then. Apart from the limitations mentioned above, I think this approach can scale reasonably well. However, the goal is to migrate to the official plugin system as soon as it's available.

Development

pip install -e '.[dev]'

Tests, linting, etc. will follow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_ibis-0.2.0.tar.gz (85.5 kB view hashes)

Uploaded Source

Built Distribution

dbt_ibis-0.2.0-py3-none-any.whl (13.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page