Skip to main content

A python package for Substrait.

Project description

Substrait

PyPI version conda-forge version

A Python package for Substrait, the cross-language specification for data compute operations.

Installation

You can install the Python substrait bindings from PyPI or conda-forge

pip install substrait
conda install -c conda-forge python-substrait  # or use mamba

Goals

This project aims to provide a Python interface for the Substrait specification. It will allow users to construct and manipulate a Substrait Plan from Python for evaluation by a Substrait consumer, such as DataFusion or DuckDB.

Non-goals

This project is not an execution engine for Substrait Plans.

Status

This is an experimental package that is still under development.

Example

At the moment, this project contains only generated Python classes for the Substrait protobuf messages. Let's use an existing Substrait producer, Ibis, to provide an example using Python Substrait as the consumer.

Produce a Substrait Plan with Ibis

In [1]: import ibis

In [2]: movie_ratings = ibis.table(
   ...:     [
   ...:         ("tconst", "str"),
   ...:         ("averageRating", "str"),
   ...:         ("numVotes", "str"),
   ...:     ],
   ...:     name="ratings",
   ...: )
   ...:

In [3]: query = movie_ratings.select(
   ...:     movie_ratings.tconst,
   ...:     avg_rating=movie_ratings.averageRating.cast("float"),
   ...:     num_votes=movie_ratings.numVotes.cast("int"),
   ...: )

In [4]: from ibis_substrait.compiler.core import SubstraitCompiler

In [5]: compiler = SubstraitCompiler()

In [6]: protobuf_msg = compiler.compile(query).SerializeToString()

In [7]: type(protobuf_msg)
Out[7]: bytes

Consume the Substrait Plan using Python Substrait

In [8]: import substrait

In [9]: from substrait.gen.proto.plan_pb2 import Plan

In [10]: my_plan = Plan()

In [11]: my_plan.ParseFromString(protobuf_msg)
Out[11]: 186

In [12]: print(my_plan)
relations {
  root {
    input {
      project {
        common {
          emit {
            output_mapping: 3
            output_mapping: 4
            output_mapping: 5
          }
        }
        input {
          read {
            common {
              direct {
              }
            }
            base_schema {
              names: "tconst"
              names: "averageRating"
              names: "numVotes"
              struct {
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                nullability: NULLABILITY_REQUIRED
              }
            }
            named_table {
              names: "ratings"
            }
          }
        }
        expressions {
          selection {
            direct_reference {
              struct_field {
              }
            }
            root_reference {
            }
          }
        }
        expressions {
          cast {
            type {
              fp64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 1
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
        expressions {
          cast {
            type {
              i64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 2
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
      }
    }
    names: "tconst"
    names: "avg_rating"
    names: "num_votes"
  }
}
version {
  minor_number: 24
  producer: "ibis-substrait"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrait-0.12.1.tar.gz (48.4 kB view details)

Uploaded Source

Built Distribution

substrait-0.12.1-py3-none-any.whl (53.1 kB view details)

Uploaded Python 3

File details

Details for the file substrait-0.12.1.tar.gz.

File metadata

  • Download URL: substrait-0.12.1.tar.gz
  • Upload date:
  • Size: 48.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for substrait-0.12.1.tar.gz
Algorithm Hash digest
SHA256 c091587a4532861ff0a6b0b328ee1d16b20e9c114dfe23dcedda832cdc211d1c
MD5 416994cfa2f7e20e7d760fadef02583f
BLAKE2b-256 300b4a2a62f6758eade9c1d4d9127e8a51779af12623dc1baf3b31ccd2ec43cb

See more details on using hashes here.

File details

Details for the file substrait-0.12.1-py3-none-any.whl.

File metadata

  • Download URL: substrait-0.12.1-py3-none-any.whl
  • Upload date:
  • Size: 53.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for substrait-0.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f1c2abb8b9706528f2a72f39fcd8f4e3e27b592d1db71c9d2e23c7e85daad41c
MD5 aa2041bc305b42707c2e1ace905371d7
BLAKE2b-256 cecb525a656178f51986749d302a8a497b5355d069a5e7895f100dbee4701dc3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page