Skip to main content

A python package for Substrait.

Project description

Substrait

PyPI version conda-forge version

A Python package for Substrait, the cross-language specification for data compute operations.

Installation

You can install the Python substrait bindings from PyPI or conda-forge

pip install substrait
conda install -c conda-forge python-substrait  # or use mamba

Goals

This project aims to provide a Python interface for the Substrait specification. It will allow users to construct and manipulate a Substrait Plan from Python for evaluation by a Substrait consumer, such as DataFusion or DuckDB.

Non-goals

This project is not an execution engine for Substrait Plans.

Status

This is an experimental package that is still under development.

Example

At the moment, this project contains only generated Python classes for the Substrait protobuf messages. Let's use an existing Substrait producer, Ibis, to provide an example using Python Substrait as the consumer.

Produce a Substrait Plan with Ibis

In [1]: import ibis

In [2]: movie_ratings = ibis.table(
   ...:     [
   ...:         ("tconst", "str"),
   ...:         ("averageRating", "str"),
   ...:         ("numVotes", "str"),
   ...:     ],
   ...:     name="ratings",
   ...: )
   ...:

In [3]: query = movie_ratings.select(
   ...:     movie_ratings.tconst,
   ...:     avg_rating=movie_ratings.averageRating.cast("float"),
   ...:     num_votes=movie_ratings.numVotes.cast("int"),
   ...: )

In [4]: from ibis_substrait.compiler.core import SubstraitCompiler

In [5]: compiler = SubstraitCompiler()

In [6]: protobuf_msg = compiler.compile(query).SerializeToString()

In [7]: type(protobuf_msg)
Out[7]: bytes

Consume the Substrait Plan using Python Substrait

In [8]: import substrait

In [9]: from substrait.gen.proto.plan_pb2 import Plan

In [10]: my_plan = Plan()

In [11]: my_plan.ParseFromString(protobuf_msg)
Out[11]: 186

In [12]: print(my_plan)
relations {
  root {
    input {
      project {
        common {
          emit {
            output_mapping: 3
            output_mapping: 4
            output_mapping: 5
          }
        }
        input {
          read {
            common {
              direct {
              }
            }
            base_schema {
              names: "tconst"
              names: "averageRating"
              names: "numVotes"
              struct {
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                nullability: NULLABILITY_REQUIRED
              }
            }
            named_table {
              names: "ratings"
            }
          }
        }
        expressions {
          selection {
            direct_reference {
              struct_field {
              }
            }
            root_reference {
            }
          }
        }
        expressions {
          cast {
            type {
              fp64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 1
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
        expressions {
          cast {
            type {
              i64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 2
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
      }
    }
    names: "tconst"
    names: "avg_rating"
    names: "num_votes"
  }
}
version {
  minor_number: 24
  producer: "ibis-substrait"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrait-0.9.0.tar.gz (46.9 kB view details)

Uploaded Source

Built Distribution

substrait-0.9.0-py3-none-any.whl (52.0 kB view details)

Uploaded Python 3

File details

Details for the file substrait-0.9.0.tar.gz.

File metadata

  • Download URL: substrait-0.9.0.tar.gz
  • Upload date:
  • Size: 46.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for substrait-0.9.0.tar.gz
Algorithm Hash digest
SHA256 1f29658ea1b1de6bb210a4217930154e2761348e83e3113f19f66a46a865eefe
MD5 02b31c3ed1968a01aa248fa3ecdc111a
BLAKE2b-256 0329718ec41d96771e5f9d93b20c4255e3e389046972d3013ac8ea1927edcff3

See more details on using hashes here.

File details

Details for the file substrait-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: substrait-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 52.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for substrait-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30e2a0ec9b16a1d1557e3cc96820a8ae7a28ae39de160edcfd1db8748f085560
MD5 e63b693a8b994cd8f20773f86dacce7f
BLAKE2b-256 0eca87d26c7e5a499e89be9042e7781d380aee0c8208956899c73deecf4a551c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page