Skip to main content

A python package for Substrait.

Project description

Substrait

A Python package for Substrait, the cross-language specification for data compute operations.

Goals

This project aims to provide a Python interface for the Substrait specification. It will allow users to construct and manipulate a Substrait Plan from Python for evaluation by a Substrait consumer, such as DataFusion or DuckDB.

Non-goals

This project is not an execution engine for Substrait Plans.

Status

This is an experimental package that is still under development.

Example

At the moment, this project contains only generated Python classes for the Substrait protobuf messages. Let's use an existing Substrait producer, Ibis, to provide an example using Python Substrait as the consumer.

Produce a Substrait Plan with Ibis

In [1]: import ibis

In [2]: movie_ratings = ibis.table(
   ...:     [
   ...:         ("tconst", "str"),
   ...:         ("averageRating", "str"),
   ...:         ("numVotes", "str"),
   ...:     ],
   ...:     name="ratings",
   ...: )
   ...:

In [3]: query = movie_ratings.select(
   ...:     movie_ratings.tconst,
   ...:     avg_rating=movie_ratings.averageRating.cast("float"),
   ...:     num_votes=movie_ratings.numVotes.cast("int"),
   ...: )

In [4]: from ibis_substrait.compiler.core import SubstraitCompiler

In [5]: compiler = SubstraitCompiler()

In [6]: protobuf_msg = compiler.compile(query).SerializeToString()

In [7]: type(protobuf_msg)
Out[7]: bytes

Consume the Substrait Plan using Python Substrait

In [8]: import substrait

In [9]: from substrait.gen.proto.plan_pb2 import Plan

In [10]: my_plan = Plan()

In [11]: my_plan.ParseFromString(protobuf_msg)
Out[11]: 186

In [12]: print(my_plan)
relations {
  root {
    input {
      project {
        common {
          emit {
            output_mapping: 3
            output_mapping: 4
            output_mapping: 5
          }
        }
        input {
          read {
            common {
              direct {
              }
            }
            base_schema {
              names: "tconst"
              names: "averageRating"
              names: "numVotes"
              struct {
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                types {
                  string {
                    nullability: NULLABILITY_NULLABLE
                  }
                }
                nullability: NULLABILITY_REQUIRED
              }
            }
            named_table {
              names: "ratings"
            }
          }
        }
        expressions {
          selection {
            direct_reference {
              struct_field {
              }
            }
            root_reference {
            }
          }
        }
        expressions {
          cast {
            type {
              fp64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 1
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
        expressions {
          cast {
            type {
              i64 {
                nullability: NULLABILITY_NULLABLE
              }
            }
            input {
              selection {
                direct_reference {
                  struct_field {
                    field: 2
                  }
                }
                root_reference {
                }
              }
            }
            failure_behavior: FAILURE_BEHAVIOR_THROW_EXCEPTION
          }
        }
      }
    }
    names: "tconst"
    names: "avg_rating"
    names: "num_votes"
  }
}
version {
  minor_number: 24
  producer: "ibis-substrait"
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrait-0.2.1.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

substrait-0.2.1-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file substrait-0.2.1.tar.gz.

File metadata

  • Download URL: substrait-0.2.1.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for substrait-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ff2c1ecfdd2af897170a8405ae1108eb0e98db000cda779568400fba4bbd6ece
MD5 386e8411310a387c9278fb5cf74f6730
BLAKE2b-256 46a5eb7440aee6f3919305eb63b09492ed27ea6130ed1722e8a5ab38542da0ca

See more details on using hashes here.

File details

Details for the file substrait-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: substrait-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for substrait-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fe721349caf3dd2f6ab281f55d1c6f9fbb809eb227c8d08fbd6e4c967982eb2e
MD5 998f1dec9725ecdd1ea638bd06fd990d
BLAKE2b-256 3d78284c0fc92afbf31506a86c5be79b14a00f3d69f0045f971e79bc220ff363

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page