Skip to main content

Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.

Project description

BQuest Logo

BQuest

Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.

We would like to thank Mike Czech who is the original inventor of bquest!

Warning

This library is a work in progress!

Breaking changes should be expected until a 1.0 release, so version pinning is recommended.

CI: Overall outcome CD: gh-pages documentation PyPI version Project status (alpha, beta, stable) PyPI downloads Project license Python version compatibility Documentation: Black

Overview

  • Use BQuest in combination with your favorite testing framework (e.g. pytest).

  • Create temporary test tables from JSON or pandas DataFrame.

  • Run BQ configurations and plain SQL queries on your test tables and check the result.

Installation

Via PyPi (standard):

pip install bquest

Via Github (most recent):

pip install git+https://github.com/ottogroup/bquest

BQuest also requires a dedicated BigQuery dataset for storing test tables, e.g.

resource "google_bigquery_dataset" "bquest" {
  dataset_id    = "bquest"
  friendly_name = "bquest"
  description   = "Source tables for bquest tests"
  location      = "EU"
  default_table_expiration_ms = 3600000
}

We recommend setting an expiration time for tables in the bquest dataset to assure removal of those test tables upon test execution.

Example

Given a pandas DataFrame

foo

weight

prediction_date

bar

23

20190301

my

42

20190301

and its table definition

from bquest.tables import BQTableDefinitionBuilder

table_def_builder = BQTableDefinitionBuilder(GOOGLE_PROJECT_ID, dataset="bquest", location="EU")
table_definition = table_def_builder.from_df("abc.feed_latest", df)

you can use the config file ./abc/config.py

{
    "query": """
        SELECT
            foo,
            PARSE_DATE('%Y%m%d', prediction_date)
        FROM
            `{source_table}`
        WHERE
            weight > {THRESHOLD}
    """,
    "start_date": "prediction_date",
    "end_date": "prediction_date",
    "source_tables": {"source_table": "abc.feed_latest"},
    "feature_table_name": "abc.myid",
}

and the runner

from bquest.runner import BQConfigFileRunner, BQConfigRunner

runner = BQConfigFileRunner(
    BQConfigRunner(bq_client, bq_executor_func),
    "config/bq_config",
)

result_df = runner.run_config(
    "20190301",
    "20190308",
    [table_definition],
    "abc/config.py",
    templating_vars={"THRESHOLD": "30"},
)

to assert the result table

assert result_df.shape == (1, 2)
assert result_df.iloc[0]["foo"] == "my"

Testing

For the actual testing bquest relies on an accessible BigQuery project which can be configured with the gcloud client. The corresponding GOOGLE_PROJECT_ID is extracted from this project and used with pandas-gbq to write temporary tables to the bquest dataset that has to be pre- configured before testing on that project.

For Github CI we have configured an identity provider in our testing project which allows only core members of this repository to access the testing projects’ resources.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bquest-0.5.0.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

bquest-0.5.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file bquest-0.5.0.tar.gz.

File metadata

  • Download URL: bquest-0.5.0.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.2.0-1019-azure

File hashes

Hashes for bquest-0.5.0.tar.gz
Algorithm Hash digest
SHA256 3265f68335e710c11e848030b9f7fb910eef329f04ed8c7ec6e9b64e02fb94e6
MD5 fae2059c45a5bb2a587f20054b1e0572
BLAKE2b-256 10cfd99b345e5be77801981457c6bb50179f087e93e89c42ddb2e7575152e607

See more details on using hashes here.

Provenance

File details

Details for the file bquest-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: bquest-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.2.0-1019-azure

File hashes

Hashes for bquest-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d23a8bd95ab51672c23b7fc03f2bdb5c08b2e4f6dbb6b34f8f508e81d0d0707
MD5 0e2509cfb1ef1c198d975d7e0b2cfd18
BLAKE2b-256 009a8224b5066438f092115cee927c21e22a06bf72db8a8ac8751e65cacbc531

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page