Skip to main content

Wrapper for Great Expectations to fit the requirements of the Gemeente Amsterdam.

Project description

Introduction

This repository contains functions that will ease the use of Great Expectations. Users can input data and data quality rules and get results in return.

DISCLAIMER: Repo is in PoC phase

Getting Started

Run the following code in your workspace:

pip install great_expectations
pip install dq-suite-amsterdam
import dq_suite
  • Define 'dfs' as a list of dataframes that require a dq check
  • Define 'dq_rules' as a JSON as shown in dq_rules_example.json in this repo
results, brontabel_df, bronattribute_df, dqRegel_df = dq_suite.df_check(dfs, dq_rules, "showcase")

Known exceptions

The functions can run on Databricks using a Personal Compute Cluster or using a Job Cluster. Using a Shared Compute Cluster will results in an error, as it does not have the permissions that Great Expectations requires.

Updates

version = "0.1.0" : dq_rules_example.json is updated. Added: "dataframe_parameters": { "unique_identifier": "id" }

version = "0.2.0" : dq_rules_example.json is updated. Added for each tables: { "dataframe_parameters": [ { "unique_identifier": "id", "table_name": "well", "rules": [ { "rule_name": "expect_column_values_to_be_between", "parameters": [ { "column": "latitude", "min_value": 6, "max_value": 10000 } ] } ] }, ....

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dq-suite-amsterdam-0.2.1.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

dq_suite_amsterdam-0.2.1-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file dq-suite-amsterdam-0.2.1.tar.gz.

File metadata

  • Download URL: dq-suite-amsterdam-0.2.1.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for dq-suite-amsterdam-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f4bb73016dbad673f63343cfb0973dd50c971cdc21a8d3f93a13cc1a11ab2ebd
MD5 4e639f40f4d8048bec7bbe284dca97d4
BLAKE2b-256 5e803d9a1e499a35d7eea21cd701ba92b1effba22727c590c28363886989af30

See more details on using hashes here.

File details

Details for the file dq_suite_amsterdam-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dq_suite_amsterdam-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6d70c93a522099be8dfd5a5b97fe2b7c9c52f20a02977cb99b0b36220ace9392
MD5 a78799b499c82fe9b130b6b305229175
BLAKE2b-256 8f524c9f951e9ec448c5895dcc7e4e38fe7069ced6df9f046037b65fb9efe855

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page