Wrapper for Great Expectations to fit the requirements of the Gemeente Amsterdam.
Project description
Introduction
This repository contains functions that will ease the use of Great Expectations. Users can input data and data quality rules and get results in return.
DISCLAIMER: Repo is in PoC phase
Getting Started
Run the following code in your workspace:
pip install great_expectations
pip install dq-suite-amsterdam
import dq_suite
- Define 'dfs' as a list of dataframes that require a dq check
- Define 'dq_rules' as a JSON as shown in dq_rules_example.json in this repo
results, brontabel_df, bronattribute_df, dqRegel_df = dq_suite.df_check(dfs, dq_rules, "showcase")
Known exceptions
The functions can run on Databricks using a Personal Compute Cluster or using a Job Cluster. Using a Shared Compute Cluster will results in an error, as it does not have the permissions that Great Expectations requires.
Updates
version = "0.1.0" : dq_rules_example.json is updated. Added: "dataframe_parameters": { "unique_identifier": "id" }
version = "0.2.0" : dq_rules_example.json is updated. Added for each tables: { "dataframe_parameters": [ { "unique_identifier": "id", "table_name": "well", "rules": [ { "rule_name": "expect_column_values_to_be_between", "parameters": [ { "column": "latitude", "min_value": 6, "max_value": 10000 } ] } ] }, ....
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dq-suite-amsterdam-0.2.1.tar.gz
.
File metadata
- Download URL: dq-suite-amsterdam-0.2.1.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4bb73016dbad673f63343cfb0973dd50c971cdc21a8d3f93a13cc1a11ab2ebd |
|
MD5 | 4e639f40f4d8048bec7bbe284dca97d4 |
|
BLAKE2b-256 | 5e803d9a1e499a35d7eea21cd701ba92b1effba22727c590c28363886989af30 |
File details
Details for the file dq_suite_amsterdam-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: dq_suite_amsterdam-0.2.1-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d70c93a522099be8dfd5a5b97fe2b7c9c52f20a02977cb99b0b36220ace9392 |
|
MD5 | a78799b499c82fe9b130b6b305229175 |
|
BLAKE2b-256 | 8f524c9f951e9ec448c5895dcc7e4e38fe7069ced6df9f046037b65fb9efe855 |