Skip to main content

Wrapper for Great Expectations to fit the requirements of the Gemeente Amsterdam.

Project description

Introduction

DISCLAIMER: Repo is in PoC phase DISCLAIMER: The functions can run on Databricks using a Personal Compute Cluster

This repository contains functions that will ease the use of Great Expectations. Users can input data and data quality rules and get rules in return.

Getting Started

Prerequisites:

Run the following code in your workspace:

pip install great_expectations

When working in Databricks you can clone this repo to Databricks Repos. Then you can access it in your workspace using:

import sys sys.path.append("/Workspace/Repos/{user}/{repo_name}") from {file} import {function}

Parameter examples: user: j.cruijff@amsterdam.nl repo_name: dq_repo file: df_checker function: df_check

Updates

version = "0.1.0" : dq_rules_example.json is updated. Added: "dataframe_parameters": { "unique_identifier": "id" }

version = "0.2.0" : dq_rules_example.json is updated. Added for each tables: { "dataframe_parameters": [ { "unique_identifier": "id", "table_name": "well", "rules": [ { "rule_name": "expect_column_values_to_be_between", "parameters": [ { "column": "latitude", "min_value": 6, "max_value": 10000 } ] } ] }, ....

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dq-suite-amsterdam-0.2.0.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

dq_suite_amsterdam-0.2.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file dq-suite-amsterdam-0.2.0.tar.gz.

File metadata

  • Download URL: dq-suite-amsterdam-0.2.0.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for dq-suite-amsterdam-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8a279dc84e4f08f04c7a2a0ea02a29b9b557292313f8aa1abb8863755529d623
MD5 3eb31bae2b8a5ab378b0212f02a02971
BLAKE2b-256 a1b427c352f8246cd6f066b31da671c74e6c695e2896cb82269f886d90294eff

See more details on using hashes here.

File details

Details for the file dq_suite_amsterdam-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dq_suite_amsterdam-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27dba595d02561d3198f16822c960e5ed9454468440b9b58bf022902937b1ba0
MD5 f666c7bbe5ca7db3d9d6a1b72aacd9a7
BLAKE2b-256 1adc5e7c318b4c2ea480b60842e87235aab54b3500eef4446bec19d50bd7f94f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page