Skip to main content

Building Entity Relationship Diagrams for Google BigQuery.

Project description

bigquery-erd

Entity Relationship Diagram (ERD) Generator for Google BigQuery, based upon eralchemy.

Examples

ERD for a NewsMeme database schema (taken from the original project).

NewsMeme Example

Installation

pip install bigquery-erd

eralchemy requires GraphViz to generate the graphs and Python. Both are available for Windows, Mac and Linux.

Usage

Usage from Python

Find an example notebook here.

But Wait, BigQuey is not a Relation Database?

That's right. You cannot enforce primary or foreign key constraints in BigQuery. However, that doesn't mean that you should not be able to have logical dependencies between tables.

Defining Relations through Column Descriptions

We use the column description field in BigQuery to define relations between columns in a format that we can later parse programmatically.

Let's assume we have a table a with a column id and another table a with a column a_id that serves as a foreign key relation to a.id. We then add the following description to b.a_id:

-> b.id

Defining Relations to Datasets Explicitly

Per default, we assume that the related tables are located inside the same dataset. However, you can also define the datasets explicitly. This is especially useful if the two related tables are not located within the same dataset.

Let's assume that table a is located in dataset d1 and table b is located in d2. The description in b.a_id would then be:

-> d1.a.id

Defining Cardinality Explicitly

Cardinality defines the relationship between two tables. This package understands four different cardinalities:

  • *, meaning "0..N"
  • ?, meaning "{0,1}"
  • +, meaning "1..N"
  • 1, meaning "1"

Per default, we assume a cardinality of *:1. You can also define the relation's cardinality explicitly.

Let's assume that every record in a has at least 1 related record in b. the description in b.a_id would be:

-> +:1 a.id

Example

You can find a example Google BigQuery project for the NewsMeme schema with annotated descriptions here.

Defining Custom Description RegEx

The default RegEx for relations in column descriptions is ->\s([?*+1 ]:[?*+1 ]\s)?(.*\.)?(.*)\.(.*)$. You can define a custom RegEx by setting the GBQ_RELATION_PATTERN environment variable. The RegEx should match four capture groups, where:

  • The first group is the cardinality (which is optional)
  • The second group is the dataset id (which is optional)
  • The third group is the table id
  • The fourth group is the column

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigquery-erd-0.1.0.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

bigquery_erd-0.1.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file bigquery-erd-0.1.0.tar.gz.

File metadata

  • Download URL: bigquery-erd-0.1.0.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.0 Linux/4.15.0-1077-gcp

File hashes

Hashes for bigquery-erd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f707a0e3069e2e9181830ed1ffecdc1f04726de83eb69098765c27a92e142465
MD5 7727b416451a5c78cc3062fc4aba4243
BLAKE2b-256 6c3588d6c1983f06b4104334a7f58cfcd288de3e9182951d5dfb184e38a8c13c

See more details on using hashes here.

File details

Details for the file bigquery_erd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bigquery_erd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.0 Linux/4.15.0-1077-gcp

File hashes

Hashes for bigquery_erd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e8df8829c173c98ef8db99419fd47fc3aa98a297d7918eb8613bb28fa6723a8
MD5 5a97da749cf440050cb8bca701924800
BLAKE2b-256 5fb0f792b84138fb1416b339a425af1244a8b7e757c0870cf2aa79736880de0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page