Skip to main content

Building Entity Relationship Diagrams for Google BigQuery.

Project description

bigquery-erd

Entity Relationship Diagram (ERD) Generator for Google BigQuery, based upon eralchemy.

Examples

ERD for a NewsMeme database schema (taken from the original project).

NewsMeme Example

Installation

pip install bigquery-erd

eralchemy requires GraphViz to generate the graphs and Python. Both are available for Windows, Mac and Linux.

Usage

Usage from Python

Find an example notebook here.

But Wait, BigQuey is not a Relation Database?

That's right. You cannot enforce primary or foreign key constraints in BigQuery. However, that doesn't mean that you should not be able to have logical dependencies between tables.

Defining Relations through Column Descriptions

We use the column description field in BigQuery to define relations between columns in a format that we can later parse programmatically.

Let's assume we have a table a with a column id and another table a with a column a_id that serves as a foreign key relation to a.id. We then add the following description to b.a_id:

-> b.id

Defining Relations to Datasets Explicitly

Per default, we assume that the related tables are located inside the same dataset. However, you can also define the datasets explicitly. This is especially useful if the two related tables are not located within the same dataset.

Let's assume that table a is located in dataset d1 and table b is located in d2. The description in b.a_id would then be:

-> d1.a.id

Defining Cardinality Explicitly

Cardinality defines the relationship between two tables. This package understands four different cardinalities:

  • *, meaning "0..N"
  • ?, meaning "{0,1}"
  • +, meaning "1..N"
  • 1, meaning "1"

Per default, we assume a cardinality of *:1. You can also define the relation's cardinality explicitly.

Let's assume that every record in a has at least 1 related record in b. the description in b.a_id would be:

-> +:1 a.id

Example

You can find a example Google BigQuery project for the NewsMeme schema with annotated descriptions here.

Defining Custom Description RegEx

The default RegEx for relations in column descriptions is ->\s([?*+1 ]:[?*+1 ]\s)?(.*\.)?(.*)\.(.*)$. You can define a custom RegEx by setting the GBQ_RELATION_PATTERN environment variable. The RegEx should match four capture groups, where:

  • The first group is the cardinality (which is optional)
  • The second group is the dataset id (which is optional)
  • The third group is the table id
  • The fourth group is the column

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bigquery-erd-0.1.1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

bigquery_erd-0.1.1-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file bigquery-erd-0.1.1.tar.gz.

File metadata

  • Download URL: bigquery-erd-0.1.1.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.0 Linux/4.15.0-1077-gcp

File hashes

Hashes for bigquery-erd-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1723bc447c2922439cbf00b4601b1ccaf4b59bdc13ccf1a288bd0002b28fbad8
MD5 0aeebb297e3a1ec30b12c657c2bfb0dc
BLAKE2b-256 753eb1e6ccfec0ce60656f3b896f76c3cd5724c7090e0dc9072a749c077c628d

See more details on using hashes here.

File details

Details for the file bigquery_erd-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bigquery_erd-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.0 Linux/4.15.0-1077-gcp

File hashes

Hashes for bigquery_erd-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 30c66c505e45247cdea5c3a2446c369c8b83757bf53cc1148dc6b318baddbd8b
MD5 7d9f4af9824ff7d5394921f3293ef232
BLAKE2b-256 7566fc13dc6db80792c9a0066e51b1e94d439d9199b51560f07a4fdd1d0f225c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page