Skip to main content

Tools to work with Amsterdam schema.

Project description

amsterdam-schema-tools

Set of libraries and tools to work with Amsterdam schema.

Install the package with: pip install amsterdam-schema-tools

Currently, the following cli commands are available:

  • schema import events
  • schema import ndjson
  • schema show schema
  • schema show tablenames
  • schema introspect db
  • schema introspect geojson *.geojson
  • schema validate
  • schema permissions apply

The tools expect either a DATABASE_URL environment variable or a command-line option --db-url with a DSN.

The output is a json-schema output according to the Amsterdam schemas definition for the tables that are being processed.

Generate amsterdam schema from existing database tables

The --prefix argument controls whether table prefixes are removed in the schema, because that is required for Django models.

As example we can generate a BAG schema. Point DATABASE_URL to bag_v11 database and then run :

schema show tablenames | sort | awk '/^bag_/{print}' | xargs schema introspect db bag --prefix bag_ | jq

The jq formats it nicely and it can be redirected to the correct directory in the schemas repository directly.

Express amsterdam schema information in relational tables

Amsterdam schema is expressed as jsonschema. However, to make it easier for people with a more relational mind- or toolset it is possible to express amsterdam schema as a set of relational tables. These tables are meta_dataset, meta_table and meta_field.

It is possible to convert a jsonschema into the relational table structure and vice-versa.

This command converts a dataset from an existing dataset in jsonschema format:

schema import schema <id of dataset>

To convert from relational tables back to jsonschema:

schema show schema <id of dataset>

Generating amsterdam schema from existing GeoJSON files

The following command can be used to inspect and import the GeoJSON files:

schema introspect geojson <dataset-id> *.geojson > schema.json
edit schema.json  # fine-tune the table names
schema import geojson schema.json <table1> file1.geojson
schema import geojson schema.json <table2> file2.geojson

Importing GOB events

The schematools library has a module that read GOB events into database tables that are defines by an Amsterdam schema. This module can be used to read GOB events from a Kafka stream. It is also possible to read GOB events from a batch file with line-separeted events using:

schema import events <path-to-dataset> <path-to-file-with-events>

Schema Tools as a pre-commit hook

Included in the project is a pre-commit hook that can validate schema files in a project such as amsterdam-schema

To configure it extend the .pre-commit-config.yaml in the project with the schema file defintions as follows:

  - repo: https://github.com/Amsterdam/schema-tools
    rev: v0.18.1
    hooks:
      - id: validate-schema
        args: ['https://schemas.data.amsterdam.nl/schema@v1.1.1#']
        exclude: 'datasets/index.json$'

args is a one element list containing the URL to the Amsterdam Meta Schema.

validate-schema will only process json files. However not all json files are Amsterdam schema files. To exclude files or directories use exclude with pattern.

pre-commit depends on properly tagged revisions of its hooks. Hence we should take care to, not only bump version numbers on updates to this package, but also commit a tag with the version number. This is automated by means of the tbump tool. Bumping a version from 0.18.1 to 0.18.2 and generating the appropriate git commits/tags is as easy as running:

$ tbump 0.18.2

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amsterdam-schema-tools-0.19.0.tar.gz (64.4 kB view details)

Uploaded Source

Built Distribution

amsterdam_schema_tools-0.19.0-py2.py3-none-any.whl (87.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file amsterdam-schema-tools-0.19.0.tar.gz.

File metadata

  • Download URL: amsterdam-schema-tools-0.19.0.tar.gz
  • Upload date:
  • Size: 64.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.1

File hashes

Hashes for amsterdam-schema-tools-0.19.0.tar.gz
Algorithm Hash digest
SHA256 6082f2efbda2f83feffefb05388b99b9b1358bace2e6aaf4ee30862de465126c
MD5 c5d07c47ad336a840f4000ded2af938f
BLAKE2b-256 693e09fc27078ab3d3e1d4d892e1c481d0263a822b5d7ea9ae25d1a0fa0bb3e6

See more details on using hashes here.

File details

Details for the file amsterdam_schema_tools-0.19.0-py2.py3-none-any.whl.

File metadata

  • Download URL: amsterdam_schema_tools-0.19.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 87.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.1

File hashes

Hashes for amsterdam_schema_tools-0.19.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 244a039def761bb2f35f4b8ef1ba6aabc38d761ac809c44fa857c84bd60ec084
MD5 43a3d246d8d2c39aa6af6b404dfbc394
BLAKE2b-256 3026719684b1cc0e8756fa466bb99b973aae46b2c48641ca1993cf6bb8193fe6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page