Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.6+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.3.20221028160159.tar.gz (532.0 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.3.20221028160159-py3-none-any.whl (567.8 kB view details)

Uploaded Python 3

schema_salad-8.3.20221028160159-cp310-cp310-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221028160159-cp39-cp39-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221028160159-cp38-cp38-musllinux_1_1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20221028160159-cp37-cp37m-musllinux_1_1_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.28+ x86-64

schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file schema-salad-8.3.20221028160159.tar.gz.

File metadata

File hashes

Hashes for schema-salad-8.3.20221028160159.tar.gz
Algorithm Hash digest
SHA256 371cda3ca8d327bdfded46e62625f31b2638e5fd9bfb17129e7e4bd3bf433e3b
MD5 dbdccca625f92391a162516b8ef7580e
BLAKE2b-256 4c6bcda8bb5cc1f88103f74a8289bf9fdca3f6e8ea47f2a2f221c9508ae09974

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-py3-none-any.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-py3-none-any.whl
Algorithm Hash digest
SHA256 dcf666a31428d4f4697e40c0416b0ebbdd6036fc9190013029df78018eabdaf6
MD5 353d09a80ed5c899b2d03537d83cdc7f
BLAKE2b-256 2b85928392d9c7af4354721e12a053ad3030715022ac7dacc559a6a6974688d2

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 146d3c22d01db3422720a47061ffb8bebe4287c44753bac4f3a57c7c4a0c67b0
MD5 022e636a4d261ff89c5f731baa581f90
BLAKE2b-256 a26039f3ef5ec55390154f76c53191849b1069561a7666691fd37d5f9e03f35a

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5eec191efe446501dc97710bff271df8e0b49337a36cb1b73e6626ffbf9b1b55
MD5 4446e76a15c91ad013e2fe3ae17ba160
BLAKE2b-256 8393d07ef61097d6552fe57d9096a6355bbfe45b9b8822889d0b30357b95e48b

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 379e217377c2db3e7062f10c318c072c8ebb38c6d2f00ab111cfc8082e14d7f2
MD5 515ab1a73e942d1ee9a8ce3e65169de4
BLAKE2b-256 98c5d0210421214e85db31741a8af730127869bb954c60323459c85a0d032c4b

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ac87e365672aca91973ad0ddfc20ce9674841508e9811625f43683cdd8d6fcf2
MD5 55604fe120397c64cde5e8ab5b666fa0
BLAKE2b-256 52a9be80399afae0517c57cf03f6a1fe27b249e9cc5c11fad9f7e6238d2aa22b

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 d5fa817f7ec6bf2d3d56cb868999759b52d02ebffd0a733a9b17687c8d6362af
MD5 e8a0bd311ad6f6d9951b14b54333a076
BLAKE2b-256 a2ed39d177bd6d4152dad63e486e5edeb65ef58079ffd1616606ad991d095f27

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 29cb5f216bf762e104c53fe1d3c61945f40a1957a7d131031ca1698b024b2cb1
MD5 9e85f84ef36b9104ab2f48aac5071c36
BLAKE2b-256 3b0c4d370d3cd674aaca2d1b8d33808a97da5c4d5777838c4421b2a42386979e

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 92e431d9d11c1e056af1efeb3eb9dba0bcfdaf01fd74dfa775d6f1c7bfeaaab6
MD5 a9e18e30322b2f3be63c91c6e721aa82
BLAKE2b-256 3f3d12bffa0684ba4184de0db5d28895e70f0fa46513c2be18db80814df71e74

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 c05c9434e80601f6a24b6a0bd8df30a27822a2c58957b9d79801f6690a9607b3
MD5 da73705218e6394633b9306a4233edba
BLAKE2b-256 d10180f2f6dbcf9a4a230a2d0e2662f0588ef4bccf850d15ec95eaa3893d8e4a

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 cd31031b7408ae552f57c0228bc2fa41a55308eba7ef5b6a53f1f5018254a412
MD5 c12d81e36981b874784595f783e12629
BLAKE2b-256 3f325b02cd71b2c0bbe9011cf43366f178d1ac901dbc447e06ab4ec25f7cd2e0

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1a6e7aacde4e068fd76740437e3689853e6e18cfa94325f22353c8281b40bd0b
MD5 27bd4517ffac5f8e555b51bf648acc24
BLAKE2b-256 1ce7c6e53ca580ab78cd7ce414339ea8270fae54a45580690ea8ec5b5aa8f0d3

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7cd0939e32d0e61c9c77eaac82bed59879aeba8d13533000214f4575c3eec79c
MD5 eef3fd2891f53fa424f00629785727ea
BLAKE2b-256 c40ca98855febb990dcb7295da47530ba65d5a8d785dbc7febb689dfc63e2eab

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 df69b3babdecf80c2071f7f7b6a9da72cf5d5c9c12c0bdd3004f66116152f18e
MD5 153e9b391b37f286b1c78568635f5be5
BLAKE2b-256 d02c96f1205c9cb8eac451fa7a51cb50c47e3f5f96045dc34147bc22045a5340

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp37-cp37m-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 96353e58521b08b57c1101bd733248f96994bc5a165d7885534a0d081046999a
MD5 df2909616745a284e6c6c3c5d07453af
BLAKE2b-256 7c5e11f15481e1413ede280321b2805f14492588e5f78c5fae9f87247d2d880d

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 416d8644f08fb8bf1f55285ba0252dd40a438f49abe89d2fce6309e132eafa82
MD5 96731182dbd5826e4b230aa8abb27cea
BLAKE2b-256 093611e817fd3ba026981887ea7e063a35e1ddddfe8d9e8f24299b0fdb119400

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ea6361d9ffc117f54212e61064e964165e3eb6c76694ff6034c736c4293093a8
MD5 6c810cddd6925a4138cddc3e827c7164
BLAKE2b-256 9c345898bd77397f7ca5967ee7ecd0145b3e1eae0def5e5f539449719d22339d

See more details on using hashes here.

File details

Details for the file schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20221028160159-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 9e9c6f65d1a8c02f0d95306b72997f806bfe4cae97aabf777b7a7551dfd20c8d
MD5 34ceb8da0c78f8c352fb141b306d72aa
BLAKE2b-256 463a04965802f392878d519bec9b56265fd0e3b9223cbec65ac060aec798020b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page