Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.7+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.2.20220204150214.tar.gz (480.3 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.2.20220204150214-py3-none-any.whl (513.1 kB view details)

Uploaded Python 3

schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

File details

Details for the file schema-salad-8.2.20220204150214.tar.gz.

File metadata

  • Download URL: schema-salad-8.2.20220204150214.tar.gz
  • Upload date:
  • Size: 480.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for schema-salad-8.2.20220204150214.tar.gz
Algorithm Hash digest
SHA256 3e53dbfe7137796b9e5135920e96bb2713ded9e7be2859dae554d1dc8b029704
MD5 9545e71808e4fc46959d1bfaaf864b06
BLAKE2b-256 652439f3804ada29f40fbc5ab3b4ccc399586694bbaf513171b0565219aed67d

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-py3-none-any.whl.

File metadata

  • Download URL: schema_salad-8.2.20220204150214-py3-none-any.whl
  • Upload date:
  • Size: 513.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for schema_salad-8.2.20220204150214-py3-none-any.whl
Algorithm Hash digest
SHA256 ce49adf0bc97fd44d99df98bf00d6bc66b38a2483b3cc721cbd200d010add3c5
MD5 3b630cb08fbcab42245e78bd64a5f637
BLAKE2b-256 699796b45e583509b8f3c15d071747c9ab14d68ab4587780c23e3670c8a91891

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ecf2c386cff04f7bc08b0ec4f3f3f0c6196043d746799c64a98ca4cb596e4c9f
MD5 3ba643d87e0a958f5552aab4a47ea738
BLAKE2b-256 1df0f0bec622e85943beac70614188baa7e3433748160c462a88b878d187471b

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 609ba090fb40f493cedf64bb124c1d208af7388b6663af9d76a91a03ec7d8710
MD5 b649dfbacedb16bedf8a190ce1d5b296
BLAKE2b-256 4b78f95ceb57bc8c7c49c279bd4143b319fd79bd2c87fc4a63fc61741fd5f57e

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8ec4ad0f1fd56637ad3f935d647adceaebc28bbe7c204cf24165c231320c09b8
MD5 d1014275758c91d2f502e10c2a1b78ea
BLAKE2b-256 42bff0a27096788e173f9a7b38234dfaa52231a17ec612809a047fa812882ff0

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e4976cbb5c4801b29295bdadb3e234e7b9572b1b9716b063f79806cb30820181
MD5 49d919aa3bf47b75e6d6393ef6b7ba94
BLAKE2b-256 2ac2c2413a25b27f9931b7fd56e2d6fa6ab4ab0e7a3273fd8fb8735dbaf09010

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2c8d4b433f745e3c51fa8ea83ab319c45269af3003cb180d53e6eb719c1f8606
MD5 06514e22af045cf207c5df31be780e23
BLAKE2b-256 d859db6223105c26ee46c8b8c20f670532a70294a8610d5f553e5b5d8170be95

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1627e1666e8f835d337926fac65d5484128ce5798fefa4bb1a0ae862e1f9ec82
MD5 67ce6c5c874fbc9eeb581e6e073a09af
BLAKE2b-256 37347210db4ed826536713b054ebba77aaaee3ecf681676ce0dffa884083aa3d

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd1f72602c21fb4e201cecba36c60f896f0cc95edaeb49bb9b720ed9869af372
MD5 af301ca202465ad5ffdd8d43ee286381
BLAKE2b-256 6d0fef38e02b43cdce30ab2e13c29d873dbb57526f1d3c710c7e486ebb588a1c

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b27300acf4d71547dae3c0f065789075c157391de74e71fdbf1daf318ef958dc
MD5 5f2967b4be013c49e87efc5fa55acf64
BLAKE2b-256 b7201c3524146c4feaa6a843782da0b644c7ee7e03e595df97666be839a70f6c

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2de190abaccd39408e384982c0bbb871708502ebb81682cf0a6e522bcdf86dc3
MD5 9445adbeec6fefc56a31c93bbeffb334
BLAKE2b-256 8467ff56d6ffc55e0a5027605213f6519efd88588c488ff5b6317c34b142e797

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 86addbf36269646914c666035318ca38a5e84a4ae5732c013cde3f03550ca5f7
MD5 b1305327eeff55f6b4fff33305ba5360
BLAKE2b-256 8d4e764f687cc72205b6e96eb9e78d4aba60c1cc0623f82425a804bc6eef0ad3

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 77ff2a0d76e490f5c39375ca4d3f625fffa51a93aa719966a5c06b5375f76fc8
MD5 e5881cc2dbc58fdc00f26775d649de14
BLAKE2b-256 75ccb75ece7fcc8c45c5eea9da85d72d1eeffbeb3c082e91aa80f0b895c92302

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220204150214-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b72181040f4330decdad422130fac6d7953d0415843a6bb02d6e7835f20e10e7
MD5 f073b4193caa4f996bba740481fade51
BLAKE2b-256 ddb02b2d031b9fca9374567e5c60aa6e8bb9a940909fa3efcbd34f795e4c3551

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page