Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.6+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.3.20220801194920.tar.gz (532.7 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.3.20220801194920-py3-none-any.whl (574.3 kB view details)

Uploaded Python 3

schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

File details

Details for the file schema-salad-8.3.20220801194920.tar.gz.

File metadata

File hashes

Hashes for schema-salad-8.3.20220801194920.tar.gz
Algorithm Hash digest
SHA256 82f7c3bbe15351e4dbe2e2f2a8330240c5da4604f4a54f6c12ed15af0ce85917
MD5 13512a914e3baea86ddee1bf058f2222
BLAKE2b-256 314cc80a6b0df33b8e654cfd38295252aba6dba9f634d9769218f57c3b394079

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-py3-none-any.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-py3-none-any.whl
Algorithm Hash digest
SHA256 ef820daad4c25c639abfd56b99628d891ce6fa9d69e45abd46a469ce31f50a42
MD5 3180e5fd40b51c1fc75d3b6c0007f173
BLAKE2b-256 a2f189411453b17cceb318c7c660f98f838179599d01236c058e123ff21a39b0

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8f71e7a79d199b0ede21a181280fc8b613167d170fb1681a2f824c2db8e665bb
MD5 5776736425dfa7d18b8134a94b02c9dc
BLAKE2b-256 186ef401a9adeb3164e97d057a9161654f57de5609389360a41f35927fbd8979

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 aaee209a13682f6f804bbf374e2f37b0e9f717cbd8a4a5eadf534e2a39cabdb5
MD5 97373c18cacc5ee742b5a08db6c6e19c
BLAKE2b-256 0465415cf2995f8d79a38e3f85ddd73d84ec493a122e0e4048a3c4af237f8890

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7a87ab75e586cf729ad3434a068cae05cc15cf1c56371651b343f763099c97f6
MD5 af4bab2ec44921b97febd46092baa4af
BLAKE2b-256 5d563a948d49690cf2ac97e4a975b63019243b55a8069778267c2f43b5a41a20

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e547af5ed207a07fc9b3c8e96889c76b050448c69e39ea50d8ed20cca73ef6a2
MD5 a895d697936a831fbbaf356b0494a811
BLAKE2b-256 199ef4b4d33bd63ddcfe259667861f4d7ea9a2144811135e27087683333710ca

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 74777bcb3c141080e2e46d86608982d565ceb187419ab55c0a1fa30e6185c712
MD5 b765bfda5a436f2ae442821a76a848b2
BLAKE2b-256 7119d507d3f6ebf6d66e32945ae5ed58dede1a46aa31542829fe00882428ecf6

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 477b808c1475e5d16c569595083a63a721c43e539b0d832ca2f927df642ec769
MD5 e3519b342b3ec284e25e6e564903ebb6
BLAKE2b-256 6e35678c38a30fa778046a579c91f3e5212916a99eb3e6321e86fc5ddb4f975e

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a61ec22b5c22bbd54d843a4ad56c39bf1eb9186bf12b6066ca0c766d1f5f44e4
MD5 03b9f94e08bc5e586ade5d0b570bb6dd
BLAKE2b-256 afb7399bad8a681dd9b5026fe44ee663f62dc121e8944f2456636084afb8d933

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 298b18712c17146b3148b66f2fcad20b7862c5ce2c8d4f890e05b13f7098e740
MD5 d087af2b7dc1ac3111da37f31394a9d3
BLAKE2b-256 8fce28b39a91906431d6fb2051d0dcddba4e740279b098ca6afe42fcde11bed9

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 30773a32b4d34eb0d5c05e2433864f239837bf20ca07da6d3a08b6beeeacca6a
MD5 a85b4aa04aad1e1453ebf401440a206d
BLAKE2b-256 02e7bc034a433d858d4817141c33ce15641d25864fff766a3d79ff6d02fa6a77

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6af681973789bd81293e61910d11d9b892875137ed36d5d2f49aa495a8821eb
MD5 c35f99bc693eaef59316d17829f494b1
BLAKE2b-256 9aa996676d6c200cb892c8c5bbbc72b2b94ebbd4208b798a61e6085bac5f33e0

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 58165f049c459d2f31e634fe3b50ac21dc9ecd58d83280375523dc79e9f2811c
MD5 2932121b31c9e27454100a4783ba7b33
BLAKE2b-256 298993b9a43f079e41ddd610f94c22293371d2dc6920f6d7123eeacf79575b07

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f4b6f1d18c98e4646c7358163181dfb628ac73ee4b5d59b4569d5baf5748a5c2
MD5 b0f5eb36a694838170abbabbe92387ba
BLAKE2b-256 ab564044cfd1c3b72e982a9d1ca967422c3606d60affc01fa5159436f560a5bf

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4b66f719df48b4c72f4df8a834892fbf15449280365ac78a92acb65e51bc804e
MD5 afaf18fb77febd0c13b98fac8441998d
BLAKE2b-256 1cb0cd206121da41278566d76f25a14680ba243ed5b125afe67fc16a4ac7740d

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ef1fd242ead116f002dddd6e116902e6521f8ba152e29a32cc15bb72abe78cd6
MD5 7024e88fbe00da691e8867c6d35e5749
BLAKE2b-256 d5f74ff116f6a658f87395af90c815031b7ebbb7c30fe75464f9d0e128b1b3c4

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220801194920-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 c585d0a8510aa928a293cc0361202bd279a361fe29fbc6a18f71f05a5bc22709
MD5 f6de38bd54e0e16fbeaf81eb4ed0ee95
BLAKE2b-256 61a063460181268c2294d1e34054e4df18e4b690b2970c760c9baec46ec0fbe7

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page