Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.6+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.2.20220103095339.tar.gz (479.1 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.2.20220103095339-py3-none-any.whl (511.4 kB view details)

Uploaded Python 3

schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

File details

Details for the file schema-salad-8.2.20220103095339.tar.gz.

File metadata

  • Download URL: schema-salad-8.2.20220103095339.tar.gz
  • Upload date:
  • Size: 479.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for schema-salad-8.2.20220103095339.tar.gz
Algorithm Hash digest
SHA256 051690a2f89b98e35100cd2cb489406a5169a60c2f27a716f3f287a42d45be2d
MD5 5d152f2bd708f1c67bc558f471800f9e
BLAKE2b-256 8e904a2cc012f32a39864372855c2fe06e2976cedf1e18844c17debc30a13c7a

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-py3-none-any.whl.

File metadata

  • Download URL: schema_salad-8.2.20220103095339-py3-none-any.whl
  • Upload date:
  • Size: 511.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for schema_salad-8.2.20220103095339-py3-none-any.whl
Algorithm Hash digest
SHA256 ea54e6606ecacaa5a2453532aa8e78b38154c34aa99167f8ccfa92f914a72f2c
MD5 4ac5d9c635392488b40128978b61ef24
BLAKE2b-256 a140e6e6efdf46f1eedee08f41430dc9e983f90fafb3e1c038a7c53af8ab5c55

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2b3ff052b2ab83ba5ecdddd783b4f1a2c8a150f8a923cb0d366b0d282f8e1f3f
MD5 e23980b500a6c6f83b1816deae4b5ca2
BLAKE2b-256 1af9c711b8f75033f301151b9048b39e7a01bea7eaa9e47b067e20797c266b2d

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3e7ed71f53076e9d3baab3fdfedf850834f6de10ec9fb40b1ab3376a9f168a56
MD5 a2598c46b04b8eb012570ce988217349
BLAKE2b-256 81156ff09074b0bae5edfcb0fe4593764b5ea2e014320b551ae027a6bf714bcd

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a6ea4820568f90382585dcf269352b9fd26f00a245e3906398cef84b99a63ea7
MD5 3af8a3baf90697d523b2068ce130b249
BLAKE2b-256 6e0466e9473a82f3227e770cf3ea94e9e01926d05193d800df607c8f4e202ef6

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4744b5fdc5894c49f8c7dbca8393e90c97e4b3f4f227c848a5c8feeb5bf70d7d
MD5 a997199e38aeedc67e82277e78bc2f41
BLAKE2b-256 f4f2fcf870b509e9ee1cdcbebf09341206a390bf25a55a0c72a14719414d49d6

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e3483e800f58169afa3bfed117bc1396e329bee8d09b1cb427f55fb67d8cb131
MD5 8327f2c6e2b4266fbdecf25a193521bb
BLAKE2b-256 daeaa13f5d4dcbc1de92837d7f03efddba280aec691223da01b769fa49484759

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d9656313c17d439a14a5f8e9149a2a708a5dcd5d69d81cbced52ff5beddb6972
MD5 6c28fcb31cbd5f0bed2dbb5b8cee1dd7
BLAKE2b-256 15b77fd6e011bf440838c1be190f0c6286c35f259acca0fc1c2b5d7a5d31a4c6

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 35cc32ba214faca47a39e97da1bd2df0b49c824118b283f9101037d89777627b
MD5 45145eb6633c0a1cbea466667a044b41
BLAKE2b-256 dd8588f240cc2b779052d5955822da22c57aa49ccf519e6ef58c72ac35891801

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 91f1b15e83470c721e2d79a4290c1961262942435c483844a4b11873e3091f3d
MD5 845237cde4ecfab78f5561a33db12f6b
BLAKE2b-256 492202cfe7f3311b73a334b2818caed11af6269ecdaec01b0f0f2a0000e37a7a

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 60060fc8065ec29704e35d59b9c9b7042710734edf96a205ef8c76121c5d0cc2
MD5 2c7a10ccc73f6c4aa5663a737eeb86f6
BLAKE2b-256 90b9edcea1de83d683fa924ea364e9510f4bbf1b409e6f146f053bad12398b70

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0ce33774970da53434c966a5d1af063672b18127992d407b57fb6b354b134030
MD5 5a4314900bad3850b8756f93bb464588
BLAKE2b-256 8d268b737b1cdbd186930277cbef4c240e29b693ab477da9dd6daeb134c73196

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 295e7334cf4023997fa198ef134ee0aab8918efcdbc9e27ee2cf6f75c7fe767b
MD5 3e0a006b61a63c0d4019389341e1b4c2
BLAKE2b-256 5a322572114659428171e6b25cc78d426ffcef28fcb3c551c5a90e981a29bc01

See more details on using hashes here.

File details

Details for the file schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.2.20220103095339-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 43366cc97bed82ee7a49a3ea39b2769bf732dd52bd678144717c7aec5fc40da6
MD5 86bd26347d71e4c13296c3fae9ef30a6
BLAKE2b-256 6f7c91a05e5c4bb46569e76e3e6b2ebc88f566ae47c17d53a10e8a00936eb549

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page