Skip to main content

Schema Annotations for Linked Avro Data (SALAD)

Project description

Linux Build Status Code coverage Documentation Status CII Best Practices

Schema Salad

Salad is a schema language for describing JSON or YAML structured linked data documents. Salad schema describes rules for preprocessing, structural validation, and hyperlink checking for documents described by a Salad schema. Salad supports rich data modeling with inheritance, template specialization, object identifiers, object references, documentation generation, code generation, and transformation to RDF. Salad provides a bridge between document and record oriented data modeling and the Semantic Web.

The Schema Salad library is Python 3.6+ only.

Installation

pip3 install schema_salad

If you intend to use the schema-salad-tool –codegen=python feature, please include the [pycodegen] extra:

pip3 install schema_salad[pycodegen]

To install from source:

git clone https://github.com/common-workflow-language/schema_salad
cd schema_salad
pip3 install .
# or pip3 install .[pycodegen] if needed

Commands

Schema salad can be used as a command line tool or imported as a Python module:

$ schema-salad-tool
usage: schema-salad-tool [-h] [--rdf-serializer RDF_SERIALIZER] [--skip-schemas]
                      [--strict-foreign-properties] [--print-jsonld-context]
                      [--print-rdfs] [--print-avro] [--print-rdf] [--print-pre]
                      [--print-index] [--print-metadata] [--print-inheritance-dot]
                      [--print-fieldrefs-dot] [--codegen language] [--codegen-target CODEGEN_TARGET]
                      [--codegen-examples directory] [--codegen-package dotted.package]
                      [--codegen-copyright copyright_string] [--print-oneline]
                      [--print-doc] [--strict | --non-strict]
                      [--verbose | --quiet | --debug] [--only ONLY] [--redirect REDIRECT]
                      [--brand BRAND] [--brandlink BRANDLINK] [--brandstyle BRANDSTYLE]
                      [--brandinverse] [--primtype PRIMTYPE] [--version]
                      [schema] [document]

$ python
>>> import schema_salad

Validate a schema:

$ schema-salad-tool myschema.yml

Validate a document using a schema:

$ schema-salad-tool myschema.yml mydocument.yml

Generate HTML documentation:

$ schema-salad-tool --print-doc myschema.yml > myschema.html
$ # or
$ schema-salad-doc myschema.yml > myschema.html

Get JSON-LD context:

$ schema-salad-tool --print-jsonld-context myschema.yml mydocument.yml

Convert a document to JSON-LD:

$ schema-salad-tool --print-pre myschema.yml mydocument.yml > mydocument.jsonld

Generate Python classes for loading/generating documents described by the schema (Requires the [pycodegen] extra):

$ schema-salad-tool --codegen=python myschema.yml > myschema.py

Display inheritance relationship between classes as a graphviz ‘dot’ file and render as SVG:

$ schema-salad-tool --print-inheritance-dot myschema.yml | dot -Tsvg > myschema.svg

Quick Start

Let’s say you have a ‘basket’ record that can contain items measured either by weight or by count. Here’s an example:

basket:
  - product: bananas
    price: 0.39
    per: pound
    weight: 1
  - product: cucumbers
    price: 0.79
    per: item
    count: 3

We want to validate that all the expected fields are present, the measurement is known, and that “count” cannot be a fractional value. Here is an example schema to do that:

- name: Product
  doc: |
    The base type for a product.  This is an abstract type, so it
    can't be used directly, but can be used to define other types.
  type: record
  abstract: true
  fields:
    product: string
    price: float

- name: ByWeight
  doc: |
    A product, sold by weight.  Products may be sold by pound or by
    kilogram.  Weights may be fractional.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - pound
          - kilogram
      jsonldPredicate: '#per'
    weight: float

- name: ByCount
  doc: |
    A product, sold by count.  The count must be a integer value.
  type: record
  extends: Product
  fields:
    per:
      type:
        type: enum
        symbols:
          - item
      jsonldPredicate: '#per'
    count: int

- name: Basket
  doc: |
    A basket of products.  The 'documentRoot' field indicates it is a
    valid starting point for a document.  The 'basket' field will
    validate subtypes of 'Product' (ByWeight and ByCount).
  type: record
  documentRoot: true
  fields:
    basket:
      type:
        type: array
        items: Product

You can check the schema and document in schema_salad/tests/basket_schema.yml and schema_salad/tests/basket.yml:

$ schema-salad-tool basket_schema.yml basket.yml
Document `basket.yml` is valid

Documentation

See the specification and the metaschema (salad schema for itself). For an example application of Schema Salad see the Common Workflow Language.

Rationale

The JSON data model is an popular way to represent structured data. It is attractive because of it’s relative simplicity and is a natural fit with the standard types of many programming languages. However, this simplicity comes at the cost that basic JSON lacks expressive features useful for working with complex data structures and document formats, such as schemas, object references, and namespaces.

JSON-LD is a W3C standard providing a way to describe how to interpret a JSON document as Linked Data by means of a “context”. JSON-LD provides a powerful solution for representing object references and namespaces in JSON based on standard web URIs, but is not itself a schema language. Without a schema providing a well defined structure, it is difficult to process an arbitrary JSON-LD document as idiomatic JSON because there are many ways to express the same data that are logically equivalent but structurally distinct.

Several schema languages exist for describing and validating JSON data, such as JSON Schema and Apache Avro data serialization system, however none understand linked data. As a result, to fully take advantage of JSON-LD to build the next generation of linked data applications, one must maintain separate JSON schema, JSON-LD context, RDF schema, and human documentation, despite significant overlap of content and obvious need for these documents to stay synchronized.

Schema Salad is designed to address this gap. It provides a schema language and processing rules for describing structured JSON content permitting URI resolution and strict document validation. The schema language supports linked data through annotations that describe the linked data interpretation of the content, enables generation of JSON-LD context and RDF schema, and production of RDF triples by applying the JSON-LD context. The schema language also provides for robust support of inline documentation.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema-salad-8.3.20220717184004.tar.gz (533.5 kB view details)

Uploaded Source

Built Distributions

schema_salad-8.3.20220717184004-py3-none-any.whl (573.0 kB view details)

Uploaded Python 3

schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

File details

Details for the file schema-salad-8.3.20220717184004.tar.gz.

File metadata

File hashes

Hashes for schema-salad-8.3.20220717184004.tar.gz
Algorithm Hash digest
SHA256 b979500debe39fd3b213d74a8f5308eba034ce6975ff556d07a71931777b77c3
MD5 f7a9c44024bf07383b3478e2eeba285d
BLAKE2b-256 1f2667682a6cd56506cd093a6a16b5ea32cfee7750db04032ce131b0cf35c5f7

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-py3-none-any.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-py3-none-any.whl
Algorithm Hash digest
SHA256 fca63cabac644a4f605a6ee64fdae8e2cea7fc0fddad7ec99fc180d8d296534f
MD5 f80ee9622e2bedc58682a84552c046dc
BLAKE2b-256 e71e8fa76e9622b0c79f6f52bd36f6374ab2c405697c53ef7f024fb261919a2e

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2a67449fc1027658173cc137db82c2a621c0ed6184c762042fddc30019c9a8b1
MD5 6c8549ed33edc736d9521e592a982f7d
BLAKE2b-256 4ce54ba8bb1be09578fb20ed3354f033a540c7cc17b35fa665311b42dd9325d1

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 e02398cbeb45f0af048b3b2ce917b9070e34c6e28faf6a9f086e9fb8abb145e9
MD5 ce99357dc8beabf44ea198d6f21c9574
BLAKE2b-256 6eff7f36e31e412cda4e2acb6f1892c8c94ee321223ea18a56c9a9f4086eefcb

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 024f1ac8c3caa95da4ccd1ef93e9b117bd50a44d6fc7f9f3b82d76e02660232e
MD5 86d60f799bdc7d949ab44838e81833b9
BLAKE2b-256 673dbb270013f3bfef1e90ab90a7b453cfda0d075c0e4a91c45182b131d08d3d

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6c8e2f90478facb0143db822d6f22e697b9ffb94d3bab6fb68fcb77c310e7d04
MD5 dab94a5b0c6009db73539e6510bf4e4d
BLAKE2b-256 23e4bb5cc0d3679c102a3c1a5fe92f19f00b48790359c6b715fae59821a27bb5

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ab095acf43b1af478f684012f91a7b9e436b259898ab838e24225ba66c562d62
MD5 bf74d0ec7aaf15a782eb2bbb48aa3a78
BLAKE2b-256 c42bf9ba150a1349c9b0c4da08fafadd7dc5a3a5e04a3fc76c65ae104bce51b9

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7892e422e6398044caf9f229eedceb937d271bab3c8b5e40d5cfce92cc50e75a
MD5 a383b1ce2369a5d543239311d3fad807
BLAKE2b-256 528ce2e96920d658d92f5086b4abd267b3a51df37496025d53cf7188e3d659dd

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1046ad03443ae35a3f41584ada78cccfa89170722534a6df166e79ec81cf26f2
MD5 01f0f83852a447275cdd66b981fa9a4e
BLAKE2b-256 f75f6adccc9c9c7bf7bb4007f8217c42005d68ec4f058260cce8ad5666c00b87

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 dbfe45eebd984d5c7faf4ef797c6442af722daa64ab85d5265ad98ba901b4a9e
MD5 184ed9143a99acc32fb0423e93ad32f7
BLAKE2b-256 64cb22ba8decb49973e48ecfd71d4ec887d398d63c6a7639fb56fb22d343543a

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a82847f196d5610cac2263ebc31652520079fb5ed7850c5c16c31a75e080d4ab
MD5 cab048be949f88851dd1f67622903e6a
BLAKE2b-256 5a0e9580f503b0dccf5a685258a5e678aa5676f47761d723b2f44b4b859480ff

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6e8baadf7f80a0028e0701e8125e8738074db06fd38f720ae9a102a8daecf2f4
MD5 837eb58e2a94cf83dd7e56ab86382d7c
BLAKE2b-256 07ee6739d5ad4073403045ceda26da53f18d34227d7b8b0c77e2b0ca42a693ce

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 fc062c8b095dd905db611205d90b88229cec34bbb7abda1a3a5fcd74ffda8f25
MD5 5c05681d8ce24e48a5d8458cfe6c42cc
BLAKE2b-256 d32b07e6be9f1b17c64a9a27a25d63a2f4968f91dfb41903650ba907ac847911

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ec61865b7ac5b6b8cac428c2dddbfbdfc752e094964bf0e128bacc6e04aa09ed
MD5 72ec4e7e165ba156322ecd02c62a3672
BLAKE2b-256 91278b7960b0ed57c198ee144329ab3b066a98dd380f90ee18ae171ba5e40730

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c160fb4140bf2203fc68f18dab1923cefd246020a9e26dbf8666bc6c002ac3a3
MD5 3726233626131b437e10be144822e106
BLAKE2b-256 cbe7637a620f971b3898f68963bdd38b65930b1fb375a681964f0d443ee21460

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 aee428bb7874739446271f08c554ac486aa45a5ca35f1c587d9b2fb903c77615
MD5 89c383695a10ff307efe53617bdad629
BLAKE2b-256 69ba9bd2a350c9305308358b022b6df84065a865aee17eba5f5bd92be6d1c320

See more details on using hashes here.

Provenance

File details

Details for the file schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for schema_salad-8.3.20220717184004-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ede27773f7e2d4b5a320c3ec6ca5411a1a9b90a963efe0a2e9098af001bdf608
MD5 fb1632d700f592226426ac3a38d10927
BLAKE2b-256 af91a3eda890bb3da96ce19edbde23d5323ad4cca5c9e6933f4bba6eb1a2447b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page