Skip to main content

Koza, an ETL framework for the Biolink model

Project description

Koza

pupa Data transformation framework

Disclaimer: Koza is in beta; we are looking for beta testers

Transform csv, json, yaml, jsonl, and xml and converting them to a target csv, json, or jsonl format based on your dataclass model. Koza also can output data in the KGX format

Documentation: https://koza.monarchinitiative.org/

Highlights
  • Author data transforms in semi-declarative Python
  • Configure source files, expected columns/json properties and path filters, field filters, and metadata in yaml
  • Create or import mapping files to be used in ingests (eg id mapping, type mappings)
  • Create and use translation tables to map between source and target vocabularies

Installation

pip install koza

Getting Started

Send a local or remove csv file through Koza to get some basic information (headers, number of rows)

koza validate \
  --file https://raw.githubusercontent.com/monarch-initiative/koza/main/examples/data/string.tsv \
  --delimiter ' '

Sending a json or jsonl formatted file will confirm if the file is valid json or jsonl

koza validate \
  --file ./examples/data/ZFIN_PHENOTYPE_0.jsonl.gz \
  --format jsonl
koza validate \
  --file ./examples/data/ddpheno.json.gz \
  --format json \
  --compression gzip
Example: transforming StringDB
koza transform --source examples/string/protein-links-detailed.yaml --global-table examples/translation_table.yaml 

koza transform --source examples/string-declarative/protein-links-detailed.yaml --global-table examples/translation_table.yaml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

koza-0.1.1.tar.gz (219.9 kB view details)

Uploaded Source

Built Distribution

koza-0.1.1-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file koza-0.1.1.tar.gz.

File metadata

  • Download URL: koza-0.1.1.tar.gz
  • Upload date:
  • Size: 219.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.25.1

File hashes

Hashes for koza-0.1.1.tar.gz
Algorithm Hash digest
SHA256 be3575955493483ac0466123402d3741fddf462d8c3a1f48eb18c4024f12b36b
MD5 f3522f683b1019912622b8f71b8c67ca
BLAKE2b-256 829ebe9a9296b27e7df221297cb6f04d86970a5f4e8448f7cb3175570fd90f5b

See more details on using hashes here.

File details

Details for the file koza-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: koza-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.25.1

File hashes

Hashes for koza-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a82c2352cccc96cf878afc3a650f5ad18d17029a66674ed08d2eb8b1e1b454c7
MD5 b52aa3c43f2c8e32cafc54d9cd38de2a
BLAKE2b-256 8f61090c098366b7d7ee1417c687026f952f39c0dd31e69b9509a82fb084264e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page