Data transformation framework for LinkML data models
Project description
Koza - a data transformation framework
Disclaimer: Koza is in beta - we are looking for testers!
Overview
- Transform csv, json, yaml, jsonl, and xml and converting them to a target csv, json, or jsonl format based on your dataclass model.
- Koza also can output data in the KGX format
- Write data transforms in semi-declarative Python
- Configure source files, expected columns/json properties and path filters, field filters, and metadata in yaml
- Create or import mapping files to be used in ingests (eg id mapping, type mappings)
- Create and use translation tables to map between source and target vocabularies
Installation
Koza is available on PyPi and can be installed via pip/pipx:
[pip|pipx] install koza
Usage
NOTE: As of version 0.2.0, there is a new method for getting your ingest's KozaApp
instance. Please see the updated documentation for details.
See the Koza documentation for usage information
Try the Examples
Validate
Give Koza a local or remote csv file, and get some basic information (headers, number of rows)
koza validate \
--file https://raw.githubusercontent.com/monarch-initiative/koza/main/examples/data/string.tsv \
--delimiter ' '
Sending a json or jsonl formatted file will confirm if the file is valid json or jsonl
koza validate \
--file ./examples/data/ZFIN_PHENOTYPE_0.jsonl.gz \
--format jsonl
koza validate \
--file ./examples/data/ddpheno.json.gz \
--format json
Transform
Run the example ingest, "string/protein-links-detailed"
koza transform \
--source examples/string/protein-links-detailed.yaml \
--global-table examples/translation_table.yaml
koza transform \
--source examples/string-declarative/protein-links-detailed.yaml \
--global-table examples/translation_table.yaml
Note:
Koza expects a directory structure as described in the above example
with the source config file and transform code in the same directory:
.
├── ...
│ ├── your_source
│ │ ├── your_ingest.yaml
│ │ └── your_ingest.py
│ └── some_translation_table.yaml
└── ...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file koza-0.7.1.tar.gz
.
File metadata
- Download URL: koza-0.7.1.tar.gz
- Upload date:
- Size: 28.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.15 Linux/6.8.0-1014-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73f733411174e943a8c5120577f76d2ba76bfb0cd0805dcbe8ef4d2c062e512c |
|
MD5 | 79a75e37b917f44f63ee9452d99c99ed |
|
BLAKE2b-256 | 5e0c443f363a682a5ee67c7c927a50f43bb3eebeecf25639cf43948daa698e77 |
File details
Details for the file koza-0.7.1-py3-none-any.whl
.
File metadata
- Download URL: koza-0.7.1-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.15 Linux/6.8.0-1014-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a56057189aec600ca31d0753050651dab24b239dddc5edc1aa86d3f0e3fd1fcf |
|
MD5 | 94772116bdd5a6c5da1af44d16594212 |
|
BLAKE2b-256 | 62545700b90c667db70d0c8a8ed7b19a89c8d28df07794564dc21a9b40cff400 |