Skip to main content

packtivity - general purpose schema + bindings for PROV activities

Reason this release was yanked:

upper bound placed on jq version which is now fixed in v0.16.2

Project description

packtivity

DOI Coverage Status Documentation Status PyPI

This package aims to collect implementations of both synchronous and asynchronous execution of preserved, but parametrized scientific computational tasks that come with batteries included, i.e. with a full specification of their software dependencies. In that sense they are packaged activities -- packtivities.

This package provides tools to validate and execute data processing tasks that are written according to the "packtivity" JSON schemas defined in https://github.com/diana-hep/yadage-schemas.

Packtivities define

  • the software environment
  • parametrized process descriptions (what programs to run within these environment) and
  • produces human and machine readable outputs (as JSON) of the resulting data fragments.

At run-time they are paired with a concrete set of parameters supplied as JSON documents and and external storage/state to actually execute these tasks.

Packtivity in Yadage

This package is used by https://github.com/lukasheinrich/yadage to execute the individual steps of yadage workflows.

Example Packtivity spec

This packtivity spec is part of a number of yadage workflow and runs the Delphes detector simulation on a HepMC file and outputs events in the LHCO and ROOT file formats. This packtivity is (stored in a public location)[https://github.com/lukasheinrich/yadage-workflows/blob/master/phenochain/delphes.yml] from which it can be later retrieved:

process:
  process_type: 'string-interpolated-cmd'
  cmd: 'DelphesHepMC  {delphes_card} {outputroot} {inputhepmc} && root2lhco {outputroot} {outputlhco}'
publisher:
  publisher_type: 'frompar-pub'
  outputmap:
    lhcofile: outputlhco
    rootfile: outputroot
environment:
  environment_type: 'docker-encapsulated'
  image: lukasheinrich/root-delphes

Usage

You can run the packtivity in a synchronous way by specifying the spec (can point to GitHub), all necessary parameters and attaching an external state (via the --read and --write flags).

packtivity-run -t from-github/phenochain delphes.yml \
  -p inputhepmc="$PWD/pythia/output.hepmc" \
  -p outputroot="'{workdir}/output.root'" \
  -p outputlhco="'{workdir}/output.lhco'" \
  -p delphes_card=delphes/cards/delphes_card_ATLAS.tcl \
  --read pythia --write outdir

Asynchronous Backends

In order to facilitate usage of distributed resources, a number of Asynchronous backends can be specified. Here is an example for IPython Parallel clusters

packtivity-run -b ipcluster --asyncwait \
  -t from-github/phenochain delphes.yml \
  -p inputhepmc="$PWD/pythia/output.hepmc" \
  -p outputroot="'{workdir}/output.root'" \
  -p outputlhco="'{workdir}/output.lhco'" \
  -p delphes_card=delphes/cards/delphes_card_ATLAS.tcl \
  --read pythia --write outdir

You can replacing the --asyncwait with --async flag in order to get a JSONable proxy representation with which to later on check on the job status. By default the proxy information is written to proxy.json (customizable via the -x flag):

packtivity-run -b celery --async \
  -t from-github/phenochain delphes.yml \
  -p inputhepmc="$PWD/pythia/output.hepmc" \
  -p outputroot="'{workdir}/output.root'" \
  -p outputlhco="'{workdir}/output.lhco'" \
  -p delphes_card=delphes/cards/delphes_card_ATLAS.tcl \
  --read pythia --write outdir

And at a later point in time you can check via:

packtivity-checkproxy proxy.json

External Backends

Users can implement their own backends to handle the JSON documents describing the packtivities. It can be enabled by using the fromenv backend and setting an environment variable specifying the module holding the backend and proxy classes. The format of the environment variable is module:backendclass:proxyclass. E.g.:

export PACKTIVITY_ASYNCBACKEND="externalbackend:ExternalBackend:ExternalProxy"

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packtivity-0.16.1.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

packtivity-0.16.1-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file packtivity-0.16.1.tar.gz.

File metadata

  • Download URL: packtivity-0.16.1.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for packtivity-0.16.1.tar.gz
Algorithm Hash digest
SHA256 c0cc5f14ca3dcc474a91899625d6279f12203be200f4d46cafe3041bfb2a8910
MD5 0f4483a47312323f0d574957d6ef117e
BLAKE2b-256 5b15ca8988dd68656fd6167bc3ce12725b8f7ae5fde13d51adbae87df416f015

See more details on using hashes here.

File details

Details for the file packtivity-0.16.1-py3-none-any.whl.

File metadata

  • Download URL: packtivity-0.16.1-py3-none-any.whl
  • Upload date:
  • Size: 36.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for packtivity-0.16.1-py3-none-any.whl
Algorithm Hash digest
SHA256 01b904d4b60292578ef578158880cd4e06cb55d495610b1a9b33c0f8bbf89a58
MD5 1a76f26e6b45932efc1787ef282400ea
BLAKE2b-256 6340853fd03eedf929f2aba8df7b6cf3dd56d239c4a3c4c193ac632b66a8392a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page