Skip to main content

StreamFlow framework

Project description

StreamFlow

Build Status

The StreamFlow framework is a container-native Workflow Management System (WMS) written in Python 3. It has been designed around two main principles:

  • Allow the execution of tasks in multi-container environments, in order to support concurrent execution of multiple communicating tasks in a multi-agent ecosystem.
  • Relax the requirement of a single shared data space, in order to allow for hybrid workflow executions on top of multi-cloud or hybrid cloud/HPC infrastructures.

Use StreamFlow

PyPI

The StreamFlow module is available on PyPI, so you can install it using pip.

pip install streamflow

Please note that StreamFlow requires python >= 3.7. Then you can execute it directly from the CLI

streamflow /path/to/streamflow.yml

Docker

StreamFlow Docker images are available on Docker Hub. In order to run a workflow inside the StreaFlow image

  • A StreamFlow project, containing a streamflow.yml file and all the other relevant dependencies (e.g. a CWL description of the workflow steps and a Helm description of the execution environment) need to be mounted as a volume inside the container, for example in the /streamflow/project folder
  • Workflow outputs, if any, will be stored in the /streamflow/results folder. Therefore, it is necessary to mount such location as a volume in order to persist the results
  • StreamFlow will save all its temporary files inside the /tmp/streamflow location. For debugging purposes, or in order to improve I/O performances in case of huge files, it could be useful to mount also such location as a volume
  • The path of the streamflow.yml file inside the container (e.g. /streamflow/project/streamflow.yml) must be passed as an argument to the Docker container

The script below gives an example of StreamFlow execution in a Docker container

docker run -d \
    --mount type=bind,source="$(pwd)"/my-project,target=/streamflow/project \
    --mount type=bind,source="$(pwd)"/results,target=/streamflow/results \
    --mount type=bind,source="$(pwd)"/tmp,target=/tmp/streamflow \
    alphaunito/streamflow \
    /streamflow/project/streamflow.yml

Kubernetes

It is also possible to execute the StreamFlow container as a Job in Kubernetes. In this case, StreamFlow is able to deploy Helm models directly on the parent cluster through the ServiceAccount credentials. In order to do that, the inCluster option must be set to true for each involved module on the streamflow.yml file

models:
  helm-model:
    type: helm
    config:
      inCluster: true
      ...

A Helm template of a StreamFlow Job can be found in the helm/chart folder.

Please note that, in case RBAC is active on the Kubernetes cluster, a proper RoleBinding must be attached to the ServiceAccount object, in order to give StreamFlow the permissions to manage deployments of pods and executions of tasks.

Contribute to StreamFlow

StreamFlow uses pipenv to guarantee deterministic builds. Therefore, the recommended way to manage dependencies is by means of the pipenv command.

As a first step, get StreamFlow from GitHub

git clone git@github.com:alpha-unito/streamflow.git

Then you can install all the requred packages using the pipenv command

pip install --user pipenv
cd streamflow
pipenv install

Finally, you can run StreamFlow in the generated virtual environment. In order for this to work, it is necessary to add the streamflow project folder (the one generated by the git clone command) to your PYTHONPATH list

pipenv run python -m streamflow

StreamFlow relies on Travis CI for PyPI and Docker Hub distributions. Therefore, in order to publish a new version of the software, you only have to augment the version number in version.py file.

StreamFlow Team

Iacopo Colonnelli iacopo.colonnelli@unito.it (creator and maintainer)
Barbara Cantalupo barbara.cantalupo@unito.it (maintainer)
Marco Aldinucci aldinuc@di.unito.it (maintainer)

Gaetano Saitta gaetano.saitta@edu.unito.it (contributor)
Alberto Mulone alberto.mulone@edu.unito.it (contributor)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamflow-0.0.25.tar.gz (75.8 kB view details)

Uploaded Source

Built Distribution

streamflow-0.0.25-py2.py3-none-any.whl (105.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file streamflow-0.0.25.tar.gz.

File metadata

  • Download URL: streamflow-0.0.25.tar.gz
  • Upload date:
  • Size: 75.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.1

File hashes

Hashes for streamflow-0.0.25.tar.gz
Algorithm Hash digest
SHA256 743881dba284678eefc1075001c27c9950164488159f2662be38f10e24abaf82
MD5 c901aef881f28a7c8a65cebcd6453eb6
BLAKE2b-256 a1fe625e759e6182a4c15577d40f8a1245d6c6f361d076a93be9f8a4e71a9471

See more details on using hashes here.

Provenance

File details

Details for the file streamflow-0.0.25-py2.py3-none-any.whl.

File metadata

  • Download URL: streamflow-0.0.25-py2.py3-none-any.whl
  • Upload date:
  • Size: 105.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.1

File hashes

Hashes for streamflow-0.0.25-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 17b5c57415a071881b6658fbca818908577e6b403072162951a3bdc68078ed48
MD5 12b2d184f2ad64c33990b8342601979a
BLAKE2b-256 ec033dbe587fa9b0c13a68121e9f0241721a0ec4d17b9481755b35055ce8c6a0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page