Skip to main content

Pipeline management software for clusters.

Project description

Toil is a massively scalable pipeline management system, written entirely in Python, and designed around the principles of functional programming.

Toil runs as easily on a laptop as it does on a bare-metal cluster or in the cloud, thanks to support for many batch systems, including GridEngine, Parasol, and a custom Mesos framework.

Toil is robust, and designed to run in unreliable computing environments like Amazon’s spot market. Towards this goal, Toil does not rely on a shared file system. Instead, Toil abstracts a pipeline’s global storage as a job store that can reside on a locally attached file system or within an object store like Amazon S3. The result of this abstraction is a robust system that can be resumed even after an unexpected shutdown of every node in the cluster, even if that event resulted in the loss of all locally stored data.

Writing a Toil script requires only a knowledge of basic Python, with Toil jobs as the unit of work in a Toil workflow. A job can dynamically spawn other jobs as needed, leading to an intuitive and powerful control over the pipeline. File management is through an immutable interface that makes it simple and easy to reason about the state of the workflow.

Prerequisites

  • Python 2.7.x

  • pip > 7.x

Installation

Toil uses setuptools’ extras mechanism for dependencies of optional features like support for Mesos or AWS. To install Toil with all bells and whistles use

pip install toil[aws,mesos,azure,encryption]

Here’s what each extra provides:

  • The aws extra provides support for storing workflow state in Amazon AWS.

  • The azure extra stores workflow state in Microsoft Azure Storage.

  • The mesos extra provides support for running Toil on an Apache Mesos cluster. Note that running Toil on SGE (GridEngine), Parasol or a single machine is enabled by default and does not require an extra.

  • The encryption extra provides client-side encryption for files stored in the Azure and AWS job stores. Note that if you install Toil without the encryption extra, files in these job stores will not be encrypted, even if you provide encryption keys (see issue #407).

Building & Testing

After cloning the source and cd-ing into the project root, create a virtualenv and activate it:

virtualenv venv
. venv/bin/activate

Simply running

make

from the project root will print a description of the available Makefile targets.

If cloning from GitHub, running

make develop

will install Toil in editable mode, also known as development mode. Just like with a regular install, you may specify extras to use in development mode

make develop extras=[aws,mesos,azure,encryption]

To invoke the tests (unit and integration) use

make test

Run an individual test with

make test tests=src/toil/test/sort/sortTest.py::SortTest::testSort

The default value for tests is "src" which includes all tests in the src subdirectory of the project root. Tests that require a particular feature will be skipped implicitly. If you want to explicitly skip tests that depend on a currently installed feature, use

make test tests="-m 'not azure' src"

This will run only the tests that don’t depend on the azure extra, even if that extra is currently installed. Note the distinction between the terms feature and extra. Every extra is a feature but there are features that are not extras, the gridengine and parasol features fall into that category. So in order to skip tests involving both the Parasol feature and the Azure extra, the following can be used:

make test tests="-m 'not azure and not parasol' src"

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toil-3.1.0b1.dev56.tar.gz (125.4 kB view details)

Uploaded Source

Built Distribution

toil-3.1.0b1.dev56-py2.7.egg (367.2 kB view details)

Uploaded Source

File details

Details for the file toil-3.1.0b1.dev56.tar.gz.

File metadata

File hashes

Hashes for toil-3.1.0b1.dev56.tar.gz
Algorithm Hash digest
SHA256 36df56441d020460ddd3378a4ac84def4a927f9e711482ec160940c3b30edeee
MD5 23f7880714cc6c57c2fbb84488894af9
BLAKE2b-256 36ffc0f2bd3bd78ebb93d5dfd2a93120d7ecdecf4730b6e2cacebbfa6844876f

See more details on using hashes here.

Provenance

File details

Details for the file toil-3.1.0b1.dev56-py2.7.egg.

File metadata

File hashes

Hashes for toil-3.1.0b1.dev56-py2.7.egg
Algorithm Hash digest
SHA256 f2ff8eaeba617af8d7bd69bc275dfbfc51505120855031b5592c475e696343c3
MD5 7398a1f3b10ff94af0951845d5dcaebf
BLAKE2b-256 2b9aaf90fddbdf8f5c30f36300ae838fde266ca0c8fdc9bb25f710fa669fb507

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page