Skip to main content

Dask on DRMAA

Project description

Build Status

Deploy a Dask.distributed cluster on top of a cluster running a DRMAA-compliant job scheduler.

Example

Launch from Python

from dask_drmaa import DRMAACluster
cluster = DRMAACluster()

from dask.distributed import Client
client = Client(cluster)
cluster.start_workers(2)

>>> future = client.submit(lambda x: x + 1, 10)
>>> future.result()
11

Or launch from the command line:

$ dask-drmaa 10  # starts local scheduler and ten remote workers

Install

Currently this is only available through GitHub and source installation:

pip install git+https://github.com/dask/dask-drmaa.git --upgrade

or:

git clone git@github.com:dask/dask-drmaa.git
cd dask-drmaa
python setup.py install

You must have the DRMAA system library installed and be able to submit jobs from your local machine.

Testing

This repository contains a Docker-compose testing harness for a Son of Grid Engine cluster with a master and two slaves. You can initialize this system as follows

docker-compose build
./start-sge.sh

And run tests with py.test in the master docker container

docker exec -it sge_master /bin/bash -c "cd /dask-drmaa; python setup.py develop"
docker exec -it sge_master py.test dask-drmaa/dask_drmaa --verbose

Adaptive Load

Dask-drmaa can adapt to scheduler load, deploying more workers on the grid when it has more work, and cleaning up these workers when they are no longer necessary. This can simplify setup (you can just leave a cluster running) and it can reduce load on the cluster, making IT happy.

To enable this, call the Adaptive class on a DRMAACluster. You can submit computations to the cluster without ever explicitly creating workers.

from dask_drmaa import DRMAACluster, Adaptive
from dask.distributed import Client

cluster = DRMAACluster()
adapt = Adaptive(cluster)
client = Client(cluster)

futures = client.map(func, seq)  # workers will be created as necessary

Extensible

The DRMAA interface is the lowest common denominator among many different job schedulers like SGE, SLURM, LSF, Torque, and others. However, sometimes users need to specify parameters particular to their cluster, such as resource queues, wall times, memory constraints, etc..

DRMAA allows users to pass native specifications either when constructing the cluster or when starting new workers:

cluster = DRMAACluster(template={'nativeSpecification': '-l h_rt=01:00:00'})
# or
cluster.start_workers(10, nativeSpecification='-l h_rt=01:00:00')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask-drmaa-0.1.0.tar.gz (24.6 kB view details)

Uploaded Source

File details

Details for the file dask-drmaa-0.1.0.tar.gz.

File metadata

  • Download URL: dask-drmaa-0.1.0.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dask-drmaa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c98041835727539fd541a2a08963831c6b1f3371784df145ab48298d5de19aff
MD5 1cda20b30bc9139591e15d9119b76ade
BLAKE2b-256 b5886ecb9e020d1040ee16cc66475a75585bdf16c5b3456b877b6b3a02e35e8b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page