Utilities for expanding dask-jobqueue with appropriate settings for NCAR's clusters
Project description
ncar-jobqueue
ncar-jobqueue
provides utilities for configuring dask-jobqueue with appropriate default settings for NCAR's clusters.
The following compute servers are supported:
- Cheyenne (cheyenne.ucar.edu)
- Casper (DAV) (casper.ucar.edu)
- Hobart (hobart.cgd.ucar.edu)
- Izumi (izumi.unified.ucar.edu)
Badges
CI | |
---|---|
Package | |
License |
Installation
NCAR-jobqueue can be installed from PyPI with pip:
python -m pip install ncar-jobqueue
NCAR-jobqueue is also available from conda-forge for conda installations:
conda install -c conda-forge ncar-jobqueue
Configuration
ncar-jobqueue
provides a custom configuration file with appropriate default settings for different clusters. This configuration file resides in ~/.config/dask/ncar-jobqueue.yaml
:
ncar-jobqueue.yaml
cheyenne:
pbs:
#project: XXXXXXXX
name: dask-worker-cheyenne
cores: 18 # Total number of cores per job
memory: '109GB' # Total amount of memory per job
processes: 18 # Number of Python processes per job
interface: ib0 # Network interface to use like eth0 or ib0
queue: regular
walltime: '01:00:00'
resource-spec: select=1:ncpus=36:mem=109GB
log-directory: '/glade/scratch/${USER}/dask/cheyenne/logs'
local-directory: '/glade/scratch/${USER}/dask/cheyenne/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
casper-dav:
pbs:
#project: XXXXXXXX
name: dask-worker-casper-dav
cores: 2 # Total number of cores per job
memory: '25GB' # Total amount of memory per job
processes: 1 # Number of Python processes per job
interface: ib0
walltime: '01:00:00'
resource-spec: select=1:ncpus=1:mem=25GB
queue: casper
log-directory: '/glade/scratch/${USER}/dask/casper-dav/logs'
local-directory: '/glade/scratch/${USER}/dask/casper-dav/local-dir'
job-extra: []
env-extra: []
death-timeout: 60
hobart:
pbs:
name: dask-worker-hobart
cores: 10 # Total number of cores per job
memory: '96GB' # Total amount of memory per job
processes: 10 # Number of Python processes per job
# interface: null # ib0 doesn't seem to be working on Hobart
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/hobart/logs'
local-directory: '/scratch/cluster/${USER}/dask/hobart/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
izumi:
pbs:
name: dask-worker-izumi
cores: 10 # Total number of cores per job
memory: '96GB' # Total amount of memory per job
processes: 10 # Number of Python processes per job
# interface: null # ib0 doesn't seem to be working on Hobart
queue: medium
walltime: '08:00:00'
resource-spec: nodes=1:ppn=48
log-directory: '/scratch/cluster/${USER}/dask/izumi/logs'
local-directory: '/scratch/cluster/${USER}/dask/izumi/local-dir'
job-extra: ['-r n']
env-extra: []
death-timeout: 60
Note:
- To configure a default project account that is used by
dask-jobqueue
when submitting batch jobs, uncomment theproject
key/line in~/.config/dask/ncar-jobqueue.yaml
and set it to an appropriate value.
Usage
Note:
⚠️ Online documentation for dask-jobqueue
is available here. ⚠️
Casper
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Cheyenne
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Hobart
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Izumi
>>> from ncar_jobqueue import NCARCluster
>>> from dask.distributed import Client
>>> cluster = NCARCluster(project='XXXXXXXX')
>>> cluster
PBSCluster(0f23b4bf, 'tcp://xx.xxx.x.x:xxxx', workers=0, threads=0, memory=0 B)
>>> cluster.scale(jobs=2)
>>> client = Client(cluster)
Non-NCAR machines
On non-NCAR machines, ncar-jobqueue
will warn the user, and it will use distributed.LocalCluster
:
>>> from ncar_jobqueue import NCARCluster
.../ncar_jobqueue/cluster.py:17: UserWarning: Unable to determine which NCAR cluster you are running on... Returning a `distributed.LocalCluster` class.
warn(message)
>>> from dask.distributed import Client
>>> cluster = NCARCluster()
>>> cluster
LocalCluster(3a7dd0f6, 'tcp://127.0.0.1:64184', workers=4, threads=8, memory=17.18 GB)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ncar-jobqueue-2021.4.14.tar.gz
.
File metadata
- Download URL: ncar-jobqueue-2021.4.14.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5e6e61f7acb013a9714257ee32f68073e4d803ee484312cabe4c5d38599caf9 |
|
MD5 | 022539dbd7ad7322189beb79a406d97b |
|
BLAKE2b-256 | 618d5cdc8f5757071e77d081d605c0129022c51f625315f1ee98d654c69210e0 |
File details
Details for the file ncar_jobqueue-2021.4.14-py3-none-any.whl
.
File metadata
- Download URL: ncar_jobqueue-2021.4.14-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ffba69c025fb9062398bae75dde1a0ce87f166c428baf4503f0d85c485e7bbf |
|
MD5 | 53de845d5e53a0b94b6b0aebf6aed41a |
|
BLAKE2b-256 | 240a02f0c21a1476046196d3aa05afcf76d641f20add1a6bb144326f664aa0fa |