Skip to main content

Client package for PBS and SLURM clusters with a headnode.

Project description

hpc05

Conda Downloads docs version PyPI - Python Version

🖥 ipyparallel.Client package for a PBS or SLURM cluster with a headnode.

Script that connects to PBS or SLURM cluster with headnode over ssh. Since ipyparallel doesn't cull enginges when inactive and people are lazy (because they forget to qdel their jobs), it automatically kills the ipengines after the set timeout (default=15 min). Note that this package doesn't only work for the hpc05 cluster on the TU Delft but also other clusters.

Installation

First install this package on both your machine and the cluster.

conda config --add channels conda-forge
conda install hpc05

or using pip

pip install hpc05

Make sure you can connect over ssh passwordless by copying your ssh key:

ssh-copy-id hpc05

Setup profile

You need a parallel profile on your cluster, which can be created by the following command on your local machine:

import hpc05
# for PBS use
hpc05.create_remote_pbs_profile(profile='pbs', hostname='hpc05')  # on the remote machine
# or
hpc05.create_local_pbs_profile(profile='pbs')  # on the cluster

# for SLURM use
hpc05.create_remote_slurm_profile(profile='slurm', hostname='hpc05')  # on the remote machine
# or
hpc05.create_local_slurm_profile(profile='slurm')  # on the cluster

Start ipcluster and connect (via ssh)

To start and connect to an ipcluster just do (and read the error messages if any, for instructions):

client, dview, lview = hpc05.start_remote_and_connect(
	n=100, profile='pbs', hostname='hpc05', folder='~/your_folder_on_the_cluster/')

This is equivent to the following three commmands:

# 0. Killing and removing files of an old ipcluster (this is optional with
#    the `start_remote_and_connect` function, use the `kill_old_ipcluster` argument)
hpc05.kill_remote_ipcluster(hostname='hpc05')

# 1. starting an `ipcluster`, similar to running
#    `ipcluster start --n=100 --profile=pbs` on the cluster headnode.
hpc05.start_remote_ipcluster(n=100, profile='pbs', hostname='hpc05')

# 2. Connecting to the started ipcluster and adding a folder to the cluster's `PATH`
client, dview, lview = hpc05.connect_ipcluster(
	n=200, profile='pbs', hostname='hpc05', folder='~/your_folder_on_the_cluster/')

Start ipcluster and connect (on cluster headnode)

To start and connect to an ipcluster just do (and read the error messages if any, for instructions):

client, dview, lview = hpc05.start_and_connect(
	n=100, profile='pbs',  folder='~/your_folder_on_the_cluster/')

This is equivent to the following three commmands:

# 0. Killing and removing files of an old ipcluster (this is optional with
#    the `start_remote_and_connect` function, use the `kill_old_ipcluster` argument)
hpc05.kill_ipcluster()

# 1. starting an `ipcluster`, similar to `ipcluster start --n=200 --profile=pbs`
hpc05.start_ipcluster(n=200, profile='pbs')

# 2. Connecting to the started ipcluster and adding a folder to the cluster's `PATH`
client, dview, lview = hpc05.connect_ipcluster(
	n=200, profile='pbs', folder='~/your_folder_on_the_cluster/')

Monitor resources

This package will monitor your resources if you start it with hpc05_monitor.start(client), see the following example use:

import time
import hpc05_monitor
hpc05_monitor.start(client, interval=5)  # update hpc05_monitor.MAX_USAGE every 'interval' seconds.

while not hpc05_monitor.LATEST_DATA:
    time.sleep(1)

hpc05_monitor.print_usage()  # uses hpc05_monitor.LATEST_DATA by default

hpc05_monitor.print_max_usage()  # uses hpc05_monitor.MAX_USAGE

With output:

 id hostname             date                             CPU% MEM%
 15 node29.q1cluster     2018-09-10T14:25:05.350499       190%   3%
 19 node29.q1cluster     2018-09-10T14:25:04.860693       200%   3%
 26 node29.q1cluster     2018-09-10T14:25:05.324466       200%   3%
 28 node29.q1cluster     2018-09-10T14:25:05.148623       190%   2%
 29 node29.q1cluster     2018-09-10T14:25:04.737664       190%   3%
 ...

Development

We use pre-commit for linting of the code, so pip install pre_commit and run

pre-commit install

in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc05-2.0.5.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

hpc05-2.0.5-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file hpc05-2.0.5.tar.gz.

File metadata

  • Download URL: hpc05-2.0.5.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.4

File hashes

Hashes for hpc05-2.0.5.tar.gz
Algorithm Hash digest
SHA256 551163b2d903ee7f692388472ef3bc7de2f338092b5998350aae6fa0e89742b5
MD5 e94a6bef1f29799e7755cb20e55b636a
BLAKE2b-256 2d3c4713f766057a6c20d87cc265e236fd5bfe3c5dc8f71dba66142b824b46d0

See more details on using hashes here.

File details

Details for the file hpc05-2.0.5-py3-none-any.whl.

File metadata

  • Download URL: hpc05-2.0.5-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.4

File hashes

Hashes for hpc05-2.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 79711f368de04bceb43c38eabd9035c1a6ee0513ac8c7e9429a9ca1ac381a564
MD5 79eaafbf6ec84f8585ccac731a6001e3
BLAKE2b-256 add63c7f8ad6ae58c604d77db45ef53ccc40515d021b489f777e549de84ce6dc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page