Skip to main content

A python package that enables user to build their custom singularity image on HPC cluster

Project description

Building a singular container for HPC using globus-compute

Context

  • One of the execution configurations of globus-compute requires a registered container which is spun up to execute the user function on the HPC.

  • HPCs do not run docker containers(due to security reasons as discussed here) and support only an apptainer/singularity image.

  • Installing the apptainer setup to build the singularity image locally is not a straightforward process especially on windows and mac systems as discussed in the documentation.

Using this python library the user can specify their custom image specification to build an apptainer/singularity image which would be used to in-turn to run their functions on globus-compute. The library registers the container and returns the container id which would be used by the globus-compute executor to execute the user function.

Prerequisite.

A globus-compute-endpoint setup on HPC cluster.

The following steps can be used to create an endpoint on the NCSA Delta Cluster, you can modify the configurations based on your use-case:

Note.

For the following to work we must use the globus-compute-sdk version of 2.2.0 while setting up our endpoint. It is recommended to use python3.9 for setting up the endpoint and as the client

  1. Create a conda virtual env. We have created a custom-image-builder conda env on the delta cluster as follows:
conda create --name custom-image-builder-py-3.9 python=3.9

conda activate custom-image-builder

pip install globus-compute-endpoint==2.2.0
  1. Creating a globus-compute endpoint:
globus-compute-endpoint configure custom-image-builder

Update the endpoint config at ~/.globus_compute/custom-image-builder/config.py to :

from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider

from globus_compute_endpoint.endpoint.utils.config import Config
from globus_compute_endpoint.executors import HighThroughputExecutor


user_opts = {
    'delta': {
        'worker_init': 'conda activate custom-image-builder-py-3.9',
        'scheduler_options': '#SBATCH --account=bbmi-delta-cpu',
    }
}

config = Config(
    executors=[
        HighThroughputExecutor(
            max_workers_per_node=10,
            address=address_by_interface('hsn0'),
            scheduler_mode='soft',
            worker_mode='singularity_reuse',
            container_type='singularity',
            container_cmd_options="",
            provider=SlurmProvider(
                partition='cpu',
                launcher=SrunLauncher(),

                # string to prepend to #SBATCH blocks in the submit
                # script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
                scheduler_options=user_opts['delta']['scheduler_options'],
                worker_init=user_opts['delta']['worker_init'],
                # Command to be run before starting a worker, such as:
                # 'module load Anaconda; source activate parsl_env'.

                # Scale between 0-1 blocks with 2 nodes per block
                nodes_per_block=1,
                init_blocks=0,
                min_blocks=0,
                max_blocks=1,

                # Hold blocks for 30 minutes
                walltime='00:30:00'
            ),
        )
    ],
)
  1. Start the endpoint and store the endpoint id to be used in the following example
globus-compute-endpoint start custom-image-builder

Example

Consider the following use-case where the user wants to execute a pandas operation on HPC using globus-compute. They need a singularity image which would be used by the globus-compute executor. The library can be leveraged as follows:

Locally you need to install the following packages, you can create a virtual env as follows:

cd example/

python3.9 -m venv venv

source venv/bin/activate

pip install globus-compute-sdk==2.2.0

pip install custom-image-builder
from custom_image_builder import build_and_register_container
from globus_compute_sdk import Client, Executor


def transform():
    import pandas as pd
    data = {'Column1': [1, 2, 3],
            'Column2': [4, 5, 6]}

    df = pd.DataFrame(data)

    return "Successfully created df"


def main():
    image_builder_endpoint = "bc106b18-c8b2-45a3-aaf0-75eebc2bef80"
    gcc_client = Client()

    container_id = build_and_register_container(gcc_client=gcc_client,
                                                endpoint_id=image_builder_endpoint,
                                                image_file_name="my-pandas-image",
                                                base_image_type="docker",
                                                base_image="python:3.8",
                                                pip_packages=["pandas"])

    print("The container id is", container_id)

    with Executor(endpoint_id=image_builder_endpoint,
                  container_id=container_id) as ex:
        fut = ex.submit(transform)

    print(fut.result())

Note.

The singularity image require globus-compute-endpoint as one of its packages in-order to run the workers as our custom singularity container, hence by default we require python as part of the image inorder to install globus-compute-endpoint.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

custom_image_builder-1.0.1.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

custom_image_builder-1.0.1-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file custom_image_builder-1.0.1.tar.gz.

File metadata

  • Download URL: custom_image_builder-1.0.1.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.11.5 Linux/6.2.0-1012-azure

File hashes

Hashes for custom_image_builder-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f2601abf9b2aeb9abdbecf2e0a7bb94e133126fe47bcdb46704e9b12e84700d2
MD5 65713ea40d7a379e738dd9efcb1c63b2
BLAKE2b-256 7f6da29ea805478fc34f3b8e4bbdcb4996a22759fd55e3c2479ed12a7f41ef67

See more details on using hashes here.

File details

Details for the file custom_image_builder-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for custom_image_builder-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ad0fe09165155a5c0c1604e0917c17aa2bcf0eb44d41b889d20593882d3c77c3
MD5 62de03144b16e69e1715aed22705ba7d
BLAKE2b-256 d96d5e969d01b7a3e2758b7cce8a42b68d927947e063e0fb48b8a6dfbca3ecd5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page