Skip to main content

Backend implementation for running MLFlow projects on Slurm

Project description

MLFlow-Slurm

Backend for executing MLFlow projects on Slurm batch system

Usage

Install this package in the environment from which you will be submitting jobs. If you are submitting jobs from inside jobs, make sure you have this package listed in your conda or pip environment.

Just list this as your --backend in the job run. You should include a json config file to control how the batch script is constructed:

mlflow run --backend slurm \
          --backend-config slurm_config.json \
          examples/sklearn_elasticnet_wine

It will generate a batch script named after the job id and submit it via the Slurm sbatch command. It will tag the run with the Slurm JobID

Configure Jobs

You can set values in a json file to control job submission. The supported properties in this file are:

Config File Setting Use
partition Which Slurm partition should the job run in?
account What account name to run under
gpus_per_node On GPU partitions how many GPUs to allocate per node
mem Amount of memory to allocate to CPU jobs
modules List of modules to load before starting job
time Max CPU time job may run
sbatch-script-file Name of batch file to be produced. Leave blank to have service generate a script file name based on the run ID

Development

The slurm docker deployment is handy for testing and development. You can start up a slurm environment with the included docker-compose file

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlflow_slurm-1.0.0rc8.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

mlflow_slurm-1.0.0rc8-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file mlflow_slurm-1.0.0rc8.tar.gz.

File metadata

  • Download URL: mlflow_slurm-1.0.0rc8.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.15

File hashes

Hashes for mlflow_slurm-1.0.0rc8.tar.gz
Algorithm Hash digest
SHA256 e64d8d28368b2dee0b6d0873cea777848cd5335c5f45ae91b92ddc6c8b3c848d
MD5 06ab11dd7b87572864c959850b23c6ce
BLAKE2b-256 35a898c49225eca88f78808a05d6d17e568c4c44ce068f150bb6e5f1bce6f989

See more details on using hashes here.

Provenance

File details

Details for the file mlflow_slurm-1.0.0rc8-py3-none-any.whl.

File metadata

File hashes

Hashes for mlflow_slurm-1.0.0rc8-py3-none-any.whl
Algorithm Hash digest
SHA256 154ad2b2802427f3861c8d676c509db549f8eb09536931545acf0eaac871d2dd
MD5 faf5aa96aa98374f6030b5c0b92edaf8
BLAKE2b-256 82da7845235ba23d6dcd9a31fb51a156ce9e3c4f8a533b963e93cc4af22f7b45

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page