Skip to main content

A JupyterLab extension for Dask.

Project description

Dask JupyterLab Extension

Build Status Version Downloads Dependencies

This package provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes.

Dask Extension

Explanatory Video (5 minutes)

Dask + JupyterLab Screencast

Requirements

JupyterLab >= 1.0 distributed >= 1.24.1

Installation

To install the Dask JupyterLab extension you will need to have JupyterLab installed. For JupyterLab < 3.0, you will also need Node.js version >= 12. These are available through a variety of sources. One source common to Python users is the conda package manager.

conda install jupyterlab
conda install -c conda-forge nodejs

JupyterLab 3.0 or greater

You should be able to install this extension with pip or conda, and start using it immediately, e.g.

pip install dask-labextension

JupyterLab 3.x

This extension includes both client-side and server-side components. Prior to JupyterLab 3.0 these needed to be installed separately, with node available on the machine.

The server-side component can be installed via pip or conda-forge:

pip install dask_labextension
conda install -c conda-forge dask-labextension

You then build the client-side extension into JupyterLab with:

jupyter labextension install dask-labextension

If you are running Notebook 5.2 or earlier, enable the server extension by running

jupyter serverextension enable --py --sys-prefix dask_labextension

Configuration of Dask cluster management

This extension has the ability to launch and manage several kinds of Dask clusters, including local clusters and kubernetes clusters. Options for how to launch these clusters are set via the dask configuration system, typically a .yml file on disk.

By default the extension launches a LocalCluster, for which the configuration is:

labextension:
  factory:
    module: 'dask.distributed'
    class: 'LocalCluster'
    args: []
    kwargs: {}
  default:
    workers: null
    adapt:
      null
      # minimum: 0
      # maximum: 10
  initial:
    []
    # - name: "My Big Cluster"
    #   workers: 100
    # - name: "Adaptive Cluster"
    #   adapt:
    #     minimum: 0
    #     maximum: 50

In this configuration, factory gives the module, class name, and arguments needed to create the cluster. The default key describes the initial number of workers for the cluster, as well as whether it is adaptive. The initial key gives a list of initial clusters to start upon launch of the notebook server.

In addition to LocalCluster, this extension has been used to launch several other Dask cluster objects, a few examples of which are:

  • A SLURM cluster, using
labextension:
    factory:
      module: 'dask_jobqueue'
       class: 'SLURMCluster'
       args: []
       kwargs: {}
  • A PBS cluster, using
labextension:
  factory:
    module: 'dask_jobqueue'
    class: 'PBSCluster'
    args: []
    kwargs: {}
labextension:
  factory:
    module: dask_kubernetes
    class: KubeCluster
    args: []
    kwargs: {}

Configuring a default layout

This extension can store a default layout for the Dask dashboard panes, which is useful if you find yourself reaching for the same dashboard charts over and over. You can launch the default layout via the command palette, or by going to the File menu and choosing "Launch Dask Dashboard Layout".

Default layouts can be configured via the JupyterLab config system (either using the JSON editor or the user interface). Specify a layout by writing a JSON object keyed by the individual charts you would like to open. Each chart is opened with a mode, and a ref. mode refers to how the chart is to be added to the workspace. For example, if you want to split a panel and add the new one to the right, choose split-right. Other options are split-top, split-bottom, split-left, tab-after, and tab-before. ref refers to the panel to which mode is applied, and might be the names of other dashboard panels. If ref is null, the panel in question is added at the top of the layout hierarchy.

A concrete example of a default layout is

{
  "individual-task-stream": {
    "mode": "split-right",
    "ref": null
  },
  "individual-workers-memory": {
    "mode": "split-bottom",
    "ref": "individual-task-stream"
  },
  "individual-progress": {
    "mode": "split-right",
    "ref": "individual-workers-memory"
  }
}

which adds the task stream to the right of the workspace, then adds the worker memory chart below the task stream, then adds the progress chart to the right of the worker memory chart.

Development install

As described in the JupyterLab documentation for a development install of the labextension you can run the following in this directory:

jlpm  # Install npm package dependencies
jlpm build  # Compile the TypeScript sources to Javascript
jupyter labextension develop . --overwrite  # Install the current directory as an extension

To rebuild the extension:

jlpm build

You should then be able to refresh the JupyterLab page and it will pick up the changes to the extension.

To run an editable install of the server extension, run

pip install -e .
jupyter serverextension enable --sys-prefix dask_labextension

Publishing

This extension contains a front-end component written in TypeScript and a back-end component written in Python. The front-end is compiled to Javascript during the build process and is distributed as static assets along with the Python package.

Note: Package versions are not prefixed with the letter v. You will need to disable this.

$ jlpm config set version-tag-prefix ""

Release process

This requires node, build, and twine to be installed.

jlpm version [--major|--minor|--patch]  # updates package.json and creates git commit and tag
git push upstream main && git push upstream main --tags  # pushes to GitHub
python -m build .  # Build the package
twine upload dist/*  # Upload the package to PyPI

Handling Javascript package version conflicts

Unlike Python, Javascript packages can include more than one version of the same dependency. Usually the yarn package manager handles this okay, but occasionally you might end up with conflicting versions, or with unexpected package bloat. You can try to fix this by deduplicating dependencies:

jlpm yarn-deduplicate -s fewer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask_labextension-6.2.0.tar.gz (164.3 kB view details)

Uploaded Source

Built Distribution

dask_labextension-6.2.0-py3-none-any.whl (67.0 kB view details)

Uploaded Python 3

File details

Details for the file dask_labextension-6.2.0.tar.gz.

File metadata

  • Download URL: dask_labextension-6.2.0.tar.gz
  • Upload date:
  • Size: 164.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for dask_labextension-6.2.0.tar.gz
Algorithm Hash digest
SHA256 1c6864e57ea0f854e0d65649342d4bc8ca830ee255a087cf770f52ac56cda753
MD5 9c262ca96011cefda95027945928f8a7
BLAKE2b-256 a06ad209bbea4678d9787fa87d62778e52b94d1f89b65ed401238dc7bfab5086

See more details on using hashes here.

Provenance

File details

Details for the file dask_labextension-6.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dask_labextension-6.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d06ed9880eef0cdceb087ffe8d2cf62efd35309e892d8cac46e5080e8532d9bf
MD5 d8272fcf783a3aafff81178852423b25
BLAKE2b-256 11d568fb86ba6d3c3d38683235be8fa87a70c94f945ff8772a1f2dd106057f94

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page