Simple and fast way to get started with Dask
Project description
Coiled Runtime
The Coiled Runtime is a conda metapackage which makes it easy to get started with Dask.
Install
coiled-runtime
can be installed with:
conda install -c conda-forge coiled-runtime
Nightly builds
coiled-runtime
has nightly conda packages for testing purposes.
You can install a nightly version of coiled-runtime
with:
conda install -c coiled/label/dev -c dask/label/dev coiled-runtime
Build
To build and install coiled-runtime
locally, use the following steps:
# Have a local copy of the `coiled-runtime` repository
git clone https://github.com/coiled/coiled-runtime
cd coiled-runtime
# Make sure conda-build is installed
conda install -c conda-forge conda-build
# Build the metapackage
conda build recipe -c conda-forge --output-folder dist/conda --no-anaconda-upload
# Install the built `coiled-runtime` metapackage
conda install -c ./dist/conda/ -c conda-forge coiled-runtime
Test
The coiled-runtime
test suite can be run locally with the following steps:
- Ensure your local machine is authenticated to use the
dask-engineering
Coiled account and the Coiled Dask Engineering AWS S3 account. - Create a Python environment and install development dependencies as
specified in
ci/environment.yml
. - (Optional) If testing against an unreleased version of
coiled-runtime
, create a Coiled software environment with the unreleasedcoiled-runtime
installed and set a localCOILED_SOFTWARE_NAME
environment variable to the name of the software environment (e.g.export COILED_SOFTWARE_NAME="account/software-name"
) - Run tests with
python -m pytest tests
Additionally, tests are automatically run on pull requests to this repository. See the section below on creating pull requests.
Benchmarking
The coiled-runtime
test suite contains a series of pytest fixtures which enable
benchmarking metrics to be collected and stored for historical and regression analysis.
By default, these metrics are not collected and stored, but they can be enabled
by including the --benchmark
flag in your pytest invocation.
From a high level, here is how the benchmarking works:
- Data from individual test runs are collected and stored in a local sqlite database.
The schema for this database is stored in
benchmark_schema.py
- The local sqlite databases are appended to a global historical record, stored in S3.
- The historical data can be analyzed using any of a number of tools.
dashboard.py
creates a set of static HTML documents showing historical data for the tests.
Running the benchmarks locally
You can collect benchmarking data by running pytest with the --benchmark
flag.
This will create a local benchmark.db
sqlite file in the root of the repository.
If you run a test suite multiple times with benchmarking,
the data will be appended to the database.
You can compare with historical data by downloading the global database from S3 first:
aws s3 cp s3://coiled-runtime-ci/benchmarks/benchmark.db ./benchmark.db
pytest --benchmark
Changing the benchmark schema
You can add, remove, or modify columns by editing the SQLAlchemy schema in benchmark_schema.py
.
However, if you have a database of historical data, then the schemas of the new and old data will not match.
In order to account for this, you must provide a migration for the data and commit it to the repository.
We use alembic
to manage SQLAlchemy migrations.
In the simple case of simply adding or removing a column to the schema, you can do the following:
# First, edit the `benchmark_schema.py`
alembic revision --autogenerate -m "Description of migration"
git add alembic/versions/name_of_new_migration.py
git commit -m "Added a new migration"
Migrations are automatically applied in the pytest runs, so you needn't run them yourself.
Using the benchmark fixtures
We have a number of pytest fixtures defined which can be used to automatically track certain metrics in the benchmark database. They are summarized here:
benchmark_db_engine
: The SQLAlchemy engine for the benchmark sqlite database. You can control the database name with the environment variable DB_NAME
, which defaults to benchmark.db
. Most tests shouldn't need to include this fixture directly.
benchmark_db_session
: The SQLAlchemy session for a given test. Most tests shouldn't need to include this fixture directly.
test_run_benchmark
: The SQLAlchemy ORM object for a given test. By including this fixutre in a test (or another fixture that includes it) you trigger the test being written to the benchmark database. This fixture includes data common to all tests, including python version, test name, and the test outcome.
benchmark_time
: Include this fixture to measure the wall clock time.
sample_memory
: This fixture yields a context manager which takes a distributed Client
object, and records peak and average memory usage for the cluster within the context:
def test_something(sample_memory):
with Client() as client:
with sample_memory(client):
client.submit(expensive_function)
Writing a new benchmark fixture would generally look like:
- Requesting the
test_run_benchmark
fixture, which yields an ORM object. - Doing whatever setup work you need.
yield
ing to the test- Collecting whatever information you need after the test is done.
- Setting the appropriate attributes on the ORM object.
The benchmark_time
fixture provides a fairly simple example.
Contribute
This repository uses GitHub Actions secrets for managing authentication tokens used
to access resources like Coiled clusters, S3 buckets, etc. However, because GitHub Actions doesn't
grant access to secrets for forked repositories,
please submit pull requests directly from the coiled/coiled-runtime
repository,
not a personal fork.
Release
To issue a new coiled-runtime
release:
- Locally update the
coiled-runtime
version and package pinnings specified inrecipe/meta.yaml
.- When updating package version pinnings (in particular
dask
anddistributed
) confirm there are no reported large scale stability issues (e.g. deadlocks) or performance regressions on thedask
/distributed
issue trackers or offline reports.
- When updating package version pinnings (in particular
- Open a pull request to the
coiled-runtime
repository titled "Release X.Y.Z" with these changes (whereX.Y.Z
is replaced with the actual version for the release). - After all CI builds have passed the release pull request can be merged.
- Add a new git tag for the release by following the steps below on your local machine:
# Pull in changes from the Release X.Y.Z PR
git checkout main
git pull origin main
# Set release version number
export RELEASE=X.Y.Z
# Create and push release tag
git tag -a $RELEASE -m "Version $RELEASE"
git push origin main --tags
- Update the
coiled-runtime
package on conda-forge by opening a pull request to thecoiled-runtime
conda-forge feedstock which updates thecoiled-runtime
version and package version pinnings.- Note that pull requests to conda-forge feedstocks must come from a fork.
- Reset the build number back to
0
if it isn't already. - For more information on updating conda-forge packages, see the conda-forge docs.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file coiled-runtime-0.1.0.tar.gz
.
File metadata
- Download URL: coiled-runtime-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab11405e2a0205a3b8691f0ad5395e169f13b9a628d83dd5ae975cf4de5eb66d |
|
MD5 | 53b87b974f1c896de9e5c1c0a3a64928 |
|
BLAKE2b-256 | bbff2e3a8093074f2aaa962a7d8bfed6a2f0b33210e037915269932e3e2a3461 |
File details
Details for the file coiled_runtime-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: coiled_runtime-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6392ae30ee199c466ee52a74394c8b6f8f28f8802a3c67449b44b57be67e519e |
|
MD5 | 44603d1726d9b59da237b38d92980714 |
|
BLAKE2b-256 | 35cb07757625a8d94b86f516640b124b9bdfa3ddd2a5c248824f8964f30070c6 |