A simple wrapper process around cloud service providers to run tools for the RAPIDS Accelerator for Apache Spark.
Project description
spark-rapids-user-tools
User tools to help with the adoption, installation, execution, and tuning of RAPIDS Accelerator for Apache Spark.
The wrapper improves end-user experience within the following dimensions:
- Qualification: Educate the CPU customer on the cost savings and acceleration potential of RAPIDS Accelerator for Apache Spark. The output shows a list of apps recommended for RAPIDS Accelerator for Apache Spark with estimated savings and speed-up.
- Bootstrap: Provide optimized RAPIDS Accelerator for Apache Spark configs based on GPU cluster shape. The output shows updated Spark config settings on driver node.
- Tuning: Tune RAPIDS Accelerator for Apache Spark configs based on initial job run leveraging Spark event logs. The output shows recommended per-app RAPIDS Accelerator for Apache Spark config settings.
- Diagnostics: Run diagnostic functions to validate the Dataproc with RAPIDS Accelerator for Apache Spark environment to make sure the cluster is healthy and ready for Spark jobs.
Getting started
Set up a Python environment with a version between 3.8 and 3.10
-
Run the project in a virtual environment.
$ python -m venv .venv $ source .venv/bin/activate
-
Install spark-rapids-user-tools
-
Using released package.
$ pip install spark-rapids-user-tools
-
Install from source.
$ pip install -e .
-
Using wheel package built from the repo (see the build steps below).
$ pip install <wheel-file>
-
-
Make sure to install CSP SDK if you plan to run the tool wrapper.
Building from source
Set up a Python environment similar to the steps above.
- Run the provided build script to compile the project.
$ ./build.sh
Usage and supported platforms
Please refer to spark-rapids-user-tools guide for details on how to use the tools and the platform.
What's new
Please refer to CHANGELOG.md for our latest changes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for spark_rapids_user_tools-23.8.0-173_a051128-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eeb5a7bf4cc72525fbf2f3bf02b6d619f91a301c866aa5bcb925d8c14fb1db3d |
|
MD5 | 133782d224d53da7cf47c674f7184daf |
|
BLAKE2b-256 | ce0a4d7e4d1e89f480872a300c1a3b3ba7ccfdc8e9cfde5349a8ff4e5659a8b9 |