A simple wrapper process around cloud service providers to run tools for the RAPIDS Accelerator for Apache Spark.
Project description
spark-rapids-user-tools
User tools to help with the adoption, installation, execution, and tuning of RAPIDS Accelerator for Apache Spark.
The wrapper improves end-user experience within the following dimensions:
- Qualification: Educate the CPU customer on the cost savings and acceleration potential of RAPIDS Accelerator for Apache Spark. The output shows a list of apps recommended for RAPIDS Accelerator for Apache Spark with estimated savings and speed-up.
- Bootstrap: Provide optimized RAPIDS Accelerator for Apache Spark configs based on GPU cluster shape. The output shows updated Spark config settings on driver node.
- Tuning: Tune RAPIDS Accelerator for Apache Spark configs based on initial job run leveraging Spark event logs. The output shows recommended per-app RAPIDS Accelerator for Apache Spark config settings.
- Diagnostics: Run diagnostic functions to validate the Dataproc with RAPIDS Accelerator for Apache Spark environment to make sure the cluster is healthy and ready for Spark jobs.
Getting started
Set up a Python environment with a version between 3.8 and 3.10
-
Run the project in a virtual environment.
$ python -m venv .venv $ source .venv/bin/activate
-
Install spark-rapids-user-tools
-
Using released package.
$ pip install spark-rapids-user-tools
-
Install from source.
$ pip install -e .
Note that you can also use optional
test
to install dependencies required to run the unit-testspip install -e '.[test]'
-
Using wheel package built from the repo (see the build steps below).
$ pip install <wheel-file>
-
-
Make sure to install CSP SDK if you plan to run the tool wrapper.
Building from source
Set up a Python environment similar to the steps above.
-
Run the provided build script to compile the project.
$> ./build.sh
-
Fat Mode: Similar to
fat jar
in Java, this mode solves the problem when web access is not available to download resources having Url-paths (http/https).
The command builds the tools jar file and downloads the necessary dependencies and packages them with the source code into a single 'wheel' file.$> ./build.sh fat
Usage and supported platforms
Please refer to spark-rapids-user-tools guide for details on how to use the tools and the platform.
What's new
Please refer to CHANGELOG.md for our latest changes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for spark_rapids_user_tools-23.10.1-194_cb0affc-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd28d5d177e5cec9460d6db6c320f9368f5f93f19c5f1b6eeedf9f99806054ec |
|
MD5 | 1584b8a5161f39998955c5f54d291298 |
|
BLAKE2b-256 | 0f38066a1c7b6a6a83dc29ad5fd7727fd43865d665ab4196674652f20e247436 |