A simple wrapper process around cloud service providers to run tools for the RAPIDS Accelerator for Apache Spark.
Project description
spark-rapids-user-tools
User tools to help with the adoption, installation, execution, and tuning of RAPIDS Accelerator for Apache Spark.
The wrapper improves end-user experience within the following dimensions:
- Qualification: Educate the CPU customer on the cost savings and acceleration potential of RAPIDS Accelerator for Apache Spark. The output shows a list of apps recommended for RAPIDS Accelerator for Apache Spark with estimated savings and speed-up.
- Bootstrap: Provide optimized RAPIDS Accelerator for Apache Spark configs based on GPU cluster shape. The output shows updated Spark config settings on driver node.
- Tuning: Tune RAPIDS Accelerator for Apache Spark configs based on initial job run leveraging Spark event logs. The output shows recommended per-app RAPIDS Accelerator for Apache Spark config settings.
- Diagnostics: Run diagnostic functions to validate the Dataproc with RAPIDS Accelerator for Apache Spark environment to make sure the cluster is healthy and ready for Spark jobs.
Getting started
Set up a Python environment with a version between 3.8 and 3.10
-
Run the project in a virtual environment.
$ python -m venv .venv $ source .venv/bin/activate
-
Install spark-rapids-user-tools
-
Using released package.
$ pip install spark-rapids-user-tools
-
Install from source.
$ pip install -e .
Note that you can also use optional
test
to install dependencies required to run the unit-testspip install -e '.[test]'
-
Using wheel package built from the repo (see the build steps below).
$ pip install <wheel-file>
-
-
Make sure to install CSP SDK if you plan to run the tool wrapper.
Building from source
Set up a Python environment similar to the steps above.
-
Run the provided build script to compile the project.
$> ./build.sh
-
Fat Mode: Similar to
fat jar
in Java, this mode solves the problem when web access is not available to download resources having Url-paths (http/https).
The command builds the tools jar file and downloads the necessary dependencies and packages them with the source code into a single 'wheel' file.$> ./build.sh fat
Usage and supported platforms
Please refer to spark-rapids-user-tools guide for details on how to use the tools and the platform.
What's new
Please refer to CHANGELOG.md for our latest changes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file spark_rapids_user_tools-24.2.1-227_518d17a-py3-none-any.whl
.
File metadata
- Download URL: spark_rapids_user_tools-24.2.1-227_518d17a-py3-none-any.whl
- Upload date:
- Size: 232.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e00032b434b84c7647459a0a60e2b84592b230126a158b205dbb65ad70287245 |
|
MD5 | b95c64d6f108c7490103aa44ff94633c |
|
BLAKE2b-256 | 0b1971d8970579ab494b747d49c3b3ac8d73190ea8508ebf156cc70cf9947dcf |