Skip to main content

NVIDIA cuSPARSELt

Project description

NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix:

\begin{equation*} D = Activation(\alpha op(A) \cdot op(B) + \beta op(C) + bias) \cdot scale \end{equation*}

where \(op(A)/op(B)\) refers to in-place operations such as transpose/non-transpose, and \(alpha, beta, scale\) are scalars.

The cuSPARSELt APIs allow flexibility in the algorithm/operation selection, epilogue, and matrix characteristics, including memory layout, alignment, and data types.

Download: developer.nvidia.com/cusparselt/downloads

Provide Feedback: Math-Libs-Feedback@nvidia.com

Examples: cuSPARSELt Example 1, cuSPARSELt Example 2

Blog post:

Key Features

  • NVIDIA Sparse MMA tensor core support

  • Mixed-precision computation support:

    Input A/B

    Input C

    Output D

    Compute

    FP32

    FP32

    FP32

    FP32

    FP16

    FP16

    FP16

    FP32

    FP16

    BF16

    BF16

    BF16

    FP32

    INT8

    INT8

    INT8

    INT32

    INT32

    INT32

    FP16

    FP16

    BF16

    BF16

    E4M3

    FP16

    E4M3

    FP32

    BF16

    E4M3

    FP16

    FP16

    BF16

    BF16

    FP32

    FP32

    E5M2

    FP16

    E5M2

    FP32

    BF16

    E5M2

    FP16

    FP16

    BF16

    BF16

    FP32

    FP32

  • Matrix pruning and compression functionalities

  • Activation functions, bias vector, and output scaling

  • Batched computation (multiple matrices in a single run)

  • GEMM Split-K mode

  • Auto-tuning functionality (see cusparseLtMatmulSearch())

  • NVTX ranging and Logging functionalities

Support

  • Supported SM Architectures: SM 8.0, SM 8.6, SM 8.9, SM 9.0

  • Supported CPU architectures and operating systems:

OS

CPU archs

Windows

x86_64

Linux

x86_64, Arm64

Documentation

Please refer to https://docs.nvidia.com/cuda/cusparselt/index.html for the cuSPARSELt documentation.

Installation

The cuSPARSELt wheel can be installed as follows:

pip install nvidia-cusparselt-cuXX

where XX is the CUDA major version (currently CUDA 12 only is supported).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

nvidia_cusparselt_cu12-0.6.3-py3-none-win_amd64.whl (155.6 MB view details)

Uploaded Python 3 Windows x86-64

File details

Details for the file nvidia_cusparselt_cu12-0.6.3-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for nvidia_cusparselt_cu12-0.6.3-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 3b325bcbd9b754ba43df5a311488fca11a6b5dc3d11df4d190c000cf1a0765c7
MD5 9aee2464322ac34bc24cf9cdd49e27e9
BLAKE2b-256 463e9e1e394a02a06f694be2c97bbe47288bb7c90ea84c7e9cf88f7b28afe165

See more details on using hashes here.

File details

Details for the file nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e5c8a26c36445dd2e6812f1177978a24e2d37cacce7e090f297a688d1ec44f46
MD5 7f9f32cf1080300ace5f4fe061d2e3dd
BLAKE2b-256 3b9a72ef35b399b0e183bc2e8f6f558036922d453c4d8237dab26c666a04244b

See more details on using hashes here.

File details

Details for the file nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8371549623ba601a06322af2133c4a44350575f5a3108fb75f3ef20b822ad5f1
MD5 414d6b93245bd57f24646f4a19f59669
BLAKE2b-256 62da4de092c61c6dea1fc9c936e69308a02531d122e12f1f649825934ad651b5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page