New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

nvidia-cusparselt-cu12

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

nvidia-cusparselt-cu12

NVIDIA cuSPARSELt

0.7.1

PyPI

Maintainers: 1

################################################################################### cuSPARSELt: A High-Performance CUDA Library for Sparse Matrix-Matrix Multiplication ###################################################################################

NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix:

.. math::

D = Activation(\alpha op(A) \cdot op(B) + \beta op(C) + bias) \cdot scale

where :math:op(A)/op(B) refers to in-place operations such as transpose/non-transpose, and :math:alpha, beta, scale are scalars.

The cuSPARSELt APIs allow flexibility in the algorithm/operation selection, epilogue, and matrix characteristics, including memory layout, alignment, and data types.

Download: developer.nvidia.com/cusparselt/downloads <https://developer.nvidia.com/cusparselt/downloads>_

Provide Feedback: Math-Libs-Feedback@nvidia.com <mailto:Math-Libs-Feedback@nvidia.com?subject=cuSPARSELt-Feedback>_

Examples: cuSPARSELt Example 1 <https://github.com/NVIDIA/CUDALibrarySamples/tree/master/cuSPARSELt/matmul>, cuSPARSELt Example 2 <https://github.com/NVIDIA/CUDALibrarySamples/tree/master/cuSPARSELt/matmul_advanced>

Blog post:

Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt <https://developer.nvidia.com/blog/exploiting-ampere-structured-sparsity-with-cusparselt/>_
Structured Sparsity in the NVIDIA Ampere Architecture and Applications in Search Engines <https://developer.nvidia.com/blog/structured-sparsity-in-the-nvidia-ampere-architecture-and-applications-in-search-engines/>__
Making the Most of Structured Sparsity in the NVIDIA Ampere Architecture <https://www.nvidia.com/en-us/on-demand/session/gtcspring21-s31552/>__

================================================================================ Key Features

NVIDIA Sparse MMA tensor core support
Mixed-precision computation support:

+--------------+----------------+-----------------+-------------+ | Input A/B | Input C | Output D | Compute | +==============+================+=================+=============+ | FP32 | FP32 | FP32 | FP32 | +--------------+----------------+-----------------+-------------+ | FP16 | FP16 | FP16 | FP32 |
- ```
         +                +                 +-------------+
```
| | | | FP16 | +--------------+----------------+-----------------+-------------+ | BF16 | BF16 | BF16 | FP32 | +--------------+----------------+-----------------+-------------+ | INT8 | INT8 | INT8 | INT32 |
- ```
         +----------------+-----------------+             +
```
| | INT32 | INT32 | |
- ```
         +----------------+-----------------+             +
```
| | FP16 | FP16 | |
- ```
         +----------------+-----------------+             +
```
| | BF16 | BF16 | | +--------------+----------------+-----------------+-------------+ | E4M3 | FP16 | E4M3 | FP32 |
- ```
         +----------------+-----------------+             +
```
| | BF16 | E4M3 | |
- ```
         +----------------+-----------------+             +
```
| | FP16 | FP16 | |
- ```
         +----------------+-----------------+             +
```
| | BF16 | BF16 | |
- ```
         +----------------+-----------------+             +
```
| | FP32 | FP32 | | +--------------+----------------+-----------------+-------------+ | E5M2 | FP16 | E5M2 | FP32 |
- ```
         +----------------+-----------------+             +
```
| | BF16 | E5M2 | |
- ```
         +----------------+-----------------+             +
```
| | FP16 | FP16 | |
- ```
         +----------------+-----------------+             +
```
| | BF16 | BF16 | |
- ```
         +----------------+-----------------+             +
```
| | FP32 | FP32 | | +--------------+----------------+-----------------+-------------+
Matrix pruning and compression functionalities
Activation functions, bias vector, and output scaling
Batched computation (multiple matrices in a single run)
GEMM Split-K mode
Auto-tuning functionality (see cusparseLtMatmulSearch())
NVTX ranging and Logging functionalities

================================================================================ Support

Supported SM Architectures: SM 8.0, SM 8.6, SM 8.9, SM 9.0, SM 10.0, SM 12.0
Supported CPU architectures and operating systems:

+------------+--------------------+ | OS | CPU archs | +============+====================+ | Windows | x86_64 | +------------+--------------------+ | Linux | x86_64, Arm64 | +------------+--------------------+

================================================================================ Documentation

Please refer to https://docs.nvidia.com/cuda/cusparselt/index.html for the cuSPARSELt documentation.

================================================================================ Installation

The cuSPARSELt wheel can be installed as follows:

.. code-block:: bash

pip install nvidia-cusparselt-cuXX

where XX is the CUDA major version (currently CUDA 12 only is supported).

Keywords

cuda

nvidia

machine learning

high-performance computing

FAQs

What is nvidia-cusparselt-cu12?

Is nvidia-cusparselt-cu12 well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

nvidia-cusparselt-cu12

================================================================================ Key Features

================================================================================ Support

================================================================================ Documentation

================================================================================ Installation

Keywords

Related posts

Lazarus Strikes npm Again with New Wave of Malicious Packages

The Pair Program Podcast: Feross Aboukhadijeh on Preserving Trust in Open Source