cutensor-cu12

NVIDIA cuTENSOR

2.2.0

PyPI

Maintainers: 8

############################################################### cuTENSOR: A High-Performance CUDA Library For Tensor Primitives ###############################################################

cuTENSOR <https://developer.nvidia.com/cutensor>_ is a high-performance CUDA library for tensor primitives.

Key Features

Extensive mixed-precision support:
- FP64 inputs with FP32 compute.
- FP32 inputs with FP16, BF16, or TF32 compute.
- Complex-times-real operations.
- Conjugate (without transpose) support.
Support for up to 64-dimensional tensors.
Arbitrary data layouts.
Trivially serializable data structures.
Main computational routines:
- Direct (i.e., transpose-free) tensor contractions.
  - Support just-in-time compilation of dedicated kernels.
- Tensor reductions (including partial reductions).
- Element-wise tensor operations:
  - Support for various activation functions.
  - Support for padding of the output tensor
  - Arbitrary tensor permutations.
  - Conversion between different data types.

Documentation

Please refer to https://docs.nvidia.com/cuda/cutensor/index.html for the cuTENSOR documentation.

Installation

The cuTENSOR wheel can be installed as follows:

.. code-block:: bash

pip install cutensor-cuXX

where XX is the CUDA major version (currently CUDA 11 & 12 are supported). The package cutensor (without the -cuXX suffix) is deprecated. If you have cutensor installed, please remove it prior to installing cutensor-cuXX.

Keywords

high-performance computing

FAQs

What is cutensor-cu12?

Is cutensor-cu12 well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install