🚀 Socket Launch Week Day 4:Socket MCP Adds Org Alerts, Threat Feed Review, and Package Inspection.Learn more
Sign In

flash-attn-4

Package Overview
Dependencies
Maintainers
2
Versions
16
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

flash-attn-4

Flash Attention CUTE (CUDA Template Engine) implementation

pipPyPI
Version
4.0.0b18
Weekly downloads
220K
-5.28%
Maintainers
2
Weekly downloads
 

FlashAttention-4 (CuTeDSL)

FlashAttention-4 is a CuTeDSL-based implementation of FlashAttention for Hopper and Blackwell GPUs.

Installation

pip install flash-attn-4

If you're on CUDA 13, install with the cu13 extra for best performance:

pip install "flash-attn-4[cu13]"

Usage

from flash_attn.cute import flash_attn_func, flash_attn_varlen_func

out = flash_attn_func(q, k, v, causal=True)

Development

git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
pip install -e "flash_attn/cute[dev]"       # CUDA 12.x
pip install -e "flash_attn/cute[dev,cu13]"  # CUDA 13.x (e.g. B200)
pytest tests/cute/

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts