Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Neighborhood Attention Extension
Bringing attention to a neighborhood near you!
Website / Releases | Documentation
NATTEN is an open-source project dedicated to providing fast implementations for Neighborhood Attention, a sliding window self-attention mechanism.
If you're not familiar with neighborhood attention, please refer to our papers, or watch our YouTube video from CVPR 2023.
To read more about our GEMM-based and fused neighborhood attention kernels, please refer to our new preprint, Faster Neighborhood Attention.
We've released the Fused Neighborhood Attention (FNA) backward kernel and interface, which means you can now train models based on neighborhood attention faster and more efficiently.
FNA can be seen as a generalization of methods such as Flash Attention and FMHA from back-to-back matrix multiplication to back-to-back tensor-tensor contraction, and comes with neighborhood attention masking built in. This accelerates accelerates neighborhood attention, a multi-dimensional sliding window attention pattern, by never storing the attention tensor to global memory, which aside from reducing global memory footprint also reduces the memory bandwidth bottleneck.
We highly recommend referring to FNA quick start or the Fused vs unfused NA guide before starting to use FNA, since the interface, memory layout, and feature set can differ from all unfused ops in NATTEN.
NATTEN supports PyTorch version 2.0 and later, and Python versions 3.8 and above. Python 3.12 is only supported with torch >= 2.2.0.
Older NATTEN releases supported python >= 3.7 and torch >= 1.8.
Please refer to install instructions to find out whether your operating system and hardware accelerator is compatible with NATTEN.
Problem space | CPU backend | CUDA backend |
---|---|---|
1D | naive | naive, gemm, fna |
2D | naive | naive, gemm, fna |
3D | naive | naive, fna |
Problem space | CPU Backend | Causal masking | Varying parameters | Relative positional bias | Autograd support |
---|---|---|---|---|---|
1D | naive | ✓ | ✓ | ✓ | Forward and reverse mode |
2D | naive | ✓ | ✓ | ✓ | Forward and reverse mode |
3D | naive | ✓ | ✓ | ✓ | Forward and reverse mode |
Notes:
Problem space | CUDA Backend | Causal masking | Varying parameters | Relative positional bias | Autograd support | Min. Arch |
---|---|---|---|---|---|---|
1D | naive | ✓ | ✓ | ✓ | Forward and reverse mode | SM35 |
2D | naive | ✓ | ✓ | ✓ | Forward and reverse mode | SM35 |
3D | naive | ✓ | ✓ | ✓ | Forward and reverse mode | SM35 |
1D | gemm | - | - | ✓ | Forward and reverse mode | SM70 |
2D | gemm | - | - | ✓ | Forward and reverse mode | SM70 |
1D | fna | ✓ | ✓ | ✓ | Reverse mode | SM50 |
2D | fna | ✓ | ✓ | ✓ | Reverse mode | SM50 |
3D | fna | ✓ | ✓ | ✓ | Reverse mode | SM50 |
Notes:
Features that will likely no longer be worked on or improved:
NATTEN is released under the MIT License.
@inproceedings{hassani2024faster,
title = {Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level},
author = {Ali Hassani and Wen-Mei Hwu and Humphrey Shi},
year = 2024,
booktitle = {Advances in Neural Information Processing Systems},
}
@inproceedings{hassani2023neighborhood,
title = {Neighborhood Attention Transformer},
author = {Ali Hassani and Steven Walton and Jiachen Li and Shen Li and Humphrey Shi},
year = 2023,
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}
}
@misc{hassani2022dilated,
title = {Dilated Neighborhood Attention Transformer},
author = {Ali Hassani and Humphrey Shi},
year = 2022,
url = {https://arxiv.org/abs/2209.15001},
eprint = {2209.15001},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}
We thank NVIDIA, and the CUTLASS project and team for their efforts in creating and open-sourcing CUTLASS. We would also like to thank Haicheng Wu for his valuable feedback and comments which led to the creation of GEMM-based NA. We also thank Meta and the xFormers team for their FMHA kernel, which is what our Fused Neighborhood Attention kernel is based on. We thank the PyTorch project and team.
FAQs
Neighborhood Attention Extension.
We found that natten demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.