Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
ScaNN (Scalable Nearest Neighbors) is a method for efficient vector similarity search at scale. This code release implements [1], which includes search space pruning and quantization for Maximum Inner Product Search and also supports other distance functions such as Euclidean distance. The implementation is designed for x86 processors with AVX2 support. ScaNN achieves state-of-the-art performance on ann-benchmarks.com as shown on the glove-100-angular dataset below:
ScaNN can be configured to fit datasets with different sizes and distributions. It has both TensorFlow and Python APIs. The library shows strong performance with large datasets [1]. The code is released for research purposes. For more details on the academic description of algorithms, please see [1].
Reference [1]:
@inproceedings{avq_2020,
title={Accelerating Large-Scale Inference with Anisotropic Vector Quantization},
author={Guo, Ruiqi and Sun, Philip and Lindgren, Erik and Geng, Quan and Simcha, David and Chern, Felix and Kumar, Sanjiv},
booktitle={International Conference on Machine Learning},
year={2020},
URL={https://arxiv.org/abs/1908.10396}
}
manylinux_2_27
-compatible wheels are available on PyPI:
pip install scann
ScaNN supports Linux environments running Python versions 3.9-3.12. See docs/releases.md for release notes; the page also contains download links for ScaNN wheels prior to version 1.1.0, which were not released on PyPI.
In accordance with the
manylinux_2_27
specification, ScaNN
requires libstdc++
version 3.4.23 or above from the operating system. See
here for an example of how
to find your system's libstdc++
version; it can generally be upgraded by
installing a newer version of g++
.
We provide custom Docker images of
TF Serving that are linked to the ScaNN
TF ops. See the tf_serving
directory for further
information.
To build ScaNN from source, first install the build tool bazel, Clang 16, and libstdc++ headers for C++17 (which are provided with GCC 9). Additionally, ScaNN requires a modern version of Python (3.9.x or later) and Tensorflow 2.16 installed on that version of Python. Once these prerequisites are satisfied, run the following command in the root directory of the repository:
python configure.py
CC=clang-16 bazel build -c opt --features=thin_lto --copt=-mavx --copt=-mfma --cxxopt="-std=c++17" --copt=-fsized-deallocation --copt=-w :build_pip_pkg
./bazel-bin/build_pip_pkg
A .whl file should appear in the root of the repository upon successful completion of these commands. This .whl can be installed via pip.
See the example in docs/example.ipynb. For a more in-depth explanation of ScaNN techniques, see docs/algorithms.md.
FAQs
Scalable Nearest Neighbor search library
We found that 0xibra-scann demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.