New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

abd-distances

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

abd-distances

Distance functions: A drop-in replacement for, and a super-set of the scipy.spatial.distance module.

  • 1.0.4
  • PyPI
  • Socket score

Maintainers
1

Algorithms for Big Data: Distances (v1.0.4)

This package contains algorithms for computing distances between data points. It is a thin Python wrapper around the distances crate, in Rust. It provides drop-in replacements for the distance functions in scipy.spatial.distance.

Supported Distance Functions

Installation

pip install abd-distances

Usage

import math

import numpy
import abd_distances.simd as distance

a = numpy.array([i for i in range(10_000)], dtype=numpy.float32)
b = a + 1.0

dist = distance.euclidean(a, b)

assert math.fabs(dist - 100.0) < 1e-6

print(dist)
# 100.0

Vector Distances

  • Bray-Curtis: abd_distances.vector.braycurtis
  • Canberra: abd_distances.vector.canberra
  • Chebyshev: abd_distances.vector.chebyshev
  • Correlation
  • Cosine: abd_distances.vector.cosine
  • Euclidean: abd_distances.vector.euclidean
  • Jensen-Shannon
  • Mahalanobis
  • Manhattan: abd_distances.vector.manhattan and abd_distances.vector.cityblock
  • Minkowski: abd_distances.vector.minkowski
  • Standardized Euclidean
  • Squared Euclidean: abd_distances.vector.sqeuclidean
  • Pairwise Distances: abd_distances.vector.cdist and abd_distances.vector.pdist
  • ...
Boolean Distances
  • Dice
  • Hamming
  • Jaccard
  • Kulczynski 1D
  • Rogers-Tanimoto
  • Russell-Rao
  • Sokal-Michener
  • Sokal-Sneath
  • Yule
  • ...

SIMD-Accelerated Vector Distances

  • Euclidean: abd_distances.simd.euclidean
  • Squared Euclidean: abd_distances.simd.sqeuclidean
  • Cosine: abd_distances.simd.cosine
  • Pairwise Distances: abd_distances.simd.cdist and abd_distances.simd.pdist
  • ...

String Distances

  • Hamming: abd_distances.strings.hamming
  • Levenshtein: abd_distances.strings.levenshtein
  • Needleman-Wunsch: abd_distances.strings.needleman_wunsch
  • Smith-Waterman
  • Pairwise Distances
  • ...

Benchmarks

SIMD-Accelerated Vector Distance Benchmarks

These benchmarks were run on an Intel Core i7-11700KF CPU @ 4.900GHz, using a single thread. The OS was Arch Linux, with kernel version 6.7.4-arch1-1.

The "Min", "Max", and "Mean" columns show the minimum, maximum, and mean times (in seconds), respectively, taken to compute the pairwise distances using the functions from scipy.spatial.distance. The "Min (+)", "Max (+)", and "Mean (+)" columns show the speedup of the this package's functions over the scipy functions. All pairwise distances (cdist and pdist) were computed for 200x200 vectors of 500 dimensions, and the average time was taken over 100 runs. All individual distances were computed for 20x20 vectors of 500 dimensions, and the average time was taken over 100 runs.

BenchmarkMinMaxMeanMin (+)Max (+)Mean (+)
cdist, euclidean, f322.5602.5762.5660.185 (13.9x)0.196 (13.2x)0.188 (13.7x)
cdist, euclidean, f642.3982.4062.4010.292 (8.2x)0.307 (7.8x)0.298 (8.0x)
cdist, sqeuclidean, f322.5192.5272.5230.182 (13.9x)0.197 (12.8x)0.187 (13.5x)
cdist, sqeuclidean, f642.3812.3932.3890.293 (8.1x)0.318 (7.5x)0.301 (7.9x)
cdist, cosine, f324.0114.0214.0160.625 (6.4x)0.637 (6.3x)0.632 (6.4x)
cdist, cosine, f643.9784.0093.9920.626 (6.4x)0.666 (6.0x)0.638 (6.3x)
pdist, euclidean, f321.2351.2491.2410.252 (4.9x)0.263 (4.7x)0.257 (4.8x)
pdist, euclidean, f641.2161.2621.2340.302 (4.0x)0.312 (4.0x)0.308 (4.0x)
pdist, sqeuclidean, f321.2291.2501.2370.251 (4.9x)0.303 (4.1x)0.265 (4.7x)
pdist, sqeuclidean, f641.2091.2131.2110.306 (3.9x)0.313 (3.9x)0.310 (3.9x)
pdist, cosine, f322.0012.0172.0060.468 (4.3x)0.484 (4.2x)0.478 (4.2x)
pdist, cosine, f641.9912.0041.9960.461 (4.3x)0.476 (4.2x)0.471 (4.2x)
euclidean, f320.6440.6700.6540.076 (8.5x)0.080 (8.4x)0.078 (8.3x)
euclidean, f640.6720.7010.6820.097 (6.9x)0.102 (6.9x)0.100 (6.8x)
sqeuclidean, f320.5060.5120.5080.076 (6.6x)0.079 (6.5x)0.078 (6.5x)
sqeuclidean, f640.5150.5190.5180.100 (5.1x)0.104 (5.0x)0.103 (5.0x)
cosine, f320.6680.6870.6770.110 (6.1x)0.113 (6.1x)0.111 (6.1x)
cosine, f640.4650.4720.4690.127 (3.7x)0.130 (3.6x)0.129 (3.6x)
f32 f64

Euclidean f32 Squared Euclidean f32 Cosine f32

Euclidean f64 Squared Euclidean f64 Cosine f64

Vector Distance Benchmarks (No SIMD)

These benchmarks were run on an Intel Core i7-11700KF CPU @ 4.900GHz, using a single thread. The OS was Arch Linux, with kernel version 6.7.4-arch1-1.

The "Min", "Max", and "Mean" columns show the minimum, maximum, and mean times (in seconds), respectively, taken to compute the pairwise distances using the functions from scipy.spatial.distance. The "Min (+)", "Max (+)", and "Mean (+)" columns show the speedup of the this package's functions over the scipy functions. All pairwise distances (cdist and pdist) were computed for 200x200 vectors of 500 dimensions, and the average time was taken over 100 runs. All individual distances were computed for 20x20 vectors of 500 dimensions, and the average time was taken over 100 runs.

These benchmarks were run using the richbench package.

BenchmarkMinMaxMeanMin (+)Max (+)Mean (+)
braycurtis, f321.1031.1341.1140.323 (3.4x)0.324 (3.5x)0.323 (3.4x)
braycurtis, f640.8340.8430.8380.170 (4.9x)0.173 (4.9x)0.171 (4.9x)
canberra, f322.5242.5292.5260.153 (16.5x)0.155 (16.3x)0.154 (16.4x)
canberra, f642.2162.2602.2350.168 (13.2x)0.170 (13.3x)0.169 (13.2x)
chebyshev, f322.7382.7742.7530.149 (18.3x)0.151 (18.4x)0.150 (18.4x)
chebyshev, f642.7772.7842.7810.165 (16.8x)0.166 (16.8x)0.165 (16.8x)
euclidean, f320.6410.6410.6410.150 (4.3x)0.150 (4.3x)0.150 (4.3x)
euclidean, f640.6570.6620.6600.167 (3.9x)0.168 (3.9x)0.168 (3.9x)
sqeuclidean, f320.5060.5090.5070.149 (3.4x)0.149 (3.4x)0.149 (3.4x)
sqeuclidean, f640.5140.5180.5160.165 (3.1x)0.173 (3.0x)0.170 (3.0x)
cityblock, f320.4370.4430.4400.150 (2.9x)0.150 (2.9x)0.150 (2.9x)
cityblock, f640.4440.4510.4470.167 (2.7x)0.168 (2.7x)0.168 (2.7x)
cosine, f320.6590.6680.6640.314 (2.1x)0.315 (2.1x)0.314 (2.1x)
cosine, f640.4590.4710.4650.321 (1.4x)0.325 (1.4x)0.324 (1.4x)
cdist, braycurtis, f324.9024.9064.9041.802 (2.7x)1.875 (2.6x)1.833 (2.7x)
cdist, braycurtis, f644.7654.7754.7680.710 (6.7x)0.735 (6.5x)0.725 (6.6x)
cdist, canberra, f326.9146.9436.9301.356 (5.1x)1.385 (5.0x)1.367 (5.1x)
cdist, canberra, f646.7826.8136.7970.684 (9.9x)0.701 (9.7x)0.692 (9.8x)
cdist, chebyshev, f322.7632.7682.7650.640 (4.3x)0.663 (4.2x)0.649 (4.3x)
cdist, chebyshev, f642.6592.6772.6640.647 (4.1x)0.662 (4.0x)0.655 (4.1x)
cdist, euclidean, f322.5632.5702.5640.644 (4.0x)0.658 (3.9x)0.653 (3.9x)
cdist, euclidean, f642.3782.4002.3880.630 (3.8x)0.649 (3.7x)0.640 (3.7x)
cdist, sqeuclidean, f322.5162.5232.5190.648 (3.9x)0.660 (3.8x)0.652 (3.9x)
cdist, sqeuclidean, f642.4122.4232.4170.631 (3.8x)0.645 (3.8x)0.638 (3.8x)
cdist, cityblock, f324.5454.5524.5480.647 (7.0x)0.671 (6.8x)0.658 (6.9x)
cdist, cityblock, f644.4064.4074.4070.633 (7.0x)0.657 (6.7x)0.647 (6.8x)
cdist, cosine, f324.0104.0204.0132.254 (1.8x)2.292 (1.8x)2.270 (1.8x)
cdist, cosine, f643.9873.9923.9902.241 (1.8x)2.288 (1.7x)2.258 (1.8x)
pdist, braycurtis, f322.3822.3872.3851.062 (2.2x)1.074 (2.2x)1.069 (2.2x)
pdist, braycurtis, f642.3682.3782.3710.510 (4.6x)0.523 (4.5x)0.516 (4.6x)
pdist, canberra, f323.3743.4113.3890.831 (4.1x)0.841 (4.1x)0.835 (4.1x)
pdist, canberra, f643.3693.4113.3960.504 (6.7x)0.515 (6.6x)0.509 (6.7x)
pdist, chebyshev, f321.3621.3641.3630.478 (2.9x)0.488 (2.8x)0.484 (2.8x)
pdist, chebyshev, f641.3381.3431.3410.476 (2.8x)0.485 (2.8x)0.481 (2.8x)
pdist, euclidean, f321.2411.2501.2460.482 (2.6x)0.487 (2.6x)0.484 (2.6x)
pdist, euclidean, f641.2221.2281.2250.474 (2.6x)0.503 (2.4x)0.482 (2.5x)
pdist, sqeuclidean, f321.2241.2471.2330.477 (2.6x)0.490 (2.5x)0.481 (2.6x)
pdist, sqeuclidean, f641.2111.2141.2130.470 (2.6x)0.478 (2.5x)0.475 (2.6x)
pdist, cityblock, f322.2042.2072.2050.483 (4.6x)0.491 (4.5x)0.486 (4.5x)
pdist, cityblock, f642.1892.1982.1920.476 (4.6x)0.483 (4.5x)0.481 (4.6x)
pdist, cosine, f322.0002.0042.0021.292 (1.5x)1.302 (1.5x)1.296 (1.5x)
pdist, cosine, f641.9881.9921.9901.288 (1.5x)1.296 (1.5x)1.292 (1.5x)
F32 F64

Chebyshev f32 Euclidean f32 Squared Euclidean f32 Manhattan f32 Canberra f32 Cosine f32

Chebyshev f64 Euclidean f64 Squared Euclidean f64 Manhattan f64 Canberra f64 Cosine f64

u32 u64

Bray-Curtis u32

Bray-Curtis u64

String Distance Benchmarks

These benchmarks were run on an Intel Core i7-11700KF CPU @ 4.900GHz, using a single thread. The OS was Arch Linux, with kernel version 6.7.4-arch1-1.

All string distances were computed 100 times each, among different pairs of strings, and the average time was taken.

Hamming Levenshtein Needleman-Wunsch

License

This package is licensed under the MIT license.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc