Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Note: This is a work in progress. The API/CLI is not stable yet.
pip install annb
# install vector search index/client you may need for benchmark
# e.g install faiss for run faiss index benchmark
Just run annb-test
to start your first benchmark with a random dataset.
annb-test
It will produce a result like this:
❯ annb-test
... some logs ...
BenchmarkResult:
attributes:
query_args: [{'nprobe': 1}]
topk: 10
jobs: 1
loop: 5
step: 10
name: Test
dataset: .annb_random_d256_l2_1000.hdf5
index: Test
dim: 256
metric_type: MetricType.L2
index_args: {'index': 'ivfflat', 'nlist': 128}
started: 2023-08-14 13:03:40
durations:
training: 1 items, 1000 total, 1490.03266ms
insert: 1 items, 1000 total, 132.439627ms
query:
nprobe=1,recall=0.2173 -> 1000 items, 18.615083ms, 53719.878659686874qps, latency=0.18615083ms, p95=0.31939ms, p99=0.41488ms
This is a simple benchmark test with default index(faiss) with random l2 dataset. If you wants to generate more data or with some different specifications for the dataset, you could see below options:
You could also use ann-benchmarks's dataset to run benchmark. download them locally and run benchmark with --dataset
option.
annb-test --dataset sift-128-euclidean.hdf5
You mary benchmark with different query args, e.g. different nprobe for faiss ivfflat index. you could try --query-args
option.
annb-test --query-args nprobe=10 --query-args nprobe=20
will output below result:
durations:
training: 1 items, 1000 total, 1548.84968ms
insert: 1 items, 1000 total, 143.402532ms
query:
nprobe=1,recall=0.2173 -> 1000 items, 20.074236ms, 49815.09632545916qps, latency=0.20074235999999998ms, p95=0.332276ms, p99=0.455525ms
nprobe=10,recall=0.5221 -> 1000 items, 49.141931ms, 20349.2207092961qps, latency=0.49141931ms, p95=0.722628ms, p99=0.818012ms
nprobe=20,recall=0.6861 -> 1000 items, 69.284072ms, 14433.331805324606qps, latency=0.69284072ms, p95=1.126946ms, p99=1.350359ms
You may run multiple benchmarks with different index and dataset. you could use --run-file
run benchmarks from a config file.
Below is a example config file:
config.yaml
default:
index_factory: annb.anns.faiss.indexes.index_under_test_factory
index_factory_args: {}
index_name: Test
dataset: gist-960-euclidean.hdf5
topk: 10
step: 10
jobs: 1
loop: 2
result: output.pth
runs:
- name: faiss-gist960-gpu-ivfflat
index_args:
gpu: yes
index: ivfflat
nlist: 1024
query_args:
- nprobe: 1
- nprobe: 16
- nprobe: 256
- name: faiss-gist960-gpu-ivfpq8
index_args:
gpu: yes
index: ivfpq
nlist: 1024
query_args:
- nprobe: 1
- nprobe: 16
- nprobe: 256
Explanation for above config file:
annb-test
command. e.g. index_factory
is same as --index-factory
.runs
section. and each run config will override the default config. In this example, we define use gist-960-euclidean.hdf5 as dataset, so it will use this dataset for all benchmarks. and we use different index and query args for each benchmark. for index_args, we use ivfflat(nlist=1024) and ivfpq(nlist=1024) as two benchmark series. and for query_args, we use nprobe=1,16,256 for each benchmark. That means we will run 6 benchmarks in total, each series will run 3 benchmarks with different nprobe.output-1.pth
and output-2.pth
. you could use annb-report
to view them.You could use annb-test --help
to see more options.
❯ annb-test --help
The annb-report
is use to view benchmark results as plain/csv text, or export them to Chart graphic.
annb-report --help
view benchmark results as plain text
annb-report output.pth
view benchmark results as csv text
annb-report output.pth --format csv
export benchmark results to chart graphic(multiple series)
annb-report output.pth --format png --output output.png output-1.pth output-2.pth
FAQs
A simple ANN benchmark tools
We found that annb demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.