An Open Robustness Benchmark for Jailbreaking Language Models
tool for benchmarking RPC endpoints
A benchmark functions collection wrote in Python 3, suited for assessing the performances of optimisation problems on deterministic functions.
A benchmark for high-dimensional robot control
TextWorldExpress: a highly optimized reimplementation of three text game benchmarks focusing on instruction following, commonsense reasoning, and object identification.
Python package to benchmark speech2text models.
Decorators for runtime statistics and benchmarking
Safety benchmarks for reinforcement learning
API for NAS-Bench-201 (a benchmark for neural architecture search).
NAS benchmark for graph data
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
Generative materials benchmarking metrics, inspired by CDVAE.
MLAgility Benchmark and Tools
Pyinfer is a model agnostic Python utility tool for ML developers and researchers to benchmark model inference statistics.
Image Retrieval Performance Benchmark on Large-scale Dataset
Benchmarking for python toml libraries
A python package for benchmarking interpretability approaches.
A library for benchmarking AI/ML applications.
The Continual Transfer Learning Benchmark
Developed in PACS Lab to ease the process of deployment and testing of our benchmarking workload to AWS Lambda.
Indexer for GZIP specially built for DLIO Profiler.
RayLEAF: a flexible, highly-scalable benchmark for federated learning
A method of benchmark
A collection of analytical benchmark functions in multiple fidelities
Significance Analysis for HPO-algorithms performing on multiple benchmarks
Benchmarked priority queue implementation in Rust
Simulation-based inference benchmark
Benchmark of Graph Clustering.
OpenMMLab Model Compression Toolbox and Benchmark
Python package for computing diefficiency metrics dief@t and dief@k.
Detectors: a python package to benchmark generalized out-of-distribution detection methods.
An utility to benchmark your Cloud
A pure Pyton tool to perform time and accuracy benchmarks
Handy tool for Object Storage performance benchmark
Tool which simplifies creating and testing inputs for programming contests.
DAPPER benchmarks the performance of data assimilation (DA) methods.
Package for STANdard drug Screening by COllaborative FIltering. Performs benchmarks against datasets and SotA algorithms, and implements training, validation and testing procedures.
Combines most popular python parsers (json, jprops, pickle...) with user-defined parsers and type converters to read objects from files. Supports multifile & multiparser objects, typically useful to organize test data. Leverages PEP484 type hints in order to intelligently use the best parser/converter chain, and to try several combinations if relevant
Benchmark for PHYsical REasoning
Benchmark utility that plugs into pytest.
Parsec Benchmark interface tool
utilities and pytorch datasets for the KITTI Vision Benchmark Suite
Assess Juju charms and benchmarks on the clouds.
Benchmark of Generative Large Language Models in Danish
Pytorch Benchmark Suite
Benchmark engine for blockchains
Memory Maze is an environment to benchmark memory abilities of RL agents