Simple benchmark framework (in active development)
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer.
Reversible Data Transforms
Pytest plugin to create CodSpeed benchmarks
Benchmarking QRC measures the ability to store information of
QCVV and Benchmarking
Metrics for multiple object tracker benchmarking.
OpenMMLab Detection Toolbox and Benchmark
Airspeed Velocity: A simple Python history benchmarking tool
Massive Text Embedding Benchmark
Python module to run and analyze benchmarks
provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.
Scikit-learn-compatible datasets
Benchmark Runner Tool
resp-benchmark is a benchmark tool for testing databases that support the RESP protocol, such as Redis, Valkey, and Tair.
Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes.
A Python Toolbox for Benchmarking Machine Learning on Partially-Observed Time Series
Merlion: A Machine Learning Framework for Time Series Intelligence
Open MMLab Semantic Segmentation Toolbox and Benchmark
A high-performant C++ implementation of benchmark functions for mathematical optimization algorithms.
WebArena benchmark for BrowserGym
MiniWoB++ benchmark for BrowserGym
A package for submitting benchmarking scripts on OSCAR.
This is an unofficial, use-at-your-own risks port of the webarena benchmark, for use as a standalone library package.
WorkArena benchmark for BrowserGym
OpenMMLab Image Classification Toolbox and Benchmark
VisualWebArena benchmark for BrowserGym
Benchmark your code
OpenMMLab Model Pretraining Toolbox and Benchmark
OpenMMLab Pose Estimation Toolbox and Benchmark.
This is an unofficial, use-at-your-own risks port of the visualwebarena benchmark, for use as a standalone library package.
AssistantBench benchmark for BrowserGym
A Heterogeneous Benchmark for Information Retrieval
peek - debugging and benchmarking made easy
CLIP-like models benchmarks on various datasets
Benchmarking framework for all types of black-box optimization algorithms.
Evaluating single-cell data integration methods
A tool for Behavior benchmARKing
Tools to benchmark, deploy and monitor prediction market agents.
A Python wrapper for the Penn Machine Learning Benchmark data repository.
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis.
Quick and easy python benchmarking.
A library to benchmark code snippets.
Opfunu: An Open-source Python Library for Optimization Benchmark Functions
Macrobenchmarking framework for OpenSearch
MEALPY: An Open-source Library for Latest Meta-heuristic Algorithms in Python