Simple benchmark framework (in active development)
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer.
Reversible Data Transforms
Pytest plugin to create CodSpeed benchmarks
OpenMMLab Detection Toolbox and Benchmark
Benchmarking QRC measures the ability to store information of
Metrics for multiple object tracker benchmarking.
Airspeed Velocity: A simple Python history benchmarking tool
Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes.
provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.
Massive Text Embedding Benchmark
Python module to run and analyze benchmarks
This is an unofficial, use-at-your-own risks port of the visualwebarena benchmark, for use as a standalone library package.
VisualWebArena benchmark for BrowserGym
A Python Toolbox for Benchmarking Machine Learning on Partially-Observed Time Series
WorkArena benchmark for BrowserGym
This is an unofficial, use-at-your-own risks port of the webarena benchmark, for use as a standalone library package.
MiniWoB++ benchmark for BrowserGym
WebArena benchmark for BrowserGym
AssistantBench benchmark for BrowserGym
Open MMLab Semantic Segmentation Toolbox and Benchmark
Benchmark Runner Tool
BrowserGym integration for the WebLINX benchmark
Modern benchmarking library for python with pytest integration.
Merlion: A Machine Learning Framework for Time Series Intelligence
Tools to benchmark, deploy and monitor prediction market agents.
Scikit-learn-compatible datasets
OpenMMLab Model Pretraining Toolbox and Benchmark
Fuzzy Data Benchmark
OpenMMLab Image Classification Toolbox and Benchmark
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis.
QCVV and Benchmarking
Macrobenchmarking framework for OpenSearch
Redis benchmark run helper. A wrapper around Redis and Redis Modules benchmark tools ( ftsb_redisearch, memtier_benchmark, redis-benchmark, aibench, etc... ).
Benchmark your code
Benchmarking framework for all types of black-box optimization algorithms.
A high-performant C++ implementation of benchmark functions for mathematical optimization algorithms.
The Feel++ Benchmarking Project
OpenMMLab Pose Estimation Toolbox and Benchmark.
CLIP-like models benchmarks on various datasets
A Heterogeneous Benchmark for Information Retrieval
A public and reproducible collection of reference implementations and benchmark suite for distributed machine learning systems.
A library to benchmark code snippets.
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
resp-benchmark is a benchmark tool for testing databases that support the RESP protocol, such as Redis, Valkey, and Tair.
GBD Tools: Maintenance and Distribution of Benchmark Instances and their Attributes
Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks.