Simple benchmark framework (in active development)
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer.
Pytest plugin to create CodSpeed benchmarks
OpenMMLab Detection Toolbox and Benchmark
Massive Text Embedding Benchmark
provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.
Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes.
Python module to run and analyze benchmarks
Metrics for multiple object tracker benchmarking.
Airspeed Velocity: A simple Python history benchmarking tool
Reversible Data Transforms
Benchmarking QRC measures the ability to store information of
Benchmark suite for Autoregressive Neural Emulators of PDEs in JAX.
CLIP-like models benchmarks on various datasets
A Python Toolbox for Benchmarking Machine Learning on Partially-Observed Time Series
Benchmark Runner Tool
Open MMLab Semantic Segmentation Toolbox and Benchmark
Merlion: A Machine Learning Framework for Time Series Intelligence
OpenMMLab Image Classification Toolbox and Benchmark
OpenMMLab Pose Estimation Toolbox and Benchmark.
A Python wrapper for the Penn Machine Learning Benchmark data repository.
The seismological machine learning benchmark collection
Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks.
WebArena benchmark for BrowserGym
MiniWoB++ benchmark for BrowserGym
VisualWebArena benchmark for BrowserGym
Benchmark your code
Collection of ML models and benchmarking tools
A Heterogeneous Benchmark for Information Retrieval
AssistantBench benchmark for BrowserGym
Pytest benchmarking plugin for codeflash.ai - automatic code performance optimization
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
ML models + benchmark for tabular data classification and regression
WorkArena benchmark for BrowserGym
This is an unofficial, use-at-your-own risks port of the webarena benchmark, for use as a standalone library package.
OpenMMLab Model Pretraining Toolbox and Benchmark
This is an unofficial, use-at-your-own risks port of the visualwebarena benchmark, for use as a standalone library package.
Tools to benchmark, deploy and monitor prediction market agents.
A library to benchmark code snippets.
A public and reproducible collection of reference implementations and benchmark suite for distributed machine learning systems.
BrowserGym integration for the WebLINX benchmark
A high-performant C++ implementation of benchmark functions for mathematical optimization algorithms.
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis.
Scikit-learn-compatible datasets
MEALPY: An Open-source Library for Latest Meta-heuristic Algorithms in Python
resp-benchmark is a benchmark tool for testing databases that support the RESP protocol, such as Redis, Valkey, and Tair.