Simple benchmark framework (in active development)
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer.
Pytest plugin to create CodSpeed benchmarks
provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.
OpenMMLab Detection Toolbox and Benchmark
Reversible Data Transforms
Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes.
Airspeed Velocity: A simple Python history benchmarking tool
Python module to run and analyze benchmarks
Metrics for multiple object tracker benchmarking.
Massive Text Embedding Benchmark
A Python Toolbox for Benchmarking Machine Learning on Partially-Observed Time Series
Benchmarking QRC measures the ability to store information of
Official Implementation of "COLLIE: Systematic Construction of Constrained Text Generation Tasks"
Quick and easy python benchmarking.
Benchmark Runner Tool
Open MMLab Semantic Segmentation Toolbox and Benchmark
resp-benchmark is a benchmark tool for testing databases that support the RESP protocol, such as Redis, Valkey, and Tair.
Tools to benchmark, deploy and monitor prediction market agents.
A Python wrapper for the Penn Machine Learning Benchmark data repository.
Merlion: A Machine Learning Framework for Time Series Intelligence
OpenMMLab Pose Estimation Toolbox and Benchmark.
Scikit-learn-compatible datasets
WebArena benchmark for BrowserGym
A Heterogeneous Benchmark for Information Retrieval
Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks.
Modern benchmarking library for python with pytest integration.
MiniWoB++ benchmark for BrowserGym
CLIP-like models benchmarks on various datasets
OpenMMLab Image Classification Toolbox and Benchmark
OpenMMLab Model Pretraining Toolbox and Benchmark
ML models + benchmark for tabular data classification and regression
A high-performant C++ implementation of benchmark functions for mathematical optimization algorithms.
WorkArena benchmark for BrowserGym
Benchmark suite for Autoregressive Neural Emulators of PDEs in JAX.
AssistantBench benchmark for BrowserGym
This is an unofficial, use-at-your-own risks port of the webarena benchmark, for use as a standalone library package.
VisualWebArena benchmark for BrowserGym
Benchmark your code
BrowserGym integration for the WebLINX benchmark
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
This is an unofficial, use-at-your-own risks port of the visualwebarena benchmark, for use as a standalone library package.
Fuzzy Data Benchmark
Collection of ML models and benchmarking tools
A public and reproducible collection of reference implementations and benchmark suite for distributed machine learning systems.
MEALPY: An Open-source Library for Latest Meta-heuristic Algorithms in Python
rliable: Reliable evaluation on reinforcement learning and machine learning benchmarks.