Simple benchmark framework (in active development)
A ``pytest`` fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer.
Pytest plugin to create CodSpeed benchmarks
Airspeed Velocity: A simple Python history benchmarking tool
provides a common interface to many IR ad-hoc ranking benchmarks, training datasets, etc.
OpenMMLab Detection Toolbox and Benchmark
Reversible Data Transforms
Metrics for multiple object tracker benchmarking.
Massive Text Embedding Benchmark
Store data created during your pytest tests execution, and retrieve it at the end of the session, e.g. for applicative benchmarking purposes.
Python module to run and analyze benchmarks
Benchmark suite for Autoregressive Neural Emulators of PDEs in JAX.
A Python Toolbox for Benchmarking Machine Learning on Partially-Observed Time Series
Official Implementation of "COLLIE: Systematic Construction of Constrained Text Generation Tasks"
Modern benchmarking library for python with pytest integration.
Open MMLab Semantic Segmentation Toolbox and Benchmark
Merlion: A Machine Learning Framework for Time Series Intelligence
OpenMMLab Pose Estimation Toolbox and Benchmark.
Benchmarking QRC measures the ability to store information of
A library to benchmark code snippets.
A Python wrapper for the Penn Machine Learning Benchmark data repository.
ML models + benchmark for tabular data classification and regression
WebArena benchmark for BrowserGym
MiniWoB++ benchmark for BrowserGym
Benchmark Runner Tool
VisualWebArena benchmark for BrowserGym
AssistantBench benchmark for BrowserGym
A Heterogeneous Benchmark for Information Retrieval
WorkArena benchmark for BrowserGym
OpenMMLab Image Classification Toolbox and Benchmark
Tools to benchmark, deploy and monitor prediction market agents.
Collection of ML models and benchmarking tools
This is an unofficial, use-at-your-own risks port of the webarena benchmark, for use as a standalone library package.
Silero Models: pre-trained enterprise-grade STT / TTS models and benchmarks.
Benchmark your code
This is an unofficial, use-at-your-own risks port of the visualwebarena benchmark, for use as a standalone library package.
BrowserGym integration for the WebLINX benchmark
The Redis benchmarks specification describes the cross-language/tools requirements and expectations to foster performance and observability standards around redis related technologies. Members from both industry and academia, including organizations and individuals are encouraged to contribute.
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
GPU基准测试工具 - 用于评估NVIDIA GPU性能的综合工具包
Scikit-learn-compatible datasets
OpenMMLab Model Pretraining Toolbox and Benchmark
Redis benchmark run helper. A wrapper around Redis and Redis Modules benchmark tools ( ftsb_redisearch, memtier_benchmark, redis-benchmark, aibench, etc... ).
Library and Client for managing, benchmarking, and interacting with jupyterhub
Production-ready Django package for safe and configurable concurrent testing with isolated databases, timing analytics, and concurrency simulation middleware
QCVV and Benchmarking
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis.