Show value of an importable object
Like `typing._eval_type`, but lets older Python versions use newer typing features.
HuggingFace community-driven open-source library of evaluation
Safely evaluate AST nodes without side effects
A simple, safe single expression evaluator library.
Evalica, your favourite evaluation toolkit.
Validation and secure evaluation of untrusted python expressions
Safe, minimalistic evaluator of python expression using ast module
The Open-Source LLM Evaluation Framework.
Testing framework for sequence labeling
an AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions
Universal library for evaluating AI models
A getattr and setattr that works on nested objects, lists, dicts, and any combination thereof without resorting to eval
A framework for evaluating language models
LLM Evaluations
Use EvalAI through command line interface
MS-COCO Caption Evaluation for Python 3
EvalScope: Lightweight LLMs Evaluation Framework
A library for providing a simple interface to create new metrics and an easy-to-use toolkit for metric computations and checkpointing.
Python Mathematical Expression Evaluator
evaluation_lumo is a package for evaluating the LUMO damage detection system.
A poker hand evaluation and equity calculation library
Evaluation tools for the SIGSEP MUS database
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
"EvalPlus for rigourous evaluation of LLM-synthesized code"
Limited evaluator
evalutils helps users create extensions for grand-challenge.org
Evaluating and scoring financial data
Faster interpretation of the original COCOEval
Data Evaluation Software of the (Magnetism) research group of Prof. Ehresmann at University of Kassel
Provides Python bindings for popular Information Retrieval measures implemented within trec_eval.
Contains the integration code of AzureML Evaluate with Mlflow.
eval-mm is a tool for evaluating Multi-Modal Large Language Models.
A Difference Evaluator for Alternating Images
Backwards-compatibility package for API of trulens_eval<1.0.0 using API of trulens-*>=1.0.0.
A custom Streamlit component to evaluate arbitrary Javascript expressions.
User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
A library for providing a simple interface to create new metrics and an easy-to-use toolkit for metric computations and checkpointing.
Microsoft Azure Evaluation Library for Python
Evaluacion Python Nivel II
Evaluation
Package for fast computation of BSS Eval metrics for source separation
An information retrieval evaluation script based on the C/W/L framework that is TREC Compatible and provides a replacement for INST_EVAL, RBP_EVAL, TBG_EVAL, UMeasure and TREC_EVAL scripts. All measurements are reported in the same units making all metrics directly comparable.
Generate eval datasets from arbitrary sources