Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
GuardBench
is a Python library for guardrail models evaluation.
It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability.
It also allows to quickly compare results and export LaTeX
tables for scientific publications.
GuardBench
's benchmarking pipeline can also be leveraged on custom datasets.
You can find the list of supported datasets here. A few of them requires authorization. Please, see here.
If you use GuardBench
to evaluate guardrail models for your scientific publications, please consider citing our work.
python>=3.10
pip install guardbench
from guardbench import benchmark
def moderate(
conversations: list[list[dict[str, str]]], # MANDATORY!
# additional `kwargs` as needed
) -> list[float]:
# do moderation
# return list of floats (unsafe probabilities)
benchmark(
moderate=moderate, # User-defined moderation function
model_name="My Guardrail Model",
batch_size=32,
datasets="all",
# Note: you can pass additional `kwargs` for `moderate`
)
Llama Guard
with GuardBench
.scripts
folder.Browse the documentation for more details about:
GuardBench
.Report
class to compare models and export results as LaTeX
tables.GuardBench
's benchmarking pipeline on custom datasets.You can find GuardBench
's leaderboard here.
All results can be reproduced using the provided scripts
.
If you want to submit your results, please contact us.
@inproceedings{guardbench,
title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
author = "Bassani, Elias and
Sanchez, Ignacio",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.1022.pdf",
pages = "18393--18409",
}
Would you like to see other features implemented? Please, open a feature request.
GuardBench is provided as open-source software licensed under EUPL v1.2.
FAQs
GuardBench: A Large-Scale Benchmark for Guardrail Models
We found that guardbench demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.