Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

guardbench

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

guardbench

GuardBench: A Large-Scale Benchmark for Guardrail Models

  • 1.0.0
  • PyPI
  • Socket score

Maintainers
1

PyPI version Documentation Status License: EUPL-1.2

GuardBench

⚡️ Introduction

GuardBench is a Python library for guardrail models evaluation. It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability. It also allows to quickly compare results and export LaTeX tables for scientific publications. GuardBench's benchmarking pipeline can also be leveraged on custom datasets.

You can find the list of supported datasets here. A few of them requires authorization. Please, see here.

If you use GuardBench to evaluate guardrail models for your scientific publications, please consider citing our work.

✨ Features

  • 40 datasets for guardrail models evaluation.
  • Automated evaluation pipeline.
  • User-friendly.
  • Extendable.
  • Reproducible and sharable evaluation.
  • Exportable evaluation reports.

🔌 Requirements

python>=3.10

💾 Installation

pip install guardbench

💡 Usage

from guardbench import benchmark

def moderate(
    conversations: list[list[dict[str, str]]],  # MANDATORY!
    # additional `kwargs` as needed
) -> list[float]:
    # do moderation
    # return list of floats (unsafe probabilities)

benchmark(
    moderate=moderate,  # User-defined moderation function
    model_name="My Guardrail Model",
    batch_size=32,
    datasets="all", 
    # Note: you can pass additional `kwargs` for `moderate`
)

📖 Examples

📚 Documentation

Browse the documentation for more details about:

🏆 Leaderboard

You can find GuardBench's leaderboard here. All results can be reproduced using the provided scripts.
If you want to submit your results, please contact us.

👨‍💻 Authors

  • Elias Bassani (European Commission - Joint Research Centre)

🎓 Citation

@inproceedings{guardbench,
    title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
    author = "Bassani, Elias  and
      Sanchez, Ignacio",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.1022.pdf",
    pages = "18393--18409",
}

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

📄 License

GuardBench is provided as open-source software licensed under EUPL v1.2.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc