Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Reproducible Testbed for Evaluating and Improving Language Model Alignment with Diverse User Values
SynthLabs.ai/research/persona |
Try PERSONA Bench Online
📄 Paper | 🗃️ Research Visualizations | 🤗 Hugging Face | Online Personalization Toolkit |
🌐 SynthLabs Research | 👥 Join the Team | 🤝 Let's Collaborate
PERSONA Bench is an extension of the PERSONA framework introduced in Castricato et al. 2024. It provides a reproducible testbed for evaluating and improving the alignment of language models with diverse user values.
PERSONA established a strong correlation between human judges and language models in persona-based personalization tasks. Building on this foundation, we've developed a suite of robust evaluations to test a model's ability to perform personalization-related tasks. This repository provides practitioners with tools to assess and improve the pluralistic alignment of their language models.
There are two intended methodologies to use PERSONA Bench:
Via the API: This method provides easy integration and evaluation of your models, including a novel "comparison" evaluation type. For detailed instructions, see the PERSONA API section below. To get started with the API you can create an account and try the testbed here.
Via InspectAI: This method allows you to run evaluations using the InspectAI framework, which provides additional visualization tools. For instructions on running with InspectAI, refer to the Running with InspectAI section.
Both methods offer comprehensive evaluation capabilities, but the API method is generally more straightforward for most users and includes the exclusive "comparison" evaluation type.
Install Poetry if you haven't already:
curl -sSL https://install.python-poetry.org | python3 -
Install the package:
poetry add persona-bench
Use in your Python script:
from dotenv import load_dotenv
from persona_bench import evaluate_model
# optional, you can also pass the environment variables directly to evaluate_model
load_dotenv()
eval = evaluate_model("gpt-3.5-turbo", evaluation_type="main")
print(eval.results.model_dump())
PERSONA Bench now offers an API for easy integration and evaluation of your models. The API provides access to all evaluation types available in PERSONA Bench, including a novel evaluation type called "comparison" for grounded personalization evaluation.
Install the package:
pip install persona-bench
Set up your API key:
export SYNTH_API_KEY=your_api_key_here
Use in your Python script:
from persona_bench.api import PERSONAClient
from persona_bench.api.prompt_constructor import ChainOfThoughtPromptConstructor
# Create a PERSONAClient object
client = PERSONAClient(
model_str="your_model_name",
evaluation_type="comparison", # Run a grounded evaluation, API exclusive!
N=50,
prompt_constructor=ChainOfThoughtPromptConstructor(),
# If not set as an environment variable, pass the API key here:
# api_key="your_api_key_here"
)
# Iterate through questions and log answers
for idx, q in enumerate(client):
answer = your_model_function(q["system"], q["user"])
client.log_answer(idx, answer)
# Evaluate the results
results = client.evaluate(drop_answer_none=True)
print(results)
Create a PERSONAClient
object with the following parameters:
model_str
: The identifier for this evaluation taskevaluation_type
: Type of evaluation ("main", "loo", "intersectionality", "pass_at_k", "comparison")N
: Number of samples for evaluationprompt_constructor
: Custom prompt constructor (optional)intersection
: List of intersection attributes (required for intersectionality evaluation)loo_attributes
: Leave-one-out attributes (required for LOO evaluation)seed
: Random seed for reproducibility (optional)url
: API endpoint URL (optional, default is "https://synth-api-development.eastus.azurecontainer.io/api/v1/personas/v1/")api_key
: Your SYNTH API key (optional if set as an environment variable)Use the client as an iterable to access questions:
for idx, question in enumerate(client):
system_prompt = question["system"]
user_prompt = question["user"]
answer = your_model_function(system_prompt, user_prompt)
client.log_answer(idx, answer)
Evaluate the logged answers:
results = client.evaluate(drop_answer_none=True, save_scores=False)
Create a custom prompt constructor by inheriting from BasePromptConstructor
:
from persona_bench.api.prompt_constructor import BasePromptConstructor
class MyCustomPromptConstructor(BasePromptConstructor):
def construct_prompt(self, persona, question):
# Implement your custom prompt construction logic
pass
client = PERSONAClient(
# ... other parameters ...
prompt_constructor=MyCustomPromptConstructor(),
)
Access the underlying data using indexing:
question = client[0] # Get the first question
answers = [generate_answer(q) for q in client]
client.set_answers(answers)
The comparison evaluation is our most advanced and grounded assessment, exclusively available through the PERSONA API. It provides a robust measure of a model's personalization capabilities using known gold truth answers.
Example usage:
from persona_bench.api import PERSONAClient
client = PERSONAClient(model_str="your_identifier_name", evaluation_type="comparison", N=50)
Clone the repository:
git clone https://github.com/SynthLabsAI/PERSONA-bench.git
cd PERSONA-bench
Install dependencies:
poetry install
Install pre-commit hooks:
poetry run pre-commit install
Set up HuggingFace authentication:
huggingface-cli login
Set up environment variables:
cp .env.example .env
vim .env
The main evaluation script assesses a model's ability to generate personalized responses based on given personas from our custom filtered PRISM dataset.
This evaluation measures the impact of individual attributes on personalization performance.
Available attributes include age, sex, race, education, employment status, and many more. See the leave one out example json for formatting.
The available attributes are
[
"age",
"sex",
"race",
"ancestry",
"household language",
"education",
"employment status",
"class of worker",
"industry category",
"occupation category",
"detailed job description",
"income",
"marital status",
"household type",
"family presence and age",
"place of birth",
"citizenship",
"veteran status",
"disability",
"health insurance",
"big five scores",
"defining quirks",
"mannerisms",
"personal time",
"lifestyle",
"ideology",
"political views",
"religion",
"cognitive difficulty",
"ability to speak english",
"vision difficulty",
"fertility",
"hearing difficulty"
]
Example usage is:
from dotenv import load_dotenv
from persona_bench import evaluate_model
# optional, you can also pass the environment variables directly to evaluate_model
# make sure that your .env file specifies where the loo_json is!
load_dotenv()
eval = evaluate_model("gpt-3.5-turbo", evaluation_type="loo")
print(eval.results.model_dump())
Evaluate model performance across different demographic intersections.
See the intersectionality example json.
This configuration defines two intersections:
Males aged 18-34 Females aged 18-34
You can use any of the attributes available in the LOO evaluation to create intersections. For attributes with non-enumerable values (e.g., textual background information), you may need to modify the intersection script to use language model embeddings for computing subpopulations.
Determines how many attempts are required to successfully personalize for a given persona.
WARNING! Pass@K is very credit intensive and may require multiple hours to complete a large run.
Configure your .env
file before running the scripts. You can set the generate mode to one of the following:
baseline
: Generate an answer directly, not given the personaoutput_only
: Generate answer given the persona, without chain of thoughtchain_of_thought
: Generate chain of thought before answering, given the personademographic_summary
: Generate a summary of the persona before answering# Activate the poetry environment
poetry shell
# Main Evaluation
inspect eval src/persona_bench/main_evaluation.py --model {model}
# Leave One Out Analysis
inspect eval src/persona_bench/main_loo.py --model {model}
# Intersectionality Evaluation
inspect eval src/persona_bench/main_intersectionality.py --model {model}
# Pass@K Evaluation
inspect eval src/persona_bench/main_pass_at_k.py --model {model}
Using Inspect AI allows you to utilize their visualization tooling, which is found in their documentation here.
We provide scripts for visualizing evaluation results:
visualization_loo.py
: Leave One Out analysisvisualization_intersection.py
: Intersectionality evaluationvisualization_pass_at_k.py
: Pass@K evaluationThese scripts use the most recent log file by default. Use the --log
parameter to specify a different log file. Local visualization is only supported by the inspect-ai backend. Visualization is also available on our webportal for API users here.
Key dependencies include:
For development:
See pyproject.toml
for a complete list of dependencies.
If you use PERSONA in your research, please cite our paper:
@misc{castricato2024personareproducibletestbedpluralistic,
title={PERSONA: A Reproducible Testbed for Pluralistic Alignment},
author={Louis Castricato and Nathan Lile and Rafael Rafailov and Jan-Philipp Fränken and Chelsea Finn},
year={2024},
eprint={2407.17387},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.17387},
}
Join our Discord community for discussions, support, and updates or reach out to us at https://www.synthlabs.ai/contact.
This research is supported by SynthLabs. We thank our collaborators and the open-source community for their valuable contributions.
Copyright © 2024, SynthLabs. Released under the Apache License.
FAQs
Pluristic alignment evaluation benchmark for LLMs
We found that persona-bench demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.