🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more →

Book a Demo Install Sign in

assert-llm-tools

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

assert-llm-tools

Automated Summary Scoring & Evaluation of Retained Text

0.3.0

PyPI

Maintainers: 1

ASSERT LLM Tools

Automated Summary Scoring & Evaluation of Retained Text

ASSERT LLM Tools is a Python library for evaluating summaries and RAG (Retrieval-Augmented Generation) outputs using various metrics, both traditional (ROUGE, BLEU, BERTScore) and LLM-based.

Features

Summary Evaluation: Measure summary quality with metrics like faithfulness, topic preservation, coherence, and more
RAG Evaluation: Evaluate RAG systems with metrics for answer relevance, context relevance, and faithfulness
Multiple LLM Providers: Support for OpenAI and AWS Bedrock APIs
PII Detection & Masking: Automatically detect and mask personally identifiable information before evaluation
Proxy Support: Comprehensive proxy configuration for corporate environments
Extensible Architecture: Easy to add new metrics or LLM providers

Installation

pip install assert_llm_tools

For additional features, install optional dependencies:

# For AWS Bedrock support
pip install assert_llm_tools[bedrock]

# For OpenAI support
pip install assert_llm_tools[openai]

# For all optional dependencies
pip install assert_llm_tools[all]

Quick Start

Summary Evaluation

Evaluate a summary against original text:

from assert_llm_tools import evaluate_summary, LLMConfig

# Configure LLM for evaluation
config = LLMConfig(
    provider="openai",
    model_id="gpt-4",
    api_key="your-openai-api-key"
)

# Evaluate the summary
results = evaluate_summary(
    full_text="Original long text goes here...",
    summary="Summary to evaluate goes here...",
    metrics=["rouge", "faithfulness", "hallucination", "coherence"],
    llm_config=config
)

print(results)

RAG Evaluation

Evaluate a RAG system output:

from assert_llm_tools import evaluate_rag, LLMConfig

# Configure LLM for evaluation
config = LLMConfig(
    provider="bedrock",
    model_id="anthropic.claude-v2",
    region="us-east-1"
)

# Evaluate the RAG output
results = evaluate_rag(
    question="What are the main effects of climate change?",
    answer="Climate change leads to rising sea levels, increased temperatures...",
    context="Climate change refers to long-term shifts in temperatures...",
    llm_config=config,
    metrics=["answer_relevance", "faithfulness"]
)

print(results)

Proxy Configuration

ASSERT LLM Tools supports various proxy configurations for environments that require proxies to access external APIs.

Using a General Proxy

from assert_llm_tools import LLMConfig

# Configure with a single proxy for both HTTP and HTTPS
config = LLMConfig(
    provider="openai",
    model_id="gpt-4",
    api_key="your-openai-api-key",
    proxy_url="http://proxy.example.com:8080"
)

Using Protocol-Specific Proxies

from assert_llm_tools import LLMConfig

# Configure with separate proxies for HTTP and HTTPS
config = LLMConfig(
    provider="bedrock",
    model_id="anthropic.claude-v2",
    region="us-east-1",
    http_proxy="http://http-proxy.example.com:8080",
    https_proxy="http://https-proxy.example.com:8443"
)

Using Environment Variables

The library also respects standard environment variables for proxy configuration:

# Set environment variables
export HTTP_PROXY="http://proxy.example.com:8080"
export HTTPS_PROXY="http://proxy.example.com:8443"

Then create configuration without explicit proxy settings:

# No proxy settings in code - will use environment variables
config = LLMConfig(
    provider="openai",
    model_id="gpt-4",
    api_key="your-openai-api-key"
)

Proxy Authentication

For proxies that require authentication, include the username and password in the URL:

config = LLMConfig(
    provider="bedrock",
    model_id="anthropic.claude-v2",
    region="us-east-1",
    proxy_url="http://username:password@proxy.example.com:8080"
)

Available Metrics

Summary Metrics

rouge: ROUGE-1, ROUGE-2, and ROUGE-L scores
bleu: BLEU score
bert_score: BERTScore precision, recall, and F1
bart_score: BARTScore
faithfulness: Measures factual consistency with the source text
hallucination: Detects claims in the summary not supported by the source text (returns hallucination_score)
topic_preservation: How well the summary preserves main topics
redundancy: Measures repetitive content
conciseness: Evaluates information density and brevity
coherence: Measures logical flow and readability

RAG Metrics

answer_relevance: How well the answer addresses the question
context_relevance: How relevant the retrieved context is to the question
faithfulness: Factual consistency between answer and context
answer_attribution: How much of the answer is derived from the context
completeness: Whether the answer addresses all aspects of the question

Advanced Configuration

PII Detection and Masking

For privacy-sensitive applications, you can automatically detect and mask personally identifiable information (PII) before evaluation:

# Basic PII masking
results = evaluate_summary(
    full_text="John Smith (john.smith@example.com) lives in New York.",
    summary="John's contact is john.smith@example.com.",
    metrics=["rouge", "faithfulness"],
    llm_config=config,
    mask_pii=True  # Enable PII masking
)

# Advanced PII masking with more options
results, pii_info = evaluate_summary(
    full_text=text_with_pii,
    summary=summary_with_pii,
    metrics=["rouge", "faithfulness"],
    llm_config=config,
    mask_pii=True,
    mask_pii_char="#",  # Custom masking character
    mask_pii_preserve_partial=True,  # Preserve parts of emails, phone numbers, etc.
    mask_pii_entity_types=["PERSON", "EMAIL_ADDRESS", "LOCATION"],  # Only mask specific entities
    return_pii_info=True  # Return information about detected PII
)

# Access PII detection results
print(f"PII in original text: {pii_info['full_text_pii']}")
print(f"PII in summary: {pii_info['summary_pii']}")

The same PII masking options are available for RAG evaluation:

results = evaluate_rag(
    question="Who is John Smith and what is his email?",
    answer="John Smith's email is john.smith@example.com.",
    context="John Smith (john.smith@example.com) is our company's CEO.",
    llm_config=config,
    metrics=["answer_relevance", "faithfulness"],
    mask_pii=True
)

Custom Model Selection

For BERTScore calculation, you can specify the model to use:

results = evaluate_summary(
    full_text=source,
    summary=summary,
    metrics=["bert_score"],
    bert_model="microsoft/deberta-xlarge-mnli"  # More accurate but slower
)

AWS Credentials for Bedrock

config = LLMConfig(
    provider="bedrock",
    model_id="anthropic.claude-v2",
    region="us-east-1",
    api_key="YOUR_AWS_ACCESS_KEY",
    api_secret="YOUR_AWS_SECRET_KEY",
    aws_session_token="YOUR_SESSION_TOKEN"  # Optional
)

Additional Model Parameters

config = LLMConfig(
    provider="openai",
    model_id="gpt-4",
    api_key="your-openai-api-key",
    additional_params={
        "response_format": {"type": "json_object"},
        "seed": 42
    }
)

License

MIT

FAQs

What is assert-llm-tools?

Is assert-llm-tools well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

assert-llm-tools

ASSERT LLM Tools

Features

Installation

Quick Start

Summary Evaluation

RAG Evaluation

Proxy Configuration

Using a General Proxy

Using Protocol-Specific Proxies

Using Environment Variables

Proxy Authentication

Available Metrics

Summary Metrics

RAG Metrics

Advanced Configuration

PII Detection and Masking

Custom Model Selection

AWS Credentials for Bedrock

Additional Model Parameters

License

Related posts

8 More Malicious Firefox Extensions: Exploiting Popular Game Recognition, Hijacking User Sessions, and Stealing OAuth Credentials

Official Go SDK for MCP in Development, Stable Release Expected in August