
Security News
npm Adopts OIDC for Trusted Publishing in CI/CD Workflows
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
LLM Evaluation for .NET Developers — No Python Required
EvalSharp brings the power of reliable LLM evaluation directly to your C# projects. Inspired by DeepEval, but designed for the .NET ecosystem, EvalSharp lets you measure LLM outputs with confidence using familiar C# tools and patterns.
dotnet add package EvalSharp
var cases = new[]
{
new TType
{
UserInput = "Please summarize the article on climate change impacts.",
LLMOutput = "The article talks about how technology is advancing rapidly.",
}
};
var evaluator = Evaluator.FromData(
ChatClient.GetInstance(),
cases,
c => new MetricEvaluationContext
{
InitialInput = c.UserInput,
ActualOutput = c.LLMOutput
}
);
evaluator.AddAnswerRelevancy(includeReason: true);
var result = await evaluator.RunAsync();
In addition to evaluating datasets with the Evaluator
, EvalSharp makes it easy to include LLM evaluation in your unit tests. The EvalTest.AssertAsync
method allows you to assert results for a single test with one or more metrics.
using EvalSharp.Models;
using EvalSharp.Scoring;
using Xunit.Abstractions;
public class MyEvalTests
{
public MyEvalTests(ITestOutputHelper testOutputHelper)
{
_testOutputHelper = testOutputHelper;
}
[Fact]
public async Task SingleTest_MultipleMetrics()
{
var testData = new EvaluatorTestData
{
InitialInput = "Summarize the meeting.",
ActualOutput = "The meeting summary is provided below...",
};
var rel_config = new AnswerRelevancyMetricConfiguration
{
IncludeReason = true,
Threshold = 0.9
};
var geval_config = new GEvalMetricConfiguration
{
Threshold = 0.5,
Criteria = "Does the output correctly explain concepts, events, or processes based on the input prompt?"
};
var metrics = new List<Metric>
{
new AnswerRelevancyMetric(ChatClient.GetInstance(), rel_config),
new GEvalMetric(ChatClient.GetInstance(), geval_config)
};
await EvalTest.AssertAsync(testData, metrics, _testOutputHelper.WriteLine);
}
}
✅ Supports multiple metrics in a single call
✅ Output results to your preferred sink (e.g., Console, Xunit test output)
✅ Ideal for lightweight, targeted LLM evaluation in CI/CD pipelines
✅ Answer Relevancy — Is the LLM's response relevant to the input?
✅ Bias — Checks for content biases.
✅ Contextual Precision — Measures if output precisely reflects provided context.
✅ Contextual Recall — Assesses how much of the relevant context was included in the output.
✅ Faithfulness — Evaluates factual correctness and grounding of the output.
✅ GEval — Enforces structure, logical flow, and coverage expectations.
✅ Hallucination — Detects whether the LLM generated unsupported or fabricated content.
✅ Match — Compares expected and actual output for equality or similarity.
✅ Prompt Alignment — Ensures output follows the intent and structure of the prompt.
✅ Summarization — Scores the quality and accuracy of generated summaries.
✅ Task Completion — Measures whether the LLM's output fulfills the requested task.
✅ Tool Correctness — Evaluates whether tool-augmented LLM responses are correct.
We're just getting started. Here's what's coming soon to EvalSharp:
This project is licensed under the MIT License. See the LICENSE file for details.
Portions of this project include content adapted from deepeval, which is licensed under the Apache License 2.0. See the NOTICE file for attribution.
Aviron Software would like to give a special thanks to the team at DeepEval. Their original metrics and prompts are the catalysts for this project.
FAQs
Unknown package
We found that evalsharp demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
Research
/Security News
A RubyGems malware campaign used 60 malicious packages posing as automation tools to steal credentials from social media and marketing tool users.
Security News
The CNA Scorecard ranks CVE issuers by data completeness, revealing major gaps in patch info and software identifiers across thousands of vulnerabilities.