Latest Threat ResearchGlassWorm Loader Hits Open VSX via Developer Account Compromise.Details
Socket
Book a DemoInstallSign in
Socket

trpc.group/trpc-go/trpc-agent-go/examples/evaluation

Package Overview
Dependencies
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

trpc.group/trpc-go/trpc-agent-go/examples/evaluation

Go Modules
Version
v1.1.0
Version published
Created
Source

Local Evaluation Example

This example runs the evaluation pipeline with a local file-backed manager. Evaluation sets, metric definitions, and run results all live on disk so you can inspect or version them alongside source code.

Environment Variables

The example supports the following environment variables:

VariableDescriptionDefault Value
OPENAI_API_KEYAPI key for the model service (required)``
OPENAI_BASE_URLBase URL for the model API endpointhttps://api.openai.com/v1

Note: The OPENAI_API_KEY is required for the example to work.

Configuration Flags

FlagDescriptionDefault
-modelModel identifier used by the calculator agentdeepseek-chat
-streamingEnable streaming responses from the LLMfalse
-data-dirDirectory containing .evalset.json and .metrics.json files./data
-output-dirDirectory where evaluation results are written./output
-eval-setEvaluation set ID to executemath-basic
-runsNumber of repetitions per evaluation case1

Run

cd trpc-agent-go/examples/evaluation/local
go run . \
  -model "deepseek-chat" \
  -data-dir "./data" \
  -output-dir "./output" \
  -eval-set "math-basic" \
  -runs 1

It prints a case-by-case summary and writes detailed JSON artifacts to ./output/math-eval-app.

Data Layout

data/
└── math-eval-app/
    ├── math-basic.evalset.json    # EvalSet file for math-basic.
    └── math-basic.metrics.json    # Metric file for math-basic EvalSet.

You can add new cases or metrics by editing these JSON files or by creating additional evaluation set IDs under the same directory.

Output

EvalResult file

output/
└── math-eval-app/
    └── math-eval-app_math-basic_538cdf6e-925d-41cf-943b-2849982b195e.evalset_result.json    # EvalResult file for math-basic EvalSet.

Log

✅ Evaluation completed
App: math-eval-app
Eval Set: math-basic
Overall Status: passed
Runs: 1
Case calc_add -> passed
  Metric tool_trajectory_avg_score: score 1.00 (threshold 1.00) => passed

Case calc_multiply -> passed
  Metric tool_trajectory_avg_score: score 1.00 (threshold 1.00) => passed

FAQs

Package last updated on 29 Dec 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts