You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

evals.do

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

evals.do

Evaluation tools for AI components, functions, workflows, and agents

0.0.1
latest
Source
npmnpm
Version published
Maintainers
1
Created
Source

evals.do

npm version npm downloads License: MIT TypeScript GitHub Issues GitHub Stars PRs Welcome Minified Size Discord

Evaluation tools for AI components, functions, workflows, and agents. Based on evalite with cloud storage integration.

Installation

npm install evals.do
# or
yarn add evals.do
# or
pnpm add evals.do

Usage

import { evals, EvalsClient } from 'evals.do'

// Use the default client
const test = await evals.createTest({
  name: 'My Test',
  input: { prompt: 'Hello, world!' },
  expected: { response: 'Hi there!' },
})

// Or create a custom client
const customClient = new EvalsClient({
  baseUrl: 'https://custom-evals.do',
  apiKey: 'your-api-key',
  storeLocally: true,
  storeRemotely: true,
  dbPath: './my-evals.db',
})

// Create and run tests
const tests = await Promise.all([
  customClient.createTest({
    name: 'Test 1',
    input: { prompt: 'Tell me a joke' },
    expected: { type: 'joke' },
  }),
  customClient.createTest({
    name: 'Test 2',
    input: { prompt: 'What is the capital of France?' },
    expected: { answer: 'Paris' },
  }),
])

// Define a task executor
const executor = {
  execute: async (input: any) => {
    // Call your AI function, workflow, or agent here
    return { response: `Processed: ${input.prompt}` }
  },
}

// Define metrics
const metrics = {
  accuracy: {
    calculate: (result: any, expected: any) => {
      // Implement your accuracy metric here
      return result.response === expected.response ? 1 : 0
    },
  },
}

// Run evaluation
const results = await customClient.evaluate(executor, tests, {
  metrics,
  concurrency: 1,
  timeout: 30000,
})

console.log(`Evaluation complete: ${results.id}`)
console.log(`Results: ${JSON.stringify(results.results, null, 2)}`)

API Reference

EvalsClient

The main client for creating and running evaluations.

Constructor

new EvalsClient(options?: EvalsOptions)

Options:

  • baseUrl: The URL of the evals.do API (default: 'https://evals.do')
  • apiKey: Your API key for authentication
  • storeLocally: Whether to store data locally (default: true)
  • storeRemotely: Whether to store data remotely (default: true)
  • dbPath: Path to the local SQLite database (default: './node_modules/evalite/.evalite.db')

Methods

  • createTest(test: Partial<Test>): Promise<Test> - Create a new test
  • getTest(id: string): Promise<Test | null> - Get a test by ID
  • createResult(result: Partial<Result>): Promise<Result> - Create a new result
  • getResult(id: string): Promise<Result | null> - Get a result by ID
  • createRun(run: Partial<TestRun>): Promise<TestRun> - Create a new test run
  • getRun(id: string): Promise<TestRun | null> - Get a run by ID
  • evaluate<T, R>(executor: TaskExecutor<T, R>, tests: Test[], options?: EvaluationOptions): Promise<TestRun> - Run an evaluation

Contributing

We welcome contributions! Please see our Contributing Guide for more details.

License

MIT

Dependencies

  • apis.do - Unified API Gateway for all domains and services in the .do ecosystem

Keywords

evaluation

FAQs

Package last updated on 14 Apr 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts