Empirical CLI
Empirical is the fastest way to test different LLMs, prompts and other model configurations, across all the scenarios
that matter for your application.
With Empirical, you can:
- Run your test datasets locally against off-the-shelf models
- Test your own custom models and RAG applications (see how-to)
- Reports to view, compare, analyze outputs on a web UI
- Score your outputs with scoring functions
- Run tests on CI/CD
Watch demo video | See all docs
Usage
See quick start on docs →
Empirical bundles together a CLI and a web app. The CLI handles running tests and
the web app visualizes results.
Everything runs locally, with a JSON configuration file, empiricalrc.json
.
Required: Node.js 20+ needs to be installed on your system.
Start with a basic example
In this example, we will ask an LLM to parse user messages to extract entities and
give us a structured JSON output. For example, "I'm Alice from Maryland" will
become "{name: 'Alice', location: 'Maryland'}"
.
Our test will succeed if the model outputs valid JSON.
-
Use the CLI to create a sample configuration file called empiricalrc.json
.
npx empiricalrun init
cat empiricalrc.json
-
Run the test samples against the models with the run
command. This step requires
the OPENAI_API_KEY
environment variable to authenticate with OpenAI. This
execution will cost $0.0026, based on the selected models.
npx empiricalrun
-
Use the ui
command to open the reporter web app and see side-by-side results.
npx empiricalrun ui
Make it yours
Edit the empiricalrc.json
file to make Empirical work for your use-case.