New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →

Book a Demo Sign in

labrat-agent

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

labrat-agent

Autonomous research & experiment loops for coding agents. Run experiments overnight, wake up to results.

latest

Source

npm

Version: 1.0.6

Version published: 3 weeks ago

Weekly downloads: 2

Maintainers: 1

Weekly downloads

Created: 3 weeks ago

Source

labrat

Autonomous research & experiment loops for coding agents.

Point your agent at a problem, go to sleep, wake up to results. Labrat runs an autonomous experiment loop — it modifies code, evaluates the result, keeps what works, throws away what doesn't, and repeats. Works with any measurable optimization target or open-ended research question.

Inspired by karpathy/autoresearch and pi-autoresearch, generalized beyond ML to work on any codebase with any agent that supports the Agent Skills standard (Claude Code, Codex CLI, etc).

Quick Start

# Claude Code
claude plugin add github:pawanpaudel93/labrat

# or via npm (auto-detects your agent CLI)
npx labrat-agent init

Then tell your agent what to work on:

Optimize my API response time. Target: src/api/handler.ts. Eval: npm run bench. Metric: response_time_ms (minimize). Constraint: memory under 512MB.

The agent creates a labrat/api-perf branch, runs a baseline, then starts iterating autonomously.

How It Works

Labrat adapts to what you give it. Give it a metric and an eval command, and it runs a tight experiment loop. Give it a research question, and it explores. If a metric emerges during exploration, it transitions to the experiment loop automatically.

The experiment loop

When a metric + eval command are available, the agent runs a keep/discard loop:

1. Pick an experiment idea (informed by what worked/failed before)
2. Modify target files
3. Quick checks (lint, typecheck) — fail fast if broken
4. Run benchmark, extract metric
5. If improved: run correctness checks, check constraints
6. Keep (commit) or discard (revert target files)
7. Log everything to .labrat/labrat-results.jsonl
8. Repeat until interrupted or context limit

The exploration loop

When no metric exists, the agent researches via web search, reads docs, tries different approaches, documents findings, and writes up an analysis. If a concrete metric and eval command are identified during exploration, the agent announces the transition and switches to the experiment loop.

The agent also tracks diminishing returns, combines near-misses, avoids repeating dead ends, and creates checkpoints at notable improvements.

Features

Adapts automatically — give it a metric and it experiments, give it a question and it explores
Domain-agnostic — works on any codebase, not just ML
Cross-agent — Claude Code, Codex CLI, or any Agent Skills tool
Survives context resets — session state persisted in .labrat/ files, agent compacts and continues
Clean git history — every commit on the research branch is a successful experiment

Usage Examples

Metric-driven optimization:

Optimize my API response time. Target: src/api/handler.ts. Eval: npm run bench. Metric: response_time_ms (minimize).

Open-ended research:

Research the best approach for real-time sync in our app. Look at CRDTs, OT, and simple polling. Target: src/sync/.

Starting open, then narrowing:

Investigate why our bundle size grew 40%. Target: webpack.config.js, src/index.ts. I think there's a metric here but I'm not sure what to measure yet.

Session Files

Everything lives in .labrat/ on the research branch (gitignored on main):

File	What it does
`labrat-config.json`	Session config — targets, metric, constraints
`labrat-results.jsonl`	Every experiment logged (kept, discarded, or crashed)
`labrat-journal.md`	Agent's running notes — strategy, dead ends, next ideas
`labrat-run.sh`	Wraps your eval command with timeout and structured output
`labrat-checks.sh`	Correctness checks (tests, lint) — optional
`labrat-report.md`	Findings report generated at session end

These files are plain text and agent-agnostic — you can start a session in Claude Code and pick it up in Codex CLI.

Commands

Command	What it does
`/labrat:start`	Set up and run a research session
`/labrat:status`	Check progress (experiments, best result, improvement %)
`/labrat:report`	Generate a findings report

Installation Options

npx labrat-agent init                 # auto-detect agent CLI
npx labrat-agent init claude          # Claude Code
npx labrat-agent init codex           # Codex CLI (user-level)
npx labrat-agent init codex-project   # Codex CLI (project-level)
npx labrat-agent uninstall            # remove

Or manually: clone this repo, copy each skill folder (skills/labrat/start/, report/, status/) to your skills directory as labrat-start/, labrat-report/, labrat-status/, and copy CLAUDE.md (or AGENTS.md for Codex) to your project root.

Inspired By

karpathy/autoresearch — autonomous ML research overnight
davebcn87/pi-autoresearch — plugin architecture for the pi agent

Contributing

Issues and PRs welcome. Please open an issue first for larger changes.

License

MIT

Keywords

FAQs

What is labrat-agent?

Is labrat-agent popular?

Is labrat-agent well maintained?

Package last updated on 18 Mar 2026

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

labrat-agent

labrat

Quick Start

How It Works

The experiment loop

The exploration loop

Features

Usage Examples

Session Files

Commands

Installation Options

Inspired By

Contributing

License

Keywords

Related posts

Feross on TBPN: How North Korea Hijacked Axios

Attackers Are Impersonating a Linux Foundation Leader in Slack to Target Open Source Developers