New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details
Socket
Book a DemoSign in
Socket

labrat-agent

Package Overview
Dependencies
Maintainers
1
Versions
7
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

labrat-agent

Autonomous research & experiment loops for coding agents. Run experiments overnight, wake up to results.

latest
Source
npmnpm
Version
1.0.6
Version published
Weekly downloads
2
-86.67%
Maintainers
1
Weekly downloads
 
Created
Source

labrat

Autonomous research & experiment loops for coding agents.

npm version license

Point your agent at a problem, go to sleep, wake up to results. Labrat runs an autonomous experiment loop — it modifies code, evaluates the result, keeps what works, throws away what doesn't, and repeats. Works with any measurable optimization target or open-ended research question.

Inspired by karpathy/autoresearch and pi-autoresearch, generalized beyond ML to work on any codebase with any agent that supports the Agent Skills standard (Claude Code, Codex CLI, etc).

Quick Start

# Claude Code
claude plugin add github:pawanpaudel93/labrat

# or via npm (auto-detects your agent CLI)
npx labrat-agent init

Then tell your agent what to work on:

Optimize my API response time. Target: src/api/handler.ts. Eval: npm run bench. Metric: response_time_ms (minimize). Constraint: memory under 512MB.

The agent creates a labrat/api-perf branch, runs a baseline, then starts iterating autonomously.

How It Works

Labrat adapts to what you give it. Give it a metric and an eval command, and it runs a tight experiment loop. Give it a research question, and it explores. If a metric emerges during exploration, it transitions to the experiment loop automatically.

The experiment loop

When a metric + eval command are available, the agent runs a keep/discard loop:

1. Pick an experiment idea (informed by what worked/failed before)
2. Modify target files
3. Quick checks (lint, typecheck) — fail fast if broken
4. Run benchmark, extract metric
5. If improved: run correctness checks, check constraints
6. Keep (commit) or discard (revert target files)
7. Log everything to .labrat/labrat-results.jsonl
8. Repeat until interrupted or context limit

The exploration loop

When no metric exists, the agent researches via web search, reads docs, tries different approaches, documents findings, and writes up an analysis. If a concrete metric and eval command are identified during exploration, the agent announces the transition and switches to the experiment loop.

The agent also tracks diminishing returns, combines near-misses, avoids repeating dead ends, and creates checkpoints at notable improvements.

Features

  • Adapts automatically — give it a metric and it experiments, give it a question and it explores
  • Domain-agnostic — works on any codebase, not just ML
  • Cross-agent — Claude Code, Codex CLI, or any Agent Skills tool
  • Survives context resets — session state persisted in .labrat/ files, agent compacts and continues
  • Clean git history — every commit on the research branch is a successful experiment

Usage Examples

Metric-driven optimization:

Optimize my API response time. Target: src/api/handler.ts. Eval: npm run bench. Metric: response_time_ms (minimize).

Open-ended research:

Research the best approach for real-time sync in our app. Look at CRDTs, OT, and simple polling. Target: src/sync/.

Starting open, then narrowing:

Investigate why our bundle size grew 40%. Target: webpack.config.js, src/index.ts. I think there's a metric here but I'm not sure what to measure yet.

Session Files

Everything lives in .labrat/ on the research branch (gitignored on main):

FileWhat it does
labrat-config.jsonSession config — targets, metric, constraints
labrat-results.jsonlEvery experiment logged (kept, discarded, or crashed)
labrat-journal.mdAgent's running notes — strategy, dead ends, next ideas
labrat-run.shWraps your eval command with timeout and structured output
labrat-checks.shCorrectness checks (tests, lint) — optional
labrat-report.mdFindings report generated at session end

These files are plain text and agent-agnostic — you can start a session in Claude Code and pick it up in Codex CLI.

Commands

CommandWhat it does
/labrat:startSet up and run a research session
/labrat:statusCheck progress (experiments, best result, improvement %)
/labrat:reportGenerate a findings report

Installation Options

npx labrat-agent init                 # auto-detect agent CLI
npx labrat-agent init claude          # Claude Code
npx labrat-agent init codex           # Codex CLI (user-level)
npx labrat-agent init codex-project   # Codex CLI (project-level)
npx labrat-agent uninstall            # remove

Or manually: clone this repo, copy each skill folder (skills/labrat/start/, report/, status/) to your skills directory as labrat-start/, labrat-report/, labrat-status/, and copy CLAUDE.md (or AGENTS.md for Codex) to your project root.

Inspired By

Contributing

Issues and PRs welcome. Please open an issue first for larger changes.

License

MIT

Keywords

labrat

FAQs

Package last updated on 18 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts