Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

spec-first

Package Overview
Dependencies
Maintainers
1
Versions
79
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install
Potential malware was recently detected in this package.

Affected versions:

1.5.51.5.61.5.71.5.8

spec-first

AI coding workflow CLI for spec-driven engineering, spec-driven development, and harness engineering on Claude Code and Codex

Source
npmnpm
Version
1.5.8
Version published
Weekly downloads
160
-63.8%
Maintainers
1
Weekly downloads
 
Created
Source

English | 简体中文

Spec-First

Spec-First

A workflow CLI that feeds LLM structured, provenance-backed context at every stage of the AI coding delivery loop — and governs the full path from ideation to compound learning.

Open-source for Claude Code and Codex. Install once, govern the full delivery loop.

Quick Start  •  Workflow  •  CLI  •  Languages  •  User Manual (zh)  •  npm

npm version npm downloads license GitHub stars GitHub issues GitHub PRs

Ask DeepWiki   Ask ChatGPT

Why Spec-First

Most AI coding failures come from degraded LLM decision inputs, not weak models:

ProblemHow spec-first addresses itEnforcement
LLM starts from a blank-slate codebase contextgraph-bootstrap extracts AST facts and compiles minimal-context with provenance and confidence signalsCode-hard gate at bootstrap / stage0-context runtime
Requirements are never made explicitBrainstorm stage produces a requirements artifact consumed by PlanSKILL.md contract
Plans drift from implementationPlan artifact is a first-class Work input, and Review Stage 2b cross-checks the Requirements Trace against the diffSKILL.md contract
Reviews are unstructured17 reviewer personas (always-on + cross-cutting + stack-specific) plus 2 CE-specific agents, routed by safe_auto / gated_auto / manual / advisorySKILL.md contract
Solved problems are not reusedCompound writes structured learnings to docs/solutions/ with YAML frontmatter for future retrievalSKILL.md contract

Suited for:

  • Teams moving from prompt-driven coding to governed AI engineering workflows
  • Claude Code and Codex users who want one repeatable delivery system across both hosts
  • Projects that need explicit specs, structured reviews, and reusable post-task learnings

Not suited for:

  • Teams without Claude Code or Codex
  • Contexts expecting zero-configuration, fully automatic code generation
  • Delivery loops too short to justify multi-stage workflow overhead

Design Philosophy

Light contract · Explicit boundaries · Let the LLM decide.

Spec-First rests on a single conviction: AI coding quality is bounded by the quality of decision inputs the LLM receives, not by the weight of orchestration. Repository governance (see CLAUDE.md / AGENTS.md) explicitly forbids:

  • Hard-coded state machines that replace LLM judgment with multi-state transitions
  • Over-engineered gates that fuse unrelated signals into a single orchestration object
  • Expanding contracts by coupling instead of by clarity

And explicitly prefers:

  • Surfacing provenance, freshness, confidence, fallback_reason, and verification_gaps as independent, composable input facts
  • Raising input quality first; reaching for more automation only when evidence demands it
  • Keeping control-plane boundaries clean — repo profile, diff recommendation, verifier dispatch, gate state, workflow prose, telemetry each answer one question and do not encroach on others

Every other choice in this README is a consequence of that stance.

How It Works

Spec-First upgrades what the LLM receives as decision input — it does not replace LLM judgment with a state machine.

Two Complementary Parts

graph-bootstrap — the foundation

Codebase → AST graph → fact extraction → minimal-context (provenance + confidence)
        → injection-index (stage-aware routing) → workflow input

graph-bootstrap turns a codebase into structured context before AI starts coding. It typically runs once per project, or incrementally when code changes, and its output is consumed by every downstream workflow stage.

Main workflow — the delivery loop

Ideate → Brainstorm → Plan → Work → Review → Compound

This solves "how does a requirement get AI-engineered end-to-end?" Every stage has explicit input artifacts, output artifacts, and a stage-gate contract.

Which bootstrap should I run?

EntrypointWhen to useProducesStability
/spec:graph-bootstrap · $spec-graph-bootstrapYou want fact-extracted, graph-informed context (Phase 0–4)Phase 0–4 facts + injection-index.yaml + minimal-context/*.jsonPrimary Stage-0 entry
/spec:compound · $spec-compoundYou want broader knowledge capture and reusable context synthesisContext synthesis docs and reusable knowledge artifactsComplementary Stage-0 path

Stage-0 entrypoints run a Host Readiness Gate at startup. If MCP setup was skipped or the host was not restarted, they stop with explicit guidance rather than degrade silently. If you still see an older bootstrap entrypoint in local runtime assets or stale documentation, migrate to /spec:graph-bootstrap or /spec:compound.

Stage-0 Context Quality Signals

Every context artifact carries machine-readable quality metadata. Downstream SKILL.md contracts read these signals and adapt:

FieldValuesMeaning
data_qualityfact-backed · partial · empty · (absent = legacy manifest, treated as backward-compatible)How much of the context comes from real code analysis
provenancefact-inventory · empty-fallbackWhether content was compiled from extracted facts or from a skeletal default
confidencehigh · medium · lowLLM-consumable trust signal
fallback_reasonempty_fact_inventory (root cause) · minimal_context_missing (secondary) · workspace_child_partial_degraded · (other runtime-specific values)Explicit degradation cause when the context is not fact-backed

When data_quality: empty, the evaluator downgrades to L1 and sets fallback_reason. The LLM gets a clear signal: this context is skeletal, not a real analysis.

Evaluator levels

L0 — fact-backed context with real AST-derived signals; full-strength Stage-0 input.
L1 — skeletal or degraded context; the evaluator has set fallback_reason, and downstream SKILLs should treat these signals as advisory.
L2 — fixed minimal fallback; used only when injection-index.yaml cannot be resolved, falling back to ambient defaults (e.g. 00-summary.md, pitfalls/index.md).

Downstream skills are allowed to proceed at any level. The evaluator exposes the level so the LLM can adjust its own confidence, not to block execution.

Enforcement Model

LayerScopeType
CLI (doctor / init / clean / stage0-context)Asset sync, state tracking, manifest validation, Stage-0 context emissionCode-hard, enforced through shell exit code
Host Readiness Gate + Stage-0 evaluator L0/L1/L2Enforced when graph-bootstrap / stage0-context runs, emitting fallback_reason and degraded levelRuntime signal, emitted by code, consumed by LLM
Workflow stages (SKILL.md)Stage contracts, artifact naming, review classes, requirements traceSKILL contract, followed by LLM
Context signals (provenance / confidence / fallback_reason)In-artifact metadataSKILL contract, consumed by LLM

Supported Languages

Powered by 15 vendored / pinned tree-sitter parsers. All 15 are installed by default — no opt-in required.

LanguageParserNotes
Ctree-sitter-c
C++tree-sitter-cpp
C#tree-sitter-c-sharp
Gotree-sitter-go
Javatree-sitter-java
JavaScripttree-sitter-javascriptCommonJS require() is resolved into imports_from edges
Kotlintree-sitter-kotlin
Objective-Ctree-sitter-objc (vendored fork).m / .mm / heuristic .h routing; extracts @interface/@implementation/@protocol
PHPtree-sitter-php
Pythontree-sitter-python
Rubytree-sitter-ruby
Rusttree-sitter-rust
Scalatree-sitter-scala
Swifttree-sitter-swift (vendored fork)Removes the upstream tree-sitter-cli install-time dependency
TypeScripttree-sitter-typescriptCovers .ts / .tsx / .d.ts

iOS repositories are auto-detected (Podfile.lock / .xcodeproj) and Pod exclude paths are applied automatically.

What You Get

CapabilityWhat it solves
CLI control plane (doctor / init / clean / stage0-context)Repeatable install, health checks, cleanup, and Stage-0 context emission — managed assets always stay traceable
CRG graph engine (spec-first crg *)Code Review Graph — an embedded Node.js runtime over SQLite + FTS5, covering AST → symbols → resolved edges → PageRank flows → community detection → surprising-connections → god-nodes → review-context
graph-bootstrap context engineLLM gets fact-extracted, confidence-annotated project context instead of a raw codebase
Full workflow layerIdeate → Brainstorm → Plan → Work → Review → Compound, every stage with an explicit artifact contract
17-persona Review stage (+ 2 CE agents)Produces structured findings routed by safe_auto / gated_auto / manual / advisory, not a single-pass scan
Compound / knowledge captureSolved problems are written to docs/solutions/ for future workflow retrieval
Dual platform supportOne methodology across Claude Code (/spec:*) and Codex ($spec-*). Claude uses a SessionStart hook + bare-agent rewrite; Codex uses .agents/skills/ discovery + explicit .codex/agents/... path rewrite
Capability layerBundled source assets ship with 47 skills, 57 agents, and 4 agent support files. Runtime delivery is host-filtered by governance: the current bundle installs 12 commands + 35 skills on Claude, and 34 skills on Codex, with 57 agents + 4 support files on both hosts
Runtime governanceManaged assets are tracked in state.json — sync, refresh, recover, and clean safely

Core Workflow

Spec-First overview

Primary stages

StageClaude CodeCodexOutput ArtifactEnforcement
Host Setup/spec:mcp-setup → restart$spec-mcp-setup → restartHost-specific marker: ~/.claude/spec-first/host-setup.json or ~/.codex/spec-first/host-setup.jsonCode-hard (bootstrap gate checks this)
Stage-0 graph bootstrap/spec:graph-bootstrap$spec-graph-bootstrapPhase 0–4 facts + injection-index.yaml + minimal-context/*.jsonCode-hard gate + SKILL.md content
Ideate/spec:ideate$spec-ideatedocs/ideation/*.mdSKILL.md contract
Brainstorm/spec:brainstorm$spec-brainstormdocs/brainstorms/*.mdSKILL.md contract
Plan/spec:plan$spec-plandocs/plans/*.mdSKILL.md contract
Work/spec:work$spec-workcode + testsSKILL.md contract
Review/spec:review$spec-reviewstructured review reportSKILL.md contract (17 reviewer personas + 2 CE agents)
Compound/spec:compound$spec-compounddocs/solutions/**/*.mdSKILL.md contract

Auxiliary stages

StageClaude CodeCodexPurpose
Debug/spec:debug$spec-debugReproduce and diagnose an existing bug or failure
Update/spec:update$spec-updateRefresh runtime assets after spec-first upgrades
Sessions/spec:sessions$spec-sessionsSearch and summarize prior coding agent sessions
Setup/spec:setup$spec-setupUnified host / environment setup entrypoint

These /spec:* and $spec-* surfaces are generated runtime workflow entrypoints, not root spec-first subcommands. The root CLI surface is documented below under CLI Commands.

Quick Start

Prerequisites

  • Node.js >=20
  • Git repositoryspec-first init reads git config user.name and graph-bootstrap depends on git ls-files, so non-Git directories are not supported
  • At least one of Claude Code or Codex
  • Disk: roughly 60–120 MB of node_modules (15 tree-sitter parsers plus the better-sqlite3 native build)

1. Install

npm install -g spec-first
spec-first -v

postinstall note: The installer runs bin/postinstall.js, which prints an install confirmation card and then trims native tree-sitter prebuilds for platforms other than yours. This step only deletes files inside the installed node_modules/ tree; it never touches your project files.

2. Check the environment

spec-first doctor
spec-first doctor --claude   # Claude-only scope
spec-first doctor --codex    # Codex-only scope

If doctor reports legacy managed state, run init again. This is the only supported upgrade path — it performs a managed hard reset before rebuilding the runtime. doctor --json also exposes workflow verification evidence as structured facts: schema validity, freshness, fallback_reason, and evidence_age_summary (oldest_* / newest_* + max_age_ms) so downstream workflows do not need to infer evidence staleness heuristically.

3. Initialize a project

spec-first init --claude
# or
spec-first init --codex

To set developer identity explicitly:

spec-first init --claude -u <name> --lang <zh|en>
spec-first init --codex -u <name> --lang <zh|en>

Identity resolution order:

  • -u flag value (when provided)
  • ~/.spec-first/.developer (global identity)
  • git config user.name fallback

Language resolution order:

  • --lang flag value (when provided)
  • Existing project .developer profile
  • Default zh

What init writes

init is not a read-only operation. It mounts spec-first into your project by writing the following:

TargetWhat gets writtenRemovable by clean?
CLAUDE.md / AGENTS.md<!-- spec-first:lang:* --> language policy block (idempotent marker block)❌ Manual removal — clean does not strip the language policy block
CLAUDE.md / AGENTS.mdusing-spec-first instruction bootstrap block✅ Removed by clean
CLAUDE.md / AGENTS.md<!-- spec-first:coding-guidelines:* --> coding execution guidelines block✅ Removed by clean
.claude/settings.jsonManaged SessionStart matcher entry (Claude only)✅ Removed by clean
.claude/hooks/session-startManaged SessionStart hook script (Claude only)✅ Removed by clean
.claude/commands/spec/** · .claude/skills/** · .claude/agents/** (or Codex equivalents)Managed runtime assets✅ Removed by clean
.claude/spec-first/.developer / .codex/spec-first/.developerHost-specific project developer profile✅ Removed by clean
.claude/spec-first/state.json / .codex/spec-first/state.jsonHost-specific managed asset tracking state✅ Removed by clean
CHANGELOG.mdBootstrapped only when missing, with the managed format header and an initial init entry❌ User-owned after creation

How to roll back

spec-first clean --claude   # or --codex

init does not overwrite an existing CLAUDE.md / AGENTS.md. On first install, spec-first appends its managed instruction blocks as a footer after any existing user content; on re-init, it only replaces the marker-delimited managed blocks it owns.

clean removes everything marked removable in the table above, then prints which platform's managed assets were removed. Custom assets outside the managed set are left untouched. The language policy block must still be removed manually — search for <!-- spec-first:lang: in CLAUDE.md / AGENTS.md. Both init --dry-run and clean --dry-run preview file-level operations derived from the same managed operation plans used by real apply paths, which keeps preview/apply drift narrow and testable. Current runtime delivery is host-specific by governance: Claude writes 12 command files, 35 skill directories, 57 agent files, and 4 agent support files; Codex writes 34 skill directories plus the same 57 agent files and 4 support files, with no command directory.

Example output

$ spec-first init --claude

🪝 Installed Claude SessionStart matcher in .claude/settings.json
📦 Generated 12 command file(s) in .claude/commands/spec
🧩 Generated 35 skill directory(ies) in .claude/skills
🤖 Generated 57 agent file(s) in .claude/agents
🧰 Generated 4 agent support file(s) in .claude/agents
🪪 Wrote project developer profile:
  📍 path: .claude/spec-first/.developer
  👤 name: yourname
  🈯 lang: zh
  ⏱ initialized_at: <ISO-8601 timestamp>
  🔖 version: <installed spec-first version>

🔁 Restart Claude Code after generation so it can pick up the new /spec:* commands.

Counts and version reflect the version actually installed at run time. If CHANGELOG.md did not exist yet, init also prints 📝 Bootstrapped CHANGELOG.md. The install log is emitted in English regardless of --lang; the --lang setting governs future Claude / Codex response language, not the installer's own output. Codex output differs by design: it does not generate .claude/commands/spec, and it restarts into $spec-* skill entrypoints instead.

4. First run

StepClaude CodeCodex
Install MCP tools/spec:mcp-setup$spec-mcp-setup
Restart hostrestart Claude Coderestart Codex
Build context/spec:graph-bootstrap or /spec:compound$spec-graph-bootstrap or $spec-compound
Start the workflow/spec:ideate/spec:brainstorm/spec:plan/spec:work/spec:review/spec:compound$spec-ideate → … → $spec-compound

graph-bootstrap runs a Host Readiness Gate at startup. If MCP setup was skipped or the host was not restarted, it stops with explicit guidance rather than degrade silently.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Entry Layer — spec-first CLI                                │
│  doctor / init / clean / stage0-context / crg <subcommand>   │
│  Enforcement: code-hard (asset sync, state, manifest,        │
│               Stage-0 emission, CRG SQLite pipeline)         │
├──────────────────────────────────────────────────────────────┤
│  Context Layer — graph-bootstrap / CRG module                │
│  AST fact extraction → artifact-manifest (data_quality)      │
│  → minimal-context (provenance + confidence + fallback)      │
│  → injection-index (stage-aware routing)                     │
│  Enforcement: code-hard gate (L0/L1/L2) + SKILL.md content   │
├──────────────────────────────────────────────────────────────┤
│  Workflow Layer — skills                                     │
│  Ideate / Brainstorm / Plan / Work / Review / Compound       │
│  + Debug / Update / Sessions / Setup auxiliaries             │
│  Stage contracts, artifact conventions, review classes       │
│  Enforcement: SKILL.md contracts (LLM-followed)              │
├──────────────────────────────────────────────────────────────┤
│  Capability Layer — agents (6 categories)                    │
│  review/ (17 reviewer personas + CE agents)                  │
│  document-review/ (requirements / plan persona review)       │
│  research/ (session / doc / Feishu / web context readers)    │
│  design/ (UI / design-lens agents)                           │
│  workflow/ (bug-reproduction / lint / pr-comment-resolver)   │
│  docs/ (documentation / onboarding support)                  │
│  Enforcement: convention (LLM-dispatched)                    │
└──────────────────────────────────────────────────────────────┘

Runtime assets under .claude/, .codex/, or .agents/ are generated outputs, not editable source. skills/, agents/, templates/, and docs/ are the source of truth.

CLI Commands

Managed-asset commands

CommandPurposeNotes
spec-first doctorEnvironment checkVerifies platform state, plugin manifest, and managed assets. --claude / --codex scopes to one platform. Reports legacy managed state when init is needed, and --json includes evidence schema/freshness plus evidence_age_summary.
spec-first initInitialize the runtimeSyncs commands, skills, agents, runtime hooks, and developer metadata through managed operation plans. Also the only supported legacy upgrade entrypoint — performs a managed hard reset. See What init writes above.
spec-first cleanRemove managed assetsRemoves the given platform's spec-first managed assets through the same operation-plan boundary used by --dry-run; does not migrate legacy state and does not strip the language policy marker block.
spec-first stage0-contextEmit Stage-0 runtime contextCalled by SKILLs such as spec-plan / spec-work / spec-review at stage start. Accepts --stage <plan|work|review>, --workflow <skill-name>, --format json.

CRG graph commands (spec-first crg <subcommand>)

An embedded Code Review Graph runtime over SQLite + FTS5.

spec-first crg --help
spec-first crg build --repo .
spec-first crg review-context --repo . --changed <ref>
SubcommandPurpose
buildBuild or incrementally refresh the graph DB from a repo
statsReport node / edge / community counts and unresolved edges
contextExport a context bundle for a symbol or file
queryEight structured lookups: callers_of / callees_of / importers_of / importees_of / dependents_of / dependencies_of / tests_for / similar_to
impactImpact-of-change analysis for a file or symbol
large-functionsFind functions above a size threshold
searchFTS5 full-text search across symbols / files
flowsPageRank + BFS flow detection
flow / affected-flowsInspect a single flow, or flows affected by a diff
communities / community3-pass community detection, plus single-community inspection
architectureHigh-level architecture summary
surprising-connectionsCross-community / peripheral-to-hub surprise detector
god-nodesHigh-fan-in hub detection
detect-changesSHA-256 incremental change detection
review-contextCompose a review context bundle from a diff
postprocessRecompute communities, flows, graph analysis, and FTS after a build or incremental refresh

All subcommands accept --repo=<path>. The full list is whatever spec-first crg --help prints for the installed version.

Documentation

Detailed manuals and implementation docs are currently Chinese-first. Until English translations catch up, English readers can use DeepWiki or Ask ChatGPT as a supplementary entrypoint.

DocumentLanguageDescription
Chinese READMEzhFull Chinese README
User ManualzhComplete user manual index
Quick StartzhFirst-time setup walkthrough
Core ConceptszhArchitecture and terminology
Full ExamplezhEnd-to-end delivery walkthrough
FAQzhTroubleshooting and common issues
Best PracticeszhTeam usage patterns
Chinese Architecture OverviewzhSystem design for contributors
Chinese Development GuidezhContributor standards
Chinese Testing PlanzhVerification and test strategy
CHANGELOGen / zh mixedCanonical version history (machine-readable)
Chinese Release NoteszhNarrative release notes

Local Development

git clone https://github.com/sunrain520/spec-first.git
cd spec-first
npm install --legacy-peer-deps
npm test

--legacy-peer-deps is required because the vendored tree-sitter forks and jest's peer-dependency resolution conflict under stricter resolvers. Omitting it typically fails the first jest run.

Verification scripts

npm run test:unit           # shell unit tests + jest unit suite (tests/unit/*)
npm run test:smoke          # install-local + CLI smoke
npm run test:integration    # verification-gate jest + e2e shell
npm run test:e2e:crg        # CRG full-command + SQLite audit
npm run test:jest           # jest only
npm run test:crg:gate       # CRG regression gate (benchmarks/regression/*)
npm run test:ai-dev:gate    # AI Dev Quality Gate (light contract check)
npm pack                    # release tarball dry run

npm test itself runs test:unit → test:smoke → test:integration → test:e2e:crg in that order.

Contributing

Issues and pull requests are welcome.

To report a bug, open an Issue with reproduction steps, environment details, and expected behavior.

To contribute code:

  • Fork the repository and create a feature branch from master.
  • Treat master as the only branch that accepts direct updates; main is an automatically synced mirror branch and should not receive direct development or commits.
  • Read AGENTS.md for repository workflow conventions.
  • Run npm install --legacy-peer-deps, then npm test.
  • Open a PR with the change goal and verification details.
  • Every code / doc change must add a line to CHANGELOG.md following the format defined at the top of that file.

Recommended reading before contributing: AGENTS.md · User Manual · CHANGELOG

License

MIT © sunrain520

Keywords

spec-first

FAQs

Package last updated on 21 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts