🚀 Socket Launch Week Day 5:Introducing Repository Access Permissions and Custom Roles.Learn more
Sign In

ai-engineering-workflow

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ai-engineering-workflow

Agent-neutral AI engineering team runtime with global memory, workflow gates, and traceable change history.

latest
Source
npmnpm
Version
0.1.0
Version published
Weekly downloads
18
260%
Maintainers
1
Weekly downloads
 
Created
Source

AI Engineering Workflow

Agent-neutral workflow runtime for turning AI coding agents into a traceable virtual engineering team.

CI npm publish ready License: MIT Node.js 20+ MCP

Installation · Quickstart · MCP Tools · Architecture · 中文

Works around Codex, Claude Code, Cursor, Gemini CLI, and other coding agents.

AI Engineering Workflow is not another coding model. It is the engineering system around coding models: roles, gates, memory, questions, trace logs, evidence, ChangeSets, and audit bundles.

Why This Exists

Modern coding agents can implement quickly, but they often miss the habits that make software engineering reliable:

  • clarify product intent only when the ambiguity matters
  • separate product, architecture, delivery, implementation, QA, security, review, and release responsibilities
  • keep changes small, reversible, and tied to requirements
  • record what was changed, by which agent, from which task packet, with which tests
  • learn reusable lessons across projects instead of only inside one repository

This project gives agents a workflow kernel so they can act less like a single autocomplete session and more like a disciplined engineering team.

What It Provides

  • MCP server as the primary integration surface
  • automatic workflow advancement from a user-provided product goal
  • virtual team roles with role-specific prompts and artifact contracts
  • global engineering memory across projects
  • project-local trace ledger, decision log, evidence log, ChangeSets, and audit bundles
  • question protocol that explores first and asks only for high-impact unknowns
  • task packets for external execution agents
  • progress feedback fields so Codex, Claude Code, or another harness can tell the user who is active and what phase is running
  • prompt packs for agents that do not support MCP yet
  • lightweight CLI for debugging, smoke tests, and audit export

What It Is Not

  • It is not a replacement for Codex, Claude Code, Cursor, or Gemini CLI.
  • It does not own your source control or silently publish changes.
  • It does not invent the product goal. The user still gives the product objective.
  • It does not ask a fixed upfront questionnaire. It discovers ambiguity while working and pauses only for important decisions.
  • It does not treat generated logs as magic truth. Gates depend on recorded evidence.

Status

The project is early alpha. The runtime is usable as a local MCP server and has executable tests for the current workflow kernel, but the external agent adapters are still task-packet based rather than fully autonomous process supervisors.

Use it first on non-critical repositories or small product slices.

Documentation

LinkPurpose
Repository StructureUnderstand the source tree and where to add new work.
ArchitectureUnderstand the MCP server, runtime, memory store, trace ledger, and adapter boundary.
WorkflowUnderstand phases, gates, stopping points, and progress feedback.
MCP ToolsUnderstand public tools and recommended call patterns.
RolesUnderstand the virtual engineering team and role handoffs.
Data And TraceabilityUnderstand where logs are saved and how audit trails are built.
PublishingPrepare GitHub and npm releases.
Testing MatrixMap public capabilities to executable tests.

Repository Layout

.
├── bin/                  # CLI entrypoint and MCP server launcher
├── src/                  # Runtime source code
│   ├── server.mjs        # Stdio JSON-RPC MCP server
│   └── core/             # Workflow, memory, trace, context, and project modules
├── schemas/              # JSON schemas for portable records
├── prompt-pack/          # Prompt-only adapter guidance for non-MCP agents
├── tests/                # Node test suite
├── docs/                 # Long-form documentation
├── examples/             # Copyable MCP configs and request examples
└── .github/              # CI, issue templates, and pull request template

Installation

From npm

After the package is published:

npm install -g ai-engineering-workflow
ai-engineering server

Or run it without a global install:

npx -y ai-engineering-workflow server

From source

git clone https://github.com/PercivalLin/ai-engineering-workflow.git
cd ai-engineering-workflow
npm run verify
node ./bin/ai-engineering.mjs server

The package requires Node.js 20 or newer.

MCP Configuration

Use the MCP server from npm:

{
  "mcpServers": {
    "ai-engineering-workflow": {
      "command": "npx",
      "args": ["-y", "ai-engineering-workflow", "server"],
      "env": {
        "AI_ENGINEERING_HOME": "/Users/you/.ai-engineering"
      }
    }
  }
}

Use the MCP server from a local checkout:

{
  "mcpServers": {
    "ai-engineering-workflow": {
      "command": "node",
      "args": ["/absolute/path/to/Vibe-Engineering/bin/ai-engineering.mjs", "server"],
      "env": {
        "AI_ENGINEERING_HOME": "/Users/you/.ai-engineering"
      }
    }
  }
}

AI_ENGINEERING_HOME is optional. If omitted, global memory is stored at ~/.ai-engineering.

First Run

The normal entrypoint is advance_workflow.

Give the agent a product goal, then let it call the MCP tool against the target repository:

{
  "project_root": "/absolute/path/to/target-product",
  "product_goal": "Build a traceable task manager where users can create tasks, assign owners, record status changes, and export an audit trail.",
  "adapter": "codex",
  "risk_level": "medium"
}

The runtime will:

  • register the user-provided product goal
  • scan the target repository
  • retrieve relevant global engineering memory
  • ask only if a discovered ambiguity affects product, architecture, cost, compliance, data, release, or acceptance criteria
  • generate requirements, architecture notes, and backlog artifacts
  • dispatch the next role task packet
  • require evidence before gates pass
  • record trace events and export an audit bundle

For CLI debugging:

ai-engineering advance \
  --project /absolute/path/to/target-product \
  --goal "Build a traceable task manager..." \
  --adapter codex

Agent Progress Feedback

MCP responses include progress fields that an agent harness can surface directly to the user:

  • current_role: active virtual team role, such as developer or architect
  • current_phase: workflow phase, such as requirements, architecture, or build_loop
  • progress_message: concise human-facing status
  • agent_feedback_prompt: instruction telling the execution agent how to explain the current status before continuing

Example:

{
  "current_role": "developer",
  "current_phase": "build_loop",
  "status": "external_agent_required",
  "progress_message": "[AI Engineering Workflow] Developer is active. Phase: build_loop. Status: external_agent_required.",
  "agent_feedback_prompt": "You are currently acting as: Developer.\nWorkflow phase: build_loop.\nWorkflow status: external_agent_required.\nTell the user this status briefly before continuing."
}

These fields are returned by advance_workflow, get_role_action, dispatch_agent_task, ask_user_decision, record_user_decision, and run_gate.

Logs And Data

Project logs are stored inside the target product repository, not inside this tool repository unless this tool repository is the target.

Project-local runtime state:

<target-project>/.ai-engineering/project.yaml
<target-project>/.ai-engineering/workflow-state.json
<target-project>/.ai-engineering/trace-ledger.jsonl
<target-project>/.ai-engineering/decision-log.jsonl
<target-project>/.ai-engineering/evidence/
<target-project>/.ai-engineering/changesets/
<target-project>/.ai-engineering/context/
<target-project>/.ai-engineering/context/task-packets/
<target-project>/.ai-engineering/audit-bundles/
<target-project>/docs/ai-artifacts/

Global memory:

~/.ai-engineering/memory/principles/
~/.ai-engineering/memory/playbooks/
~/.ai-engineering/memory/anti-patterns/
~/.ai-engineering/memory/cases/
~/.ai-engineering/memory/rules/
~/.ai-engineering/memory/role-checklists/
~/.ai-engineering/memory/stack-knowledge/
~/.ai-engineering/memory/organization-preferences/
~/.ai-engineering/agents/
~/.ai-engineering/sandbox-rules/

Useful inspection commands:

tail -n 20 <target-project>/.ai-engineering/trace-ledger.jsonl
tail -n 20 <target-project>/.ai-engineering/decision-log.jsonl
find <target-project>/.ai-engineering -maxdepth 2 -type f | sort
find ~/.ai-engineering -maxdepth 3 -type f | sort

Virtual Team Roles

RoleResponsibilityPrimary outputs
PMclarify product goal, users, scope, priorities, acceptance criteriaPRD or lightweight spec, requirements, success criteria
Architect / Tech Leaddesign technical approach, interfaces, risks, migration, rollbackADR, interface contract, risk register
Delivery Managerbreak work into small batches and keep gates movingbacklog, task queue, Definition of Ready, Definition of Done
Developerimplement focused, traceable code changescode, tests, ChangeSet, implementation notes
QAverify acceptance, regression, edge, and failure pathstest matrix, test evidence, failure reproduction
Securitycheck threats, dependencies, secrets, permissions, data risksecurity review, risk findings, evidence
SRE / DevOpscheck CI, deployment, observability, SLO, rollbackrelease readiness, operational evidence
Reviewerindependently review diff, architecture fit, tests, maintainabilityreview findings, approval or blocker
Writerproduce user/developer-facing delivery docsREADME, API docs, changelog, release notes
Learning Coachconvert evidence-backed outcomes into reusable memorycandidate playbooks, anti-patterns, policy rules
Trace Auditorverify traceability from goal to code to evidence to releaseaudit bundle, traceability matrix, missing-link findings

Each role has a mission, principles, decision frameworks, artifact contract, quality bar, inputs, outputs, and gates in src/core/defaults.mjs.

Workflow

The automated state machine follows this shape:

  • Intake
  • Context Scan
  • Experience Retrieval
  • Clarification Gate
  • Requirements
  • Architecture
  • Planning
  • Build Loop
  • Verification Loop
  • Review Gate
  • Release Readiness
  • Retro / Learn
  • Archive

advance_workflow moves through as many safe steps as it can, then stops at one of these boundaries:

  • user decision required
  • external agent task required
  • gate blocked
  • task complete
  • workflow cannot safely continue

Question Protocol

The runtime should explore before asking.

It should ask the user when ambiguity affects:

  • product goal, target user, or success criteria
  • architecture, cost, compliance, long-term maintenance, or user experience
  • data deletion, migration, compatibility, public API behavior, paid services, or release window
  • acceptance criteria that cannot be inferred from the repository or prior decisions

It should not ask for:

  • facts discoverable from repository files, tests, README, CI, or package metadata
  • minor implementation details
  • code style choices already implied by the project
  • low-risk defaults

Questions are recorded in the decision log with default assumptions and impact.

Traceability Model

Every code modification should be recorded as a ChangeSet.

A ChangeSet records:

  • change_id
  • task_id, requirement_id, or decision_id
  • initiating role
  • execution agent
  • task packet or prompt hash
  • context hash
  • changed files
  • diff hash
  • commands run
  • tests run
  • evidence refs
  • review refs
  • risk level
  • rollback plan
  • timestamp

This enables two directions of traceability:

  • requirement -> design -> task -> code -> tests -> review -> release
  • file change -> requirement -> decision -> agent -> evidence -> rollback

MCP Tools

ToolPurpose
advance_workflowhigh-level automatic workflow advancement
create_goalcreate a product goal or engineering task
scan_project_contextscan repository facts and inferred test commands
retrieve_global_experiencesearch global engineering memory
get_role_actionget the next task packet for a role
ask_user_decisioncreate a structured high-impact question
record_user_decisionrecord a user answer
dispatch_agent_taskcreate a task packet for Codex, Claude Code, Cursor, Gemini CLI, or another adapter
record_changesetrecord traceable modification metadata
record_artifactrecord requirements, ADRs, release notes, retros, or other artifacts
record_backlogrecord Delivery Manager backlog items
record_evidencerecord tests, scans, reviews, security checks, deployments, or manual evidence
run_gaterun the current or selected phase gate
propose_learningcreate evidence-backed global memory candidates
promote_or_rollback_rulemove rules through candidate, sandbox, default, or deprecated states
export_audit_bundleexport timeline, decisions, evidence, ChangeSets, matrix, and summary

Prompt Packs

For agents or environments that cannot use MCP directly, see:

prompt-pack/codex-skill.md
prompt-pack/claude-code.md
prompt-pack/generic-agent.md

These prompt packs tell the agent how to behave as one role in the virtual team and what evidence it must return.

Development

npm run check
npm test
npm run verify
npm run ci
npm run smoke

The implementation is dependency-light on purpose. The MCP server uses stdio JSON-RPC directly so the workflow kernel remains portable across agent harnesses.

See TESTING.md for the experiment matrix that maps public capabilities to executable tests. See docs/ for architecture, workflow, roles, data, and publishing guides.

Release Checklist

Before publishing:

  • Create the public GitHub repository and update package.json with repository, homepage, and bugs URLs.
  • Review docs/ and examples/ for stale placeholders.
  • Confirm the npm package name is available:
npm view ai-engineering-workflow version

An npm 404 usually means the package name is not currently published.

  • Log in to npm:
npm adduser
npm whoami

This package sets publishConfig.registry to the official npm registry so publishing does not accidentally target a local mirror.

  • Run verification:
npm run ci
  • Review what will be published:
npm pack --dry-run
  • Publish:
npm publish

For a scoped public package such as @your-scope/ai-engineering-workflow, use:

npm publish --access public

The npm docs recommend reviewing package contents for sensitive or unnecessary information before publishing, testing the package, and using --access public for scoped public packages. See the official npm docs for package.json, npm publish, and scoped public packages:

Security And Privacy

This tool writes project workflow data to the target repository and global memory to AI_ENGINEERING_HOME or ~/.ai-engineering.

Before sharing logs, audit bundles, or global memory, check for:

  • product strategy
  • proprietary code paths
  • user data
  • secrets
  • credentials
  • private URLs
  • vendor or customer names

Do not publish generated .ai-engineering/ runtime directories by default unless the target project intentionally treats audit logs as public artifacts.

Roadmap

  • richer Codex and Claude Code execution adapters
  • adapter health checks and retry policies
  • stronger schema validation for all ledger records
  • configurable gate policies
  • global memory conflict detection
  • import/export for organization preferences
  • CI examples for package provenance and release automation

Role Prompt References

The role prompts in src/core/defaults.mjs are original syntheses for this project. They do not vendor or copy complete third-party skill prompts. The design was informed by these public references:

  • BMAD Method agents: informed the multi-role agent model, including PM, Architect, Developer, Scrum/Delivery-style planning, QA, and implementation readiness workflows.
  • BMAD named agents: informed phase-anchored role identity, role customization, and reducing user cognitive load through named agents.
  • aj-geddes product-manager skill: informed PM requirements discipline around PRD vs tech spec choice, functional/non-functional requirements, prioritization, acceptance criteria, and traceability.
  • Product-Manager-Skills: informed agent-agnostic skill packaging and PM workflow structure across multiple AI coding harnesses.
  • Google Engineering Practices: informed Developer and Reviewer guidance around small changes, tests, code health, and review standards.
  • The Kanban Guide: informed Delivery Manager guidance around visualizing work, WIP, explicit policies, flow, and continuous improvement.
  • Scrum Guide: informed delivery gates, Definition of Done, transparency, inspection, and adaptation language.
  • Google SRE Book: informed SRE/DevOps guidance around SLOs, release engineering, monitoring, rollback, incident readiness, and reliable release processes.
  • OWASP ASVS: informed Security role verification thinking for application security controls.
  • NIST SSDF SP 800-218: informed Security, Learning Coach, and Trace Auditor guidance around secure development practices and evidence.
  • SLSA: informed Trace Auditor and SRE supply-chain/provenance language.
  • Google developer documentation style guide: informed Writer guidance around clear, consistent, accessible developer documentation.
  • Anthropic guide to building skills: influenced the progressive-disclosure shape: concise role metadata first, detailed role guidance only inside the role task packet.

Adaptation notes:

  • We use MoSCoW as the default prioritization language because it is compact for autonomous planning.
  • RICE is included only when comparable reach, impact, confidence, and effort inputs are available.
  • The PM role is forbidden from prescribing low-level implementation details unless they are already project constraints.
  • All roles must preserve traceability from product goal to requirements, stories, acceptance criteria, backlog, implementation, tests, evidence, review, release, learning, and ChangeSets.
  • Role prompts are not meant to replace the workflow gates; they tell each role how to produce evidence that the gates can verify.

License

MIT

Keywords

ai

FAQs

Package last updated on 05 Jun 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts