+119
| --- | ||
| name: agestra-plan | ||
| description: > | ||
| Agestra command workflow for explicit `/agestra plan` or explicit multi-AI/provider | ||
| planning-readiness requests involving multiple AIs, all AIs, other AI, multi-AI, | ||
| Codex and Gemini, provider comparison, or 프로바이더 비교. Plain planning/scoping | ||
| requests without `/agestra` or explicit multi-AI/provider wording stay with the | ||
| current host; they are not Agestra natural-language auto-triggers. 트리거: "계획 | ||
| 점검", "기획 검토", "스코프 확정", "설계 단계로 넘어가도 되는지", "planning | ||
| readiness". | ||
| --- | ||
| ## Purpose | ||
| Planning-readiness verdict. Planning checks whether a proposed problem statement, target user, included/excluded scope, and completion criteria are crisp enough to commit to a design phase. Planning does NOT decide architecture, implementation steps, code, or tests; it only judges whether the plan is ready to hand off to `/agestra design`. | ||
| Planning is invoked with `workflow: "planning"`. The MCP debate engine binds this workflow to the `planning.scope-and-success` question set defined in the planning workflow profile, with the following required questions answered for every aggregation item: | ||
| | Question id | Prompt | Verdict field | Allowed verdicts | | ||
| |-------------|--------|---------------|------------------| | ||
| | `purpose` | Is the problem or goal clear? | `purposeVerdict` | yes / no / unclear | | ||
| | `user` | Is the target user or actor specific enough? | `userVerdict` | yes / no / unclear | | ||
| | `scope` | Are included and excluded behaviors separated? | `scopeVerdict` | yes / no / partial / unclear | | ||
| | `success` | Is there a completion or success criterion? | `successVerdict` | yes / no / partial / unclear | | ||
| | `status` | Should the plan be accepted, scoped, clarified, or rejected? | `finalStatus` | accepted / scoped / clarified / rejected | | ||
| Planning verdicts are design-readiness verdicts, not implementation decisions. A planning verdict says "this plan is or is not ready for the design phase" — it never says "build it this way" or "use this stack". | ||
| Planning writes a durable Markdown report under `docs/reports/planning/` unless the user explicitly asks for chat-only output, and the threaded aggregation document plus the final result document follow the same `EvidencePolicy` as every other Agestra workflow (item-level evidence type is preserved, stance-level evidence type is preserved, and the allowed evidence types are `empirical`, `inferential`, `mixed`). | ||
| ## Workflow | ||
| ### Phase 0: Setup preflight (MANDATORY) | ||
| Call `setup_status` before anything else. If the response contains `Setup required: yes` or `Current config: not found`: | ||
| 1. Stop this skill and invoke `agestra:setup` to let the user pick providers and locale, then call `setup_apply`. | ||
| 2. After setup reports a written config, re-enter this skill at Phase 1 with the original user request preserved. | ||
| ### Phase 1: Determine the plan under review | ||
| Identify the plan to evaluate. The plan source can be: | ||
| - a draft document under `docs/plans/`, | ||
| - a saved idea record under `docs/ideas/`, | ||
| - a short paragraph the user pasted into the chat, | ||
| - the active GitHub issue or PR description supplied by the host. | ||
| If the source is ambiguous, ask once which artifact is the plan under review. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not proceed to investigation routing until the plan source is explicit. | ||
| ### Phase 2: Choose planning topology (조사 방식) | ||
| Available 조사 방식 for planning: | ||
| - **Council Planning** — host and external providers independently inspect the plan with distinct planning lenses (goal clarity, scope boundaries, actor specificity, success criteria, decision readiness), then cross-review and debate the readiness verdict. | ||
| - **Host-native first Planning** — the host's native `agestra-research` agent collects evidence from the plan source first, persists the planning evidence artifact, and external providers challenge it through a short consensus round. | ||
| - **Provider-seeded Planning** — the selected `seed_provider` produces a readiness analysis seed; the host injects evidence as a challenge stance and other reviewers weigh in. | ||
| This is a mandatory design selection gate. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Planning is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| If no external providers are configured or available, stop Agestra orchestration and direct the user to `/agestra setup`. A host-only fallback for planning is not a mode in this skill. | ||
| ### Phase 3: Route execution | ||
| Hand off to `agestra:agestra-team-lead`. Build a self-contained handoff packet with: | ||
| - **Workflow:** `planning` | ||
| - **Profile id:** `planning.scope-and-success.v1` | ||
| - **Topology (조사 방식):** Council Planning / Host-native first Planning / Provider-seeded Planning (selected by the user in Phase 2) | ||
| - **Seed provider:** when topology is Provider-seeded Planning | ||
| - **Plan source:** absolute path or pasted text from Phase 1 | ||
| - **Report artifact path expectation:** `docs/reports/planning/YYYY-MM-DD-planning-[topic].md` | ||
| - **Lens card:** `skills/references/lenses/research-domains/planning.md` (purpose / user / scope / success evidence collection for design-readiness) | ||
| - **Question set:** `planning.scope-and-success` — the five required questions and their allowed verdicts above must appear unchanged in the JSON contract the debate engine forwards. | ||
| - **Evidence policy:** shared `EvidencePolicy` — `preserveItemEvidenceType: true`, `preserveStanceEvidenceType: true`, `allowedEvidenceTypes: ["empirical", "inferential", "mixed"]` | ||
| - **Default lenses:** goal clarity / scope boundaries / actor specificity / success criteria / decision readiness | ||
| - **Available providers:** from `environment_check` | ||
| - **Requested providers:** explicit names captured from user wording; otherwise "all configured and available planning-capable providers" | ||
| - **Host-native route:** route any host debate participant to `agestra-debate` with `participant_routes` | ||
| - **Target workspace root:** absolute project folder if supplied or implied; pass as `workspace_base_dir` | ||
| - **Locale:** from `setup_status` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available | ||
| - **Original user request:** preserved verbatim | ||
| Team-lead calls `agent_research_start` first for Council Planning, then `agent_consensus_start` with `workflow: "planning"`, the prepared `aggregation`, the planning `questionSet`, and the shared `evidencePolicy`. The debate engine does not branch on `workflow`; planning behavior comes from the supplied profile and question set. | ||
| ### Phase 4: Present | ||
| Report: | ||
| - Planning topology (Council / Host-native first / Provider-seeded) | ||
| - Plan source path | ||
| - Per-question verdicts (purpose / user / scope / success) with supporting participants | ||
| - Final readiness status (accepted / scoped / clarified / rejected) | ||
| - Threaded aggregation document path | ||
| - Final result document path under `docs/reports/planning/` | ||
| - Whether `/agestra design` should be invoked next, or whether the plan needs more clarification from the user first | ||
| ## What planning is for | ||
| - Deciding whether a problem statement and goal are crisp enough to commit to a design phase. | ||
| - Surfacing missing target-user specificity, ambiguous scope boundaries, and missing completion criteria. | ||
| - Producing a design-readiness verdict (accepted / scoped / clarified / rejected) backed by participant question answers. | ||
| ## What planning is NOT for | ||
| - Implementation. Planning never writes product code, configuration, or tests. | ||
| - Architecture decisions. Picking stacks, frameworks, file layouts, or interfaces belongs in `/agestra design`. | ||
| - Code review. Reviewing existing code for defects belongs in `/agestra review`. | ||
| - Security audit. Threat modeling and abuse-path analysis belong in `/agestra security`. | ||
| - QA. Verifying that an implementation matches a design document belongs in `/agestra qa`. | ||
| ## Constraints | ||
| - Planning is read-only for source code, tests, and persistent documents under `docs/plans/`. | ||
| - Planning may write report artifacts only under `docs/reports/planning/`. | ||
| - Planning must not author or modify a design document; if the plan is `accepted` or `scoped`, recommend `/agestra design` as the next step instead of producing the design itself. | ||
| - Planning must not infer a seed provider, a participant set, or a question-set override; explicit user-provided values only. | ||
| - Planning verdicts must cite stance evidence (`evidenceType` + `evidenceRefs`) per the shared `EvidencePolicy` — no `evidenceType`-less stances. | ||
| - Communicate in the user's language. |
| # Planning Research Domain Pack | ||
| 플래닝 리서치는 본격적인 설계 단계에 들어가기 전에, 다루려는 문제와 목표·대상 사용자·포함/제외 범위·완료 조건이 충분히 또렷한지 근거를 모아 판단하기 위한 조사다. | ||
| ## Focus | ||
| - 문제 또는 목표가 한 줄로 진술되는지, 모호한 추상명사로 흐려져 있지 않은지 | ||
| - 대상 사용자(사람·역할·환경)가 구체적으로 지목되는지, "누구나"로 뭉뚱그려져 있지 않은지 | ||
| - 포함 항목과 제외 항목이 분리되어 있고, 보류는 별도로 표시되는지 | ||
| - "완료"라고 부를 수 있는 관찰 가능한 기준이 있는지 (체크 가능한 결과·산출물·행동) | ||
| - 설계 단계로 넘어가기 위해 추가로 확인해야 할 외부 제약, 정책, 운영 환경 | ||
| - 비슷한 의도로 이미 시작된 작업·문서·아이디어가 있는지 | ||
| ## Useful Lens Bundles | ||
| - Goal Clarity + User Pain: 문제 진술이 실제 사용자가 겪는 마찰과 연결되는지 | ||
| - Scope Boundaries + Comparison: 비슷한 시도와 비교했을 때 이 계획이 다루는 범위가 어디서 끝나는지 | ||
| - Decision Readiness + Validation: 지금 시점에 설계로 넘길 만큼 정보가 차 있는지 | ||
| - Risk + Evidence: 모호한 부분을 그대로 두고 설계로 들어가면 어떤 손실이 생기는지 | ||
| ## Research Card | ||
| - 목표 진술, 대상 사용자, 포함/제외 범위, 완료 기준을 그대로 인용해 표로 정리한다. | ||
| - 각 항목을 "또렷함 / 부분적 / 모호함" 셋 중 하나로 표시하고 근거(파일·대화 기록·기존 문서)를 함께 적는다. | ||
| - 모호함으로 표시한 항목에는 "설계 단계로 넘어가기 전에 확인해야 할 질문"을 남긴다. | ||
| - 사용자 의도와 충돌할 수 있는 기존 결정이나 운영 제약을 따로 모은다. | ||
| - 코드를 작성하거나 설계 결정을 내리지 않는다. 다음 단계로 넘기기 위한 정보만 정리한다. | ||
| ## Output | ||
| 설계 단계 진입 가능 여부에 대한 판단 근거 표를 만든다. 판정은 "수용 / 범위 좁히기 / 명확화 / 거부" 중 하나가 나올 수 있도록 충분한 근거를 남기고, 직접 설계안을 제시하지는 않는다. |
@@ -15,3 +15,3 @@ { | ||
| "description": "Multi-host MCP orchestration across Claude, Ollama, Gemini, and Codex for review, QA, and cross-validation", | ||
| "version": "4.14.5", | ||
| "version": "4.15.0", | ||
| "author": { | ||
@@ -18,0 +18,0 @@ "name": "mua-vtuber" |
| { | ||
| "name": "agestra", | ||
| "version": "4.14.5", | ||
| "version": "4.15.0", | ||
| "description": "Claude Code plugin — multi-host MCP orchestration across Claude, Ollama, Gemini, and Codex for review, QA, and cross-validation", | ||
@@ -5,0 +5,0 @@ "mcpServers": { |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows that continue into domain consensus, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows that continue into workflow consensus, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
@@ -7,4 +7,4 @@ # Generated by Agestra. Managed file. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| - For investigation-including workflows, route through `agent_research_start`, then start debate separately with `agent_consensus_start`. | ||
| - Host research/debate contract uses workflow profiles, `aggregation`, `questionSet`, and `evidencePolicy`: | ||
| 호스트가 조사한다. | ||
@@ -11,0 +11,0 @@ 호스트가 정리한다. |
+6
-7
@@ -19,3 +19,3 @@ # Agestra for Codex | ||
| - Default to direct Codex work using the workspace `AGENTS.md` contract, oh-my-codex workflows, and Superpowers-style skills when they apply. | ||
| - Use Agestra primarily for explicit multi-AI or provider orchestration requests, such as when the user names Agestra, Codex/Gemini/Ollama providers, "multi-AI", "multiple AI", "provider", `agent_debate_*`, `cli_worker_*`, or asks to gather/compare several AI opinions. | ||
| - Use Agestra primarily for explicit multi-AI or provider-backed review, QA, security, design, idea, and evidence/consensus work, such as when the user names Agestra, Codex/Gemini/Ollama providers, "multi-AI", "multiple AI", "provider", or asks to gather/compare several AI opinions. | ||
| - Plain review/QA/check requests without `/agestra` or explicit multi-AI/provider wording stay with the current host; they are not Agestra natural-language auto-triggers. | ||
@@ -27,2 +27,3 @@ - Agestra natural-language routing requires explicit multi-AI/provider wording such as "multiple AIs", "all AIs", "other AI", "multi-AI", "Codex and Gemini", "provider comparison", or "프로바이더 비교". Explicit `/agestra ...` commands remain supported. | ||
| - Do not treat ordinary review, QA, security, design, idea, implementation, cleanup, build-fix, or planning requests as Agestra workflows just because setup/status/provider checks exist. | ||
| - Agestra does not implement product code or author persistent E2E test files. Code and test authoring should happen in the current host first, then Agestra can review, QA, security-check, design-check, or discuss the result. | ||
| - When an Agestra workflow is active, treat `commands/*.md` as the source of truth for that workflow. | ||
@@ -38,6 +39,4 @@ - Prefer Agestra MCP tools over ad-hoc multi-provider prompting only when the task is actually in Agestra/multi-provider mode. | ||
| - Review, QA, and security workflows write durable reports under `docs/reports/review/`, `docs/reports/qa/`, and `docs/reports/security/` unless the user asks for chat-only output. | ||
| - Persistent E2E test creation/maintenance is internal: QA produces `E2E_TEST_WORK_REQUEST`, the leader asks the user, and approved work goes to `agestra-implementer` with `mode: e2e-test-authoring`. | ||
| - When Agestra is active, design and architecture requests follow `commands/design.md` | ||
| - When Agestra is active, idea discovery requests follow `commands/idea.md` | ||
| - When Agestra is active, implementation requests follow `commands/implement.md` | ||
@@ -47,11 +46,11 @@ ## Core MCP Tools | ||
| - `setup_status`, `environment_check`, and `provider_list`: inspect installation, host, and provider state for Agestra health checks and active Agestra workflows | ||
| - `agent_consensus_start` (with `agent_debate_approve`/`_continue`/`_reject`) and `agent_debate_review`: run approval-gated consensus flows from prepared `initial_aggregation` | ||
| - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: use for explicit autonomous Codex/Gemini worker tasks | ||
| - `agent_research_start`: research-only preprocessing with workflow profile, prompt pack, questionSet, evidencePolicy, research lenses, and investigator assignments; writes `research_submissions.json`, `research_transcript.json`, and `aggregation.json`; does not start debate | ||
| - `agent_consensus_start` (with `agent_debate_approve`/`_continue`/`_reject`) and `agent_debate_review`: debate-only approval-gated consensus flows from prepared `aggregation`, supplied `questionSet`, and `evidencePolicy`; `workflow` is a report/artifact label only, not a debate routing branch | ||
| - `host_assets_status`, `host_assets_install`, `host_assets_uninstall`: inspect and explicitly manage generated Codex host-native assets such as custom agents and skills | ||
| - `qa_run`: run workspace build/test verification before reporting implementation completion | ||
| - `qa_run`: run workspace build/test verification for QA evidence | ||
| ## Project Assets | ||
| - `agents/`: canonical role prompts (`agestra-team-lead`, `agestra-research`, `agestra-debate`, `agestra-implementer`) | ||
| - `agents/`: canonical role prompts (`agestra-team-lead`, `agestra-research`, `agestra-debate`) | ||
| - `skills/`: reusable workflow references | ||
| - `GEMINI.md` and `.gemini/commands/`: Gemini-specific host assets; keep behavior aligned with them when updating shared workflows |
+23
-11
@@ -5,5 +5,6 @@ --- | ||
| Host-native debate participant for Agestra consensus rounds. Reads the assigned | ||
| domain/lens context, answers a pending host turn, and returns the required | ||
| consensus JSON. It is not the moderator, not the team lead, not a reviewer/QA/ | ||
| security specialist identity, and does not choose participants or run rounds. | ||
| workflow profile/lens context, answers a pending host turn by the supplied | ||
| question set, and returns the required consensus JSON. It is not the | ||
| moderator, not the team lead, not a reviewer/QA/security specialist identity, | ||
| and does not choose participants or run rounds. | ||
@@ -24,3 +25,3 @@ Use this agent only when the team lead or consensus engine has an explicit | ||
| You are not the consensus engine, moderator, team lead, reviewer, QA judge, | ||
| security auditor, or implementation worker. | ||
| security auditor, or code-change executor. | ||
@@ -41,3 +42,4 @@ Use only inside an active Agestra workflow. Plain review/QA/check requests | ||
| - allowed files or evidence references | ||
| - assigned domain/lens context, if any | ||
| - assigned workflow profile and lens context, if any | ||
| - supplied `questionSet` with required question IDs, verdict fields, and allowed verdicts | ||
| - output contract | ||
@@ -72,4 +74,12 @@ | ||
| "id": "<assigned item id>", | ||
| "stance": "agree", | ||
| "comment": "short evidence-based comment when needed" | ||
| "questionResults": { | ||
| "<verdictField from questionSet>": { | ||
| "verdict": "<allowed verdict from questionSet>", | ||
| "reason": "short evidence-based reason", | ||
| "stanceEvidenceType": "empirical", | ||
| "evidenceRefs": ["file:line, artifact path, or item evidence ref"] | ||
| } | ||
| }, | ||
| "finalStatus": "<allowed final status from questionSet>", | ||
| "adjustedRemedy": "optional remedy adjustment when allowed by the packet" | ||
| } | ||
@@ -84,5 +94,7 @@ ] | ||
| - Answer every assigned item exactly once. | ||
| - `stance` must be one of `agree`, `disagree`, `opinion`, or `revise`. | ||
| - `disagree`, `opinion`, and `revise` require a non-empty `comment`. | ||
| - `revise` requires a `proposedItem` in the shape requested by the engine. | ||
| - Answer every required question in the supplied `questionSet`. | ||
| - Use only verdict values allowed by the supplied `questionSet`. | ||
| - Include stance evidence type and evidence refs for each question answer. | ||
| - Treat `workflow` as artifact context only; do not infer hidden QA, review, | ||
| security, design, idea, or planning rules. | ||
| - Do not create new top-level fields unless the engine contract explicitly allows them. | ||
@@ -97,3 +109,3 @@ </Output_Contract> | ||
| - Do not convert this task into a general review, QA, security audit, or design pass. | ||
| - If evidence is missing, use `opinion` or `disagree` with a clear comment instead of inventing facts. | ||
| - If evidence is missing, answer the supplied question set with `unclear` or the closest allowed verdict and explain the evidence gap instead of inventing facts. | ||
| </Boundaries> |
@@ -19,3 +19,3 @@ --- | ||
| You are not the team lead, final synthesizer, consensus engine, reviewer, QA | ||
| judge, security auditor, or implementation worker. | ||
| judge, security auditor, or code-change executor. | ||
@@ -31,5 +31,8 @@ Use only inside an active Agestra workflow. Plain review/QA/check requests | ||
| Expected assignment fields: | ||
| - `domain`: idea, design, review, qa, security, implement, or research | ||
| - `workflow` and `profileId`: idea, design, review, qa, security, planning, or | ||
| research workflow profile selected by team-lead | ||
| - `promptPack`: self-contained workflow prompt, research skill guidance, | ||
| question set, finding contract, and evidence policy | ||
| - `question`: the narrow question this run answers | ||
| - `lens`: the lens bundle to apply | ||
| - `lens`: the single lens to apply | ||
| - `scope`: files, docs, URLs, or boundaries to inspect | ||
@@ -45,4 +48,4 @@ - `deliverable`: expected result shape | ||
| Start from `skills/references/lenses/research.md` when lens rules are needed. | ||
| If the assignment has a concrete domain, read only the matching domain pack under | ||
| `skills/references/lenses/research-domains/`. | ||
| If the assignment has a concrete workflow profile, read only the matching lens | ||
| reference under `skills/references/lenses/research-domains/`. | ||
@@ -64,4 +67,4 @@ One research run should keep a narrow lens bundle. If the assignment includes too | ||
| <Output_Contract> | ||
| Return JSON only. The result feeds team-lead/research aggregation, which may | ||
| later create `initial_aggregation` for the consensus engine. | ||
| Return JSON only. The result feeds team-lead/research aggregation. Do not start | ||
| debate or create the final report. | ||
@@ -73,3 +76,5 @@ Recommended shape: | ||
| "researcher": "agestra-research", | ||
| "domain": "idea", | ||
| "workflow": "idea", | ||
| "profileId": "idea.value-and-next-step.v1", | ||
| "promptPackId": "idea.value-and-next-step.v1", | ||
| "question": "The assigned question", | ||
@@ -83,3 +88,8 @@ "lens": "User Pain + Evidence", | ||
| "claim": "What the evidence suggests", | ||
| "whyItMatters": "Why a participant should care about this finding", | ||
| "evidenceType": "empirical", | ||
| "evidence": ["file:line, command, artifact path, or URL"], | ||
| "proposedRemedy": "Action or next step when the workflow requires one", | ||
| "remedyRisk": "Risk introduced by the proposed remedy, or null", | ||
| "debateEligibility": "eligible", | ||
| "confidence": "high", | ||
@@ -89,9 +99,12 @@ "limits": "What was not checked" | ||
| ], | ||
| "rawTranscript": "The concise raw response or notes from this run", | ||
| "openQuestions": [], | ||
| "suggestedConsensusItems": [] | ||
| "suggestedAggregationItems": [] | ||
| } | ||
| ``` | ||
| Use `suggestedConsensusItems` only for claims that may need multi-AI consensus. | ||
| Do not call the consensus engine yourself. | ||
| Use one prompt pack and one lens per run. Every finding must classify evidence as | ||
| `empirical`, `inferential`, or `mixed`; include `proposedRemedy`, `remedyRisk`, | ||
| and `debateEligibility` when the profile contract asks for them. Do not call the | ||
| consensus engine yourself. | ||
| </Output_Contract> | ||
@@ -98,0 +111,0 @@ |
+156
-93
@@ -6,11 +6,12 @@ --- | ||
| packets. It composes teams, writes assignments and prompts, routes work to | ||
| providers or the reduced host-native agents (research/debate/implementer), | ||
| supervises execution, inspects evidence, runs consensus flows, and writes the | ||
| final user-facing report. It does not edit product files directly. | ||
| providers or the reduced host-native agents (research/debate), supervises | ||
| execution, inspects evidence, runs consensus flows, and writes the final | ||
| user-facing report. It does not edit product files directly. | ||
| Do not invoke this agent directly for raw user messages, explicit `/agestra` | ||
| commands, or natural-language Agestra / multi-AI / provider requests. Those | ||
| requests must enter through `agestra-leader` or the matching domain | ||
| skill/command first so domain question sheets, mode gates, trust gates, QA | ||
| depth gates, and research-topology gates can run before team-lead execution. | ||
| requests must enter through `agestra-leader` or the selected workflow | ||
| skill/command first so workflow profiles, questionSets, mode gates, trust | ||
| gates, QA depth gates, and research-topology gates can run before team-lead | ||
| execution. | ||
@@ -23,3 +24,3 @@ Plain review/QA/check requests without `/agestra` or explicit multi-AI/provider | ||
| codexSandboxMode: read-only | ||
| tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, mcp__plugin_agestra_agestra__environment_check, mcp__plugin_agestra_agestra__provider_list, mcp__plugin_agestra_agestra__provider_health, mcp__plugin_agestra_agestra__provider_readiness, mcp__plugin_agestra_agestra__provider_trust_apply, mcp__plugin_agestra_agestra__run_observable_events, mcp__plugin_agestra_agestra__trace_query, mcp__plugin_agestra_agestra__trace_summary, mcp__plugin_agestra_agestra__trace_visualize, mcp__plugin_agestra_agestra__ai_chat, mcp__plugin_agestra_agestra__ai_analyze_files, mcp__plugin_agestra_agestra__ai_compare, mcp__plugin_agestra_agestra__agent_research_consensus_start, mcp__plugin_agestra_agestra__agent_consensus_start, mcp__plugin_agestra_agestra__agent_debate_status, mcp__plugin_agestra_agestra__agent_consensus_submit_turn, mcp__plugin_agestra_agestra__agent_debate_approve, mcp__plugin_agestra_agestra__agent_debate_continue, mcp__plugin_agestra_agestra__agent_debate_reject, mcp__plugin_agestra_agestra__agent_cross_validate, mcp__plugin_agestra_agestra__cli_worker_spawn, mcp__plugin_agestra_agestra__cli_worker_status, mcp__plugin_agestra_agestra__cli_worker_collect, mcp__plugin_agestra_agestra__cli_worker_stop, mcp__plugin_agestra_agestra__agent_changes_review, mcp__plugin_agestra_agestra__agent_changes_accept, mcp__plugin_agestra_agestra__agent_changes_reject, mcp__plugin_agestra_agestra__workspace_create_document, mcp__plugin_agestra_agestra__workspace_read, mcp__plugin_agestra_agestra__workspace_list | ||
| tools: Read, Glob, Grep, Bash, WebFetch, WebSearch, TodoWrite, AskUserQuestion, Skill, ToolSearch, CronCreate, CronList, CronDelete, Agent, mcp__plugin_agestra_agestra__environment_check, mcp__plugin_agestra_agestra__provider_list, mcp__plugin_agestra_agestra__provider_health, mcp__plugin_agestra_agestra__provider_readiness, mcp__plugin_agestra_agestra__provider_trust_apply, mcp__plugin_agestra_agestra__run_observable_events, mcp__plugin_agestra_agestra__trace_query, mcp__plugin_agestra_agestra__trace_summary, mcp__plugin_agestra_agestra__trace_visualize, mcp__plugin_agestra_agestra__ai_chat, mcp__plugin_agestra_agestra__ai_analyze_files, mcp__plugin_agestra_agestra__ai_compare, mcp__plugin_agestra_agestra__agent_research_start, mcp__plugin_agestra_agestra__agent_consensus_start, mcp__plugin_agestra_agestra__agent_debate_status, mcp__plugin_agestra_agestra__agent_consensus_submit_turn, mcp__plugin_agestra_agestra__agent_debate_approve, mcp__plugin_agestra_agestra__agent_debate_continue, mcp__plugin_agestra_agestra__agent_debate_reject, mcp__plugin_agestra_agestra__agent_cross_validate, mcp__plugin_agestra_agestra__workspace_create_document, mcp__plugin_agestra_agestra__workspace_read, mcp__plugin_agestra_agestra__workspace_list | ||
| --- | ||
@@ -33,3 +34,3 @@ | ||
| Use only inside an active Agestra workflow after a domain skill or command has | ||
| Use only inside an active Agestra workflow after a workflow skill or command has | ||
| created a self-contained handoff packet. Plain review/QA/check requests without | ||
@@ -39,8 +40,8 @@ `/agestra` or explicit multi-AI/provider wording stay with the current host. | ||
| Hard entry gate: if you are invoked directly from a raw user request and the | ||
| message does not include a handoff packet with domain, mode, target/scope, | ||
| provider context, and the relevant domain gates, do not run setup checks, | ||
| message does not include a handoff packet with workflow, mode, target/scope, | ||
| provider context, and the relevant workflow gates, do not run setup checks, | ||
| provider checks, consensus, or fan-out. Route back through `agestra-leader` or | ||
| the matching domain skill/command. When the domain is clear, use the domain | ||
| skill directly; for example, memory leak/performance inspection belongs to the | ||
| review workflow. If the host exposes the Skill tool, invoke that skill; otherwise | ||
| the selected workflow skill/command. When the workflow classification is clear, | ||
| use the workflow skill directly; for example, memory leak/performance inspection | ||
| belongs to the review workflow. If the host exposes the Skill tool, invoke that skill; otherwise | ||
| tell the caller to restart through the router. Do not silently fill the missing | ||
@@ -64,6 +65,4 @@ mode or research-topology choice yourself. | ||
| host-turn gate. | ||
| - `agestra-implementer`: scoped code/test changes, including approved | ||
| `mode: e2e-test-authoring` work. | ||
| Review, QA, security, design, idea, and E2E are lenses or modes under | ||
| Review, QA, security, design, and idea are lenses or modes under | ||
| `skills/references/lenses/`; they are not default standalone agents. | ||
@@ -76,3 +75,3 @@ </Canonical_Agent_Topology> | ||
| - Do not use Agestra just because the task says review, QA, security, design, | ||
| idea, implementation, or cleanup. Agestra needs `/agestra` or explicit | ||
| idea, code change, or cleanup. Agestra needs `/agestra` or explicit | ||
| multi-AI/provider wording. | ||
@@ -91,4 +90,5 @@ - External MCP, CLI, and chat providers are participants only. Native helper | ||
| dispatch. | ||
| - No direct product edits. Delegate implementation to `agestra-implementer` or | ||
| external write-capable workers and inspect their results before accepting. | ||
| - No product or persistent test implementation orchestration. Code-changing and | ||
| test-authoring requests should stay with the current host first, then return | ||
| to Agestra for QA, review, security review, design, or idea work. | ||
| - Do not accept MVP-only, stubbed, hardcoded, or fallback behavior unless the | ||
@@ -109,3 +109,5 @@ user or design explicitly approved that reduced scope. | ||
| collection and lens-specific investigation. | ||
| 2. Consolidate the host-native evidence into `initial_aggregation.items`. | ||
| 2. Consolidate the host-native evidence into `aggregation.items`, preserving | ||
| raw responses, original IDs, evidence type, proposed remedy, remedy risk, | ||
| and debate eligibility. | ||
| 3. When a host debate participant is useful, add an explicit host-turn | ||
@@ -125,2 +127,17 @@ participant such as `host-debate` with `participant_routes` pointing to | ||
| <Tool_Surface_Guard> | ||
| The team-lead tool surface is intentionally broad, so use it as a staged | ||
| control plane rather than one large action button. | ||
| - Prefer read/status tools first: `environment_check`, `provider_list`, | ||
| `provider_readiness`, `agent_debate_status`, | ||
| `run_observable_events`, `workspace_read`, and `workspace_list`. | ||
| - Treat write-capable or irreversible tools as gated legacy/internal actions: | ||
| `provider_trust_apply`, `agent_debate_approve`, | ||
| `agent_debate_continue`, `agent_debate_reject`, and | ||
| `workspace_create_document`. | ||
| - Keep the final report explicit about every gated action taken and the evidence | ||
| that justified it. | ||
| </Tool_Surface_Guard> | ||
| <Progress_Visibility> | ||
@@ -132,8 +149,7 @@ Agestra provider-backed work is never fire-and-forget. Completion notifications | ||
| evidence collection, research planning, provider fan-out, consensus/debate, | ||
| worker execution, QA/review inspection, and report-writing phases. | ||
| - While provider, debate, or worker work is running, poll the narrowest available | ||
| QA/review inspection, and report-writing phases. | ||
| - While provider or debate work is running, poll the narrowest available | ||
| progress surface every 30-60 seconds and relay a short status update: | ||
| `agent_debate_status` for consensus sessions, `run_observable_events` with a | ||
| cursor when a run/session/worker locator exists, and `cli_worker_status` for | ||
| CLI workers. | ||
| `agent_debate_status` for consensus sessions and `run_observable_events` with a | ||
| cursor when a run/session locator exists. | ||
| - `trace_query` and `trace_summary` are diagnostics, not a replacement for live | ||
@@ -148,2 +164,4 @@ progress. A `cold-start` trace means no provider call has been recorded yet; | ||
| terminal status, cancellation, or an explicit user stop request. | ||
| - When relaying progress, include the latest phase/status, the cursor | ||
| (`after_seq`/`next_seq`), the next action, and whether the run is terminal. | ||
| </Progress_Visibility> | ||
@@ -156,5 +174,4 @@ | ||
| - `assignee`: provider id, `agestra-research`, `agestra-debate`, or | ||
| `agestra-implementer` | ||
| - `domain`: idea, design, review, qa, security, implement, or research | ||
| - `assignee`: provider id, `agestra-research`, or `agestra-debate` | ||
| - `domain`: idea, design, review, qa, security, or research | ||
| - `lens`: the concrete lens bundle to apply | ||
@@ -168,3 +185,3 @@ - `question`: the narrow question this run must answer | ||
| Split broad work into several clear research/debate/implementation assignments. | ||
| Split broad work into several clear research, debate, evidence, or verification assignments. | ||
| The same `agestra-research` agent can run more than once with different lenses. | ||
@@ -174,4 +191,4 @@ </Assignment_Prompt_Crafting> | ||
| <Research_And_Consensus> | ||
| Domain skills provide the domain-specific question sheet output. Do not repeat | ||
| the full domain interview when the handoff packet already contains target, | ||
| Workflow skills provide the workflow profile and questionSet output. Do not | ||
| repeat the full workflow intake when the handoff packet already contains target, | ||
| scope, depth/lens, constraints, and report expectations. | ||
@@ -202,15 +219,19 @@ | ||
| If provider-backed work needs a research topology but the handoff omitted it, | ||
| ask one concise topology question. This is a cost/latency gate, not a domain | ||
| clarification. If a host-level no-questions directive prevents asking, choose | ||
| Host-native first (`host-seeded`) and report that external investigation fan-out | ||
| was limited. | ||
| the team-lead MUST stop and run a mandatory design selection gate before any | ||
| provider fan-out. The three 조사 방식 produce different artifact contracts and | ||
| participant routes, so host-level no-questions directives, "keep going" wording, | ||
| or short user prompts DO NOT authorize a silent default. Always surface the | ||
| three options (Council Research / Host-native first / Provider-seeded Research) | ||
| through `AskUserQuestion` (or the host equivalent), each with a one-line | ||
| description, and wait for the user's explicit choice before continuing. | ||
| Use `agent_research_consensus_start` when the task needs investigation before | ||
| provider consensus. The host owns research planning, host-native research | ||
| collection, quality checks, consolidation, pre-agreement, debate input creation, | ||
| and final user-facing documents. Host-owned research should run through | ||
| Use `agent_research_start` when the task needs investigation before provider | ||
| consensus. Research start receives the workflow profile, prompt pack, | ||
| `questionSet`, `evidencePolicy`, research lenses, and investigator assignments, then produces | ||
| `research_submissions.json`, `research_transcript.json`, and `aggregation.json`. | ||
| It does not start debate. Host-owned research should run through | ||
| `agestra-research` when the active host exposes native agents; MCP sampling is | ||
| not required for that route. | ||
| Human-facing documents under `docs/agestra/` have exactly two roles: | ||
| Human-facing documents under `docs/reports/{workflow}/` have exactly two roles: | ||
@@ -244,4 +265,7 @@ - `*-aggregation.md`: a readable Markdown aggregation of participant comments, | ||
| "agestra-debate" }` | ||
| - `initial_aggregation.items`: the already prepared consensus items | ||
| - `metadata.taskLabel`: optional human label only | ||
| - `workflow`: artifact/report label only, not a debate-routing branch | ||
| - `questionSet`: the selected workflow profile's required questions and final | ||
| status contract | ||
| - `aggregation.items`: the team-lead-approved research or seed items | ||
| - `evidencePolicy`: item and stance evidence-type preservation rules | ||
@@ -253,4 +277,4 @@ Prefer a host-turn `agestra-debate` participant over the current host's external | ||
| Do not pass legacy research/source-document/specialist-injection fields. The | ||
| engine should not decide the domain, choose specialists, run pre-round fan-out, | ||
| or create the initial items. | ||
| engine must not decide the workflow, branch on `workflow`, choose specialists, | ||
| run pre-round fan-out, or create the initial items. | ||
| </Research_And_Consensus> | ||
@@ -266,9 +290,8 @@ | ||
| Provider-seeded topology, or when the user explicitly asks for it. | ||
| - Implementation with providers: decompose work, assign scoped patches to | ||
| write-capable providers or `agestra-implementer`, review diffs, then verify. | ||
| - Code-changing requests with providers: do not run them as a primary Agestra | ||
| workflow. Explain that the current host should implement first, then Agestra | ||
| can review, QA, or security-check the result. | ||
| - Host participant needed in consensus: add an explicit host-turn participant | ||
| routed to `agestra-debate`; submit its JSON answer with | ||
| `agent_consensus_submit_turn`. | ||
| - Persistent E2E test creation: only after QA/user approval, route a scoped | ||
| packet to `agestra-implementer` with `mode: e2e-test-authoring`. | ||
| </Team_Composition> | ||
@@ -292,58 +315,98 @@ | ||
| External providers may cross-check QA evidence, but browser/dev-server/runtime | ||
| flows and persistent E2E file creation remain host-owned. | ||
| Across all three QA topologies — Council QA, Host-native first QA, | ||
| Provider-seeded QA — browser/dev-server/runtime flows remain host-owned, and | ||
| external providers cross-check artifacts only. Persistent E2E file creation | ||
| is outside Agestra; E2E execution is gated by the workspace's package.json | ||
| scripts.e2e entry. | ||
| </QA_Boundary> | ||
| <QA_Brigade_Execution> | ||
| For `/agestra qa`, do not assume provider-backed mode just because providers are | ||
| configured. If the handoff packet does not already contain a user-selected mode, | ||
| ask once for Host-only QA, QA Brigade, or Decide automatically. | ||
| <QA_Topology_Execution> | ||
| For `/agestra qa`, the handoff packet's `topology` field is authoritative. | ||
| Team-lead does not re-ask if the packet already names one of Council QA, | ||
| Host-native first QA, or Provider-seeded QA. | ||
| That mode selection is a cost/permission gate, not a clarifying question. If a | ||
| host-level no-questions directive prevents asking, choose Host-only QA and | ||
| report that provider fan-out was skipped. Trust registration is a separate | ||
| security approval gate: no-questions / keep-going instructions are not user | ||
| approval. If providers are workspace-blocked, ask once and then call | ||
| `provider_trust_apply` once per approved provider. Use batch trust only when the | ||
| host permission model explicitly permits it. | ||
| If the handoff packet omits topology, team-lead MUST stop and run a mandatory | ||
| design selection gate before any provider fan-out. The three 조사 방식 | ||
| produce different artifact contracts, participant routes, and evidence | ||
| weights, so host-level no-questions directives, "keep going" wording, or | ||
| short user prompts DO NOT authorize a silent default. Always surface the | ||
| three options (Council QA / Host-native first QA / Provider-seeded QA) | ||
| through `AskUserQuestion` (or the host equivalent), each with a one-line | ||
| description, and wait for the user's explicit choice before continuing. | ||
| Default QA Brigade is a fast host-prepared consensus path: | ||
| A host-only fallback is not a routing option for QA. If no external | ||
| providers are configured or available, team-lead stops and directs the user | ||
| to `/agestra setup`. | ||
| 1. Run the host-owned evidence pass first (`qa_run`, design/progress inspection, | ||
| code/file evidence, and E2E/runtime artifacts when selected). | ||
| Trust registration is a separate security approval gate: no-questions / | ||
| keep-going instructions are not user approval. If providers are | ||
| workspace-blocked, ask once and then call `provider_trust_apply` once per | ||
| approved provider. Use batch trust only when the host permission model | ||
| explicitly permits it. | ||
| ### Council QA | ||
| 1. Select the QA workflow profile and call `agent_research_start`. | ||
| 2. Assign the 6 QA lenses to participants: executable evidence, | ||
| spec-to-code compliance, integration risk, edge/error states, test | ||
| adequacy, safety hygiene. | ||
| 3. Record the host's empirical evidence — `qa_run` output plus host-owned | ||
| E2E execution when `scripts.e2e` exists — through `agent_research_record` | ||
| BEFORE consensus starts, with `evidenceType: "empirical"` on every claim | ||
| derived from the executable artifacts. | ||
| 4. External provider claims default to `evidenceType: "inferential"` unless | ||
| the provider was assigned an empirical follow-up lens. | ||
| 5. Inherit research's council defaults for `max_rounds`. | ||
| ### Host-native first QA | ||
| 1. Run `qa_run` plus host-owned E2E execution when `scripts.e2e` exists | ||
| (gated by the workspace `package.json` `scripts.e2e` entry; absent | ||
| means E2E is skipped with a reason recorded). | ||
| 2. Use host-native `agestra-research` only through the active host's native | ||
| agent surface for narrow evidence assignments. Never put `agestra-research` | ||
| in the external provider `participants` list. | ||
| 3. Prepare `initial_aggregation.items` from concrete evidence. Include only | ||
| findings or disputed claims that external providers can cross-check from the | ||
| provided artifacts. | ||
| 4. Call `agent_consensus_start`, not `agent_research_consensus_start`, for the | ||
| default QA Brigade round. Use exact provider participants, optional | ||
| agent surface for narrow evidence assignments. Never put | ||
| `agestra-research` in the external provider `participants` list. | ||
| 3. Prepare `aggregation.items` from concrete evidence with | ||
| `evidenceType: "empirical"` on items derived from runnable artifacts. | ||
| 4. Call debate-only `agent_consensus_start` with `workflow: "qa"`, the QA | ||
| `questionSet`, `aggregation`, `evidencePolicy`, exact provider participants, optional | ||
| `participant_routes` for a host-native `agestra-debate` participant, | ||
| `max_rounds: 1` for Standard QA, and a bounded participant timeout. | ||
| 5. Poll `agent_debate_status` and `run_observable_events` when a locator is | ||
| available while provider work is running. Surface concise progress at least | ||
| every 30-60 seconds. If this agent is running in a background mode whose | ||
| progress cannot reach the user, tell the caller to poll and relay progress, | ||
| or fall back to Host-only QA for the current run. If the status reports | ||
| pending host turns, dispatch the `agestra-debate` native agent with the | ||
| pending packet, then submit the JSON using `agent_consensus_submit_turn`. | ||
| `max_rounds: 1`, and a bounded participant timeout. | ||
| 5. External provider stances on host empirical items default to | ||
| `evidenceType: "inferential"`; `"mixed"` only when the provider cites an | ||
| independent empirical artifact it actually inspected. | ||
| Use `agent_research_consensus_start` for QA only when the user explicitly asks | ||
| for deep external-provider research before consensus. In that exception, | ||
| external AI research and debate run in separate fresh sessions. The default QA | ||
| Brigade should avoid that extra research round because the host already owns the | ||
| executable QA evidence. | ||
| </QA_Brigade_Execution> | ||
| ### Provider-seeded QA | ||
| <E2E_Test_Authoring> | ||
| Persistent E2E work is an implementation sub-mode, not a standalone agent. | ||
| 1. Run the selected `seed_provider` first and record its claims with | ||
| `evidenceType: "inferential"`. | ||
| 2. Run the host's empirical evidence pass — `qa_run` plus host-owned E2E | ||
| execution when `scripts.e2e` exists — and append host claims with | ||
| `evidenceType: "empirical"`. Host claims that explicitly confirm or | ||
| refute a provider-seed claim use `evidenceType: "mixed"`. | ||
| 3. Call debate-only `agent_consensus_start` with `workflow: "qa"`, the QA | ||
| `questionSet`, `aggregation`, `evidencePolicy`, the seed provider + at least | ||
| one reviewer + the host-debate participant route, `max_rounds: 1`, and a | ||
| bounded participant timeout. | ||
| Only invoke `agestra-implementer` with `mode: e2e-test-authoring` after the | ||
| leader has an approved E2E work packet. In that mode the implementer may edit | ||
| only named E2E test files, fixtures, or test configuration. If the test exposes | ||
| a product bug or testability gap, it reports the problem instead of changing | ||
| product code inline. | ||
| </E2E_Test_Authoring> | ||
| ### Evidence-type policy (all three topologies) | ||
| Every QA claim carries `evidenceType`. Host empirical claims include an | ||
| `evidence_ref` (e.g., `docs/reports/qa/.../qa_run.log#L42-L58`). Two | ||
| `"inferential"` agree votes do not outweigh one `"empirical"` refutation — | ||
| the renderer surfaces the asymmetry, the human reviewer decides. | ||
| ### Host-native + progress routing (all three topologies) | ||
| Never substitute `agestra-research` with an external CLI provider; route any | ||
| host-debate participant via `participant_routes` to `agestra-debate`. Poll | ||
| `agent_debate_status` and `run_observable_events` at 30-60 second intervals | ||
| while provider work is running. If this agent is running in a background | ||
| mode whose progress cannot reach the user, tell the caller to poll and | ||
| relay progress, or stop and direct the user to `/agestra setup`. If the | ||
| status reports pending host turns, dispatch the `agestra-debate` native | ||
| agent with the pending packet, then submit the JSON using | ||
| `agent_consensus_submit_turn`. | ||
| </QA_Topology_Execution> | ||
| <Completion_Report> | ||
@@ -350,0 +413,0 @@ Before reporting completion, inspect the evidence yourself. Report: |
+16
-11
@@ -38,3 +38,3 @@ --- | ||
| - If **"Describe an idea"**: ask a follow-up "What would you like to design?" and proceed. | ||
| - If **"Find ideas first"**: run `/agestra idea` to generate suggestions through the research/consensus flow. After the user selects an idea from the results, save the idea decision under `docs/ideas/`, then continue to Step 2 with that as the subject. | ||
| - If **"Find ideas first"**: run `/agestra idea` to generate suggestions through the research and debate flow. After the user selects an idea from the results, save the idea decision under `docs/ideas/`, then continue to Step 2 with that as the subject. | ||
| - If **"Use saved idea"**: list relevant Markdown files under `docs/ideas/`, summarize the titles briefly, and ask which one to design using `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer the saved-idea selection. | ||
@@ -45,4 +45,10 @@ - If **"Use recent context"**: scan the current conversation for previously discussed ideas, improvements, or features. Summarize them and ask the user which to design using `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer the context selection. | ||
| After the subject is identified, gather only the missing design-contract details. Ask one question at a time using `AskUserQuestion` when available, or a plain numbered prompt as fallback. Keep choices short, and put explanations in a separate **Term help** block instead of stuffing long parentheticals into each option. Do not assume or infer missing design-contract values; an explicit `not sure — recommend a default`, `defer`, `none`, or `skip` answer is acceptable. | ||
| After the subject is identified, gather only the missing design-contract details. Ask one question at a time. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (`$ARGUMENTS`, the prior turn, or a saved-idea record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For design the required dimensions are the "Need-to-know details" listed below; "Nice-to-know details" are optional. | ||
| Keep choices short, and put explanations in a separate **Term help** block instead of stuffing long parentheticals into each option. An explicit `not sure — recommend a default`, `defer`, `none`, or `skip` answer is acceptable. | ||
| Need-to-know details: | ||
@@ -54,3 +60,3 @@ - **One-line identity:** what this app/feature is, what it should feel like, and what it must not become | ||
| - **Progress style:** one complete pass, MVP then completion, or staged checkpoints | ||
| - **Completion criteria:** how the user and AI workers will know the implementation is done | ||
| - **Completion criteria:** how the user and current-host implementation pass will know the work is done | ||
| - **Research notes:** existing patterns in this codebase, prior art / competing implementations, constraints / regulations, current-information needs, or `skip` | ||
@@ -76,5 +82,4 @@ - **Research assignments:** any preferred participant/lens split for the selected investigation, or `skip` | ||
| | **Provider-seeded Research** | One selected provider creates the first design seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded design work, Council for broad architecture exploration, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a design clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -117,4 +122,4 @@ Default design principles: | ||
| - **User constraints:** any explicit constraints provided | ||
| - **Consensus domain:** `design` | ||
| - **Research topology / 조사 방식:** selected in Step 2 (`host-seeded`, `council`, `provider-seeded`, or `automatic`) | ||
| - **Workflow profile:** design profile with `workflow: "design"`, design `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** selected in Step 2 (`host-seeded`, `council`, `provider-seeded`, or `automatic`); seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -127,3 +132,3 @@ - **Research notes:** what the selected investigation should look for (existing patterns, prior art, constraints, current-information needs) | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
@@ -133,6 +138,6 @@ | ||
| - Building the participant team from focused research lenses, explicit host-turn debate participants, and external providers when applicable | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/design/`. | ||
| - Returning the research artifact paths, accepted decisions, excluded options, disputed items, and the final design document path under `docs/plans/`. | ||
@@ -139,0 +144,0 @@ |
+20
-15
@@ -36,4 +36,10 @@ --- | ||
| Then gather only the missing details, one question at a time. Ask with `AskUserQuestion` when available, or with a plain numbered prompt as fallback. Do not assume or infer values; treat each required field as a hard gate before provider fan-out. Include a skip option where useful so the user can explicitly answer `none`, `unspecified`, or `skip`. | ||
| Then gather only the missing details, one question at a time. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (`$ARGUMENTS`, the prior turn, or a saved-idea record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For idea exploration the required dimensions are listed under "For **Existing project**, collect:" or "For **New project idea**, collect:" below depending on the selected starting point. | ||
| Include a skip option where useful so the user can explicitly answer `none`, `unspecified`, or `skip`. | ||
| For **Existing project**, collect: | ||
@@ -72,5 +78,4 @@ - **Intent:** additions, improvements, or broader inspiration for where the project could go next | ||
| | **Provider-seeded Research** | One selected provider creates the first seed/evidence artifact, then host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded topics, Council for broad idea discovery, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a domain clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -105,4 +110,4 @@ Provider-backed `/agestra idea` uses the selected research topology flow: | ||
| - **Interview answers:** the details collected above, including research notes, research assignments, and free notes | ||
| - **Consensus domain:** `idea` | ||
| - **Research topology / 조사 방식:** selected in Step 2 (`host-seeded`, `council`, `provider-seeded`, or `automatic`) | ||
| - **Workflow profile:** idea profile with `workflow: "idea"`, idea `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** selected in Step 2 (`host-seeded`, `council`, `provider-seeded`, or `automatic`); seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -115,3 +120,3 @@ - **Research notes:** what the selected investigation should look for | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
@@ -121,9 +126,9 @@ | ||
| - Building the participant team from idea research lenses, explicit host-turn debate participants, and external providers. External providers are MCP/CLI/chat participants only. | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - For research fan-out, pass `domain: "idea"` to `agent_research_consensus_start`. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - For research fan-out, pass the idea workflow profile, prompt pack, `questionSet`, and `evidencePolicy` to `agent_research_start`. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Writing the project-facing idea decision record under `docs/agestra/YYYY-MM-DD-idea-<session-id>-result.md` from the aggregation document, JSON artifacts, consensus state, and the user's interview answers. Preserve disputed positions and weak-evidence flags rather than averaging them away. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader final document target. | ||
| - Returning the research artifact paths, accepted/excluded/disputed items, carry-forward ideas, weak-evidence flags, and the `docs/agestra/` decision document path. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Writing the project-facing idea decision record under `docs/reports/idea/YYYY-MM-DD-idea-<session-id>-result.md` from the threaded aggregation document, JSON artifacts, `workflow_result.json`, and the user's interview answers. Preserve disputed positions and weak-evidence flags rather than averaging them away. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final document. | ||
| - Returning the research artifact paths, accepted/excluded/disputed items, carry-forward ideas, weak-evidence flags, and the `docs/reports/idea/` decision document path. | ||
@@ -135,3 +140,3 @@ **Do NOT from this command:** | ||
| Writing the final project-facing idea decision record under `docs/agestra/` is allowed and expected after the user chooses or approves ideas. `.agestra/workspace/` is the internal research/debate workspace, not the user's primary browsing surface. | ||
| Writing the final project-facing idea decision record under `docs/reports/idea/` is allowed and expected after the user chooses or approves ideas. `.agestra/workspace/` is the internal research/debate workspace, not the user's primary browsing surface. | ||
@@ -145,3 +150,3 @@ Direct execution bypasses team-lead's capability-based routing, optional trace-assisted signals, and consistency enforcement. Always go through team-lead in the provider-backed path. | ||
| - Separate research-backed opportunities, hypotheses, risky but interesting ideas, duplicates, weakly grounded ideas, and recommended next directions | ||
| - Name the idea decision document under `docs/agestra/` after the user chooses or approves ideas | ||
| - Name the idea decision document under `docs/reports/idea/` after the user chooses or approves ideas | ||
| - Show ideas grouped as Make Soon, Explore Next, and Inspiration Bank when available | ||
@@ -152,3 +157,3 @@ - In terminal/chat, show a title-only list first and point the user to the synthesis document for details | ||
| - Point out the 2-3 best candidates to take into `/agestra design`, where feasibility and scope will be evaluated | ||
| - If no idea has been selected yet, ask which idea or bundle of ideas should be saved before writing the `docs/agestra/YYYY-MM-DD-idea-<session-id>-result.md` decision record. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer the saved idea selection. | ||
| - If no idea has been selected yet, ask which idea or bundle of ideas should be saved before writing the `docs/reports/idea/YYYY-MM-DD-idea-<session-id>-result.md` decision record. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer the saved idea selection. | ||
| - Communicate in the user's language |
+79
-60
@@ -22,3 +22,3 @@ --- | ||
| Before any provider fan-out, run the shared workspace trust preflight for the exact current project root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or fall back to Host-only QA. | ||
| Before any provider fan-out, run the shared workspace trust preflight for the exact current project root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or stop and direct the user to /agestra setup. | ||
@@ -36,38 +36,26 @@ ## Step 1: Determine QA target | ||
| ## Step 2: Choose QA execution mode | ||
| ## Step 2: Choose QA topology (조사 방식) | ||
| Ask the user once: | ||
| Available 조사 방식 for QA: | ||
| > Which QA execution mode should I use? | ||
| - **Council QA** — host and external providers all investigate independently with distinct QA lenses (executable evidence, spec-to-code compliance, integration risk, edge/error states, test adequacy, safety hygiene), then cross-review and debate. | ||
| - **Host-native first QA** — the host's native `agestra-research` agent collects executable QA evidence first (build / type / test, plus E2E when `package.json` `scripts.e2e` is present), persists the QA evidence artifact, and external providers challenge it through a short consensus round. | ||
| - **Provider-seeded QA** — the selected `seed_provider` produces a code/spec-analysis seed (inferential); the host then injects empirical evidence as a challenge stance and other reviewers weigh in. | ||
| | Option | Description | | ||
| |--------|-------------| | ||
| | **Host-only QA (Recommended)** | Fastest path. The current host collects evidence, runs `qa_run`, writes the QA report, and does not call external providers. | | ||
| | **QA Brigade** | The host collects evidence first, then enabled providers cross-check the prepared findings through a short consensus round. Takes longer. | | ||
| | **Decide automatically** | Use Host-only QA unless the target is broad/high-risk, the user explicitly asked for multiple AIs/providers, or the design has disputed evidence. | | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts, participant routes, and evidence weights, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded QA is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/permission gate, not a clarifying question. Do not infer provider-backed QA merely because `/agestra qa` was invoked or providers are configured. Only skip this question when the user already explicitly requested current-host-only QA, named provider-backed/multi-AI QA, or chose a mode in the same request. If a host-level no-questions directive prevents asking, choose Host-only QA and report that provider fan-out was skipped. | ||
| If no external providers are configured or available, stop Agestra orchestration and direct the user to `/agestra setup`. A host-only fallback for QA is not a mode in this command. | ||
| ## Step 3: Choose QA depth | ||
| ## Step 3: Detect E2E coverage | ||
| Ask the user once: | ||
| E2E execution is host-owned and gated by explicit user intent. Before evidence collection, the host MUST read the workspace `package.json` (and, in a workspace monorepo, any nested package `package.json` files) and check whether a `scripts.e2e` entry exists: | ||
| > E2E verification can open the app and exercise real user flows. It gives stronger confidence, but can take more time, tokens, and local runtime setup. Which QA depth should I use? | ||
| - If `scripts.e2e` exists at the workspace root or in any nested package, run it as part of the QA evidence pass via the workspace's package manager (`npm run e2e`, `pnpm run e2e`, `yarn e2e`, etc.). Capture its stdout/stderr into the QA evidence artifact alongside build/type/test output. | ||
| - If `scripts.e2e` is absent everywhere, do NOT attempt E2E execution and do NOT search for `playwright.config.*`, `cypress.config.*`, or `tests/e2e/` directories. Presence of those files alone is not a reliable signal — abandoned framework setups produce false positives. Record in the QA report that E2E was not run because no `scripts.e2e` was declared, and recommend that the user add one to enable E2E in future runs. | ||
| - Standard QA evidence (`qa_run` for build/type/test) always runs regardless of `scripts.e2e` presence. | ||
| | Option | Description | | ||
| |--------|-------------| | ||
| | **Standard QA (Recommended)** | Design/progress compliance, build/type/test, Connection / Boundary Checks, error/empty states, and basic safety hygiene | | ||
| | **Full QA with E2E** | Standard QA plus existing E2E tests, temporary browser automation, screenshots when useful, and core real-user flows | | ||
| | **Decide automatically** | Include E2E when UI flow, auth, file operations, public release, destructive actions, or complex state transitions are central | | ||
| Even in multi-AI QA, E2E/runtime execution is host-owned across all three topologies. External providers may review the design, code, host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/permission gate, not a clarifying question. Do not infer QA depth unless the user chose `Decide automatically` or the request already explicitly asked for Standard QA or Full QA/E2E. If a host-level no-questions directive prevents asking, choose Standard QA and report that E2E was skipped unless the user explicitly requested it. | ||
| If the user chooses Full QA and persistent E2E test files must be added or updated, QA must ask approval and route test-file work to `agestra-implementer` with `mode: e2e-test-authoring`. QA itself remains read-only for source code and persistent tests. | ||
| Even in multi-AI QA, E2E/runtime execution is host-owned. External providers may review the design, code, host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly. | ||
| QA writes a Markdown report under `docs/reports/qa/` unless the user explicitly asks for chat-only output. | ||
| If QA Brigade was selected, then ask focused provider cross-check notes before provider fan-out: spec-to-code mapping gaps, API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, suspected regressions, integration/regression risk, edge/error states, test adequacy, safety hygiene, E2E artifact interpretation, or `skip`. Ask whether any provider or host-native lens should receive a specific cross-check assignment, or whether team-lead should choose. | ||
| ## Step 4: Route execution | ||
@@ -77,19 +65,6 @@ | ||
| **Host-only path:** | ||
| Run the host-owned QA evidence pass directly: | ||
| Before any provider fan-out, run workspace trust readiness for the exact target root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or stop and direct the user to `/agestra setup`. Pass `workspace_base_dir` explicitly to provider readiness/trust and consensus calls whenever the host workspace root may be ambiguous. | ||
| - Use `qa_run` for build/test verification where applicable. | ||
| - Inspect the design/progress contract, implementation files, command output, and runtime/E2E artifacts according to the selected depth. | ||
| - Use host-native `agestra-research` only as a bounded native helper assignment when the current host exposes native agents and the evidence question is narrow. | ||
| - Write the QA report under `docs/reports/qa/`. | ||
| - Do not call `agent_research_consensus_start`, `agent_consensus_start`, `ai_chat`, or external provider tools. | ||
| Hand off to `agestra:agestra-team-lead`. The canonical QA boundary (Host-native first QA and Provider-seeded QA default): | ||
| **No-provider stop path:** | ||
| If QA Brigade was selected but no external provider is available, stop provider orchestration and offer Host-only QA or `/agestra setup`. Do not spawn a provider-backed consensus with zero providers. | ||
| **Provider-backed path — QA Brigade selected and 1+ configured external providers available:** | ||
| Before any provider fan-out, run workspace trust readiness for the exact target root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or fall back to Host-only QA. Pass `workspace_base_dir` explicitly to provider readiness/trust and consensus calls whenever the host workspace root may be ambiguous. | ||
| Hand off to `agestra:agestra-team-lead`. Provider-backed QA uses the fast host-prepared consensus path by default: | ||
| ```text | ||
@@ -102,38 +77,82 @@ 호스트가 조사한다. | ||
| The host must prepare QA evidence before provider fan-out. External providers cross-check the prepared evidence; they do not run the initial research phase. Build a self-contained handoff packet: | ||
| Council QA loosens the first two lines: "호스트와 프로바이더가 함께 조사한다. 호스트가 집계한다." Across all three topologies the closing lines ("시스템이 토론한다." and "호스트가 문서화한다.") are unchanged, and E2E execution remains host-owned. | ||
| External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. Do not carry a research conversation into the debate phase. | ||
| ### Council QA path | ||
| Council QA uses the split research/debate MCP route. Team-lead calls `agent_research_start` with: | ||
| - `workflow: "qa"` and the selected QA workflow profile | ||
| - the QA `questionSet` and `evidencePolicy` | ||
| - The 6 QA lenses as participant assignments: executable evidence, spec-to-code compliance, integration risk, edge/error states, test adequacy, safety hygiene | ||
| - Available external providers as participants alongside the host | ||
| - The host's empirical evidence (`qa_run` output, E2E output when `scripts.e2e` ran) preserved in `research_submissions.json`, `research_transcript.json`, and `aggregation.json`, with `evidenceType: "empirical"` on every claim derived from the executable artifacts | ||
| - Other provider participants emit claims with `evidenceType: "inferential"` (default) unless they were assigned an empirical follow-up lens | ||
| Council QA inherits research's council defaults (`max_rounds` follows the research command's default). | ||
| ### Host-native first QA path | ||
| Team-lead runs the host-owned QA evidence pass first via `qa_run` and (when `scripts.e2e` exists) host-run `npm run e2e`, then prepares `aggregation.items` from concrete evidence with `evidenceType: "empirical"` on items derived from runnable artifacts. Then call debate-only `agent_consensus_start` with: | ||
| - `workflow: "qa"` as an artifact/report label | ||
| - the QA `questionSet` | ||
| - the prepared `aggregation` | ||
| - the QA `evidencePolicy` | ||
| - Exact provider participants | ||
| - `participant_routes` for any host-native `agestra-debate` participant | ||
| - `max_rounds: 1` | ||
| - A bounded participant timeout | ||
| External provider stances on host empirical items default to `evidenceType: "inferential"` because they did not run the build/test/E2E themselves; they may set `"mixed"` only when they cite an independent empirical artifact they actually inspected. | ||
| ### Provider-seeded QA path | ||
| Team-lead asks the user which configured, available provider should seed (Step 2 may have already captured this; do not re-ask). Then: | ||
| 1. Run the selected `seed_provider` to produce a code/spec-analysis seed; record its claims with `evidenceType: "inferential"`. | ||
| 2. Run the host's empirical evidence pass — host-owned `qa_run` plus host-owned E2E execution when `scripts.e2e` exists — and append host claims with `evidenceType: "empirical"`. Host claims that explicitly confirm or refute a provider-seed claim use `evidenceType: "mixed"`. | ||
| 3. Call debate-only `agent_consensus_start` with `workflow: "qa"`, the QA `questionSet`, prepared `aggregation`, `evidencePolicy`, the seed provider + at least one reviewer + the host-debate participant route, `max_rounds: 1`, and a bounded participant timeout. | ||
| ### No-provider stop path | ||
| If no external providers are configured or available, stop Agestra orchestration and direct the user to `/agestra setup`. Do not spawn a provider-backed consensus with zero providers, and do not silently substitute a host-only fallback. | ||
| ### Handoff packet (all three paths) | ||
| Build a self-contained handoff packet with: | ||
| - **Domain:** `qa` | ||
| - **Submode:** `qa-only` | ||
| - **Mode:** `qa-brigade` (selected by the user; do not re-ask) | ||
| - **QA formation:** QA Brigade | ||
| - **Topology (조사 방식):** Council QA / Host-native first QA / Provider-seeded QA (selected by the user in Step 2; do not re-ask) | ||
| - **Seed provider:** when topology is Provider-seeded QA | ||
| - **QA target:** from Step 1 | ||
| - **QA depth:** Standard QA / Full QA with E2E / Decide automatically | ||
| - **E2E status:** ran / skipped, with the reason ("scripts.e2e present in {path}" or "no scripts.e2e declared") | ||
| - **E2E/runtime execution:** host-owned only; external providers cross-validate artifacts and findings, not browser/dev-server execution | ||
| - **Design doc reference:** path under `docs/plans/` | ||
| - **Report artifact path expectation:** `docs/reports/qa/YYYY-MM-DD-qa-[target].md` | ||
| - **Consensus domain:** `qa` | ||
| - **Workflow profile:** QA profile with `workflow: "qa"`, QA `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Connection / Boundary Checks:** API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, and E2E artifact interpretation when E2E ran | ||
| - **Research notes:** what the host-owned evidence pass should look for (spec-to-code gaps, boundary mismatches, regressions, integration risk, edge/error states, test adequacy, safety hygiene) | ||
| - **Cross-check assignments:** optional provider/lens rows for the short consensus round, or "team-lead choose" | ||
| - **Host-native route:** run active-host `agestra-research` for bounded QA evidence lenses before provider cross-check when useful; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
| - **Evidence type policy:** every claim emitted into the ledger MUST carry `evidenceType`; host empirical claims set `"empirical"` with an `evidence_ref` to the qa_run artifact path/line; provider inferential claims set `"inferential"`; cross-cited host-confirmation-of-provider-claim sets `"mixed"`. Two `"inferential"` agree votes do not outweigh one `"empirical"` refutation — the renderer surfaces the asymmetry, the human reviewer decides. | ||
| - **Host-native route:** route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
| - **Available providers:** from `environment_check`; include configured providers when their detected model capability is suitable, using read-only QA/review tools so verification cannot modify source files | ||
| - **Requested providers:** explicit names captured from user wording; otherwise "all configured and available review-capable providers" | ||
| - **QA lens handoff:** when a host QA/review/security perspective is needed, team-lead assigns `agestra-research` focused native-agent lenses before provider fan-out and includes that evidence in the host-prepared `initial_aggregation`. Do not list `agestra-research` as an external provider participant. | ||
| - **Brigade lenses:** host executable evidence, spec-to-code compliance, implementation progress truthfulness, integration/regression risk, edge/error states, test adequacy, basic safety hygiene, and E2E artifact review when E2E ran | ||
| - **QA-only boundary:** QA-only mode does not modify product code; connection or boundary defects are findings until the user approves a separate implementation task | ||
| - **JSON finding flow:** candidate findings become `ITEM-*` ledger items; participants use the existing `agree` / `disagree` / `opinion` / `revise` stance contract; only ledger-accepted items affect the final verdict | ||
| - **JSON finding flow:** candidate findings become `aggregation.items`; debate participants answer each required QA `questionSet` question with allowed verdicts, stance evidence type, and evidence refs; only `workflow_result.json` final-status items affect the final verdict | ||
| - **Locale:** from `setup_status` | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
| Team-lead owns running the host-owned QA evidence pass, then preparing `initial_aggregation.items` from concrete evidence and calling `agent_consensus_start` with `domain` represented only in metadata, exact provider participants, `participant_routes` for any host-native `agestra-debate` participant, `max_rounds: 1` for Standard QA, and a bounded participant timeout. Team-lead must poll `agent_debate_status` and `run_observable_events` when a locator is available, then surface concise progress at least every 30-60 seconds while provider work is running. When the status reports pending host turns, team-lead dispatches the native `agestra-debate` agent and submits the JSON with `agent_consensus_submit_turn`. If the current host cannot surface progress from a background team-lead, the caller must poll and relay progress, or choose Host-only QA for the current run. | ||
| Team-lead polls `agent_debate_status` and `run_observable_events` when a locator is available, then surfaces concise progress at least every 30-60 seconds while provider work is running. When the status reports pending host turns, team-lead dispatches the native `agestra-debate` agent and submits the JSON with `agent_consensus_submit_turn`. If the current host cannot surface progress from a background team-lead, the caller must poll and relay progress, or stop and direct the user to `/agestra setup`. | ||
| Do not call `agent_research_consensus_start` for the default QA Brigade path. That tool is reserved for an explicit deep provider-research mode; in that exception, External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. Default QA Brigade must avoid the extra external research round because QA already has host-owned executable evidence. | ||
| ### Council QA MCP routing note | ||
| Council QA is the topology that calls `agent_research_start` before debate. Host-native first QA and Provider-seeded QA may prepare `aggregation.items` directly, but all three topologies start debate with `agent_consensus_start` using `workflow`, `questionSet`, `aggregation`, and `evidencePolicy`. The debate engine does not branch on `workflow`; QA behavior comes from the supplied profile and question set. | ||
| ## Step 5: Present the final result | ||
| When QA returns: | ||
| - State QA execution mode | ||
| - State QA depth and whether E2E was run | ||
| - State QA topology (Council QA / Host-native first QA / Provider-seeded QA) | ||
| - State whether E2E was run, and the reason (scripts.e2e path or "no scripts.e2e declared") | ||
| - Link or name the design document used | ||
@@ -143,6 +162,6 @@ - Link the QA report artifact under `docs/reports/qa/` | ||
| - Show PASS / CONDITIONAL PASS / FAIL | ||
| - In QA Brigade mode, summarize participants, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus, and notable dissenting findings | ||
| - In Council QA, Host-native first QA, and Provider-seeded QA modes, summarize participants, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus, notable dissenting findings, and any empirical-vs-inferential evidence asymmetries flagged in the report ledger | ||
| - Summarize progress-table mismatches, design gaps, build/test failures, E2E failures, and basic safety hygiene risks | ||
| - If QA returned `E2E_TEST_WORK_REQUEST`, ask the user whether to create or update persistent E2E tests. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. If approved, route the request to `agestra:agestra-implementer` with `mode: e2e-test-authoring` or team-lead as a separate E2E test-writing task, then re-run QA after tests exist. If declined, record E2E as residual risk. Do not infer approval. | ||
| - If persistent E2E coverage is missing, list the recommended scenarios and record the gap as residual risk. Do not ask Agestra to create or update test files. | ||
| - Recommend `/agestra review` for critique or `/agestra security` for dedicated security audit when needed | ||
| - Communicate in the user's language |
+185
-34
| --- | ||
| description: "Run domain-specific research with Host-native first, council, or provider-seeded topology" | ||
| description: "Run workflow-profile research with Host-native first, council, or provider-seeded topology" | ||
| argument-hint: "[domain] [topic or question]" | ||
@@ -31,3 +31,2 @@ --- | ||
| - `security` | ||
| - `implement` | ||
| - `research` | ||
@@ -48,3 +47,3 @@ | ||
| - Standalone `/agestra research` produces research artifacts and a human report; it does not create a bundled participant for a later domain debate. | ||
| - When research should continue into idea/design/review/security/qa/implement consensus, hand off to team-lead to call `agent_research_consensus_start` for the target domain instead of chaining a research-domain debate into a second debate. | ||
| - When research should continue into idea/design/review/security/qa consensus, hand off to team-lead to call `agent_research_start` for the target workflow, inspect `aggregation.json`, then start debate separately with `agent_consensus_start`. | ||
| - External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. | ||
@@ -68,10 +67,7 @@ | ||
| If the user already chose one, validate that it fits the domain and continue. | ||
| If not, propose one recommendation with a short reason and ask for approval. | ||
| This is a cost/latency gate, not a clarifying question. If a host-level | ||
| no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report | ||
| that broader provider investigation was skipped. | ||
| If the user already chose one in the request, validate that it fits the domain and continue. | ||
| Otherwise this is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. | ||
| If no external providers are available, stop Agestra orchestration and tell the user to run setup or handle the research directly outside Agestra. | ||
| Host-native first means the active host's native `agestra-research` agent creates the first seed/evidence document, persists it through workspace document tooling, and external participants challenge it through `domain: "research"`. Record it internally as `host-seeded`. It is provider-backed research, not a host-only multi-AI mode. | ||
| Host-native first means the active host's native `agestra-research` agent creates the first seed/evidence document, persists it through workspace document tooling, and external participants challenge it through the supplied research workflow profile and question set. Record it internally as `host-seeded`. It is provider-backed research, not a host-only multi-AI mode. | ||
@@ -85,5 +81,5 @@ Provider-seeded Research means the selected `seed_provider` creates the first seed/evidence artifact, then reviewer participants independently challenge that seed. The seed provider never commands reviewers; Agestra team-lead/moderator remains the orchestrator. | ||
| - research target domain | ||
| - domain-specific investigation items | ||
| - workflow-profile investigation items | ||
| - runtime lenses and roles | ||
| - AI/worker assignment table with explicit `domain`, `role`, `lens`, `question`, `deliverable`, and `expected_artifact` values | ||
| - participant assignment table with explicit `domain`, `role`, `lens`, `question`, `deliverable`, and `expected_artifact` values | ||
| - expected JSON artifacts | ||
@@ -101,3 +97,3 @@ - Markdown report target | ||
| For Host-native first (`host-seeded`), create the host seed/aggregation through the active host's native agent surface before provider fan-out. Normalize it into `initial_aggregation.items`; do not pass it through `source_documents`. | ||
| For Host-native first (`host-seeded`), create the host seed/aggregation through the active host's native agent surface before provider fan-out. Normalize it into `aggregation.items`; do not pass it through legacy source-document fields. | ||
@@ -124,14 +120,65 @@ Host-native first requires at least one external reviewer participant outside the seed provider. If the user explicitly asks for host-only artifact capture, use `artifact_only_diagnostic: true` and clearly state that no multi-AI consensus was produced. | ||
| - Required JSON artifacts | ||
| - Progress contract: surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - Progress contract: surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - Original user request verbatim | ||
| For Council Research, the team-lead must call `agent_consensus_start` only after approval and after preparing `initial_aggregation.items`: | ||
| For Council Research, the team-lead must call `agent_research_start` first. Debate starts only after approval and after inspecting `aggregation.json`: | ||
| `agent_research_start` is research-only. It receives the selected workflow profile, | ||
| prompt pack, `questionSet`, `evidencePolicy`, research lenses, and investigator | ||
| assignments, then writes `research_submissions.json`, | ||
| `research_transcript.json`, and `aggregation.json`. It does not start debate. | ||
| debate-only `agent_consensus_start` runs from prepared `aggregation`, supplied | ||
| `questionSet`, and `evidencePolicy`; `workflow` is a report/artifact label only, | ||
| not a debate routing branch. | ||
| ```json | ||
| { | ||
| "domain": "research", | ||
| "initial_aggregation": { | ||
| "summary": "<approved host aggregation summary>", | ||
| "items": [] | ||
| "workflow": "research", | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { "aggregationId": "<approved aggregation id>", "items": [] }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true }, | ||
| "participants": ["<explicit-consensus-participant>"] | ||
@@ -145,21 +192,112 @@ } | ||
| { | ||
| "domain": "research", | ||
| "workflow": "research", | ||
| "participants": ["host-seed", "<external-reviewer>"], | ||
| "initial_aggregation": { | ||
| "summary": "<host seed summary>", | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { | ||
| "aggregationId": "<host seed aggregation id>", | ||
| "items": [ | ||
| { "id": "HOST-SEED", "title": "<claim>", "claim": "<what external reviewers should challenge>" } | ||
| { "id": "HOST-SEED", "title": "<claim>", "claim": "<what external reviewers should challenge>", "evidenceType": "empirical" } | ||
| ] | ||
| } | ||
| }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true } | ||
| } | ||
| ``` | ||
| For Provider-seeded Research, the team-lead must prepare the seed findings as `initial_aggregation.items`, then call `agent_consensus_start` with selected consensus participants: | ||
| For Provider-seeded Research, the team-lead must prepare the seed findings as `aggregation.items`, then call `agent_consensus_start` with selected consensus participants: | ||
| ```json | ||
| { | ||
| "domain": "research", | ||
| "workflow": "research", | ||
| "participants": ["<configured-seed-provider>", "<reviewer-provider-or-host-participant>"], | ||
| "initial_aggregation": { | ||
| "summary": "<provider seed summary>", | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { | ||
| "aggregationId": "<provider seed aggregation id>", | ||
| "items": [ | ||
@@ -169,17 +307,19 @@ { | ||
| "title": "<seed claim>", | ||
| "claim": "<what reviewers should challenge>" | ||
| "claim": "<what reviewers should challenge>", | ||
| "evidenceType": "inferential" | ||
| } | ||
| ] | ||
| } | ||
| }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true } | ||
| } | ||
| ``` | ||
| If the seed provider artifact already exists, convert its supported claims into `initial_aggregation.items` before starting consensus. | ||
| If the seed provider artifact already exists, convert its supported claims into `aggregation.items` before starting consensus. | ||
| Team-lead owns provider/worker fan-out, consensus coordination, JSON ledger flow, finding-validator phase, and final synthesis. | ||
| Team-lead owns provider/host-participant fan-out, consensus coordination, JSON ledger flow, finding-validator phase, and final synthesis. | ||
| Runtime boundary: native researcher/helper agents are created only by the active host layer. External providers named in the host-owned assignment plan participate through MCP, CLI worker, or chat routes; they do not create, spawn, or manage Claude/Codex/Gemini native agents. | ||
| Runtime boundary: native researcher/helper agents are created only by the active host layer. External providers named in the host-owned assignment plan participate through MCP or chat routes; they do not create, spawn, or manage Claude/Codex/Gemini native agents. | ||
| This command must not call `agent_consensus_start` directly when external providers are involved until host-owned research preprocessing has produced `initial_aggregation.items`. | ||
| This command must not create a bundled research pseudo-participant or carry research bundles through `source_documents`. | ||
| This command must not call `agent_consensus_start` directly when external providers are involved until host-owned research preprocessing has produced `aggregation.items`. | ||
| This command must not create a bundled research pseudo-participant or carry research bundles through legacy source-document fields. | ||
@@ -195,2 +335,6 @@ When host-owned investigation material is produced as evidence for a provider-backed research workflow, record it through `agent_research_record` before the council or Host-native first (`host-seeded`) review consumes it. Include: | ||
| `agent_research_record` is only a host-owned evidence recording helper. It does | ||
| not replace `agent_research_start`, `aggregation.json`, or the separate | ||
| `agent_consensus_start` debate flow. | ||
| ## Step 5: Present results | ||
@@ -200,2 +344,9 @@ | ||
| - `research_submissions.json` | ||
| - `research_transcript.json` | ||
| - `aggregation.json` | ||
| - `debate_transcript.json` | ||
| - `workflow_result.json` | ||
| - threaded aggregation document | ||
| - concise final decision document | ||
| - Markdown report path | ||
@@ -202,0 +353,0 @@ - JSON artifact index path |
+13
-9
@@ -39,4 +39,9 @@ --- | ||
| If the user chooses **Specific area**, ask for the path or description. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not proceed to review lens selection or provider routing until the review target is explicit. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (`$ARGUMENTS`, the prior turn, or a saved design record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For review the required dimensions are the review target (Step 1), review lens (Step 2), and the research-notes question that gates research assignments — depth and tone are optional defaults. | ||
| Do not proceed to review lens selection or provider routing until the review target is explicit. | ||
| ## Step 2: Choose review lens | ||
@@ -88,5 +93,4 @@ | ||
| | **Provider-seeded Research** | One selected provider creates the first review seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for scoped reviews, Council for whole-project/deep reviews, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a review clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -119,4 +123,4 @@ ## Step 4: Route execution | ||
| - **Report artifact path expectation:** `docs/reports/review/YYYY-MM-DD-review-[target].md` | ||
| - **Consensus domain:** `review` | ||
| - **Research topology / 조사 방식:** selected in Step 3 (`host-seeded`, `council`, `provider-seeded`, or `automatic`) | ||
| - **Workflow profile:** review profile with `workflow: "review"`, review `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** selected in Step 3 (`host-seeded`, `council`, `provider-seeded`, or `automatic`); seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -131,3 +135,3 @@ - **Research notes:** what the selected investigation should look for (regression-prone areas, blast radius, prior incidents, dependency concerns, current-information needs) | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
@@ -137,6 +141,6 @@ | ||
| - Building the participant team from focused review lenses, explicit host-turn debate participants, and external providers | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/review/`. | ||
| - Returning the research artifact paths, consensus table, disputed positions, review verdict, and the final report path under `docs/reports/review/`. | ||
@@ -143,0 +147,0 @@ |
+10
-7
@@ -36,4 +36,8 @@ --- | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not proceed to depth selection or provider routing until the security target/surface is explicit. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (`$ARGUMENTS`, the prior turn, or a saved design record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For security the required dimensions are the security target/surface (Step 1) and security depth (Step 2); tool-permission approvals are a separate gate addressed later in this step. | ||
| Do not proceed to depth selection or provider routing until the security target/surface is explicit. | ||
| ## Step 2: Choose security depth | ||
@@ -65,5 +69,4 @@ | ||
| | **Provider-seeded Research** | One selected provider creates the first security seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded audits, Council for broad/full security reviews, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a security clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -96,4 +99,4 @@ ## Step 4: Route execution | ||
| - **Report artifact path expectation:** `docs/reports/security/YYYY-MM-DD-security-[target].md` | ||
| - **Consensus domain:** `security` | ||
| - **Research topology / 조사 방식:** selected in Step 3 (`host-seeded`, `council`, `provider-seeded`, or `automatic`) | ||
| - **Workflow profile:** security profile with `workflow: "security"`, security `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** selected in Step 3 (`host-seeded`, `council`, `provider-seeded`, or `automatic`); seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -107,6 +110,6 @@ - **Research notes:** what the selected investigation should look for (secrets/keys, auth/authz boundaries, file/command execution, network exposure, dependency concerns, unsafe defaults) | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
| Team-lead owns resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. Team-lead must ensure external AI research and debate use separate fresh sessions when a research phase is used, must never create a bundled research pseudo-participant, and must never carry research bundles through `source_documents`. Inspect `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. The brigade must not run destructive exploit tests and must not install tools or run heavyweight/networked scans without explicit user approval. | ||
| Team-lead owns resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required. Debate starts only after team-lead inspects `aggregation.json` and calls debate-only `agent_consensus_start` with `workflow`, `questionSet`, `aggregation`, and `evidencePolicy`. Team-lead must ensure external AI research and debate use separate fresh sessions when a research phase is used, must never create a bundled research pseudo-participant, and must never carry research bundles through legacy source-document fields. Inspect `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/security/`. The brigade must not run destructive exploit tests and must not install tools or run heavyweight/networked scans without explicit user approval. | ||
@@ -113,0 +116,0 @@ ## Step 5: Present the result |
+13
-9
@@ -23,5 +23,7 @@ # Agestra for Gemini CLI | ||
| - `/agestra:idea` | ||
| - `/agestra:implement` | ||
| Each command delegates to the shared workflow specs in `commands/*.md`. | ||
| Agestra does not implement product code or author persistent E2E test files. Use | ||
| Gemini CLI or the current host for code/test changes first, then run Agestra | ||
| QA/review/security on the result. | ||
@@ -35,5 +37,6 @@ ## Usage Rules | ||
| - Treat `commands/*.md` and `agents/*.md` as the canonical workflow and role assets. | ||
| - Keep native agent creation host-owned. Providers reached through MCP, CLI workers, or chat are participants only. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Use this host research consensus contract verbatim: | ||
| - Keep native agent creation host-owned. Providers reached through MCP or chat are participants only. | ||
| - For investigation-including workflows, route through `agent_research_start`, | ||
| then start debate separately with `agent_consensus_start`. | ||
| - Use this host research/debate phase contract verbatim: | ||
| 호스트가 조사한다. | ||
@@ -48,9 +51,10 @@ 호스트가 정리한다. | ||
| - `agent_research_consensus_start`: host-led research, consolidation, system debate, engine aggregation docs, and host-authored final decision docs for investigation-including workflows | ||
| - `agent_consensus_start`, `agent_debate_approve`/`_continue`/`_reject`, `agent_debate_review`: direct consensus sessions from prepared `initial_aggregation` and approval-gated debate artifacts | ||
| - `cli_worker_spawn`, `agent_changes_review`, `agent_changes_accept`, `agent_changes_reject`: autonomous worker lifecycle | ||
| - `agent_research_start`: research-only host-led preprocessing with workflow | ||
| profile, prompt pack, `questionSet`, `evidencePolicy`, research lenses, and | ||
| investigator assignments; writes `research_submissions.json`, | ||
| `research_transcript.json`, and `aggregation.json`; does not start debate | ||
| - debate-only `agent_consensus_start`, `agent_debate_approve`/`_continue`/`_reject`, `agent_debate_review`: sessions from prepared `aggregation`, supplied `questionSet`, `evidencePolicy`, and approval-gated debate artifacts | ||
| - `workspace_*`: document-backed review and aggregation flows | ||
| - `qa_run`: workspace build/test verification before implementation completion | ||
| - `qa_run`: workspace build/test verification for QA evidence | ||
| Review, QA, and security workflows write durable reports under `docs/reports/review/`, `docs/reports/qa/`, and `docs/reports/security/` unless the user asks for chat-only output. | ||
| Persistent E2E test creation/maintenance is internal: QA produces `E2E_TEST_WORK_REQUEST`, the leader asks the user, and approved work goes to `agestra-implementer` with `mode: e2e-test-authoring`. There is no standalone Gemini `/agestra:e2e` command yet. |
@@ -105,3 +105,3 @@ #!/usr/bin/env node | ||
| " Do not ask the user to repeat it — the original prompt is preserved above this block.", | ||
| " Re-dispatch it through the matching Agestra skill/command (leader/review/idea/design/implement).", | ||
| " Re-dispatch it through the matching Agestra skill/command (leader/review/idea/design/qa/security).", | ||
| "", | ||
@@ -129,3 +129,3 @@ "Do NOT skip setup. Do NOT run Agestra orchestration until the user has", | ||
| " 2. After setup writes a fresh file, resume the user's original request above.", | ||
| " Re-dispatch it through the matching Agestra skill/command (leader/review/idea/design/implement).", | ||
| " Re-dispatch it through the matching Agestra skill/command (leader/review/idea/design/qa/security).", | ||
| ]; | ||
@@ -195,7 +195,7 @@ console.log(JSON.stringify({ additionalContext: lines.join("\n") })); | ||
| "Leader routing requirements:", | ||
| " 1. Classify the work domain as review, idea, design, or implement.", | ||
| " 1. Classify the work domain as review, idea, design, QA, or security.", | ||
| " 2. Preserve any explicitly requested providers (for example codex, gemini, claude, ollama).", | ||
| " 3. If the domain is ambiguous, ask one targeted question in the user's language.", | ||
| " 4. If the domain is implement, multi-AI mode is already selected, but show a task-to-provider routing table and get approval before spawning file-changing workers or accepting worktree changes.", | ||
| " 5. Let the matching domain skill hand off to team-lead; team-lead owns consensus via `agent_consensus_start` or `agent_research_consensus_start` as appropriate.", | ||
| " 4. If the user asks for code or persistent test authoring, explain that Agestra no longer implements; route only design/review/QA/security/idea portions through Agestra.", | ||
| " 5. Let the matching workflow skill hand off to team-lead after selecting the workflow profile, questionSet, and lenses; team-lead uses `agent_research_start` for research-only preprocessing with workflow profile, prompt pack, questionSet, evidencePolicy, research lenses, and investigator assignments, then starts debate separately with debate-only `agent_consensus_start` from prepared `aggregation`, supplied `questionSet`, and `evidencePolicy`.", | ||
| " 6. Prefer Host-native first (`host-seeded`) for bounded research/review/design/security/idea evidence: use `agestra-research` and host-turn `agestra-debate` before the current host's external CLI provider.", | ||
@@ -207,3 +207,3 @@ " 7. Do not treat unsupported MCP sampling (for example `claude-host`) as a reason to replace the host-native role with `claude-cli`; external CLI providers are independent fresh-session participants.", | ||
| " 2. While team-lead/provider work runs, surface a concise phase update every 30-60 seconds.", | ||
| " 3. Poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when a locator/worker exists.", | ||
| " 3. Poll `agent_debate_status` or `run_observable_events` with a cursor when a locator exists.", | ||
| " 4. If trace is `cold-start`, report the current local phase and keep monitoring; do not stop because polling seems like context waste.", | ||
@@ -221,16 +221,5 @@ ]; | ||
| "", | ||
| "MANDATORY: You MUST use AskUserQuestion to ask the user whether Agestra should propose an AI task distribution for this implementation.", | ||
| "Present this question in the user's language.", | ||
| "Question: 'Should Agestra inspect the task and suggest which enabled AIs should handle each part?'", | ||
| "Options: 'Yes, suggest distribution' / 'No, handle it directly'", | ||
| "", | ||
| "If user selects YES:", | ||
| " 1. Call `setup_status`, `environment_check`, and `provider_list`", | ||
| " 2. Follow `commands/implement.md`", | ||
| " 3. Present a task-to-provider routing table before spawning workers", | ||
| " 4. Distribute work according to detected model capability, including frontier and local models", | ||
| " 5. Use safe edit-capable paths for actual code edits: `agestra-implementer`, Codex/Gemini CLI workers, or write-enabled local/tool models when policy and capability qualify", | ||
| "", | ||
| "If user selects NO:", | ||
| " Proceed without Agestra task distribution. Handle the task directly.", | ||
| "Agestra no longer implements product code or authors persistent E2E test files.", | ||
| "Proceed without Agestra task distribution. Handle the code/test change directly in the current host.", | ||
| "After the change, suggest Agestra QA/review/security only if the user explicitly wants multi-AI/provider-backed validation.", | ||
| ]; | ||
@@ -237,0 +226,0 @@ |
+5
-2
| { | ||
| "name": "agestra", | ||
| "version": "4.14.5", | ||
| "version": "4.15.0", | ||
| "description": "Multi-host MCP orchestration for Claude Code, Codex CLI, Gemini CLI, and local models", | ||
@@ -25,3 +25,2 @@ "type": "module", | ||
| "hooks/", | ||
| "prompts/", | ||
| "scripts/install-host-mcp.mjs", | ||
@@ -33,2 +32,3 @@ "scripts/uninstall-host-mcp.mjs", | ||
| "scripts": { | ||
| "prepare": "node scripts/install-git-hooks.mjs", | ||
| "sync:metadata": "node scripts/sync-metadata.mjs", | ||
@@ -45,2 +45,5 @@ "prebuild": "npm run sync:metadata", | ||
| "bundle": "node scripts/bundle.mjs", | ||
| "bundle:verify": "node scripts/check-bundle-equivalence.mjs", | ||
| "bundle:size": "node scripts/check-bundle-size.mjs", | ||
| "check:bundle": "npm run bundle:verify && npm run bundle:size", | ||
| "install:claude": "node scripts/install-host-mcp.mjs claude", | ||
@@ -47,0 +50,0 @@ "install:claude:global": "node scripts/install-host-mcp.mjs claude --source global --scope user", |
+5
-6
@@ -10,3 +10,3 @@ # Agestra | ||
| Agestra は、1 つの作業に複数の AI を使って比較し、整理するためのツールです。コードレビュー、QA、セキュリティ確認、設計相談、アイデア探索、provider-backed 実装向けに作られています。 | ||
| Agestra は、1 つの問題を複数の AI 視点で検討し、整理するためのツールです。コードレビュー、QA、セキュリティ確認、設計相談、アイデア探索、根拠にもとづく合意形成向けに作られています。 | ||
@@ -25,4 +25,4 @@ ## クイックスタート | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea`, `/agestra implement` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea` | ||
| - Codex CLI: `Use Agestra with Gemini and Codex to review this branch.` のように、Agestra や複数 AI を明示して依頼 | ||
@@ -39,3 +39,2 @@ | ||
| - `idea`: 改善案、代替案、類似ツールを探る | ||
| - `implement`: 複数 provider で実装を進め、最後の検証までつなぐ | ||
@@ -50,5 +49,5 @@ ## 実行すると何が起こるか | ||
| 普通のレビューや QA の依頼が自動で Agestra になるわけではありません。`/agestra ...` を使うか、複数 AI や provider-backed 作業を明示したときに Agestra が動きます。 | ||
| 普通のレビューや QA の依頼が自動で Agestra になるわけではありません。`/agestra ...` を使うか、複数 AI や provider-backed のレビュー、QA、セキュリティ、設計、アイデア作業を明示したときに Agestra が動きます。 | ||
| 実装と QA では、最後の確認は引き続きホストが担当します。ビルド、テスト、実行証拠、ブラウザフロー、最終的なファイル反映はホスト側で確認します。 | ||
| コード変更は、まず現在のホストで直接行うのが基本です。Agestra はその後で結果をレビューし、計画との一致を確認し、複数 provider の意見と根拠を記録するところで最も力を発揮します。 | ||
@@ -55,0 +54,0 @@ ## このリポジトリで使う |
+5
-6
@@ -10,3 +10,3 @@ # Agestra | ||
| Agestra는 하나의 작업에 여러 AI를 붙여서 비교하고 정리해 주는 도구입니다. 코드 리뷰, QA, 보안 점검, 설계 논의, 아이디어 탐색, provider-backed 구현에 맞춰 설계되어 있습니다. | ||
| Agestra는 하나의 문제를 여러 AI 시각으로 검토하고 정리해 주는 도구입니다. 코드 리뷰, QA, 보안 점검, 설계 논의, 아이디어 탐색, 근거 기반 합의에 맞춰 설계되어 있습니다. | ||
@@ -25,4 +25,4 @@ ## 빠른 시작 | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea`, `/agestra implement` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea` | ||
| - Codex CLI: `Agestra로 Gemini와 Codex를 같이 써서 이 브랜치 리뷰해줘`처럼 Agestra나 여러 AI를 명시해서 요청 | ||
@@ -39,3 +39,2 @@ | ||
| - `idea`: 개선 아이디어, 대안, 유사 도구 탐색 | ||
| - `implement`: 여러 provider를 써서 구현을 진행하고 마지막 검증까지 이어감 | ||
@@ -50,5 +49,5 @@ ## 실행하면 어떻게 되나 | ||
| 평범한 리뷰나 QA 요청이 자동으로 Agestra가 되는 것은 아닙니다. `/agestra ...`를 쓰거나, 여러 AI나 provider-backed 작업을 명시했을 때 Agestra 워크플로우가 시작됩니다. | ||
| 평범한 리뷰나 QA 요청이 자동으로 Agestra가 되는 것은 아닙니다. `/agestra ...`를 쓰거나, 여러 AI나 provider-backed 리뷰/QA/보안/설계/아이디어 작업을 명시했을 때 Agestra 워크플로우가 시작됩니다. | ||
| 구현과 QA에서는 마지막 확인을 계속 호스트가 맡습니다. 빌드, 테스트, 실행 근거, 브라우저 흐름, 최종 파일 반영은 호스트가 확인합니다. | ||
| 코드 변경은 먼저 현재 호스트에서 직접 진행하는 편이 좋습니다. Agestra는 그 다음 결과를 리뷰하고, 계획과 맞는지 검증하고, 여러 provider 의견과 근거를 기록할 때 가장 강합니다. | ||
@@ -55,0 +54,0 @@ ## 이 저장소에서 쓰기 |
+5
-6
@@ -10,3 +10,3 @@ # Agestra | ||
| Agestra helps you use more than one AI for the same task. It is built for review, QA, design discussion, idea exploration, and provider-backed implementation. | ||
| Agestra helps you use more than one AI to examine the same problem. It is built for review, QA, security checks, design discussion, idea exploration, and evidence-backed consensus. | ||
@@ -25,4 +25,4 @@ ## Quick Start | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea`, `/agestra implement` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea` | ||
| - Codex CLI: ask explicitly for Agestra or multiple providers, for example `Use Agestra with Gemini and Codex to review this branch.` | ||
@@ -39,3 +39,2 @@ | ||
| - `idea`: explore improvements, alternatives, and similar tools | ||
| - `implement`: coordinate provider-backed implementation, then verify the result | ||
@@ -50,5 +49,5 @@ ## How It Runs | ||
| Plain review or QA requests do not automatically become Agestra workflows. Agestra starts when you use `/agestra ...` or explicitly ask for multi-AI or provider-backed help. | ||
| Plain review or QA requests do not automatically become Agestra workflows. Agestra starts when you use `/agestra ...` or explicitly ask for multi-AI or provider-backed review, QA, security, design, or idea work. | ||
| For implementation and QA, the host still owns the final checks such as build, test, runtime evidence, browser flows, and accepted file changes. | ||
| For code changes, use your current host directly first. Agestra is strongest after that: reviewing the result, checking it against a plan, comparing provider opinions, and recording the evidence. | ||
@@ -55,0 +54,0 @@ ## Using This Repository |
+5
-6
@@ -10,3 +10,3 @@ # Agestra | ||
| Agestra 用来把多个 AI 放到同一个任务里比较和整理。它适合代码审查、QA、安全检查、设计讨论、想法探索,以及 provider-backed 实现。 | ||
| Agestra 用来让多个 AI 从不同角度审视同一个问题,并把结果整理成证据清晰的结论。它适合代码审查、QA、安全检查、设计讨论、想法探索和基于证据的共识。 | ||
@@ -25,4 +25,4 @@ ## 快速开始 | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea`, `/agestra implement` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea`, `/agestra:implement` | ||
| - Claude Code: `/agestra review`, `/agestra qa`, `/agestra security`, `/agestra design`, `/agestra idea` | ||
| - Gemini CLI: `/agestra:review`, `/agestra:qa`, `/agestra:security`, `/agestra:design`, `/agestra:idea` | ||
| - Codex CLI: 像 `Use Agestra with Gemini and Codex to review this branch.` 这样明确提到 Agestra 或多个 AI | ||
@@ -39,3 +39,2 @@ | ||
| - `idea`: 探索改进方向、备选方案和相似工具 | ||
| - `implement`: 用多个 provider 推进实现,并把最后验证也串起来 | ||
@@ -50,5 +49,5 @@ ## 运行时会发生什么 | ||
| 普通的 review 或 QA 请求不会自动变成 Agestra 工作流。只有当你使用 `/agestra ...`,或者明确要求多 AI / provider-backed 帮助时,Agestra 才会启动。 | ||
| 普通的 review 或 QA 请求不会自动变成 Agestra 工作流。只有当你使用 `/agestra ...`,或者明确要求多 AI / provider-backed 的 review、QA、安全、设计或 idea 工作时,Agestra 才会启动。 | ||
| 在实现和 QA 里,最后的确认仍然由宿主负责。构建、测试、运行证据、浏览器流程,以及最终落盘的改动都由宿主确认。 | ||
| 代码修改应优先由当前宿主直接完成。Agestra 最适合在修改之后审查结果、按计划验证、比较多个 provider 的意见,并记录证据。 | ||
@@ -55,0 +54,0 @@ ## 在这个仓库里使用 |
@@ -45,3 +45,3 @@ // Single source of truth for agent permission categories. | ||
| "ai_compare", | ||
| "agent_research_consensus_start", | ||
| "agent_research_start", | ||
| "agent_consensus_start", | ||
@@ -54,9 +54,2 @@ "agent_debate_status", | ||
| "agent_cross_validate", | ||
| "cli_worker_spawn", | ||
| "cli_worker_status", | ||
| "cli_worker_collect", | ||
| "cli_worker_stop", | ||
| "agent_changes_review", | ||
| "agent_changes_accept", | ||
| "agent_changes_reject", | ||
| "workspace_create_document", | ||
@@ -95,3 +88,3 @@ "workspace_read", | ||
| description: | ||
| "Full-lifecycle orchestrator. Spawns workers, reviews and accepts worktree changes, runs consensus sessions. Does not write files directly.", | ||
| "Full-lifecycle evidence and consensus orchestrator for review, QA, security, design, idea, and research workflows. Does not write product code or persistent E2E tests.", | ||
| }), | ||
@@ -112,9 +105,2 @@ research: Object.freeze({ | ||
| }), | ||
| implementation: Object.freeze({ | ||
| members: Object.freeze(["agestra-implementer"]), | ||
| policy: "open", | ||
| tools: null, | ||
| description: | ||
| "Applies scoped code or test changes, including approved mode:e2e-test-authoring work. Tool surface is intentionally unconstrained at the frontmatter level so implementation can use whatever the task requires.", | ||
| }), | ||
| }); | ||
@@ -121,0 +107,0 @@ |
+7
-18
@@ -17,5 +17,4 @@ --- | ||
| 1. **CLI Workers** — Call `cli_worker_status` to check for workers in RUNNING or SPAWNING state | ||
| 2. **Debate** — Call `agent_debate_status` to check for active debates | ||
| 3. **Background agents** — Check for any spawned background agents still running | ||
| 1. **Consensus session** — Call `agent_debate_status` to check for active consensus sessions | ||
| 2. **Background agents** — Check for any spawned background agents still running | ||
@@ -29,16 +28,7 @@ If nothing is detected as active, inform the user: "No active Agestra operations found." | ||
| ### CLI Workers | ||
| 1. List all workers in RUNNING/SPAWNING state with their provider, elapsed time, and task description. | ||
| 2. Ask the user which to stop (or all): | ||
| - Single worker: call `cli_worker_stop` with the worker ID. | ||
| - All workers: call `cli_worker_stop` for each. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Wait for an explicit choice; do not infer which worker to stop. | ||
| 3. Workers receive SIGTERM, then SIGKILL after 5 seconds. | ||
| 4. Worktrees are cleaned up automatically. | ||
| 5. Report: which workers were stopped, any partial results available via `cli_worker_collect`. | ||
| ### Consensus Session | ||
| 1. Call `agent_debate_status` and inspect the allowed actions for the session. | ||
| 2. If the session is ready for a leader decision, call `agent_debate_reject` with a reason noting early termination. | ||
| 3. If the session is still running and cannot be stopped through an allowed action, report the current phase, session ID, and available follow-up actions rather than inventing a legacy conclude path. | ||
| ### Debate | ||
| 1. Call `agent_debate_conclude` with a summary noting early termination | ||
| 2. Inform the user which providers participated and how many rounds completed | ||
| ### Task Chain | ||
@@ -63,4 +53,3 @@ 1. Note the current step and remaining steps | ||
| - Summarize what was stopped and what completed | ||
| - Note any artifacts produced (debate documents, partial results, worker diffs) | ||
| - If CLI workers produced changes before stopping, mention that partial diffs may be available via `cli_worker_collect` | ||
| - Note any artifacts produced (debate documents, partial results) | ||
| - If the operation produced useful partial work, mention it so the user can resume later | ||
@@ -67,0 +56,0 @@ |
+16
-13
@@ -13,3 +13,3 @@ --- | ||
| Pre-implementation design contract creation. Turn a selected idea into a self-contained implementation contract that both humans and AI workers can follow without guessing. Understand identity, users, scope, constraints, success criteria, and quality principles through targeted questions; explore the codebase for existing patterns; propose multiple approaches with trade-offs; and produce a design document that defines what to build, what not to build, how it should behave, and how implementation completeness will be judged. | ||
| Pre-implementation design contract creation. Turn a selected idea into a self-contained implementation contract that both humans and the current host can follow without guessing. Understand identity, users, scope, constraints, success criteria, and quality principles through targeted questions; explore the codebase for existing patterns; propose multiple approaches with trade-offs; and produce a design document that defines what to build, what not to build, how it should behave, and how implementation completeness will be judged. | ||
@@ -38,4 +38,8 @@ ## Scope | ||
| Ask **Need to know** questions before **Nice to know** questions. Use `AskUserQuestion` when available, or ask the same options plainly in chat as a numbered prompt when structured choices are unavailable. Prefer short choices with a separate "Term help" block instead of long parenthetical explanations in every option. Include "not sure — recommend a default" when helpful. Do not assume or infer missing design-contract values; explicit `not sure`, `defer`, `none`, or `skip` answers are valid. | ||
| Ask **Need to know** questions before **Nice to know** questions. Prefer short choices with a separate "Term help" block instead of long parenthetical explanations in every option. Include "not sure — recommend a default" when helpful. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (the prior turn, a saved-idea record under `docs/ideas/`, or an upstream `/agestra design` invocation that already named the field) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any "Need to know" dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. Explicit `not sure`, `defer`, `none`, or `skip` answers are valid as user-provided values. | ||
| **Design Contract Dimensions:** | ||
@@ -72,3 +76,3 @@ | ||
| **Host research consensus inputs (mandatory before provider fan-out):** | ||
| **Host research-phase inputs (mandatory before provider fan-out):** | ||
| - "Provider-backed design can use Host-native first, Council, or Provider-seeded research. What should the selected investigation look for: existing patterns in this codebase, prior art / competing implementations, constraints / regulations, current-information needs, or skip?" | ||
@@ -108,5 +112,4 @@ - "Should any participant or lens receive a specific research assignment, or should team-lead choose the assignment rows?" | ||
| | **Provider-seeded Research** | One selected provider creates the first design seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded design work, Council for broad architecture exploration, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a design clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -144,4 +147,4 @@ ### Phase 3: Route execution | ||
| - **Existing design docs:** {paths under `docs/plans/` if any} | ||
| - **Consensus domain:** `design` | ||
| - **Research topology / 조사 방식:** {selected in Phase 2 — `host-seeded`, `council`, `provider-seeded`, or `automatic`} | ||
| - **Workflow profile:** design profile with `workflow: "design"`, design `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** {selected in Phase 2 — `host-seeded`, `council`, `provider-seeded`, or `automatic`}; seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -154,3 +157,3 @@ - **Research notes:** {what the selected investigation should look for — existing patterns, prior art, constraints, current-information needs} | ||
| - **Target workspace root:** {absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir`} | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** {preserve verbatim} | ||
@@ -160,6 +163,6 @@ | ||
| - Building the participant team (host designer + external providers + auto-injected specialists when applicable) | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/design/`. | ||
| - Returning the research artifact paths, design synthesis path, accepted decisions, excluded options, disputed items, and the final design document path under `docs/plans/` | ||
@@ -256,4 +259,4 @@ | ||
| The document must be self-contained and precise enough for a separate AI worker to implement from it without conversation context. | ||
| The Implementation Progress section must be the first section after the title. Pre-populate it with concrete rows for the included scope, expected state/error handling, integration points, and verification-sensitive items so implementers and QA can track evidence without changing the design contract. | ||
| The document must be self-contained and precise enough for a later current-host implementation pass to follow without conversation context. | ||
| The Implementation Progress section must be the first section after the title. Pre-populate it with concrete rows for the included scope, expected state/error handling, integration points, and verification-sensitive items so implementers outside Agestra and QA can track evidence without changing the design contract. | ||
@@ -260,0 +263,0 @@ **Required design principles to include unless the user explicitly overrides them:** |
+19
-16
@@ -52,4 +52,8 @@ --- | ||
| Before researching, understand what the user needs through targeted questions. Ask ONE question at a time. Use `AskUserQuestion` when available, or ask the same options plainly in chat as a numbered prompt when structured choices are unavailable. Communicate in the user's language. | ||
| Before researching, understand what the user needs through targeted questions. Ask ONE question at a time. Communicate in the user's language. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (the prior turn, a saved-idea record being reused, or an upstream `/agestra idea` invocation that already named the field) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension in the Mode A or Mode B table below cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. | ||
| **Step 1: Determine mode.** | ||
@@ -95,3 +99,3 @@ - If the codebase has a README or meaningful code → Mode A (existing project) | ||
| **Early exit:** If the user provides all required fields upfront, skip redundant questions and proceed to Phase 2. If any required field is missing, ask for it one at a time. | ||
| **Bundle-skip rule applied.** If the user's incoming request explicitly provided values for every required dimension (Mode A: Intent, Area, User wishes, Current audience, Research notes, Research assignments, Identity and boundaries, Free notes — explicit `none`/`skip`/`unspecified` is a value; or Mode B: Kind, Seed, Audience, Must-have, Inspiration, Difference, Research notes, Research assignments, Free notes), skip redundant questions and proceed to Phase 2. If any required dimension is missing or ambiguous, ask for it one at a time per the gate above. Skipping is permitted only when the source request literally enumerates each required field — agent inference does not satisfy the bundle-skip condition. | ||
@@ -107,5 +111,4 @@ ### Phase 2: Choose 조사 방식 | ||
| | **Provider-seeded Research** | One selected provider creates the first seed/evidence artifact, then host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded topics, Council for broad idea discovery, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a domain clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -131,4 +134,4 @@ ### Phase 3: Route execution | ||
| - **Project context:** {README summary, current feature set if Mode A; seed idea verbatim if Mode B} | ||
| - **Consensus domain:** `idea` | ||
| - **Research topology / 조사 방식:** {selected in Phase 2 — `host-seeded`, `council`, `provider-seeded`, or `automatic`} | ||
| - **Workflow profile:** idea profile with `workflow: "idea"`, idea `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** {selected in Phase 2 — `host-seeded`, `council`, `provider-seeded`, or `automatic`}; seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -141,3 +144,3 @@ - **Research notes:** {what the selected investigation should look for} | ||
| - **Target workspace root:** {absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir`} | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** {preserve verbatim} | ||
@@ -147,9 +150,9 @@ | ||
| - Building the participant team. Any host ideator is invoked through the active host layer; external providers are MCP/CLI/chat participants only. | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - For research fan-out, pass `domain: "idea"` to `agent_research_consensus_start`. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - For research fan-out, pass the idea workflow profile, prompt pack, `questionSet`, and `evidencePolicy` to `agent_research_start`. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Writing the project-facing idea decision record under `docs/agestra/YYYY-MM-DD-idea-<session-id>-result.md` from the aggregation document, JSON artifacts, consensus state, and the user's interview answers. Preserve disputed positions and weak-evidence flags rather than averaging them away. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader final document target. | ||
| - Returning the research artifact paths, accepted/excluded/disputed items, carry-forward ideas, weak-evidence flags, and the `docs/agestra/` decision document path. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Writing the project-facing idea decision record under `docs/reports/idea/YYYY-MM-DD-idea-<session-id>-result.md` from the threaded aggregation document, JSON artifacts, `workflow_result.json`, and the user's interview answers. Preserve disputed positions and weak-evidence flags rather than averaging them away. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final document. | ||
| - Returning the research artifact paths, accepted/excluded/disputed items, carry-forward ideas, weak-evidence flags, and the `docs/reports/idea/` decision document path. | ||
@@ -160,11 +163,11 @@ **Do NOT from this skill:** | ||
| - Build individual documents or hand-edit generated debate/synthesis Markdown | ||
| - Create a bundled research pseudo-participant or carry research bundles through `source_documents` | ||
| - Create a bundled research pseudo-participant or carry research bundles through legacy source-document fields | ||
| Direct execution from this skill bypasses team-lead's capability-based routing and optional trace-assisted signals, and consistency enforcement. Always go through team-lead in the provider-backed path. | ||
| When team-lead returns, present a title-only idea list first, then point the user to the `docs/agestra/` decision record for details. Separate research-backed opportunities, hypotheses, risky but interesting ideas, duplicates, weakly grounded ideas, and recommended next directions. In the user's language, explain accepted ideas as "worth carrying forward" rather than MVP approval, and preserve each provider's rationale on disputed positions. Treat `.agestra/workspace/` as the internal research workspace; the user-facing research-consensus decision record belongs under `docs/agestra/`. | ||
| When team-lead returns, present a title-only idea list first, then point the user to the `docs/reports/idea/` decision record for details. Separate research-backed opportunities, hypotheses, risky but interesting ideas, duplicates, weakly grounded ideas, and recommended next directions. In the user's language, explain accepted ideas as "worth carrying forward" rather than MVP approval, and preserve each provider's rationale on disputed positions. Treat `.agestra/workspace/` as the internal research workspace; the user-facing research-phase decision record belongs under `docs/reports/idea/`. | ||
| #### Reference: Mode A / Mode B research-participant brief templates | ||
| These templates are reference material for team-lead's `research_assignments` and external research prompts inside `agent_research_consensus_start` with `domain: "idea"`. They guide each research participant's idea-shaped evidence collection before the fresh-session debate phase. | ||
| These templates are reference material for team-lead's `research_assignments` and external research prompt packs inside `agent_research_start` with the idea workflow profile. They guide each research participant's idea-shaped evidence collection before the fresh-session debate phase. | ||
@@ -171,0 +174,0 @@ **Mode A research brief (existing project) — used for `research_assignments` and external research prompts:** |
+74
-65
@@ -38,5 +38,5 @@ --- | ||
| auto-classify clear requests and route them quietly. It shows the domain menu | ||
| only when the request is genuinely ambiguous. The matching domain skill gathers | ||
| domain-specific requirements, then the `agestra:agestra-team-lead` agent executes | ||
| the actual orchestration. | ||
| only when the request is genuinely ambiguous. The selected workflow skill gathers | ||
| the workflow profile, questionSet, lenses, and gates, then the | ||
| `agestra:agestra-team-lead` agent executes the actual orchestration. | ||
@@ -55,3 +55,3 @@ ## When NOT to use | ||
| - **Operational tasks** — use the dedicated skills: `agestra:setup`, `build-fix`, `cancel`, | ||
| `trace`, `worker-manage`, `provider-guide`. | ||
| `trace`, `provider-guide`. | ||
@@ -75,5 +75,7 @@ ## Workflow | ||
| Read the user's request and classify into one of six work domains. Use the strongest signal | ||
| present in the user's wording. | ||
| Read the user's request and classify into one of five active work domains. Use the strongest | ||
| signal present in the user's wording. | ||
| Implementation is no longer an active Agestra domain. Code-changing requests should stay with the current host first, then return to Agestra for QA, review, security review, design clarification, or idea work. | ||
| This table is a domain classifier after the entry signal is already established. It must not | ||
@@ -90,8 +92,14 @@ turn plain review, QA, validation, comparison, debate, or check wording into an Agestra | ||
| | **security** | security, vulnerability, secrets, API keys, auth, permissions, file access risk, 보안, 취약점, 권한, secret, セキュリティ, 安全审计 | security workflow | Dedicated security audit | | ||
| | **implement** | build, refactor, migrate, fix-the-bug, apply changes, write code, 구현, 리팩토링, 마이그레이션, 実装, 实现 | implement workflow | Code changes + QA included by default | | ||
| **QA placement rule:** Implementation includes QA by default after code changes. When the | ||
| user asks for QA by itself, classify as `qa` so the QA workflow can choose Standard vs E2E | ||
| depth, then hand off to team-lead with `submode: qa-only` when multi-AI is requested. | ||
| **Code-change placement rule:** If the user asks Agestra to build, refactor, | ||
| migrate, fix, apply changes, write code, 구현, 리팩토링, マイグレーション, 実装, or 实现, | ||
| explain that Agestra no longer performs provider-backed implementation. Tell the | ||
| user to let the current host make the change first, then run `qa`, `review`, or | ||
| `security` through Agestra on the result. | ||
| **QA placement rule:** When the user asks for QA, classify as `qa` so the QA | ||
| workflow can choose Standard vs E2E depth. QA-only mode does not modify product | ||
| code; connection or boundary defects are findings until the user approves a | ||
| separate implementation task outside this router. | ||
| Distinguish `review` vs `qa`: | ||
@@ -105,12 +113,12 @@ - `review` → freeform critique/evaluation of code quality, UX/product feel, maintainability, | ||
| **Mixed-domain requests** (e.g. "여러 AI로 설계하고 구현해"): | ||
| - Split into ordered phases: design → implement, or idea → design → implement. | ||
| - Inform the user once: "이 작업은 N단계로 나눕니다 — 먼저 X, 다음 Y." (in user's language) | ||
| - Run them sequentially via repeated domain-skill → team-lead handoffs. | ||
| **Mixed-workflow requests** (e.g. "여러 AI로 설계하고 구현해"): | ||
| - Split into active Agestra workflow phases plus a host-owned code-change step: design → current-host implementation → QA/review, or idea → design → current-host implementation → QA/review. | ||
| - Inform the user once in their language that Agestra can run the idea/design/review/QA/security parts, while code edits should happen in the current host first. | ||
| - Run only the active Agestra phases through workflow-skill → team-lead handoffs. | ||
| **Clear-domain requests:** | ||
| - Do not ask "which command?" when the domain is clear from the request. | ||
| - Route directly to the matching domain skill. | ||
| - Route directly to the matching workflow skill. | ||
| - Domain-specific cost, trust, runtime-depth, write, provider, or research-topology gates | ||
| are asked inside the domain skill or by team-lead, not here. | ||
| are asked inside the workflow skill or by team-lead, not here. | ||
| - If the user says "알아서", "decide automatically", or equivalent and the | ||
@@ -121,3 +129,3 @@ domain is clear, pass `automatic` intent forward instead of opening the menu. | ||
| - Ask ONE targeted question via `AskUserQuestion` (or a plain numbered prompt as fallback). | ||
| - Present the SIX options below. Match the question language to the user's language. | ||
| - Present the five options below. Match the question language to the user's language. | ||
| - Wait for an explicit domain choice. Do not infer the domain when the request is ambiguous. | ||
@@ -131,5 +139,4 @@ | ||
| 3. 코드 리뷰 — 품질/UX/성능/유지보수성 평가 | ||
| 4. 개발 (+QA) — 코드 작성/리팩토링 + 자동 QA 검증 | ||
| 5. 검증만 — 이미 구현된 코드를 설계 문서 대비 PASS/FAIL 확인 | ||
| 6. 보안 감사 — 비밀키/권한/파일접근/네트워크 위험 감사 | ||
| 4. 검증 — 이미 구현된 코드를 설계 문서 대비 PASS/FAIL 확인 | ||
| 5. 보안 감사 — 비밀키/권한/파일접근/네트워크 위험 감사 | ||
| ``` | ||
@@ -143,5 +150,4 @@ | ||
| 3. Code review — critique quality, UX, performance, maintainability | ||
| 4. Implement (+QA) — write/refactor code with automatic QA verification | ||
| 5. QA only — verify already-implemented code against the design doc (PASS/FAIL) | ||
| 6. Security audit — audit secrets, auth, file access, network exposure | ||
| 4. QA — verify already-implemented code against the design doc (PASS/FAIL) | ||
| 5. Security audit — audit secrets, auth, file access, network exposure | ||
| ``` | ||
@@ -153,5 +159,4 @@ | ||
| - 3 → `review` | ||
| - 4 → `implement` (default, includes QA) | ||
| - 5 → `qa` | ||
| - 6 → `security` | ||
| - 4 → `qa` | ||
| - 5 → `security` | ||
@@ -185,11 +190,11 @@ Use the equivalent template in `ja` / `zh` when those locales are active. | ||
| ### Phase 4: Route to the domain skill (which then hands off to team-lead) | ||
| ### Phase 4: Route to the workflow skill (which then hands off to team-lead) | ||
| This skill is a **router only** — it does NOT directly spawn the team-lead agent or call | ||
| `agent_debate_*` tools. After classification, route the user to the matched domain skill so | ||
| `agent_debate_*` tools. After classification, route the user to the matched workflow skill so | ||
| that proper information gathering (Clarity Gate, focus areas, Mode A/B detection, etc.) | ||
| happens before team-lead receives the handoff packet. | ||
| Invoke the matched domain skill via the `Skill` tool with a routing context comment that | ||
| the domain skill should read: | ||
| Invoke the matched workflow skill via the `Skill` tool with a routing context comment that | ||
| the workflow skill should read: | ||
@@ -201,10 +206,13 @@ | Classified Domain | Skill to invoke | Notes | | ||
| | `review` | `agestra:review` | | | ||
| | `qa` | `agestra:qa` | Verification only. Asks Host-only vs QA Brigade, then QA depth; provider-backed QA Brigade is optional and falls back to Host-only/setup guidance when no providers are available | | ||
| | `qa` | `agestra:qa` | Verification only. Mandatory topology gate (Council QA / Host-native first QA / Provider-seeded QA); E2E gated by package.json `scripts.e2e`; no host-only fallback — when no providers are configured, stop and direct the user to `/agestra setup` | | ||
| | `security` | `agestra:security` | Dedicated security audit | | ||
| | `implement` (default) | `agestra:implement` | Code changes + QA. Team-lead runs provider-backed code execution, host-owned evidence collection, and Phase 5M structured QA debate | | ||
| Pre-populate the following context for the domain skill so it doesn't re-classify: | ||
| If the user explicitly invoked a removed implementation command, explain that | ||
| Agestra no longer provides implementation workflows. Route the code change to | ||
| the current host first, then suggest Agestra QA/review/security on the result. | ||
| Pre-populate the following context for the workflow skill so it doesn't re-classify: | ||
| - **Pre-classified domain**: from Phase 1 | ||
| - **Submode** (implement/qa only): `qa-only` when the user asked for verification of already-implemented code; otherwise omit (default = full implement + QA) | ||
| - **Submode** (qa only): `qa-only` when the user asked for verification of already-implemented code; otherwise omit | ||
| - **Multi-AI signal**: present (the user's wording explicitly asked for it) | ||
@@ -216,3 +224,3 @@ - **Requested providers**: from Phase 3 (e.g. `[codex, gemini]` or "all available") | ||
| The domain skill is responsible for: | ||
| The workflow skill is responsible for: | ||
@@ -229,7 +237,7 @@ 1. Skipping its own domain-classification step (already done here) | ||
| - Building the participant team from the reduced host-native agents (`agestra-research`, `agestra-debate`, `agestra-implementer`) and named external providers. External providers participate through MCP/CLI/chat routes and do not create or manage native host agents. | ||
| - Building the participant team from the reduced host-native agents (`agestra-research`, `agestra-debate`) and named external providers. External providers participate through MCP/CLI/chat routes and do not create or manage native host agents. | ||
| - Maintaining user-visible progress. Completion notifications are not enough: | ||
| team-lead or the caller must surface phase updates every 30-60 seconds while | ||
| provider, debate, or worker work is running. Use `agent_debate_status`, | ||
| `run_observable_events` with a cursor, or `cli_worker_status` when available. | ||
| provider or debate work is running. Use `agent_debate_status` or | ||
| `run_observable_events` with a cursor when available. | ||
| If trace is still `cold-start`, report the current local phase and keep | ||
@@ -240,4 +248,4 @@ monitoring instead of stopping. | ||
| participants. If topology is missing and asking is allowed, team-lead asks one | ||
| concise topology question; if no-questions mode blocks asking, team-lead uses | ||
| Host-native first (`host-seeded`) research and reports that choice. | ||
| concise topology question; if asking is blocked, team-lead must | ||
| stop and return that the mandatory topology choice is required. | ||
| - For Host-native first, team-lead must invoke the active host's | ||
@@ -248,32 +256,33 @@ `agestra-research` / `agestra-debate` native agents before considering the | ||
| native-agent route. | ||
| - Consensus orchestration: `agent_research_consensus_start` when the selected | ||
| topology requires investigation before debate, or `agent_consensus_start` from | ||
| prepared `initial_aggregation` when the domain skill/team-lead already has | ||
| sufficient evidence | ||
| - Consensus orchestration: `agent_research_start` when the selected topology | ||
| requires research-only preprocessing with workflow profile, prompt pack, | ||
| questionSet, evidencePolicy, research lenses, and investigator assignments; | ||
| it writes `research_submissions.json`, `research_transcript.json`, and | ||
| `aggregation.json` and does not start debate. Start debate separately with | ||
| debate-only `agent_consensus_start` from prepared `aggregation`, supplied | ||
| `questionSet`, and `evidencePolicy`; `workflow` is a report/artifact label | ||
| only, not a debate routing branch. | ||
| - Approval gate (`agent_debate_approve` / `_continue` / `_reject`) | ||
| - For `implement`: code edits via `agestra:agestra-implementer` or CLI workers | ||
| (`cli_worker_spawn`), then `agent_changes_review` before merge | ||
| - `qa_run` and QA lenses when applicable | ||
| - `agestra:agestra-implementer` with `mode: e2e-test-authoring` only for approved persistent E2E test-writing packets; QA remains the final verifier | ||
| **Mixed-domain requests** (e.g. "여러 AI로 설계하고 구현해"): | ||
| - Route to the FIRST domain skill in the user's stated order (design → implement, etc.) | ||
| **Mixed-workflow requests** (e.g. "여러 AI로 설계하고 구현해"): | ||
| - Route to the FIRST active workflow skill in the user's stated order (idea, design, review, qa, or security) | ||
| - Pass forward the prior phase's artifacts (design doc IDs, debate session IDs) when the | ||
| next phase is invoked | ||
| - The domain skill's team-lead handoff returns control here for the next phase routing | ||
| - The workflow skill's team-lead handoff returns control here for the next phase routing | ||
| **Why route through the domain skill instead of team-lead directly:** | ||
| - Each domain has specialized information-gathering (Clarity Gate dimensions, review focus | ||
| areas, Mode A vs B detection) that team-lead does not replicate. | ||
| **Why route through the workflow skill instead of team-lead directly:** | ||
| - Each workflow has specialized information-gathering (Clarity Gate dimensions, review focus | ||
| areas, Mode A vs B detection), profile selection, questionSet selection, and lens | ||
| assignment that team-lead does not replicate. | ||
| - Direct team-lead invocation produces under-specified handoff packets and forces team-lead | ||
| to interview the user, duplicating the domain skill's work. | ||
| - This routing keeps a single information-gathering source of truth per domain. | ||
| to interview the user, duplicating the workflow skill's work. | ||
| - This routing keeps a single information-gathering source of truth per workflow. | ||
| ### Phase 5: Report | ||
| When the domain skill (and team-lead behind it) returns: | ||
| When the workflow skill (and team-lead behind it) returns: | ||
| - Surface the consensus document, disputed positions (with each provider's rationale), and | ||
| final synthesis. Do not flatten distinctive positions. | ||
| - For `implement`: list files changed, tests run, and QA verdict. | ||
| - For `review`: include the review verdict and separate objective findings from reviewer opinions. | ||
@@ -291,6 +300,6 @@ - For `qa`: include QA depth, E2E status, PASS / CONDITIONAL PASS / FAIL, and classified failures. | ||
| providers directly, or invoke `agent_debate_*` tools. All execution happens through the | ||
| domain skill → team-lead → agents/MCP tools chain. | ||
| workflow skill → team-lead → agents/MCP tools chain. | ||
| - Do NOT spawn the `agestra:agestra-team-lead` agent directly from this skill. Always go | ||
| through the domain skill so information gathering (Clarity Gate, focus areas, Mode A/B) | ||
| runs first. | ||
| through the workflow skill so information gathering, profile selection, questionSet | ||
| selection, and lens assignment run first. | ||
| - Do not bypass Phase 0 setup preflight even if `environment_check` succeeds — config | ||
@@ -300,11 +309,11 @@ determines which providers are sanctioned. | ||
| - Do not silently substitute providers. If the user named specific providers, pass that list | ||
| to the domain skill. If a requested provider is offline, ask before proceeding. | ||
| - For `implement` domain: never call `Edit` / `Write` directly from this skill. Always go | ||
| through domain skill → team-lead → implementer/worker so review and QA gates apply. | ||
| to the workflow skill. If a requested provider is offline, ask before proceeding. | ||
| - For code-changing requests: never call `Edit` / `Write` directly from this skill. Explain | ||
| that implementation should happen in the current host first, then route any follow-up | ||
| verification to QA, review, or security. | ||
| - Do not treat a background team-lead completion notification as sufficient | ||
| progress reporting. If the team-lead is opaque to the user, the caller is | ||
| responsible for bounded progress polling and concise relay messages. | ||
| - Do not use `agent_debate_create` as the default debate path. It is a legacy/manual | ||
| turn-based tool for diagnostics or special low-level control; default to | ||
| - Do not use retired turn-based debate tools. The supported consensus start path is | ||
| `agent_consensus_start` (invoked by team-lead, not this skill). | ||
| - Match the response language to the user's language. |
+54
-74
@@ -6,3 +6,3 @@ --- | ||
| named AI providers, provider comparison, provider routing, structured | ||
| debates, parallel provider dispatch, cross-validation, or CLI workers. | ||
| debates, parallel provider dispatch, or cross-validation. | ||
| Plain review/QA/check requests without `/agestra` or explicit multi-AI/provider | ||
@@ -16,4 +16,4 @@ wording stay with the current host; they are not Agestra natural-language | ||
| - **Ollama** — Local models. Detected at runtime via `ollama_models`. | ||
| - **Gemini** — Cloud agent. Full capability. Can run as autonomous CLI worker. | ||
| - **Codex** — Cloud agent. Full capability. Can run as autonomous CLI worker. | ||
| - **Gemini** — Cloud agent. Full capability for analysis, review, design, QA cross-checking, and debate. | ||
| - **Codex** — Cloud agent. Full capability for analysis, review, design, QA cross-checking, and debate. | ||
@@ -27,5 +27,3 @@ All providers are detected at runtime. Call `environment_check` for a full capability map, or `provider_list` / `provider_health` for provider availability. | ||
| - Ollama models with size-based tier classification | ||
| - Git worktree support | ||
| - Available provider-backed modes: `independent`, `debate`, `team` | ||
| - Whether autonomous CLI workers can be spawned | ||
@@ -51,4 +49,3 @@ ## Provider Capability Guidelines | ||
| - Complex tasks via `ai_chat` (text response) | ||
| - Autonomous coding via `cli_worker_spawn` (file modifications in worktree) | ||
| - Parallel work and as validators | ||
| - Parallel analysis and as validators | ||
@@ -66,4 +63,4 @@ ## Work Modes | ||
| When providers are enabled, explicit Agestra/multi-AI/provider requests go to | ||
| the leader router first, then to the matching domain skill. The domain skill | ||
| still owns its question sheet and cost gates before team-lead dispatches providers. Idea, design, review, | ||
| the leader router first, then to the matching workflow skill. The workflow skill | ||
| still owns its workflow profile, questionSet, lenses, and cost gates before team-lead dispatches providers. Idea, design, review, | ||
| security, and explicit research workflows ask for `조사 방식` / research topology | ||
@@ -74,37 +71,17 @@ when provider-backed investigation is possible: Host-native first, Council | ||
| `/agestra qa` is the exception: it must ask for Host-only QA, QA Brigade, or | ||
| automatic selection because configured providers alone should not turn a | ||
| verification request into a long provider-research run. | ||
| If a host-level no-questions directive prevents that gate, use Host-only QA and | ||
| report that provider fan-out was skipped. Trust registration remains a separate | ||
| security approval gate and cannot be inferred from no-questions instructions. | ||
| When no providers are enabled, run setup or handle the task directly outside Agestra. | ||
| `/agestra qa` uses the same three-topology pattern as the other domains: it | ||
| asks for Council QA, Host-native first QA, or Provider-seeded QA as a | ||
| mandatory design selection gate. No-questions directives do not authorize a | ||
| silent default and there is no host-only fallback mode. When no providers | ||
| are configured, stop and direct the user to `/agestra setup` instead. Trust | ||
| registration remains a separate security approval gate and cannot be | ||
| inferred from no-questions instructions. E2E execution is host-owned and | ||
| gated by `package.json` `scripts.e2e` auto-detection. | ||
| ### Implementation Work (실제 구현) | ||
| ### Code and Test Authoring | ||
| Provider-backed implementation is available via team-lead orchestration: | ||
| Agestra no longer performs provider-backed implementation or persistent E2E test | ||
| authoring. Use the current host to make code/test changes first, then run | ||
| `/agestra qa`, `/agestra review`, or `/agestra security` on the result. | ||
| | Mode | Description | When to Use | | ||
| |------|-------------|-------------| | ||
| | **Suggested AI distribution** | Team lead proposes which enabled AIs should do which work, asks for approval, then dispatches | Complex, repetitive, or parallelizable tasks | | ||
| Small is not the same as simple. Distribute work according to detected model capability, including frontier and local models: | ||
| - **Small but hard/risky:** one-line auth, security, concurrency, data-loss, or release logic. Keep with `agestra-implementer` or a high-capability CLI worker plus QA. | ||
| - **Large but simple:** repeated safe pattern across many files. Prefer a capability-matched local/tool model for a patch plan, candidate transform, or scoped edit when execution policy allows; otherwise have `agestra-implementer` apply it. | ||
| ## CLI Workers | ||
| CLI workers spawn Codex or Gemini in `--full-auto` mode within isolated git worktrees. | ||
| | Tool | Purpose | | ||
| |------|---------| | ||
| | `cli_worker_spawn` | Spawn autonomous CLI worker with task manifest | | ||
| | `cli_worker_status` | Check worker FSM state, output, heartbeat | | ||
| | `cli_worker_collect` | Collect completed worker results (diff, output) | | ||
| | `cli_worker_stop` | Stop worker (SIGTERM → SIGKILL) + cleanup | | ||
| Worker lifecycle: SPAWNING → RUNNING → COLLECTING → COMPLETED (or FAILED/CANCELLED/TIMEOUT) | ||
| Use the `worker-manage` skill for user-friendly worker operations. | ||
| ## Auto-Routing Guidelines | ||
@@ -114,5 +91,4 @@ | ||
| |---|---| | ||
| | Simple repetitive proposal work (formatting, pattern matching) | Capability-matched local/tool model or host implementer | | ||
| | Simple repetitive proposal work (formatting, pattern matching) | Capability-matched local/tool model | | ||
| | Moderate (code review, summarization) | Local model tier that qualifies, or a frontier provider | | ||
| | Complex implementation (multi-file, multi-step) | High-capability CLI worker (for example Codex/Gemini) or host implementer | | ||
| | Complex analysis (architecture, refactoring) | Highest-capability detected model, usually a frontier provider | | ||
@@ -131,3 +107,3 @@ | No providers available | Handle directly — do not suggest agestra tools | | ||
| **Routing principle:** any work that involves external providers (Codex/Gemini/Ollama) or multi-AI coordination must enter through a domain skill (`/agestra review` / `design` / `idea` / `implement`) which then delegates to the `agestra:agestra-team-lead` agent. Do NOT suggest `ai_chat`, `cli_worker_spawn`, `agent_debate_*`, `agent_cross_validate`, or `ai_compare` as direct user-facing tools — those are MCP tools that team-lead invokes internally. Suggest the corresponding domain command instead. | ||
| **Routing principle:** any work that involves external providers (Codex/Gemini/Ollama) or multi-AI coordination must enter through an active workflow skill (`/agestra review` / `qa` / `security` / `design` / `idea`) which selects the workflow profile, questionSet, lenses, and gates before delegating to the `agestra:agestra-team-lead` agent. Do NOT suggest `ai_chat`, `agent_debate_*`, `agent_cross_validate`, or `ai_compare` as direct user-facing tools — those are MCP tools that team-lead invokes internally. Suggest the corresponding workflow command instead. | ||
@@ -140,10 +116,9 @@ | Intent | Suggested entry point | When | | ||
| | Explicit security audit command | `/agestra security` | User invoked `/agestra security`, or asked for security review with explicit multi-AI/provider wording | | ||
| | Speed up, parallelize, split work | `/agestra implement` (multi-AI mode) | User asked for provider-backed or multi-AI implementation; plain implementation requests stay with the current host | | ||
| | Mention a provider by name (Gemini, Codex, Ollama) | Matching domain command (`/agestra review` / `design` / `idea` / `implement`) — team-lead picks up the named providers from user wording | Provider names alone don't pick a domain; ask "어느 작업?" if ambiguous | | ||
| | Code changes, refactoring, fixes | Current host first, then `/agestra qa` or `/agestra review` | Agestra no longer performs provider-backed implementation as a primary workflow | | ||
| | Mention a provider by name (Gemini, Codex, Ollama) | Matching active workflow command (`/agestra review` / `qa` / `security` / `design` / `idea`) — team-lead picks up the named providers from user wording | Provider names alone don't pick a workflow; ask "어느 작업?" if ambiguous | | ||
| | Explicit architecture/design command | `/agestra design` | User invoked `/agestra design`, or asked for design with explicit multi-AI/provider wording | | ||
| | Compare options, which is better | `/agestra design` (`domain: design`) for design options, `/agestra idea` (`domain: idea`) for product/feature options | Use Agestra only when the comparison is explicitly multi-AI/provider-backed or `/agestra ...` was invoked | | ||
| | Large refactoring, many files to change | `/agestra implement` (multi-AI mode) | User explicitly wants provider-backed splitting or multiple AIs | | ||
| | Compare options, which is better | `/agestra design` (`workflow: design`) for design options, `/agestra idea` (`workflow: idea`) for product/feature options | Use Agestra only when the comparison is explicitly multi-AI/provider-backed or `/agestra ...` was invoked | | ||
| | Large refactoring, many files to change | Current host first, then `/agestra qa` or `/agestra review` | If the user wants multiple AI opinions, use Agestra to review/QA the resulting diff | | ||
| | About to commit, create PR, finalize work | `/agestra qa` | User invoked `/agestra qa`, or explicitly wants multi-AI/provider-backed QA | | ||
| | Check worker status, manage workers | `worker-manage` skill | User asks about running workers (operational, not domain work) | | ||
| | Domain unclear ("여러 AI로 뭐 좀") | `agestra-leader` skill (catch-all router) | Skill asks the user to pick from 6 options (idea / design / review / implement / QA / security) | | ||
| | Workflow unclear ("여러 AI로 뭐 좀") | `agestra-leader` skill (catch-all router) | Skill asks the user to pick from 5 options (idea / design / review / QA / security) | | ||
@@ -159,4 +134,2 @@ ### Commands and Agents | ||
| | `/agestra design` | `agestra:agestra-team-lead` + design lenses | Pre-implementation architecture exploration | | ||
| | `/agestra implement` | `agestra:agestra-team-lead` + `agestra:agestra-implementer` | Implementation routing, code execution, QA/review | | ||
| | Internal E2E writing | `agestra:agestra-implementer` with `mode: e2e-test-authoring` | Persistent E2E creation/maintenance after QA request and user approval; no standalone command yet | | ||
@@ -169,14 +142,13 @@ ### Utility Skills | ||
| | `build-fix` | Auto-diagnose and fix build/typecheck/lint errors one at a time | | ||
| | `cancel` | Gracefully stop running operations (including CLI workers) with state cleanup | | ||
| | `worker-manage` | List, check, collect, and stop CLI workers | | ||
| | `cancel` | Gracefully stop running operations with state cleanup | | ||
| In consensus mode, the engine runs rounds over team-lead prepared items. Host-native participation uses explicit `agestra:agestra-debate` host turns, while team-lead handles final aggregation and reporting. Sampling providers such as `claude-host` are optional; if sampling is unsupported, the host-native Agent/Skill route is still the preferred host participant path. | ||
| Commands and hook-triggered suggestions go through the leader/domain-skill gate when providers are available. Commands are explicit entry points; hooks detect explicit Agestra/multi-AI/provider intent from natural language. | ||
| Commands and hook-triggered suggestions go through the leader/workflow-skill gate when providers are available. Commands are explicit entry points; hooks detect explicit Agestra/multi-AI/provider intent from natural language. | ||
| ### Hook-Triggered Flow | ||
| When the UserPromptSubmit hook injects multi-AI context (e.g. user mentioned an external provider or used multi-AI phrasing), route through the matched **domain skill** (`agestra:design` / `idea` / `review` / `qa` / `security` / `implement`), which then hands off to `agestra:agestra-team-lead` with the multi-AI handoff packet. Domain skills own information gathering (Clarity Gate, focus areas, Mode A/B), team-lead owns execution (`agent_consensus_start`, CLI workers, approval gate). | ||
| When the UserPromptSubmit hook injects multi-AI context (e.g. user mentioned an external provider or used multi-AI phrasing), route through the matched **workflow skill** (`agestra:design` / `idea` / `review` / `qa` / `security`), which selects the workflow profile, questionSet, lenses, and gates before handing off to `agestra:agestra-team-lead` with the multi-AI handoff packet. Team-lead owns evidence gathering, consensus, approval gates, and reporting. | ||
| If the user's wording is multi-AI but the domain is unclear, route to the `agestra-leader` skill which asks the user to pick from 6 options (idea / design / review / implement / QA / security) and forwards to the chosen domain skill. | ||
| If the user's wording is multi-AI but the workflow is unclear, route to the `agestra-leader` skill which asks the user to pick from 5 options (idea / design / review / QA / security) and forwards to the chosen workflow skill. | ||
@@ -202,14 +174,12 @@ If no config exists, suggest `/agestra setup` first. | ||
| When team-lead orchestrates multi-AI work, the full pipeline is: | ||
| When team-lead orchestrates multi-AI evidence work, the full pipeline is: | ||
| ``` | ||
| Phase 0: Clarity Gate (team-lead + design lenses — ambiguity scoring, skip if request is clear) | ||
| Phase 1: Situation Assessment (team-lead — environment_check, providers, design doc) | ||
| Phase 2: Task Design (team-lead — work mode selection, decompose, route by AI capability) | ||
| Phase 3: Parallel Execution (team-lead — implementer + CLI workers + capability-matched local/tool model work, monitor loop) | ||
| Phase 4: Result Inspection (team-lead — review diffs, check consistency, merge) | ||
| Phase 5: Host-owned QA evidence collection (QA lenses — spec-to-code map, verify, classify failures → team-lead routes approved fixes) [provider-backed workflow prerequisite] | ||
| Phase 5M: Structured QA Debate (mode:"review", cross-validation across providers) [provider-backed QA Brigade] | ||
| Phase 6: Post-implementation Review (review lenses — critique, quality, UX, performance, blast radius, AI-slop) [host-owned lens inside provider-backed workflow] | ||
| Phase 7: Report | ||
| Phase 1: Situation Assessment (team-lead — environment_check, providers, target docs/diff) | ||
| Phase 2: Evidence Plan (team-lead — choose lenses, scope, provider roles, research topology) | ||
| Phase 3: Host-Owned Evidence Collection (host + agestra-research — inspect files/docs/tests/runtime evidence) | ||
| Phase 4: Provider Cross-Check / Consensus (providers + agestra-debate when needed) | ||
| Phase 5: Result Inspection (team-lead — classify findings, disagreements, confidence, residual risk) | ||
| Phase 6: Report (durable report or concise chat result) | ||
| ``` | ||
@@ -222,3 +192,3 @@ | ||
| **Work modes:** | ||
| - `Multi-AI`: CLI workers + capability-matched local/tool model work for parallelized execution; team lead supervises and merges | ||
| - `Multi-AI`: host-prepared evidence plus provider cross-check/debate; team lead supervises consensus and reporting | ||
@@ -235,4 +205,14 @@ **Host-native first:** | ||
| - Use external CLI providers for independent Council/Provider-seeded | ||
| participants, explicitly requested providers, or file-changing worker tasks. | ||
| participants or explicitly requested provider perspectives. | ||
| **Research/debate split:** | ||
| - `agent_research_start` is research-only. It receives the selected workflow | ||
| profile, prompt pack, `questionSet`, `evidencePolicy`, research lenses, and | ||
| investigator assignments, then writes `research_submissions.json`, | ||
| `research_transcript.json`, and `aggregation.json`. It does not start debate. | ||
| - Start debate separately with debate-only `agent_consensus_start` from prepared | ||
| `aggregation`, supplied `questionSet`, and `evidencePolicy`. `workflow` is a | ||
| report/artifact label only, not a debate routing branch. Route host debate | ||
| turns with `participant_routes` to `agestra-debate`. | ||
| **Progress visibility:** | ||
@@ -242,5 +222,5 @@ - Multi-AI work must stay visible while it runs. A background-agent completion | ||
| - Team-lead or the caller must surface concise phase updates every 30-60 | ||
| seconds during provider, debate, or worker phases. | ||
| - Use `agent_debate_status`, `run_observable_events` with a cursor, or | ||
| `cli_worker_status` when a session locator or worker exists. If trace is | ||
| seconds during provider or debate phases. | ||
| - Use `agent_debate_status` or `run_observable_events` with a cursor when a | ||
| session locator exists. If trace is | ||
| still `cold-start`, report the current local phase and keep monitoring instead | ||
@@ -250,3 +230,3 @@ of stopping. | ||
| **QA domain:** | ||
| - `/agestra qa` verifies existing work without code changes. It first asks Host-only QA vs QA Brigade vs automatic selection, then asks Standard vs Full E2E depth, writes a QA report under `docs/reports/qa/`, collects host-owned evidence, and runs Connection / Boundary Checks (API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, and E2E artifact interpretation). QA Brigade cross-checks host-prepared findings through a short consensus round; it does not start with external provider research unless the user explicitly asks for deep provider research. If no providers are enabled, it offers Host-only QA or setup. QA-only mode does not modify product code. It never spawns implementer or CLI workers for product fixes. If QA decides persistent E2E tests are needed, team-lead asks the user and routes only the approved test work to `agestra-implementer` with `mode: e2e-test-authoring`. | ||
| - `/agestra qa` verifies existing work without code changes. It first asks Council QA vs Host-native first QA vs Provider-seeded QA as a mandatory design selection gate, then auto-detects E2E coverage via `package.json` `scripts.e2e`, writes QA reports under `docs/reports/qa/`, collects host-owned evidence, and runs Connection / Boundary Checks (API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, and E2E artifact interpretation). Council QA uses `agent_research_start` with the QA workflow profile; all QA topologies start debate through `agent_consensus_start` with host-approved `aggregation`, supplied `questionSet`, and `evidencePolicy`. Every QA claim carries an `evidenceType` of `"empirical"` / `"inferential"` / `"mixed"` so the renderer can flag empirical refutations of inferential claims. If no providers are configured, stop and direct the user to `/agestra setup`. QA-only mode does not modify product code and does not author persistent E2E test files. If persistent tests are missing, QA records the gap and recommended scenarios as findings for the current host to implement separately. | ||
@@ -256,4 +236,4 @@ **Security domain:** | ||
| **QA Fix Loop — provider escalation:** | ||
| On failure, immediately assign to a DIFFERENT provider with full context (original task, previous AI, diagnosis, fix instruction, scope boundary). Never retry the same provider for the same failure. | ||
| **QA failure handling:** | ||
| On failure, record the defect, evidence, likely owner, and recommended current-host fix path. Do not dispatch product fixes through Agestra unless a separate explicit internal maintenance packet has been approved. | ||
@@ -260,0 +240,0 @@ ## Completion Verification |
+85
-48
@@ -34,32 +34,25 @@ --- | ||
| ### Phase 2: Choose QA Execution Mode | ||
| ### Phase 2: Choose QA topology (조사 방식) | ||
| Ask once unless already specified: | ||
| Available 조사 방식 for QA: | ||
| | Option | Description | | ||
| |--------|-------------| | ||
| | **Host-only QA (Recommended)** | Fastest path. The current host collects evidence, runs `qa_run`, writes the QA report, and does not call external providers. | | ||
| | **QA Brigade** | Host evidence first, then enabled providers cross-check prepared findings through a short consensus round. Takes longer. | | ||
| | **Decide automatically** | Use Host-only QA unless the target is broad/high-risk, the user explicitly asked for multiple AIs/providers, or the design has disputed evidence. | | ||
| - **Council QA** — host and external providers all investigate independently with distinct QA lenses (executable evidence, spec-to-code compliance, integration risk, edge/error states, test adequacy, safety hygiene), then cross-review and debate. | ||
| - **Host-native first QA** — the host's native `agestra-research` agent collects executable QA evidence first (build / type / test, plus E2E when `package.json` `scripts.e2e` is present), persists the QA evidence artifact, and external providers challenge it through a short consensus round. | ||
| - **Provider-seeded QA** — the selected `seed_provider` produces a code/spec-analysis seed (inferential); the host then injects empirical evidence as a challenge stance and other reviewers weigh in. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/permission gate, not a clarifying question. Do not infer provider-backed QA merely because `/agestra qa` was invoked or providers are configured. Skip this question only when the user already explicitly requested current-host-only QA, named provider-backed/multi-AI QA, or selected a mode in the same request. If a host-level no-questions directive prevents asking, choose Host-only QA and report that provider fan-out was skipped. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts, participant routes, and evidence weights, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded QA is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| ### Phase 3: Choose QA Depth | ||
| If no external providers are configured or available, stop Agestra orchestration and direct the user to `/agestra setup`. A host-only fallback for QA is not a mode in this skill. | ||
| Ask once unless already specified: | ||
| ### Phase 3: Detect E2E coverage | ||
| | Option | Description | | ||
| |--------|-------------| | ||
| | **Standard QA (Recommended)** | Design/progress compliance, build/type/test, Connection / Boundary Checks, error/empty states, and basic safety hygiene | | ||
| | **Full QA with E2E** | Standard QA plus existing E2E tests, temporary browser automation, screenshots when useful, and core real-user flows | | ||
| | **Decide automatically** | Include E2E for UI-heavy, auth, file, public-release, destructive, or complex state-flow work | | ||
| E2E execution is host-owned and gated by explicit user intent. Before evidence collection, the host MUST read the workspace `package.json` (and, in a workspace monorepo, any nested package `package.json` files) and check whether a `scripts.e2e` entry exists: | ||
| Warn that E2E can cost more time, tokens, and local runtime setup. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/permission gate, not a clarifying question. Do not infer QA depth unless the user chose `Decide automatically` or the request already explicitly asked for Standard QA or Full QA/E2E. If a host-level no-questions directive prevents asking, choose Standard QA and report that E2E was skipped unless the user explicitly requested it. | ||
| - If `scripts.e2e` exists at the workspace root or in any nested package, run it as part of the QA evidence pass via the workspace's package manager (`npm run e2e`, `pnpm run e2e`, `yarn e2e`, etc.). Capture its stdout/stderr into the QA evidence artifact alongside build/type/test output. | ||
| - If `scripts.e2e` is absent everywhere, do NOT attempt E2E execution and do NOT search for `playwright.config.*`, `cypress.config.*`, or `tests/e2e/` directories. Presence of those files alone is not a reliable signal — abandoned framework setups produce false positives. Record in the QA report that E2E was not run because no `scripts.e2e` was declared, and recommend that the user add one to enable E2E in future runs. | ||
| - Standard QA evidence (`qa_run` for build/type/test) always runs regardless of `scripts.e2e` presence. | ||
| Persistent E2E test files are not created or maintained by QA. If they are needed, QA returns an `E2E_TEST_WORK_REQUEST` packet; after explicit user approval, route it to `agestra:agestra-implementer` with `mode: e2e-test-authoring` and re-run QA after those tests exist. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer approval. | ||
| Even in multi-AI QA, E2E/runtime execution is host-owned across all three topologies. External providers may review the design, code, host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly. | ||
| If QA Brigade was selected, also ask focused provider cross-check inputs before provider fan-out: | ||
| - "QA Brigade uses host-prepared evidence first. What should providers cross-check: spec-to-code mapping gaps, API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, suspected regressions, integration/regression risk, edge / error states, test adequacy, safety hygiene, E2E artifact interpretation, or skip?" | ||
| - "Should any provider or host-native lens receive a specific cross-check assignment, or should team-lead choose the assignment rows?" | ||
| QA writes a Markdown report under `docs/reports/qa/` unless the user explicitly asks for chat-only output. | ||
@@ -70,16 +63,6 @@ ### Phase 4: Route Execution | ||
| If Host-only QA was selected, run the host-owned QA evidence pass directly: | ||
| Before any provider fan-out, run workspace trust readiness for the exact target root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or stop and direct the user to `/agestra setup`. Pass `workspace_base_dir` explicitly to provider readiness/trust and consensus calls whenever the host workspace root may be ambiguous. | ||
| - Use `qa_run` for build/test verification where applicable. | ||
| - Inspect the design/progress contract, implementation files, command output, and runtime/E2E artifacts according to the selected depth. | ||
| - Use host-native `agestra-research` only as a bounded native helper assignment when the current host exposes native agents and the evidence question is narrow. | ||
| - Write the QA report under `docs/reports/qa/`. | ||
| - Do not call `agent_research_consensus_start`, `agent_consensus_start`, `ai_chat`, or external provider tools. | ||
| Hand off to `agestra:agestra-team-lead`. The canonical QA boundary (Host-native first QA and Provider-seeded QA default): | ||
| If QA Brigade was selected but no external providers are available, stop provider orchestration and offer Host-only QA or `/agestra setup`. Do not spawn a provider-backed consensus with zero providers. | ||
| If QA Brigade was selected and external providers are available, first run workspace trust readiness for the exact target root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers or fall back to Host-only QA. Pass `workspace_base_dir` explicitly to provider readiness/trust and consensus calls whenever the host workspace root may be ambiguous. | ||
| Then hand off to `agestra:agestra-team-lead`. Provider-backed QA uses the fast host-prepared consensus path by default: | ||
| ```text | ||
@@ -92,29 +75,82 @@ 호스트가 조사한다. | ||
| The host must prepare QA evidence before provider fan-out. External providers cross-check the prepared evidence; they do not run the initial research phase. Build a self-contained handoff packet: | ||
| Council QA loosens the first two lines: "호스트와 프로바이더가 함께 조사한다. 호스트가 집계한다." Across all three topologies the closing lines ("시스템이 토론한다." and "호스트가 문서화한다.") are unchanged, and E2E execution remains host-owned. | ||
| External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. Do not carry a research conversation into the debate phase. | ||
| #### Council QA path | ||
| Council QA uses the split research/debate MCP route. Team-lead calls `agent_research_start` with: | ||
| - `workflow: "qa"` and the selected QA workflow profile | ||
| - the QA `questionSet` and `evidencePolicy` | ||
| - The 6 QA lenses as participant assignments: executable evidence, spec-to-code compliance, integration risk, edge/error states, test adequacy, safety hygiene | ||
| - Available external providers as participants alongside the host | ||
| - The host's empirical evidence (`qa_run` output, E2E output when `scripts.e2e` ran) preserved in `research_submissions.json`, `research_transcript.json`, and `aggregation.json`, with `evidenceType: "empirical"` on every claim derived from the executable artifacts | ||
| - Other provider participants emit claims with `evidenceType: "inferential"` (default) unless they were assigned an empirical follow-up lens | ||
| Council QA inherits research's council defaults (`max_rounds` follows the research command's default). | ||
| #### Host-native first QA path | ||
| Team-lead runs the host-owned QA evidence pass first via `qa_run` and (when `scripts.e2e` exists) host-run `npm run e2e`, then prepares `aggregation.items` from concrete evidence with `evidenceType: "empirical"` on items derived from runnable artifacts. Then call debate-only `agent_consensus_start` with: | ||
| - `workflow: "qa"` as an artifact/report label | ||
| - the QA `questionSet` | ||
| - the prepared `aggregation` | ||
| - the QA `evidencePolicy` | ||
| - Exact provider participants | ||
| - `participant_routes` for any host-native `agestra-debate` participant | ||
| - `max_rounds: 1` | ||
| - A bounded participant timeout | ||
| External provider stances on host empirical items default to `evidenceType: "inferential"` because they did not run the build/test/E2E themselves; they may set `"mixed"` only when they cite an independent empirical artifact they actually inspected. | ||
| #### Provider-seeded QA path | ||
| Team-lead asks the user which configured, available provider should seed (Phase 2 may have already captured this; do not re-ask). Then: | ||
| 1. Run the selected `seed_provider` to produce a code/spec-analysis seed; record its claims with `evidenceType: "inferential"`. | ||
| 2. Run the host's empirical evidence pass — host-owned `qa_run` plus host-owned E2E execution when `scripts.e2e` exists — and append host claims with `evidenceType: "empirical"`. Host claims that explicitly confirm or refute a provider-seed claim use `evidenceType: "mixed"`. | ||
| 3. Call debate-only `agent_consensus_start` with `workflow: "qa"`, the QA `questionSet`, prepared `aggregation`, `evidencePolicy`, the seed provider + at least one reviewer + the host-debate participant route, `max_rounds: 1`, and a bounded participant timeout. | ||
| #### No-provider stop path | ||
| If no external providers are configured or available, stop Agestra orchestration and direct the user to `/agestra setup`. Do not spawn a provider-backed consensus with zero providers, and do not silently substitute a host-only fallback. | ||
| #### Handoff packet (all three paths) | ||
| Build a self-contained handoff packet with: | ||
| - **Domain:** `qa` | ||
| - **Submode:** `qa-only` | ||
| - **Mode:** `qa-brigade` (selected by the user; do not re-ask) | ||
| - **QA depth:** {selected depth} | ||
| - **Topology (조사 방식):** Council QA / Host-native first QA / Provider-seeded QA (selected by the user in Phase 2; do not re-ask) | ||
| - **Seed provider:** when topology is Provider-seeded QA | ||
| - **QA target:** from Phase 1 | ||
| - **E2E status:** ran / skipped, with the reason ("scripts.e2e present in {path}" or "no scripts.e2e declared") | ||
| - **E2E/runtime execution:** host-owned only; external providers cross-validate artifacts and findings, not browser/dev-server execution | ||
| - **Design doc reference:** {path under docs/plans} | ||
| - **Report artifact path expectation:** `docs/reports/qa/YYYY-MM-DD-qa-[target].md` | ||
| - **Consensus domain:** `qa` | ||
| - **Workflow profile:** QA profile with `workflow: "qa"`, QA `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Connection / Boundary Checks:** API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, and E2E artifact interpretation when E2E ran | ||
| - **Research notes:** {what the host-owned evidence pass should look for — spec-to-code gaps, boundary mismatches, regressions, integration risk, edge/error states, test adequacy, safety hygiene} | ||
| - **Cross-check assignments:** {optional provider/lens rows for the short consensus round, or "team-lead choose"} | ||
| - **Host-native route:** run active-host `agestra-research` for bounded QA evidence lenses before provider cross-check when useful; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
| - **Evidence type policy:** every claim emitted into the ledger MUST carry `evidenceType`; host empirical claims set `"empirical"` with an `evidence_ref` to the qa_run artifact path/line; provider inferential claims set `"inferential"`; cross-cited host-confirmation-of-provider-claim sets `"mixed"`. Two `"inferential"` agree votes do not outweigh one `"empirical"` refutation — the renderer surfaces the asymmetry, the human reviewer decides. | ||
| - **Host-native route:** route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
| - **Available providers:** {from `environment_check`; include configured providers when their detected model capability is suitable, using read-only QA/review tools so verification cannot modify source files} | ||
| - **Requested providers:** {explicit names captured from user wording; otherwise "all configured and available review-capable providers"} | ||
| - **QA-only boundary:** QA-only mode does not modify product code; connection or boundary defects are findings until the user approves a separate implementation task | ||
| - **JSON finding flow:** candidate findings become `aggregation.items`; debate participants answer each required QA `questionSet` question with allowed verdicts, stance evidence type, and evidence refs; only `workflow_result.json` final-status items affect the final verdict | ||
| - **Target workspace root:** {absolute project folder if supplied or implied; pass as `workspace_base_dir`} | ||
| - **Locale:** {from setup_status} | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** {preserve verbatim} | ||
| Team-lead runs the host-owned QA evidence pass, prepares `initial_aggregation.items` from concrete evidence, and calls `agent_consensus_start` with metadata for `domain: "qa"`, exact provider participants, `participant_routes` for any host-native `agestra-debate` participant, `max_rounds: 1` for Standard QA, and a bounded participant timeout. Team-lead must poll `agent_debate_status` and `run_observable_events` when a locator is available, then surface concise progress at least every 30-60 seconds while provider work is running. When the status reports pending host turns, team-lead dispatches the native `agestra-debate` agent and submits the JSON with `agent_consensus_submit_turn`. If the current host cannot surface progress from a background team-lead, the caller must poll and relay progress, or choose Host-only QA for the current run. | ||
| Team-lead polls `agent_debate_status` and `run_observable_events` when a locator is available, then surfaces concise progress at least every 30-60 seconds while provider work is running. When the status reports pending host turns, team-lead dispatches the native `agestra-debate` agent and submits the JSON with `agent_consensus_submit_turn`. If the current host cannot surface progress from a background team-lead, the caller must poll and relay progress, or stop and direct the user to `/agestra setup`. | ||
| Do not call `agent_research_consensus_start` for the default QA Brigade path. That tool is reserved for an explicit deep provider-research mode; in that exception, External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. Default QA Brigade must avoid the extra external research round because QA already has host-owned executable evidence. | ||
| #### Council QA MCP routing note | ||
| Council QA is the topology that calls `agent_research_start` before debate. Host-native first QA and Provider-seeded QA may prepare `aggregation.items` directly, but all three topologies start debate with `agent_consensus_start` using `workflow`, `questionSet`, `aggregation`, and `evidencePolicy`. The debate engine does not branch on `workflow`; QA behavior comes from the supplied profile and question set. | ||
| ### Phase 5: Present | ||
| Report: | ||
| - QA execution mode | ||
| - QA depth and E2E status | ||
| - QA topology (Council QA / Host-native first QA / Provider-seeded QA) | ||
| - E2E status and reason (scripts.e2e path or "no scripts.e2e declared") | ||
| - QA report path | ||
@@ -124,4 +160,5 @@ - Design document used | ||
| - Evidence and classified failures | ||
| - Empirical-vs-inferential evidence asymmetries flagged in the ledger | ||
| - Spec-to-code mapping summary | ||
| - Any E2E test work request and whether it was approved, declined, or left pending | ||
| - Any missing E2E coverage and recommended scenarios for the current host to implement separately | ||
| - Whether review or security should be run next | ||
@@ -134,5 +171,5 @@ | ||
| - QA may write report artifacts only under `docs/reports/qa/`. | ||
| - QA must not add or update persistent E2E test files; route that to `agestra-implementer` with `mode: e2e-test-authoring` after approval. | ||
| - QA must not add or update persistent E2E test files. | ||
| - QA must not mark `Verified` without fresh evidence: command output, file:line, runtime result, or screenshot/artifact path. | ||
| - QA must not issue PASS when required design items are missing or falsely marked Verified. | ||
| - Communicate in the user's language. |
@@ -1,35 +0,36 @@ | ||
| # E2E Lens | ||
| # E2E Evidence Lens | ||
| E2E 렌즈는 실제 사용자 흐름이 의도대로 작동하는지 확인하거나, 지속 가능한 E2E 테스트를 작성할 때 쓴다. | ||
| E2E 렌즈는 실제 사용자 흐름 증거를 해석할 때 쓴다. Agestra QA는 화면 흐름, | ||
| 실행 로그, screenshot, trace, video, 기존 테스트 결과를 검토할 수 있지만 | ||
| 새 persistent E2E 테스트 파일을 만들거나 고치지 않는다. | ||
| ## Test Authoring Focus | ||
| ## Evidence Review Focus | ||
| - 사용자 플로우: setup, action, expected result | ||
| - 실패/빈/로딩/재시도 상태 | ||
| - 기존 테스트 프레임워크와 project convention | ||
| - 안정적인 selector와 대기 조건 | ||
| - screenshot/trace/video 해석 기준 | ||
| - 테스트가 제품 동작을 바꾸지 않는지 | ||
| - 기존 테스트 결과가 실제 요구사항을 증명하는지 | ||
| - flaky, timeout, selector 불안정, 환경 의존성 같은 신뢰도 위험 | ||
| ## Boundary | ||
| E2E test authoring mode는 테스트 파일만 수정한다. | ||
| Agestra는 E2E 증거를 읽고 판단한다. 테스트 파일 작성, 테스트 유지보수, | ||
| 제품 코드 수정은 현재 호스트의 별도 작업으로 분리한다. | ||
| 테스트를 만들다가 제품 버그, 설계 누락, testability gap을 발견하면 바로 제품 코드를 고치지 않는다. `PRODUCT_FIX_REQUIRED`로 보고하고 별도 구현 task로 분리한다. | ||
| 제품 버그, 설계 누락, testability gap을 발견하면 바로 고치지 않는다. | ||
| QA finding으로 기록하고, 현재 호스트가 수정한 뒤 Agestra QA를 다시 실행한다. | ||
| 단, 사용자가 명시적으로 구현 fix loop를 승인했거나 팀리더가 별도 구현 task를 열었다면, 그때는 implementer가 제품 코드를 수정할 수 있다. E2E 렌즈 자체는 "테스트 작성/해석"에 머문다. | ||
| ## Stability Checks | ||
| - 테스트가 특정 타이밍 운에 기대지 않는가 | ||
| - selector가 화면 문구나 스타일 변경에 너무 쉽게 깨지지 않는가 | ||
| - 증거가 특정 타이밍 운에 기대지 않는가 | ||
| - selector나 화면 문구 변경 때문에 결과가 쉽게 흔들리지 않는가 | ||
| - 네트워크, 파일, 날짜, 랜덤 값이 고정되거나 제어되는가 | ||
| - 실패했을 때 trace/screenshot/log로 원인을 볼 수 있는가 | ||
| - 테스트가 제품 코드를 E2E에 맞춰 왜곡시키지 않는가 | ||
| - E2E 결과를 맞추기 위해 제품 동작이 왜곡된 흔적은 없는가 | ||
| ## Output | ||
| - 추가/수정할 테스트 파일 | ||
| - 검증할 사용자 흐름 | ||
| - 실행 명령 | ||
| - 발견한 제품 결함 또는 설계 불일치 | ||
| - 사용한 E2E 증거와 신뢰도 판단 | ||
| - 발견한 제품 결함, 설계 불일치, 또는 누락된 coverage |
@@ -9,3 +9,3 @@ # Agestra Lenses | ||
| - 팀리더, 리서치, 토론, 구현 에이전트는 필요한 렌즈만 읽는다. | ||
| - 팀리더, 리서치, 토론, 내부 유지보수 에이전트는 필요한 렌즈만 읽는다. | ||
| - 모든 렌즈를 항상 프롬프트에 넣지 않는다. | ||
@@ -23,3 +23,3 @@ - 렌즈는 작업 지시를 대체하지 않는다. 사용자 요청, 설계 문서, assignment row가 우선한다. | ||
| | `research.md` | 공통 리서치 primitive, evidence surface, research run 조합 규칙 | | ||
| | `research-domains/*.md` | idea/design/review/QA/security/implement 도메인별 조사 팩 | | ||
| | `research-domains/*.md` | idea/design/review/QA/security 도메인별 조사 팩 | | ||
| | `review.md` | 코드 품질, 사용자 불편, 유지보수성, 리소스, 레거시, 하드코딩 검토 | | ||
@@ -29,3 +29,3 @@ | `qa.md` | 문서-구현 대조, 진행표 진실성, 승인 없는 MVP/타협 감지 | | ||
| | `design.md` | 범위, 상태, 데이터, 흐름, tradeoff, mock/fallback 정책 설계 | | ||
| | `e2e.md` | E2E 테스트 작성/해석 렌즈. 제품 수정은 별도 구현 task로 분리 | | ||
| | `e2e.md` | E2E 실행 증거, screenshot, trace, video, 기존 테스트 결과 해석 렌즈 | | ||
@@ -32,0 +32,0 @@ ## Skills와의 관계 |
@@ -52,3 +52,3 @@ # Research Lens | ||
| | --- | --- | | ||
| | `domain` | idea, design, review, qa, security, implement, research 중 무엇을 위한 조사인가 | | ||
| | `domain` | idea, design, review, qa, security, research 중 무엇을 위한 조사인가 | | ||
| | `question` | 이번 run이 답해야 하는 좁은 질문 | | ||
@@ -86,3 +86,3 @@ | `lens` | 사용할 primitive와 domain focus를 짧게 요약한 값 | | ||
| | `phase` | 항상 `research` | | ||
| | `targetDomain` | idea, design, review, qa, security, implement 중 대상 도메인 | | ||
| | `targetDomain` | idea, design, review, qa, security 중 대상 도메인 | | ||
| | `assignmentId` | 어떤 assignment row에 대한 결과인지 | | ||
@@ -89,0 +89,0 @@ | `summary` | 무엇을 조사했는지 짧은 요약 | |
+187
-41
@@ -8,3 +8,3 @@ --- | ||
| is already selected. Trigger examples include: "/agestra research", | ||
| "multi-AI research plan", "provider research consensus", "Codex and Gemini research", | ||
| "multi-AI research plan", "provider research plan", "Codex and Gemini research", | ||
| "조사 방식", "여러 AI로 조사", "Council Research", "Host-native first", | ||
@@ -49,3 +49,2 @@ "Provider-seeded Research", "provider evidence packet". | ||
| - `security` | ||
| - `implement` | ||
| - `research` | ||
@@ -55,3 +54,3 @@ | ||
| Canonical host research consensus boundary: | ||
| Canonical host research/debate boundary: | ||
@@ -65,3 +64,3 @@ ```text | ||
| Standalone `/agestra research` produces research artifacts and a human report; it does not create a bundled participant for a later domain debate. When research should continue into idea/design/review/security/qa/implement consensus, hand off to team-lead to call `agent_research_consensus_start` for the target domain instead of chaining a research-domain debate into a second debate. External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. | ||
| Standalone `/agestra research` produces research artifacts and a human report; it does not create a bundled participant for a later domain debate. When research should continue into idea/design/review/security/qa consensus, hand off to team-lead to call `agent_research_start` for the target workflow, inspect `aggregation.json`, then start debate separately with `agent_consensus_start`. External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. | ||
@@ -91,12 +90,10 @@ ## Workflow | ||
| - Council Research: host and external providers independently investigate with distinct lenses, then cross-review and debate. | ||
| - Host-native first: the active host's native `agestra-research` agent creates and persists the first seed/source document, then external participants challenge it through `domain: "research"`. Record internally as `host-seeded`. | ||
| - Host-native first: the active host's native `agestra-research` agent creates and persists the first seed/source document, then external participants challenge it through the supplied research workflow profile and question set. Record internally as `host-seeded`. | ||
| - Provider-seeded Research: the selected seed provider creates the first seed/evidence artifact, then host/reviewer participants challenge it independently. | ||
| This is a cost/latency gate, not a clarifying question. If a host-level | ||
| no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report | ||
| that broader provider investigation was skipped. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. | ||
| If no external providers are available, stop Agestra orchestration and tell the user to run setup or handle the research directly outside Agestra. | ||
| For Council Research, first create a table of domain-specific investigation items and AI/worker assignments, then ask the user to approve or modify it. | ||
| For Council Research, first create a table of workflow-profile investigation items and participant assignments, then ask the user to approve or modify it. | ||
| Each assignment row must make the runtime contract explicit: `item_id`, `assignee`, `domain`, `role`, `lens`, `question`, `deliverable`, `priority`, `expected_artifact`, and optional `rationale`. | ||
@@ -106,8 +103,8 @@ Provider fan-out is forbidden until that plan is approved or modified by the user. | ||
| Native researcher/helper agents are host-owned. External providers in the assignment table participate through MCP, CLI, or chat routes and must not be described as creating or managing Claude/Codex/Gemini native agents. | ||
| Do not create a bundled research pseudo-participant, and do not carry research bundles through `source_documents`. | ||
| Do not create a bundled research pseudo-participant, and do not carry research bundles through legacy source-document fields. | ||
| For Host-native first (`host-seeded`), the active host creates the host seed before provider fan-out: | ||
| - Write the seed through the active host's native `agestra-research` agent as host-owned aggregation evidence and normalize it into `initial_aggregation.items`. | ||
| - Do not pass the seed through `source_documents`. | ||
| - Write the seed through the active host's native `agestra-research` agent as host-owned aggregation evidence and normalize it into `aggregation.items`. | ||
| - Do not pass the seed through legacy source-document fields. | ||
| - Include only explicit consensus participants in `participants`. | ||
@@ -143,12 +140,63 @@ - Include at least one external reviewer participant outside the seed provider. | ||
| For Council Research, the approved MCP packet must produce an `agent_consensus_start` packet only after host preprocessing has prepared `initial_aggregation.items`: | ||
| For Council Research, the approved MCP packet must call `agent_research_start` first and produce an `agent_consensus_start` packet only after host preprocessing has prepared `aggregation.items`: | ||
| `agent_research_start` is research-only. It receives the workflow profile, | ||
| prompt pack, `questionSet`, `evidencePolicy`, research lenses, and investigator | ||
| assignments, then writes `research_submissions.json`, | ||
| `research_transcript.json`, and `aggregation.json`. It does not start debate. | ||
| debate-only `agent_consensus_start` runs from prepared `aggregation`, supplied | ||
| `questionSet`, and `evidencePolicy`; `workflow` is a report/artifact label only, | ||
| not a debate routing branch. | ||
| ```json | ||
| { | ||
| "domain": "research", | ||
| "workflow": "research", | ||
| "participants": ["<explicit-consensus-participant>"], | ||
| "initial_aggregation": { | ||
| "summary": "<approved host aggregation summary>", | ||
| "items": [] | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { "aggregationId": "<approved aggregation id>", "items": [] }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true }, | ||
| "participant_routes": [] | ||
@@ -162,6 +210,51 @@ } | ||
| { | ||
| "domain": "research", | ||
| "workflow": "research", | ||
| "participants": ["host-seed", "<external-reviewer>"], | ||
| "initial_aggregation": { | ||
| "summary": "<host seed summary>", | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { | ||
| "aggregationId": "<host seed aggregation id>", | ||
| "items": [ | ||
@@ -171,6 +264,8 @@ { | ||
| "title": "<claim>", | ||
| "claim": "<what external reviewers should challenge>" | ||
| "claim": "<what external reviewers should challenge>", | ||
| "evidenceType": "empirical" | ||
| } | ||
| ] | ||
| } | ||
| }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true } | ||
| } | ||
@@ -183,6 +278,51 @@ ``` | ||
| { | ||
| "domain": "research", | ||
| "workflow": "research", | ||
| "participants": ["<configured-seed-provider>", "<reviewer-provider-or-host-participant>"], | ||
| "initial_aggregation": { | ||
| "summary": "<provider seed summary>", | ||
| "questionSet": { | ||
| "id": "research.findings-and-sources", | ||
| "title": "Research findings and sources validation", | ||
| "requiredQuestions": [ | ||
| { | ||
| "id": "claim", | ||
| "prompt": "Is the research claim specific and answerable?", | ||
| "verdictField": "claimVerdict", | ||
| "allowedVerdicts": ["yes", "no", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "source", | ||
| "prompt": "Is the claim supported by traceable source evidence?", | ||
| "verdictField": "sourceVerdict", | ||
| "allowedVerdicts": ["yes", "no", "partial", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "conflict", | ||
| "prompt": "Are conflicting sources or uncertainties preserved?", | ||
| "verdictField": "conflictVerdict", | ||
| "allowedVerdicts": ["yes", "no", "not-applicable", "unclear"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "action", | ||
| "prompt": "What follow-up or decision does this finding support?", | ||
| "verdictField": "actionVerdict", | ||
| "allowedVerdicts": ["decision-ready", "needs-followup", "background-only", "rejected"], | ||
| "required": true | ||
| }, | ||
| { | ||
| "id": "status", | ||
| "prompt": "Should this be accepted, followed up, background context, or rejected?", | ||
| "verdictField": "finalStatus", | ||
| "allowedVerdicts": ["accepted", "needs_followup", "background", "rejected"], | ||
| "required": true | ||
| } | ||
| ], | ||
| "finalStatus": { | ||
| "field": "finalStatus", | ||
| "allowedValues": ["accepted", "needs_followup", "background", "rejected"] | ||
| } | ||
| }, | ||
| "aggregation": { | ||
| "aggregationId": "<provider seed aggregation id>", | ||
| "items": [ | ||
@@ -192,14 +332,16 @@ { | ||
| "title": "<seed claim>", | ||
| "claim": "<what reviewers should challenge>" | ||
| "claim": "<what reviewers should challenge>", | ||
| "evidenceType": "inferential" | ||
| } | ||
| ] | ||
| } | ||
| }, | ||
| "evidencePolicy": { "preserveItemEvidenceType": true, "preserveStanceEvidenceType": true } | ||
| } | ||
| ``` | ||
| If a seed artifact already exists, convert its supported claims into `initial_aggregation.items` before starting consensus. | ||
| If a seed artifact already exists, convert its supported claims into `aggregation.items` before starting consensus. | ||
| Progress contract: surface concise phase updates every 30-60 seconds during | ||
| provider, debate, or worker phases. Poll `agent_debate_status`, | ||
| `run_observable_events` with a cursor, or `cli_worker_status` when available. | ||
| provider, host-participant, or debate phases. Poll `agent_debate_status`, | ||
| `run_observable_events` with a cursor when available. | ||
| If trace is `cold-start`, report the current local phase and keep monitoring | ||
@@ -210,17 +352,21 @@ instead of stopping. | ||
| - `artifact_index.json` | ||
| - `run_report.json` | ||
| - `gate_ledger.json` | ||
| - `research_plan.json` | ||
| - `assignment_table.json` | ||
| - `individual_results.json` | ||
| - `evidence_packet.json` | ||
| - `validated_findings.json` | ||
| - `dispute_ledger.json` | ||
| - `consensus_ledger.json` | ||
| - `research_submissions.json` | ||
| - `research_transcript.json` | ||
| - `aggregation.json` | ||
| - `debate_transcript.json` | ||
| - `workflow_result.json` | ||
| - `artifact_index.json` when the host needs an index | ||
| - `run_report.json`, `gate_ledger.json`, `research_plan.json`, | ||
| `assignment_table.json`, `individual_results.json`, `evidence_packet.json`, | ||
| `validated_findings.json`, `dispute_ledger.json`, and | ||
| `consensus_ledger.json` as legacy/internal supporting artifacts when the | ||
| workflow records that detail | ||
| Markdown report should summarize agreed conclusions, unique insights, disputed items, and rejected or dismissed claims separately. | ||
| The document flow is a threaded aggregation document plus a concise final decision document. Markdown report should summarize agreed conclusions, unique insights, disputed items, and rejected or dismissed claims separately. | ||
| Markdown reports should include an `실행 증거` section that links only to run evidence artifact paths, not prompt bodies or prompt capsule summary tables. | ||
| For host-owned investigation material that feeds provider-backed research, call `agent_research_record` after the host report body exists. This records `run_report.json`, `gate_ledger.json`, `evidence_packet.json`, `artifact_index.json`, the evidence routing reason, and the optional Markdown report. | ||
| `agent_research_record` is only a host-owned evidence recording helper. It does | ||
| not replace `agent_research_start`, `aggregation.json`, or the separate | ||
| `agent_consensus_start` debate flow. | ||
@@ -227,0 +373,0 @@ ### Phase 5: Validation |
+12
-9
@@ -41,4 +41,8 @@ --- | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not proceed to lens selection or provider routing until the review target is explicit. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (the prior turn or a saved design record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For review the required dimensions are the review target (Phase 1), the review lens (Phase 2), and the research-notes question that gates research assignments — depth and tone are optional defaults. | ||
| Do not proceed to lens selection or provider routing until the review target is explicit. | ||
| ### Phase 2: Choose Review Lens | ||
@@ -89,5 +93,4 @@ | ||
| | **Provider-seeded Research** | One selected provider creates the first review seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for scoped reviews, Council for whole-project/deep reviews, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a review clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -122,4 +125,4 @@ ### Phase 4: Route execution | ||
| - **Report artifact path expectation:** `docs/reports/review/YYYY-MM-DD-review-[target].md` | ||
| - **Consensus domain:** `review` | ||
| - **Research topology / 조사 방식:** {selected in Phase 3 — `host-seeded`, `council`, `provider-seeded`, or `automatic`} | ||
| - **Workflow profile:** review profile with `workflow: "review"`, review `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** {selected in Phase 3 — `host-seeded`, `council`, `provider-seeded`, or `automatic`}; seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -132,3 +135,3 @@ - **Research notes:** {what the selected investigation should look for — regression-prone areas, blast radius, prior incidents, dependency concerns, current-information needs} | ||
| - **Target workspace root:** {absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir`} | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** {preserve verbatim} | ||
@@ -138,6 +141,6 @@ | ||
| - Building the participant team (host reviewer + external providers) | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/review/`. | ||
| - Returning the research artifact paths, consensus table, disputed positions, review verdict, and the final report path under `docs/reports/review/`. | ||
@@ -144,0 +147,0 @@ |
+13
-9
@@ -30,4 +30,9 @@ --- | ||
| Use the provided target or ask whether to audit recent changes, the whole project, or a specific surface. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not proceed to depth selection or provider routing until the security target/surface is explicit. | ||
| **Each dimension below is a mandatory gate.** You MUST use `AskUserQuestion` when available; when it is not, you MUST ask the same options plainly in chat as a numbered prompt and wait for the user's answer before moving on. Do not assume, infer, or auto-fill any required value. A host-level no-questions directive, a "keep going" instruction, or a short user prompt DOES NOT authorize a silent default — those wordings are not consent for any specific interview answer. | ||
| **Bundle-skip rule.** The only legal way to skip an interview question is when the user's incoming request (the prior turn or a saved design record being reused) already contains an explicit, unambiguous value for that question. "Explicit" means the user said the value, not that the agent inferred it from a related word. If any required dimension cannot be fully populated from explicit user-provided values, you MUST ask for the missing dimension before any provider fan-out. For security the required dimensions are the security target/surface (Phase 1) and security depth (Phase 2); tool-permission approvals are a separate gate addressed in Phase 2's tool-assisted-scan clause. | ||
| Do not proceed to depth selection or provider routing until the security target/surface is explicit. | ||
| ### Phase 2: Choose Depth | ||
@@ -58,5 +63,4 @@ | ||
| | **Provider-seeded Research** | One selected provider creates the first security seed/evidence artifact; host and other providers challenge it. | | ||
| | **Decide automatically** | Use Host-native first for bounded audits, Council for broad/full security reviews, and Provider-seeded only when the user named a provider to lead. | | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. This is a cost/latency gate, not a security clarification. If a host-level no-questions directive prevents asking, choose Host-native first (`host-seeded`) and report that broader provider investigation was skipped. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
| This is a mandatory design selection gate. The three 조사 방식 produce different artifact contracts and participant routes, so host-level no-questions directives, "keep going" wording, or short user prompts DO NOT authorize a silent default. Always present the three options through `AskUserQuestion` (or the host equivalent), each with a one-line description, and wait for the user's explicit choice before any provider fan-out. If Provider-seeded Research is selected and the seed provider is not explicit, record the seed provider as pending; after provider availability is listed, ask which available provider should seed. Do not infer it. | ||
@@ -87,4 +91,4 @@ ### Phase 4: Route Execution | ||
| - **Report artifact path expectation:** `docs/reports/security/YYYY-MM-DD-security-[target].md` | ||
| - **Consensus domain:** `security` | ||
| - **Research topology / 조사 방식:** {selected in Phase 3 — `host-seeded`, `council`, `provider-seeded`, or `automatic`} | ||
| - **Workflow profile:** security profile with `workflow: "security"`, security `questionSet`, prompt pack, and `evidencePolicy` | ||
| - **Research topology / 조사 방식:** {selected in Phase 3 — `host-seeded`, `council`, `provider-seeded`, or `automatic`}; seed or research findings become `aggregation.items` | ||
| - **Host-native route:** for Host-native first (`host-seeded`), run active-host `agestra-research` before external provider fan-out; route any host debate participant to `agestra-debate` with `participant_routes`; do not substitute the current host's external CLI provider for this native role | ||
@@ -95,10 +99,10 @@ - **Research notes:** {what the selected investigation should look for — secrets/keys, auth/authz boundaries, file/command execution, network exposure, dependency concerns, unsafe defaults} | ||
| - **Target workspace root:** {absolute project folder if supplied or implied; pass as `workspace_base_dir`} | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status` or `run_observable_events` with a cursor when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** {preserve verbatim} | ||
| Team-lead owns: | ||
| - Resolving the selected research topology, then calling `agent_research_consensus_start` when investigation fan-out is required or `agent_consensus_start` with prepared `initial_aggregation.items` when seed/host evidence is already available. | ||
| - Resolving the selected research topology, then calling `agent_research_start` when investigation fan-out is required; call debate-only `agent_consensus_start` only after `aggregation.json` has been inspected and approved. | ||
| - Ensuring external AI research and debate use separate fresh sessions. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through `source_documents`. | ||
| - Inspecting `aggregation_record.json`, `open_debate_items.json`, `round_packet.{round}.{provider}.json`, the aggregation document, and the leader-authored final decision document under `docs/agestra/`. | ||
| - Never creating a bundled research pseudo-participant and never carrying research bundles through legacy source-document fields. | ||
| - Inspecting `research_submissions.json`, `research_transcript.json`, `aggregation.json`, `debate_transcript.json`, `workflow_result.json`, the threaded aggregation document, and the concise final decision document under `docs/reports/security/`. | ||
| - The brigade must not run destructive exploit tests and must not install tools or run heavyweight/networked scans without explicit user approval. | ||
@@ -105,0 +109,0 @@ |
+8
-5
@@ -64,3 +64,3 @@ --- | ||
| "enabled": true, | ||
| "executionPolicy": "workspace-write", | ||
| "executionPolicy": "read-only", | ||
| "config": { "timeout": 120000 } | ||
@@ -72,3 +72,3 @@ }, | ||
| "enabled": false, | ||
| "executionPolicy": "workspace-write", | ||
| "executionPolicy": "read-only", | ||
| "config": { "timeout": 120000 } | ||
@@ -84,2 +84,4 @@ } | ||
| - Not installed providers are omitted entirely | ||
| - Provider execution is read-only. Legacy `workspace-write` or `full-auto` | ||
| values are accepted only for migration and are coerced back to `read-only`. | ||
| - `selectionPolicy`: `"default-only"` (the supported value) | ||
@@ -114,6 +116,7 @@ - `locale`: `ko`, `en`, `ja`, or `zh` | ||
| - Use the existing provider config instead of asking provider setup again | ||
| - Still run the matching domain skill's question sheet, including domain-specific | ||
| cost gates and `조사 방식` / research topology selection when provider-backed | ||
| research is possible | ||
| - Still run the matching workflow skill's question sheet and selected workflow | ||
| profile, including workflow-specific cost gates, prompt pack, questionSet, | ||
| evidencePolicy, lenses, and `조사 방식` / research topology selection when | ||
| provider-backed research is possible | ||
| - Route through team-lead for consensus/debate using the enabled providers | ||
| - If only 1 provider is enabled, inform user that debate needs 2+ participants and offer to add Claude as participant |
| # Generated by Agestra. Managed file. | ||
| description = "Coordinate implementation through Agestra" | ||
| prompt = """ | ||
| You are executing the `/agestra implement` Gemini command. | ||
| - Start with `setup_status`, then `environment_check` and `provider_list`. | ||
| - For investigation-including workflows, route through `agent_research_consensus_start`. | ||
| - Host research consensus contract: | ||
| 호스트가 조사한다. | ||
| 호스트가 정리한다. | ||
| 시스템이 토론한다. | ||
| 호스트가 문서화한다. | ||
| - External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. | ||
| @{commands/implement.md} | ||
| """ |
| --- | ||
| name: agestra-implementer | ||
| description: | | ||
| Host-local implementation executor. Applies scoped code changes, writes or updates tests, | ||
| follows existing project patterns, and verifies the result. Receives tasks from | ||
| agestra-team-lead; does not orchestrate other providers and does not run debates or QA. | ||
| May run in `mode: e2e-test-authoring` for approved persistent E2E test work; that mode | ||
| edits test files only and reports product defects instead of fixing them inline. | ||
| For multi-AI implementation (Codex/Gemini/Ollama workers), route through agestra-team-lead. | ||
| <example> | ||
| Context: The team lead has decomposed a task and needs a host-local executor | ||
| user: "이 작업은 현재 호스트 구현 에이전트가 처리해" | ||
| assistant: "I'll use the agestra-implementer agent to apply the scoped code changes." | ||
| <commentary> | ||
| Single-host implementation executor — modifies code within the assigned files and verifies. | ||
| </commentary> | ||
| </example> | ||
| <example> | ||
| Context: User wants multi-AI implementation — DO NOT use this agent directly | ||
| user: "코덱스로 이 부분 구현하고 제미니로 검증해줘" | ||
| assistant: "I'll use the agestra-team-lead agent to orchestrate multi-AI implementation." | ||
| <commentary> | ||
| Multi-AI implementation with external providers — must go through team-lead which spawns CLI workers and coordinates results. Do NOT call agestra-implementer directly here. | ||
| </commentary> | ||
| </example> | ||
| model: sonnet | ||
| color: green | ||
| codexSandboxMode: workspace-write | ||
| --- | ||
| <Role> | ||
| You are a focused implementation executor. You receive a bounded task from the | ||
| leader, modify the code, add or update tests when needed, run verification, and | ||
| report exactly what changed. You are not the moderator, planner, or reviewer. | ||
| Use only inside an active Agestra workflow. Plain review/QA/check requests | ||
| without `/agestra` or explicit multi-AI/provider wording stay with the current | ||
| host. | ||
| </Role> | ||
| <Operating_Principles> | ||
| - Stay inside the assigned scope. Do not perform unrelated cleanup or broad refactors. | ||
| - Treat the referenced `docs/plans/` design document as the implementation contract. Do not silently change included, excluded, or deferred scope. | ||
| - Prioritize the design intent, implementation completeness, and maintainable code quality over quick shortcuts. | ||
| - Follow existing project patterns before introducing new abstractions. | ||
| - Treat small and simple as different concepts: a small security or concurrency change may be high-risk; a large repetitive rename may be simple. | ||
| - Prefer tests first for behavioral changes and regressions. | ||
| - Preserve user or other-agent changes. If a file has unrelated edits, work with them rather than reverting them. | ||
| - Do not use mock data, placeholder UI, stubs, temporary fallback, shadow-mode behavior, or hardcoded environment assumptions unless the design explicitly allows it or the leader/user approves it. | ||
| - If implementation requires a scope change, lower-fidelity behavior, or a risky trade-off, pause and explain it in plain language: what is blocked, which design item is affected, what options exist, and which option you recommend. | ||
| - Fix normal implementation errors forward by diagnosing root cause. Do not revert an approved direction just because errors appeared. | ||
| - Do not spawn other AI providers, run debates, or approve external worker changes. Escalate those decisions to the leader. | ||
| </Operating_Principles> | ||
| <Workflow> | ||
| ### Phase 1: Intake | ||
| Confirm the task packet contains: | ||
| - Task goal | ||
| - Files allowed to modify | ||
| - Files useful to read | ||
| - Constraints and non-goals | ||
| - Success criteria and verification commands | ||
| - Design/spec reference if one exists | ||
| If the task packet is too vague to implement safely, ask the leader for clarification rather than guessing. | ||
| If a design/spec reference exists, read it before editing code. Extract the top-level Implementation Progress table, scope ledger, completion criteria, mock/fallback policy, and Decision Change Log. | ||
| Do not accept `E2E_TEST_WORK_REQUEST` as a normal product implementation task. Persistent E2E test creation or maintenance is allowed only when the leader explicitly invokes `mode: e2e-test-authoring` with an approved packet. If QA or E2E work reports a product bug, testability gap, or design mismatch, accept only the resulting scoped product-fix task from the leader. | ||
| ### Phase 2: Inspect | ||
| Read the relevant files and identify the existing local pattern. Keep the implementation approach small and compatible with surrounding code. | ||
| Map each assigned task to one or more Implementation Progress rows. If the design document has no progress row for an assigned requirement, add a row under Implementation Progress before or during implementation instead of changing the scope sections. | ||
| ### Phase 3: Implement | ||
| Apply the minimal code changes needed to satisfy the task. Add tests when the change affects behavior, public contracts, generated output, provider routing, orchestration, or failure handling. | ||
| Update the design document's Implementation Progress section as work proceeds: | ||
| - Mark `In Progress` when actively working on a row. | ||
| - Mark `Implemented` only when the real code path exists and is connected. | ||
| - Mark `Blocked` when the design cannot be implemented as written without a decision. | ||
| - Mark `Deferred` only for items already deferred by the design or explicitly approved by the leader/user. | ||
| - Do not mark mock, placeholder, stub, temporary fallback, or shadow-mode behavior as `Implemented` unless the design explicitly defines that behavior as the intended implementation. | ||
| - Do not rewrite design scope to match implementation shortcuts. If scope must change, add a Decision Change Log entry and ask for explicit approval through the leader/user. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer approval. | ||
| ### Phase 3E: E2E Test Authoring Mode | ||
| Enter this mode only when the task packet explicitly says `mode: e2e-test-authoring`. | ||
| Rules: | ||
| - Modify only persistent E2E test files, test fixtures, or test configuration named in the approved packet. | ||
| - Do not change product source code to make the test pass. | ||
| - Do not broaden the tested product scope. | ||
| - If the test reveals a product defect, design mismatch, or missing testability hook, report `PRODUCT_FIX_REQUIRED` or `TESTABILITY_CHANGE_REQUIRED` with evidence and stop that part of the work. | ||
| - After writing tests, run the requested E2E command when feasible and report pass/fail evidence. | ||
| ### Phase 4: Verify | ||
| Run the narrowest useful tests first, then broader checks when the change touches shared behavior. If verification fails, diagnose and fix within the assigned scope. If the failure belongs outside the task scope, report it clearly to the leader. | ||
| After verification, update Implementation Progress evidence: | ||
| - Mark `Verified` only when the relevant tests, QA command, manual verification, or file:line evidence exists. | ||
| - Add concise evidence such as `test command passed`, `file:line`, or `QA PASS`. | ||
| - Leave unverified rows as `Implemented`, `Blocked`, or `In Progress`; do not round them up to `Verified`. | ||
| ### Phase 5: Report | ||
| Report: | ||
| - Files changed | ||
| - Behavior changed | ||
| - Implementation Progress rows updated, with statuses and evidence | ||
| - Tests or commands run, with pass/fail status | ||
| - Any remaining risks or follow-up decisions for the leader | ||
| </Workflow> | ||
| <Constraints> | ||
| - You may edit files only for the assigned implementation task. | ||
| - You may update the referenced `docs/plans/` Implementation Progress and Decision Change Log sections, but must not alter the approved scope to hide incomplete work. | ||
| - Do not accept or reject CLI worker worktrees. | ||
| - Do not write final synthesis, debate, or approval documents. | ||
| - Do not hide failed verification. Report it and either fix it or explain why it is outside scope. | ||
| - Explain blockers and trade-offs in non-specialist language first, with technical detail second. | ||
| </Constraints> |
| --- | ||
| description: "Implement a feature or change with provider-backed AI orchestration" | ||
| argument-hint: "[feature, bugfix, or task description]" | ||
| --- | ||
| You are executing the `/agestra implement` command. | ||
| **Task:** $ARGUMENTS | ||
| Plain review/QA/check requests without `/agestra` or explicit multi-AI/provider wording stay with the current host; they are not Agestra natural-language auto-triggers. | ||
| Agestra natural-language routing requires explicit Agestra/multi-AI/provider wording such as "Agestra", "아제스트라", "multiple AIs", "all AIs", "other AI", "multi-AI", "Codex and Gemini", "provider comparison", or "프로바이더 비교". Explicit `/agestra ...` commands remain supported. | ||
| Host interaction fallback: when this workflow says `AskUserQuestion`, use a structured question UI if the current host exposes one. If it is unavailable (for example, in Codex), ask the same question plainly in chat, present the same options, and wait for the user's answer. | ||
| ## Step 0: Setup preflight (MANDATORY) | ||
| Before anything else, call `setup_status`. If it reports `Setup required: yes` or `Current config: not found`, **automatically enter the interactive setup flow before continuing**: | ||
| 1. Invoke the `agestra:setup` skill (or run `/agestra setup` inline) — provider detection, selection, locale, `setup_apply`. | ||
| 2. After the config is written, resume this `/agestra implement` command **from Step 1**, preserving `$ARGUMENTS`. Do not ask the user to retype. | ||
| Agestra uses a single shared `providers.config.json` resolved through `AGESTRA_CONFIG_PATH` or `~/.agestra/providers.config.json` (existing legacy `$CLAUDE_PLUGIN_ROOT/providers.config.json` remains readable). No config -> no sanctioned provider set or locale -> interactive setup is the only correct starting point. Do not silently choose defaults or write config without the user's provider/language choices. | ||
| Before any provider fan-out or `cli_worker_spawn`, run the shared workspace trust preflight for the exact current project root. If supported providers are blocked, ask once whether to register only this project folder. This is a security approval gate, not a clarifying question; "keep going" / no-questions instructions are not approval. After approval, call `provider_trust_apply` once per blocked provider. Use `provider_trust_apply_all` only when the host permission model explicitly allows batch trust changes. If approval cannot be obtained, skip blocked providers. | ||
| ## Step 1: Determine implementation target | ||
| If `$ARGUMENTS` is empty, ask the user: | ||
| - "What would you like to implement?" | ||
| Use `AskUserQuestion` when available, or a plain prompt as fallback. Do not proceed to environment checks or routing until the implementation target is explicit. | ||
| ## Step 2: Check environment | ||
| Call `environment_check` and `provider_list`. | ||
| ## Step 3: Classify the work and verify provider-backed routing | ||
| Classify the task using these dimensions: | ||
| | Dimension | Meaning | | ||
| |-----------|---------| | ||
| | **Risk** | Security, auth, data loss, concurrency, release, or cross-module contract impact | | ||
| | **Complexity** | Requires design judgment, multi-step reasoning, or unfamiliar architecture | | ||
| | **Repetition** | Same safe pattern across many files | | ||
| | **Context size** | Needs many files or long documents to understand | | ||
| | **Write capability** | Whether the candidate provider can safely modify files or should only propose changes | | ||
| Small is not the same as simple. A one-line permission check may be high-risk; a 50-file repeated rename may be simple. | ||
| Implementation through Agestra requires provider-backed execution. If `environment_check` | ||
| shows no team-capable route (`team` mode / `can_autonomous_work`) and no enabled | ||
| write-capable provider is suitable for the task, stop here. Tell the user to either: | ||
| - run `/agestra setup` to enable a capable provider, or | ||
| - ask the current host to implement the task directly outside Agestra. | ||
| When provider-backed execution is available, present the suggested AI distribution in the | ||
| user's language and wait for approval before dispatching file-changing workers or accepting | ||
| worktree changes. | ||
| Routing guidance: | ||
| - Distribute work according to detected model capability, including frontier and local models. Use model tier, task risk, and execution policy first; use trace quality data only when it exists. | ||
| - Simple and repetitive, low-risk work → prefer a capability-matched local/tool model when available. If its `executionPolicy` is `workspace-write` or `full-auto`, it may use AgentLoop read/write tools through `ai_chat`; if it is `read-only`, use it for analysis, patch plans, or candidate diffs only. | ||
| - Complex or multi-file implementation → prefer high-capability frontier/CLI workers such as Codex/Gemini in isolated worktrees. Use the host implementer only as a supervised participant inside provider-backed orchestration when that is safer. | ||
| - Small but risky work → prefer a high-capability CLI worker, or a supervised host-local implementer task inside provider-backed orchestration, with QA/review after. | ||
| - If trace quality data exists, use it only as a tie-breaker between otherwise qualified providers. If no trace data exists, do not invent quality history; start with lower-risk, tightly scoped assignments. | ||
| ## Step 4: Ensure there is a design basis | ||
| If no recent design doc exists for the task: | ||
| - Ask the user whether to design first, or implement directly. | ||
| - If the user wants design first, run `/agestra design`. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer this decision when the task lacks a design basis. | ||
| Determine QA depth for the post-implementation verification: | ||
| - **Standard QA** by default: design/progress compliance, build/type/test, Connection / Boundary Checks, error/empty states, and basic safety hygiene. | ||
| - **Full QA with E2E** when the user explicitly asks for E2E/runtime verification, or when the work is centered on UI flows, auth, file operations, public release, destructive actions, or complex state transitions. | ||
| - If Full QA may require long setup, a dev server, browser automation, screenshots, or persistent E2E test files, explain the time/token cost and ask before enabling it. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback for long/costly Full QA or persistent E2E test-file approval. Treat Standard QA as the default only when the user has not requested Full QA/E2E and no high-risk runtime flow requires explicit confirmation. | ||
| Determine QA routing separately from implementation routing: | ||
| - When configured external providers are available, team-lead may route post-implementation QA through the QA Brigade, but it should still avoid an extra external research phase unless the user explicitly asked for deep provider research. | ||
| - If executable checks are required, the host owns command/browser/runtime evidence collection and providers review that evidence. | ||
| - E2E/runtime execution is always host-owned. External providers may review the host QA report, command output, screenshots, traces, and E2E findings, but they must not run browser/dev-server flows or create persistent E2E files directly. | ||
| - QA-only mode does not modify product code; connection or boundary defects are findings until the user approves a separate implementation task. | ||
| - Provider-backed QA uses the fast host-prepared consensus path by default: | ||
| ```text | ||
| 호스트가 조사한다. | ||
| 호스트가 정리한다. | ||
| 시스템이 토론한다. | ||
| 호스트가 문서화한다. | ||
| ``` | ||
| The host prepares executable QA evidence first, then providers cross-check prepared `initial_aggregation.items` through `agent_consensus_start`. Use `agent_research_consensus_start` for QA only when the user explicitly asks for deep external-provider research; in that exception, External AI research and debate run in separate fresh sessions, even when the same provider participates in both phases. | ||
| ## Step 5: Execute via team-lead | ||
| Spawn `agestra:agestra-team-lead` with a self-contained handoff packet. The team-lead agent is the single execution entry point — this command does NOT call `cli_worker_spawn`, `ai_chat`, `agent_debate_*`, or spawn implementation/debate/research agents directly. | ||
| Handoff packet: | ||
| - **Domain:** `implement` | ||
| - **Submode:** `qa-only` if the user asked for verification of already-implemented code without code changes; otherwise omit (default = full implement + QA) | ||
| - **Mode:** `multi-ai` | ||
| - **Task:** `$ARGUMENTS` or the user's clarified task | ||
| - **Design doc reference:** path under `docs/plans/` if Step 4 produced or referenced one | ||
| - **Progress tracking:** implementers must update the design document's top-level Implementation Progress table with Planned / In Progress / Implemented / Verified / Blocked / Deferred status and evidence; they must not rewrite approved scope to hide incomplete work | ||
| - **QA depth:** Standard QA / Full QA with E2E / Decide automatically, based on Step 4 | ||
| - **QA routing:** team-lead orchestrates the QA Brigade by default; host owns executable evidence collection | ||
| - **QA formation:** host executable evidence lead + all configured and available review-capable providers with distinct QA lenses | ||
| - **Connection / Boundary Checks:** API/consumer data shape, route/link mapping, state transition completeness, command/result consistency, and E2E artifact interpretation when E2E ran | ||
| - **E2E/runtime execution:** host-owned only | ||
| - **Available providers:** from `environment_check` / `provider_list` | ||
| - **Requested providers:** explicit names captured from user wording; otherwise "all available" | ||
| - **Host-native route:** use `agestra-implementer`, `agestra-research`, and host-turn `agestra-debate` through the active host's native agent surface when they represent the host role; do not substitute the current host's external CLI provider for those native roles | ||
| - **Locale:** from `setup_status` | ||
| - **Target workspace root:** absolute project folder if the user supplied or implied one; pass it to workspace/debate MCP calls as `workspace_base_dir` | ||
| - **Risk/Complexity classification:** from Step 3 dimensions | ||
| - **Progress contract:** surface concise phase updates every 30-60 seconds; poll `agent_debate_status`, `run_observable_events` with a cursor, or `cli_worker_status` when available; if trace is `cold-start`, report the current local phase and keep monitoring | ||
| - **Original user request:** preserve verbatim | ||
| Team-lead owns the rest: | ||
| **Multi-AI mode:** | ||
| - Presents task-to-provider routing table for approval | ||
| - Dispatches CLI workers (`cli_worker_spawn`) for suitable Codex/Gemini tasks in isolated worktrees | ||
| - Uses capability-matched local/tool models through `ai_chat` with tools selected from their `executionPolicy`: read-only policies get read/search tools, while `workspace-write` / `full-auto` policies may perform scoped file writes. | ||
| - May route tightly scoped implementation to `agestra-implementer`, research evidence to `agestra-research`, or host-turn consensus to `agestra-debate` when needed, but only inside this provider-backed workflow. | ||
| - Reviews changes with `agent_changes_review` before merge | ||
| - Runs Phase 5M structured QA debate (cross-validation across providers) | ||
| **QA-only submode (`submode: qa-only`):** | ||
| - Skips Phase 2/3/4 (no code changes) | ||
| - Requires configured providers. If none are available, stop Agestra orchestration and tell the user to run `/agestra setup` or ask the current host to verify directly outside Agestra. When providers are available, collect host-owned evidence and run Phase 5M (QA Brigade) against existing code. | ||
| - Returns PASS / CONDITIONAL / FAIL verdict — never spawns implementer or CLI workers | ||
| - Exception: if QA returns `E2E_TEST_WORK_REQUEST`, ask the user whether to create or update persistent E2E tests. Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. If approved, route only that packet to `agestra:agestra-implementer` with `mode: e2e-test-authoring` as a separate E2E test-writing task, then re-run QA. Do not infer approval. | ||
| ## Step 6: Present the final result | ||
| When team-lead returns, surface: | ||
| - Files changed (or "none" for qa-only) | ||
| - Implementation Progress rows updated, including evidence for anything marked Verified | ||
| - QA depth used and whether E2E/runtime verification ran | ||
| - QA report path under `docs/reports/qa/` | ||
| - Test/build outcome (`qa_run` result if executed) | ||
| - QA verdict (PASS / CONDITIONAL PASS / FAIL with classified failures if any) | ||
| - QA Brigade participants, assigned lenses, accepted ledger items, excluded ledger items, open/opinion items, consensus, and notable dissenting findings when multi-AI QA ran | ||
| - Review report path under `docs/reports/review/` and review verdict (APPROVE / APPROVE WITH CONCERNS / BLOCKING CONCERNS) when review ran | ||
| - Consensus ledger/progress paths under `.agestra/workspace/` if consensus ran, plus any human-facing `docs/agestra/*-aggregation.md` / `*-result.md` paths | ||
| - Communicate in the user's language |
| --- | ||
| name: agestra-e2e | ||
| description: > | ||
| Use only inside an active Agestra workflow, an explicit `/agestra ...` handoff, an | ||
| approved E2E_TEST_WORK_REQUEST, or explicit Agestra-backed persistent E2E test | ||
| authoring. Plain E2E test authoring requests without `/agestra` or explicit | ||
| multi-AI/provider wording stay with the current host. Handles repairing obsolete E2E | ||
| coverage after QA or deciding whether approved E2E work should route to the implementer | ||
| in `mode: e2e-test-authoring`. | ||
| --- | ||
| ## Purpose | ||
| Internal workflow for persistent E2E test authoring. This is not a standalone user command yet. QA owns the decision that persistent E2E tests are needed; team-lead obtains user approval and invokes `agestra:agestra-implementer` with `mode: e2e-test-authoring`; QA reruns after the tests exist. | ||
| Plain E2E test authoring requests without `/agestra` or explicit multi-AI/provider | ||
| wording stay with the current host. Enter this skill only after an active Agestra | ||
| workflow or explicit Agestra-backed E2E request exists. | ||
| ## Workflow | ||
| ### Phase 0: Setup preflight | ||
| Call `setup_status` before routing. If the response contains `Setup required: yes` | ||
| or `Current config: not found`, complete `/agestra setup` first and resume with the original request. | ||
| ### Phase 1: Confirm authority | ||
| Proceed only if one is true: | ||
| - QA produced an `E2E_TEST_WORK_REQUEST`. | ||
| - Team-lead included approved E2E test-writing in the implementation plan. | ||
| - The user explicitly asked Agestra to create/update persistent E2E tests. | ||
| If the user is merely asking whether the app is correct, route to `/agestra qa` first. | ||
| ### Phase 2: Explain cost and approval | ||
| Before persistent E2E work, explain that it may require dev-server setup, browser automation, browser downloads, dependencies, screenshots/traces, and longer runs. | ||
| Ask approval before installing tools, downloading browsers, adding dependencies, or modifying test configuration. | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer approval for tool installs, browser downloads, dependency changes, or persistent test configuration changes. | ||
| ### Phase 3: Route | ||
| If no external providers are requested, invoke `agestra:agestra-implementer` with `mode: e2e-test-authoring` and: | ||
| - Approved `E2E_TEST_WORK_REQUEST` | ||
| - Design doc path | ||
| - QA report path, if any | ||
| - Existing E2E framework or "unknown" | ||
| - Files allowed to modify | ||
| - Forbidden changes: product source code, feature behavior, approved design scope | ||
| - Verification command expectation | ||
| If external providers or multi-AI are requested, hand off to `agestra:agestra-team-lead` and mark the task as E2E test-writing only. | ||
| ### Phase 4: After E2E test authoring | ||
| Read the implementer's E2E test-authoring result. | ||
| - If tests were added/updated and verification ran, re-run QA. | ||
| - If it returns `PRODUCT_FIX_REQUEST`, route the product fix to implementer and rerun QA. | ||
| - If it returns `TESTABILITY_CHANGE_REQUEST`, ask the user/leader before changing product code for testability. Do not infer approval. | ||
| - If it returns `TOOL_APPROVAL_REQUEST`, ask the user with exact command, cost, network, and artifact details. | ||
| Use `AskUserQuestion` when available for these decisions, or a plain numbered prompt as fallback. | ||
| ## Constraints | ||
| - No `/agestra e2e` command is exposed yet; this skill is an internal routing/reference workflow. | ||
| - E2E test-authoring mode may not change product behavior. | ||
| - Do not weaken tests to match broken implementation. | ||
| - QA remains the final verifier after E2E work. | ||
| - Communicate in the user's language. |
| # Implementation Research Domain Pack | ||
| 구현 리서치는 설계를 실제 코드로 옮기기 전에 통합 지점과 위험을 찾는 조사다. | ||
| ## Focus | ||
| - 설계대로 구현 가능한 기존 패턴 | ||
| - 수정해야 할 파일과 호출 경로 | ||
| - 데이터 shape, 타입, 설정, 스크립트 영향 | ||
| - migration 또는 backward compatibility 위험 | ||
| - 테스트 가능성 | ||
| - worker/provider 분할 가능성 | ||
| - 예상 blocker와 검증 명령 | ||
| - 하드코딩, mock/fallback, 낮은 충실도 구현 위험 | ||
| ## Useful Lens Bundles | ||
| - Codebase + Feasibility: 기존 구조에서 어디를 바꾸면 되는가 | ||
| - Risk + Boundary: 공유 모듈, public API, state, config 영향 | ||
| - Evidence + Validation: 어떤 테스트가 구현 완료를 증명하는가 | ||
| - Comparison: 기존 패턴과 새 구현 방식이 어긋나는가 | ||
| ## Research Card | ||
| - existing code pattern, integration point, migration boundary, test affordance를 확인한다. | ||
| - 반드시 바꿔야 하는 것과 선택적 정리를 분리한다. | ||
| - edit 추천 전에 위험한 공유 계약을 먼저 찾는다. | ||
| - 새 설계가 요구할 때만 compatibility를 보존한다. 오래된 호환 잔재를 자동으로 남기지 않는다. | ||
| - objective를 만족하는 가장 작은 구현 경로와 검증 명령을 남긴다. | ||
| ## Output | ||
| 구현자가 바로 쓸 수 있는 작업 분해, 파일 범위, 검증 명령, blocker 목록을 만든다. |
| --- | ||
| name: worker-manage | ||
| description: > | ||
| Use when managing CLI worker processes — checking status, collecting results, | ||
| stopping workers, or viewing active workers. Triggers on: "worker status", | ||
| "check workers", "stop worker", "worker results", "워커 상태", "워커 중지". | ||
| --- | ||
| Wraps `cli_worker_spawn`, `cli_worker_status`, `cli_worker_collect`, `cli_worker_stop` into user-friendly operations. | ||
| ## Operations | ||
| ### List Active Workers | ||
| Call `cli_worker_status` for all known workers. | ||
| Present a table: | ||
| | Worker ID | Provider | Status | Elapsed | Files Changed | | ||
| |-----------|----------|--------|---------|---------------| | ||
| | codex-auth-abc | codex | RUNNING | 45s | 2 | | ||
| | gemini-api-def | gemini | COMPLETED | 120s | 5 | | ||
| ### Check Worker Status | ||
| For a specific worker, call `cli_worker_status` with the worker ID. | ||
| Show: | ||
| - Current FSM state | ||
| - Elapsed time | ||
| - Last 20 lines of output | ||
| - Files changed so far | ||
| - Worktree branch name | ||
| - Retry count | ||
| ### Collect Results | ||
| For a completed worker, call `cli_worker_collect` with the worker ID. | ||
| Present: | ||
| - Exit code | ||
| - Git diff summary (files changed, insertions, deletions) | ||
| - Full output (or truncated if very long) | ||
| - Worktree branch | ||
| Then ask the user using AskUserQuestion, or ask the same options plainly in chat as a numbered prompt if structured choices are unavailable: | ||
| | Option | Description | | ||
| |--------|-------------| | ||
| | **Merge** | Accept changes and merge worker branch to main | | ||
| | **Review diff** | Show the full diff before deciding | | ||
| | **Reject** | Discard changes and clean up worktree | | ||
| Wait for an explicit choice before accepting, rejecting, or cleaning up worker changes. Do not infer merge/reject approval. | ||
| ### Stop Worker | ||
| Call `cli_worker_stop` with the worker ID. | ||
| The worker receives SIGTERM, then SIGKILL after 5 seconds if still running. | ||
| Worktree is cleaned up after the worker stops. | ||
| Confirm before stopping: | ||
| - "Worker [id] is currently RUNNING (elapsed: Xs). Stop it?" | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer approval to stop a running worker. | ||
| ### Stop All Workers | ||
| If the user says "stop all workers" or similar: | ||
| 1. List all RUNNING workers. | ||
| 2. Confirm: "Stop all N running workers?" | ||
| Use `AskUserQuestion` when available, or a plain numbered prompt as fallback. Do not infer approval to stop all workers. | ||
| 3. Call `cli_worker_stop` for each. | ||
| ## Error Handling | ||
| - If a worker ID is not found: "No worker found with ID [id]. Use 'list workers' to see active workers." | ||
| - If trying to collect from a RUNNING worker: "Worker [id] is still running. Wait for completion or stop it first." | ||
| - If trying to stop an already completed worker: "Worker [id] has already finished (state: COMPLETED)." |
Sorry, the diff of this file is too big to display
URL strings
Supply chain riskPackage contains fragments of external URLs or IP addresses, which the package may be accessing at runtime.
Found 1 instance in 1 package
URL strings
Supply chain riskPackage contains fragments of external URLs or IP addresses, which the package may be accessing at runtime.
Found 1 instance in 1 package
1499939
0.69%62
-6.06%5794
-2.38%119
-0.83%