+6
-0
@@ -9,2 +9,8 @@ # Changelog | ||
| ## [0.5.1] - 2026-05-23 | ||
| ### Fixed | ||
| - **Verifier false-positive on zero-tests-executed suites.** Closed via new validator **V-05** in `template/.claude/agents/verifier.md` and `template/.claude/skills/hstack-verify/SKILL.md`: a `unit`, `integration`, or `e2e` suite that executed zero tests cannot be recorded as `pass`. The verifier now confirms the runner's observed-test-count is greater than zero before mapping a suite to `pass`; suites gated by an unset env var, all `.skip` / `.todo`, empty collection, or filter-collapsed are recorded as `not-run` (per the existing `test-results` enum) with a high-severity Discrepancy naming the runner-reported counts and the suspected reason. A `not-run` value blocks `status: passed` and halts the Skill at `status: ran`. Lint and typecheck are exempt — both produce diagnostic counts whose floor is naturally zero. Reported by a consumer repo where the verifier marked `integration: pass` while the integration suite was env-gated to zero tests. | ||
| ## [0.5.0] - 2026-05-23 | ||
@@ -11,0 +17,0 @@ |
+1
-1
| { | ||
| "name": "hstack", | ||
| "version": "0.5.0", | ||
| "version": "0.5.1", | ||
| "description": "A spec-driven engineering workflow that ships as Claude Code Skills and subagents.", | ||
@@ -5,0 +5,0 @@ "keywords": [ |
@@ -33,4 +33,4 @@ --- | ||
| - "{{TODO-SKILL: /hstack:verify — invokes verifier after implementation completion}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output, including an observed-test-count per suite for V-05}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02/V-05}}" | ||
| --- | ||
@@ -74,2 +74,3 @@ | ||
| - V-04: any test-plan performance-budget assertion that did not execute or that observed values outside the declared budget blocks `status: passed`. | ||
| - V-05: a suite that executed zero tests cannot be recorded as `pass`. Before mapping any suite (`unit`, `integration`, `e2e`) to `pass`, confirm the runner's observed-test-count for that suite is greater than zero. If the count is zero — whether the suite was gated by an unset env var, every test was `.skip`/`.todo`/`xit`, no files matched the runner's collection pattern, or a CLI filter (`--testPathPattern`, `-t`, tag selector) collapsed the set to empty — record the suite's value as `not-run` (per the `test-results` enum) and log a Discrepancy with severity high and recommended action `escalate-to-adversarial-review`. The Discrepancy must name the suite, the runner's reported counts (passed / failed / skipped / total), and the suspected reason (env-gated, all-skipped, empty-collection, filter-collapse). A `not-run` value blocks `status: passed`. Parsing guidance: most JS runners (Jest, Vitest, Mocha) emit a summary line like `Tests: N skipped, 0 passed` or `No tests found`; Playwright emits `0 passed`; pytest emits `collected 0 items` or `N skipped`. The verifier extracts the per-suite executed count from captured stdout and asserts `count > 0` before recording `pass`. Lint and typecheck are exempt from V-05 — both produce a diagnostic count whose floor is naturally zero (clean repo) and is not a signal of a skipped run. | ||
| - Discrepancies section captures anything the verifier observed that the plan or test-plan did not predict: a test that ran but no artifact promised; a test the plan or test-plan promised that did not exist; flakiness; environment-dependent behavior. Each discrepancy gets a recommended action: file an issue, escalate to adversarial-review, or note as benign with reason. | ||
@@ -89,2 +90,3 @@ - Mechanical role only. Do not score security or data. Do not produce findings. Do not advise on remediation beyond the discrepancy action. | ||
| - A `failed` result would block `status: passed`. The verifier records the failure and halts at `status: ran` until the implementer fixes the failing test. | ||
| - A suite executed zero tests (V-05). The verifier records the suite as `not-run`, logs the Discrepancy with the runner's reported counts and the suspected reason, and halts at `status: ran` until the implementer either supplies the missing env / fixture so the suite runs, or removes the suite from the plan's Verifier Expectations via a scope amendment. | ||
@@ -99,2 +101,3 @@ ## Output expectations | ||
| - No `failed` value in `test-results` (V-02). | ||
| - No `not-run` value in `test-results` for `unit`, `integration`, or `e2e` (V-05). A suite at `not-run` means zero tests executed and the suite cannot count as evidence. | ||
@@ -104,2 +107,3 @@ ## Anti-patterns | ||
| - Never invent a PASS. If tests are not green, status is `ran` or `failed`, not `passed`. | ||
| - Never record a suite as `pass` without confirming the runner's observed-test-count for that suite is greater than zero (V-05). A suite gated by an unset env var, a suite where every test is `.skip` / `.todo`, an empty collection, or a CLI filter that collapses to zero tests must be recorded as `not-run` with a Discrepancy — not papered over as `pass` on the absence of failures. "Zero failures" is not "evidence of correctness" when there were zero assertions to fail. | ||
| - Never skip a canonical command. The consuming repo's test/lint/typecheck commands in `ci-cd.md` are mandatory. | ||
@@ -106,0 +110,0 @@ - Never silently drop a discrepancy. Even benign discrepancies get a one-line note. |
@@ -23,2 +23,11 @@ --- | ||
| </example> | ||
| <example> | ||
| Context: The integration suite is gated by `RUN_INTEGRATION=1` and the engineer ran `npm test` without setting it; the runner reported `Tests: 0 passed, 0 failed`. | ||
| user: "/hstack:verify 2026-06-knowledge-citations" | ||
| assistant: "I'll invoke verifier. Per V-05, an integration suite that executed zero tests is recorded as `not-run`, not `pass` — zero failures is not evidence of correctness when there were zero assertions to fail. The Skill halts at `status: ran` with a high-severity Discrepancy naming the suspected reason (env-gated, all-skipped, empty-collection, or filter-collapse)." | ||
| <commentary> | ||
| V-05 closes the verifier false-positive where a suite gated by an unset env var would silently pass on the absence of failures. The remediation is either supplying the missing env / fixture and re-running, or amending the plan's Verifier Expectations via scope amendment so the zero-test state is intentional and recorded. | ||
| </commentary> | ||
| </example> | ||
| tools: | ||
@@ -32,4 +41,4 @@ - Read | ||
| - Task | ||
| - "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output, including an observed-test-count per suite for V-05}}" | ||
| - "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02/V-05}}" | ||
| --- | ||
@@ -67,3 +76,3 @@ | ||
| 4. **Test-results map.** The subagent writes the top-level `test-results` map covering `unit`, `integration`, `e2e`, `lint`, `typecheck`. Per V-02, any `failed` value blocks `status: passed`. | ||
| 4. **Test-results map.** The subagent writes the top-level `test-results` map covering `unit`, `integration`, `e2e`, `lint`, `typecheck`. Per V-02, any `failed` value blocks `status: passed`. Per V-05, before mapping `unit`, `integration`, or `e2e` to `pass`, the subagent confirms the runner's observed-test-count for that suite is greater than zero — a suite gated by an unset env var, all-skipped, empty-collection, or filter-collapsed to zero tests is recorded as `not-run` with a high-severity Discrepancy, not as `pass` on the absence of failures. | ||
@@ -78,3 +87,3 @@ 5. **Test-plan coverage check.** The subagent walks the test-plan's Edge Cases bullets, Tenant Isolation Tests array, and Performance Budgets table, and confirms each observed in the test run. `test-plan-coverage` frontmatter map captures the three subsections. Per V-03, any tenant-isolation test absent or skipped blocks `status: passed` and is escalated to adversarial-review via Discrepancies. Per V-04, any performance-budget assertion that did not execute or that observed values outside the declared budget blocks `status: passed`. | ||
| 9. **Validate.** Run `{{TODO-SCRIPT: hstack/scripts/validate-spec.ts}}` — V-01, V-02, V-03, V-04. | ||
| 9. **Validate.** Run `{{TODO-SCRIPT: hstack/scripts/validate-spec.ts}}` — V-01, V-02, V-03, V-04, V-05. | ||
@@ -128,2 +137,3 @@ ## Outputs | ||
| - A test failure blocks `status: passed`. The Skill halts at `status: ran` (or `failed`) until the implementer fixes the failing test via a new `hstack-implement` invocation. | ||
| - A `unit`, `integration`, or `e2e` suite executed zero tests (V-05). The Skill halts at `status: ran`; the subagent records the suite as `not-run` and logs the Discrepancy. Remediation is either (a) the implementer supplies the missing env / fixture so the suite collects and runs, or (b) a scope amendment removes the suite from the plan's Verifier Expectations so the zero-test state is intentional and recorded. | ||
@@ -139,2 +149,3 @@ ## Failure modes | ||
| - Never invent a PASS. If tests are not green, status is `ran` or `failed`, not `passed`. | ||
| - Never record a suite as `pass` on the absence of failures alone (V-05). A suite that ran zero tests — gated by an unset env var, all `.skip` / `.todo`, empty collection, or filter-collapsed — is `not-run`, not `pass`. The Skill propagates the zero-tests-ran signal from the runner output into the subagent context so the rule is enforceable rather than inferred. | ||
| - Never skip a canonical command. The consuming repo's commands in `ci-cd.md` are mandatory. | ||
@@ -141,0 +152,0 @@ - Never silently drop a discrepancy. Even benign discrepancies get a one-line note. |
+1
-1
@@ -1,1 +0,1 @@ | ||
| 0.5.0 | ||
| 0.5.1 |
URL strings
Supply chain riskPackage contains fragments of external URLs or IP addresses, which the package may be accessing at runtime.
Found 1 instance in 1 package
URL strings
Supply chain riskPackage contains fragments of external URLs or IP addresses, which the package may be accessing at runtime.
Found 1 instance in 1 package
985002
0.57%