hstack - npm Package Compare versions

Comparing version

0.5.0

0.5.1

+6

-0

CHANGELOG.md

		@@ -9,2 +9,8 @@ # Changelog

		## [0.5.1] - 2026-05-23

		### Fixed

		- Verifier false-positive on zero-tests-executed suites. Closed via new validator V-05 in `template/.claude/agents/verifier.md` and `template/.claude/skills/hstack-verify/SKILL.md`: a `unit`, `integration`, or `e2e` suite that executed zero tests cannot be recorded as `pass`. The verifier now confirms the runner's observed-test-count is greater than zero before mapping a suite to `pass`; suites gated by an unset env var, all `.skip` / `.todo`, empty collection, or filter-collapsed are recorded as `not-run` (per the existing `test-results` enum) with a high-severity Discrepancy naming the runner-reported counts and the suspected reason. A `not-run` value blocks `status: passed` and halts the Skill at `status: ran`. Lint and typecheck are exempt — both produce diagnostic counts whose floor is naturally zero. Reported by a consumer repo where the verifier marked `integration: pass` while the integration suite was env-gated to zero tests.

		## [0.5.0] - 2026-05-23
		@@ -11,0 +17,0 @@

+1

-1

package.json

		{
		"name": "hstack",
		"version": "0.5.0",
		"version": "0.5.1",
		"description": "A spec-driven engineering workflow that ships as Claude Code Skills and subagents.",
		@@ -5,0 +5,0 @@ "keywords": [

+6

-2

template/.claude/agents/verifier.md

		@@ -33,4 +33,4 @@ ---
		- "{{TODO-SKILL: /hstack:verify — invokes verifier after implementation completion}}"
		- "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output}}"
		- "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02}}"
		- "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output, including an observed-test-count per suite for V-05}}"
		- "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02/V-05}}"
		---
		@@ -74,2 +74,3 @@
		- V-04: any test-plan performance-budget assertion that did not execute or that observed values outside the declared budget blocks `status: passed`.
		- V-05: a suite that executed zero tests cannot be recorded as `pass`. Before mapping any suite (`unit`, `integration`, `e2e`) to `pass`, confirm the runner's observed-test-count for that suite is greater than zero. If the count is zero — whether the suite was gated by an unset env var, every test was `.skip`/`.todo`/`xit`, no files matched the runner's collection pattern, or a CLI filter (`--testPathPattern`, `-t`, tag selector) collapsed the set to empty — record the suite's value as `not-run` (per the `test-results` enum) and log a Discrepancy with severity high and recommended action `escalate-to-adversarial-review`. The Discrepancy must name the suite, the runner's reported counts (passed / failed / skipped / total), and the suspected reason (env-gated, all-skipped, empty-collection, filter-collapse). A `not-run` value blocks `status: passed`. Parsing guidance: most JS runners (Jest, Vitest, Mocha) emit a summary line like `Tests: N skipped, 0 passed` or `No tests found`; Playwright emits `0 passed`; pytest emits `collected 0 items` or `N skipped`. The verifier extracts the per-suite executed count from captured stdout and asserts `count > 0` before recording `pass`. Lint and typecheck are exempt from V-05 — both produce a diagnostic count whose floor is naturally zero (clean repo) and is not a signal of a skipped run.
		- Discrepancies section captures anything the verifier observed that the plan or test-plan did not predict: a test that ran but no artifact promised; a test the plan or test-plan promised that did not exist; flakiness; environment-dependent behavior. Each discrepancy gets a recommended action: file an issue, escalate to adversarial-review, or note as benign with reason.
		@@ -89,2 +90,3 @@ - Mechanical role only. Do not score security or data. Do not produce findings. Do not advise on remediation beyond the discrepancy action.
		- A `failed` result would block `status: passed`. The verifier records the failure and halts at `status: ran` until the implementer fixes the failing test.
		- A suite executed zero tests (V-05). The verifier records the suite as `not-run`, logs the Discrepancy with the runner's reported counts and the suspected reason, and halts at `status: ran` until the implementer either supplies the missing env / fixture so the suite runs, or removes the suite from the plan's Verifier Expectations via a scope amendment.

		@@ -99,2 +101,3 @@ ## Output expectations
		- No `failed` value in `test-results` (V-02).
		- No `not-run` value in `test-results` for `unit`, `integration`, or `e2e` (V-05). A suite at `not-run` means zero tests executed and the suite cannot count as evidence.

		@@ -104,2 +107,3 @@ ## Anti-patterns
		- Never invent a PASS. If tests are not green, status is `ran` or `failed`, not `passed`.
		- Never record a suite as `pass` without confirming the runner's observed-test-count for that suite is greater than zero (V-05). A suite gated by an unset env var, a suite where every test is `.skip` / `.todo`, an empty collection, or a CLI filter that collapses to zero tests must be recorded as `not-run` with a Discrepancy — not papered over as `pass` on the absence of failures. "Zero failures" is not "evidence of correctness" when there were zero assertions to fail.
		- Never skip a canonical command. The consuming repo's test/lint/typecheck commands in `ci-cd.md` are mandatory.
		@@ -106,0 +110,0 @@ - Never silently drop a discrepancy. Even benign discrepancies get a one-line note.

+15

-4

template/.claude/skills/hstack-verify/SKILL.md

		@@ -23,2 +23,11 @@ ---
		</example>

		<example>
		Context: The integration suite is gated by `RUN_INTEGRATION=1` and the engineer ran `npm test` without setting it; the runner reported `Tests: 0 passed, 0 failed`.
		user: "/hstack:verify 2026-06-knowledge-citations"
		assistant: "I'll invoke verifier. Per V-05, an integration suite that executed zero tests is recorded as `not-run`, not `pass` — zero failures is not evidence of correctness when there were zero assertions to fail. The Skill halts at `status: ran` with a high-severity Discrepancy naming the suspected reason (env-gated, all-skipped, empty-collection, or filter-collapse)."
		<commentary>
		V-05 closes the verifier false-positive where a suite gated by an unset env var would silently pass on the absence of failures. The remediation is either supplying the missing env / fixture and re-running, or amending the plan's Verifier Expectations via scope amendment so the zero-test state is intentional and recorded.
		</commentary>
		</example>
		tools:
		@@ -32,4 +41,4 @@ - Read
		- Task
		- "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output}}"
		- "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02}}"
		- "{{TODO-SCRIPT: hstack/scripts/run-gates.sh — runs the consuming repo's test/lint/typecheck suite and captures output, including an observed-test-count per suite for V-05}}"
		- "{{TODO-SCRIPT: hstack/scripts/validate-spec.ts — validates verification.md frontmatter and V-01/V-02/V-05}}"
		---
		@@ -67,3 +76,3 @@

		4. Test-results map. The subagent writes the top-level `test-results` map covering `unit`, `integration`, `e2e`, `lint`, `typecheck`. Per V-02, any `failed` value blocks `status: passed`.
		4. Test-results map. The subagent writes the top-level `test-results` map covering `unit`, `integration`, `e2e`, `lint`, `typecheck`. Per V-02, any `failed` value blocks `status: passed`. Per V-05, before mapping `unit`, `integration`, or `e2e` to `pass`, the subagent confirms the runner's observed-test-count for that suite is greater than zero — a suite gated by an unset env var, all-skipped, empty-collection, or filter-collapsed to zero tests is recorded as `not-run` with a high-severity Discrepancy, not as `pass` on the absence of failures.

		@@ -78,3 +87,3 @@ 5. Test-plan coverage check. The subagent walks the test-plan's Edge Cases bullets, Tenant Isolation Tests array, and Performance Budgets table, and confirms each observed in the test run. `test-plan-coverage` frontmatter map captures the three subsections. Per V-03, any tenant-isolation test absent or skipped blocks `status: passed` and is escalated to adversarial-review via Discrepancies. Per V-04, any performance-budget assertion that did not execute or that observed values outside the declared budget blocks `status: passed`.

		9. Validate. Run `{{TODO-SCRIPT: hstack/scripts/validate-spec.ts}}` — V-01, V-02, V-03, V-04.
		9. Validate. Run `{{TODO-SCRIPT: hstack/scripts/validate-spec.ts}}` — V-01, V-02, V-03, V-04, V-05.

		@@ -128,2 +137,3 @@ ## Outputs
		- A test failure blocks `status: passed`. The Skill halts at `status: ran` (or `failed`) until the implementer fixes the failing test via a new `hstack-implement` invocation.
		- A `unit`, `integration`, or `e2e` suite executed zero tests (V-05). The Skill halts at `status: ran`; the subagent records the suite as `not-run` and logs the Discrepancy. Remediation is either (a) the implementer supplies the missing env / fixture so the suite collects and runs, or (b) a scope amendment removes the suite from the plan's Verifier Expectations so the zero-test state is intentional and recorded.

		@@ -139,2 +149,3 @@ ## Failure modes
		- Never invent a PASS. If tests are not green, status is `ran` or `failed`, not `passed`.
		- Never record a suite as `pass` on the absence of failures alone (V-05). A suite that ran zero tests — gated by an unset env var, all `.skip` / `.todo`, empty collection, or filter-collapsed — is `not-run`, not `pass`. The Skill propagates the zero-tests-ran signal from the runner output into the subagent context so the rule is enforceable rather than inferred.
		- Never skip a canonical command. The consuming repo's commands in `ci-cd.md` are mandatory.
		@@ -141,0 +152,0 @@ - Never silently drop a discrepancy. Even benign discrepancies get a one-line note.

+1

-1

VERSION

		@@ -1,1 +0,1 @@
		0.5.0
		0.5.1

hstack - npm Package Compare versions

New alerts

Fixed alerts

Improved metrics