Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

@codexstar/bug-hunter

Package Overview
Dependencies
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@codexstar/bug-hunter - npm Package Compare versions

Comparing version
3.0.7
to
3.0.8
+39
CODE_OF_CONDUCT.md
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to a positive environment:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members
Examples of unacceptable behavior:
- The use of sexualized language or imagery and unwelcome sexual attention or advances
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information without explicit permission
- Other conduct which could reasonably be considered inappropriate in a professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the project maintainers at **conduct@codexstar.dev**. All complaints will be reviewed and investigated promptly and fairly.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1.
# Contributing to Bug Hunter
Thanks for your interest in contributing. Bug Hunter is an open-source adversarial code auditing skill for AI coding agents.
## Ways to Contribute
- **Report bugs** — open an issue with reproduction steps
- **Improve prompts** — the agent prompts in `prompts/` are the core of Bug Hunter's accuracy; PRs that reduce false positives or catch more real bugs are highly valued
- **Add calibration examples** — `prompts/examples/` contains few-shot examples that tune agent behavior; more real-world examples improve precision
- **Improve scripts** — the Node.js helpers in `scripts/` handle triage, state, and orchestration; performance and reliability improvements welcome
- **Documentation** — fix typos, clarify instructions, add usage examples
## Development Setup
```bash
git clone https://github.com/codexstar69/bug-hunter.git
cd bug-hunter
# Run the test suite (25 tests)
node --test scripts/tests/*.test.cjs
# Run the self-test against the test fixture
node scripts/run-bug-hunter.cjs preflight --skill-dir .
# Optional: install Context Hub CLI for doc verification testing
npm install -g @aisuite/chub
```
## Pull Request Guidelines
1. Keep PRs focused — one concern per PR
2. Test your changes against the `test-fixture/` directory
3. If modifying agent prompts, explain the reasoning and expected impact on false positive / true positive rates
4. Run `node --test scripts/tests/*.test.cjs` to verify all tests pass
5. Run `node scripts/run-bug-hunter.cjs preflight --skill-dir .` to verify preflight checks
6. Update `CHANGELOG.md` with your changes
## Code Style
- Scripts use CommonJS (`.cjs`) for maximum compatibility across agent runtimes
- No external dependencies in scripts — Node.js built-ins only
- Prompts are markdown — keep them concise and structured
## Prompt Changes
Changes to agent prompts (`prompts/*.md`) have outsized impact. When submitting prompt changes:
- Describe the false positive or missed bug that motivated the change
- Show before/after behavior if possible
- Consider impact on all three agents (Hunter, Skeptic, Referee) — they form an adversarial system
## License
By contributing, you agree that your contributions will be licensed under the MIT License.
# Bug Hunter — Full Reference
> AI-powered adversarial bug finding that argues with itself to surface real vulnerabilities — and auto-fixes them safely.
## Overview
Bug Hunter is an automated adversarial code auditing skill for AI coding agents. Instead of a single AI scanning code and flooding you with false alarms, it runs an adversarial multi-agent pipeline: one agent hunts for bugs, a second agent tries to disprove them, and a third agent delivers an independent final verdict.
## Skills-First Architecture
All pipeline agents are bundled as first-class skills with frontmatter under `skills/`. The orchestrator (`SKILL.md`) reads agent skills at each phase:
```
Recon (skills/recon/)
→ Hunter (skills/hunter/) + doc-lookup (skills/doc-lookup/)
→ Skeptic (skills/skeptic/) + doc-lookup
→ Referee (skills/referee/)
→ Fix Strategy + Fix Plan
→ Fixer (skills/fixer/) + doc-lookup
```
### Core Agent Skills
| Skill | Lines | Purpose |
|-------|-------|---------|
| `skills/hunter/SKILL.md` | 172 | Deep behavioral code analysis — logic errors, security vulns, race conditions |
| `skills/skeptic/SKILL.md` | 153 | Adversarial reviewer — challenges every finding, kills false positives |
| `skills/referee/SKILL.md` | 143 | Independent arbiter — delivers verdicts with CVSS scoring and PoC |
| `skills/fixer/SKILL.md` | 124 | Surgical code repair — minimal fixes respecting strategy classifications |
| `skills/recon/SKILL.md` | 166 | Codebase reconnaissance — maps architecture, trust boundaries, risk priorities |
| `skills/doc-lookup/SKILL.md` | 51 | Documentation access — Context Hub (chub) primary, Context7 fallback |
### Security Skills
| Skill | Purpose | Trigger |
|-------|---------|---------|
| `skills/commit-security-scan/SKILL.md` | Diff-scoped PR/commit security review | `--pr-security` |
| `skills/security-review/SKILL.md` | Full security workflow (threat model + code + deps + validation) | `--security-review` |
| `skills/threat-model-generation/SKILL.md` | STRIDE threat model bootstrap/refresh | `--threat-model` |
| `skills/vulnerability-validation/SKILL.md` | Exploitability/reachability/CVSS/PoC validation | `--validate-security` |
## Pipeline Architecture
### Phase 1: Triage (0 tokens, <2s)
`scripts/triage.cjs` classifies every file by risk tier (CRITICAL/HIGH/MEDIUM/LOW/SKIP) using filename patterns, directory structure, and file size heuristics. Outputs `.bug-hunter/triage.json`.
### Phase 2: Recon
`skills/recon/SKILL.md` — maps the tech stack, auth mechanisms, database layer, and key dependencies. Produces a risk map.
### Phase 3: Hunt
`skills/hunter/SKILL.md` — scans files in risk-map order (CRITICAL → HIGH → MEDIUM). Reports behavioral bugs with exact code evidence, runtime triggers, and cross-file references. Verifies claims against official documentation via `scripts/doc-lookup.cjs`.
### Phase 4: Skeptic Challenge
`skills/skeptic/SKILL.md` — re-reads actual source code for every finding and attempts to disprove it. Applies 15 hard exclusion rules. Uses risk calculations (EV = confidence × points - (1-confidence) × 2×points) to decide ACCEPT vs DISPROVE.
### Phase 5: Referee Verdict
`skills/referee/SKILL.md` — reads original code, Hunter findings, and Skeptic challenges. Delivers final verdicts. Enriches security findings with STRIDE, CWE, CVSS 3.1, reachability analysis, and proof-of-concept blocks.
### Phase 6: Fix Strategy + Plan
`buildFixStrategy()` classifies confirmed bugs into: `safe-autofix`, `manual-review`, `larger-refactor`, or `architectural-remediation`. `buildFixPlan()` gates execution by `autofixEligible` flag, splits into canary/rollout batches.
### Phase 7: Fix (optional)
`skills/fixer/SKILL.md` — surgical auto-fix with git branching, worktree isolation, test baseline capture, canary rollout, per-bug commits, automatic rollback on test regression, and post-fix re-scan.
## Documentation Verification
Bug Hunter verifies claims against official library documentation before any agent asserts framework behavior. Uses a hybrid approach:
1. **Context Hub (chub)** — curated, versioned, annotatable docs (primary)
2. **Context7 API** — broad coverage fallback when chub doesn't have the library
```bash
# Primary: doc-lookup.cjs (tries chub first, falls back to Context7)
node scripts/doc-lookup.cjs search "express" "middleware error handling"
node scripts/doc-lookup.cjs get "prisma/orm" "parameterized queries" --lang js
# Fallback: context7-api.cjs (Context7 only)
node scripts/context7-api.cjs search "express" "middleware"
node scripts/context7-api.cjs context "/expressjs/express" "error handling"
```
## Security Classification
### STRIDE Threat Categories
Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege
### CWE Mapping
SQL Injection (CWE-89), Command Injection (CWE-78), XSS (CWE-79), Path Traversal (CWE-22), IDOR (CWE-639), Missing Auth (CWE-306/862), Hardcoded Credentials (CWE-798), SSRF (CWE-918), and more.
### CVSS 3.1 Scoring
Critical and High security findings receive CVSS 3.1 base scores with attack vector, complexity, privileges required, and impact metrics.
## CLI Flags
| Flag | Description |
|------|-------------|
| (default) | Scan + auto-fix confirmed bugs |
| `--scan-only` / `--review` | Report only, no code changes |
| `--fix` | Explicitly enable auto-fix |
| `--fix --approve` / `--safe` | Fix with human approval per edit |
| `--plan-only` / `--plan` | Build strategy + plan, stop before edits |
| `--dry-run` / `--preview` | Preview fixes as unified diffs |
| `-b <branch>` | Scan files changed vs branch |
| `--staged` | Scan git-staged files |
| `--pr [current\|recent\|N]` | PR review workflow |
| `--pr-security` | PR security review (threat model + CVEs) |
| `--security-review` | Enterprise security audit |
| `--threat-model` | Generate STRIDE threat model |
| `--validate-security` | Force vulnerability validation |
| `--deps` | Include dependency CVE scan |
| `--autonomous` | No-intervention auto-fix run |
| `--no-loop` | Single pass (disable iterative coverage) |
## Output Files
| File | Description |
|------|-------------|
| `.bug-hunter/triage.json` | File classification and scan strategy |
| `.bug-hunter/recon.md` | Tech stack and risk map |
| `.bug-hunter/findings.json` | Canonical Hunter findings |
| `.bug-hunter/skeptic.json` | Canonical Skeptic challenges |
| `.bug-hunter/referee.json` | Canonical Referee verdicts |
| `.bug-hunter/report.md` | Final human-readable report |
| `.bug-hunter/fix-strategy.json` | Fix classification (safe-autofix/manual-review/etc) |
| `.bug-hunter/fix-strategy.md` | Human-readable strategy |
| `.bug-hunter/fix-plan.json` | Executable fix plan (canary/rollout) |
| `.bug-hunter/fix-report.json` | Fix results |
| `.bug-hunter/coverage.json` | Loop coverage state |
| `.bug-hunter/coverage.md` | Coverage summary |
| `.bug-hunter/threat-model.md` | STRIDE threat model |
| `.bug-hunter/dep-findings.json` | Dependency CVE results |
## Supported Languages
JavaScript, TypeScript, Python, Go, Rust, Java, C#, Ruby, PHP, Swift, Kotlin, C/C++
## Supported Frameworks
Express, Fastify, Next.js, Django, Flask, FastAPI, Gin, Echo, Actix, Spring Boot, Rails, Laravel — and any framework with docs in Context Hub or Context7.
## Project Structure
```
bug-hunter/
├── SKILL.md # Orchestrator — routes to skills/
├── README.md # Full documentation
├── CHANGELOG.md # Version history
├── package.json # npm package (@codexstar/bug-hunter)
├── bin/bug-hunter # CLI entry point
├── skills/ # All agent skills (core + security)
│ ├── hunter/ # Bug finding
│ ├── skeptic/ # False positive elimination
│ ├── referee/ # Final verdicts
│ ├── fixer/ # Surgical fixes
│ ├── recon/ # Codebase mapping
│ ├── doc-lookup/ # Documentation verification
│ ├── commit-security-scan/
│ ├── security-review/
│ ├── threat-model-generation/
│ └── vulnerability-validation/
├── modes/ # Execution strategies by codebase size
├── prompts/examples/ # Calibration examples for Hunter/Skeptic
├── schemas/ # JSON Schema contracts for all artifacts
├── scripts/ # Node.js helpers (zero AI tokens)
│ ├── run-bug-hunter.cjs # Main orchestrator script
│ ├── doc-lookup.cjs # Context Hub + Context7 doc lookup
│ ├── context7-api.cjs # Context7 standalone fallback
│ ├── prepublish-guard.cjs # Publish safety net
│ └── tests/ # Test suite (61 tests)
├── templates/ # Subagent launch template
└── test-fixture/ # 6 planted bugs for validation
```
## Install
```bash
npm install -g @codexstar/bug-hunter && bug-hunter install
```
Requirements: Node.js 18+. Works with Pi, Claude Code, Codex, Cursor, Windsurf, Kiro, Copilot.
## License
MIT
# Bug Hunter
> AI-powered adversarial bug finding that argues with itself to surface real vulnerabilities — and auto-fixes them safely.
## What It Does
Bug Hunter is an automated adversarial code auditing skill for AI coding agents. It runs a multi-agent pipeline where one agent hunts for bugs, a second tries to disprove them, and a third delivers an independent verdict. Only bugs that survive all three stages appear in your report.
## Key Capabilities
- Zero-token triage: classifies files by risk in <2s before any AI runs
- Adversarial pipeline: Hunter → Skeptic → Referee eliminates false positives
- Security classification: STRIDE threat categories, CWE weakness IDs, CVSS 3.1 scoring
- Documentation verification: checks claims against official library docs via Context Hub + Context7
- Safe auto-fix: git-branched fixes with worktree isolation, canary rollout, test verification, and automatic rollback
- Dependency CVE scanning: lockfile-aware audits for npm, pnpm, yarn, bun (including Bun 1.2+ text-format lockfiles)
- PR review: first-class `--pr` workflow for reviewing current, recent, or numbered PRs
- Enterprise security pack: bundled STRIDE threat modeling, vulnerability validation, and security review skills
## Architecture — Skills-First Design
All pipeline agents are bundled as first-class skills under `skills/`:
| Skill | Purpose |
|-------|---------|
| `skills/hunter/` | Deep behavioral code analysis — finds bugs |
| `skills/skeptic/` | Adversarial reviewer — kills false positives |
| `skills/referee/` | Independent arbiter — delivers final verdicts |
| `skills/fixer/` | Surgical code repair — implements fixes |
| `skills/recon/` | Codebase reconnaissance — maps architecture |
| `skills/doc-lookup/` | Documentation access — Context Hub + Context7 |
| `skills/commit-security-scan/` | PR/commit security review |
| `skills/security-review/` | Full enterprise security audit |
| `skills/threat-model-generation/` | STRIDE threat model generation |
| `skills/vulnerability-validation/` | Exploitability/CVSS validation |
## Usage
```
/bug-hunter src/ # scan a directory, auto-fix confirmed bugs
/bug-hunter --scan-only src/ # report only, no code changes
/bug-hunter --pr current # review the current PR
/bug-hunter --pr-security # PR security review with threat model + CVEs
/bug-hunter --security-review src/ # enterprise security audit
/bug-hunter --threat-model src/ # generate STRIDE threat model
/bug-hunter --deps src/ # include dependency CVE scan
/bug-hunter -b feature-xyz # scan branch diff vs main
```
## Install
```bash
# npm global install (recommended)
npm install -g @codexstar/bug-hunter && bug-hunter install
# Cross-IDE install via skills.sh
npx skills add codexstar69/bug-hunter
# Git clone
git clone https://github.com/codexstar69/bug-hunter.git ~/.agents/skills/bug-hunter
# Optional: curated doc verification
npm install -g @aisuite/chub
```
Requirements: Node.js 18+. Works with Pi, Claude Code, Codex, Cursor, Windsurf, Kiro, Copilot.
## Documentation
- [README.md](https://github.com/codexstar69/bug-hunter/blob/main/README.md) — full documentation
- [SKILL.md](https://github.com/codexstar69/bug-hunter/blob/main/SKILL.md) — agent integration instructions
- [CHANGELOG.md](https://github.com/codexstar69/bug-hunter/blob/main/CHANGELOG.md) — version history
- [CONTRIBUTING.md](https://github.com/codexstar69/bug-hunter/blob/main/CONTRIBUTING.md) — contribution guide
## License
MIT
# Security Policy
## Reporting a Vulnerability
If you discover a security vulnerability in Bug Hunter, please report it responsibly.
**Do NOT open a public GitHub issue for security vulnerabilities.**
Instead, email the maintainer directly at **security@codexstar.dev** or use [GitHub's private vulnerability reporting](https://github.com/codexstar69/bug-hunter/security/advisories/new).
## What Qualifies
- Vulnerabilities in Bug Hunter's scripts (`scripts/*.cjs`) that could lead to arbitrary code execution, path traversal, or data exfiltration
- Issues in the subagent dispatch pipeline that could allow prompt injection or scope escape
- Flaws in `fix-lock.cjs` or `payload-guard.cjs` that bypass safety mechanisms
## What Does NOT Qualify
- Bugs found *by* Bug Hunter in your codebase — those are features, not vulnerabilities
- Issues in upstream dependencies (Context7 API, Context Hub) — report those to their maintainers
- Theoretical attacks requiring local filesystem access (Bug Hunter already runs with full local access)
## Response Timeline
- Acknowledgment within 48 hours
- Assessment and fix plan within 7 days
- Patch release within 14 days for confirmed vulnerabilities
const express = require('express');
const jwt = require('jsonwebtoken');
const bcrypt = require('bcrypt');
const db = require('./db');
const router = express.Router();
// BUG 1 (Critical/Security): SQL injection — user input concatenated into query
router.post('/login', async (req, res) => {
const { email, password } = req.body;
const user = await db.query(`SELECT * FROM users WHERE email = '${email}'`);
if (!user) return res.status(401).json({ error: 'Invalid credentials' });
// BUG 2 (Critical/Security): JWT signed with hardcoded secret
const valid = await bcrypt.compare(password, user.password_hash);
if (!valid) return res.status(401).json({ error: 'Invalid credentials' });
const token = jwt.sign({ id: user.id, role: user.role }, 'super-secret-key-123');
res.json({ token });
});
// Auth middleware
function requireAuth(req, res, next) {
const token = req.headers.authorization?.replace('Bearer ', '');
if (!token) return res.status(401).json({ error: 'No token' });
try {
req.user = jwt.verify(token, 'super-secret-key-123');
next();
} catch {
res.status(401).json({ error: 'Invalid token' });
}
}
// BUG 3 (Medium/Logic): Admin check uses == instead of === and doesn't check role value properly
router.get('/admin/users', requireAuth, async (req, res) => {
if (req.user.role == true) {
const users = await db.query('SELECT id, email, role FROM users');
res.json(users);
} else {
res.status(403).json({ error: 'Forbidden' });
}
});
router.get('/profile', requireAuth, async (req, res) => {
const user = await db.query('SELECT id, email, name FROM users WHERE id = $1', [req.user.id]);
res.json(user);
});
module.exports = { router, requireAuth };
// Minimal database wrapper — simulates a pg pool
const { Pool } = require('pg');
const pool = new Pool({
connectionString: process.env.DATABASE_URL || 'postgresql://localhost:5432/app',
});
module.exports = {
query: (text, params) => pool.query(text, params).then(r => r.rows[0]),
queryAll: (text, params) => pool.query(text, params).then(r => r.rows),
};
# Expected Bugs in Test Fixture
This file documents the intentionally planted bugs for pipeline validation.
Do NOT include this file in the scan (it's .md, auto-filtered).
## BUG 1: SQL Injection (Critical)
- **File:** auth.js:12
- **Issue:** User email concatenated directly into SQL query string
- **Trigger:** POST /auth/login with email: `' OR 1=1 --`
## BUG 2: Hardcoded JWT Secret (Critical)
- **File:** auth.js:18, auth.js:30
- **Issue:** JWT signed/verified with hardcoded string 'super-secret-key-123'
- **Trigger:** Anyone who reads the source code can forge tokens
## BUG 3: Broken Admin Authorization (Medium)
- **File:** auth.js:38
- **Issue:** `req.user.role == true` uses loose equality — any truthy role value passes
- **Trigger:** Any authenticated user with role="user" (truthy string) gets admin access
## BUG 4: Off-by-One Pagination (Medium)
- **File:** users.js:11
- **Issue:** `offset = page * limit` skips first page of results; should be `(page - 1) * limit`
- **Trigger:** GET /users?page=1 returns results starting from offset 20 instead of 0
## BUG 5: Silent Error Swallowing (Low)
- **File:** users.js:22-24
- **Issue:** Delete endpoint catches and ignores all errors, always returns success
- **Trigger:** DELETE /users/nonexistent returns 200 {success: true} even if query fails
## BUG 6: bcrypt.compare with non-string input (Medium)
- **File:** users.js:40-46
- **Issue:** req.body.password passed directly to bcrypt.compare without type check. JSON body can contain numbers, booleans, objects. bcrypt.compare throws on non-string/Buffer input.
- **Trigger:** POST /users/check-password with body {"password": 12345} — bcrypt.compare throws TypeError
- **Context7 test:** Hunter/Skeptic should verify bcrypt.compare behavior against actual bcrypt docs
const express = require('express');
const { router: authRouter } = require('./auth');
const usersRouter = require('./users');
const app = express();
app.use(express.json());
app.get('/health', (req, res) => res.json({ status: 'ok' }));
app.use('/auth', authRouter);
app.use('/users', usersRouter);
app.use((err, req, res, next) => {
console.error(err.stack);
res.status(500).json({ error: 'Internal server error' });
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Server running on port ${PORT}`));
const express = require('express');
const bcrypt = require('bcrypt');
const db = require('./db');
const { requireAuth } = require('./auth');
const router = express.Router();
// BUG 4 (Medium/Logic): Off-by-one in pagination — skips the first result
router.get('/', requireAuth, async (req, res) => {
const page = parseInt(req.query.page) || 1;
const limit = parseInt(req.query.limit) || 20;
const offset = page * limit; // Should be (page - 1) * limit
const users = await db.query('SELECT id, email, name FROM users LIMIT $1 OFFSET $2', [limit, offset]);
res.json({ users, page, limit });
});
// BUG 5 (Low/Error-handling): deleteUser doesn't check if user exists before deleting,
// and swallows the error silently — caller gets 200 even if delete failed
router.delete('/:id', requireAuth, async (req, res) => {
try {
await db.query('DELETE FROM users WHERE id = $1', [req.params.id]);
} catch (err) {
// silently swallowed
}
res.json({ success: true });
});
router.put('/:id', requireAuth, async (req, res) => {
const { name, email } = req.body;
if (!name || !email) return res.status(400).json({ error: 'Name and email required' });
const user = await db.query(
'UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING id, name, email',
[name, email, req.params.id]
);
if (!user) return res.status(404).json({ error: 'User not found' });
res.json(user);
});
// BUG 6 (Medium/Security): bcrypt.compare with non-string input
// If password is a number or object from JSON body, bcrypt.compare may behave unexpectedly
router.post('/check-password', requireAuth, async (req, res) => {
const user = await db.query('SELECT password_hash FROM users WHERE id = $1', [req.user.id]);
if (!user) return res.status(404).json({ error: 'User not found' });
// req.body.password could be a number, boolean, or object from JSON parsing
// Does bcrypt.compare handle non-string inputs safely or throw?
const valid = await bcrypt.compare(req.body.password, user.password_hash);
res.json({ valid });
});
module.exports = router;
+30
-1

@@ -8,2 +8,29 @@ # Changelog

## [3.0.8] - 2026-03-13
### Highlights
- **All 61 tests pass.** Systematic reliability audit fixed 11 bugs across schemas, scripts, and the orchestrator — 10 previously-failing tests now pass, plus one new test added.
- **`High` severity now works end-to-end.** All JSON schemas, severity ranking functions, and payload-guard templates recognize `High` as a valid severity level.
- **Confidence threshold is fully configurable.** The `--confidence-threshold` flag now propagates through the entire pipeline — from the orchestrator through `processPendingChunks` to `record-findings`.
- **Shell injection fixed in doc-lookup.** Library names and IDs passed to `chub` CLI are now properly shell-quoted.
- **Modern Bun support.** `dep-scan.cjs` detects `bun.lock` (Bun 1.2+ text format) alongside the legacy `bun.lockb` binary format.
### Fixed
- `schemas/findings.schema.json`, `schemas/skeptic.schema.json`, `schemas/referee.schema.json`, `schemas/fix-report.schema.json`: added missing `High` to severity enums — previously only `Critical`, `Medium`, and `Low` were accepted, causing valid findings to fail schema validation
- `scripts/bug-hunter-state.cjs`: `severityRank()` now returns rank 2 for `High` severity — previously returned -1 (unknown), breaking severity ordering and dedup logic
- `scripts/run-bug-hunter.cjs`: `classifyStrategy()` added explicit parentheses around compound conditions to prevent operator-precedence misclassification
- `scripts/run-bug-hunter.cjs`: `runCommandOnce()` now clears the SIGKILL failsafe timer on normal exit — previously leaked a timer handle that could fire after the process had already exited
- `scripts/run-bug-hunter.cjs`: `processPendingChunks()` now receives and forwards `confidenceThreshold` to `record-findings` — previously the configurable threshold was silently ignored, always defaulting to 75
- `scripts/worktree-harvest.cjs`: commit log parsing no longer truncates the hash or drops the message when a `git log` line contains no space separator
- `scripts/dep-scan.cjs`: lockfile detection now checks for `bun.lock` (text format, Bun ≥1.2) in addition to `bun.lockb`
- `scripts/payload-guard.cjs`: hunter and fixer severity template strings now include `High` alongside `Critical`, `Medium`, and `Low`
- `scripts/doc-lookup.cjs`: `chubSearch()` and `chubGet()` now shell-quote all interpolated arguments via single-quote wrapping — previously, library names containing shell metacharacters could cause command injection
### Changed
- `scripts/bug-hunter-state.cjs`: `record-findings` command now accepts an optional 4th positional argument for confidence threshold (defaults to 75 for backwards compatibility)
- Test suite expanded from 50 passing / 10 failing to **61 passing / 0 failing**
### Added
- `scripts/tests/bug-hunter-state.test.cjs`: new test verifying that `High` severity findings are ranked above `Medium` and `Low`, and that re-recording with higher severity upgrades the existing ledger entry
## [3.0.7] - 2026-03-12

@@ -216,3 +243,5 @@

[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.0.5...HEAD
[Unreleased]: https://github.com/codexstar69/bug-hunter/compare/v3.0.8...HEAD
[3.0.8]: https://github.com/codexstar69/bug-hunter/compare/v3.0.7...v3.0.8
[3.0.7]: https://github.com/codexstar69/bug-hunter/compare/v3.0.5...v3.0.7
[3.0.5]: https://github.com/codexstar69/bug-hunter/compare/v3.0.4...v3.0.5

@@ -219,0 +248,0 @@ [3.0.4]: https://github.com/codexstar69/bug-hunter/compare/v3.0.3...v3.0.4

{
"name": "@codexstar/bug-hunter",
"version": "3.0.7",
"version": "3.0.8",
"description": "Adversarial AI bug hunter — multi-agent pipeline finds security vulnerabilities, logic errors, and runtime bugs, then fixes them autonomously. Works with Claude Code, Cursor, Codex CLI, Copilot, Kiro, and more.",

@@ -46,2 +46,8 @@ "license": "MIT",

"CHANGELOG.md",
"CONTRIBUTING.md",
"SECURITY.md",
"CODE_OF_CONDUCT.md",
"llms.txt",
"llms-full.txt",
"test-fixture/",
"LICENSE"

@@ -48,0 +54,0 @@ ],

+20
-10
<p align="center">
<img src="docs/images/2026-03-12-hero-bug-hunter-overview.png" alt="Bug Hunter product overview banner — code and pull requests flow through adversarial review, strategic fix planning, and verified patch delivery" width="720">
<img src="docs/images/hero.png" alt="Bug Hunter — Adversarial Bug Finding for Coding Agents. Pipeline: Triage → Recon → Hunters (Security, Logic) → Skeptics → Referee → Fixers → Verify. Compatible with all coding agent terminals and IDEs." width="720">
</p>

@@ -54,9 +54,11 @@

This release makes Bug Hunter much better at PR-first auditing and safer at automated remediation.
This release is a reliability hardening pass — 11 bugs fixed, 10 previously-failing tests now pass, and the full pipeline is more robust end-to-end.
- **PR review is now a first-class workflow.** Review the current PR, the most recent PR, or a specific PR number with `--pr`, `--pr current`, `--pr recent`, or `--pr 123`.
- **PR security review is now built in.** `--pr-security` runs a PR-scoped security audit with threat-model and dependency context, without editing code.
- **Strategic remediation is now explicit.** Bug Hunter writes `fix-strategy.json` and `fix-plan.json` before fixes run, so auto-fix decisions stay explainable and reviewable.
- **The security pack is now bundled locally.** `commit-security-scan`, `security-review`, `threat-model-generation`, and `vulnerability-validation` now ship inside the repo under `skills/`.
- **Fix execution is harder to break.** This update adds schema-validated fix plans, atomic lock handling, safer worktree cleanup, stash preservation, and shell-safe worker command templating.
- **`High` severity works everywhere.** All JSON schemas, severity ranking, and payload-guard templates now recognize `High` — previously only `Critical`, `Medium`, and `Low` were accepted, silently dropping valid findings.
- **Confidence threshold is fully wired.** `--confidence-threshold` now propagates from the CLI through the orchestrator to `record-findings`. Previously the flag was parsed but never forwarded, always defaulting to 75.
- **Shell injection fixed in doc-lookup.** Library names passed to `chub` CLI are now properly shell-quoted — prevents command injection via crafted library names.
- **SIGKILL timer leak fixed.** The failsafe kill timer in `runCommandOnce` is now cleared on normal exit — previously it could fire after the child had already exited.
- **Modern Bun lockfile support.** `dep-scan.cjs` now detects `bun.lock` (text format, Bun 1.2+) alongside the legacy `bun.lockb` binary format.
- **Worktree commit parsing hardened.** Edge case where `git log` lines with no space separator caused truncated hashes and wrong messages is now handled.
- **61 tests, 0 failures.** Up from 50 passing / 10 failing — the test suite now covers severity ranking, schema validation, confidence threshold propagation, and shell-safe worker templating.

@@ -692,3 +694,3 @@ <p align="center">

The repository also ships with **60 Node.js regression tests** covering orchestration, schemas, PR scope resolution, fix-plan validation, lock behavior, worktree lifecycle, and the bundled local security-skill routing.
The repository also ships with **61 Node.js regression tests** covering orchestration, schemas, PR scope resolution, fix-plan validation, lock behavior, worktree lifecycle, severity ranking, and the bundled local security-skill routing.

@@ -724,3 +726,3 @@ ```bash

│ └── images/ # Documentation visuals
│ ├── 2026-03-12-hero-bug-hunter-overview.png # Product overview hero
│ ├── hero.png # Product overview hero
│ ├── 2026-03-12-pr-review-flow.png # PR review + security workflow

@@ -764,4 +766,8 @@ │ ├── 2026-03-12-security-pack.png # Bundled local security pack

│ ├── referee.schema.json # Referee artifact schema
│ ├── fix-report.schema.json # Fix report artifact schema
│ ├── fix-strategy.schema.json # Strategic remediation schema
│ └── fix-plan.schema.json # Fix execution schema
│ ├── fix-plan.schema.json # Fix execution schema
│ ├── coverage.schema.json # Coverage tracking schema
│ ├── recon.schema.json # Recon artifact schema
│ └── shared.schema.json # Shared definitions

@@ -788,2 +794,6 @@ ├── skills/ # Bundled local security pack

│ ├── run-bug-hunter.test.cjs # Orchestrator tests
│ ├── bug-hunter-state.test.cjs # State management tests
│ ├── code-index.test.cjs # Code index tests
│ ├── delta-mode.test.cjs # Delta mode tests
│ ├── pr-scope.test.cjs # PR scope resolution tests
│ └── worktree-harvest.test.cjs # Worktree lifecycle tests

@@ -790,0 +800,0 @@

@@ -45,3 +45,3 @@ {

"type": "string",
"enum": ["Critical", "Medium", "Low"]
"enum": ["Critical", "High", "Medium", "Low"]
},

@@ -48,0 +48,0 @@ "file": { "type": "string", "minLength": 1 },

@@ -25,3 +25,3 @@ {

"type": "string",
"enum": ["Critical", "Medium", "Low"]
"enum": ["Critical", "High", "Medium", "Low"]
},

@@ -28,0 +28,0 @@ "category": {

@@ -59,3 +59,3 @@ {

"bugId": { "type": "string", "minLength": 1 },
"severity": { "type": "string", "enum": ["Critical", "Medium", "Low"] },
"severity": { "type": "string", "enum": ["Critical", "High", "Medium", "Low"] },
"category": { "type": "string", "minLength": 1 },

@@ -62,0 +62,0 @@ "file": { "type": "string", "minLength": 1 },

@@ -34,3 +34,3 @@ {

"type": "string",
"enum": ["CRITICAL", "HIGH", "MEDIUM", "LOW", "Critical", "Medium", "Low"]
"enum": ["CRITICAL", "HIGH", "MEDIUM", "LOW", "Critical", "High", "Medium", "Low"]
},

@@ -37,0 +37,0 @@ "status": { "type": "string", "minLength": 1 },

@@ -26,3 +26,3 @@ {

"type": "string",
"enum": ["Critical", "Medium", "Low"]
"enum": ["Critical", "High", "Medium", "Low"]
},

@@ -29,0 +29,0 @@ "confidenceScore": {

@@ -9,3 +9,3 @@ {

"type": "string",
"enum": ["Critical", "Medium", "Low"]
"enum": ["Critical", "High", "Medium", "Low"]
},

@@ -12,0 +12,0 @@ "category": {

@@ -136,9 +136,12 @@ #!/usr/bin/env node

}
if (normalized === 'medium') {
if (normalized === 'high') {
return 2;
}
if (normalized === 'low') {
if (normalized === 'medium') {
return 1;
}
return 0;
if (normalized === 'low') {
return 0;
}
return -1;
}

@@ -152,3 +155,3 @@

console.error(' bug-hunter-state.cjs mark-chunk <statePath> <chunkId> <pending|in_progress|done|failed> [error]');
console.error(' bug-hunter-state.cjs record-findings <statePath> <findingsJsonPath> [source]');
console.error(' bug-hunter-state.cjs record-findings <statePath> <findingsJsonPath> [source] [confidenceThreshold]');
console.error(' bug-hunter-state.cjs hash-filter <statePath> <filesJsonPath>');

@@ -269,3 +272,3 @@ console.error(' bug-hunter-state.cjs hash-update <statePath> <filesJsonPath> [status]');

if (command === 'record-findings') {
const [statePath, findingsJsonPath, source = 'unknown'] = args;
const [statePath, findingsJsonPath, source = 'unknown', confidenceThresholdRaw] = args;
if (!statePath || !findingsJsonPath) {

@@ -275,2 +278,5 @@ usage();

}
const confidenceThreshold = Number.isInteger(Number.parseInt(String(confidenceThresholdRaw || ''), 10))
? Number.parseInt(confidenceThresholdRaw, 10)
: 75;
const state = readState(statePath);

@@ -346,3 +352,3 @@ const findings = readJson(findingsJsonPath);

state.metrics.lowConfidenceFindings = state.bugLedger.filter((entry) => {
return entry.confidenceScore === null || entry.confidenceScore < 75;
return entry.confidenceScore === null || entry.confidenceScore < confidenceThreshold;
}).length;

@@ -349,0 +355,0 @@ saveState(statePath, state);

@@ -62,2 +62,3 @@ #!/usr/bin/env node

{ lockfile: 'bun.lockb', ecosystem: 'node', manager: 'bun', command: 'bun audit --json' },
{ lockfile: 'bun.lock', ecosystem: 'node', manager: 'bun', command: 'bun audit --json' },
{ lockfile: 'requirements.txt', ecosystem: 'pip', manager: 'pip', command: 'pip-audit --format json' },

@@ -245,3 +246,3 @@ { lockfile: 'Pipfile.lock', ecosystem: 'pip', manager: 'pipenv', command: 'pip-audit --format json' },

lockfile: 'none',
reason: 'No supported lockfile found (package-lock.json, pnpm-lock.yaml, yarn.lock, bun.lockb, requirements.txt, go.sum, Cargo.lock)',
reason: 'No supported lockfile found (package-lock.json, pnpm-lock.yaml, yarn.lock, bun.lockb, bun.lock, requirements.txt, go.sum, Cargo.lock)',
},

@@ -248,0 +249,0 @@ ],

@@ -74,5 +74,9 @@ #!/usr/bin/env node

function shellQuote(s) {
return "'" + String(s).replace(/'/g, "'\\''") + "'";
}
function chubSearch(library) {
try {
const raw = execSync(`chub search "${library}" --json`, {
const raw = execSync(`chub search ${shellQuote(library)} --json`, {
encoding: 'utf8',

@@ -91,4 +95,4 @@ timeout: CHUB_TIMEOUT_MS,

try {
const langFlag = lang ? ` --lang ${lang}` : '';
const raw = execSync(`chub get ${id}${langFlag}`, {
const langFlag = lang ? ` --lang ${shellQuote(lang)}` : '';
const raw = execSync(`chub get ${shellQuote(id)}${langFlag}`, {
encoding: 'utf8',

@@ -95,0 +99,0 @@ timeout: CHUB_TIMEOUT_MS,

@@ -43,3 +43,3 @@ #!/usr/bin/env node

bugId: 'BUG-1',
severity: 'Critical|Medium|Low',
severity: 'Critical|High|Medium|Low',
file: 'src/example.ts',

@@ -78,3 +78,3 @@ lines: '10-15',

bugId: 'BUG-1',
severity: 'Critical|Medium|Low',
severity: 'Critical|High|Medium|Low',
file: 'src/example.ts',

@@ -81,0 +81,0 @@ lines: '10-15',

@@ -227,2 +227,3 @@ #!/usr/bin/env node

let timeoutHit = false;
let killTimer = null;

@@ -232,3 +233,3 @@ const timer = setTimeout(() => {

child.kill('SIGTERM');
setTimeout(() => {
killTimer = setTimeout(() => {
if (!child.killed) {

@@ -248,2 +249,5 @@ child.kill('SIGKILL');

clearTimeout(timer);
if (killTimer) {
clearTimeout(killTimer);
}
resolve({

@@ -580,3 +584,3 @@ ok: code === 0 && !timeoutHit,

if (refactorSignals.some((signal) => claim.includes(signal)) || severityRank(entry.severity) >= 2 && crossReferences.length >= 2) {
if (refactorSignals.some((signal) => claim.includes(signal)) || (severityRank(entry.severity) >= 2 && crossReferences.length >= 2)) {
return {

@@ -1005,3 +1009,4 @@ strategy: 'larger-refactor',

skillDir,
index
index,
confidenceThreshold
}) {

@@ -1092,3 +1097,3 @@ while (true) {

let findings = [];
runJsonScript(stateScript, ['record-findings', statePath, findingsJsonPath, 'orchestrator']);
runJsonScript(stateScript, ['record-findings', statePath, findingsJsonPath, 'orchestrator', String(confidenceThreshold)]);
findings = readJson(findingsJsonPath);

@@ -1266,3 +1271,4 @@

skillDir,
index
index,
confidenceThreshold
});

@@ -1322,3 +1328,4 @@

skillDir,
index
index,
confidenceThreshold
});

@@ -1325,0 +1332,0 @@ }

@@ -153,1 +153,53 @@ const assert = require('node:assert/strict');

});
test('bug-hunter-state severity ranking orders High above Medium and Low', () => {
const sandbox = makeSandbox('bug-hunter-state-severity-');
const stateScript = resolveSkillScript('bug-hunter-state.cjs');
const filePath = path.join(sandbox, 'a.ts');
fs.writeFileSync(filePath, 'const a = 1;\n', 'utf8');
const filesJson = path.join(sandbox, 'files.json');
writeJson(filesJson, [filePath]);
const statePath = path.join(sandbox, 'state.json');
runJson('node', [stateScript, 'init', statePath, 'extended', filesJson, '1']);
// Record a Low finding first
const findingsLow = path.join(sandbox, 'findings-low.json');
writeJson(findingsLow, [
{
bugId: 'BUG-SEV',
severity: 'Low',
category: 'logic',
file: 'src/a.ts',
lines: '1',
claim: 'severity test',
evidence: 'src/a.ts:1 evidence',
runtimeTrigger: 'Call a()',
crossReferences: ['Single file'],
confidenceScore: 80
}
]);
runJson('node', [stateScript, 'record-findings', statePath, findingsLow, 'test']);
// Record a High finding for the same location — should upgrade
const findingsHigh = path.join(sandbox, 'findings-high.json');
writeJson(findingsHigh, [
{
bugId: 'BUG-SEV',
severity: 'High',
category: 'logic',
file: 'src/a.ts',
lines: '1',
claim: 'severity test',
evidence: 'src/a.ts:1 evidence',
runtimeTrigger: 'Call a()',
crossReferences: ['Single file'],
confidenceScore: 85
}
]);
runJson('node', [stateScript, 'record-findings', statePath, findingsHigh, 'test']);
const state = readJson(statePath);
assert.equal(state.bugLedger[0].severity, 'High');
assert.equal(state.bugLedger[0].confidenceScore, 85);
});

@@ -12,2 +12,3 @@ const assert = require('node:assert/strict');

runRaw,
shellQuote,
writeJson

@@ -130,3 +131,3 @@ } = require('./test-utils.cjs');

'node',
flakyWorker,
shellQuote(flakyWorker),
'--chunk-id',

@@ -139,3 +140,3 @@ '{chunkId}',

'--attempts-file',
attemptsFile
shellQuote(attemptsFile)
].join(' ');

@@ -216,3 +217,3 @@

'node',
worker,
shellQuote(worker),
'--chunk-id',

@@ -227,3 +228,3 @@ '{chunkId}',

'--seen-files',
seenFilesPath,
shellQuote(seenFilesPath),
'--confidence',

@@ -268,2 +269,6 @@ '60'

factsPath,
'--coverage-path',
coveragePath,
'--coverage-markdown-path',
coverageMarkdownPath,
'--use-index',

@@ -341,3 +346,3 @@ 'true',

'node',
worker,
shellQuote(worker),
'--chunk-id',

@@ -424,3 +429,3 @@ '{chunkId}',

'--worker-cmd',
`node ${workerPath} --chunk-id {chunkId} --scan-files-json {scanFilesJson} --findings-json {findingsJson}`,
`node ${shellQuote(workerPath)} --chunk-id {chunkId} --scan-files-json {scanFilesJson} --findings-json {findingsJson}`,
'--timeout-ms',

@@ -485,3 +490,3 @@ '5000',

'--worker-cmd',
`node ${workerPath} --chunk-id {chunkId} --scan-files-json {scanFilesJson} --findings-json {findingsJson}`,
`node ${shellQuote(workerPath)} --chunk-id {chunkId} --scan-files-json {scanFilesJson} --findings-json {findingsJson}`,
'--timeout-ms',

@@ -548,3 +553,3 @@ '5000',

'node',
workerPath,
shellQuote(workerPath),
'--chunk-id',

@@ -557,5 +562,5 @@ '{chunkId}',

'--seen-files',
seenFilesPath,
shellQuote(seenFilesPath),
'--changed-file',
changedFile
shellQuote(changedFile)
].join(' ');

@@ -635,3 +640,3 @@

'node',
workerPath,
shellQuote(workerPath),
'--chunk-id',

@@ -642,3 +647,3 @@ '{chunkId}',

'--attempts-file',
attemptsFile
shellQuote(attemptsFile)
].join(' ');

@@ -716,3 +721,3 @@

'node',
workerPath,
shellQuote(workerPath),
'--chunk-id',

@@ -723,3 +728,3 @@ '{chunkId}',

'--attempts-file',
attemptsFile
shellQuote(attemptsFile)
].join(' ');

@@ -925,7 +930,7 @@

'node',
workerPath,
shellQuote(workerPath),
'--output-path',
'{outputPath}',
'--attempts-file',
attemptsFile
shellQuote(attemptsFile)
].join(' ');

@@ -935,3 +940,3 @@

'node',
path.join(skillDir, 'scripts', 'render-report.cjs'),
shellQuote(path.join(skillDir, 'scripts', 'render-report.cjs')),
'skeptic',

@@ -1079,7 +1084,7 @@ '{outputPath}',

'node',
workerPath,
shellQuote(workerPath),
'--output-path',
'{outputPath}',
'--attempts-file',
attemptsFile
shellQuote(attemptsFile)
].join(' ');

@@ -1086,0 +1091,0 @@

@@ -52,2 +52,8 @@ const childProcess = require('child_process');

function shellQuote(value) {
const s = String(value);
if (s.length === 0) return "''";
return `'${s.replace(/'/g, "'\\''")}'`;
}
module.exports = {

@@ -59,3 +65,4 @@ readJson,

makeSandbox,
shellQuote,
writeJson
};

@@ -220,4 +220,4 @@ #!/usr/bin/env node

const spaceIdx = line.indexOf(' ');
const hash = line.slice(0, spaceIdx);
const message = line.slice(spaceIdx + 1);
const hash = spaceIdx >= 0 ? line.slice(0, spaceIdx) : line;
const message = spaceIdx >= 0 ? line.slice(spaceIdx + 1) : '';
const bugMatch = message.match(/BUG-(\d+)/);

@@ -224,0 +224,0 @@ return {