@avcodes/mi
Advanced tools
+1
-1
@@ -56,3 +56,3 @@ #!/usr/bin/env node | ||
| /* System prompt: built-in instructions plus current directory and date. */ | ||
| const SYSTEM = (process.env.SYSTEM_PROMPT || 'You are an autonomous agent. Prefer action over speculation—use tools to answer questions and complete tasks.\nA skill is a SKILL.md file containing a procedure for a particular kind of task, paired with a one-line description of when it applies. The skill tool loads a skill\'s body by name.\nWhen handling a request, compare it against each skill description listed below. If a description covers what the user is asking for, call skill(name) to load that skill and follow its body as your plan.\nbash runs any shell command: curl/wget for HTTP, git, package managers, compilers, anything available on the system.\nFile I/O goes through bash. Read with cat, sed -n \'10,40p\' file, head, tail, grep -n. Edit surgically with sed -i \'s/old/new/\' file, or rewrite a whole file via a quoted heredoc: cat > file <<\'EOF\' ... EOF. Read before editing; quote the heredoc delimiter to prevent expansion.\nApproach: explore, plan, act one step at a time, verify. Be concise.') + `\nCWD: ${process.cwd()}\nDate: ${new Date().toISOString()}`; | ||
| const SYSTEM = (process.env.SYSTEM_PROMPT || 'You are an autonomous agent. Prefer action over speculation—use tools to answer questions and complete tasks.\nA skill is a SKILL.md file containing a procedure for a particular kind of task, paired with a one-line description of when it applies. The skill tool loads a skill\'s body by name.\nWhen handling a request, compare it against each skill description listed below. If a description covers what the user is asking for, call skill(name) to load that skill and follow its body as your plan.\nbash runs any shell command: curl/wget for HTTP, git, package managers, compilers, anything available on the system.\nFile I/O goes through bash. Read with cat, sed -n \'10,40p\' file, head, tail, grep -n. Edit surgically with sed -i \'s/old/new/\' file, or rewrite a whole file via a quoted heredoc: cat > file <<\'EOF\' ... EOF. Read before editing; quote the heredoc delimiter to prevent expansion.\nApproach: explore, plan, act one step at a time, verify. Be concise.\nWhen you declare a task done, your final message must include the actual command output that proves it — not a summary of what you did. Unreproduced work is unfinished work.') + `\nCWD: ${process.cwd()}\nDate: ${new Date().toISOString()}`; | ||
@@ -59,0 +59,0 @@ /* History seeded with the system prompt; getArg reads a named CLI flag. */ |
+1
-1
| { | ||
| "name": "@avcodes/mi", | ||
| "version": "1.1.0", | ||
| "version": "1.2.0", | ||
| "description": "agentic coding in 27 loc. a loop, two tools, and an llm.", | ||
@@ -5,0 +5,0 @@ "type": "module", |
@@ -19,6 +19,6 @@ --- | ||
| If the fix requires new code rather than a surgical change, hand off to the `tdd` skill for red/green/refactor before proceeding to verification. | ||
| 5. **Verify the fix.** The repro now passes AND nothing else broke. Hand off to the `verify` skill for the broader check (lint, typecheck, full test suite). Keep the repro script or test committed where useful — it's a regression guard. | ||
| If you cannot reproduce after a reasonable effort, stop and say so. Request more information (exact command, environment, inputs, version). Do not guess-patch an unreproduced bug. | ||
| If the fix is new code rather than a surgical change, hand off to `tdd` for red/green/refactor. |
@@ -8,3 +8,3 @@ --- | ||
| Do not use for iterative work needing back-and-forth, or for tasks already mid-flight in the main context. Do not nest — a delegated subprocess must not itself delegate. | ||
| Do not use for iterative work needing back-and-forth, or for tasks already mid-flight in the main context. | ||
@@ -33,4 +33,4 @@ The subprocess inherits `OPENAI_API_KEY`, `MODEL`, `OPENAI_BASE_URL` and has no prior history. The prompt must be fully self-contained: | ||
| Collect each `pid` and `log`. Wait with `wait <pid>` or poll `kill -0 <pid> 2>/dev/null`. Once done, `cat` each log to read the full transcript. Prefer telling each subprocess to also write a compact result file under `/tmp/mi-*` so you don't have to parse transcript noise. | ||
| Collect each `pid` and `log`. Background children are detached (the harness calls `unref`) so `wait` will not find them — poll with `kill -0 <pid> 2>/dev/null` instead (exit 0 = still running, exit 1 = finished). Once done, `cat` each log to read the full transcript. Prefer telling each subprocess to also write a compact result file under `/tmp/mi-*` so you don't have to parse transcript noise. | ||
| Keep prompts short and specific. A vague delegation wastes a whole subprocess. |
| --- | ||
| name: plan | ||
| description: Record a short strategy doc at /tmp/mi-plan.md before non-trivial work. Load when a task needs more than one step, spans multiple files, or has unclear direction. | ||
| description: Record a short strategy doc at /tmp/mi-<slug>/plan.md before non-trivial work. Load when a task needs more than one step, spans multiple files, or has unclear direction. | ||
| --- | ||
@@ -8,4 +8,6 @@ | ||
| Write `/tmp/mi-plan.md` with three sections, nothing else: | ||
| Pick a short kebab-case `<slug>` for the task (e.g. `auth-refactor`, `fix-retry-bug`) and reuse it for both `plan` and `tasks` — they share `/tmp/mi-<slug>/` so they move together. Create the dir once with `mkdir -p /tmp/mi-<slug>`. If a plan already exists for the task, reuse the same slug rather than starting a new one (`ls -d /tmp/mi-*/ 2>/dev/null` to check). | ||
| Write `/tmp/mi-<slug>/plan.md` with three sections, nothing else: | ||
| ``` | ||
@@ -26,4 +28,4 @@ # Goal | ||
| Re-read `/tmp/mi-plan.md` before each major step. If reality diverges from the plan, revise the file before continuing — do not let it rot. | ||
| Re-read `/tmp/mi-<slug>/plan.md` before each major step. If reality diverges from the plan, revise the file before continuing — do not let it rot. | ||
| Revise (don't append) when direction changes. The doc should always reflect current intent, not history. |
| --- | ||
| name: tasks | ||
| description: Track execution state for multi-step work as a checkbox list at /tmp/mi-tasks.md. Load when work has more than one step; skip for single-step jobs. State tracking only — use `plan` for strategy. | ||
| description: Track execution state for multi-step work as a checkbox list at /tmp/mi-<slug>/tasks.md. Load when work has more than one step; skip for single-step jobs. State tracking only — use `plan` for strategy. | ||
| --- | ||
@@ -8,4 +8,6 @@ | ||
| Maintain `/tmp/mi-tasks.md` as a flat checkbox list: | ||
| Use the same `<slug>` as the `plan` skill — both live under `/tmp/mi-<slug>/`. If no plan exists, pick a kebab-case slug and `mkdir -p /tmp/mi-<slug>` first. | ||
| Maintain `/tmp/mi-<slug>/tasks.md` as a flat checkbox list: | ||
| ``` | ||
@@ -24,4 +26,14 @@ - [ ] pending task | ||
| Re-read the file before each state transition so updates are exact. Edit with targeted `sed -i` on the full line, not line numbers (they shift). | ||
| **Always rewrite the entire file with a quoted heredoc.** Do NOT use `sed`, `awk`, or any stream edit — task titles can contain `/`, `&`, quotes, backticks, and regex metacharacters that silently corrupt in-place edits. The quoted `'EOF'` also blocks variable and command expansion in the body: | ||
| ``` | ||
| cat > /tmp/mi-<slug>/tasks.md <<'EOF' | ||
| - [x] read config | ||
| - [~] add retry to fetch | ||
| - [ ] update tests | ||
| EOF | ||
| ``` | ||
| Before every rewrite, `cat /tmp/mi-<slug>/tasks.md` first and reproduce every existing line verbatim so nothing is dropped. Only change the state marker of the single task whose status flipped; leave every other line byte-identical. | ||
| Keep titles short and action-oriented ("add retry to fetch", not "I should probably add retry logic to the fetch function"). No priorities, no timestamps, no nesting — if you need structure beyond a flat list, the work belongs in `plan`, not here. |
@@ -6,32 +6,18 @@ --- | ||
| Detect the stack, then run its checks cheapest-first so failures surface fast. Run after every non-trivial edit, not just at the end. | ||
| Run after every non-trivial edit, not just at the end. Never invent commands — use only what the project itself declares. | ||
| Detection (one pass, quick): | ||
| - `ls package.json pyproject.toml Cargo.toml go.mod Makefile 2>/dev/null` | ||
| - Node: `sed -n '/"scripts"/,/}/p' package.json` — look for `lint`, `typecheck`, `test`, `build`. | ||
| - Make: `grep -E '^[a-zA-Z_-]+:' Makefile` — look for `lint`, `test`, `check`, `ci`. | ||
| - Python: check `pyproject.toml` for `[tool.ruff]`, `[tool.mypy]`, `[tool.pytest]`; also `tox.ini`, `noxfile.py`. | ||
| - Rust: `Cargo.toml` implies `cargo` toolchain. | ||
| - Go: `go.mod` implies `go` toolchain. | ||
| Find the commands the project actually uses, in this order: | ||
| Execution order (stop and fix on first red): | ||
| 1. Typecheck / lint (seconds) | ||
| 2. Unit tests (seconds to a minute) | ||
| 3. Full test suite / build (slow) | ||
| 1. **Repo instructions first.** `AGENTS.md`, `CLAUDE.md`, `CONTRIBUTING.md`, `README.md`. These usually name the lint/test/build commands verbatim and are authoritative when present. | ||
| 2. **Declared build metadata.** Inspect whatever the project's toolchain uses: `package.json` `"scripts"`, `Makefile` targets, `pyproject.toml` / `tox.ini` / `noxfile.py`, `Cargo.toml`, `go.mod`, `build.gradle` / `pom.xml`, `mix.exs`, `stack.yaml` / `*.cabal`, `deno.json`, etc. Only run scripts/targets that actually exist. | ||
| 3. **CI config as fallback.** `.github/workflows/*.yml`, `.gitlab-ci.yml`, `.circleci/config.yml`, `azure-pipelines.yml` — these run the real check commands and are a reliable source when docs are thin. | ||
| One-liners by stack — pick what the project actually has: | ||
| - npm: `npm run -s lint && npm run -s typecheck && npm test --silent && npm run -s build` | ||
| - yarn: `yarn -s lint && yarn -s typecheck && yarn -s test && yarn -s build` | ||
| - pnpm: `pnpm -s lint && pnpm -s typecheck && pnpm -s test && pnpm -s build` | ||
| - python: `ruff check . && mypy . && pytest -q` | ||
| - rust: `cargo fmt --check && cargo clippy -- -D warnings && cargo test` | ||
| - go: `go vet ./... && go test ./... && go build ./...` | ||
| - make: `make lint && make test` (or `make check` / `make ci` if defined) | ||
| Run checks cheapest-first so failures surface fast: format/lint → typecheck → unit tests → integration/build. Stop at the first red and fix the cause, not the symptom, before continuing. | ||
| For long suites, wrap with `timeout=ms` on the bash call; if a command hangs, kill it and investigate rather than retrying blindly. | ||
| For long suites, wrap with `timeout=<ms>` on the bash call; if a command hangs, kill it and investigate rather than retrying blindly. | ||
| On red: | ||
| - Do NOT report the task as done. Read the failing output, fix the cause (not the symptom), re-run the same command, then re-run earlier stages to confirm no regression. | ||
| - Do NOT report the task as done. Read the failing output, fix the underlying cause, re-run the same command, then re-run earlier stages to confirm no regression. | ||
| - If a check was pre-existing red on untouched code, say so explicitly; do not silently skip it. | ||
| If detection finds no lint/typecheck/test/build commands: tell the user plainly that the project has no verification commands configured. Do not invent one. | ||
| If after checking the three sources above you find no verification commands, tell the user plainly that the project has none configured. Do not scaffold one and do not fall back to guessed defaults. |
Network access
Supply chain riskThis module accesses the network.
Found 1 instance in 1 package
Shell access
Supply chain riskThis module accesses the system shell. Accessing the system shell increases the risk of executing arbitrary code.
Found 1 instance in 1 package
Environment variable access
Supply chain riskPackage accesses environment variables, which may be a sign of credential stuffing or data theft.
Found 5 instances in 1 package
Filesystem access
Supply chain riskAccesses the file system, and could potentially read sensitive data.
Found 1 instance in 1 package
Long strings
Supply chain riskContains long string literals, which may be a sign of obfuscated or packed code.
Found 1 instance in 1 package
Network access
Supply chain riskThis module accesses the network.
Found 1 instance in 1 package
Shell access
Supply chain riskThis module accesses the system shell. Accessing the system shell increases the risk of executing arbitrary code.
Found 1 instance in 1 package
Environment variable access
Supply chain riskPackage accesses environment variables, which may be a sign of credential stuffing or data theft.
Found 5 instances in 1 package
Filesystem access
Supply chain riskAccesses the file system, and could potentially read sensitive data.
Found 1 instance in 1 package
26257
4.57%62
1.64%9
12.5%