
Research
/Security News
Mini Shai-Hulud Campaign Hits Red Hat Cloud Services npm Packages
A mini Shai-Hulud campaign compromised Red Hat Cloud Services npm packages to steal developer and CI/CD secrets during installation.
cliagent-council
Advanced tools
Convene a panel of CLI-based AI agents to deliberate on engineering problems
Convene a panel of CLI-based AI agents to deliberate on your questions. Three models answer independently, review each other's work, and the invoking agent synthesizes the verdict as chairman.
Works with Claude Code, Codex CLI, and Gemini CLI. Whichever tool you invoke from becomes the chairman. The others are council members.
Inspired by Karpathy's LLM Council, adapted for the CLI agent ecosystem.
/council "Should we use Postgres or DynamoDB for our event sourcing system?"
Dispatching Stage 1 to 3 agents in parallel...
- claude (timeout: 120s)
- codex (timeout: 120s)
- gemini (timeout: 180s)
claude responded (38.2s)
codex responded (52.1s)
Quorum reached (2/3). Giving stragglers 30s grace...
gemini responded (64.7s)
All 3 agents responded.
Stage 1 complete: 3/3 successful opinions
--- CHAIRMAN SYNTHESIS (claude) ---
### Consensus
All agents agree: Postgres is the right choice given strong consistency
requirements and team SQL experience.
### Divergence
Claude emphasizes ACID guarantees as non-negotiable for account balances.
Codex flags a scaling ceiling at ~10TB without sharding.
Gemini suggests read replicas as a scaling bridge.
### Confidence
HIGH — Strong consensus across models.
Every existing LLM council is API-call-based. Karpathy's LLM Council, Perplexity Model Council, Council AI... they all pass text through API endpoints. Agent Council is different:
Grounded deliberation. Council members are CLI agents with tool access. They can grep your codebase, read migration files, run git log. Opinions are grounded in your actual project, not abstract text generation.
Zero marginal cost. You're tapping into subscriptions you already have (Claude Code, Codex, Gemini CLI). No new API tokens to buy.
Living decisions. Every deliberation is a hypothesis that can be re-evaluated. "We chose Postgres 3 months ago... re-run with what we know now." Use /council-revisit to compare then vs now.
npx cliagent-council
This clones the repo, installs skills for all detected CLI agents, and you're ready to go.
git clone https://github.com/yogirk/agent-council.git
cd agent-council
./setup
Platform: macOS and Linux. Windows users: use WSL.
Requirements: Bun + at least 2 of these CLI agents:
claude) — skills install to ~/.claude/skills/codex) — skills install to ~/.agents/skills/gemini) — skills install to ~/.gemini/skills/The same slash commands work in all three CLIs. The invoking agent automatically becomes the chairman.
/council "Should we use WebSockets or SSE for real-time updates?"
/council --with-review "Review auth middleware for security issues"
/council --quick "What's the best job queue for Node.js?"
/council-list # List all past sessions
/council-replay council-20260329-143000 # Replay a session in terminal
/council-revisit council-20260329-143000 # Re-run with current context (living decisions)
/council-outcome council-20260329-143000 "It worked great" # Record outcome
/council-nudge council-20260329-143000 --agent codex --correction "Our data will never exceed 100GB"
When invoked from Claude Code, Claude is chairman. From Codex, Codex is chairman. From Gemini, Gemini is chairman. The chairman gives its own independent opinion in Stage 1, then synthesizes all opinions in Stage 3.
# Fast mode (default): opinions + synthesis
bin/council --question-file question.txt --project myapp
# Specify chairman explicitly (auto-detected if omitted)
bin/council --question-file question.txt --chairman codex --project myapp
# With peer review
bin/council --question-file question.txt --project myapp --with-review
# Browse past sessions
bin/council list --project myapp
bin/council replay council-20260329-143000 --project myapp
# Nudge: challenge an agent's assumptions after a session
bin/council nudge council-20260329-143000 --agent codex --correction "Budget is not a constraint" --project myapp
# Skip preflight health checks (faster startup)
bin/council --question-file question.txt --project myapp --skip-preflight
+------------------+
| Your Question |
+--------+---------+
|
Stage 1: Independent Opinions
|
+----------------+----------------+
| | |
+-----------+ +-----------+ +-----------+
| Claude | | Codex | | Gemini |
| Code | | CLI | | CLI |
+-----------+ +-----------+ +-----------+
| | |
v v v
[Opinion A] [Opinion B] [Opinion C]
| | |
+----------------+----------------+
|
Stage 2: Anonymized Peer Review
(optional: --with-review)
|
Stage 3: Chairman Synthesis
|
+------------------+
| Final Verdict |
| with consensus |
| and dissent |
+------------------+
|
Stage 4: Targeted Nudge (optional)
"Your assumption about X is wrong"
|
+------------------+
| Updated Opinion |
| with what changed|
+------------------+
Stage 1: ALL agents (including the chairman) answer independently, in parallel. Each gets your question + codebase context. No visibility into what others are producing. Once a quorum of opinions arrives, a grace window starts for slower agents.
Stage 2 (optional): Each agent reviews the others' anonymized opinions. Scores them on correctness, completeness, and feasibility. Produces a ranking.
Stage 3: The chairman (whichever CLI you invoked from) reads all opinions (including its own from Stage 1) and synthesizes: where they agree, where they diverge, and a final recommendation with confidence level. When agents fundamentally disagree, the synthesis flags it explicitly with per-agent confidence so you can decide.
Stage 4 (optional): After reading the verdict, you can nudge a specific agent: "Your assumption about X is wrong." The agent reconsiders with your correction and produces an updated recommendation explaining what changed and what stayed the same.
Create ~/.council/config.json to customize models, timeouts, and quorum behavior:
{
"models": {
"claude": "claude-opus-4-6",
"codex": "gpt-5.4",
"gemini": "gemini-3.1-pro"
},
"timeout_ms": {
"claude": 120000,
"codex": 120000,
"gemini": 180000
},
"quorum_grace_ms": 30000
}
All fields are optional. Missing fields use the defaults shown above.
Council sessions are stored in ~/.council/{project}/. Each session contains:
meta.json — question, agents, mode, timestampstage1/opinion_*.json — individual agent opinionsstage2/review_*.json — peer reviews (if --with-review)stage4/nudge_*.json — nudge results (if nudge was used)synthesis.json — chairman's final verdictviewer.html — interactive viewer (open in browser)Every council session generates a self-contained HTML viewer. Open it in your browser to explore:
We ran 3 benchmark questions through the council and compared against a single agent (Claude Opus 4.6). The council consistently found more considerations:
| Benchmark | Single Agent | Council | Delta |
|---|---|---|---|
| Database choice (Postgres vs DynamoDB) | 1/5 (20%) | 3/5 (60%) | +2 |
| Error handling (exceptions vs Result types) | 0/5 (0%) | 1/5 (20%) | +1 |
| Deployment (Kubernetes vs Docker Compose vs PaaS) | 3/5 (60%) | 4/5 (80%) | +1 |
| Average | 27% | 53% | +1.3 |
The council found nearly 2x as many expected considerations. This measures consideration coverage (did the response mention scaling? cost? team experience?), not answer quality. Run your own eval: bun run eval/run-eval.ts --dry-run to see all 10 benchmarks. The test suite has 59 tests with 133 assertions covering adapters, parsing, prompts, preflight, nudge, and viewer generation.
Agent Council can suggest /council when it detects you're making a decision with trade-offs. After setup, an ambient skill watches for patterns like:
/council/council-revisit/council-outcomeSuggestions are quiet (a single line after the response), max 2 per session, and never interrupt your flow. Disable in ~/.council/config.json:
{ "proactive": false }
MIT
FAQs
Convene a panel of CLI-based AI agents to deliberate on engineering problems
We found that cliagent-council demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
/Security News
A mini Shai-Hulud campaign compromised Red Hat Cloud Services npm packages to steal developer and CI/CD secrets during installation.

Research
/Security News
The North Korean malware loader hides in a Packagist-listed package and its GitHub branch to fetch and execute remote code in a likely Contagious Interview-style lure.

Security News
The Rust project is moving toward formal rules on LLM use in contributions after months of internal debate over maintainer burden, code quality, and contributor experience.