🚀 Socket Launch Week Day 5:Introducing Repository Access Permissions and Custom Roles.Learn more
Sign In

@faviovazquez/deliberate

Package Overview
Dependencies
Maintainers
1
Versions
11
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@faviovazquez/deliberate

Multi-agent deliberation skill for AI coding assistants. Agreement is a bug.

latest
Source
npmnpm
Version
0.2.10
Version published
Maintainers
1
Created
Source

deliberate — Agreement is a bug

deliberate

Agreement is a bug.

A multi-agent deliberation and brainstorming skill for AI coding assistants. Forces multiple agents to disagree before they agree, surfacing blind spots that single-perspective answers hide.

The Problem: AI Sycophancy

AI chatbots are sycophantic. They validate your claims, confirm your hypotheses, and produce polished answers that sound balanced but come from a single reasoning tradition.

This is not a minor UX inconvenience. It is a structural failure mode:

  • Confirmation bias amplification: LLMs agree with the user's framing by default. If you ask "should we use microservices?", the model builds a case for microservices. If you ask "should we stay monolithic?", the same model builds an equally confident case for monoliths. The answer follows the framing, not the evidence.
  • Delusional spiraling: Chandra et al. (2025) formalized how prolonged conversations with agreeable AI lead to "sycophancy-induced delusional spiraling" — users develop dangerously confident beliefs because the AI never pushes back. Their model shows that even initially rational users converge toward overconfidence when the AI consistently validates.
  • Simulated balance: A single LLM generates one coherent viewpoint per response. When asked for "both sides," it produces a paragraph for each — but both paragraphs come from the same reasoning tradition, the same training distribution, the same latent biases. It simulates balance without achieving genuine adversarial analysis.
  • Hidden trade-offs: Complex decisions involve real trade-offs where the correct answer depends on which values you weight. A single model flattens these into one recommendation, hiding the tensions that should be visible to the decision-maker.
  • Context collapse: In long conversations, the AI anchors on earlier positions. By session 5, you're in an echo chamber of your own assumptions, reinforced by an eager assistant.

The research is clear: a single AI perspective is structurally insufficient for complex decisions.

The Sycophancy Problem — Single AI vs Multi-Agent Deliberation

The Solution

deliberate externalizes the disagreement layer. Instead of asking one agent for a balanced answer, it spawns multiple agents with distinct analytical methods, explicit blind spots, and structural counterweights. They analyze independently, cross-examine each other, and produce a verdict that shows you where they agree, where they disagree, and why.

The disagreements are the point. A single model averages opposing views into one confident recommendation. deliberate keeps them separate so you can decide.

What deliberate brings to the table:

  • Structural disagreement, not simulated balance: Each agent has a declared analytical method and declared blind spots. The polarity pairs (e.g., pragmatic-builder vs reframer: "ship it" vs "does it even need to exist?") guarantee genuine tension.
  • Forced dissent: The protocol requires at least 30% of agents to disagree in Round 2. Unanimous agreement triggers an explicit groupthink warning. The system is designed to make agreement hard.
  • Minority report: Dissenting positions are preserved in full, not averaged away. Sometimes the minority is right — you should see their reasoning.
  • Multi-round cross-examination: Agents don't just state opinions in parallel. In Round 2, each agent must name which other agent they most disagree with, and why. This forces genuine engagement with opposing views.
  • Transparent verdict: The output shows you agreement, disagreement, the specific tensions, and unresolved questions. No confident recommendation hiding real trade-offs.

Quick Start

Install

npx @faviovazquez/deliberate

The interactive installer auto-detects your platform and lets you choose global or workspace installation. You can also pass flags directly:

# Claude Code — global (recommended)
npx @faviovazquez/deliberate --claude --global

# Windsurf — global
npx @faviovazquez/deliberate --windsurf --global

# Cursor — workspace only
npx @faviovazquez/deliberate --cursor

# All detected platforms — global
npx @faviovazquez/deliberate --all --global

# Preview without installing
npx @faviovazquez/deliberate --claude --global --dry-run

# Uninstall
npx @faviovazquez/deliberate --claude --global --uninstall

Your First Deliberation

Claude Code — invoke with /deliberate:

/deliberate "should we migrate from REST to GraphQL?"

Windsurf — invoke with @deliberate or just ask a complex question (Windsurf auto-invokes when the question matches the skill description):

@deliberate should we migrate from REST to GraphQL?

Cursor — invoke with @deliberate:

@deliberate should we migrate from REST to GraphQL?

Manual Installation (git clone)

git clone https://github.com/FavioVazquez/deliberate.git
cd deliberate
# Claude Code
./install.sh --platform claude-code --global

# Windsurf
./install.sh --platform windsurf --global

# Both
./install.sh --platform all --global

Modes

deliberate has 6 modes. Each mode works on every platform — only the invocation syntax differs.

Six Deliberation Modes

Full Deliberation (3 rounds)

All 14 core agents. Round 1: independent analysis. Round 2: cross-examination (agents must disagree). Round 3: crystallization. Produces a structured verdict with minority report.

Claude Code:

/deliberate --full "is this acquisition worth pursuing at 8x revenue?"
/deliberate --full "should we open-source our core library?"

Windsurf / Cursor:

@deliberate full deliberation: is this acquisition worth pursuing at 8x revenue?
@deliberate run all 14 agents on: should we open-source our core library?

Quick Deliberation (2 rounds)

Auto-selected triad. Rounds 1 + 3 only (skips cross-examination). Faster, cheaper, still multi-perspective.

Claude Code:

/deliberate --quick "monorepo or polyrepo?"
/deliberate --quick "should we add Redis caching?"

Windsurf / Cursor:

@deliberate quick: monorepo or polyrepo?
@deliberate quick deliberation on whether to add Redis caching

Triad (domain-optimized)

3 agents selected for a specific domain. 18 pre-defined triads available (see table below). Use when you know the domain of your question.

Claude Code:

/deliberate --triad architecture "should we split the monolith?"
/deliberate --triad decision "build vs buy for notifications"
/deliberate --triad risk "should we launch before the security audit?"
/deliberate --triad ai "should we fine-tune or use RAG?"
/deliberate --triad shipping "can we ship v2 by Friday?"

Windsurf / Cursor:

@deliberate architecture triad: should we split the monolith?
@deliberate use the decision triad for: build vs buy for notifications
@deliberate risk triad: should we launch before the security audit?
@deliberate ai triad: should we fine-tune or use RAG?
@deliberate shipping triad: can we ship v2 by Friday?

Duo / Dialectic

Two agents, two rounds of exchange, then synthesis. Best for binary decisions ("should we X or not?"). Pair agents from the polarity pairs table for maximum disagreement.

Claude Code:

/deliberate --duo assumption-breaker,pragmatic-builder "rewrite the auth layer?"
/deliberate --duo risk-analyst,pragmatic-builder "ship with known tech debt?"
/deliberate --duo classifier,emergence-reader "impose strict types or keep it flexible?"

Windsurf / Cursor:

@deliberate duo with assumption-breaker and pragmatic-builder: should we rewrite the auth layer?
@deliberate dialectic between risk-analyst and pragmatic-builder on shipping with known tech debt
@deliberate duo: classifier vs emergence-reader on imposing strict types vs keeping it flexible

Brainstorm

Creative exploration with multiple agents. Divergent ideas → cross-pollination → convergence into actionable designs. Optionally add --visual for an interactive browser companion.

Claude Code:

/deliberate --brainstorm "how should we redesign onboarding?"
/deliberate --brainstorm --visual "landing page redesign"
/deliberate --brainstorm "new pricing model for our API"

Windsurf / Cursor:

@deliberate brainstorm: how should we redesign onboarding?
@deliberate brainstorm with visual companion: landing page redesign
@deliberate brainstorm: new pricing model for our API

Auto-Detect (no flag)

Just ask your question. The coordinator parses domain keywords, selects the best-matching triad, and runs the 3-round protocol automatically.

Claude Code:

/deliberate "should we migrate from REST to GraphQL?"
/deliberate "is our microservices architecture causing more problems than it solves?"
/deliberate "should we hire senior engineers or train juniors?"

Windsurf / Cursor:

@deliberate should we migrate from REST to GraphQL?
@deliberate is our microservices architecture causing more problems than it solves?
@deliberate should we hire senior engineers or train juniors?

Custom Agent Selection

Pick specific agents by name for full control over who deliberates.

Claude Code:

/deliberate --members assumption-breaker,first-principles,bias-detector "why does our cache keep failing?"
/deliberate --members pragmatic-builder,risk-analyst,systems-thinker,inverter "refactor the payment system?"

Windsurf / Cursor:

@deliberate use agents assumption-breaker, first-principles, and bias-detector: why does our cache keep failing?
@deliberate members pragmatic-builder, risk-analyst, systems-thinker, inverter: should we refactor the payment system?

Profiles

Pre-defined agent groups for common scenarios. Use when you don't want to pick individual agents.

ProfileAgentsWhen to Use
fullAll 14 core agentsComplex decisions with real trade-offs. Claude Code default.
leanassumption-breaker, first-principles, bias-detector, pragmatic-builder, inverterFast decisions, limited context. Windsurf/Cursor default.
explorationassumption-breaker, classifier, emergence-reader, reframer, systems-thinker, inverter, risk-analystDiscovery, open-ended investigation
executionpragmatic-builder, first-principles, adversarial-strategist, bias-detector, formal-verifierShipping decisions, technical trade-offs

Claude Code:

/deliberate --profile exploration "what's the right approach to AI safety for our product?"
/deliberate --profile execution "can we ship this feature by next sprint?"
/deliberate --profile lean "quick take: should we use Postgres or MongoDB?"

Windsurf / Cursor:

@deliberate exploration profile: what's the right approach to AI safety for our product?
@deliberate execution profile: can we ship this feature by next sprint?
@deliberate lean profile: should we use Postgres or MongoDB?

Flags Reference

FlagEffectExample (Claude Code)
(no flag)Auto-detect domain, select matching triad/deliberate "your question"
--fullAll 14 core agents, 3-round protocol/deliberate --full "question"
--quickAuto-detect triad, 2-round protocol (skip cross-examination)/deliberate --quick "question"
--duo a,bDialectic mode: 2 agents, 2 exchange rounds, then synthesis/deliberate --duo risk-analyst,pragmatic-builder "question"
--triad {domain}Pre-defined triad for domain, 3-round protocol/deliberate --triad architecture "question"
--members a,b,cCustom agent selection (2-14 agents), 3-round protocol/deliberate --members a,b,c "question"
--brainstormCreative exploration with divergent-convergent flow/deliberate --brainstorm "question"
--profile {name}Use named profile (full, lean, exploration, execution)/deliberate --profile exploration "question"
--visualLaunch browser-based visual companion/deliberate --visual --full "question"
--save {slug}Override auto-generated filename slug for output/deliberate --save my-decision "question"
--researchGrounding phase before Round 1: scan codebase + search web. Agents reason from retrieved evidence. Opt-in only./deliberate --research "question"
--research=webWeb search only (no codebase scan)/deliberate --research=web "question"
--research=codeCodebase scan only (no web search)/deliberate --research=code "question"

Windsurf/Cursor note: Flags are expressed as natural language after @deliberate. The skill parses intent from your message. Examples: @deliberate full deliberation: ..., @deliberate quick: ..., @deliberate architecture triad: ..., @deliberate brainstorm: ..., @deliberate with research: ..., @deliberate search the web before answering: ..., @deliberate look at the codebase first: ....

Research Grounding (--research)

By default, agents reason from parametric knowledge — what the model internalized during training. This works well for timeless architectural reasoning. It breaks down for:

  • Your codebase: agents don't know what's in it unless they read it
  • Recent events: model cutoffs miss new library releases, regulatory changes, pricing shifts, CVEs
  • Current benchmarks: performance comparisons from 18 months ago may be reversed today
  • Competitor landscape: what Bloomberg, Palantir, or a new startup shipped last quarter

--research adds a grounding phase before Round 1. The coordinator scans your codebase and/or searches the web, packages the findings into a Codebase Context Summary and Web Research Summary, and injects this evidence into every agent's prompt. Agents then reason from real, retrieved facts — and are required to cite what they found and flag what was missing.

This is opt-in only. It is never the default. Not every question needs research — and research consumes tokens. Use it when parametric knowledge is insufficient.

Research Grounding — --research flag

Research modes

Claude Code:

/deliberate --research "should we adopt Apache Iceberg for our data lake?"
/deliberate --research=web "what are the current alternatives to dbt for data transformation?"
/deliberate --research=code "is our current ingestion pipeline a good candidate for refactoring?"
/deliberate --triad architecture --research "should we migrate our REST APIs to GraphQL?"
/deliberate --brainstorm --research "what data quality monitoring approaches should we consider?"

Windsurf / Cursor:

@deliberate with research: should we adopt Apache Iceberg for our data lake?
@deliberate search the web before answering: what are the current alternatives to dbt for data transformation?
@deliberate look at the codebase first: is our ingestion pipeline a good candidate for refactoring?
@deliberate architecture triad with research: should we migrate our REST APIs to GraphQL?
@deliberate brainstorm do research first: what data quality monitoring approaches should we consider?

What the grounding phase does

Codebase scan (activated by --research or --research=code):

  • Reads AGENTS.md, project README, top-level structure
  • Reads relevant entry points, dependency manifests, config files
  • Follows specific files surfaced by the question domain (auth layers for security questions, hot paths for performance, etc.)
  • Caps at 10 files maximum — precision over volume
  • Produces a Codebase Context Summary shared with all agents

Web search (activated by --research or --research=web):

  • Runs 5 targeted searches maximum — bad searches waste context
  • Prioritizes recency (last 12 months) and primary sources (official docs, engineering blogs, CVE databases, academic papers)
  • Produces a Web Research Summary with findings + source URLs + recency note
  • assumption-breaker is explicitly tasked with scrutinizing source quality

Grounding evidence rules

Agents treat grounding evidence as facts to reason from, not conclusions to accept:

  • They must cite specific findings ("the codebase currently uses library X at version Y, which has Z implication")
  • They must flag gaps ("the web research didn't find benchmark data for >10M records/day — my analysis is therefore first-principles")
  • assumption-breaker scrutinizes source quality and flags cherry-picked or outdated evidence
  • Agents that over-rely on evidence get flagged in the coordinator's Round 2 dispatch

If the platform has no web search or file access tools, the coordinator announces the limitation and falls back gracefully to parametric knowledge.

The 14 Core Agents

Each agent is named by its analytical function — not by historical figures. Every agent declares its method, what it sees that others miss, and what it tends to miss. These declared blind spots are why the polarity pairs matter.

The 14 Core Agents

#AgentFunctionTier
1assumption-breakerDestroys hidden premises, tests by contradiction, dialectical questioninghigh
2first-principlesBottom-up derivation, refuses unexplained complexitymid
3classifierTaxonomic structure, category errors, four-cause analysismid
4formal-verifierComputational skeleton, mechanization boundaries, abstractionmid
5bias-detectorCognitive bias detection, pre-mortem, de-biasing interventionshigh
6systems-thinkerFeedback loops, leverage points, unintended consequencesmid
7resilience-anchorControl vs acceptance, moral clarity, anti-panic groundingmid
8adversarial-strategistTerrain reading, competitive dynamics, strategic timingmid
9emergence-readerNon-action, subtraction, intervention audit, minimum interventionhigh
10incentive-mapperPower dynamics, actor incentives, principal-agent problemsmid
11pragmatic-builderShip it, maintenance cost, over-engineering detectionmid
12reframerDissolves false problems, frame audit, false dichotomieshigh
13risk-analystAntifragility, tail risk, fragility profile, barbell strategyhigh
14inverterMulti-model reasoning, inversion ("what guarantees failure?"), opportunity costmid

Optional Specialists

Activated only when their domain-specific triad is selected:

AgentFunctionTriads
ml-intuitionNeural net intuition, training dynamics, jagged frontierai, ai-product
safety-frontierScaling dynamics, capability-safety frontier, phase transitionsai
design-lensUser-centered design, honesty audit, "less but better"design, ai-product

Polarity Pairs

These agents are structural counterweights. When both are present, genuine disagreement is almost guaranteed. Use these pairings for --duo mode:

PairTension
assumption-breaker vs first-principlesTop-down destruction vs bottom-up construction
classifier vs emergence-readerImpose structure vs let it emerge
adversarial-strategist vs resilience-anchorWin externally vs govern internally
formal-verifier vs incentive-mapperAbstract purity vs messy human reality
pragmatic-builder vs reframerShip it vs does it need to exist?
pragmatic-builder vs systems-thinkerFix the bug vs redesign the system
risk-analyst vs ml-intuitionTail paranoia vs empirical iteration

18 Pre-defined Triads

Each triad is a team of 3 agents optimized for a domain. The reasoning chain shows the deliberation flow.

18 Pre-defined Triads

DomainAgentsReasoning Chain
architectureclassifier + formal-verifier + first-principlescategorize → formalize → simplicity-test
strategyadversarial-strategist + incentive-mapper + resilience-anchorterrain → incentives → moral grounding
ethicsresilience-anchor + assumption-breaker + emergence-readerduty → questioning → natural order
debuggingfirst-principles + assumption-breaker + formal-verifierbottom-up → assumptions → formal verify
innovationformal-verifier + emergence-reader + classifierabstraction → emergence → classification
conflictassumption-breaker + incentive-mapper + resilience-anchorexpose → predict → ground
complexityemergence-reader + classifier + formal-verifieremergence → categories → formalism
riskadversarial-strategist + resilience-anchor + first-principlesthreats → resilience → empirical verify
shippingpragmatic-builder + adversarial-strategist + first-principlespragmatism → timing → first-principles
productpragmatic-builder + incentive-mapper + reframership it → incentives → reframing
decisioninverter + bias-detector + risk-analystinversion → biases → tail risk
systemssystems-thinker + emergence-reader + classifierfeedback → emergence → structure
economicsadversarial-strategist + inverter + incentive-mapperterrain → models → power
uncertaintyrisk-analyst + adversarial-strategist + assumption-breakertails → threats → premises
biasbias-detector + reframer + assumption-breakerbiases → frame → premises
aiformal-verifier + ml-intuition + safety-frontierformalism → empirical ML → safety
ai-productpragmatic-builder + ml-intuition + design-lensship → ML reality → user
designdesign-lens + reframer + pragmatic-builderuser → frame → ship

Which triad should I use?

If your question is about...Use triad
Code structure, API design, monolith vs microarchitecture
Build vs buy, go/no-go, pricingdecision
Should we launch/ship?shipping
Competitive moves, market entrystrategy
Something feels riskyrisk or uncertainty
Root cause analysis, why is X brokendebugging
Feature design, user experienceproduct or design
AI/ML architecture, model selectionai
AI product decisionsai-product
Team dynamics, organizational changeconflict
System reliability, scalingsystems or complexity
Ethical implicationsethics
Cognitive biases in your decisionbias

Visual Companion

A browser-based interface that shows deliberation progress in real time. Available on all platforms.

Claude Code:

/deliberate --visual --full "major architecture decision"
/deliberate --brainstorm --visual "redesign the dashboard"

Windsurf / Cursor:

@deliberate full deliberation with visual companion: major architecture decision
@deliberate brainstorm with visual: redesign the dashboard

It provides:

  • Agent Position Map: Force-directed graph showing agents as colored nodes positioned by agreement/disagreement
  • Agreement Matrix: Heatmap of which agents agree/disagree on which points
  • Idea Evolution Timeline (brainstorm): How ideas appeared, forked, merged across phases
  • Verdict Formation (deliberation): Step-by-step visualization of how the verdict emerged
  • Minority Report Panel: Highlighted dissenting positions

Built with plain HTML + JS + Canvas 2D. No framework, no build step. Served locally via a lightweight Node.js file-watcher server.

Starting and stopping the server

The coordinator starts the server automatically when --visual is active. You do not need to start it manually. When your deliberation or brainstorm ends, the coordinator will ask you:

The visual companion server is still running at http://localhost:{port}.
Stop it now? (Y/n)

If you say yes (or press Enter), it runs scripts/stop-server.sh to shut it down cleanly. If you say no, the server keeps running — useful if you want to review the session in the browser after the deliberation ends. It will auto-shutdown after 30 minutes of inactivity regardless.

If you ever need to manage the server manually, run these commands from your project root.

  • start-server.sh --project-dir . stores the session under .deliberate/companion/ in your project. This is what allows stop-server.sh to find and kill it automatically.
  • stop-server.sh takes no arguments — it scans .deliberate/companion/ and /tmp/ to find running sessions and kills them.

Claude Code — global install:

bash ~/.claude/skills/deliberate/scripts/start-server.sh --project-dir .
bash ~/.claude/skills/deliberate/scripts/stop-server.sh

Claude Code — local install:

bash .claude/skills/deliberate/scripts/start-server.sh --project-dir .
bash .claude/skills/deliberate/scripts/stop-server.sh

Windsurf — global install:

bash ~/.codeium/windsurf/skills/deliberate/scripts/start-server.sh --project-dir .
bash ~/.codeium/windsurf/skills/deliberate/scripts/stop-server.sh

Windsurf — local install:

bash .windsurf/skills/deliberate/scripts/start-server.sh --project-dir .
bash .windsurf/skills/deliberate/scripts/stop-server.sh

Cursor — local install:

bash .cursor/skills/deliberate/scripts/start-server.sh --project-dir .
bash .cursor/skills/deliberate/scripts/stop-server.sh

Platforms

PlatformExecution ModelDefault ProfileInvocationInstall Paths
Claude CodeParallel subagents (context: fork)full (14 agents)/deliberate + flags~/.claude/skills/ + ~/.claude/agents/
WindsurfSequential role-promptinglean (5 agents)@deliberate or auto-invoke~/.codeium/windsurf/skills/deliberate/
CursorSequential role-promptinglean (5 agents)@deliberate.cursor/skills/deliberate/

How it works on each platform

Claude Code: Each agent runs as a parallel subagent with its own isolated context window. The coordinator dispatches all agents simultaneously in Round 1 and Round 3, and sequentially in Round 2 (cross-examination requires seeing prior outputs). Agents are installed as separate .md files in ~/.claude/agents/ and referenced by the skill protocol in ~/.claude/skills/deliberate/.

Windsurf: No subagent support. The coordinator adopts each agent's persona sequentially within a single context window. Agent definitions are bundled inside the skill directory (~/.codeium/windsurf/skills/deliberate/agents/) and read on demand. The default "lean" profile (5 agents) keeps context usage manageable. Windsurf also auto-invokes the skill when your question matches the skill description — you don't always need to type @deliberate.

Cursor: Same sequential execution as Windsurf. Agents bundled inside .cursor/skills/deliberate/agents/. Workspace-local installation only.

Configuration

Model Tiers

All agents use sonnet-equivalent models by default. The tier determines model quality:

TierModelAgents at this tier
highclaude-sonnet-4 / equivalentassumption-breaker, bias-detector, emergence-reader, reframer, risk-analyst
midclaude-sonnet-4 / equivalentAll other agents

To override all agents to a higher tier:

# In your project's config.yaml:
model_tier: high
# WARNING: high-tier models consume significantly more tokens/credits

Custom Model Routing

For per-agent model control, copy the example config:

cp configs/provider-model-slots.example.yaml configs/provider-model-slots.yaml

Then edit to assign specific models per agent:

# configs/provider-model-slots.yaml
assumption-breaker:
  provider: anthropic
  model: claude-sonnet-4-20250514
first-principles:
  provider: anthropic
  model: claude-sonnet-4-20250514
# ... customize per agent

Output Location

All deliberation and brainstorm records are saved to deliberations/ in your project root:

deliberations/
  2025-04-03-14-30-triad-architecture-monolith.md
  2025-04-03-15-00-full-acquisition.md
  2025-04-03-16-00-brainstorm-onboarding.md

Platform Defaults

The configs/defaults.yaml file controls per-platform behavior:

platforms:
  claude-code:
    execution: parallel       # Agents run as parallel subagents
    default_profile: full     # All 14 agents
  windsurf:
    execution: sequential     # Single context, role-prompting
    default_profile: lean     # 5 agents (saves context)
  cursor:
    execution: sequential
    default_profile: lean

You can override the default profile per invocation with --profile.

When to Use

Use deliberate for:

  • Complex decisions where trade-offs are real: architecture choices, strategic pivots, build-vs-buy
  • Pricing models, go/no-go decisions, risk assessment
  • Any situation where you suspect a single confident answer hides real trade-offs
  • Decisions where you already have an opinion but suspect you're missing something

Don't use deliberate for:

  • Questions with clear correct answers
  • Don't convene 14 agents to debate tabs vs spaces
  • Don't use --full when a triad covers the domain (14 agents consume significant context and API cost)

The sweet spot: Decisions where a single confident answer hides real trade-offs. deliberate surfaces what you're not seeing — structured, with the disagreements visible.

Enforcement

The protocol includes safeguards against common failure modes:

Protocol Enforcement

RuleWhat it prevents
Hemlock ruleInfinite questioning spirals — forces 50-word position statement
3-level depth limitEndless depth — forces position commitment after 3 rounds of questioning
2-message cutoffAny pair dominating the discussion
Dissent quota (30%)Groupthink — at least 30% of agents must disagree in Round 2
Novelty gateStale deliberation — Round 2 must introduce new ideas not in Round 1
Groupthink flagUnanimous agreement triggers explicit warning to the user

Verdict Output

Every deliberation produces a structured verdict saved to deliberations/:

Structured Verdict Output

## Deliberation Verdict

### Problem
{Original question}

### Agents Present
{List of agents with functions}

### Mode
{Full / Quick / Duo / Triad: {domain}}

### Consensus Position
{Position held by 2/3+ agents, if one exists}

### Key Insights by Agent
{2-3 sentence summary per agent}

### Points of Agreement
{Where agents converged}

### Points of Disagreement
{Where agents diverged, with the specific tension}

### Minority Report
{Dissenting positions with full reasoning — sometimes the minority is right}

### Verdict Type
{consensus | majority | split | dilemma}

### Recommended Next Steps
{1-3 concrete actions}

### Unresolved Questions
{Questions raised but not resolved}

Verdict types:

  • consensus: 2/3+ agree, minority report recorded
  • majority: Simple majority, significant dissent recorded
  • split: No majority, all positions presented equally
  • dilemma: Genuine dilemma with no clear resolution — the agents surfaced the tension, you decide

References

  • Chandra, Y., Mishra, C., & Flynn, B. (2025). Can AIeli-bots turn us all delusional? How AI sycophancy, AI psychosis, and human self-correction interact. arXiv:2602.19141. Paper — The formal model of sycophancy-induced delusional spiraling that motivated this project.
  • Council of High Intelligencegithub.com/0xNyk/council-of-high-intelligence — The original multi-agent deliberation system for Claude Code. deliberate redesigns the agent roster around analytical functions rather than personas, adds enforcement rules, and extends to multiple platforms.
  • Superpowers Brainstorming Skillgithub.com/obra/superpowers — The brainstorming skill and browser-based visual companion architecture.

License

MIT

Keywords

ai

FAQs

Package last updated on 04 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts