Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

debate-mcp

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

debate-mcp

MCP server that stress-tests your decisions with adversarial AI debate. GPT vs Gemini, Skeptic vs Steelman, grounded in web search.

latest
Source
npmnpm
Version
1.0.0
Version published
Maintainers
1
Created
Source

Debate MCP

Stress-test your decisions before you commit. An MCP server that runs adversarial AI debates between frontier models, grounded in live web search.

Most AI tools optimize for consensus. Debate MCP optimizes for finding where your plan breaks.

MIT License npm

How It Works

You describe your plan
        |
        v
  [Web Search] -- gathers current facts, laws, regulations
        |
        v
  +-----------+          +-----------+
  |  SKEPTIC  |          | STEELMAN  |
  |  (GPT)    |          | (Gemini)  |
  |           |          |           |
  | Attacks   |          | Finds the |
  | your plan |          | strongest |
  | ruthlessly|          | version,  |
  |           |          | then      |
  |           |          | stress-   |
  |           |          | tests it  |
  +-----------+          +-----------+
        |    Round 2: they     |
        |    read each other   |
        |    (anonymized) and  |
        +--- argue back -------+
                  |
                  v
        [Structured synthesis]
        Recommendation + Crux +
        What Would Falsify +
        Unresolved disagreements

Quick Start

1. Install

npx debate-mcp

2. Add to Claude Code

claude mcp add debate npx debate-mcp \
  -e OPENAI_API_KEY=sk-... \
  -e GEMINI_API_KEY=AI...

3. Use it

Just tell Claude: "debate this", "what am I missing", "stress-test this plan", or "is this the right call".

[!TIP] You can also trigger it with domain and current_leaning for targeted debates: "Debate this as a tax attorney. I'm leaning toward electing S-Corp."

What Makes This Different

FeatureWhy it matters
Asymmetric rolesOne model attacks (Skeptic), one defends then stress-tests (Steelman). Research shows this outperforms giving both models the same prompt.
Anonymized cross-examinationIn Round 2, models see each other's work labeled "another analyst" to prevent identity bias. Based on NeurIPS 2025 research.
Web search groundingBefore the debate, the server searches for current facts, laws, and regulations. Both models receive this as VERIFIED evidence and must flag ungrounded claims as UNVERIFIED.
Confirmation bias attackTell it what you're leaning toward. The Skeptic will specifically attack that leaning.
Domain expertisePass domain: "tax attorney" or "systems architect" to make both analysts domain-specific.
Constrained synthesisThe output forces a structured format: Recommendation, Crux of Disagreement, What Would Falsify, Risk of Acting vs Waiting. Prevents AI from smoothing real disagreements into false consensus.

Example

Input: "Should we elect S-Corp status? Net profit $40K, based in NYC." Domain: tax attorney Current leaning: "I think S-Corp will save on self-employment tax"

What happens:

  • Web search pulls current NYC tax rates, QBI rules, IRS thresholds
  • Skeptic leads with: "At $40K net profit in NYC, S-Corp election is mathematically guaranteed to lose you money" and explains exactly why
  • Steelman finds the strongest case for S-Corp, then stress-tests it against NYC-specific tax penalties
  • Cross-examination: Skeptic concedes the QBI interaction point, Steelman concedes the compliance cost erasure
  • Synthesis: Don't elect. Here's the specific profit threshold where it flips.

Configuration

Environment Variables

Required (at minimum):

VariableDescription
OPENAI_API_KEYAPI key for the Skeptic model (OpenAI by default)
GEMINI_API_KEYAPI key for the Steelman model (Gemini by default)

Model configuration:

VariableDefaultDescription
SKEPTIC_MODELgpt-5.4Model for the Skeptic role
SKEPTIC_BASE_URLOpenAI defaultBase URL for the Skeptic API (change to use Grok, Groq, Mistral, etc.)
STEELMAN_MODELgemini-3.1-pro-previewModel for the Steelman role
STEELMAN_PROVIDERgeminiSet to openai to use any OpenAI-compatible API for Steelman
STEELMAN_BASE_URL-Base URL when using STEELMAN_PROVIDER=openai
STEELMAN_API_KEYFalls back to GEMINI_API_KEYAPI key when using STEELMAN_PROVIDER=openai
CALL_TIMEOUT_MS90000Timeout per API call (ms)

Use Any Model Provider

The Skeptic role works with any OpenAI-compatible API out of the box. Just change the base URL:

# Grok (xAI)
SKEPTIC_BASE_URL=https://api.x.ai/v1 SKEPTIC_MODEL=grok-3 OPENAI_API_KEY=xai-...

# Groq
SKEPTIC_BASE_URL=https://api.groq.com/openai/v1 SKEPTIC_MODEL=llama-4-scout OPENAI_API_KEY=gsk_...

# Ollama (local, free)
SKEPTIC_BASE_URL=http://localhost:11434/v1 SKEPTIC_MODEL=llama3 OPENAI_API_KEY=ollama

# Mistral
SKEPTIC_BASE_URL=https://api.mistral.ai/v1 SKEPTIC_MODEL=mistral-large OPENAI_API_KEY=...

The Steelman role uses Gemini by default (for Google Search grounding). To use a different provider, set STEELMAN_PROVIDER=openai and configure the base URL.

MCP Configuration (.mcp.json)

{
  "mcpServers": {
    "debate": {
      "command": "npx",
      "args": ["-y", "debate-mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

[!NOTE] Bring your own API keys. Debate MCP calls OpenAI and Google APIs directly. You are responsible for your own API usage and costs. A typical debate uses ~20,000-30,000 tokens across both providers.

Tool Parameters

ParameterRequiredDescription
contextYesThe plan, decision, or situation to debate. Include all relevant details.
questionNoSpecific question to focus the debate on.
domainNoDomain expertise: "tax attorney", "systems architect", "financial advisor", etc.
current_leaningNoWhat you're leaning toward. The Skeptic attacks this to counter confirmation bias.

The Research Behind It

Debate MCP's design is based on peer-reviewed research on multi-agent debate:

  • Asymmetric roles outperform identical prompts ("Peacemaker or Troublemaker: How Sycophancy Shapes Multi-Agent Debate", 2025)
  • Anonymized cross-examination prevents identity bias ("When Identity Skews Debate", NeurIPS 2025)
  • Steelmanning before disagreeing forces genuine engagement (Kahneman's Adversarial Collaboration framework)
  • Re-stating the original question each round prevents context drift ("Talk Isn't Always Cheap", ICML 2025)
  • Caller-model synthesis avoids positional commitment bias from debaters ("Auditing Multi-Agent LLM Reasoning Trees", 2025)
  • Ray Dalio's triangulation method: get independent expert opinions, map convergence and divergence, then decide

When To Use It

Good for: Taxes, legal decisions, financial planning, business strategy, architecture choices, investment analysis, contract terms, hiring decisions, production deployments.

Not for: Simple coding tasks, quick lookups, routine bug fixes, or questions with obvious answers.

License

MIT

Keywords

mcp

FAQs

Package last updated on 11 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts