LLM Council MCP

LLM Council MCP

Enables Claude Code to consult external LLMs (GPT, Gemini) through multi-turn sessions for second opinions, parallel consultations, and web-grounded research.

Category
Visit Server

README

LLM Council MCP

An MCP server that lets Claude Code consult external LLMs (GPT, Gemini) through multi-turn sessions. Get a second opinion, run parallel consultations, or do web-grounded research — all without leaving your Claude Code workflow.

Why?

Claude Code is powerful, but sometimes you want to:

  • Get a second opinion on architecture decisions from GPT or Gemini
  • Cross-reference answers by asking multiple models the same question
  • Web-grounded research using Gemini's Google Search or OpenAI's web search
  • Multi-turn conversations with external models while Claude orchestrates

This MCP server makes all of that possible with a simple tool interface.

Tools

Tool Description
council Multi-turn chat with an external LLM. Auto-creates sessions.
council_research Web-grounded research via LLM + live search. Stateless.
council_inject Inject context (files, docs) into a session without an LLM call.
council_sessions List all active sessions with usage stats.
council_delete Delete a session and its history.
council_reset Clear conversation history, keep session config.

Supported Providers

Provider Default Model Features
OpenAI gpt-5.4 Reasoning (low/medium effort), web search
Gemini gemini-3.1-pro-preview Thinking levels, Google Search grounding

Quick Start

1. Get API Keys

You need at least one:

2. Add to Claude Code

Add this to your .mcp.json (in your home directory or project root):

{
  "mcpServers": {
    "llm-council": {
      "command": "uvx",
      "args": ["llm-council-mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

Alternative — install locally with uv:

{
  "mcpServers": {
    "llm-council": {
      "command": "uv",
      "args": ["--directory", "/path/to/llm-council-mcp", "run", "python", "-m", "llm_council_mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

3. Restart Claude Code

Claude Code will pick up the new MCP server on restart. You should see llm-council tools available.

Usage Patterns

Dispatching: Agent Teams or Background Subagents

Council tools are blocking calls (10-30s per response). Never call them from the main conversation. Two dispatch patterns:

Agent Teams (preferred in Claude Code): Use TeamCreate with one teammate per provider. Teammates can send real-time status updates via SendMessage -- errors surface immediately, results stream back as each provider responds.

Background Subagents (fallback): Use Agent with run_in_background=true, one per provider. Simpler but no intermediate status updates -- the caller only sees the final result.

User: "Ask GPT and Gemini what they think about this architecture"

Claude Code dispatches:
  -> Teammate/Agent 1: calls council(provider="openai", ...)
  -> Teammate/Agent 2: calls council(provider="gemini", ...)

Results arrive independently as each provider responds.

One Agent Per Provider

When consulting multiple providers, always use separate agents -- one per provider. This ensures the faster provider's results arrive immediately without waiting for the slower one.

Session Management

Sessions persist across calls within a conversation:

# First call creates the session
council(session="arch-review", message="Review this design...", provider="openai")

# Follow-up uses the same session (conversation continues)
council(session="arch-review", message="What about error handling?")

# Inject context without an LLM call
council_inject(session="arch-review", content="<file contents>", label="schema.sql")

# Clean up when done
council_delete(session="arch-review")

Web Research

# Stateless web-grounded research
council_research(query="What are the latest MCP server best practices?", provider="gemini")

Configuration

Environment Variables

Variable Required Description
OPENAI_API_KEY For OpenAI provider OpenAI API key
GEMINI_API_KEY For Gemini provider Google AI Studio API key
LLM_COUNCIL_DATA_DIR No Data directory (default: ~/.local/share/llm-council-mcp/)
LLM_COUNCIL_LOG_DIR No Log directory (default: $LLM_COUNCIL_DATA_DIR/logs/)

You only need the API key for the provider(s) you use. If you only use Gemini, you don't need an OpenAI key (and vice versa). The key is loaded lazily when the provider is first called.

Default System Prompt

Create $LLM_COUNCIL_DATA_DIR/config.json to set a default system prompt applied to all sessions:

{
  "default_system_prompt": "You are a senior software architect. Be concise."
}

Custom system prompts passed to council() are appended after the default.

Error Handling

Council tools return errors as MCP CallToolResult with isError: true instead of throwing exceptions. This ensures the calling agent always gets a parseable result it can relay to the user.

Error responses include structured fields:

{
  "error": true,
  "provider": "gemini",
  "model": "gemini-3.1-pro-preview",
  "error_type": "RuntimeError",
  "phase": "headers",
  "retryable": true,
  "http_status": 503,
  "message": "Gemini API error 503: This model is currently experiencing high demand."
}
Field Description
error_type Exception class name (RuntimeError, timeout, etc.)
phase Where the failure occurred: connect, headers, stream, timeout, unknown
retryable Whether the error is transient and safe to retry
http_status HTTP status code if applicable (429, 503, etc.)

Streaming

Both providers use streaming HTTP (SSE) internally. This means:

  • Instant error detection: HTTP errors (503, 429) surface immediately from response headers instead of hanging until a timeout fires.
  • No arbitrary timeouts: The connection stays alive as long as the provider is generating tokens. No risk of cutting off legitimate long responses.
  • Mid-stream resilience: If a connection drops after partial data, the error is reported with context about how much data was received.

Cost Tracking

Every council call returns usage stats including estimated cost:

{
  "provider": "openai",
  "model": "gpt-5.4",
  "session": "review-gpt",
  "response": "...",
  "usage": {
    "input_tokens": 1250,
    "output_tokens": 890,
    "reasoning_tokens": 2048,
    "cost_usd": 0.021
  }
}

Session-level cost tracking is available via council_sessions.

Development

# Clone and install
git clone https://github.com/Envious-Labs-LLC/llm-council-mcp.git
cd llm-council-mcp
uv sync

# Run directly
uv run python -m llm_council_mcp

# Test with MCP Inspector
npx @modelcontextprotocol/inspector uv run python -m llm_council_mcp

Adding a New Provider

  1. Create src/llm_council_mcp/providers/yourprovider.py implementing LLMProvider
  2. Add pricing to pricing.py
  3. Add model profiles to model_profiles.py
  4. Register in providers/__init__.py

License

MIT — see LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured