MCP Servers

LLM Council MCP

Enables Claude Code to consult external LLMs (GPT, Gemini) through multi-turn sessions for second opinions, parallel consultations, and web-grounded research.

README

LLM Council MCP

An MCP server that lets Claude Code consult external LLMs (GPT, Gemini) through multi-turn sessions. Get a second opinion, run parallel consultations, or do web-grounded research — all without leaving your Claude Code workflow.

Why?

Claude Code is powerful, but sometimes you want to:

Get a second opinion on architecture decisions from GPT or Gemini
Cross-reference answers by asking multiple models the same question
Web-grounded research using Gemini's Google Search or OpenAI's web search
Multi-turn conversations with external models while Claude orchestrates

This MCP server makes all of that possible with a simple tool interface.

Tools

Tool	Description
`council`	Multi-turn chat with an external LLM. Auto-creates sessions.
`council_research`	Web-grounded research via LLM + live search. Stateless.
`council_inject`	Inject context (files, docs) into a session without an LLM call.
`council_sessions`	List all active sessions with usage stats.
`council_delete`	Delete a session and its history.
`council_reset`	Clear conversation history, keep session config.

Supported Providers

Provider	Default Model	Features
OpenAI	`gpt-5.4`	Reasoning (low/medium effort), web search
Gemini	`gemini-3.1-pro-preview`	Thinking levels, Google Search grounding

Quick Start

1. Get API Keys

You need at least one:

OpenAI: platform.openai.com/api-keys
Gemini: aistudio.google.com/apikey

2. Add to Claude Code

Add this to your .mcp.json (in your home directory or project root):

{
  "mcpServers": {
    "llm-council": {
      "command": "uvx",
      "args": ["llm-council-mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

Alternative — install locally with uv:

{
  "mcpServers": {
    "llm-council": {
      "command": "uv",
      "args": ["--directory", "/path/to/llm-council-mcp", "run", "python", "-m", "llm_council_mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "GEMINI_API_KEY": "AI..."
      }
    }
  }
}

3. Restart Claude Code

Claude Code will pick up the new MCP server on restart. You should see llm-council tools available.

Usage Patterns

Dispatching: Agent Teams or Background Subagents

Council tools are blocking calls (10-30s per response). Never call them from the main conversation. Two dispatch patterns:

Agent Teams (preferred in Claude Code): Use TeamCreate with one teammate per provider. Teammates can send real-time status updates via SendMessage -- errors surface immediately, results stream back as each provider responds.

Background Subagents (fallback): Use Agent with run_in_background=true, one per provider. Simpler but no intermediate status updates -- the caller only sees the final result.

User: "Ask GPT and Gemini what they think about this architecture"

Claude Code dispatches:
  -> Teammate/Agent 1: calls council(provider="openai", ...)
  -> Teammate/Agent 2: calls council(provider="gemini", ...)

Results arrive independently as each provider responds.

One Agent Per Provider

When consulting multiple providers, always use separate agents -- one per provider. This ensures the faster provider's results arrive immediately without waiting for the slower one.

Session Management

Sessions persist across calls within a conversation:

# First call creates the session
council(session="arch-review", message="Review this design...", provider="openai")

# Follow-up uses the same session (conversation continues)
council(session="arch-review", message="What about error handling?")

# Inject context without an LLM call
council_inject(session="arch-review", content="<file contents>", label="schema.sql")

# Clean up when done
council_delete(session="arch-review")

Web Research

# Stateless web-grounded research
council_research(query="What are the latest MCP server best practices?", provider="gemini")

Configuration

Environment Variables

Variable	Required	Description
`OPENAI_API_KEY`	For OpenAI provider	OpenAI API key
`GEMINI_API_KEY`	For Gemini provider	Google AI Studio API key
`LLM_COUNCIL_DATA_DIR`	No	Data directory (default: `~/.local/share/llm-council-mcp/`)
`LLM_COUNCIL_LOG_DIR`	No	Log directory (default: `$LLM_COUNCIL_DATA_DIR/logs/`)

You only need the API key for the provider(s) you use. If you only use Gemini, you don't need an OpenAI key (and vice versa). The key is loaded lazily when the provider is first called.

Default System Prompt

Create $LLM_COUNCIL_DATA_DIR/config.json to set a default system prompt applied to all sessions:

{
  "default_system_prompt": "You are a senior software architect. Be concise."
}

Custom system prompts passed to council() are appended after the default.

Error Handling

Council tools return errors as MCP CallToolResult with isError: true instead of throwing exceptions. This ensures the calling agent always gets a parseable result it can relay to the user.

Error responses include structured fields:

{
  "error": true,
  "provider": "gemini",
  "model": "gemini-3.1-pro-preview",
  "error_type": "RuntimeError",
  "phase": "headers",
  "retryable": true,
  "http_status": 503,
  "message": "Gemini API error 503: This model is currently experiencing high demand."
}

Field	Description
`error_type`	Exception class name (`RuntimeError`, `timeout`, etc.)
`phase`	Where the failure occurred: `connect`, `headers`, `stream`, `timeout`, `unknown`
`retryable`	Whether the error is transient and safe to retry
`http_status`	HTTP status code if applicable (429, 503, etc.)

Streaming

Both providers use streaming HTTP (SSE) internally. This means:

Instant error detection: HTTP errors (503, 429) surface immediately from response headers instead of hanging until a timeout fires.
No arbitrary timeouts: The connection stays alive as long as the provider is generating tokens. No risk of cutting off legitimate long responses.
Mid-stream resilience: If a connection drops after partial data, the error is reported with context about how much data was received.

Cost Tracking

Every council call returns usage stats including estimated cost:

{
  "provider": "openai",
  "model": "gpt-5.4",
  "session": "review-gpt",
  "response": "...",
  "usage": {
    "input_tokens": 1250,
    "output_tokens": 890,
    "reasoning_tokens": 2048,
    "cost_usd": 0.021
  }
}

Session-level cost tracking is available via council_sessions.

Development

# Clone and install
git clone https://github.com/Envious-Labs-LLC/llm-council-mcp.git
cd llm-council-mcp
uv sync

# Run directly
uv run python -m llm_council_mcp

# Test with MCP Inspector
npx @modelcontextprotocol/inspector uv run python -m llm_council_mcp

Adding a New Provider

Create src/llm_council_mcp/providers/yourprovider.py implementing LLMProvider
Add pricing to pricing.py
Add model profiles to model_profiles.py
Register in providers/__init__.py

License

MIT — see LICENSE.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured