LLM Council MCP
Enables Claude Code to consult external LLMs (GPT, Gemini) through multi-turn sessions for second opinions, parallel consultations, and web-grounded research.
README
LLM Council MCP
An MCP server that lets Claude Code consult external LLMs (GPT, Gemini) through multi-turn sessions. Get a second opinion, run parallel consultations, or do web-grounded research — all without leaving your Claude Code workflow.
Why?
Claude Code is powerful, but sometimes you want to:
- Get a second opinion on architecture decisions from GPT or Gemini
- Cross-reference answers by asking multiple models the same question
- Web-grounded research using Gemini's Google Search or OpenAI's web search
- Multi-turn conversations with external models while Claude orchestrates
This MCP server makes all of that possible with a simple tool interface.
Tools
| Tool | Description |
|---|---|
council |
Multi-turn chat with an external LLM. Auto-creates sessions. |
council_research |
Web-grounded research via LLM + live search. Stateless. |
council_inject |
Inject context (files, docs) into a session without an LLM call. |
council_sessions |
List all active sessions with usage stats. |
council_delete |
Delete a session and its history. |
council_reset |
Clear conversation history, keep session config. |
Supported Providers
| Provider | Default Model | Features |
|---|---|---|
| OpenAI | gpt-5.4 |
Reasoning (low/medium effort), web search |
| Gemini | gemini-3.1-pro-preview |
Thinking levels, Google Search grounding |
Quick Start
1. Get API Keys
You need at least one:
- OpenAI: platform.openai.com/api-keys
- Gemini: aistudio.google.com/apikey
2. Add to Claude Code
Add this to your .mcp.json (in your home directory or project root):
{
"mcpServers": {
"llm-council": {
"command": "uvx",
"args": ["llm-council-mcp"],
"env": {
"OPENAI_API_KEY": "sk-...",
"GEMINI_API_KEY": "AI..."
}
}
}
}
Alternative — install locally with uv:
{
"mcpServers": {
"llm-council": {
"command": "uv",
"args": ["--directory", "/path/to/llm-council-mcp", "run", "python", "-m", "llm_council_mcp"],
"env": {
"OPENAI_API_KEY": "sk-...",
"GEMINI_API_KEY": "AI..."
}
}
}
}
3. Restart Claude Code
Claude Code will pick up the new MCP server on restart. You should see llm-council tools available.
Usage Patterns
Dispatching: Agent Teams or Background Subagents
Council tools are blocking calls (10-30s per response). Never call them from the main conversation. Two dispatch patterns:
Agent Teams (preferred in Claude Code): Use TeamCreate with one teammate per provider. Teammates can send real-time status updates via SendMessage -- errors surface immediately, results stream back as each provider responds.
Background Subagents (fallback): Use Agent with run_in_background=true, one per provider. Simpler but no intermediate status updates -- the caller only sees the final result.
User: "Ask GPT and Gemini what they think about this architecture"
Claude Code dispatches:
-> Teammate/Agent 1: calls council(provider="openai", ...)
-> Teammate/Agent 2: calls council(provider="gemini", ...)
Results arrive independently as each provider responds.
One Agent Per Provider
When consulting multiple providers, always use separate agents -- one per provider. This ensures the faster provider's results arrive immediately without waiting for the slower one.
Session Management
Sessions persist across calls within a conversation:
# First call creates the session
council(session="arch-review", message="Review this design...", provider="openai")
# Follow-up uses the same session (conversation continues)
council(session="arch-review", message="What about error handling?")
# Inject context without an LLM call
council_inject(session="arch-review", content="<file contents>", label="schema.sql")
# Clean up when done
council_delete(session="arch-review")
Web Research
# Stateless web-grounded research
council_research(query="What are the latest MCP server best practices?", provider="gemini")
Configuration
Environment Variables
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
For OpenAI provider | OpenAI API key |
GEMINI_API_KEY |
For Gemini provider | Google AI Studio API key |
LLM_COUNCIL_DATA_DIR |
No | Data directory (default: ~/.local/share/llm-council-mcp/) |
LLM_COUNCIL_LOG_DIR |
No | Log directory (default: $LLM_COUNCIL_DATA_DIR/logs/) |
You only need the API key for the provider(s) you use. If you only use Gemini, you don't need an OpenAI key (and vice versa). The key is loaded lazily when the provider is first called.
Default System Prompt
Create $LLM_COUNCIL_DATA_DIR/config.json to set a default system prompt applied to all sessions:
{
"default_system_prompt": "You are a senior software architect. Be concise."
}
Custom system prompts passed to council() are appended after the default.
Error Handling
Council tools return errors as MCP CallToolResult with isError: true instead of throwing exceptions. This ensures the calling agent always gets a parseable result it can relay to the user.
Error responses include structured fields:
{
"error": true,
"provider": "gemini",
"model": "gemini-3.1-pro-preview",
"error_type": "RuntimeError",
"phase": "headers",
"retryable": true,
"http_status": 503,
"message": "Gemini API error 503: This model is currently experiencing high demand."
}
| Field | Description |
|---|---|
error_type |
Exception class name (RuntimeError, timeout, etc.) |
phase |
Where the failure occurred: connect, headers, stream, timeout, unknown |
retryable |
Whether the error is transient and safe to retry |
http_status |
HTTP status code if applicable (429, 503, etc.) |
Streaming
Both providers use streaming HTTP (SSE) internally. This means:
- Instant error detection: HTTP errors (503, 429) surface immediately from response headers instead of hanging until a timeout fires.
- No arbitrary timeouts: The connection stays alive as long as the provider is generating tokens. No risk of cutting off legitimate long responses.
- Mid-stream resilience: If a connection drops after partial data, the error is reported with context about how much data was received.
Cost Tracking
Every council call returns usage stats including estimated cost:
{
"provider": "openai",
"model": "gpt-5.4",
"session": "review-gpt",
"response": "...",
"usage": {
"input_tokens": 1250,
"output_tokens": 890,
"reasoning_tokens": 2048,
"cost_usd": 0.021
}
}
Session-level cost tracking is available via council_sessions.
Development
# Clone and install
git clone https://github.com/Envious-Labs-LLC/llm-council-mcp.git
cd llm-council-mcp
uv sync
# Run directly
uv run python -m llm_council_mcp
# Test with MCP Inspector
npx @modelcontextprotocol/inspector uv run python -m llm_council_mcp
Adding a New Provider
- Create
src/llm_council_mcp/providers/yourprovider.pyimplementingLLMProvider - Add pricing to
pricing.py - Add model profiles to
model_profiles.py - Register in
providers/__init__.py
License
MIT — see LICENSE.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.