MCP Gateway

MCP Gateway

Reduces LLM context window overhead by proxying multiple MCP servers through a few efficient dispatch tools instead of registering hundreds of individual tool schemas. It supports multi-account routing and tool discovery for both CLI-based and persistent MCP server configurations.

Category
Visit Server

README

MCP Gateway

Cut 83-89% of your Claude Code context window overhead from MCP tool schemas.

Every MCP server you register dumps its full JSON tool schema into your context window — every conversation, whether you use those tools or not. If you have 5 servers with 30+ tools each, that's thousands of tokens burned before you type a single character.

MCP Gateway replaces N tool schemas with 3-4 dispatch tools that proxy requests to your existing MCP servers. The underlying servers stay exactly the same. You just stop paying the token tax.

The Problem

Before (real numbers from a multi-account setup):
  Google Workspace × 3 accounts  = 142 tools × 3 = 426 tool schemas
  Telegram                       = 92 tool schemas
  Linear × 3 accounts            = 42 tools × 3  = 126 tool schemas
  ─────────────────────────────────────────────────
  Total                          = 644 tool schemas = ~57,000 tokens

After:
  Google Workspace gateway       = 3 tool schemas
  Services gateway (TG + Linear) = 4 tool schemas
  ─────────────────────────────────────────────────
  Total                          = 7 tool schemas   = ~6,200 tokens

Savings: 89% fewer tokens, every single conversation.

How It Works

Instead of registering each MCP server directly in your Claude Code config, you register a single gateway server that proxies tool calls to the underlying servers on demand.

Claude sees 3-4 generic tools (gw, gw_discover, gw_batch or tg, linear, etc.) instead of hundreds of specialized ones. When Claude needs to call a specific tool, it uses the dispatch tool with the tool name as a parameter. The gateway forwards the call to the right backend.

Two patterns are included, depending on what your upstream MCP server supports:

Pattern 1: CLI Dispatch (cli_gateway.py)

For MCP servers that support a --cli mode (subprocess per call). Each tool invocation spawns a short-lived process. Good for servers like google-workspace-mcp that have built-in CLI modes.

Features:

  • Multi-account routing (one gateway, N credential sets)
  • Auto-injection of per-account parameters (e.g., user_google_email)
  • Tool-to-service mapping for faster cold starts (only loads the needed module)
  • Discovery with caching
  • Batch execution (parallel tool calls in one request)

Pattern 2: Persistent MCP Client (persistent_gateway.py)

For MCP servers without CLI mode. Maintains persistent subprocess connections to upstream MCP servers, avoiding cold-start latency on every call. Uses the MCP SDK's ClientSession with AsyncExitStack for lifecycle management.

Features:

  • Lazy connection (connects on first use, not at startup)
  • Auto-reconnect on connection failure
  • Multi-service routing (multiple MCP servers behind one gateway)
  • Multi-account support per service
  • Tool discovery with caching

Quick Start

1. Install

# Clone
git clone https://github.com/block-town/mcp-gateway.git
cd mcp-gateway

# Install dependencies
pip install fastmcp pyyaml python-dotenv
# or
uv pip install fastmcp pyyaml python-dotenv

2. Configure

Copy the example configs and fill in your details:

cp config.example.yaml config.yaml
# Edit config.yaml with your accounts, paths, and credentials

3. Choose Your Pattern

CLI Dispatch (for servers with --cli mode):

cp cli_gateway.py server.py
# Edit server.py — update the tool descriptions with your account names and common tools

Persistent Client (for servers without CLI mode):

cp persistent_gateway.py server.py
# Edit server.py — update service names and tool descriptions

4. Register in Claude Code

Add to your ~/.claude/settings.json (global) or project .mcp.json:

{
  "mcpServers": {
    "gateway": {
      "command": "python3",
      "args": ["/path/to/mcp-gateway/server.py"]
    }
  }
}

Then remove the original MCP server entries that the gateway now proxies.

5. Use

# Before (142 tools polluting your context):
gw_discover("gmail")     → see available Gmail tools + params
gw("work", "search_gmail_messages", {"query": "is:unread"})
gw_batch("work", [
  {"tool": "search_gmail_messages", "params": {"query": "is:unread"}},
  {"tool": "get_events", "params": {"time_min": "2025-01-01"}}
])

# Persistent gateway:
tg("send_message", {"chat_id": "123", "text": "hello"})
linear("work", "linear_getIssues", {"teamId": "TEAM-1"})

Configuration

CLI Gateway (config.example.yaml — accounts mode)

# Path to the upstream MCP server
upstream_dir: "/path/to/google-workspace-mcp"

# Command to run the upstream server in CLI mode
runner: "uv"

accounts:
  personal:
    client_id: "your-oauth-client-id"
    client_secret: "your-oauth-client-secret"
    credentials_dir: "/path/to/credentials/personal"
    email: "you@gmail.com"
  work:
    client_id: "your-work-oauth-client-id"
    client_secret: "your-work-oauth-client-secret"
    credentials_dir: "/path/to/credentials/work"
    email: "you@company.com"

Persistent Gateway (config.example.yaml — services mode)

services:
  telegram:
    command: "uv"
    args: ["--directory", "/path/to/telegram-mcp", "run", "main.py"]
    env_file: "/path/to/telegram-mcp/.env"

  linear:
    command: "npx"
    args: ["-y", "@tacticlaunch/mcp-linear"]
    accounts:
      work:
        LINEAR_API_TOKEN: "lin_api_xxxxx"
      personal:
        LINEAR_API_TOKEN: "lin_api_yyyyy"

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Claude Code                                                │
│                                                             │
│  Context window sees: 3-4 tool schemas (~6K tokens)         │
│  Instead of:          644 tool schemas (~57K tokens)         │
│                                                             │
│  gw("work", "search_gmail_messages", {"query": "..."})      │
│  tg("send_message", {"chat_id": "...", "text": "..."})      │
└──────────────────┬──────────────────────────────────────────┘
                   │
         ┌─────────▼─────────┐
         │   MCP Gateway     │
         │   (FastMCP)       │
         │                   │
         │   3-4 tools that  │
         │   dispatch to     │
         │   upstream MCP    │
         │   servers         │
         └───┬─────┬─────┬──┘
             │     │     │
    ┌────────▼┐ ┌──▼───┐ ┌▼────────┐
    │ Google  │ │ Tele-│ │ Linear  │
    │ MCP     │ │ gram │ │ MCP     │
    │ (CLI)   │ │ MCP  │ │ (×N     │
    │         │ │      │ │ accts)  │
    └─────────┘ └──────┘ └─────────┘

When to Use This

Good fit:

  • You have 3+ MCP servers registered and context window pressure is real
  • Multi-account setups (same server, different credentials)
  • Servers with 30+ tools where you use maybe 5-10 regularly

Not worth it:

  • Single MCP server with <10 tools
  • You rarely hit context limits
  • The MCP server is already tiny

Token Math

Each MCP tool schema is roughly 200-800 tokens of JSON (tool name, description, parameter schema with types/descriptions/required fields). To measure your own overhead:

  1. Count your registered tools: look at your settings.json and .mcp.json files
  2. Estimate ~400 tokens per tool (conservative average)
  3. Multiply by every conversation you start

A gateway tool schema is ~800-900 tokens (larger description with embedded cheat-sheet), but you only have 3-4 of them instead of hundreds.

Adapting to Your Stack

The two gateway files are templates. Fork and modify:

  1. Change the tool namesgw/tg/linear are just conventions. Name them whatever makes sense.
  2. Update the docstrings — The tool descriptions are the cheat-sheet Claude sees. List your most common tools and their params there.
  3. Add more services — The persistent gateway pattern works with any MCP server. Add a new service block in config, build a new client, expose new dispatch tools.
  4. Strip what you don't need — If you only have one account, remove multi-account routing. If you don't need batch, remove gw_batch.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured