Vox MCP
Enables MCP clients like Claude Code and Cursor to use multiple AI models (Gemini, GPT, Grok, DeepSeek, Kimi, Ollama) via a unified chat tool with conversation memory.
README
Vox MCP
Multi-model AI gateway for MCP clients.
Why
MCP clients like Claude Code, Claude Desktop, and Cursor are locked to their host model. Vox gives them access to every other model — Gemini, GPT, Grok, DeepSeek, Kimi, or your local Ollama — through a single chat tool.
The design is deliberately minimal: prompts go to providers unmodified, responses come back unmodified. No system prompt injection. No response formatting. No behavioral directives. The only value Vox adds is routing and conversation memory — everything else is pure passthrough.
What it does
Send a prompt, optionally attach files or images, pick a model (or let the agent pick), and get back the model's raw response. Conversation threads persist in memory via continuation_id for multi-turn exchanges across any provider — start a thread with Gemini, continue it with GPT. Threads are shadow-persisted to disk as JSONL for durability and can be exported as Markdown.
3 tools:
| Tool | Description |
|---|---|
chat |
Send prompts to any configured AI model with optional file/image context |
listmodels |
Show available models, aliases, and capabilities |
dump_threads |
Export conversation threads as JSON or Markdown |
8 providers:
| Provider | Env Variable | Example Models |
|---|---|---|
| Google Gemini | GEMINI_API_KEY |
gemini-2.5-pro |
| OpenAI | OPENAI_API_KEY |
gpt-5.1, gpt-5, o3, o4-mini |
| Anthropic | ANTHROPIC_API_KEY |
claude-4-opus, claude-4-sonnet |
| xAI | XAI_API_KEY |
grok-3, grok-3-fast |
| DeepSeek | DEEPSEEK_API_KEY |
deepseek-v4-pro |
| Moonshot (Kimi) | MOONSHOT_API_KEY |
kimi-k2.6 |
| OpenRouter | OPENROUTER_API_KEY |
Any OpenRouter model |
| Custom | CUSTOM_API_URL |
Ollama, vLLM, LM Studio, etc. |
Quick start
git clone https://github.com/linxule/vox-mcp.git
cd vox-mcp
cp .env.example .env
# Edit .env — add at least one API key
uv sync
uv run python server.py
MCP client configuration
Vox runs as a stdio MCP server. Each client needs to know how to launch it.
Replace /path/to/vox-mcp with the absolute path to your cloned repo.
Claude Code (CLI)
claude mcp add vox-mcp \
-e GEMINI_API_KEY=your-key-here \
-- uv run --directory /path/to/vox-mcp python server.py
Or add to .mcp.json in your project root:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Claude Desktop
Add to claude_desktop_config.json:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Cursor
Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Any MCP client
The canonical stdio configuration:
{
"mcpServers": {
"vox-mcp": {
"command": "uv",
"args": ["run", "--directory", "/path/to/vox-mcp", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
Tips:
- Paths must be absolute
- You only need one API key to start — add more providers later via
.env - The
.envfile in the vox-mcp directory is loaded automatically, so API keys can go there instead of in the client config - Use
VOX_FORCE_ENV_OVERRIDE=truein.envif client-passed env vars conflict with your.envvalues
Configuration
Copy .env.example to .env and configure:
- API keys — at least one provider key is required
DEFAULT_MODEL—auto(default, agent picks) or a specific model name- Model restrictions —
GOOGLE_ALLOWED_MODELS,OPENAI_ALLOWED_MODELS, etc. CONVERSATION_TIMEOUT_HOURS— thread TTL (default: 24h)MAX_CONVERSATION_TURNS— thread length limit (default: 100)
See .env.example for the full reference.
Development
uv sync
uv run python -c "import server" # smoke test
uv run pytest # run tests
See CONTRIBUTING.md for code style, project structure, and how to add providers.
License
Apache 2.0 — see LICENSE and NOTICE.
Derived from pal-mcp-server by Beehive Innovations.
<!-- mcp-name: io.github.linxule/vox-mcp -->
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.