ai-mcp-server
A local MCP bridge that registers multiple AI API endpoints, enabling agents to automatically discover and route to models based on capabilities like chat, vision, reasoning, embedding, image generation, TTS, STT, and rerank.
README
ai-mcp-server
Languages: English | 繁體中文 | 简体中文
Local MCP bridge: register multiple (api_key, base_url) pairs once, and let your Agent automatically discover and route to any model with the right capability (chat, vision, reasoning, embedding, image_gen, tts, stt, rerank).
Three entry points:
ai-mcp— CLI (manage endpoints, query models, trigger probes, init wizard)ai-mcp-server— MCP stdio server, launched by Claude Desktop / Cursor / Cline / Traeai-mcp ui— local Web management dashboard (FastAPI + Jinja2, bound to 127.0.0.1)
Install
Option 1: uv (recommended)
uv tool install ai-mcp-server
Option 2: Homebrew
brew install brianMacao/tap/ai-mcp-server
Option 3: npm / npx
npx ai-mcp-server # auto-installs uv + Python package
Option 4: pip
pip install ai-mcp-server
Quickstart
# Interactive first-run wizard
ai-mcp init
# Or step by step:
ai-mcp endpoint add --name openrouter --base-url https://openrouter.ai/api/v1 --key sk-...
ai-mcp endpoint probe openrouter
ai-mcp model list --capability vision
# Start the Web UI
ai-mcp ui
# → http://127.0.0.1:8765/
# Start the MCP server (for Claude Desktop, Cursor, etc.)
ai-mcp-server
MCP Tools
ai-mcp-server exposes 6 MCP tools:
usage_guide— dynamic inventory, capability distribution, and routing guidance.list_models— filter models by capability, context length, endpoint, and probe state.invoke_model— pass through chat / embedding / image_gen / tts / stt / rerank calls; TTS audio is returned asaudio_base64inside the JSON body.model_performance— inspect recent per-model call counts, success rate, and latency.refresh_endpoint— refresh model lists and enqueue asynchronous capability probes.add_models— manually register models for endpoints without/v1/models, or let an Agent register user-confirmed model features.
Model Feature Registration
Capabilities use canonical names such as text_chat, vision, audio_tts,
audio_stt, embedding, and rerank. Common aliases including tts, stt,
and asr are accepted by manual registration flows and normalized internally.
Static recognition includes these known model ids:
seed-tts-2.0→audio_ttsvolc.seedasr.sauc.duration→audio_stt
Register model features from the CLI:
ai-mcp model add --endpoint volc seed-tts-2.0 --capability audio_tts
ai-mcp model add --endpoint volc volc.seedasr.sauc.duration --features asr=true
ai-mcp model add --endpoint volc custom-model --features text_chat=true,context_length=32000
ai-mcp model override volc custom-model --capability vision=false
Register features from the Web UI:
ai-mcp ui
# Open http://127.0.0.1:8765/
# Use Models -> manual add, or Overrides -> add/update feature override.
Register features from an MCP client / Agent:
- Call
usage_guide. - Use
add_modelswithcapabilitiesfor true capability flags. - Use
feature_overridesfor explicit boolean or context-length overrides.
Example MCP arguments:
{
"endpoint": "volc",
"model_ids": ["seed-tts-2.0"],
"feature_overrides": {
"audio_tts": true,
"context_length": 32000
}
}
Claude Desktop / Trae / Codex Configuration
ai-mcp init will auto-detect installed MCP clients and configure them.
Manual configuration
Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"ai-mcp": {
"command": "uv",
"args": ["run", "--from", "ai-mcp-server", "ai-mcp-server"]
}
}
}
Trae / Trae CN (project root .mcp.json):
{
"mcpServers": {
"ai-mcp": {
"command": "uv",
"args": ["run", "--from", "ai-mcp-server", "ai-mcp-server"],
"transport": "stdio"
}
}
}
Codex Desktop (~/.codex/config.toml):
[mcp_servers.ai-mcp]
command = "uv"
args = ["run", "--from", "ai-mcp-server", "ai-mcp-server"]
Environment Variables
| Variable | Purpose | Default |
|---|---|---|
AI_MCP_CONFIG_DIR |
Override data/config directory | ~/.ai-mcp-server |
AI_MCP_DB_PATH |
SQLite database path | $AI_MCP_CONFIG_DIR/db.sqlite3 |
AI_MCP_MASTER_KEY |
Fernet master key for api_key encryption | auto-generated → system keyring |
AI_MCP_UI_TOKEN |
Access token for Web UI when exposed (--expose) |
none |
Development
# Clone and set up
git clone https://github.com/brianMacao/ai-mcp-server
cd ai-mcp-server
uv sync
# Run tests
uv run pytest -q
# Verify against real endpoint
cp .keys.example .keys # edit with your keys
source .keys
export AI_MCP_CONFIG_DIR="$(pwd)/.data"
export AI_MCP_MASTER_KEY="$(cat .data/.master_key)" # first run generates this
uv run ai-mcp endpoint add --name test --base-url "$EXAMPLE_URL" --key "$EXAMPLE_API_KEY"
uv run ai-mcp endpoint probe test --capability text_chat -y
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.