MCP Servers

ComputeGauge MCP

Provides cost intelligence and a reputation scoring system to help AI agents optimize spending through smart model selection and local-to-cloud routing. It enables real-time cost tracking and rewards agents for making efficient, high-credibility decisions across various LLM providers.

README

@computegauge/mcp

The cost intelligence + agent credibility layer for AI agents. Install once, every session is cost-aware and credibility-building.

npm install -g @computegauge/mcp

Why This Exists

AI agents are powerful but expensive. A single Claude Code session can cost $5-50+. A multi-agent workflow can burn through $100 in minutes. And nobody tells the agent — or the human — until the bill arrives.

ComputeGauge MCP makes agents cost-aware AND gives them a credibility score. Any MCP-compatible agent (Claude, Cursor, Windsurf, custom agents) gets:

Cost Intelligence — Know which model is optimal for every task, track spend in real-time
Agent Credibility — Build a reputation score (0-1000) by making smart decisions. Compete on a leaderboard.
Local→Cloud Routing — Detect when local inference isn't good enough, route to cloud, earn credibility for smart routing

The result: agents that spend 40-70% less, build visible credibility, and know when to route to cloud.

Setup — 30 Seconds

Claude Desktop / Claude Code

Add to ~/.config/claude/claude_desktop_config.json:

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"]
    }
  }
}

That's it. Restart Claude. Every conversation now has cost intelligence + credibility tracking.

With Provider API Keys (Enhanced)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY": "sk-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

With Local Inference (Ollama, vLLM, etc.)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "OLLAMA_MODELS": "llama3.3:70b,qwen2.5:7b,deepseek-r1:14b",
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

Cursor

Add to Cursor MCP settings:

{
  "computegauge": {
    "command": "npx",
    "args": ["-y", "@computegauge/mcp"]
  }
}

Tools Reference

Agent-Native Tools (use automatically every session)

Tool	When to Call	What It Does	Credibility
`pick_model`	Before any API request	Returns the optimal model for a task	+8 Routing Intelligence
`log_request`	After any API request	Logs the request cost	+3 Honest Reporting
`session_cost`	Every 5-10 requests	Shows cumulative cost and budget	—
`rate_recommendation`	After completing a task	Rate how well the model performed	+5 Quality Contribution
`model_ratings`	When curious about quality	View model quality leaderboard	—
`improvement_cycle`	At session end	Run continuous improvement engine	+15 Quality Contribution
`integrity_report`	For transparency	View rating acceptance/rejection stats	—

Credibility Tools (the reputation protocol)

Tool	When to Call	What It Does	Credibility
`credibility_profile`	Anytime	View your 0-1000 credibility score, tier, badges	—
`credibility_leaderboard`	To compete	See how you rank vs other agents	—
`route_to_cloud`	After local→cloud routing	Report smart routing decision	+70 Cloud Routing
`assess_routing`	Before choosing local vs cloud	Should this task stay local?	—
`cluster_status`	To check local capabilities	View local endpoints, models, hardware	—

Intelligence Tools (for user questions)

Tool	Description
`get_spend_summary`	User's total AI spend across all providers
`get_budget_status`	Budget utilization and alerts
`get_model_pricing`	Current pricing for any model
`get_cost_comparison`	Compare costs for specific workloads
`suggest_savings`	Actionable cost optimization recommendations
`get_usage_trend`	Spend trends and anomaly detection

Resources

Resource	URI	Description
Config	`computegauge://config`	Current server configuration
Session	`computegauge://session`	Real-time session cost data
Ratings	`computegauge://ratings`	Model quality leaderboard
Credibility	`computegauge://credibility`	Agent credibility profile + leaderboard
Cluster	`computegauge://cluster`	Local inference cluster status
Quickstart	`computegauge://quickstart`	Agent onboarding guide

Prompts

Prompt	Description
`cost_aware_system`	System prompt that makes any agent cost-aware + credibility-building
`daily_cost_report`	Generate a quick daily cost report
`optimize_workflow`	Analyze and optimize a described AI workflow

Agent Credibility System

Every smart decision earns credibility points on a 0-1000 scale:

Category	How to Earn	Points
🧠 Routing Intelligence	Using `pick_model` wisely, avoiding overspec	+8 to +15 per event
💰 Cost Efficiency	Staying under budget, significant savings	+5 to +30 per event
✅ Task Success	Completing tasks successfully	+10 to +25 per event
📊 Honest Reporting	Logging requests, reporting failures honestly	+3 to +10 per event
☁️ Cloud Routing	Smart local→cloud routing via ComputeGauge	+25 to +70 per event
⭐ Quality Contribution	Rating models, running improvement cycles	+5 to +15 per event

Credibility Tiers

Tier	Score	What It Means
⚪ Unrated	0-99	Just getting started
🥉 Bronze	100-299	Learning the ropes
🥈 Silver	300-499	Competent and cost-aware
🥇 Gold	500-699	Skilled optimizer
💎 Platinum	700-849	Elite decision-maker
👑 Diamond	850-1000	Best in class

Earnable Badges

Badge	How to Earn
🌱 First Steps	Complete first session
💰 Cost Optimizer	Save >$10 through smart model selection
📊 Transparency Champion	Log 50+ requests accurately
☁️ Smart Router	Successfully route 10+ tasks to cloud
⭐ Quality Pioneer	Submit 25+ model ratings
🔥 Streak Master	20+ consecutive successful tasks
🥇 Gold Agent	Reach Gold tier (500+ score)
💎 Platinum Agent	Reach Platinum tier (700+ score)
👑 Diamond Agent	Reach Diamond tier (850+ score)
🌐 Hybrid Intelligence	Use both local and cloud models in one session

Local Cluster Integration

ComputeGauge auto-detects local inference endpoints:

Platform	Environment Variable	Default
Ollama	`OLLAMA_HOST`	`http://localhost:11434`
vLLM	`VLLM_HOST`	—
llama.cpp	`LLAMACPP_HOST`	—
TGI	`TGI_HOST`	—
LocalAI	`LOCALAI_HOST`	—
Custom	`LOCAL_LLM_ENDPOINT`	—

Set OLLAMA_MODELS="llama3.3:70b,qwen2.5:7b" (comma-separated) to declare available models.

The Local→Cloud Routing Flow

1. Agent calls assess_routing("code_generation", quality="good")
2. ComputeGauge checks: local llama3.3:70b quality for code_generation = 80/100
3. "Good" quality threshold = 78 → Local model is sufficient!
4. Agent uses local model → saves money → earns credibility for honest assessment

OR:

1. Agent calls assess_routing("complex_reasoning", quality="excellent")
2. ComputeGauge checks: local llama3.3:70b quality for complex_reasoning = 78/100
3. "Excellent" quality threshold = 88 → Quality gap of 10 points → Route to cloud!
4. Agent calls pick_model → gets Claude Sonnet 4 → executes → calls route_to_cloud
5. Agent earns +70 credibility points for smart routing decision

How `pick_model` Works

The decision engine scores every model across three dimensions:

Quality — Per-task-type scores for 14 task types Cost — Real pricing from 8 providers, 20+ models, calculated per-call (log-scale normalization) Speed — Relative inference speed scores

Priority	Quality	Cost	Speed
`cheapest`	20%	70%	10%
`balanced`	45%	35%	20%
`best_quality`	70%	10%	20%
`fastest`	25%	15%	60%

Model Coverage

Provider	Models	Tier Range
Anthropic	Claude Opus 4, Sonnet 4, Sonnet 3.5, Haiku 3.5	Frontier → Budget
OpenAI	o1, GPT-4o, o3-mini, GPT-4o-mini	Frontier → Budget
Google	Gemini 2.0 Pro, 1.5 Pro, 2.0 Flash	Premium → Budget
DeepSeek	Reasoner, Chat	Value → Budget
Groq	Llama 3.3 70B, Llama 3.1 8B	Value → Budget
Together	Llama 3.3 70B Turbo, Qwen 2.5 72B	Value
Mistral	Large, Small	Premium → Budget

Local Models Supported

Model	Quality (general)	Best For
llama3.3:70b	79/100	General tasks, code
qwen2.5:72b	81/100	Code, math, translation
deepseek-r1:70b	80/100	Reasoning, math, code
deepseek-r1:14b	68/100	Budget reasoning
phi3:14b	60/100	Simple tasks
llama3.1:8b	58/100	Classification, simple QA
mistral:7b	58/100	Simple tasks

Environment Variables

Variable	Required	Description
`COMPUTEGAUGE_DASHBOARD_URL`	No	URL of ComputeGauge dashboard
`COMPUTEGAUGE_API_KEY`	No	API key for dashboard access
`COMPUTEGAUGE_BUDGET_TOTAL`	No	Session budget limit in USD
`COMPUTEGAUGE_BUDGET_ANTHROPIC`	No	Per-provider monthly budget
`COMPUTEGAUGE_BUDGET_OPENAI`	No	Per-provider monthly budget
`ANTHROPIC_API_KEY`	No	Enables Anthropic provider detection
`OPENAI_API_KEY`	No	Enables OpenAI provider detection
`GOOGLE_API_KEY`	No	Enables Google provider detection
`OLLAMA_HOST`	No	Ollama inference endpoint
`OLLAMA_MODELS`	No	Comma-separated local model names
`VLLM_HOST`	No	vLLM inference endpoint
`COMPUTEGAUGE_GPU`	No	GPU name for hardware detection
`COMPUTEGAUGE_VRAM_GB`	No	VRAM in GB
`COMPUTEGAUGE_COST_PER_HOUR`	No	Amortized hardware cost/hr

For Agent Developers

If you're building AI agents (via Claude Agent SDK, LangChain, CrewAI, AutoGen, etc.), ComputeGauge MCP is the easiest way to add cost awareness AND agent credibility:

Zero integration effort — Just add the MCP server to your agent's config
No code changes — The agent discovers 18 tools via MCP protocol automatically
Immediate value — pick_model returns recommendations on first call, credibility tracking starts automatically
Session tracking built-in — Full cost visibility per agent run
Credibility system — Your agent earns a visible reputation score that users can see
Local cluster support — Auto-detect and leverage on-prem inference
Budget guardrails — Warnings when approaching limits

Pattern: Cost-Aware + Credibility-Building Agent Loop

1. Agent receives task
2. Agent calls assess_routing(task_type) → local or cloud?
3. Agent calls pick_model(task_type, priority="balanced")
4. Agent uses recommended model for the task
5. Agent calls log_request(provider, model, tokens)
6. Agent calls rate_recommendation(model, rating, success)
7. If cloud-routed: agent calls route_to_cloud(task_type, reason, model)
8. Every 5 requests, agent calls session_cost()
9. If session cost > 80% of budget, switch to priority="cheapest"
10. At session end: check credibility_profile()

This pattern reduces costs by 40-70% while building a credibility score that makes users trust the agent more.

License

Apache-2.0 — Free to use, modify, and distribute.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

ComputeGauge MCP

README

@computegauge/mcp

Why This Exists

Setup — 30 Seconds

Claude Desktop / Claude Code

With Provider API Keys (Enhanced)

With Local Inference (Ollama, vLLM, etc.)

Cursor

Tools Reference

Agent-Native Tools (use automatically every session)

Credibility Tools (the reputation protocol)

Intelligence Tools (for user questions)

Resources

Prompts

Agent Credibility System

Credibility Tiers

Earnable Badges

Local Cluster Integration

The Local→Cloud Routing Flow

How pick_model Works

Model Coverage

Local Models Supported

Environment Variables

For Agent Developers

Pattern: Cost-Aware + Credibility-Building Agent Loop

License

Links

Recommended Servers

How `pick_model` Works