MCP Servers

mcp-bigquery-evals

A BigQuery MCP server with mandatory cost guardrails that dry-run every query before execution, and a measurable accuracy badge from an eval harness.

README

mcp-bigquery-evals

The BigQuery MCP server with mandatory cost guardrails and a measurable accuracy number.

uvx mcp-bigquery-evals · works with any MCP-compatible client · v0.1.0

</div>

Why use this over the other BigQuery MCPs

	Most BQ MCPs	`mcp-bigquery-evals`
Cost guardrails	none	mandatory dry-run before every query, refuses if over cap
Quality signal	"trust me"	live accuracy badge, recomputed every release
Write operations	usually enabled	disabled by design (read-only)
Errors when things break	raw API exceptions	7 stable error codes an agent can switch on
Local dev without GCP	impossible	in-memory sqlite-backed fake ships in the box

What ships in the box

7 read-only MCP tools for warehouse discovery and querying
Mandatory dry-run cost cap on every run_query (default 100 MB scanned, about $0.0005 per query)
Result-set-equivalence eval harness (Spider/BIRD methodology) with a live accuracy badge in this README
Structured BigQuery errors with 7 stable codes (invalid_sql, table_not_found, permission_denied, unauthenticated, rate_limited, query_timeout, unknown)
Two BigQueryClient implementations: RealBigQueryClient (production, wraps google-cloud-bigquery) and FakeBigQueryClient (in-memory, sqlite-backed, for dev and CI without GCP credentials)

Quickstart (5 minutes)

1. Install

uvx mcp-bigquery-evals --help

First run takes about 30s while uv fetches dependencies; subsequent runs are instant from the local cache. Plain pip install mcp-bigquery-evals also works.

2. Authenticate to GCP

gcloud auth application-default login

3. Wire into your MCP client

Open your MCP client's server config (developer settings) and add:

{
  "mcpServers": {
    "bigquery": {
      "command": "uvx",
      "args": ["mcp-bigquery-evals", "serve"],
      "env": {
        "BIGQUERY_PROJECT": "YOUR_GCP_PROJECT_ID_HERE"
      }
    }
  }
}

Restart your client. The MCP indicator should show "bigquery" with 7 tools.

4. Try it

Using the bigquery tool, find the top 5 most-viewed Stack Overflow questions tagged 'python'.

The agent chains list_datasets, list_tables, describe_table, run_query to answer. Every run_query is dry-run-cost-capped before execution.

Detailed setup, troubleshooting, and the alternative pip install path live in docs/mcp_client_setup.md.

The 7 tools

Tool	Purpose
`list_datasets()`	List all datasets in your GCP project
`list_tables(dataset_id)`	List tables in a dataset
`describe_table(table_id)`	Schema, row count, size
`sample_table(table_id, n=5)`	Up to n sample rows
`search_schema(term)`	Fuzzy-match a term against all column names
`estimate_cost(sql)`	Free dry-run; returns bytes_scanned and estimated USD
`run_query(sql, max_bytes_scanned=100MB)`	Dry-run, refuse if over cap, then execute

All tools are read-only. There are no write operations in v1 by design. See docs/architecture.md for the design rationale.

Cost guardrails

Every run_query call dry-runs first (free) before execution. If the dry-run estimate exceeds max_bytes_scanned, the call returns a structured error rather than burning bytes:

{
  "error": "cost_cap_exceeded",
  "would_scan": "1.4 GB",
  "cap": "100.0 MB",
  "estimated_usd": 0.007,
  "hint": "narrow your WHERE clause or pass max_bytes_scanned=1500000000 to override"
}

The agent reads the structured error and self-corrects (narrows the WHERE clause, raises the cap explicitly, picks a different table).

Eval harness

Every release runs a result-set-equivalence eval suite against bigquery-public-data and updates the accuracy badge above. The methodology matches Spider and BIRD academic benchmarks: execute both gold and predicted SQL, compare result sets as multisets of rows (order-independent, with float tolerance, Decimal handling, NULL equality, NaN equality, ARRAY/STRUCT recursion, bool/int distinction).

Run locally:

mcp-bigquery-evals evals run --model <your-model-id>

Full methodology, golden-pairs YAML format, and how to add your own pairs: docs/how_evals_work.md.

Development

git clone https://github.com/Umarfarook1/mcp-bigquery-evals
cd mcp-bigquery-evals
python -m venv .venv && source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

pytest                    # unit tests (no GCP needed; ~160 tests)
pytest -m bq              # real-BQ integration tests (needs GCP creds)
pytest -m live            # end-to-end with real model + real BQ

Contributing

Issues and PRs welcome. Highest-leverage contributions:

More verified golden NL-to-SQL pairs against bigquery-public-data
Prompt improvements with before/after eval numbers showing the accuracy badge moved
Bug reports with minimum reproductions

License

MIT, see LICENSE.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured