MCP DataBridge

MCP DataBridge

Enables AI agents to interact with the Titanic passenger database via MCP, providing tools for querying, analysis, and exploration.

Category
Visit Server

README

MCP DataBridge

Production-ready MCP server that enables AI agents to interact with the Titanic passenger database via the Model Context Protocol.

Architecture

┌──────────────┐         MCP Protocol         ┌──────────────────────────────┐
│   AI Agent   │◄────── stdio / HTTP ────────►│       MCP DataBridge         │
│ (Claude, etc)│                              │                              │
└──────────────┘                              │  Tools ─── query_passengers  │
                                              │          ├ get_passenger     │
                                              │          ├ aggregate_stats   │
                                              │          ├ survival_analysis │
                                              │          ├ describe_column   │
                                              │          ├ list_tables       │
                                              │          └ run_sql (sandbox) │
                                              │                              │
                                              │  Resources ─ info, sample,   │
                                              │              stats/{column}  │
                                              │                              │
                                              │  Prompts ── explore_dataset  │
                                              │           ├ survival_analysis│
                                              │           └ data_quality     │
                                              │                              │
                                              │  ┌────────────────────────┐  │
                                              │  │  SQLite (8 tables)     │  │
                                              │  │  891 passengers        │  │
                                              │  │  Normalized + JOINs    │  │
                                              │  └────────────────────────┘  │
                                              └──────────────────────────────┘

Database: 891 passengers across 8 normalized tables (Observation + 7 lookup tables). All tools resolve foreign keys and return human-readable labels — agents never see raw IDs.

Quick Start

# Install
pip install -e ".[dev]"

# Run (stdio transport — for MCP clients like Claude Desktop)
python -m mcp_databridge

# Run with HTTP transport (for remote/Docker access)
DATABRIDGE_TRANSPORT=streamable-http python -m mcp_databridge

# Open MCP Inspector (interactive web UI for testing tools/resources/prompts)
npx @modelcontextprotocol/inspector --config inspector-config.json --server databridge

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) and add the mcpServers block:

{
  "mcpServers": {
    "databridge": {
      "command": "/full/path/to/python",
      "args": ["-m", "mcp_databridge"],
      "cwd": "/path/to/mcp_databridge",
      "env": {
        "DATABRIDGE_DB_PATH": "/path/to/mcp_databridge/data/titanic.db"
      }
    }
  }
}

Important: Use the full Python path (run which python to find it). Claude Desktop does not inherit your shell's PATH, so bare python won't be found. The DATABRIDGE_DB_PATH env var ensures the database is found regardless of working directory.

Then quit Claude Desktop (Cmd+Q) and reopen it. The server should appear under Connectors.

VS Code

Add to your VS Code MCP settings (.vscode/mcp.json or user settings):

{
  "mcpServers": {
    "databridge": {
      "command": "python",
      "args": ["-m", "mcp_databridge"],
      "cwd": "/path/to/mcp_databridge"
    }
  }
}

Docker

# HTTP transport (accessible at http://localhost:8000/mcp)
docker compose up -d

# stdio transport (pipe directly to MCP client)
docker build -t mcp-databridge .
docker run -i mcp-databridge

To connect Claude Desktop to the Docker container, edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "databridge": {
      "url": "http://localhost:8000/mcp"
    }
  }
}

Then quit Claude Desktop (Cmd+Q) and reopen it. View server logs with docker logs -f databridge.

Tools

Tool Description Key Parameters
query_passengers Filter and retrieve passengers with resolved labels filters, columns, limit, offset
get_passenger Get a single passenger by row number (1–891) row_number
aggregate_stats Group-by aggregations (count/avg/sum/min/max) group_by, metric, column
get_survival_analysis Survival rates by class, sex, age group, deck, etc. dimension
describe_column Statistical summary for any column column
list_tables Show all tables and their schemas
run_sql Execute read-only SQL (SELECT only, sandboxed) query

Filter Syntax

Filters use human-readable labels. The server resolves them to foreign key JOINs internally:

{
  "sex": "female",
  "pclass": 1,
  "age_min": 20,
  "age_max": 40,
  "survived": true,
  "embarked": "S",
  "who": "woman",
  "deck": "B",
  "alone": false
}

Use "missing" to filter for unknown/NULL values in categorical columns:

{"deck": "missing"}
{"embarked": "missing"}
{"embark_town": "missing"}

Resources

URI Description
databridge://info Schema, row counts, missing values, table relationships
databridge://sample First 5 rows with resolved labels
databridge://stats/{column} Statistical summary for a column (numeric or categorical)

Prompts

Prompt Description
explore_dataset Guided exploration — schema overview, suggested starting queries
survival_analysis Step-by-step survival analysis across multiple dimensions
data_quality_report Missing values, distributions, data quality findings

Example Agent Interaction

User: "What was the survival rate for women vs men?"

Agent calls get_survival_analysis(dimension="sex"):

{
  "dimension": "sex",
  "results": [
    {"sex": "female", "survived_count": 233, "total_count": 314, "survival_rate_pct": 74.2},
    {"sex": "male", "survived_count": 109, "total_count": 577, "survival_rate_pct": 18.89}
  ]
}

User: "Average fare by passenger class?"

Agent calls aggregate_stats(group_by="class", metric="avg", column="fare"):

{
  "results": [
    {"class": "First", "avg_fare": 84.15},
    {"class": "Second", "avg_fare": 20.66},
    {"class": "Third", "avg_fare": 13.68}
  ],
  "count": 3
}

User: "Show me first-class female passengers"

Agent calls query_passengers(filters={"sex": "female", "pclass": 1}):

{
  "rows": [
    {
      "row_number": 2, "survived": 1, "pclass": 1, "age": 38.0,
      "sex": "female", "class": "First", "who": "woman",
      "deck": "C", "embark_town": "Cherbourg", "fare": 71.28, "alive": "yes"
    }
  ],
  "count": 94
}

All responses use human-readable labels (e.g., "female", "First", "Cherbourg") — the normalized schema is fully abstracted from the agent.

Configuration

All settings via environment variables (12-factor compliant):

Variable Default Description
DATABRIDGE_DB_PATH ./data/titanic.db Path to SQLite database
DATABRIDGE_LOG_LEVEL INFO Logging level
DATABRIDGE_MAX_RESULTS 200 Max rows per query
DATABRIDGE_TRANSPORT stdio Transport: stdio | streamable-http
DATABRIDGE_HOST 0.0.0.0 HTTP host (streamable-http only)
DATABRIDGE_PORT 8000 HTTP port (streamable-http only)
DATABRIDGE_QUERY_TIMEOUT 30 Query timeout in seconds

Security

  • Read-only SQL: run_sql only allows SELECT — DDL/DML keywords (DROP, INSERT, UPDATE, DELETE, ALTER, CREATE, ATTACH, DETACH, PRAGMA) are blocked
  • Parameterized queries: All built-in tools use parameterized queries to prevent SQL injection
  • Input validation: All tool parameters validated via Pydantic models with constrained types
  • Result size limits: Max 200 rows per query (configurable)
  • Non-root container: Docker runs as unprivileged appuser
  • Multi-statement blocking: Semicolons in run_sql queries are rejected

Testing

# Run all tests (119 tests, 95% coverage)
pytest --cov=mcp_databridge --cov-report=term-missing -v

# Lint + format check
ruff check src/ tests/
ruff format --check src/ tests/

# Type check
mypy src/

# Interactive MCP Inspector (pre-configured command, args, env vars)
npx @modelcontextprotocol/inspector --config inspector-config.json --server databridge

Test suite includes:

  • Unit tests for all 7 tools, 3 resources, 3 prompts
  • Database layer tests (connection management, query helpers, SQL sandbox)
  • Pydantic model validation tests
  • Full MCP protocol integration test (spawns server via stdio, performs JSON-RPC handshake, tests all endpoints)
  • Structured logging tests (correlation IDs, log configuration)

Project Structure

mcp_databridge/
├── pyproject.toml              # Dependencies, tool config (ruff, mypy, pytest)
├── Dockerfile                  # Production container (non-root, slim)
├── docker-compose.yml          # HTTP transport deployment
├── inspector-config.json       # Pre-configured MCP Inspector (command, args, env vars)
├── .github/workflows/ci.yml    # CI: lint → type-check → test (3.11-3.13) → docker build
├── .env.example                # Configuration template
├── data/
│   └── titanic.db              # Pre-built SQLite database (committed)
├── src/mcp_databridge/
│   ├── __main__.py             # Entry point: python -m mcp_databridge
│   ├── server.py               # FastMCP server — registers tools, resources, prompts
│   ├── database.py             # SQLite connection, resolved-view JOINs, SQL sandbox
│   ├── models.py               # Pydantic models for parameter validation
│   ├── config.py               # pydantic-settings (env vars with DATABRIDGE_ prefix)
│   ├── logging.py              # structlog JSON logging with correlation IDs
│   ├── resources.py            # MCP resources (info, sample, stats)
│   ├── prompts.py              # MCP prompts (explore, survival, quality)
│   └── tools/
│       ├── query.py            # query_passengers, get_passenger, list_tables
│       ├── analytics.py        # aggregate_stats, get_survival_analysis, describe_column
│       └── sql.py              # run_sql (sandboxed, SELECT-only)
└── tests/
    ├── conftest.py             # Shared fixtures (test DB copy)
    ├── test_tools/             # Unit tests for each tool module
    ├── test_database.py        # Database layer tests
    ├── test_resources.py       # Resource endpoint tests
    ├── test_prompts.py         # Prompt content tests
    ├── test_models.py          # Pydantic validation tests
    ├── test_logging.py         # Logging configuration tests
    ├── test_server.py          # Server wiring tests
    ├── test_integration.py     # Integration tests
    └── test_mcp_protocol.py    # Full MCP protocol round-trip via stdio

Tech Stack

Component Choice Why
Language Python 3.11+ Challenge requirement
MCP SDK FastMCP (mcp v1.26+) Official Anthropic SDK, decorator-based registration
Database SQLite (stdlib sqlite3) Zero-infra, pre-built database, WAL mode for concurrent reads
Validation Pydantic v2 Type safety, constrained types, serialization
Config pydantic-settings 12-factor env var management with type coercion
Logging structlog Structured JSON logging, correlation IDs
Testing pytest + pytest-asyncio 119 tests, 95% coverage, MCP protocol integration
Linting Ruff Fast, replaces flake8 + isort + pyupgrade
Type Check mypy (strict mode) Static analysis, catches bugs before runtime
Container Docker (slim) Non-root, minimal image, stdio + HTTP transport
CI/CD GitHub Actions lint → type-check → test (3.11/3.12/3.13) → docker build

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured