MCP Servers

MCP Athena Analytics Server

Enables secure querying of analytics data stored in AWS Athena/S3 through natural language, with multiple security layers including query validation, resource limits, and automatic PII redaction to prevent data breaches and destructive operations.

README

MCP Athena Analytics Server

Secure MCP server for querying analytics data through AWS Athena.

Overview

This MCP (Model Context Protocol) server provides Claude and other LLMs with controlled access to analytics data stored in S3/Athena. It implements multiple security layers to prevent data breaches, destructive operations, and excessive resource usage.

Use cases:

Exploratory data analysis via natural language
Ad-hoc queries without writing SQL manually
Template-based reporting with parameter validation
Data discovery (tables, schemas, available templates)

Architecture

HTTP-based deployment (recommended for production):

┌─────────────────────────────────────────────────────────────┐
│                     claude-network (Docker bridge)          │
│                                                             │
│  ┌────────────────────────┐    ┌─────────────────────────┐ │
│  │ Claude Code Container  │    │ MCP Server Container    │ │
│  │ (claude-sandbox)       │    │ (mcp-athena-server)     │ │
│  │                        │    │                         │ │
│  │ Claude Code CLI        │HTTP│ FastAPI/SSE Server      │ │
│  │   ↓                    │◄───┤   ↓                     │ │
│  │ MCP Client (HTTP)      │    │ MCP Protocol Handler    │ │
│  │                        │    │   ↓                     │ │
│  │ 🚫 NO credentials      │    │ Athena Tools            │ │
│  │                        │    │   ↓                     │ │
│  └────────────────────────┘    │ boto3 + .env            │ │
│                                │   ↓                     │ │
│                                │ ✅ AWS Athena           │ │
│                                └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Security benefits:

✅ Credentials isolated in MCP server container
✅ Claude container has NO access to .env or AWS credentials
✅ Bearer token authentication between containers
✅ HTTP API as security boundary
✅ Network-level isolation via Docker bridge

Alternative: stdio mode (local development only):

MCP server runs as subprocess of Claude Code
⚠️ Claude has access to all credentials (not recommended)

Features

4 MCP Tools

execute_template - Run template with validated parameters
execute_query - Execute ad-hoc SELECT query (validated)
list_tables - Show available Athena tables
get_table_schema - Get table DDL (CREATE TABLE statement)

Security Layers

1. Query Validation

Ad-hoc queries: Only SELECT and WITH (CTE) allowed
Forbidden keywords: DROP, DELETE, CREATE, ALTER, INSERT, TRUNCATE, REPLACE
Whitelist approach: Query must start with SELECT or WITH

2. Template Blacklist

DROP_TABLE - Permanently blocked (destructive)
DROP_TABLE_IF_EXISTS - Permanently blocked (destructive)
Extensible: Add more templates to blacklist.py as needed

3. Resource Limits

Max rows: 100,000 (results truncated if exceeded)
Max timeout: 600 seconds (10 minutes)
Scan warning: 100 GB (logged but not blocked)

4. Data Sanitization

Sensitive columns auto-detected: password, token, secret, key, credential, api_key, etc
Redaction: Values replaced with ***REDACTED***
Applied to all results before returning to LLM

5. AWS Authentication

IRSA (IAM Roles for Service Accounts): Temporary credentials via Kubernetes
No static keys: Credentials refreshed automatically
Default profile: Sandbox/replica buckets (read-only production data)

Installation

Prerequisites

Docker & Docker Compose
AWS credentials (~/.aws/credentials with ATHENA_ANALYTICS_INTERNAL profile)
Active Athena setup

Setup (HTTP-based, recommended)

# 1. Create .env file in home directory
cp mcp_athena_analytics/.env.example ~/.mcp_athena_analytics.env

# 2. Generate authentication token
openssl rand -hex 32

# 3. Edit .env file and set MCP_AUTH_TOKEN
nano ~/.mcp_athena_analytics.env

# 4. Start claude-sandbox first (creates network)
cd claude-sandbox
docker compose up -d

# 5. Start MCP server container (joins network)
cd ../mcp_athena_analytics
docker compose up -d --build

# 6. Verify server is running
docker ps | grep mcp-athena-server
curl http://localhost:8000/health
# Expected: {"status":"healthy","service":"mcp-athena-analytics"}

# 7. Configure Claude Code CLI
cd ..
cp .mcp.json.example .mcp.json

# 8. Edit .mcp.json and set MCP_AUTH_TOKEN (same as in ~/.mcp_athena_analytics.env)
nano .mcp.json

Note: .env file is stored in home directory (~/.mcp_athena_analytics.env) to keep credentials outside project directory.

Setup (stdio mode, local development only)

Note: stdio mode is deprecated. Use HTTP-based deployment instead.

For local testing without Docker:

# 1. Install dependencies
pip install -r mcp_athena_analytics/requirements.txt
pip install -r app/requirements.txt

# 2. Create .env file
cp mcp_athena_analytics/.env.example ~/.mcp_athena_analytics.env
nano ~/.mcp_athena_analytics.env

# 3. Test server locally (HTTP mode)
PYTHONPATH=. python mcp_athena_analytics/server_http.py

Usage

Claude Code CLI (HTTP mode)

MCP server configuration is in .mcp.json (created from .mcp.json.example):

{
  "mcpServers": {
    "athena-analytics": {
      "type": "sse",
      "url": "http://mcp-athena-server:8000/sse",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN_HERE"
      }
    }
  }
}

Note: URL uses container name mcp-athena-server for Docker DNS resolution (both containers in claude-network).

Important: Replace YOUR_TOKEN_HERE with the same token from mcp_athena_analytics/.env

Start Claude Code and enable MCP server:

/mcp  # In Claude Code CLI

Note: .mcp.json is in .gitignore (local config), .mcp.json.example is tracked.

Testing

Use curl to test HTTP endpoints:

# Health check
curl http://localhost:8000/health

# Test with MCP client (from claude-sandbox container)
docker exec -it claude-sandbox bash
curl http://mcp-athena-server:8000/health

Tool Examples

execute_query

Run ad-hoc SELECT query:

{
  "tool": "execute_query",
  "arguments": {
    "sql_query": "SELECT COUNT(*) FROM provider__actions_alpha WHERE year = 2024"
  }
}

execute_template

Run registered template with params:

{
  "tool": "execute_template",
  "arguments": {
    "template_name": "AGG_RTP",
    "params": {"year": 2024, "month": 1, "brand": "alpha"}
  }
}

Security Best Practices

What's Allowed

✅ SELECT queries (read-only) ✅ WITH (Common Table Expressions) ✅ Registered templates (except blacklisted) ✅ Table metadata queries

What's Forbidden

❌ DROP TABLE (data destruction) ❌ DELETE/INSERT/UPDATE (data modification) ❌ CREATE/ALTER (schema modification)

Data Access

Athena (S3): ✅ Read-only access via MCP server
PostgreSQL: ❌ NOT accessible (by design)
- Reason: Direct DB access bypasses audit trail
- Alternative: Use Athena (data replicated to S3)

Troubleshooting

Server won't start

Error: ImportError: No module named 'mcp' Fix: pip install mcp

Error: ImportError: No module named 'app.lib.aws' Fix: Set PYTHONPATH:

export PYTHONPATH=/path/to/analytics:$PYTHONPATH
python mcp_athena_analytics/server.py

Error: botocore.exceptions.NoCredentialsError Fix: Configure AWS credentials (~/.aws/credentials profile ATHENA_ANALYTICS_INTERNAL)

Queries timeout

Symptoms: Queries exceed 600s

Fixes:

Add partition filters (year, month, day)
Reduce date range
Use aggregated tables instead of raw data

Results truncated

Symptoms: "truncated: X rows → 100,000 rows"

Fixes:

Add LIMIT clause: SELECT ... LIMIT 10000
Add WHERE filters to reduce results
Use GROUP BY aggregation

Sensitive data not redacted

Fix: Add pattern to sanitizer.py:

SENSITIVE_PATTERNS = [
    r"\bpassword\b",
    r"\buser_api_key\b",  # Add your pattern
]

Architecture

File Structure

mcp_athena_analytics/
├── server.py              # MCP server entrypoint
├── config.py              # Path setup
├── logging_setup.py       # Logging configuration
├── athena_tools/
│   ├── registry.py        # Tool registry (ToolConfig, TOOL_REGISTRY)
│   ├── execute_query.py   # Ad-hoc query tool
│   ├── execute_template.py
│   ├── list_tables.py
│   └── get_table_schema.py
├── blacklist.py           # Template blacklist
├── validator.py           # Query validation
└── sanitizer.py           # Data redaction

Data Flow

Claude Code CLI
    ↓ (JSON-RPC over stdio)
server.py
    ↓ (setup_paths, load_env)
TOOL_REGISTRY
    ↓ (route to tool.execute())
athena_tools/*.py
    ↓ (validate, check blacklist)
AthenaHelper
    ↓ (boto3)
AWS Athena → S3
    ↓ (results)
sanitizer.py (redact PII)
    ↓ (JSON)
Claude Code CLI

Development

Running Tests

# Test individual modules
python -m mcp_athena_analytics.blacklist
python -m mcp_athena_analytics.validator
python -m mcp_athena_analytics.sanitizer

Adding New Tools

Create tool module in athena_tools/my_tool.py:

TOOL_NAME = 'my_tool'
TOOL_DESCRIPTION = 'What this tool does'
TOOL_INPUT_SCHEMA = {...}

def execute(param1: str) -> dict[str, Any]:
    """Implementation."""
    return {'result': '...'}

Add to athena_tools/registry.py:

from mcp_athena_analytics.athena_tools import my_tool

TOOL_REGISTRY: list[ToolConfig] = [
    # ...
    ToolConfig(
        name=my_tool.TOOL_NAME,
        description=my_tool.TOOL_DESCRIPTION,
        input_schema=my_tool.TOOL_INPUT_SCHEMA,
        execute=my_tool.execute,
    ),
]

Tool automatically appears in list_tools() and call_tool().

Logging

stdout: Reserved for MCP protocol (don't pollute!)
stderr: Error messages and debugging
File logs: tmp/mcp_server_YYYYMMDD_HHMMSS.log

License

Internal analytics project.

Support

Check troubleshooting section above
Review server logs in tmp/mcp_server_*.log
Test with MCP Inspector
Contact analytics team

References

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured