MCP Servers

Apache Spark History Server MCP

Exposes Spark History Server data as tools for AI agents, enabling natural language querying of Spark applications, jobs, stages, and performance metrics.

README

Kubeflow Spark AI Toolkit

Connect AI agents and engineers to Apache Spark History Server for intelligent job analysis, performance monitoring, and investigation

[!IMPORTANT]

✨ NEW — Spark History Server CLI is now available

A standalone Go binary that queries Spark History Server directly from your terminal — no MCP, no AI framework, no daemon process. Inspect jobs, compare runs, investigate failures, and script against the Spark REST API.

Get started with the SHS CLI →

This project provides two interfaces to your Spark History Server data:

	🛠️ SHS CLI (`shs`)	⚡ MCP Server
For	Engineers, shell scripts, CI/CD, coding agents	AI agents and MCP-compatible clients
Mental model	"I know the command I want to run"	"Agent, investigate this Spark app"
Install	Single static binary — no dependencies	Python 3.12+, uv
Get started	CLI docs →	MCP docs →

📺 See it in action:

🏗️ Architecture

graph TB
    subgraph Clients
        A[🤖 AI Agent / LLM]
        B[👩‍💻 Engineer / Script / CI]
        C[🔧 Coding Agent - Claude Code / Kiro]
    end

    subgraph "Kubeflow Spark AI Toolkit"
        D[⚡ MCP Server]
        E[🛠️ CLI - shs]
    end

    subgraph "Spark History Servers"
        F[🔥 Production]
        G[🔥 Staging / Dev]
    end

    A -->|MCP Protocol| D
    B -->|Terminal commands| E
    C -->|shs skill file| E

    D -->|REST API| F
    D -->|REST API| G
    E -->|REST API| F
    E -->|REST API| G

🛠️ SHS CLI (`shs`) — For Engineers & Scripts

A standalone Go binary. Query your Spark History Server directly from the terminal, shell scripts, or CI/CD pipelines. Also works as a skill for coding agents like Claude Code and Kiro.

Install

# Auto-detect latest version, OS, and architecture
VERSION=$(curl -s https://api.github.com/repos/kubeflow/mcp-apache-spark-history-server/releases | grep -m1 '"tag_name": "cli/' | cut -d'"' -f4 | sed 's|cli/||')
OS=$(uname -s | tr '[:upper:]' '[:lower:]')
ARCH=$(uname -m)
[ "$ARCH" = "x86_64" ] && ARCH="amd64"
[ "$ARCH" = "aarch64" ] && ARCH="arm64"

curl -sSL "https://github.com/kubeflow/mcp-apache-spark-history-server/releases/download/cli%2F${VERSION}/shs-${VERSION}-${OS}-${ARCH}.tar.gz" | tar xz
sudo mv shs /usr/local/bin/

Quick Start

# Generate a config file
shs setup config > config.yaml   # then set your Spark History Server URL

# Explore applications
shs apps
shs jobs -a APP_ID --status failed
shs stages -a APP_ID --sort duration
shs compare apps --app-a APP1 --app-b APP2

# Use as a skill with Claude Code or Kiro
shs setup skill > ~/.claude/skills/spark-history.md

CLI documentation for full usage, or check out a real-world example of Claude Code comparing two TPC-DS 3TB benchmark runs.

⚡ MCP Server — For AI Agents

An MCP (Model Context Protocol) server that exposes Spark History Server data as tools for AI agents. Agents query your Spark infrastructure using natural language — the server handles tool selection, multi-server routing, and structured data retrieval.

Use the MCP server when you want an AI agent to conduct multi-step investigations, synthesize findings across tools, or answer natural-language questions about your Spark applications.

Install

# Run directly with uvx (no install needed)
uvx --from mcp-apache-spark-history-server spark-mcp

# Or install with pip
uv tool install mcp-apache-spark-history-server
spark-mcp

The package is published to PyPI.

Configure

Basic configuration below. Create a file named config.yaml:

servers:
  local:
    default: true
    url: "http://your-spark-history-server:18080"
    auth:            # optional
      username: "user"
      password: "pass"
    include_plan_description: false   # include SQL plans by default (default: false)
mcp:
  transports:
    - streamable-http   # or: stdio
  port: "18888"
  debug: false

Configurations can be overriden with environment variables.

SHS_MCP_PORT          Port for MCP server (default: 18888)
SHS_MCP_TRANSPORT     Transport mode: streamable-http or stdio
SHS_MCP_DEBUG         Enable debug mode (default: false)
SHS_MCP_ADDRESS       Bind address (default: localhost)
SHS_SERVERS_*_URL     URL for a specific server
SHS_SERVERS_*_AUTH_USERNAME
SHS_SERVERS_*_AUTH_PASSWORD
SHS_SERVERS_*_AUTH_TOKEN
SHS_SERVERS_*_VERIFY_SSL
SHS_SERVERS_*_TIMEOUT
SHS_SERVERS_*_EMR_CLUSTER_ARN
SHS_SERVERS_*_INCLUDE_PLAN_DESCRIPTION

Multi-Server Setup

Configure multiple Spark History Servers and route queries to specific ones:

servers:
  production:
    default: true
    url: "http://prod-spark-history:18080"
    auth:
      username: "user"
      password: "pass"
  staging:
    url: "http://staging-spark-history:18080"

Agents can target a specific server per query:

"Get application <app_id> from the production server"

Connect an AI Agent

Agent	Transport	Guide
Claude Desktop	stdio	Setup →
Claude Code	stdio or streamable-http	Setup →
Kiro	streamable-http	Setup →
LangGraph	streamable-http	Setup →
Strands Agents	streamable-http	Setup →
Local / Inspector	streamable-http	Setup →

Available Tools (21)

<details> <summary>Available Tools</summary>

Application Information

Tool	Description
`list_applications`	List applications with optional status, date, and limit filters
`get_application`	Get application detail: status, resources, duration, attempts

Job Analysis

Tool	Description
`list_jobs`	List jobs with status filtering
`list_slowest_jobs`	Top N slowest jobs

Stage Analysis

Tool	Description
`list_stages`	List stages with status filtering
`list_slowest_stages`	Top N slowest stages
`get_stage`	Stage detail with attempt and summary metrics
`get_stage_task_summary`	Task metric distributions (execution time, memory, I/O, spill)

Executor & Resource Analysis

Tool	Description
`list_executors`	List executors (active and optionally inactive)
`get_executor`	Executor detail: resources, task stats, performance
`get_executor_summary`	Aggregate metrics across all executors
`get_resource_usage_timeline`	Chronological executor add/remove with resource totals

Configuration & Environment

Tool	Description
`get_environment`	Spark config, JVM info, system properties, classpath

SQL & Query Analysis

Tool	Description
`list_slowest_sql_queries`	Top N slowest SQL executions with metrics
`get_sql_execution`	SQL execution detail with optional plan and node metrics
`compare_sql_execution_plans`	Compare SQL plans and metrics between two jobs

Performance & Bottleneck Analysis

Tool	Description
`get_job_bottlenecks`	Identify bottlenecks across stages, tasks, and executors

Comparative Analysis

Tool	Description
`compare_job_environments`	Diff Spark configs between two applications
`compare_job_performance`	Diff performance metrics between two applications

AWS Spark Troubleshooting (opt-in)

Tool	Description
`aws_analyze_spark_workload`	One-shot root cause analysis of failed/slow Spark workloads
`aws_spark_code_recommendation`	Code fix recommendations for identified Spark issues

Automatically available when AWS credentials and region are configured. See IAM setup guide.

</details>

Example Agent Queries

"Why is my ETL job running slower than yesterday?" → get_job_bottlenecks + list_slowest_stages + compare_job_performance
"What caused job 42 to fail?" → list_jobs + get_stage + get_stage_task_summary
"Compare today's batch with yesterday's run" → compare_job_performance + compare_job_environments
"Find my slowest SQL queries and explain why" → list_slowest_sql_queries + get_sql_execution + compare_sql_execution_plans

📸 Screenshots

🔍 Get Spark Application

Get Application

⚡ Job Performance Comparison

Job Comparison

🚀 Kubernetes Deployment

Deploy the MCP server using Helm:

helm install spark-history-mcp ./deploy/kubernetes/helm/mcp-apache-spark-history-server/

# Production configuration
helm install spark-history-mcp ./deploy/kubernetes/helm/mcp-apache-spark-history-server/ \
  --set replicaCount=3 \
  --set autoscaling.enabled=true

See deploy/kubernetes/helm/ for full configuration options.

When deployed in Kubernetes, connect Claude Desktop via mcp-remote:

kubectl port-forward svc/mcp-apache-spark-history-server 18888:18888

📔 AWS Integration

AWS Glue — Connect to Glue Spark History Server
Amazon EMR — Use EMR Persistent UI for Spark analysis
AWS Spark Troubleshooting — One-shot root cause analysis and code fix recommendations for failed Spark workloads (EMR EC2, EMR Serverless). Automatically available when AWS credentials and region are configured. See IAM setup guide for required permissions.

🔧 Development Setup

git clone https://github.com/kubeflow/mcp-apache-spark-history-server.git
cd mcp-apache-spark-history-server

# Install Task runner
brew install go-task   # macOS; see https://taskfile.dev/installation/ for others

# MCP Server
task install           # install Python dependencies
task start-spark-bg    # start Spark History Server with sample data
task start-mcp-bg      # start MCP server
task start-inspector-bg  # open MCP Inspector at http://localhost:6274
task stop-all

# CLI
cd skills/cli
task build             # build ./bin/shs
task test              # unit tests
task test-e2e          # e2e tests (starts/stops Docker SHS automatically)
task start-shs         # start SHS with CLI e2e sample data

🌍 Adopters

Using this project? Add your organization to ADOPTERS.md and help grow the community.

🤝 Contributing

See CONTRIBUTING.md for guidelines.

📄 License

Apache License 2.0 — see LICENSE.

📝 Trademark Notice

Built for use with Apache Spark™ History Server. Not affiliated with or endorsed by the Apache Software Foundation.

Connect your Spark infrastructure to AI agents and engineers

🛠️ SHS CLI · ⚡ MCP Server · 🧪 Test · 🤝 Contribute

Built by the community, for the community 💙

</div>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Apache Spark History Server MCP

README

Kubeflow Spark AI Toolkit

✨ NEW — Spark History Server CLI is now available

🏗️ Architecture

🛠️ SHS CLI (shs) — For Engineers & Scripts

Install

Quick Start

⚡ MCP Server — For AI Agents

Install

Configure

Multi-Server Setup

Connect an AI Agent

Available Tools (21)

Application Information

Job Analysis

Stage Analysis

Executor & Resource Analysis

Configuration & Environment

SQL & Query Analysis

Performance & Bottleneck Analysis

Comparative Analysis

AWS Spark Troubleshooting (opt-in)

Example Agent Queries

📸 Screenshots

🔍 Get Spark Application

⚡ Job Performance Comparison

🚀 Kubernetes Deployment

📔 AWS Integration

🔧 Development Setup

🌍 Adopters

🤝 Contributing

📄 License

📝 Trademark Notice

Recommended Servers

🛠️ SHS CLI (`shs`) — For Engineers & Scripts