MCP Server for ETL Orchestration

MCP Server for ETL Orchestration

Enables natural language-powered ETL workflows using Airflow, AWS Glue, Athena, and S3, allowing LLM agents to control and monitor data infrastructure.

Category
Visit Server

README

🧠 MCP Server for ETL Orchestration

Natural Language-Powered ETL Workflows using Airflow, AWS Glue, Athena, and S3

This project implements a Model Context Protocol (MCP)-compliant server that exposes a powerful set of ETL orchestration tools to LLM agents (like Claude or GPT), enabling them to control, monitor, and interact with real-world data infrastructure using natural language.


🚀 Features

  • 🛰️ Airflow Integration
    Trigger DAGs, monitor their status, and list available workflows.

  • 🪣 S3 Tools
    Create buckets, upload files, delete buckets — programmatically or via LLM prompts.

  • 🧬 AWS Glue Integration
    Start jobs, track job runs, fetch logs, and view available ETL scripts.

  • 🔍 Athena Query Engine
    Execute SQL queries on S3 data, poll for status, fetch results, and list catalog metadata.

  • 🧠 LLM-Native Tool Interface
    Fully MCP-compliant interface for Claude, GPT, and other AI assistants to programmatically operate the stack using natural language.


🛠️ Available Tools

📌 Airflow

  • Trigger DAGs
  • Check DAG status
  • List available DAGs with status

📌 S3

  • Create an S3 bucket
  • Upload a file to a bucket
  • Delete an S3 bucket (with optional object cleanup)

📌 Glue

  • Run a Glue job with optional arguments
  • Check Glue job run status
  • Fetch Glue job logs
  • List all available Glue jobs

📌 Athena

  • Run SQL queries on Athena with configurable output location
  • Check query execution status
  • Fetch query results
  • List available databases
  • List tables in a specific database

⚙️ Setup

1. Clone the Repository and Install Dependencies

git clone https://github.com/atharvpatwardhan/mcp-etl-orchestrator.git
cd mcp-etl-orchestrator
pip install -r requirements.txt

2. Configure Environment Variables

Create a .env file in the root directory and populate it with your AWS credentials:

# AWS Credentials
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key
AWS_DEFAULT_REGION=your-aws-region

3. Update Airflow Credentials in tools/airflow_config.py (optional)

Airflow API Configuration

AIRFLOW_API_BASE=http://localhost:8080/api/v1
AIRFLOW_USERNAME=admin
AIRFLOW_PASSWORD=admin

4. Start the MCP Server

python main.py

Once the server is running, connect your Claude Desktop or any MCP-compatible client to the server and begin using the tools with natural language commands!

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured