MCP Servers

AWS Deep Learning Containers MCP Server

Provides tools for discovering, building, deploying, and troubleshooting AWS Deep Learning Containers (DLC) images. Supports multiple frameworks and instance recommendations.

README

AWS Deep Learning Containers MCP Server

A Model Context Protocol (MCP) server for AWS Deep Learning Containers (DLC) that provides tools for discovering, building, deploying, and troubleshooting DLC images.

Features

Dynamic DLC Image Discovery: Automatically fetches latest images from AWS DLC GitHub - always up-to-date
Image Building: Create custom Dockerfiles and build images based on DLC base images
Multi-Platform Deployment: Deploy to SageMaker, EC2, ECS, and EKS
Instance Recommendations: Get GPU instance recommendations based on model size and budget
Upgrade Support: Analyze upgrade paths and generate migration Dockerfiles
Troubleshooting: Diagnose common DLC issues with actionable solutions
Best Practices: Security, cost optimization, and deployment guidance
No AWS Credentials Required: Discovery tools work without AWS credentials

Quick Start

Option 1: Run with uv (Recommended)

# Clone the repo
git clone https://github.com/aws-samples/sample-dlc-mcp-server.git
cd sample-dlc-mcp-server

# Run directly with uv
uv run dlc-mcp-server

Option 2: Run with Docker

# Build the image
docker build -t dlc-mcp-server .

# Run the container
docker run -it --rm \
  -v ~/.aws:/root/.aws:ro \
  dlc-mcp-server

Option 3: Install locally

pip install -e .
dlc-mcp-server

MCP Client Configuration

For Amazon Q CLI

Add to ~/.aws/amazonq/mcp.json:

{
  "mcpServers": {
    "dlc-mcp-server": {
      "command": "uv",
      "args": ["--directory", "/path/to/sample-dlc-mcp-server", "run", "dlc-mcp-server"],
      "timeout": 120000
    }
  }
}

For Kiro

Add to .kiro/settings/mcp.json:

{
  "mcpServers": {
    "dlc-mcp-server": {
      "command": "uv",
      "args": ["--directory", "/path/to/sample-dlc-mcp-server", "run", "dlc-mcp-server"],
      "timeout": 120000
    }
  }
}

Using Docker

{
  "mcpServers": {
    "dlc-mcp-server": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "~/.aws:/root/.aws:ro", "dlc-mcp-server"],
      "timeout": 120000
    }
  }
}

Available Tools

DLC Discovery

Tool	Description
`search_dlc_images`	Search DLC images by framework, version, accelerator, platform
`get_dlc_recommendation`	Get image recommendations based on model type and size
`list_dlc_frameworks`	List all available frameworks with versions
`get_llm_serving_options`	Compare vLLM, SGLang, DJL, NeuronX options
`compare_dlc_images`	Side-by-side image comparison
`refresh_dlc_catalog`	Force refresh image catalog from GitHub

Image Building

Tool	Description
`create_custom_dockerfile`	Generate Dockerfile with custom packages
`build_custom_dlc_image`	Build and optionally push to ECR

Deployment

Tool	Description
`deploy_to_sagemaker`	Deploy to SageMaker endpoint
`deploy_to_ec2`	Launch EC2 instance with DLC
`deploy_to_ecs`	Deploy to ECS cluster
`deploy_to_eks`	Deploy to EKS cluster
`get_sagemaker_endpoint_status`	Check endpoint status

Instance Advisor

Tool	Description
`get_instance_recommendation`	GPU instance recommendations by model size
`list_gpu_instances`	List available GPU instances with pricing
`estimate_training_cost`	Estimate training job costs

Troubleshooting

Tool	Description
`analyze_dlc_error`	Analyze error logs with root cause analysis
`diagnose_common_issues`	Diagnose common DLC problems
`get_framework_compatibility_info`	Check framework version compatibility

Best Practices

Tool	Description
`get_security_best_practices`	Security guidelines
`get_cost_optimization_tips`	Cost reduction strategies
`get_deployment_best_practices`	Platform-specific guidance
`get_framework_specific_best_practices`	Framework optimization tips

Supported Frameworks

Framework	Latest Version	Use Cases
PyTorch	2.9.0	Training, Inference
TensorFlow	2.19.0	Training, Inference
vLLM	0.15.1	LLM Inference
SGLang	0.5.8	LLM Inference
HuggingFace PyTorch	2.6.0	NLP Training/Inference
AutoGluon	1.5.0	AutoML
DJL	0.36.0	Large Model Inference
PyTorch NeuronX	2.9.0	Trainium/Inferentia

Example Usage

Find vLLM images

Search for vLLM images for SageMaker inference

Deploy LLM to SageMaker

Deploy Qwen2.5-32B using vLLM on SageMaker with the right instance type

Get instance recommendations

What instance should I use for a 35GB model?

Troubleshoot errors

Help me fix this CUDA out of memory error: [paste error]

Configuration

Environment variables:

Variable	Description	Default
`ALLOW_WRITE`	Enable build/deploy operations	`false`
`ALLOW_SENSITIVE_DATA`	Enable detailed logs access	`false`
`FASTMCP_LOG_LEVEL`	Logging level	`ERROR`
`FASTMCP_LOG_FILE`	Log file path	None

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Run linting
ruff check .

See DEVELOPMENT.md for more details.

License

This library is licensed under the MIT-0 License.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured