ArgoCD MCP Server
Safety-first GitOps operations for ArgoCD via the Model Context Protocol. Enables listing, diagnosing, syncing, and managing ArgoCD applications with progressive disclosure and defense-in-depth security.
README
_____ _____ ____ __ __ _____ _____
/ _ \_______ _____ / ____| _ \ | \/ |/ ____| __ \
/ /_\ \_ __ \_ __ \| | | | | | | \ / | | | _) |
/ | \ | \/ | | | | | | |_| | | |\/| | | | __/
\____|____/__| |_| |_|\_____|____/ |_| |_|\_____|_|
╔═══════════════════════════════════════╗
║ S A F E T Y F I R S T G I T O P S ║
╚═══════════════════════════════════════╝
ArgoCD MCP Server
Safety-first GitOps operations for ArgoCD via the Model Context Protocol.
Why This Exists
"95% of MCP servers are garbage."
We built this because we were tired of:
The Empty Error Message Problem. You sync an app. It fails. The error? "Sync failed." Thanks, very helpful. Meanwhile, the actual cause is buried across three different Kubernetes events, two pod logs, and a misconfigured HPA that nobody told you about. Our diagnose_sync_failure tool aggregates all of that into one coherent analysis.
The "Oops, I Deleted Production" Problem. Most MCP servers treat every operation the same. Read a deployment? Same as deleting a namespace! We implement defense-in-depth: read-only by default, writes require explicit opt-in, destructive operations require typing the resource name twice. Because at 3 AM, you don't need an LLM with root access to your production cluster.
The YAML Fatigue Problem. When you have 47 applications across 12 clusters, the ArgoCD UI becomes a slideshow. We built this for SREs who live in terminals and need answers fast: "What's broken in prod?" should return an answer, not a loading spinner.
The Context Window Problem. MCP servers that dump entire manifests into the context are doing it wrong. We use progressive disclosure: simple status first, details on request, full manifests only when you explicitly ask. Your context window is precious real estate.
Quick Demo
Illustrative example — the conversation below is a hand-written demonstration of how an agent would use these tools, not a transcript of a real session. Tool names, parameters, and responses match what the server actually returns; the surrounding chat is for illustration only.
You: What applications are failing in production?
Claude: Let me check the production applications for you.
[calls list_applications(health_status="Degraded", project="prod")]
Found 2 application(s):
- payment-service [prod] health=Degraded sync=OutOfSync dest=payments@https://prod.k8s...
- notification-api [prod] health=Degraded sync=Synced dest=notifications@https://prod.k8s...
The payment-service is both unhealthy AND out of sync. Want me to diagnose what's wrong?
You: Why is payment-service broken?
Claude: I'll run a full diagnosis.
[calls diagnose_sync_failure(name="payment-service")]
Diagnosis for 'payment-service':
Found 4 issue(s):
- Application is out of sync (revision: abc123f)
- [SyncError] Failed to sync: resource payment-deployment: container crashed
- Image pull failed: ImagePullBackOff for payments:v2.3.1
- Found 2 unhealthy resources in resource tree
- Deployment/payment-deployment: Container crashloopbackoff
- Pod/payment-deployment-7f8b9-x2k4j: CrashLoopBackOff
Suggestions:
- Verify image exists and registry credentials are configured
- Check pod logs for application startup errors
- Run get_application_diff to see pending changes
The image tag v2.3.1 doesn't seem to exist. Can you verify it was pushed to the registry?
You: Show me what would change if I sync it
Claude: [calls get_application_diff(name="payment-service")]
Diff for application 'payment-service':
Resources to UPDATE (3):
~ Deployment/payment-deployment
~ ConfigMap/payment-config
~ Service/payment-service
Resources in sync: 12
The sync would update the deployment (probably the image tag fix), update the
config, and touch the service. Want me to run the sync?
You: Yes, sync it
Claude: [calls sync_application(name="payment-service", dry_run=false)]
Sync initiated for 'payment-service'
Revision: HEAD
Prune: false
Use get_application_status to monitor progress.
I'll check the status in a moment to confirm it's progressing.
Philosophy
Progressive Disclosure
Not everything needs to be visible all the time. We tier our tools:
| Tier | Access | Examples |
|---|---|---|
| Tier 1 | Always available | list_applications, get_application_status, diagnose_sync_failure |
| Tier 2 | Requires MCP_READ_ONLY=false |
sync_application, refresh_application |
| Tier 3 | Requires confirmation + typing name | delete_application, sync_application_with_prune |
This isn't bureaucracy. This is respecting that production systems deserve more friction than rm -rf /.
Dry-Run by Default
Every write operation defaults to preview mode. You have to explicitly say "yes, really do this" before anything changes. We learned this lesson from too many "I thought that was staging" incidents.
Agent-Friendly Error Messages
# Bad (what most tools return)
Error: exit status 1
# Good (what we return)
ArgoCD API error (403): Application payment-service not found in project 'default'
Suggestions:
- Check if application exists: list_applications(project="prod")
- Verify you have access to the target project
Errors should tell you what went wrong AND what to try next.
Quick Start
Installation
# Clone the repository
git clone https://github.com/peopleforrester/mcp-k8s-observability-argocd-server
cd mcp-k8s-observability-argocd-server
# Install with uv
uv sync
Claude Desktop / Claude Code Configuration
Add to your Claude configuration (~/.claude.json for Claude Code):
{
"mcpServers": {
"argocd": {
"type": "stdio",
"command": "/path/to/uv",
"args": [
"run",
"--directory",
"/path/to/mcp-k8s-observability-argocd-server",
"argocd-mcp"
],
"env": {
"ARGOCD_URL": "https://argocd.example.com",
"ARGOCD_TOKEN": "your-api-token",
"ARGOCD_INSECURE": "false"
}
}
}
}
Note: Replace /path/to/uv with the full path to your uv binary (run which uv to find it).
See examples/ for more configuration options including multi-cluster setups.
Docker
# Build the image
docker build -t argocd-mcp-server .
# Run with environment variables
docker run -e ARGOCD_URL=https://argocd.example.com \
-e ARGOCD_TOKEN=your-token \
argocd-mcp-server:latest
Running Directly
# Set environment variables
export ARGOCD_URL=https://argocd.example.com
export ARGOCD_TOKEN=your-token
# Run the server
uv run argocd-mcp
Security Model
We don't just check permissions. We make it hard to do the wrong thing.
| Layer | Environment Variable | Default | What It Does |
|---|---|---|---|
| Read-only Mode | MCP_READ_ONLY |
true |
Blocks ALL write operations. You can look, but you cannot touch. |
| Non-destructive Mode | MCP_DISABLE_DESTRUCTIVE |
true |
Blocks delete/prune even if writes enabled. Deletes require this AND read-only off. |
| Single-cluster Mode | MCP_SINGLE_CLUSTER |
false |
Restricts operations to the default cluster. For when multi-cluster access is too scary. |
| Audit Logging | MCP_AUDIT_LOG |
(disabled) | Logs every operation to a file. For when you need to know who did what. |
| Secret Masking | MCP_MASK_SECRETS |
true |
Redacts tokens, passwords, and API keys from output. Always on unless you're debugging. |
| Rate Limiting | MCP_RATE_LIMIT_CALLS |
100 |
Max API calls per minute. Prevents runaway loops from eating your ArgoCD API. |
Enabling Write Operations (Carefully)
# Enable writes (still blocks destructive operations)
export MCP_READ_ONLY=false
# Enable destructive operations (delete, prune) - DANGER ZONE
export MCP_DISABLE_DESTRUCTIVE=false
For the full security model deep-dive, see docs/SECURITY.md.
Tool Reference
Tier 1: Essential Read Operations (Always Available)
| Tool | What It Does |
|---|---|
list_applications |
List apps with filtering by project, health, or sync status. The "show me what's on fire" tool. |
get_application |
Get detailed app info: source, destination, status. The deep dive. |
get_application_status |
Quick health/sync check. Fast and cheap. |
get_application_diff |
Preview what would change on sync. Look before you leap. |
get_application_history |
View deployment history with commits. "What changed and when?" |
diagnose_sync_failure |
AI-powered troubleshooting. Aggregates logs, events, status into actionable analysis. |
get_application_logs |
Get pod logs for debugging. Filter by pod, container, and time range. |
list_clusters |
List registered clusters with connection status. |
list_projects |
List ArgoCD projects. |
Tier 2: Write Operations (Require MCP_READ_ONLY=false)
| Tool | What It Does |
|---|---|
sync_application |
Sync with dry-run default. Set dry_run=false to actually apply. |
refresh_application |
Force manifest refresh from Git. "Did you push? Let me check again." |
rollback_application |
Rollback to a previous deployment. Dry-run by default. |
terminate_sync |
Stop a running sync operation. For when syncs get stuck. |
Tier 3: Destructive Operations (Require explicit confirmation)
| Tool | What It Does |
|---|---|
delete_application |
Delete application. Requires confirm=true AND confirm_name matching the app name. We make you type it twice for a reason. |
sync_application_with_prune |
Sync and DELETE cluster resources missing from Git. Dry-run by default. Live runs require confirm=true AND confirm_name matching the app name. |
For detailed parameter documentation, see docs/TOOLS.md.
Example Conversations
"What applications are failing in production?"
list_applications(health_status="Degraded", project="prod")
"Why is my-app not syncing?"
diagnose_sync_failure(name="my-app")
"Deploy the latest changes to staging"
sync_application(name="my-app", dry_run=false)
"Show me what would change if I sync"
get_application_diff(name="my-app")
"What was deployed last week?"
get_application_history(name="my-app", limit=20)
Configuration Reference
Environment Variables
| Variable | Description | Default |
|---|---|---|
ARGOCD_URL |
ArgoCD server URL | (required) |
ARGOCD_TOKEN |
ArgoCD API token | (required) |
ARGOCD_INSECURE |
Skip TLS verification (dev only!) | false |
MCP_READ_ONLY |
Block write operations | true |
MCP_DISABLE_DESTRUCTIVE |
Block delete/prune | true |
MCP_SINGLE_CLUSTER |
Restrict to default cluster | false |
MCP_AUDIT_LOG |
Path to audit log file | (disabled) |
MCP_RATE_LIMIT_CALLS |
Max API calls per window | 100 |
MCP_RATE_LIMIT_WINDOW |
Rate limit window (seconds) | 60 |
ARGOCD_MCP_LOG_LEVEL |
Logging level | INFO |
Multi-Instance Configuration
For managing multiple ArgoCD instances (multi-cluster, multi-environment):
# Primary instance
export ARGOCD_URL=https://argocd-prod.example.com
export ARGOCD_TOKEN=prod-token
# Additional instances can be configured via the multi-env example.
# See examples/claude-desktop-multi-env.json
Development
Prerequisites
- Python 3.11, 3.12, 3.13, or 3.14
- uv (recommended) or pip
- Docker (for container builds)
- Kind 0.32+ (for local Kubernetes testing)
Setup
# Install dependencies
uv sync --dev
# Run tests
uv run pytest
# Run linting
uv run ruff check src tests
uv run mypy src
# Build Docker image
docker build -t argocd-mcp-server .
Testing with Kind
Important: Kubernetes 1.36 removed cgroup v1 support entirely — a node on a cgroup v1 host will not start. (cgroup v1 had been in maintenance mode since 1.31; 1.35 was the last release to support it.) Check your cgroup version:
docker info | grep "Cgroup Version"
- Cgroup Version: 2 - Use Kubernetes 1.36 (default in Kind 0.32+)
- Cgroup Version: 1 - Pin Kubernetes 1.35.x or earlier, or upgrade Docker/WSL2 to cgroup v2
# Auto-detect cgroup version and create cluster
./scripts/setup-test-cluster.sh
# Or manually with specific version:
# For cgroups v2 (recommended):
kind create cluster --name argocd-mcp-test --image kindest/node:v1.36.1
# For cgroups v1 hosts (last supported release):
kind create cluster --name argocd-mcp-test --image kindest/node:v1.35.0
# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Wait for ArgoCD to be ready
kubectl wait --for=condition=available --timeout=300s deployment/argocd-server -n argocd
# Get ArgoCD admin password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
# Port forward
kubectl port-forward svc/argocd-server -n argocd 8080:443
Architecture
argocd-mcp-server/
├── src/argocd_mcp/
│ ├── server.py # Entrypoint: FastMCP instance, lifespan, ServerContext, registration
│ ├── config.py # Configuration management (pydantic-settings)
│ ├── tools/
│ │ ├── read.py # Tier-1 read-only handlers
│ │ ├── write.py # Tier-2 write handlers (require MCP_READ_ONLY=false)
│ │ ├── destructive.py # Tier-3 destructive handlers (require confirmation)
│ │ ├── params.py # Pydantic parameter models for every tool
│ │ └── _safety.py # Shared destination-cluster guard
│ ├── resources/
│ │ └── applications.py # MCP resources: argocd://instances, argocd://security
│ └── utils/
│ ├── client.py # ArgoCD API client with retry logic and secret masking
│ ├── safety.py # Confirmation patterns, rate limiting
│ └── logging.py # Structured logging, audit trail
├── tests/
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests (Kind cluster)
├── docs/
│ ├── TOOLS.md # Detailed tool documentation
│ └── SECURITY.md # Security model deep-dive
├── examples/ # Example configurations
└── Dockerfile # Multi-stage container build
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
Apache 2.0 - See LICENSE for details.
Acknowledgments
Built on the shoulders of:
- MCP Specification - The protocol that makes this possible
- containers/kubernetes-mcp-server - Inspiration for safety patterns
- argoproj-labs/mcp-for-argocd - The official (but less opinionated) option
Built by SREs, for SREs. Because production deserves better than "LGTM, ship it."
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.