ArgoCD MCP Server

ArgoCD MCP Server

Safety-first GitOps operations for ArgoCD via the Model Context Protocol. Enables listing, diagnosing, syncing, and managing ArgoCD applications with progressive disclosure and defense-in-depth security.

Category
Visit Server

README

     _____                  _____ ____    __  __  _____ _____
    /  _  \_______ _____   / ____|  _ \  |  \/  |/ ____| __ \
   /  /_\  \_  __ \_  __ \| |    | | | | | \  / | |    |  _) |
  /    |    \  | \/ | | | | |    | |_| | | |\/| | |    |  __/
  \____|____/__|    |_| |_|\_____|____/  |_|  |_|\_____|_|

         ╔═══════════════════════════════════════╗
         ║  S A F E T Y   F I R S T   G I T O P S ║
         ╚═══════════════════════════════════════╝

ArgoCD MCP Server

Safety-first GitOps operations for ArgoCD via the Model Context Protocol.

License Python MCP Code style: ruff Type checked: mypy Pre-commit


Why This Exists

"95% of MCP servers are garbage."

We built this because we were tired of:

The Empty Error Message Problem. You sync an app. It fails. The error? "Sync failed." Thanks, very helpful. Meanwhile, the actual cause is buried across three different Kubernetes events, two pod logs, and a misconfigured HPA that nobody told you about. Our diagnose_sync_failure tool aggregates all of that into one coherent analysis.

The "Oops, I Deleted Production" Problem. Most MCP servers treat every operation the same. Read a deployment? Same as deleting a namespace! We implement defense-in-depth: read-only by default, writes require explicit opt-in, destructive operations require typing the resource name twice. Because at 3 AM, you don't need an LLM with root access to your production cluster.

The YAML Fatigue Problem. When you have 47 applications across 12 clusters, the ArgoCD UI becomes a slideshow. We built this for SREs who live in terminals and need answers fast: "What's broken in prod?" should return an answer, not a loading spinner.

The Context Window Problem. MCP servers that dump entire manifests into the context are doing it wrong. We use progressive disclosure: simple status first, details on request, full manifests only when you explicitly ask. Your context window is precious real estate.


Quick Demo

Illustrative example — the conversation below is a hand-written demonstration of how an agent would use these tools, not a transcript of a real session. Tool names, parameters, and responses match what the server actually returns; the surrounding chat is for illustration only.

You: What applications are failing in production?

Claude: Let me check the production applications for you.

[calls list_applications(health_status="Degraded", project="prod")]

Found 2 application(s):

- payment-service [prod] health=Degraded sync=OutOfSync dest=payments@https://prod.k8s...
- notification-api [prod] health=Degraded sync=Synced dest=notifications@https://prod.k8s...

The payment-service is both unhealthy AND out of sync. Want me to diagnose what's wrong?
You: Why is payment-service broken?

Claude: I'll run a full diagnosis.

[calls diagnose_sync_failure(name="payment-service")]

Diagnosis for 'payment-service':

Found 4 issue(s):
  - Application is out of sync (revision: abc123f)
  - [SyncError] Failed to sync: resource payment-deployment: container crashed
  - Image pull failed: ImagePullBackOff for payments:v2.3.1
  - Found 2 unhealthy resources in resource tree
    - Deployment/payment-deployment: Container crashloopbackoff
    - Pod/payment-deployment-7f8b9-x2k4j: CrashLoopBackOff

Suggestions:
  - Verify image exists and registry credentials are configured
  - Check pod logs for application startup errors
  - Run get_application_diff to see pending changes

The image tag v2.3.1 doesn't seem to exist. Can you verify it was pushed to the registry?
You: Show me what would change if I sync it

Claude: [calls get_application_diff(name="payment-service")]

Diff for application 'payment-service':

Resources to UPDATE (3):
  ~ Deployment/payment-deployment
  ~ ConfigMap/payment-config
  ~ Service/payment-service

Resources in sync: 12

The sync would update the deployment (probably the image tag fix), update the
config, and touch the service. Want me to run the sync?
You: Yes, sync it

Claude: [calls sync_application(name="payment-service", dry_run=false)]

Sync initiated for 'payment-service'
Revision: HEAD
Prune: false

Use get_application_status to monitor progress.

I'll check the status in a moment to confirm it's progressing.

Philosophy

Progressive Disclosure

Not everything needs to be visible all the time. We tier our tools:

Tier Access Examples
Tier 1 Always available list_applications, get_application_status, diagnose_sync_failure
Tier 2 Requires MCP_READ_ONLY=false sync_application, refresh_application
Tier 3 Requires confirmation + typing name delete_application, sync_application_with_prune

This isn't bureaucracy. This is respecting that production systems deserve more friction than rm -rf /.

Dry-Run by Default

Every write operation defaults to preview mode. You have to explicitly say "yes, really do this" before anything changes. We learned this lesson from too many "I thought that was staging" incidents.

Agent-Friendly Error Messages

# Bad (what most tools return)
Error: exit status 1

# Good (what we return)
ArgoCD API error (403): Application payment-service not found in project 'default'

Suggestions:
  - Check if application exists: list_applications(project="prod")
  - Verify you have access to the target project

Errors should tell you what went wrong AND what to try next.


Quick Start

Installation

# Clone the repository
git clone https://github.com/peopleforrester/mcp-k8s-observability-argocd-server
cd mcp-k8s-observability-argocd-server

# Install with uv
uv sync

Claude Desktop / Claude Code Configuration

Add to your Claude configuration (~/.claude.json for Claude Code):

{
  "mcpServers": {
    "argocd": {
      "type": "stdio",
      "command": "/path/to/uv",
      "args": [
        "run",
        "--directory",
        "/path/to/mcp-k8s-observability-argocd-server",
        "argocd-mcp"
      ],
      "env": {
        "ARGOCD_URL": "https://argocd.example.com",
        "ARGOCD_TOKEN": "your-api-token",
        "ARGOCD_INSECURE": "false"
      }
    }
  }
}

Note: Replace /path/to/uv with the full path to your uv binary (run which uv to find it).

See examples/ for more configuration options including multi-cluster setups.

Docker

# Build the image
docker build -t argocd-mcp-server .

# Run with environment variables
docker run -e ARGOCD_URL=https://argocd.example.com \
           -e ARGOCD_TOKEN=your-token \
           argocd-mcp-server:latest

Running Directly

# Set environment variables
export ARGOCD_URL=https://argocd.example.com
export ARGOCD_TOKEN=your-token

# Run the server
uv run argocd-mcp

Security Model

We don't just check permissions. We make it hard to do the wrong thing.

Layer Environment Variable Default What It Does
Read-only Mode MCP_READ_ONLY true Blocks ALL write operations. You can look, but you cannot touch.
Non-destructive Mode MCP_DISABLE_DESTRUCTIVE true Blocks delete/prune even if writes enabled. Deletes require this AND read-only off.
Single-cluster Mode MCP_SINGLE_CLUSTER false Restricts operations to the default cluster. For when multi-cluster access is too scary.
Audit Logging MCP_AUDIT_LOG (disabled) Logs every operation to a file. For when you need to know who did what.
Secret Masking MCP_MASK_SECRETS true Redacts tokens, passwords, and API keys from output. Always on unless you're debugging.
Rate Limiting MCP_RATE_LIMIT_CALLS 100 Max API calls per minute. Prevents runaway loops from eating your ArgoCD API.

Enabling Write Operations (Carefully)

# Enable writes (still blocks destructive operations)
export MCP_READ_ONLY=false

# Enable destructive operations (delete, prune) - DANGER ZONE
export MCP_DISABLE_DESTRUCTIVE=false

For the full security model deep-dive, see docs/SECURITY.md.


Tool Reference

Tier 1: Essential Read Operations (Always Available)

Tool What It Does
list_applications List apps with filtering by project, health, or sync status. The "show me what's on fire" tool.
get_application Get detailed app info: source, destination, status. The deep dive.
get_application_status Quick health/sync check. Fast and cheap.
get_application_diff Preview what would change on sync. Look before you leap.
get_application_history View deployment history with commits. "What changed and when?"
diagnose_sync_failure AI-powered troubleshooting. Aggregates logs, events, status into actionable analysis.
get_application_logs Get pod logs for debugging. Filter by pod, container, and time range.
list_clusters List registered clusters with connection status.
list_projects List ArgoCD projects.

Tier 2: Write Operations (Require MCP_READ_ONLY=false)

Tool What It Does
sync_application Sync with dry-run default. Set dry_run=false to actually apply.
refresh_application Force manifest refresh from Git. "Did you push? Let me check again."
rollback_application Rollback to a previous deployment. Dry-run by default.
terminate_sync Stop a running sync operation. For when syncs get stuck.

Tier 3: Destructive Operations (Require explicit confirmation)

Tool What It Does
delete_application Delete application. Requires confirm=true AND confirm_name matching the app name. We make you type it twice for a reason.
sync_application_with_prune Sync and DELETE cluster resources missing from Git. Dry-run by default. Live runs require confirm=true AND confirm_name matching the app name.

For detailed parameter documentation, see docs/TOOLS.md.


Example Conversations

"What applications are failing in production?"

list_applications(health_status="Degraded", project="prod")

"Why is my-app not syncing?"

diagnose_sync_failure(name="my-app")

"Deploy the latest changes to staging"

sync_application(name="my-app", dry_run=false)

"Show me what would change if I sync"

get_application_diff(name="my-app")

"What was deployed last week?"

get_application_history(name="my-app", limit=20)

Configuration Reference

Environment Variables

Variable Description Default
ARGOCD_URL ArgoCD server URL (required)
ARGOCD_TOKEN ArgoCD API token (required)
ARGOCD_INSECURE Skip TLS verification (dev only!) false
MCP_READ_ONLY Block write operations true
MCP_DISABLE_DESTRUCTIVE Block delete/prune true
MCP_SINGLE_CLUSTER Restrict to default cluster false
MCP_AUDIT_LOG Path to audit log file (disabled)
MCP_RATE_LIMIT_CALLS Max API calls per window 100
MCP_RATE_LIMIT_WINDOW Rate limit window (seconds) 60
ARGOCD_MCP_LOG_LEVEL Logging level INFO

Multi-Instance Configuration

For managing multiple ArgoCD instances (multi-cluster, multi-environment):

# Primary instance
export ARGOCD_URL=https://argocd-prod.example.com
export ARGOCD_TOKEN=prod-token

# Additional instances can be configured via the multi-env example.
# See examples/claude-desktop-multi-env.json

Development

Prerequisites

  • Python 3.11, 3.12, 3.13, or 3.14
  • uv (recommended) or pip
  • Docker (for container builds)
  • Kind 0.32+ (for local Kubernetes testing)

Setup

# Install dependencies
uv sync --dev

# Run tests
uv run pytest

# Run linting
uv run ruff check src tests
uv run mypy src

# Build Docker image
docker build -t argocd-mcp-server .

Testing with Kind

Important: Kubernetes 1.36 removed cgroup v1 support entirely — a node on a cgroup v1 host will not start. (cgroup v1 had been in maintenance mode since 1.31; 1.35 was the last release to support it.) Check your cgroup version:

docker info | grep "Cgroup Version"
  • Cgroup Version: 2 - Use Kubernetes 1.36 (default in Kind 0.32+)
  • Cgroup Version: 1 - Pin Kubernetes 1.35.x or earlier, or upgrade Docker/WSL2 to cgroup v2
# Auto-detect cgroup version and create cluster
./scripts/setup-test-cluster.sh

# Or manually with specific version:
# For cgroups v2 (recommended):
kind create cluster --name argocd-mcp-test --image kindest/node:v1.36.1

# For cgroups v1 hosts (last supported release):
kind create cluster --name argocd-mcp-test --image kindest/node:v1.35.0

# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Wait for ArgoCD to be ready
kubectl wait --for=condition=available --timeout=300s deployment/argocd-server -n argocd

# Get ArgoCD admin password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

# Port forward
kubectl port-forward svc/argocd-server -n argocd 8080:443

Architecture

argocd-mcp-server/
├── src/argocd_mcp/
│   ├── server.py           # Entrypoint: FastMCP instance, lifespan, ServerContext, registration
│   ├── config.py           # Configuration management (pydantic-settings)
│   ├── tools/
│   │   ├── read.py         # Tier-1 read-only handlers
│   │   ├── write.py        # Tier-2 write handlers (require MCP_READ_ONLY=false)
│   │   ├── destructive.py  # Tier-3 destructive handlers (require confirmation)
│   │   ├── params.py       # Pydantic parameter models for every tool
│   │   └── _safety.py      # Shared destination-cluster guard
│   ├── resources/
│   │   └── applications.py # MCP resources: argocd://instances, argocd://security
│   └── utils/
│       ├── client.py       # ArgoCD API client with retry logic and secret masking
│       ├── safety.py       # Confirmation patterns, rate limiting
│       └── logging.py      # Structured logging, audit trail
├── tests/
│   ├── unit/               # Unit tests
│   └── integration/        # Integration tests (Kind cluster)
├── docs/
│   ├── TOOLS.md            # Detailed tool documentation
│   └── SECURITY.md         # Security model deep-dive
├── examples/               # Example configurations
└── Dockerfile              # Multi-stage container build

Contributing

See CONTRIBUTING.md for development setup and guidelines.


License

Apache 2.0 - See LICENSE for details.


Acknowledgments

Built on the shoulders of:


Built by SREs, for SREs. Because production deserves better than "LGTM, ship it."

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured