Prometheus MCP Server

Prometheus MCP Server

Enables AI assistants to query Prometheus metrics, monitor alerts, and analyze system health through read-only access to your Prometheus server with built-in query safety and optional AI-powered metric analysis.

Category
Visit Server

README

prometheus-mcp

A Model Context Protocol (MCP) server for Prometheus integration. Give your AI assistant eyes on your metrics and alerts.

Status: Planning Author: Claude (claude@arktechnwa.com) + Meldrey License: MIT Organization: ArktechNWA


Why?

Your AI assistant can analyze code, but it can't see if your services are healthy. It can suggest optimizations, but can't see the actual latency metrics. It's blind to the alerts firing at 3am.

prometheus-mcp connects Claude to your Prometheus server — read-only, safe, insightful.


Philosophy

  1. Read-only by design — Prometheus queries don't mutate state
  2. Query safety — Timeout expensive queries, limit cardinality
  3. Never hang — PromQL can be expensive, always timeout
  4. Structured output — Metrics + human summaries
  5. Fallback AI — Haiku for anomaly detection and query help

Features

Perception (Read)

  • Instant queries (current values)
  • Range queries (over time)
  • Alert status and history
  • Target health
  • Recording rules and alerts
  • Label discovery
  • Metric metadata

Analysis (AI-Assisted)

  • "Is this metric normal?"
  • "What caused this spike?"
  • "Suggest a query for X"
  • Anomaly detection

Permission Model

Prometheus is inherently read-only for queries. Permissions focus on:

Level Description Default
query Run PromQL queries ON
alerts View alert status ON
admin View config, reload rules OFF

Query Safety

{
  "query_limits": {
    "max_duration": "30s",
    "max_resolution": "10000",
    "max_series": 1000,
    "blocked_metrics": [
      "__.*",
      "secret_.*"
    ]
  }
}

Safety features:

  • Query timeout enforcement
  • Cardinality limits
  • Metric blacklist patterns
  • Rate limiting

Authentication

{
  "prometheus": {
    "url": "http://localhost:9090",
    "auth": {
      "type": "none" | "basic" | "bearer",
      "username_env": "PROM_USER",
      "password_env": "PROM_PASS",
      "token_env": "PROM_TOKEN"
    }
  }
}

Tools

Queries

prom_query

Execute instant query (current values).

prom_query({
  query: string,            // PromQL expression
  time?: string             // evaluation time (default: now)
})

Returns:

{
  "query": "up{job=\"api\"}",
  "result_type": "vector",
  "results": [
    {
      "metric": {"job": "api", "instance": "api-1:8080"},
      "value": 1,
      "timestamp": "2025-12-29T10:30:00Z"
    }
  ],
  "summary": "3 of 3 api instances are up"
}

prom_query_range

Execute range query (over time).

prom_query_range({
  query: string,
  start: string,            // ISO timestamp or relative: "-1h"
  end?: string,             // default: now
  step?: string             // resolution: "15s", "1m", "5m"
})

Returns:

{
  "query": "rate(http_requests_total[5m])",
  "result_type": "matrix",
  "results": [
    {
      "metric": {"handler": "/api/users"},
      "values": [[1735470600, "123.45"], ...],
      "stats": {
        "min": 100.2,
        "max": 456.7,
        "avg": 234.5,
        "current": 345.6
      }
    }
  ],
  "summary": "Request rate ranged from 100-457 req/s over the last hour, currently 346 req/s"
}

prom_series

Find series matching label selectors.

prom_series({
  match: string[],          // label matchers
  start?: string,
  end?: string,
  limit?: number
})

prom_labels

Get label names or values.

prom_labels({
  label?: string,           // get values for this label (omit for label names)
  match?: string[],         // filter by series
  limit?: number
})

Alerts

prom_alerts

Get current alert status.

prom_alerts({
  state?: "firing" | "pending" | "inactive",
  filter?: string           // alert name pattern
})

Returns:

{
  "alerts": [
    {
      "name": "HighErrorRate",
      "state": "firing",
      "severity": "critical",
      "summary": "Error rate > 5% for api service",
      "started_at": "2025-12-29T10:15:00Z",
      "duration": "15m",
      "labels": {"job": "api", "severity": "critical"},
      "annotations": {"summary": "..."}
    }
  ],
  "summary": "1 critical, 0 warning alerts firing"
}

prom_rules

Get alerting and recording rules.

prom_rules({
  type?: "alert" | "record",
  filter?: string
})

Targets

prom_targets

Get scrape target health.

prom_targets({
  state?: "active" | "dropped",
  job?: string
})

Returns:

{
  "targets": [
    {
      "job": "api",
      "instance": "api-1:8080",
      "health": "up",
      "last_scrape": "2025-12-29T10:29:45Z",
      "scrape_duration": "0.023s",
      "error": null
    }
  ],
  "summary": "12 of 12 targets healthy"
}

Discovery

prom_metadata

Get metric metadata (help, type, unit).

prom_metadata({
  metric?: string,          // specific metric (omit for all)
  limit?: number
})

Analysis

prom_analyze

AI-powered metric analysis.

prom_analyze({
  query: string,
  question?: string,        // "Is this normal?", "What caused the spike?"
  use_ai?: boolean
})

Returns:

{
  "query": "rate(http_errors_total[5m])",
  "data_summary": {
    "current": 12.3,
    "1h_ago": 2.1,
    "change": "+486%"
  },
  "synthesis": {
    "analysis": "Error rate spiked 5x in the last hour. The spike correlates with deployment at 10:15. Errors are concentrated on /api/checkout endpoint.",
    "suggested_queries": [
      "rate(http_errors_total{handler=\"/api/checkout\"}[5m])",
      "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))"
    ],
    "confidence": "high"
  }
}

prom_suggest_query

Get PromQL query suggestions.

prom_suggest_query({
  intent: string            // "show me api latency p99"
})

NEVERHANG Architecture

PromQL queries can be expensive. High-cardinality queries can OOM Prometheus.

Query Timeouts

  • Default: 30s
  • Configurable per-query
  • Server-side timeout parameter

Cardinality Protection

  • Limit series returned
  • Block known expensive patterns
  • Warn on high-cardinality queries

Circuit Breaker

  • 3 timeouts in 60s → 5 minute cooldown
  • Tracks Prometheus health
  • Graceful degradation
{
  "neverhang": {
    "query_timeout": 30000,
    "max_series": 1000,
    "circuit_breaker": {
      "failures": 3,
      "window": 60000,
      "cooldown": 300000
    }
  }
}

Fallback AI

Optional Haiku for metric analysis.

{
  "fallback": {
    "enabled": true,
    "model": "claude-haiku-4-5",
    "api_key_env": "PROM_MCP_FALLBACK_KEY",
    "max_tokens": 500
  }
}

When used:

  • prom_analyze with questions
  • prom_suggest_query for natural language
  • Anomaly detection

Configuration

~/.config/prometheus-mcp/config.json:

{
  "prometheus": {
    "url": "http://localhost:9090",
    "auth": {
      "type": "none"
    }
  },
  "permissions": {
    "query": true,
    "alerts": true,
    "admin": false
  },
  "query_limits": {
    "max_duration": "30s",
    "max_series": 1000
  },
  "fallback": {
    "enabled": false
  }
}

Claude Code Integration

{
  "mcpServers": {
    "prometheus": {
      "command": "prometheus-mcp",
      "args": ["--config", "/path/to/config.json"]
    }
  }
}

Installation

npm install -g @arktechnwa/prometheus-mcp

Requirements

  • Node.js 18+
  • Prometheus server (2.x+)
  • Optional: Anthropic API key for fallback AI

Credits

Created by Claude (claude@arktechnwa.com) in collaboration with Meldrey. Part of the ArktechNWA MCP Toolshed.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured