Overwatch MCP

Overwatch MCP

An MCP server that enables querying logs and metrics from Graylog, Prometheus, and InfluxDB 2.x. It provides tools for executing Lucene log searches, PromQL queries, and Flux queries directly within MCP-compatible clients.

Category
Visit Server

README

Overwatch MCP

Python 3.11+ License: MIT Docker CI

MCP server for querying Graylog, Prometheus, and InfluxDB 2.x from Claude Desktop.

Tools

Tool What it does
graylog_search Search logs (Lucene syntax)
graylog_fields List log fields
prometheus_query Instant PromQL query
prometheus_query_range Range PromQL query
prometheus_metrics List metrics
influxdb_query Flux query (bucket allowlisted)

Quick Start

One-Line Setup (Docker)

curl -fsSL https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/setup.sh | bash
cd Overwatch_MCP
# Edit .env and config.yaml with your values
docker compose up -d

Manual Setup (Docker)

# Download compose files
mkdir -p Overwatch_MCP && cd Overwatch_MCP
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/docker-compose.yml
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/.env.example
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/config.example.yaml

# Create config from templates
cp .env.example .env
cp config.example.yaml config.yaml

# Edit .env with your credentials
# Edit config.yaml if needed (adjust allowed_buckets, limits, etc.)

# Run
docker compose up -d

Local Install

pip install -e .
cp .env.example .env
cp config/config.example.yaml config/config.yaml
# Edit both files with your values
python -m overwatch_mcp

Claude Desktop Config

Docker

~/.claude/config.json (Linux/Mac) or %APPDATA%\Claude\config.json (Windows):

{
  "mcpServers": {
    "overwatch": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-v", "/path/to/config:/app/config:ro",
        "--env-file", "/path/to/.env",
        "ghcr.io/malindarathnayake/Overwatch-mcp:latest"
      ]
    }
  }
}

Local Python

{
  "mcpServers": {
    "overwatch": {
      "command": "python",
      "args": ["-m", "overwatch_mcp"],
      "env": {
        "GRAYLOG_URL": "https://graylog.internal:9000/api",
        "GRAYLOG_TOKEN": "your-token",
        "PROMETHEUS_URL": "http://prometheus.internal:9090",
        "INFLUXDB_URL": "https://influxdb.internal:8086",
        "INFLUXDB_TOKEN": "your-token",
        "INFLUXDB_ORG": "your-org"
      }
    }
  }
}

Windows PowerShell Setup

One-shot script to configure Claude Desktop on Windows:

# Stop Claude if running
Get-Process -Name "Claude*" -ErrorAction SilentlyContinue | Stop-Process -Force

$config = @'
{
  "mcpServers": {
    "overwatch": {
      "command": "C:/Users/<USERNAME>/AppData/Local/Microsoft/WindowsApps/python3.13.exe",
      "args": ["-m", "overwatch_mcp", "--config", "C:/path/to/Overwatch-mcp/compose/config.yaml"],
      "env": {
        "GRAYLOG_URL": "https://your-graylog-url",
        "GRAYLOG_TOKEN": "<YOUR_GRAYLOG_TOKEN>",
        "PROMETHEUS_URL": "http://your-prometheus-url:9090",
        "INFLUXDB_URL": "https://your-influxdb-url",
        "INFLUXDB_TOKEN": "<YOUR_INFLUXDB_TOKEN>",
        "INFLUXDB_ORG": "<YOUR_INFLUXDB_ORG>",
        "LOG_LEVEL": "debug",
        "LOG_FILE": "C:/path/to/Overwatch-mcp/overwatch.log"
      }
    }
  }
}
'@
[System.IO.File]::WriteAllText("$env:APPDATA\Claude\claude_desktop_config.json", $config)

# Install from source (run from repo root)
cd C:\path\to\Overwatch-mcp
pip install -e .

Note: Replace <USERNAME>, <YOUR_GRAYLOG_TOKEN>, <YOUR_INFLUXDB_TOKEN>, <YOUR_INFLUXDB_ORG>, and paths with your actual values.

Configuration

config.yaml

The config uses ${ENV_VAR} substitution - values come from environment at runtime.

server:
  log_level: "info"

datasources:
  graylog:
    enabled: true
    url: "${GRAYLOG_URL}"
    token: "${GRAYLOG_TOKEN}"
    timeout_seconds: 30
    max_time_range_hours: 24
    max_results: 1000
    # Production environments to filter on (auto-builds from known_applications.json)
    production_environments:
      - "prod"
      - "production"
    # Known apps file - auto-builds env filter from discovered data
    known_applications_file: "${GRAYLOG_KNOWN_APPS_FILE:-}"

  prometheus:
    enabled: true
    url: "${PROMETHEUS_URL}"
    timeout_seconds: 30
    max_range_hours: 168

  influxdb:
    enabled: true
    url: "${INFLUXDB_URL}"
    token: "${INFLUXDB_TOKEN}"
    org: "${INFLUXDB_ORG}"
    timeout_seconds: 60
    allowed_buckets:
      - "telegraf"
      - "app_metrics"

cache:
  enabled: true
  default_ttl_seconds: 60

Disable a datasource by setting enabled: false. Server runs in degraded mode if some datasources fail health checks.

Tool Parameters

graylog_search

{
  "query": "level:ERROR AND service:api",
  "from_time": "-2h",
  "to_time": "now",
  "limit": 100,
  "fields": ["timestamp", "message", "level"]
}

Time formats: ISO8601 (2025-01-27T10:00:00Z), relative (-1h, -30m), now

graylog_fields

{
  "pattern": "http_.*",
  "limit": 100
}

prometheus_query

{
  "query": "rate(http_requests_total[5m])",
  "time": "-1h"
}

prometheus_query_range

{
  "query": "up",
  "start": "-6h",
  "end": "now",
  "step": "1m"
}

Step auto-calculated if omitted.

prometheus_metrics

{
  "pattern": "http_.*",
  "limit": 100
}

influxdb_query

{
  "query": "from(bucket: \"telegraf\") |> range(start: -1h) |> filter(fn: (r) => r._measurement == \"cpu\")",
  "bucket": "telegraf"
}

Bucket must be in allowed_buckets config.

Error Codes

Code Meaning
DATASOURCE_DISABLED Datasource disabled in config
DATASOURCE_UNAVAILABLE Failed health check
INVALID_QUERY Bad query syntax
INVALID_PATTERN Bad regex
TIME_RANGE_EXCEEDED Range exceeds max
BUCKET_NOT_ALLOWED Bucket not in allowlist
UPSTREAM_TIMEOUT Request timed out
UPSTREAM_CLIENT_ERROR 4xx from datasource
UPSTREAM_SERVER_ERROR 5xx from datasource

Application Discovery

Generate a known applications file to speed up lookups:

# Using environment variables
python scripts/discover_applications.py --env

# Or with explicit credentials
python scripts/discover_applications.py \
  --url https://graylog.example.com \
  --token YOUR_TOKEN \
  --hours 24 \
  --environment "environment:prod" \
  --output known_applications.json

Output known_applications.json:

{
  "_metadata": {
    "generated_at": "2025-01-28T10:00:00",
    "identifier_fields_used": ["application", "service", "container_name"]
  },
  "environments": ["prod", "staging", "dev"],
  "applications": [
    {
      "name": "api-gateway",
      "identifier_fields": ["service", "application"],
      "aliases": [],
      "description": "",
      "team": "",
      "enabled": true
    }
  ]
}

Edit the file to:

  • Remove entries you don't need (enabled: false)
  • Add descriptions and team ownership
  • Add aliases for alternative names

Then set GRAYLOG_KNOWN_APPS_FILE=/path/to/known_applications.json in your environment.

Development

# Install with dev deps
pip install -e ".[dev]"

# Tests
pytest tests/ -v

# Coverage
pytest tests/ -v --cov=overwatch_mcp

Project Structure

src/overwatch_mcp/
├── __main__.py        # Entry point
├── server.py          # MCP server
├── config.py          # Config loader
├── cache.py           # TTL cache
├── clients/           # HTTP clients (graylog, prometheus, influxdb)
├── tools/             # MCP tool implementations
└── models/            # Pydantic models

127 tests (89 unit, 38 integration).

Usage Guide

See Docs/usage-guide.md for examples of how to ask questions:

  • Finding errors and investigating issues
  • Searching logs with filters and time ranges
  • Querying metrics and trends
  • Investigation workflows and common patterns

Troubleshooting

Server won't start: Check config/config.yaml exists and env vars are set.

Datasource unavailable: Verify URL, check token permissions. Server continues with available datasources.

Query errors: Check syntax (Lucene/PromQL/Flux), verify time range within limits, ensure bucket is allowlisted for InfluxDB.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured