MCP Prometheus
An MCP server for Prometheus-based monitoring that enables users to query system metrics, resource usage, and PostgreSQL health across multiple environments. It supports both predefined diagnostic checks and custom PromQL execution for comprehensive server monitoring and troubleshooting.
README
MCP Prometheus ๐
Prometheus ๊ธฐ๋ฐ ๋ชจ๋ํฐ๋ง์ฉ MCP ์๋ฒ์
๋๋ค.
์ํธ๋ฆฌํฌ์ธํธ๋ main.py์
๋๋ค.
Quick Start ๐
cd d:\MCPTools
uv sync
uv run python mcp_prometheus/main.py
ํ๋ก์ ํธ ๊ตฌ์กฐ ๐งฉ
mcp_prometheus/
main.py
core/
config.py
runtime.py
server.py
time_utils.py
domain/
checks.py
infra/
prom_client.py
tools/
catalog.py
checks_runner.py
promql.py
utils/
query_utils.py
summarize.py
Tools ์์ฝ ๐ ๏ธ
| Tool | ๋ชฉ์ | ๋น๊ณ |
|---|---|---|
list_checks |
๋ฑ๋ก๋ ์ฒดํฌ ๋ชฉ๋ก ์กฐํ | id, name, description ๋ฐํ |
list_environments |
ํ๊ฒฝ๋ณ Prometheus URL ์กฐํ | prod/dev_test/dr |
list_servers |
์ต๊ทผ up ๊ธฐ์ค ์๋ฒ ๋ชฉ๋ก ์กฐํ | (instance, job) ๊ธฐ์ค ์ค๋ณต ์ ๊ฑฐ |
list_process_groups |
ํ๋ก์ธ์ค ๊ทธ๋ฃน ๋ชฉ๋ก ์กฐํ | process_monitoring ๊ธฐ์ค |
run_check |
๋จ์ผ ์ฒดํฌ ์คํ | ๊ธฐ๋ณธ ๊ถ์ฅ |
run_all_checks |
์ ์ฒด ์ฒดํฌ ๋ณ๋ ฌ ์คํ | step=5m ๊ณ ์ |
run_promql |
์ฌ์ฉ์ PromQL ์ง์ ์คํ | approved=True ํ์ |
run_check ์
๋ ฅ ๊ฐ์ด๋ ๐งญ
ํ์
check_id
๊ธฐ๊ฐ
- ์๋:
hours,minutes,days - ์ ๋:
start_time_utc_iso,end_time_utc_iso - ์ข
๋ฃ ์คํ์
:
end_offset_minutes,end_offset_hours,end_offset_days
ํ๊ฒ ํํฐ
server_nameinstance(์:host-or-ip:9100)
ํํฐ ๊ท์น:
server_name์instance๋ฅผ ํจ๊ป ์ฃผ๋ฉด AND ์ ์ฉ- ํ๋๋ง ์ฃผ๋ฉด ํด๋น ๋ผ๋ฒจ๋ง ์ ์ฉ
run_promql ๊ฐ๋๋ ์ผ ๐
approved=False: ์คํํ์ง ์๊ณ ํ์ธ ๋ฉ์์ง ๋ฐํapproved=True: ์คํ
๋ชจ๋:
instant=True->/api/v1/queryinstant=False->/api/v1/query_range
์ฌ์ฉ ์์ ๐
1) ํน์ ์๋ฒ CPU ํ๊ท (์ต๊ทผ 24์๊ฐ)
{
"check_id": "cpu_avg_pct",
"hours": 24,
"instance": "10.23.12.11:9100",
"environment": "prod"
}
2) ํน์ ์๋ฒ ๋์คํฌ ์ฌ์ฉ๋ฅ (mountpoint๋ณ)
{
"check_id": "disk_used_pct_by_mount",
"hours": 24,
"server_name": "CMS AP #1",
"environment": "prod"
}
3) ์ฌ์ฉ์ PromQL ์คํ (instant)
{
"promql": "up",
"approved": true,
"instant": true,
"environment": "prod"
}
CHECKS Catalog โ
Source:
domain/checks.py(CHECKS)
System / Resource
cpu_avg_pct: CPU average usage (%) by instance/server_namecpu_peak_pct: window peak CPU usage (%) over selected rangemem_used_pct: memory used ratio (%)mem_swap_used_pct: swap used ratio (%)load15_avg: 15-minute load averagecpu_iowait_pct: CPU iowait ratio (%)
Disk / Filesystem
disk_used_pct_by_mount: filesystem used (%) by mountpoint/device (0-100 scale)disk_used_top5_pct: top 5 filesystem usage (%)disk_inodes_used_pct: inode usage (%)fs_readonly: readonly filesystem indicator (1=readonly)disk_io_busy_pct: disk I/O busy ratio (%)
Availability
up: target liveness (1=up, 0=down)
Network / TCP
net_in_bytes: inbound throughput (bytes/sec)net_out_bytes: outbound throughput (bytes/sec)net_errs_per_sec: RX+TX network errors per secondtcp_retrans_per_sec: TCP retransmit segments per secondtcp_established: established TCP connectionstcp_time_wait: TIME_WAIT TCP socketstcp_inuse: in-use TCP socketstcp_orphan: orphan TCP sockets
Process Monitoring
proc_cpu_pct: process group CPU usage (%)proc_mem_bytes: process group memory usage (bytes)proc_count: process group process count
PostgreSQL
pg_up: PostgreSQL exporter up state (1=up, 0=down)pg_qps: PostgreSQL transactions/sec (commit + rollback)pg_cache_hit_pct: PostgreSQL buffer cache hit ratio (%)pg_active_conn: active PostgreSQL connections
ํ๊ฒฝ ๋ณ์ ์์ฝ โ๏ธ
PROM_ENV_URLS={"prod":"http://...:9090","dev_test":"http://...:9090","dr":"http://...:9090"}
PROM_URL=http://...:9090
PROM_BEARER_TOKEN=
PROM_TIMEOUT_SEC=15
ALERT_WARN_PCT=85
ALERT_CRIT_PCT=95
ALERT_SUSTAIN_MINUTES=5
PROM_MAX_SAMPLES_PER_SERIES=5000
PROM_MAX_PARALLEL_CHECKS=6
ํ๊ฒฝ ์ ํ ์ฐ์ ์์:
environmentenv_hintPROM_URLfallback
์ด์ ํ ๐ก
- ๋ฆฌํฌํธ ์ถ๋ ฅ ์
%๋จ์๋ฅผ ๋ช ํํ ํ๊ธฐํ์ธ์. - ๋จ์ผ ์๋ฒ ์ ๊ฒ์
instance๋๋server_nameํํฐ๋ฅผ ์ฌ์ฉํ์ธ์. disk_used_pct_by_mount๊ฐ์ 0~100 ์ค์ผ์ผ์ ๋๋ค. (0.8=0.8%)
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.