thermal-mcp-server
A physics engine for liquid-cooled GPU systems, exposed as an AI-callable MCP server. Enables thermal analysis, coolant comparison, flow optimization, and rack-level sizing via natural language queries.
README
thermal-mcp-server
A physics engine for liquid-cooled GPU systems, exposed as an AI-callable MCP server. Ask Claude to size a cooling system for an H100 cluster, optimize cold plate flow rates, or compare water versus glycol — and get first-principles answers backed by hand-validated thermal models.
Quick Start
Try it now — open the interactive notebook in Colab to run NVL72 rack sizing, topology comparisons, and flow optimization interactively.
Install and use as an MCP server:
pip install thermal-mcp-server
Add to your MCP client config (claude_desktop_config.json for Claude Desktop):
{
"mcpServers": {
"thermal": {
"command": "python",
"args": ["-m", "thermal_mcp_server"]
}
}
}
Note: Claude Desktop does not inherit your shell's
PATH. If the above doesn't work, use the absolute path to your Python binary (e.g./usr/local/bin/pythonor the path inside a virtualenv).
Once configured, ask Claude engineering questions directly:
"I have 8 H100 SXM GPUs at 700 W each, water cooling at 8 LPM per cold plate, 25°C supply. What's the junction temperature and thermal margin?"
"Compare water versus 50/50 glycol for a 700 W load at 8 LPM."
"Size a CDU for 8 H100 GPUs in a parallel manifold — total flow, system ΔP, and return water temperature."
Claude calls the relevant tool, interprets the physics, and answers in context.
<img width="1768" height="1750" alt="Claude Desktop answering a liquid cooling question by calling thermal-mcp-server tools" src="https://github.com/user-attachments/assets/7e3fb436-38d2-477b-a4dd-e5a2a740d463" />
Claude Desktop calling analyze_coldplate via the MCP server. The user asks a natural-language thermal question; Claude picks the right tool, runs the physics, and interprets the result.
Example: H100 SXM Baseline
This is the hand-calculation validated reference case — every intermediate value (Reynolds number, Nusselt number, convection coefficient, pressure drop) is independently verified in tests/test_physics_behavior.py.
from thermal_mcp_server.physics import analyze
from thermal_mcp_server.schemas import AnalyzeColdplateInput
result = analyze(AnalyzeColdplateInput(
heat_load_w=700, flow_rate_lpm=8.0, inlet_temp_c=25.0, coolant="water"
))
print(f"Junction temp: {result.junction_temp_c:.1f}°C") # 70.9°C
print(f"Thermal margin: {83 - result.junction_temp_c:.1f}°C below throttle onset")
print(f"Flow regime: {result.regime}") # transitional (Re ≈ 3734)
print(f"Pressure drop: {result.pressure_drop_pa:.0f} Pa") # 16800 Pa (0.17 bar)
For rack-scale analysis (NVL72 CDU sizing, series vs. parallel topology, B200 at 1,200 W), see the interactive notebook.
Tools
Four MCP tools, each also available as a Python function:
| Tool | What it does |
|---|---|
analyze_coldplate |
Single-point thermal + hydraulic analysis: Tj, resistance breakdown, ΔP, regime, pump power |
compare_coolants |
Side-by-side water vs. glycol at identical conditions |
optimize_flow_rate |
Binary search for minimum flow to meet a Tj target |
analyze_rack |
N identical GPUs in series or parallel: max Tj, per-GPU temps, total flow, system ΔP, CDU return temp |
See docs/mcp.md for full input/output schemas.
How It Works
The physics engine models a cold plate as a 1D thermal resistance network:
T_junction = T_inlet + Q × (R_jc + R_tim + R_base + R_conv) + ΔT_coolant/2
- R_jc / R_tim: Package resistances (chip manufacturer spec or estimate)
- R_base: Copper base conduction (geometry + k = 385 W/m·K)
- R_conv: Forced convection — Dittus-Boelter (turbulent) or Nu = 4.36 (laminar), linearly blended through transition (Re 2,300–4,000)
- ΔP: Darcy-Weisbach with Blasius friction factor, same transition blend
Rack-level model stacks N single-GPU analyses in series (cumulative temperature rise) or parallel (uniform inlet, flow split) topology.
flowchart LR
A["Input\nchip power, flow,\ncoolant, geometry"] --> B["Physics Engine\nDittus-Boelter · Darcy-Weisbach\nR_total network"]
B --> C["Output\nT_junction · ΔP\nthermal margin · pump power"]
See docs/physics.md for the full physics documentation including equations and assumptions.
Validation
Model outputs against published chip specs. All runs use water coolant, 25°C inlet.
| Chip | TDP | Tj Design Ceiling | Model Tj | Margin | Notes |
|---|---|---|---|---|---|
| H100 SXM | 700 W | 83°C | 70.9°C at 8 LPM | 12.1°C | Default geometry; hand-calc validated |
| MI300X | 750 W | ~85°C (proxy) | 74.2°C at 8 LPM | ~10°C | AMD does not publish Tj_max |
| B200 NVL72 | 1,200 W | ~75°C (est.) | 75.0°C at 9.3 LPM/GPU | 0°C at limit | R_jc=0.02 K/W est.; NVIDIA does not publish |
| Gaudi 3 OAM | 900 W (air) / 1,200 W (liquid) | ~85°C (proxy) | Requires B200-class geometry | — | Default H100 geometry undersized for 1,200 W |
On B200 and Gaudi 3 numbers: NVIDIA and Intel do not publish cold plate geometry or R_jc for these chips. The B200 analysis uses engineering estimates. Treat as indicative; real sizing requires vendor data.
Chip sources: NVIDIA H100 Datasheet · NVIDIA GB200 NVL72 · SemiAnalysis B200 thermal estimates · AMD MI300X Data Sheet · Intel Gaudi 3 Product Brief
Known Limitations
These are documented explicitly because they bound what the model can and cannot tell you:
- No manifold or header pressure losses — rack ΔP is cold-plate-only. Real system ΔP should add 20–50% for manifold losses.
- No heterogeneous racks — all GPUs assumed identical TDP, geometry, and thermal resistance.
- Steady-state only — no transient thermal capacitance.
- Single-point fluid properties — water and glycol50 properties fixed at 25°C nominal.
- No flow maldistribution — uniform flow assumed across all cold plates.
Development
git clone https://github.com/riccardovietri/thermal-mcp-server.git
cd thermal-mcp-server
uv sync --group dev
uv run pytest -v # all tests should pass
Roadmap
- Interactive demo polish — expand the Colab notebook with sensitivity outputs and clearer walkthrough
- ROI calculator — annual cooling cost delta between air and liquid, CDU payback period, per-GPU cooling cost
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.