MCP Servers

prime-intellect-mcp

The server enables Claude Code to rent, drive, and terminate Prime Intellect GPU pods with hard spend caps, allowing AI agents to provision and manage cloud GPU resources autonomously.

README

prime-intellect-mcp

Let Claude Code rent, drive, and terminate Prime Intellect GPU pods on its own — with hard spend caps you control.

What this is

An MCP server that connects Claude Code (or any MCP client) to your Prime Intellect account. With it, the agent can:

🔍 Find the cheapest GPU pod that matches your requirements
💸 Quote a price before committing money
🛒 Provision the pod (only after you say confirm=True)
🖥️ SSH into it (the connection string is handed to the agent's own Bash tool)
🛑 Terminate it when work is done — and warn loudly if you forget

Built for one workflow: telling Claude "rent the cheapest H100, run my training script, then kill it" and not waking up to a $400 bill.

Install in 60 seconds

You only need this much to start renting GPUs through Claude Code:

1. Get a Prime Intellect API key

Click here to generate one → set permissions:

Scope	Level
Instances	Read and write
Availability	Read only
Billing	Read only
SSH Keys	Read only

Copy the key — it starts with pit_….

2. Add the server to Claude Code

Open ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or your project's .mcp.json, and paste:

{
  "mcpServers": {
    "prime-intellect": {
      "command": "uvx",
      "args": ["prime-intellect-mcp"],
      "env": {
        "PRIME_API_KEY": "pit_PASTE_YOURS_HERE",
        "PRIME_MAX_HOURLY_USD": "5",
        "PRIME_MAX_TOTAL_USD": "40"
      }
    }
  }
}

That's it. Restart Claude Code and ask: "What GPUs are available right now under $1/hr?"

Don't have uvx? Install it with curl -LsSf https://astral.sh/uv/install.sh | sh (or brew install uv). It's a one-liner installer for the uv package manager and you'll never have to manage a virtualenv again.

✨ Add SSH (optional, +2 min) — needed for Claude to actually run code on the pod

The server above can already provision/inspect/terminate pods. But to have Claude Code SSH into a running pod and execute commands on it, Prime Intellect needs to know your machine's public SSH key.

3. Find or generate an SSH key on your machine

ls ~/.ssh/*.pub          # if you have id_ed25519.pub or similar, you're set
# otherwise:
ssh-keygen -t ed25519 -C "you@example.com"   # press Enter through the prompts

4. Register the public key with Prime Intellect

cat ~/.ssh/id_ed25519.pub    # or whichever .pub file you have

Copy the output (one line starting with ssh-ed25519 …), then paste it into the Add SSH key form at app.primeintellect.ai/dashboard/ssh-keys.

That's it. Future pods will have your public key in authorized_keys, and Claude Code's Bash tool can SSH straight in:

ssh ubuntu@<pod-ip-from-pod_status> "nvidia-smi"

Coming in v0.2: a register_ssh_key MCP tool that does step 4 from inside Claude (no browser visit). See the issue tracker to follow along.

What Claude can now do (the 9 tools)

Tool	Use case
`list_gpu_types`	"What GPU types does Prime Intellect offer?"
`list_availability`	"Show me 1×H100 pods available under $3/hr."
`get_wallet_balance`	"How much credit do I have left?"
`pod_quote`	"Quote me a 1×A100 with 200GB disk." (no charge)
`pod_create`	"Provision the pod from that quote." (requires `confirm=True`)
`pod_list`	"Show me my running pods."
`pod_status`	"Is pod X ready? Wait until it has SSH info."
`pod_terminate`	"Kill pod X." (requires `confirm=True`)
`pod_check_runaway`	"Did I forget to terminate anything?"

Safety: nothing provisions silently

Three layers, in order:

Quote first. pod_quote returns a price + a 60-second token. No side effects. The dollar amount is now in the agent's context.
Explicit confirm. pod_create (and pod_terminate) requires confirm=True. Without it, you get a dry-run preview.
Hard env-var caps. PRIME_MAX_HOURLY_USD blocks any pod above the rate. PRIME_MAX_TOTAL_USD blocks any (rate × max_lifetime_hours) above the budget. Wallet balance is also enforced. None of these caps can be overridden by tool arguments — they're read at every call.

Defaults: PRIME_MAX_HOURLY_USD=5, PRIME_MAX_TOTAL_USD=40. Set them in your config's env block.

Every pod_create / pod_terminate is appended as JSON to ~/.prime-intellect-mcp/audit.log, so you have a complete history of what the agent did with your money.

Example prompts (paste these into Claude Code)

List the cheapest 1×H100 pods available right now. Show me the top 3 by hourly price.

Quote a 1×A100 80GB with 100GB disk, 8 vCPU, 64GB RAM. Don't provision yet —
just show me what it would cost.

I need to fine-tune a 7B model overnight. Find the cheapest 1×H100 with 200GB
disk, max $40 total budget, max 12 hours. Provision it, give me the SSH command,
and remind me to terminate when I'm done.

Check if I have any running pods I forgot about and show me their hourly cost.

Terminate pod abc123. Confirm before doing it.

Troubleshooting

<details> <summary><code>PRIME_API_KEY is not set</code></summary>

Either your Claude Code config didn't pick up the env block, or you typed PRIME_API_KEY as a different variable. Verify with:

$ env | grep PRIME

inside the same shell that launches Claude Code, or paste the key directly into the JSON env block (instead of using ${PRIME_API_KEY}). </details>

<details> <summary><code>Hourly rate $X/hr exceeds PRIME_MAX_HOURLY_USD cap</code></summary>

The agent picked a pod above your hard cap. Either:

Pick a cheaper GPU (list_availability with a region filter often surfaces cheaper community-priced rows), or
Raise PRIME_MAX_HOURLY_USD in your config and restart Claude Code. </details>

<details> <summary><code>Quote token expired</code></summary>

Quotes live 60 seconds; the agent waited too long between pod_quote and pod_create. Just call pod_quote again — it's a no-op cost-wise. </details>

<details> <summary>Pod is "ACTIVE" but <code>ssh_connection</code> is null</summary>

Provisioning isn't fully done. The pod is alive but still running its install script. Call pod_status(pod_id, wait_for_ssh=True) and it will block (polling every 5s) until SSH comes up. </details>

<details> <summary><code>ssh: Permission denied (publickey)</code></summary>

You haven't told Prime Intellect about your public key (or the pod was provisioned before you registered it). Fix:

Verify your pubkey is registered at app.primeintellect.ai/dashboard/ssh-keys.
Re-provision — the pod's authorized_keys is set at create time, so existing pods won't pick up keys you registered after.
If your private key has a passphrase, run ssh-add --apple-use-keychain ~/.ssh/your_key once on macOS so the agent unlocks it silently from now on. </details>

<details> <summary>Wallet balance is empty / <code>PaymentRequiredError</code></summary>

Top up at app.primeintellect.ai/wallet and try again. </details>

Why another one?

There's a prime-mcp-server 0.1.2 on PyPI. It's a thin proof-of-concept; this isn't a fork. Differences for unattended overnight use:

	`prime-intellect-mcp`	`prime-mcp-server` 0.1.2
Two-step quote → confirm	✅	❌
Env-var hard spend caps	✅	❌
Wallet pre-check	✅	❌
Runaway-pod detection	✅	❌
SSH handoff to agent	✅	❌
Tests	32 unit + opt-in live	None

Local development

git clone https://github.com/kvrancic/prime-intellect-mcp
cd prime-intellect-mcp
uv sync
uv run pytest -m "not live"        # 32 fast tests, no network, no spend
uv run ruff check .
uv run mypy src

Live smoke test (provisions cheapest available GPU, runs nvidia-smi, terminates; ~$0.05 spend):

PRIME_API_KEY=pit_... PRIME_LIVE_TEST=1 PRIME_LIVE_MAX_HOURLY=0.60 \
PRIME_MAX_HOURLY_USD=0.60 PRIME_MAX_TOTAL_USD=2.00 \
uv run pytest tests/test_smoke_live.py -v -s

Roadmap

v0.2 — register_ssh_key MCP tool (kill the dashboard step), Sandboxes (prime-sandboxes SDK), Environments Hub
v0.3 — Optional auto-terminate daemon (server-side enforcement of max_lifetime_hours); cost telemetry
v1.0+ — Hosted/OAuth deployment when Prime Intellect ships OAuth; submission to Anthropic connector directory

Acknowledgements

Prime Intellect for the prime Python SDK that does 90% of the work
MIT 6.8610 (Advanced NLP) for the Prime Intellect credits that made testing this possible
Anthropic for MCP
FastMCP for the framework

License

MIT — see LICENSE.

Contributing

Issues and PRs welcome. Please run uv run pytest -m "not live" and uv run ruff check . before submitting.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured