MCP Servers

RISBridge MCP

Enables users to run Slurm jobs on WashU RIS Compute2 cluster by describing their intent in plain English, without requiring SSH or Slurm knowledge.

README

RISBridge MCP

Run jobs on the WashU RIS Compute2 cluster by describing what you want. An intent-to-compute server that turns plain English into safe, validated Slurm work — no SSH, sbatch, partitions, GRES, modules, or storage layout to learn.

RISBridge is a Model Context Protocol server. You talk to it in plain language through an MCP client (Claude Desktop or Claude Code); it plans the job, shows you exactly what it will submit, and runs it on the cluster only after you confirm. It is deliberately not a terminal — there is no arbitrary-shell tool. Every action is a specific, schema-validated operation, which is what makes it safe to let an agent drive real HPC work.

Why it exists

Getting research code onto an HPC cluster usually means learning SSH, Duo, sbatch, partitions, GRES, modules, and a storage layout — before running a single line. RISBridge removes that wall:

Researchers new to HPC get a guided path — a setup wizard, plain-English planning, dry-run previews, and clear status/log/diagnose tools. No Slurm knowledge required.
Experienced HPC users get speed and consistency — job arrays, afterok pipelines, single-node multi-GPU torchrun, vLLM serving, efficiency right-sizing, run comparison, and reproducible history.

Same server, both audiences. The tools scale from "hold my hand" to "give me the template and get out of the way."

How it works

flowchart LR
    A([Plain-English intent]) --> B[Plan & validate<br/>partition · resources · paths]
    B --> C{Dry-run preview<br/>“I will submit: …”}
    C -- confirm --> D[Build sbatch from<br/>fixed, validated templates]
    D --> E[[SSH · Duo / multiplexing]]
    E --> F[(Slurm on Compute2)]
    F --> G[Monitor · logs · explain<br/>efficiency · history]
    G --> A

Login nodes are used only for submitting — every workload runs through Slurm on compute nodes. Submissions are dry-run by default and require an explicit confirm before anything reaches the scheduler.

Highlights

Intent → compute. "Run src/train.py on a GPU for 4 hours" becomes a validated sbatch submission. "Why is my job pending?" returns a plain-English answer with a fix.
Safe by construction. No arbitrary-shell tool; strict input validation; remote commands run via argument arrays (never shell strings); file uploads are base64-encoded; paths are workspace-confined; job control verifies ownership; a redacted audit log records every action.
Auth that respects policy. SSH-key setup, Duo-aware connection multiplexing (approve once, reuse for hours), and an optional per-user worker that orchestrates Slurm from inside the cluster. Duo is never automated; private keys are never shown or logged.
Broad workload coverage. Python (CPU/GPU), R, notebooks, GPU smoke tests, job arrays, multi-GPU training, vLLM serving, JupyterLab, conda environments, dependency pipelines, and long checkpoint/requeue runs.
53 high-level tools across auth, discovery, projects/files, planning, submission, job management, logs, environments, expert controls, and the per-user worker.

Example

You type intent; RISBridge does the rest:

"What's my RISBridge setup status?" "Set up my SSH key." → installs the key, one Duo approval "Discover my profile." → finds your Slurm account and storage workspace "Run src/train.py on one H100 for 4 hours." → preview → confirm → job ID "Why is job 1234567 pending?" · "Tail its logs." · "Right-size my last 5 runs."

The toolset

<details> <summary><b>All 53 tools, by category</b></summary>

Setup & authentication — ris_setup_wizard, ris_auth_status, ris_setup_ssh_key, ris_show_public_key, ris_generate_ssh_config, ris_write_ssh_config, ris_test_key_only_auth, ris_open_ssh_master, ris_check_ssh_master, ris_repair_stale_socket

Discovery & configuration — ris_discover_profile, ris_set_profile, ris_show_config, ris_validate_config, ris_list_partitions, ris_gpu_status

Projects, files & environments — ris_create_project, ris_upload_file, ris_list_project_files, ris_ensure_env, ris_list_modules, ris_list_conda_envs, ris_inspect_conda_env

Planning — ris_plan_run, ris_researcher_wizard, ris_estimate_resources

Submitting jobs — ris_run, ris_submit_python_job, ris_submit_r_job, ris_submit_notebook_job, ris_submit_gpu_smoke_test, ris_submit_array_job, ris_submit_multigpu_torch_job, ris_submit_jupyter_job, ris_submit_vllm_job, ris_submit_conda_env_job, ris_create_pipeline, ris_generate_sbatch_template

Monitoring & lifecycle — ris_list_my_jobs, ris_job_history, ris_explain_job, ris_get_job_logs, ris_tail_job_logs, ris_cancel_job, ris_hold_job, ris_release_job

Analysis & reproducibility — ris_analyze_efficiency, ris_compare_runs, ris_get_result_manifest

Per-user worker — ris_bootstrap_worker, ris_worker_enqueue, ris_worker_status, ris_worker_cancel

</details>

Safety & trust model

Guarantee	How
No arbitrary shell	Every remote command is a fixed template built from validated tokens; no raw command tool exists.
Injection-resistant	Zod validation on every input; `spawn` with argument arrays; base64-encoded uploads.
Confined	Workspace-only paths (no traversal, no data on `/home`).
Confirmed	Submissions are dry-run by default and need an explicit confirm.
Accountable	Ownership-checked job control; redacted append-only audit log.
Private	Duo never automated; passwords never captured; private keys never shown or logged.

Requirements

A WashU RIS account with a Compute2 allocation.
The WashU network or VPN (login nodes are reachable only on-network) and Duo MFA.
An MCP client (Claude Desktop or Claude Code). For the source route: Node.js 18.18+ and OpenSSH.

Getting started

Use is governed by the license — the steps below are for authorized users.

One-click extension. Install the risbridge-mcp.mcpb Desktop Extension in Claude Desktop (Settings → Extensions → Install Extension…), enter your WashU username, and you're done — it bundles its own runtime.

From source.

npm install
npm test          # unit tests (no network)
npm run build     # → dist/
node dist/cli.js tools     # list the registered tools

Then register the server with your client (for example: claude mcp add risbridge -- node /abs/path/dist/server.js), connect to the WashU VPN, and ask RISBridge to set up your SSH key.

Cluster reference

<details> <summary><b>Partitions, storage & GPU modes</b></summary>

Partitions (there is no plain general partition):

Partition	Use	GPU
`general-gpu`	Full-throughput GPU work	Full H100 80 GB
`general-preempt-gpu`	Free, restartable GPU work	Untyped (preemptible)
`general-cpu`	Standard CPU work	—
`general-short`	Quick tests / dev	MIG slice (~10 GB)
`general-interactive`	Interactive sessions	MIG slice
`general-bigmem`	Large-memory CPU	—

Storage — /home is small (code only); data, environments, outputs, and checkpoints live under your /storage2/fs1/<lab>/Active/... workspace.

GPU requests — typed --gres=gpu:H100:N by default; untyped --gres=gpu:N is opt-in and matches more nodes on general-gpu (and is required on preemptible GPU queues).

</details>

Project structure

src/
  tools/        53 MCP tools
  core/         sbatch builder (policy chokepoint), ssh/slurm clients, validation, planner
  templates/    12 job templates
  auth/         SSH key + Duo / multiplexing
  worker/       per-user worker daemon
  config.ts     environment-driven configuration
tests/          unit test suite
scripts/        installers + .mcpb packaging

License

Proprietary — All Rights Reserved. This software and its source code are proprietary and confidential. No license or permission is granted to use, copy, modify, or distribute it, in whole or in part, without the prior written permission of the copyright holder. Viewing this repository does not grant any such rights. See LICENSE.

To request permission, contact sourabh@wustl.edu.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured