WarpGBM MCP Service

WarpGBM MCP Service

Provides GPU-accelerated gradient boosting model training and inference through a cloud service. Enables AI agents to train models on NVIDIA A10G GPUs and get fast cached predictions with portable model artifacts.

Category
Visit Server

README

⚡ WarpGBM MCP Service

GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Get artifact_id for <100ms cached predictions • Download portable artifacts

<div align="center">

smithery badge License: GPL v3 Modal MCP X402

🌐 Live Service📖 API Docs🤖 Agent Guide🐍 Python Package

</div>


🎯 What is This?

Outsource your GBDT workload to the world's fastest GPU implementation.

WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.

🏗️ How It Works (The Smart Cache Workflow)

graph LR
    A[Train on GPU] --> B[Get artifact_id + model]
    B --> C[5min Cache]
    C --> D[<100ms Predictions]
    B --> E[Download Artifact]
    E --> F[Use Anywhere]
  1. Train: POST your data → Train on A10G GPU → Get artifact_id + portable artifact
  2. Fast Path: Use artifact_id → Sub-100ms cached predictions (5min TTL)
  3. Slow Path: Use model_artifact_joblib → Download and use anywhere

Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts


⚡ Quick Start

For AI Agents (MCP)

Add to your MCP settings (e.g., .cursor/mcp.json):

{
  "mcpServers": {
    "warpgbm": {
      "url": "https://warpgbm.ai/mcp/sse"
    }
  }
}

For Developers (REST API)

# 1. Train a model
curl -X POST https://warpgbm.ai/train \
  -H "Content-Type: application/json" \
  -d '{
    "X": [[5.1,3.5,1.4,0.2], [6.7,3.1,4.4,1.4], ...],
    "y": [0, 1, 2, ...],
    "model_type": "warpgbm",
    "objective": "multiclass"
  }'

# Response includes artifact_id for fast predictions
# {"artifact_id": "abc-123", "model_artifact_joblib": "H4sIA..."}

# 2. Make fast predictions (cached, <100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
  -H "Content-Type: application/json" \
  -d '{
    "artifact_id": "abc-123",
    "X": [[5.0,3.4,1.5,0.2]]
  }'

🚀 Key Features

Feature Description
🎯 Multi-Model WarpGBM (GPU) + LightGBM (CPU)
Smart Caching artifact_id → 5min cache → <100ms inference
📦 Portable Artifacts Download joblib models, use anywhere
🤖 MCP Native Direct tool integration for AI agents
💰 X402 Payments Optional micropayments (Base network)
🔒 Stateless No data storage, you own your models
🌐 Production Ready Deployed on Modal with custom domain

🐍 Python Package vs MCP Service

This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:

Feature MCP Service (This Repo) Python Package
Installation None needed pip install git+https://...
GPU Cloud (pay-per-use) Your GPU (free)
Control REST API parameters Full Python API
Features Train, predict, upload + Cross-validation, callbacks, feature importance
Best For Quick experiments, demos Production pipelines, research
Cost $0.01 per training Free (your hardware)

Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control


📡 Available Endpoints

Core Endpoints

Method Endpoint Description
GET /models List available model backends
POST /train Train model, get artifact_id + model
POST /predict_from_artifact Fast predictions (artifact_id or model)
POST /predict_proba_from_artifact Probability predictions
POST /upload_data Upload CSV/Parquet for training
POST /feedback Submit feedback to improve service
GET /healthz Health check with GPU status

MCP Integration

Method Endpoint Description
SSE /mcp/sse MCP Server-Sent Events endpoint
GET /.well-known/mcp.json MCP capability manifest
GET /.well-known/x402 X402 pricing manifest

💡 Complete Example: Iris Dataset

# 1. Train WarpGBM on Iris (60 samples recommended for proper binning)
curl -X POST https://warpgbm.ai/train \
  -H "Content-Type: application/json" \
  -d '{
  "X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
        [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
        [5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
        [7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
        [6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
        [7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]],
  "y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
        0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
        0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2],
  "model_type": "warpgbm",
  "objective": "multiclass",
  "n_estimators": 100
}'

# Response:
{
  "artifact_id": "abc123-def456-ghi789",
  "model_artifact_joblib": "H4sIA...",
  "training_time_seconds": 0.0
}

# 2. Fast inference with cached artifact_id (<100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
  -H "Content-Type: application/json" \
  -d '{
  "artifact_id": "abc123-def456-ghi789",
  "X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]]
}'

# Response: {"predictions": [0, 1, 2], "inference_time_seconds": 0.05}
# Perfect classification! ✨

⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.


🏠 Self-Hosting

Local Development

# Clone repo
git clone https://github.com/jefferythewind/mcp-warpgbm.git
cd mcp-warpgbm

# Setup environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run locally (GPU optional for dev)
uvicorn local_dev:app --host 0.0.0.0 --port 8000 --reload

# Test
curl http://localhost:8000/healthz

Deploy to Modal (Production)

# Install Modal
pip install modal

# Authenticate
modal token new

# Deploy
modal deploy modal_app.py

# Service will be live at your Modal URL

Deploy to Other Platforms

# Docker (requires GPU)
docker build -t warpgbm-mcp .
docker run --gpus all -p 8000:8000 warpgbm-mcp

# Fly.io, Railway, Render, etc.
# See their respective GPU deployment docs

🧪 Testing

# Install dev dependencies
pip install -r requirements-dev.txt

# Run all tests
./run_tests.sh

# Or use pytest directly
pytest tests/ -v

# Test specific functionality
pytest tests/test_train.py -v
pytest tests/test_integration.py -v

📦 Project Structure

mcp-warpgmb/
├── app/
│   ├── main.py              # FastAPI app + routes
│   ├── mcp_sse.py           # MCP Server-Sent Events
│   ├── model_registry.py    # Model backend registry
│   ├── models.py            # Pydantic schemas
│   ├── utils.py             # Serialization, caching
│   ├── x402.py              # Payment verification
│   └── feedback_storage.py  # Feedback persistence
├── .well-known/
│   ├── mcp.json             # MCP capability manifest
│   └── x402                 # X402 pricing manifest
├── docs/
│   ├── AGENT_GUIDE.md       # Comprehensive agent docs
│   ├── MODEL_SUPPORT.md     # Model parameter reference
│   └── WARPGBM_PYTHON_GUIDE.md
├── tests/
│   ├── test_train.py
│   ├── test_predict.py
│   ├── test_integration.py
│   └── conftest.py
├── examples/
│   ├── simple_train.py
│   └── compare_models.py
├── modal_app.py             # Modal deployment config
├── local_dev.py             # Local dev server
├── requirements.txt
└── README.md

💰 Pricing (X402)

Optional micropayments on Base network:

Endpoint Price Description
/train $0.01 Train model on GPU, get artifacts
/predict_from_artifact $0.001 Batch predictions
/predict_proba_from_artifact $0.001 Probability predictions
/feedback Free Help us improve!

Note: Payment is optional for demo/testing. See /.well-known/x402 for details.


🔐 Security & Privacy

Stateless: No training data or models persisted
Sandboxed: Runs in temporary isolated directories
Size Limited: Max 50 MB request payload
No Code Execution: Only structured JSON parameters
Rate Limited: Per-IP throttling to prevent abuse
Read-Only FS: Modal deployment uses immutable filesystem


🌍 Available Models

🚀 WarpGBM (GPU)

  • Acceleration: NVIDIA A10G GPUs
  • Speed: 13× faster than LightGBM
  • Best For: Time-series, financial modeling, temporal data
  • Special: Era-aware splitting, invariant learning
  • Min Samples: 60+ recommended

⚡ LightGBM (CPU)

  • Acceleration: Highly optimized CPU
  • Speed: 10-100× faster than sklearn
  • Best For: General tabular data, large datasets
  • Special: Categorical features, low memory
  • Min Samples: 20+

🗺️ Roadmap

  • [x] Core training + inference endpoints
  • [x] Smart artifact caching (5min TTL)
  • [x] MCP Server-Sent Events integration
  • [x] X402 payment verification
  • [x] Modal deployment with GPU
  • [x] Custom domain (warpgbm.ai)
  • [x] Smithery marketplace listing
  • [ ] ONNX export support
  • [ ] Async job queue for large datasets
  • [ ] S3/IPFS dataset URL support
  • [ ] Python client library (warpgbm-client)
  • [ ] Additional model backends (XGBoost, CatBoost)

💬 Feedback & Support

Help us make this service better for AI agents!

Submit feedback about:

  • Missing features that would unlock new use cases
  • Confusing documentation or error messages
  • Performance issues or timeout problems
  • Additional model types you'd like to see
# Via API
curl -X POST https://warpgbm.ai/feedback \
  -H "Content-Type: application/json" \
  -d '{
    "feedback_type": "feature_request",
    "message": "Add support for XGBoost backend",
    "severity": "medium"
  }'

Or via:


📚 Learn More


📄 License

GPL-3.0 (same as WarpGBM core)

This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.


🙏 Credits

Built with:

  • WarpGBM - GPU-accelerated GBDT library
  • Modal - Serverless GPU infrastructure
  • FastAPI - Modern Python web framework
  • LightGBM - Microsoft's GBDT library

<div align="center">

Built with ❤️ for the open agent economy

⭐ Star on GitHub🚀 Try Live Service📖 Read the Docs

</div>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured