ML Lab MCP

ML Lab MCP

Transforms AI assistants into a full ML engineering environment for training and fine-tuning models across multiple backends (local GPU, Mistral, Together AI, OpenAI) and cloud providers (Lambda Labs, RunPod, SSH-accessible VPS), with dataset management, experiment tracking, cost estimation, and deployment to Ollama/Open WebUI.

Category
Visit Server

README

ML Lab MCP

A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.

Features

Unified Credential Management

  • Encrypted vault for API keys (Lambda Labs, RunPod, Mistral, OpenAI, Together AI, etc.)
  • PBKDF2 key derivation with AES encryption
  • Never stores credentials in plaintext

Dataset Management

  • Register datasets from local files (JSONL, CSV, Parquet)
  • Automatic schema inference and statistics
  • Train/val/test splitting
  • Template-based transformations

Experiment Tracking

  • SQLite-backed experiment storage
  • Version control and comparison
  • Fork experiments with config modifications
  • Full metrics history

Multi-Backend Training

  • Local: transformers + peft + trl for local GPU training
  • Mistral API: Native fine-tuning for Mistral models
  • Together AI: Hosted fine-tuning service
  • OpenAI: GPT model fine-tuning

Cloud GPU Provisioning

  • Lambda Labs: H100, A100 instances
  • RunPod: Spot and on-demand GPUs
  • Automatic price comparison across providers
  • Smart routing based on cost and availability

Remote VPS Support

  • Use any SSH-accessible machine (Hetzner, Hostinger, OVH, home server, university cluster)
  • Automatic environment setup
  • Dataset sync via rsync
  • Training runs in tmux (persistent across disconnects)
  • Amortized hourly cost calculation from monthly fees

Cost Estimation

  • Pre-training cost estimates across all providers
  • Real-time pricing queries
  • Time estimates based on model and dataset size

Ollama Integration

  • Deploy fine-tuned GGUF models to Ollama
  • Pull models from Ollama registry
  • Chat/inference testing directly from MCP
  • Model management (list, delete, copy)

Open WebUI Integration

  • Create model presets with system prompts
  • Knowledge base management (RAG)
  • Chat through Open WebUI (applies configs + knowledge)
  • Seamless Ollama ↔ Open WebUI workflow

Installation

pip install ml-lab-mcp

# With training dependencies
pip install ml-lab-mcp[training]

# With cloud provider support
pip install ml-lab-mcp[cloud]

# Everything
pip install ml-lab-mcp[training,cloud,dev]

Quick Start

1. Initialize and Create Vault

ml-lab init
ml-lab vault create

2. Add Provider Credentials

ml-lab vault unlock
ml-lab vault add --provider lambda_labs --api-key YOUR_KEY
ml-lab vault add --provider mistral --api-key YOUR_KEY

3. Configure with Claude Code / Claude Desktop

Add to your MCP configuration:

{
  "mcpServers": {
    "ml-lab": {
      "command": "ml-lab",
      "args": ["serve"]
    }
  }
}

MCP Tools

Credentials

Tool Description
creds_create_vault Create encrypted credential vault
creds_unlock Unlock vault with password
creds_add Add provider credentials
creds_list List configured providers
creds_test Verify credentials work

Datasets

Tool Description
dataset_register Register a local dataset file
dataset_list List all datasets
dataset_inspect View schema and statistics
dataset_preview Preview samples
dataset_split Create train/val/test splits
dataset_transform Apply template transformations

Experiments

Tool Description
experiment_create Create new experiment
experiment_list List experiments
experiment_get Get experiment details
experiment_compare Compare multiple experiments
experiment_fork Fork with modifications

Training

Tool Description
train_estimate Estimate cost/time across providers
train_launch Start training run
train_status Check run status
train_stop Stop training

Infrastructure

Tool Description
infra_list_gpus List available GPUs with pricing
infra_provision Provision cloud instance
infra_terminate Terminate instance

Remote VPS

Tool Description
vps_register Register a VPS (host, user, key, GPU info, monthly cost)
vps_list List all registered VPS machines
vps_status Check VPS status (online, GPU, running jobs)
vps_unregister Remove a VPS from registry
vps_setup Install training dependencies on VPS
vps_sync Sync dataset to VPS
vps_run Run command on VPS
vps_logs Get training logs from VPS

Ollama

Tool Description
ollama_status Check Ollama status (running, version, GPU)
ollama_list List models in Ollama
ollama_pull Pull model from registry
ollama_deploy Deploy GGUF to Ollama
ollama_chat Chat with a model
ollama_delete Delete a model

Open WebUI

Tool Description
owui_status Check Open WebUI connection
owui_list_models List model configurations
owui_create_model Create model preset (system prompt, params)
owui_delete_model Delete model configuration
owui_list_knowledge List knowledge bases
owui_create_knowledge Create knowledge base
owui_add_knowledge_file Add file to knowledge base
owui_chat Chat through Open WebUI

Security

Tool Description
security_audit_log View recent audit log entries
security_audit_summary Get audit activity summary
security_tailscale_status Check Tailscale VPN connection
security_ssh_key_rotate Rotate SSH key for a VPS
creds_expiry_check Check credential expiry status
creds_rotate Rotate credentials for a provider

Codex Integration (Executor LLM)

Tool Description
codex_status Check if Codex CLI is available
codex_analyze_error Have Codex analyze errors and suggest fixes
codex_generate_training_script Generate training script from experiment config
codex_fix_code Have Codex fix issues in training code
codex_optimize_config Optimize training config for quality/speed/memory
codex_debug_training Debug training issues from logs
codex_run Run arbitrary task with Codex

Architecture: Separation of intelligence vs execution:

  • Planner LLM (Claude): Reasoning, architecture, tradeoffs
  • Executor LLM (Codex): Precise code edits + CLI work
  • World tools (ML Lab): Data, training, infra, deployment

Deep Thinking Analysis (Ollama Reasoning Models)

Tool Description
thinking_analyze Run deep analysis with reasoning models (DeepSeek R1, QwQ)
thinking_schedule Schedule automated analysis (after training, daily, weekly)
thinking_reports List and retrieve analysis reports

Analysis Types:

  • training - Analyze loss curves, convergence, hyperparameters
  • experiment - Compare experiments, suggest improvements
  • activity - Review audit logs for patterns/anomalies
  • cost - Cost efficiency analysis and recommendations
  • dataset - Data quality and training suitability

Schedule Frequencies:

  • after_training - Auto-analyze when training completes
  • hourly / daily / weekly - Periodic reviews

Reports stored in ~/.cache/ml-lab/reports/ for historical reference.

Example Workflow

User: Fine-tune Llama 3.1 8B on my customer support data

Claude: I'll help you fine-tune Llama 3.1 8B. Let me first check your data.

[Uses dataset_register to load ./support_data.jsonl]

Found 15,000 samples. Let me estimate the costs:

[Uses train_estimate]

| Provider      | Time   | Cost   | Availability |
|---------------|--------|--------|--------------|
| vps:hetzner-01| ~4h    | $0.28* | Idle         |
| Local (4090)  | ~12h   | $0     | Ready        |
| Lambda H100   | ~2h    | $5.50  | 3 available  |
| Together API  | ~4h    | $12    | Queued       |

*amortized from $200/mo

Which would you prefer?

User: Use Lambda Labs

Claude: [Uses experiment_create, then train_launch with backend=lambda]

Training started! Run ID: abc123
I'll monitor progress and let you know when it completes.

Architecture

src/ml_lab/
├── server.py           # MCP server entry point (61 tools)
├── credentials.py      # Encrypted credential vault
├── cli.py              # Command-line interface
├── backends/
│   ├── base.py         # Training backend interface
│   ├── local.py        # Local GPU training
│   ├── mistral_api.py  # Mistral fine-tuning API
│   ├── together_api.py # Together AI API
│   ├── openai_api.py   # OpenAI fine-tuning API
│   └── vertex_api.py   # Google Vertex AI (Gemini)
├── cloud/
│   ├── base.py         # Cloud provider interface
│   ├── lambda_labs.py  # Lambda Labs integration
│   ├── runpod.py       # RunPod integration
│   ├── modal_provider.py # Modal integration
│   └── remote_vps.py   # Generic SSH VPS support (+ Tailscale)
├── storage/
│   ├── datasets.py     # Dataset management
│   └── experiments.py  # Experiment tracking
├── inference/
│   ├── ollama.py       # Ollama integration
│   ├── openwebui.py    # Open WebUI integration
│   └── thinking.py     # Deep thinking analysis (DeepSeek R1, QwQ)
├── integrations/
│   └── codex.py        # Codex CLI integration (executor LLM)
├── security/
│   └── audit.py        # Audit logging
└── evals/
    └── benchmarks.py   # Evaluation suite

Security

  • Credentials encrypted with Fernet (AES-128-CBC)
  • PBKDF2-SHA256 key derivation (480,000 iterations)
  • Vault file permissions set to 600 (owner read/write only)
  • API keys never logged or transmitted unencrypted
  • Audit logging: All sensitive operations logged to ~/.cache/ml-lab/audit.log
  • Credential expiry: Automatic tracking with rotation reminders
  • Tailscale support: Optional VPN requirement for VPS connections
  • SSH key rotation: Automated rotation with rollback on failure

Supported Providers

Compute Providers

  • Lambda Labs (H100, A100, A10)
  • RunPod (H100, A100, RTX 4090)
  • Modal (serverless GPU functions)

Fine-Tuning APIs

  • Mistral AI (Mistral, Mixtral, Codestral)
  • Together AI (Llama, Mistral, Qwen)
  • OpenAI (GPT-4o, GPT-3.5)
  • Google Vertex AI (Gemini 1.5 Pro, Gemini 1.5 Flash)

Model Hubs

  • Hugging Face Hub
  • Replicate
  • Ollama (local GGUF models)

Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

License

PolyForm Noncommercial 1.0.0 - free for personal use, contact for commercial licensing.

See LICENSE for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured