MCP Servers

fine-tuning-os

A zero-data MCP server for LLM fine-tuning, providing 64 tools across 10 dimensions to prepare, build, train, evaluate, secure, package, and deliver fine-tuned models without ever accessing client data.

README

fine-tuning-os

The Zero-Data Model Context Protocol control plane for LLM fine-tuning — 64 tools across 10 dimensions to prepare, build, train in the client enclave, evaluate, secure, package, and deliver a fine-tuned model — without ever seeing the client's data.

Quickstart · Architecture · The 10 dimensions · Zero-Data · Testing · Security

</div>

Overview
Zero-Data Contract
Architecture
Install
Run
Configuration
Tool Catalogue
Testing
Security Notes
Contributing
License

Overview

fine-tuning-os is a zero-dependency-on-secrets MCP server that exposes 64 domain tools (+ 1 health tool) for the entire LLM fine-tuning delivery workflow. It integrates into any MCP-compatible host — Claude Desktop, Claude Code, or a custom orchestrator — with no mandatory secrets at boot.

Tools that require external services (SSH, HuggingFace, SFTP, SMTP, Slack, registries) advertise their requirements via a dry_run response rather than failing silently or faking execution. This means you get a fully operational server and actionable CLI commands from day one, and can progressively enable live execution by setting environment variables.

✨ Highlights

64 tools / 10 dimensions. prep · synthetic · pipeline · execution · evaluation · security · packaging · docs · client · maintenance — the full fine-tuning delivery lifecycle, callable from any MCP host.
Zero-Data by construction. C1/C3 tools cannot open a socket; C2 tools dry-run (the exact command, with env-name placeholders) until you set the env var — never a faked success. Enforced by tests/test_zero_data.py on every CI run.
Trains where the data lives. The server embeds no torch/unsloth; heavy GPU work runs in the client enclave (or a routed engine) — only sanitized metrics/logs come back.
Real artifacts you own. AES-256-GCM encrypted deliverables + SHA256, French-law contract / NDA / data-destruction-certificate templates, performance & security reports — generated, not black-boxed.
Companion skill. A fine-tuning-os Claude skill (SKILL.md + 16 references) maps every phase to the exact tool, with go/no-go gates and a Zero-Data playbook.
657 tests, ≥95% coverage, ruff + black + mypy clean, Hypothesis property tests + mutation config, CI on Python 3.10–3.13 across Linux / macOS / Windows.

Zero-Data Contract

Every tool belongs to one of three classes:

Class	Behaviour	Network	Secrets required
C1 — Pure/Offline	Generates text, configs, or analysis from local state only	Never	None
C2 — Emit/Dry-run	Builds and returns an actionable command or payload; if the required env var is absent returns `meta.executed=False, meta.dry_run=True` and never fakes execution	Only when env is configured	Optional (enables live mode)
C3 — Static Audit	Reads local files/config and returns a structured report	Never	None

Guarantees enforced by tests/test_zero_data.py on every CI run:

C1 and C3 tools cannot open sockets (socket patched to raise on any attempt).
C2 tools with no env configured return executed=False, dry_run=True and open no sockets.
65 tools registered at server boot with zero env vars set.
No file written outside the configured workspace root (FTOS_WORKSPACE).

Architecture

flowchart TB
    subgraph Host["MCP Host (Claude Code / Claude Desktop)"]
        CC["Claude Code"]
    end

    subgraph Server["fine-tuning-os MCP Server (stdio)"]
        S["server.py<br/>FastMCP + 65 tools"]

        subgraph Socle["Socle / Infrastructure"]
            ST["store.py<br/>Filesystem abstraction"]
            TG["targets.py<br/>gate() — env-based C2 activation"]
            MD["models.py<br/>Response dataclasses"]
            CR["crypto.py<br/>AES-256-GCM encryption"]
            SN["sanitize.py<br/>Secret / PII stripping"]
            RE["render.py<br/>Markdown to PDF"]
        end

        subgraph Tools["10 Tool Modules"]
            T1["prep<br/>9 tools"]
            T2["synthetic<br/>1 tool"]
            T3["pipeline<br/>7 tools"]
            T4["execution<br/>8 tools"]
            T5["evaluation<br/>7 tools"]
            T6["security<br/>6 tools · C3"]
            T7["packaging<br/>8 tools"]
            T8["docs<br/>8 tools"]
            T9["client<br/>6 tools"]
            T10["maintenance<br/>4 tools"]
        end
    end

    subgraph Boundary["Zero-Data Boundary"]
        direction LR
        ZD["C1/C3: socket = BLOCKED<br/>C2: dry_run when no env<br/>All writes: FTOS_WORKSPACE only"]
    end

    subgraph Enclave["Client Enclave (optional)"]
        HF["HuggingFace API"]
        SSH["Remote GPU server<br/>SSH"]
        REG["Container Registry"]
        SFTP["SFTP / SMTP / Slack"]
    end

    CC <-->|"MCP stdio protocol"| S
    S --> Socle
    S --> Tools
    Tools --> Boundary
    Boundary -.->|"C2 live mode<br/>only when env set"| Enclave

The server registers all 65 tools at startup. C2 tools call gate() from targets.py to check whether the required environment variable is set; if not, they return the dry-run command without touching the network.

Install

# Clone
git clone https://github.com/Casius999/fine-tuning-os.git
cd fine-tuning-os

# Create virtual environment (Python 3.10+)
python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # Linux / macOS

# Install (dev mode with test dependencies)
pip install -e ".[dev]"

Optional PDF export support (requires system libraries):

pip install -e ".[pdf]"

Run

stdio transport (Claude Desktop / Claude Code)

python -m fine_tuning_os
# or: fine-tuning-os

Claude Desktop config (`claude_desktop_config.json`)

{
  "mcpServers": {
    "fine-tuning-os": {
      "command": "python",
      "args": ["-m", "fine_tuning_os"],
      "env": {
        "FTOS_WORKSPACE": "/path/to/your/workspace"
      }
    }
  }
}

Configuration

All configuration is through environment variables. Setting none of them is valid — the server starts and all tools respond (C2 tools return dry-run commands).

Variable	Class	Description	Default
`FTOS_WORKSPACE`	All	Root directory for all project files	`./ftos-workspace`
`FTOS_LOCAL_PYTHON`	C2	Path to Python interpreter for local training/merge/quantize	—
`HF_TOKEN`	C2	Hugging Face token for `cache_base_model`, checkpoint download	—
`FTOS_SSH_HOST`	C2	Remote training server hostname	—
`FTOS_SSH_KEY`	C2	Path to SSH private key for remote operations	—
`FTOS_REGISTRY`	C2	Container registry URL for `push_docker_to_registry`	—
`FTOS_REGISTRY_TOKEN`	C2	Registry authentication token	—
`FTOS_SFTP_HOST`	C2	SFTP host for `upload_deliverable`	—
`FTOS_SFTP_USER`	C2	SFTP username	—
`FTOS_SFTP_KEY`	C2	Path to SFTP private key	—
`FTOS_SMTP_HOST`	C2	SMTP host for `send_status_update`	—
`FTOS_SMTP_USER`	C2	SMTP username	—
`FTOS_SMTP_PASSWORD`	C2	SMTP password	—
`FTOS_SLACK_WEBHOOK`	C2	Slack incoming webhook URL for notifications	—
`FTOS_CALENDLY_TOKEN`	C2	Calendly API token for `schedule_meeting`	—
`FTOS_GIT_REMOTE`	C2	Git remote URL for `self_update`	—

Tool Catalogue

prep — Data Preparation (9 tools, C1/C2)

Tool	Class	Description
`create_training_config`	C1	Generate a full training configuration (LoRA, hyperparams, scheduler)
`cache_base_model`	C2	Emit `huggingface-cli download` command or execute if HF_TOKEN set
`generate_requirements`	C1	Produce `requirements.txt` for a given framework (unsloth, trl, etc.)
`create_project_structure`	C1	Scaffold a project directory tree under workspace
`load_project_template`	C1	Load and render a named project template
`describe_expected_data_format`	C1	Return schema documentation for a task type
`validate_data_schema`	C1	Validate a dataset sample against the expected schema
`anonymize_dataset_preview`	C1	Mask PII in a dataset sample for safe preview
`split_dataset_config`	C1	Generate train/eval/test split configuration

synthetic — Synthetic Data (1 tool, C1)

Tool	Class	Description
`generate_synthetic_dataset`	C1	Generate a synthetic instruction-tuning dataset from a schema

pipeline — Local Pipeline (7 tools, C1/C2)

Tool	Class	Description
`build_docker_image`	C2	Emit `docker build` command or execute if Docker configured
`test_docker_build`	C2	Emit `docker run` smoke-test command
`run_local_synthetic_train`	C2	Emit local training command via `FTOS_LOCAL_PYTHON`
`get_local_metrics`	C1	Parse and return metrics from a local training log file
`dry_run_remote_config`	C1	Validate remote training config without connecting
`optimize_hyperparams`	C1	Suggest hyperparameter adjustments based on metrics
`generate_unit_tests`	C1	Generate pytest unit tests for a training script

execution — Remote Execution (8 tools, C1/C2)

Tool	Class	Description
`push_docker_to_registry`	C2	Emit `docker push` command or execute if registry configured
`generate_deployment_command`	C1	Build deployment command string for a given engine and host
`trigger_remote_training`	C2	SSH-trigger training job or emit command if SSH not configured
`stream_remote_logs`	C2	SSH-tail training logs or emit SSH command
`monitor_training_metrics`	C2	SSH-poll metrics endpoint or emit monitoring command
`detect_anomalies`	C1	Analyse a metrics series and flag anomalies
`pause_resume_training`	C2	SSH-send pause/resume signal or emit command
`early_stopping_check`	C1	Evaluate early-stopping criteria from a metrics snapshot

evaluation — Model Evaluation (7 tools, C1/C2)

Tool	Class	Description
`download_checkpoint_metadata`	C2	Fetch checkpoint metadata from remote or emit command
`evaluate_on_synthetic`	C1	Run evaluation loop on synthetic dataset locally
`evaluate_on_validation_set`	C2	Run evaluation on remote validation set or emit command
`compute_metrics`	C1	Compute BLEU, ROUGE, and task-specific metrics
`generate_predictions_sample`	C1	Generate a sample of model predictions for review
`compare_to_baseline`	C1	Compare current metrics to a stored baseline
`bias_fairness_scan`	C1	Run bias and fairness checks on evaluation outputs

security — Security Auditing (6 tools, C3)

Tool	Class	Description
`audit_code_no_network`	C3	Static security scan of training code (no network)
`audit_dockerfile_security`	C3	Audit a Dockerfile for security misconfigurations
`scan_data_leakage_risk`	C3	Scan dataset for PII and data-leakage patterns
`verify_model_license`	C3	Verify model license compatibility for commercial use
`generate_security_report`	C3	Aggregate audit results into a structured security report
`sanitize_logs_for_claude`	C3	Strip secrets and PII from logs before sharing with Claude

packaging — Model Packaging (8 tools, C1/C2)

Tool	Class	Description
`merge_lora_weights`	C2	Emit merge command or execute via `FTOS_LOCAL_PYTHON`
`quantize_model`	C2	Emit quantization command (GGUF/GPTQ/AWQ) or execute
`build_inference_container`	C2	Write Dockerfile to workspace and emit `docker build` command
`generate_inference_config`	C1	Generate vLLM/SGLang/TGI inference configuration
`test_inference_api`	C2	Emit curl test command or execute against live endpoint
`encrypt_deliverable`	C1	Encrypt a deliverable file with AES-256 and return key hex
`upload_deliverable`	C2	Emit SFTP upload command or execute if SFTP configured
`generate_delivery_note`	C1	Generate a signed delivery note document

docs — Documentation (8 tools, C1)

Tool	Class	Description
`generate_contract`	C1	Generate a service contract from project metadata
`generate_nda`	C1	Generate a non-disclosure agreement
`generate_performance_report`	C1	Generate a full training performance report
`generate_user_guide`	C1	Generate end-user guide for a fine-tuned model
`generate_deployment_guide`	C1	Generate deployment and operations guide
`generate_destruction_certificate`	C1	Generate data destruction certificate (RGPD)
`export_document_pdf`	C1	Render a markdown document to PDF locally
`sign_document`	C1	Hash-sign a document and return verification metadata

client — Client Management (6 tools, C1/C2)

Tool	Class	Description
`onboard_client`	C1	Create client project record and onboarding checklist
`send_status_update`	C2	Send status email/Slack or emit message if not configured
`schedule_meeting`	C2	Create Calendly event or emit scheduling command
`log_project_event`	C1	Append a timestamped event to the project log
`request_client_approval`	C1	Generate an approval request document
`generate_invoice`	C1	Generate a project invoice from billing metadata

maintenance — Maintenance (4 tools, C1/C2)

Tool	Class	Description
`check_model_rot`	C1	Analyse metric drift to detect model rot
`suggest_retraining`	C1	Recommend retraining schedule based on drift analysis
`update_base_model`	C1	Generate update plan for a new base model version
`self_update`	C2	Emit `git pull` command or execute if `FTOS_GIT_REMOTE` set

health (1 tool)

Tool	Class	Description
`ftos_health`	C1	Return server version, tool count, and workspace status

Testing

# Full suite with coverage
pytest --cov=src/fine_tuning_os --cov-report=term-missing --cov-fail-under=95

# Zero-Data invariant tests only
pytest tests/test_zero_data.py -v

# Tool registration check (65 tools)
pytest tests/test_registration.py -v

# Run the synthetic demo bundle (no network, no secrets needed)
python scripts/demo_bundle.py

Coverage gate: ≥95% (CI enforced).

Test structure (tests/):

tests/
├── conftest.py              # workspace / store / project_id fixtures
├── test_registration.py     # 65-tool registration check
├── test_zero_data.py        # Zero-Data invariants (C1/C2/C3 × network × filesystem)
├── test_prep.py
├── test_synthetic.py
├── test_pipeline.py
├── test_execution.py
├── test_evaluation.py
├── test_security.py
├── test_packaging.py        # TDD + confinement regression
├── test_docs.py
├── test_client.py
├── test_maintenance.py
├── test_error_paths.py      # error-path coverage (OSError, TemplateError, missing-project, bad-crypto)
└── test_property.py         # Hypothesis property-based tests (sanitize, crypto, metrics, Store)

Security Notes

No secret on disk. All credentials are read from environment variables at call time via targets.py:gate(). No secret is ever written to files or returned in tool output values.
Filesystem confinement. Every tool that writes files resolves the destination through Store.project_dir(project_id), anchored under FTOS_WORKSPACE. Writing outside is rejected with an explicit error.
Sanitize before returning. Use sanitize_logs_for_claude to strip secrets and PII from logs before passing output to any LLM.
C2 dry_run is safe. The returned command string contains only env var name references (e.g., $HF_TOKEN), never literal secret values.
No network for C1/C3. Verified by the test suite on every CI run.

Found a vulnerability? See SECURITY.md — report privately, do not open a public issue.

Contributing

Contributions are welcome! Please read CONTRIBUTING.md and our Code of Conduct. Commits follow Conventional Commits.

Legal Notice

Ce logiciel est fourni à titre d'outil d'assistance technique. Il ne constitue pas un conseil juridique, fiscal, ou professionnel. Les documents générés (contrats, NDA, factures) sont des modèles à soumettre à un professionnel qualifié avant tout usage. L'utilisateur reste seul responsable de l'usage qu'il fait des outils et des sorties produites.

License

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

fine-tuning-os

README

fine-tuning-os

Table of Contents

Overview

✨ Highlights

Zero-Data Contract

Architecture

Install

Run

stdio transport (Claude Desktop / Claude Code)

Claude Desktop config (claude_desktop_config.json)

Configuration

Tool Catalogue

prep — Data Preparation (9 tools, C1/C2)

synthetic — Synthetic Data (1 tool, C1)

pipeline — Local Pipeline (7 tools, C1/C2)

execution — Remote Execution (8 tools, C1/C2)

evaluation — Model Evaluation (7 tools, C1/C2)

security — Security Auditing (6 tools, C3)

packaging — Model Packaging (8 tools, C1/C2)

docs — Documentation (8 tools, C1)

client — Client Management (6 tools, C1/C2)

maintenance — Maintenance (4 tools, C1/C2)

health (1 tool)

Testing

Security Notes

Contributing

Legal Notice

License

Recommended Servers

Claude Desktop config (`claude_desktop_config.json`)