clair

clair

An MCP server that reduces token usage by lazily loading skills and tools only when needed, and routing repetitive subtasks to ML backends instead of the LLM.

Category
Visit Server

README

CLAIR — Cascaded Lazy AI Routing

An MCP server that reduces token burn by lazily loading skills and tools only when needed, and routing repetitive subtasks to ML backends instead of the LLM.

License

Why CLAIR?

Loading all MCP tools and skill documents upfront can consume 5,000–15,000 tokens before the user's first message is processed. CLAIR introduces a 280-token always-on router that:

  1. Classifies the incoming task
  2. Returns only the skills and tools relevant to that task
  3. Identifies subtasks that can bypass the LLM entirely via small ML models

Proven Results — A/B Test

A controlled A/B test was run against a real Travel Planner web application across 50 tasks spanning 7 task categories (travel planning, UI changes, debugging, styling, state management, API development, button interactions).

Metric Control (full load) CLAIR (lazy) Improvement
Avg tokens/request 9,930 2,719 −72.6%
Skills loaded 15 (always) 3.8 (avg) −74.7%
Best category 9,930 559 −94.4%
Worst category 9,930 4,361 −56.1%

Real LLM API calls (OpenRouter) confirmed the estimates:

  • Control: 9,740 real prompt tokens per request
  • CLAIR travel query: 328 real prompt tokens (−96.6% vs control)
  • CLAIR dev task: 2,982 real prompt tokens (−69.4% vs control)

→ See the full report: demo/travel-planner/REPORT.md
→ Run the demo: demo/travel-planner/

Architecture

User Request → CLAIR Router (~280 tokens) → Domain Skill → Cascade Skill → LLM
                                           ↘ ML Backend (for repetitive tasks)

Installation

git clone https://github.com/concensure/clair-mcp-server.git
cd clair-mcp-server
npm install
npm run build
# Compiled output: dist/stdio.js (stdio transport) and dist/server.js (HTTP transport)

Compatibility — Works with Any MCP Client

CLAIR is client-agnostic. It works with any MCP-compatible AI coding assistant:

Client LLM Status
Claude Desktop Claude ✅ Full support
Claude Code Claude ✅ Full support
Kilo Code OpenRouter / any Full support
Cursor OpenAI / Anthropic ✅ Full support (via MCP)
Any MCP client Any ✅ Full support

Kilo Code users: CLAIR works with Kilo Code + OpenRouter. Call clair_route before attaching skill/rule documents. CLAIR's output tells you exactly which documents to attach for the current task — compatible with Kilo Code's skill document attachment feature.

Rule documents: CLAIR's manifest can point to any markdown file — skill documents, Kilo Code .rules files, or custom instruction files. No modification required.

MCP tool savings: CLAIR also reduces MCP tool token costs. With 20 tools loaded upfront (~4,000–8,000 tokens), CLAIR routes which tools are needed per request, reducing tool context to ~800–1,200 tokens (75–90% savings). See RFC-CLAIR.md §8.3 for details.


Usage — stdio Transport (Universal)

stdio is the standard transport for local MCP servers across all major AI coding clients. After building, use dist/stdio.js as the entry point.

Kilo Code (mcp_settings.json)

{
  "mcpServers": {
    "clair": {
      "command": "node",
      "args": ["/path/to/clair-mcp-server/dist/stdio.js"],
      "alwaysAllow": ["clair_route", "clair_list_skills", "clair_offload"]
    }
  }
}

File location: %APPDATA%\Code\User\globalStorage\kilocode.kilo-code\settings\mcp_settings.json (Windows) or ~/.config/Code/User/globalStorage/kilocode.kilo-code/settings/mcp_settings.json (Linux/Mac)

Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "clair": {
      "command": "node",
      "args": ["/path/to/clair-mcp-server/dist/stdio.js"]
    }
  }
}

Claude Code (CLI)

claude mcp add clair node /path/to/clair-mcp-server/dist/stdio.js

Cursor (.cursor/mcp.json)

{
  "mcpServers": {
    "clair": {
      "command": "node",
      "args": ["/path/to/clair-mcp-server/dist/stdio.js"]
    }
  }
}

OpenAI Codex / Any MCP-compatible client

{
  "mcpServers": {
    "clair": {
      "command": "node",
      "args": ["/path/to/clair-mcp-server/dist/stdio.js"]
    }
  }
}

Windows paths: Use double backslashes: "C:\\Users\\you\\clair-mcp-server\\dist\\stdio.js"

HTTP Transport (Remote / Team Deployment)

For remote deployment (Railway, Fly.io, Render), use dist/server.js which exposes an HTTP endpoint at /mcp. See HOSTING.md.

Available Tools

clair_route

Classifies a task and returns the minimal set of skills and MCP tools to load.

{
  "task_description": "Write a Python script to analyze sales data from a CSV",
  "prefer_ml_offload": true
}

Returns:

{
  "domains": ["coding", "data_analysis"],
  "load_skills": [
    { "id": "coding", "path": "skills/domains/coding.md", "token_cost": 420 },
    { "id": "data", "path": "skills/domains/data.md", "token_cost": 340 },
    { "id": "python", "path": "skills/cascades/coding/python.md", "token_cost": 320 }
  ],
  "load_tools": [
    { "id": "filesystem", "reason": "Required by coding skill" },
    { "id": "python_exec", "reason": "Required by data skill" }
  ],
  "ml_candidates": [],
  "estimated_tokens_saved": 1840,
  "routing_confidence": 0.85
}

clair_offload

Routes a repetitive subtask to an ML backend.

{
  "subtask_type": "sentiment_classification",
  "data": ["Great product!", "Terrible service", "It was okay"]
}

clair_list_skills

Lists all available skills, cascades, and ML backends with their token costs.

Skill Tree

skills/
├── router/ROUTER.md          ← always loaded (280 tokens)
├── domains/
│   ├── documents.md          (380 tokens)
│   ├── coding.md             (420 tokens)
│   ├── data.md               (340 tokens)
│   └── research.md           (290 tokens)
└── cascades/
    ├── documents/
    │   ├── docx.md           (560 tokens)
    │   ├── pdf.md            (480 tokens)
    │   └── pptx.md           (510 tokens)
    └── coding/
        ├── python.md         (320 tokens)
        ├── typescript.md     (310 tokens)
        └── testing.md        (280 tokens)

Total if loaded naively: ~4,170 tokens
Average CLAIR-routed load: ~1,040 tokens
Average savings: ~75%

ML Offload Registry

Task Backend Accuracy Latency
Sentiment classification distilbert-sst-2 93% 15ms
Language detection langdetect 99% 2ms
Spell check pyspellchecker 97% 5ms
Named entity extraction spaCy en_core_web_sm 91% 10ms
Text similarity all-MiniLM-L6-v2 89% 20ms
Anomaly detection Isolation Forest 85% 25ms
Regex extraction Rules 99% 1ms
Tabular classification sklearn 88% 8ms

How Routing Works (Zero LLM Tokens)

CLAIR's routing is pure keyword matching — no LLM is invoked at routing time. The developer writes a manifest.json once with trigger keywords for each skill. At request time, CLAIR scans the user's query for those keywords in microseconds.

Token cost of routing: ~280 tokens (router overhead) + 0 LLM tokens
Token cost of NOT routing: 5,000–15,000 tokens (all skills loaded upfront)

→ See MANIFEST_GUIDE.md for how to write the manifest, organize the skill tree, and validate triggers — without consuming any LLM tokens.


Extending CLAIR

Adding a new skill

  1. Create your skill markdown file in skills/domains/ or skills/cascades/
  2. Add an entry to manifest.json
  3. Include trigger keywords and token cost estimate

Adding an ML backend

Add an entry to the ml_offload_registry in manifest.json:

{
  "id": "your_task_id",
  "triggers": ["keyword1", "keyword2"],
  "volume_threshold": 5,
  "backend": "your-model-name",
  "backend_type": "huggingface|sklearn|spacy|rules|python_library",
  "latency_ms": 10,
  "accuracy": 0.90
}

Demo

The demo/travel-planner/ directory contains a complete A/B test demo:

  • A Travel Planner web app with real LLM integration (OpenRouter)
  • 15 skills across travel and software development domains
  • Automated 50-task simulation script
  • Full A/B test report with per-category breakdown

Hosting

Recommended: Railway, Fly.io, or Render for remote HTTP transport.
Local dev: stdio transport (default, no server needed).

See HOSTING.md for deployment guides.

RFC

The formal proposal for the CLAIR protocol is in RFC-CLAIR.md.

License

Apache 2.0 — see LICENSE.

Free to use commercially with attribution. See the licence comparison for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured