clark-mcp

clark-mcp

A local, offline MCP server that provides a natural-language interface to Clark, a warehouse workforce RL agent, enabling plan queries, what-if analysis, and decision explanations without any cloud dependencies.

Category
Visit Server

README

clark-mcp

Local, offline natural-language interface to Clark (the warehouse workforce RL agent). Plain English in → real staffing decisions out. No cloud, no API cost, no data egress.

Ask "what's the opening plan for the east dock Tuesday, and what happens if two pickers call off?" — a local LLM turns that into real Clark tool calls and explains the result honestly. Nothing leaves the machine.

What it is

                       ┌── chat ─────────▶ hermes3:8b (Ollama, local)
                       │                        │  tool calls
 browser ──HTTP──▶ web UI                       ▼
                       │                  clark_mcp server  (MCP, stdio)
                       └── staffing sweep ──▶   │
                                                ▼  HTTP
                                          clark serve  (localhost inference API)
                                                │
                                                ▼  real Clark RL inference

Four thin layers, each independently testable, no shared process:

  • clark serve (in the clark repo) — a minimal localhost inference API: 7 stateless read routes (incl. /simulate for staffing-sufficiency outcomes and /capabilities for architectural facts), weights loaded once. Not part of this repo; this repo consumes it.
  • clark_mcp/server.py — a real MCP server (any MCP host can use it) exposing 6 tools: clark_list_facilities, clark_facility_info, clark_capabilities (architectural facts — the model looks them up rather than memorizing), clark_get_plan, clark_what_if, clark_explain_decision.
  • clark_mcp/agent.py — a fully-local client: a Hermes-3-8B model in Ollama drives those tools and explains the result. Zero external calls. Multi-turn aware (history persists across calls).
  • clark_mcp/web/ — a tiny stdlib local web UI (browser → http://127.0.0.1:8765). Chat panel + a staffing-sufficiency sweep dashboard panel that visualizes grade distribution across roster sizes. No JS frameworks, no chart libraries.

Nothing here re-implements inference — every tool delegates to Clark's localhost API over HTTP. clark_explain_decision returns Clark's plan plus the facility's rules as grounding; the explanation is the model's, not Clark policy introspection (Clark is an RL policy — it emits actions, not reasons).

See docs/ARCHITECTURE.md for the full design, tool contracts, the honesty model, and the fine-tune pipeline.

The honesty model (why this is built the way it is)

This system is designed to be a truthful staffing tool, not a confident one. Three rules are enforced structurally, not by prompt alone:

  1. Never invent a plan. Plans/what-ifs always come from a live tool call; a tool error or unknown facility is reported plainly, never papered over with a plausible-looking fabrication.
  2. Explain, don't introspect. Clark is an RL policy. The system describes what it assigned and interprets it against the facility's rules — it never claims to know why the network chose it.
  3. Opening assignment ≠ outcome. A plan is the start-of-day assignment, not a simulated end-of-day grade. What-ifs compare opening assignments across scenarios, not projected results.

A trained Clark genuinely fails some days (a roster can be too thin for its volume). The tool is meant to surface those failures honestly — not to be tuned until it always "wins."

Run (all local)

# 1. Clark inference API (from the clark repo, on a stable checkpoint)
clark serve --model <checkpoint.pt> --facilities-dir clark/data/configs --port 8000

# 2. Ollama with the model pulled
ollama pull hermes3:8b
# (or: ollama run clark-hermes3:ft  after Phase 3 deployment)

# 3. this:
pip install -e .

# Three surfaces — pick one:
python -m clark_mcp.web            # local web UI (http://127.0.0.1:8765)
python -m clark_mcp.agent          # interactive REPL (multi-turn; /reset to restart)
clark-mcp                          # MCP server for any MCP host (Claude Desktop, etc.)

Tests

pip install -e ".[dev]"
pytest                              # tool layer, against a fake clark client

The pure tool layer (tools.py) is unit-tested with an injected fake client. The agent.py LLM loop and MCP stdio transport are smoke-tested, not regression-covered — exercised once manually (hermes3:8b → MCP → clark-mcp → Clark), not in CI.

Status

Phase What State
0 Minimal localhost Clark inference API (clark serve) Built + hardened. Non-facility configs → clean 422 (not 500); seeded /plan reproducible; pytest-green against real Clark.
1 MCP server + fully-local Hermes-3 client Built. Tool layer regression-covered; LLM loop + stdio smoke-tested.
2 Fine-tune dataset Built + quality-gated. Real-API generator + 8-example gold bar; 196 curated examples (was 160 — added multi-turn + numeric-grounding + capabilities categories), every taught behavior covered, no category > 30%. See finetune/DATASET.md.
3 QLoRA domain fine-tune of the local model Gate passed, iterating. First fine-tune beat base on the held-out gate (envelope conformance 0.00→1.00, tool-selection 0.74→1.00, tool-args 0.56→0.80, no regressions). Honest caveat: a live multi-turn smoke found numeric-fact hallucination + context loss the single-turn gate can't see — second iteration adds multi-turn + numeric-grounding data and metrics, retrain in progress. No fine-tuned weights shipped yet. See finetune/PHASE3.md.
4 Integration + the staffing-sufficiency what-if Primitive + dashboard built; MCP-tool integration pending. Clark's new /simulate endpoint runs the policy end-to-end at a given roster size (with extra_workers for additive sweeps) — that's the honest outcome primitive the question requires. The web UI's "Staffing sweep" panel calls it and renders a stacked-bar chart of grade distribution per roster size. Still to do: a clark_staffing_sweep MCP tool so chat can also drive the sweep, plus dataset examples that teach it.
5 Portfolio write-up Planned.

Honest scope of the result: the pipeline works end to end; plan quality is only as good as the Clark checkpoint behind it, and answer fluency is base-Hermes until the Phase 3 fine-tune lands.

Project memory

Architectural decisions and constraints are recorded under .context/ (context-keeper): why the runtime is HTTP-decoupled from Clark, why every fine-tune payload must be live-captured, why the dataset is quality-gated against the gold set. Read these before changing the contract.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured