harness-mcp

harness-mcp

A TypeScript boilerplate for building MCP servers with harness engineering, providing tool definitions, structured errors, and evaluation harnesses.

Category
Visit Server

README

harness-mcp

An opinionated TypeScript boilerplate for building MCP servers with harness engineering in mind.

Harness engineering is the discipline of designing the scaffolding around an LLM agent — tools, descriptions, errors, context — so the agent actually does the right thing. Most MCP boilerplates teach you the protocol. This one teaches you the protocol and the practice.

What's in the box

  • defineTool() — one Zod schema feeds the MCP SDK, OpenAI's function-calling API, and the runtime handler. Validation and error wrapping are automatic.
  • Both transports — stdio (src/index.ts) for Claude Code-style local clients, Streamable HTTP (src/http.ts) for remote/web clients. Both share one createServer().
  • Structured AgentErrors — every error has a code, a message, and a hint written for the model: "call items_list first to find a valid id." Vague errors waste turns; this fixes that at the type level.
  • A real eval harness — Vitest-based. Unit tests run free in CI; tests/mcp/echo.test.ts drives a real OpenAI model through the MCP server via an in-memory transport pair and asserts on the resulting tool-call trace.
  • A simple CRUD exampleitems_create / list / read / update / delete plus an echo tool. Replace the in-memory store with your real backend; keep the shape.

Quick start

bun install
bun test              # unit tests, no API key needed
bun run start         # stdio server on stdin/stdout
bun run start:http    # HTTP server on http://localhost:3000/mcp

To run the model-in-the-loop evals:

cp .env.example .env
# add OPENAI_API_KEY
bun run test:mcp

Wire into Claude Code

{
  "mcpServers": {
    "harness-mcp": {
      "command": "bun",
      "args": ["run", "/absolute/path/to/harness/mcp/src/index.ts"]
    }
  }
}

Layout

src/
  index.ts              stdio entry
  http.ts               streamable-http entry
  core/
    server.ts           createServer() — shared by both transports
    tool.ts             defineTool() wrapper
    errors.ts           AgentError
    store.ts            replace with your backend
  tools/
    echo.ts             smoke-test tool
    items-*.ts          CRUD example tools
    index.ts            registry

tests/
  unit/                 fast, no API key
    tool.test.ts
    store.test.ts
  mcp/                  protocol + model-in-the-loop
    setup.ts            in-memory client + runWithModel() helper
    smoke.test.ts       no model
    echo.test.ts        gpt-4o-mini, skipped without OPENAI_API_KEY

The opinions

  1. Tool descriptions are prompt engineering. Every description leads with USE WHEN ... and includes DO NOT USE WHEN ... for sibling tools the model could confuse this with. The smoke test enforces the convention.
  2. Errors teach. Every AgentError carries a hint field. Read your error messages as if you were the agent — would you know what to do next? If not, rewrite.
  3. List endpoints paginate. items_list returns { items, nextCursor }. Default limit 20, hard cap 100. Don't dump unbounded data into the context.
  4. Destructive ops accept dryRun. items_delete will tell you what would happen if you weren't sure.
  5. One Zod, three consumers. Don't maintain JSON Schema by hand alongside Zod — defineTool derives both.
  6. Evals are tests. Tool-call traces are assertable. When a description regression breaks the model's behavior, your test catches it.

License

MIT.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured