iris-eval/mcp-server

iris-eval/mcp-server

MCP-native agent evaluation and observability server. Log traces, evaluate output quality with 12 built-in rules (PII detection, prompt injection, cost thresholds), and track agent costs. Real-time dashboard, OTel-compatible spans. Self-hosted, MIT licensed.

Category
Visit Server

README

Iris — MCP-Native Agent Eval & Observability

GitHub stars npm version npm downloads CI License: MIT

See what your AI agents are actually doing. Iris is an open-source MCP server that logs every trace, evaluates output quality, and tracks costs across all your agents. Any MCP-compatible agent discovers and uses it automatically — no SDK, no code changes.

Iris Dashboard

The Problem

Your agents are running in production. Traditional monitoring sees 200 OK and moves on. It has no idea the agent just:

  • Leaked a social security number in its response
  • Hallucinated an answer with zero factual grounding
  • Burned $0.47 on a single query — 4.7x your budget threshold
  • Made 6 tool calls when 2 would have sufficed

Iris sees all of it.

What You Get

Trace Logging Hierarchical span trees with per-tool-call latency, token usage, and cost in USD. Stored in SQLite, queryable instantly.
Output Evaluation 12 built-in rules across 4 categories: completeness, relevance, safety, cost. PII detection, prompt injection patterns, hallucination markers. Add custom rules with Zod schemas.
Cost Visibility Aggregate cost across all agents over any time window. Set budget thresholds. Get flagged when agents overspend.
Web Dashboard Real-time dark-mode UI with trace visualization, eval results, and cost breakdowns.

Quickstart

Add Iris to your Claude Desktop (or Cursor, Claude Code, Windsurf) MCP config:

{
  "mcpServers": {
    "iris-eval": {
      "command": "npx",
      "args": ["@iris-eval/mcp-server"]
    }
  }
}

That's it. Your agent discovers Iris and starts logging traces automatically.

Want the dashboard?

npx @iris-eval/mcp-server --dashboard
# Open http://localhost:6920

Other Install Methods

# Global install
npm install -g @iris-eval/mcp-server
iris-mcp --dashboard

# Docker
docker run -p 3000:3000 -v iris-data:/data ghcr.io/iris-eval/mcp-server

MCP Tools

Iris registers three tools that any MCP-compatible agent can invoke:

  • log_trace — Log an agent execution with spans, tool calls, token usage, and cost
  • evaluate_output — Score output quality against completeness, relevance, safety, and cost rules
  • get_traces — Query stored traces with filtering, pagination, and time-range support

Full tool schemas and configuration: iris-eval.com

Cloud Tier (Coming Soon)

Self-hosted Iris runs on your machine with SQLite. As your team grows, the cloud tier adds PostgreSQL, team dashboards, alerting, and managed infrastructure.

Join the waitlist to get early access.

Examples

Community

<details> <summary><strong>Configuration & Security</strong></summary>

CLI Arguments

Flag Default Description
--transport stdio Transport type: stdio or http
--port 3000 HTTP transport port
--db-path ~/.iris/iris.db SQLite database path
--config ~/.iris/config.json Config file path
--api-key API key for HTTP authentication
--dashboard false Enable web dashboard
--dashboard-port 6920 Dashboard port

Environment Variables

Variable Description
IRIS_TRANSPORT Transport type
IRIS_PORT HTTP port
IRIS_DB_PATH Database path
IRIS_LOG_LEVEL Log level: debug, info, warn, error
IRIS_DASHBOARD Enable dashboard (true/false)
IRIS_API_KEY API key for HTTP authentication
IRIS_ALLOWED_ORIGINS Comma-separated allowed CORS origins

Security

When using HTTP transport, Iris includes:

  • API key authentication with timing-safe comparison
  • CORS restricted to localhost by default
  • Rate limiting (100 req/min API, 20 req/min MCP)
  • Helmet security headers
  • Zod input validation on all routes
  • ReDoS-safe regex for custom eval rules
  • 1MB request body limits
# Production deployment
iris-mcp --transport http --port 3000 --api-key "$(openssl rand -hex 32)" --dashboard

</details>


If Iris is useful to you, consider starring the repo — it helps others find it.

Star on GitHub

MIT Licensed.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured