Warden

Warden

A governed MCP server enforcing RBAC outside the model, with OpenTelemetry traces on every run and an LLM-as-judge eval suite proving the agent stays inside policy.

Category
Visit Server

README

Warden

A governed MCP server, with the receipts: RBAC enforced outside the model, OpenTelemetry traces on every run, and an LLM-as-judge eval suite proving the agent stays inside policy.

Live demo: warden.alexlaguardia.dev. Browse recorded runs, replay traces, and fire a live agent run yourself (rate-limited).

Same question, different role

The problem this demonstrates

Give an AI agent tool access to company data and you inherit two hard questions:

  1. Who is the agent acting as? A support agent asking "what's our pipeline with Acme?" must not get an answer the human asking is not allowed to see.
  2. How do you know the agent behaved? "It seemed fine in testing" is not an answer you can take to a security review.

Warden answers both, end to end, in a system small enough to read in an afternoon:

  • Governance lives outside the model. The role comes from the session identity (like OAuth token scopes), every read passes through one policy choke point, and the model cannot widen its own access by prompting harder.
  • Behavior is measured, not vibed. A deterministic oracle computes ground truth through the same governance layer, and a stronger model judges each agent answer against that reference: accuracy, faithfulness, RBAC compliance, and whether the agent honestly said "I can't see that" instead of fabricating.
  • Every run is traceable. OpenTelemetry spans for each LLM completion and MCP tool call, persisted and replayable on a timeline in the dashboard.

What the demo shows

Page What it proves
Console Live agent runs through the governed server, as any of 3 roles
Diff The same question answered as admin, sales, and support. Admin gets $125,000 of pipeline; support gets a policy denial and an honest refusal
Run detail Span timeline (LLM vs tool time), tool inputs/outputs with the enforcing role stamped on every result
Evals 12/12 cases passing across all governance primitives

Run detail with trace timeline

Architecture

server/             The governed MCP server
  data.py           Seed dataset: one fictional company across 3 sources
                    (CRM accounts + deals, billing invoices, support tickets)
  rbac.py           Policy engine: roles + 3 governance primitives
                    (resource-level access, region row-scoping, field redaction)
  store.py          GovernedStore: the single choke point every read passes through
  mcp_server.py     MCP server (official SDK, stdio): 4 registry/dispatch tools,
                    role fixed by WARDEN_ROLE env, never by the model
agent/
  runner.py         Claude tool-calling loop over the real MCP server, role-bound
eval/
  oracle.py         Deterministic ground truth, computed THROUGH the governance
                    layer, so the reference itself respects policy
  judge.py          LLM-as-judge (a stronger model judges the agent's model),
                    anchored to the oracle reference to mitigate judge bias
  golden.py         12 cases covering every RBAC primitive + honesty-on-denial
tracing/
  otel.py           Real OpenTelemetry spans (GenAI semantic conventions) via a
                    custom in-process SpanProcessor
  store.py          Run persistence (runs.db): answers, tool calls, spans
dashboard/
  api.py            FastAPI over runs.db + a rate-limited live-run endpoint
web/                Next.js console: run list, trace replay, role diff, evals

Roles: admin (everything), sales_west (West-region CRM + tickets, no billing), support (tickets everywhere + basic accounts, financial tier hidden).

Design choices worth stealing

  • Registry/dispatch tools, not one-tool-per-table. The server exposes list_resources, describe_resource, query_resource, get_record. Adding a data source changes the registry, not the tool surface, and the policy engine stays in one place.
  • The oracle goes through the governance layer. If the reference answer were computed against raw data, a correctly-denied answer would score as "wrong." Governance-aware ground truth is what makes "the agent honestly declined" a passing grade.
  • A stronger model judges than answers (Opus judges Sonnet), anchored to the oracle reference, to cut self-preference bias.
  • Denials are structured, not stringly. Tools return an access_denied object; the eval then checks the agent reported the limit honestly rather than guessing.

Run it locally

pip install -r requirements.txt
python seed.py                       # build the demo company (warden.db)
python -m tests.test_rbac            # 7/7 governance primitives hold
export ANTHROPIC_API_KEY=sk-ant-...  # needed for agent + evals
python -m agent.runner --role support --trace "What account tier is Acme Corp?"
python -m eval.run_evals             # 12/12, writes eval/results.json
python -m tracing.seed_runs          # seed demo runs for the dashboard

# dashboard
uvicorn dashboard.api:app --port 8710
cd web && npm install && npm run dev   # http://localhost:3006

Stack

Python (FastAPI, official mcp SDK, OpenTelemetry SDK), Claude (Sonnet agent, Opus judge), SQLite, Next.js + Tailwind.


Built solo by Alex LaGuardia as a working answer to "how do you let an agent touch real data without trusting it blindly?"

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured