production-grade-mcp-agentic-system

production-grade-mcp-agentic-system

A production-grade MCP server designed for multi-tenant, authenticated, and observable AI agent systems, enabling secure tool execution across heterogeneous data sources.

Category
Visit Server

README

<div align="center">

<img src="https://miro.medium.com/v2/resize:fit:4800/1*vPJ1Xag-f3cgOgSA4QTeXQ.png" alt="Production-Grade MCP Server + Agentic System" width="100%"/>

๐Ÿ›๏ธ Production-Grade MCP Server + Agentic System

A reference implementation of an MCP server designed to actually ship

Multi-tenant ยท Authenticated ยท Observable ยท Rate-limited ยท Cached ยท Circuit-broken ยท Governed

Python 3.11+ MCP 2026 License: MIT Docker


๐Ÿ“– Full Step-by-Step Blog Walkthrough

This repository is the companion codebase for a long-form blog post that walks through every single component end to end, with every line of code explained in context. Start there if you want to understand the "why" behind the architecture before reading the code.

๐Ÿ”— Building a Production-Grade MCP Server Architecture with Agentic System โ†’


</div>

๐ŸŽฏ What This Is

Most MCP tutorials end with a @tool decorator that returns "hello world". That is fine for a demo. It is not what ships.

This repository is a reference implementation of an MCP server designed to run in production: multi-tenant, authenticated, observable, rate-limited, cached, circuit-broken, and governed. It exposes a company's heterogeneous data layer (Postgres, Elasticsearch, S3, vector DB) to AI agents as a single, secure tool surface, and ships with a four-agent support copilot (Planner โ†’ Retriever โ†’ Synthesizer โ†’ Critic) that uses it end to end.

The codebase is deliberately organised around twelve components that keep showing up on the 3 AM pager when teams skip them. Each one lives in its own module and can be read, replaced, or extended independently.


๐Ÿ—๏ธ Architecture Overview

<div align="center">

<img src="https://miro.medium.com/v2/resize:fit:4800/1*vPJ1Xag-f3cgOgSA4QTeXQ.png" alt="Full Architecture" width="90%"/>

The complete production-grade system: MCP server dispatch pipeline on the right, four-agent orchestrator on the left, data plane on top, observability on the bottom, identity and governance as crosscutting concerns.

</div>


๐Ÿงฉ The 12 Components

# Component Lives in What it gives you
1 ๐Ÿšช Transport & Session Layer server.py stdio for local, Streamable HTTP for remote, horizontal-scale-friendly sessions
2 ๐Ÿ” Authentication Server auth/oauth.py OAuth 2.1 + PKCE, short-lived JWTs, JWKS validation
3 โš–๏ธ Authorization & Policy Engine auth/policy.py Tool-level RBAC, tenant-scoped ABAC, deny-by-default
4 ๐Ÿ“š Tool Registry & Discovery tools/registry.py Dynamic toolsets, .well-known capability metadata
5 โœ… Input Validation Layer validation/schemas.py Pydantic schemas, enum constraints, agent-adversarial input as default threat model
6 ๐Ÿ”ง Tool Execution Engine tools/base.py Three-level hierarchy (atomic / composed / workflow)
7 ๐Ÿ”„ Circuit Breaker & Retry reliability/ Closed โ†’ open โ†’ half-open, Adaptive Timeout Budget Allocation
8 ๐Ÿšฆ Rate Limiting & Quotas ratelimit/limiter.py Redis token-bucket (Lua-atomic), per-tenant and per-tool
9 โšก Caching Layer cache/manager.py Two-tier (L1 in-process, L2 Redis), stampede prevention
10 ๐Ÿงฑ Structured Error Framework errors/framework.py Machine-readable errors with retryable and hint fields
11 ๐Ÿ”ญ Observability Stack observability/ OpenTelemetry traces, Prometheus metrics, audit logs
12 ๐Ÿ›ก๏ธ Governance & Multi-Tenancy governance/ Tenant isolation, approval gates, outbound HTTP allowlisting

๐Ÿ“– Diving Deeper, Section by Section

Each diagram below links back to the corresponding section in the blog, where every line of code is walked through in detail.

<table> <tr> <td width="50%" align="center">

๐Ÿ“ฆ Data Persistence Layer

<img src="https://miro.medium.com/v2/resize:fit:4800/1*kT_lhnF50R4aM2iXXahMoA.png" alt="Data Persistence Layer" width="100%"/>

Postgres + Row-Level Security ยท Tenant isolation at the DB layer

</td> <td width="50%" align="center">

๐Ÿšช Transport & Session Layer

<img src="https://miro.medium.com/v2/resize:fit:4800/1*7GEV6AlegLbxX-dqJXHUdA.png" alt="Transport Layer" width="100%"/>

Dual transport ยท Stateless session ยท Middleware chain

</td> </tr> <tr> <td width="50%" align="center">

๐Ÿ” Authentication, Policy & Governance

<img src="https://miro.medium.com/v2/resize:fit:4800/1*m45EPmIT1_5EmKNR4EEpLQ.png" alt="Auth & Policy" width="100%"/>

OAuth 2.1 ยท YAML policies ยท Human-in-the-loop approvals

</td> <td width="50%" align="center">

๐Ÿ”ง Tool Execution Engine

<img src="https://miro.medium.com/v2/resize:fit:4800/1*ak49o0j_5qLbvvM-zkkF_A.png" alt="Tool Execution" width="100%"/>

Three-level hierarchy ยท Atomic ยท Composed ยท Workflow

</td> </tr> <tr> <td width="50%" align="center">

๐Ÿ”„ Reliability Layer

<img src="https://miro.medium.com/v2/resize:fit:4800/1*rjIJxzUpMhJ9BGffTczvLA.png" alt="Reliability" width="100%"/>

Circuit breakers ยท Retry with jitter ยท ATBA budget allocator

</td> <td width="50%" align="center">

โšก Rate Limiting & Caching

<img src="https://miro.medium.com/v2/resize:fit:4800/1*CvfLYyppMTLyU9UalfHmyA.png" alt="Rate Limit & Cache" width="100%"/>

Redis token bucket ยท Two-tier cache ยท Stampede lock

</td> </tr> <tr> <td width="50%" align="center">

๐Ÿ”ญ Observability Stack

<img src="https://miro.medium.com/v2/resize:fit:4800/1*dMi7KXpUfoMMsFpVTS8Acg.png" alt="Observability" width="100%"/>

OpenTelemetry ยท Prometheus ยท Audit logs ยท One trace ID

</td> <td width="50%" align="center">

๐Ÿค– Multi-Agentic Architecture

<img src="https://miro.medium.com/v2/resize:fit:4800/1*rasNhRMj5Ei93-AEQrbBwQ.png" alt="Multi-Agent" width="100%"/>

Four-agent design ยท Planner ยท Retriever ยท Synthesizer ยท Critic

</td> </tr> </table>

<div align="center">

๐ŸŽผ The Orchestrator Flow

<img src="https://miro.medium.com/v2/resize:fit:4800/1*7wyopmnCF_mEdxnI8u02uA.png" alt="Orchestrator" width="80%"/>

End-to-end agent orchestration with one bounded revise loop

</div>


๐Ÿš€ Quick Start

Prerequisites

  • Docker & Docker Compose
  • Python 3.11+ (only for running the CLI locally)
  • An Anthropic API key (for the agent layer)

1. Clone and Configure

git clone https://github.com/FareedKhan-dev/production-grade-mcp-agentic-system.git
cd production-grade-mcp-agentic-system
cp .env.example .env

Edit .env and set at minimum:

  • ANTHROPIC_API_KEY โ€” for the agent layer
  • ATLAS_AUTH_JWKS_URL โ€” your OAuth 2.1 provider's JWKS endpoint (or leave default for dev)

2. Bring Up the Stack

docker compose up -d

That brings up the full local environment:

Service URL What it is
๐Ÿ›๏ธ MCP Server http://localhost:8080/mcp Streamable HTTP endpoint
๐Ÿ” Discovery http://localhost:8080/.well-known/mcp-server Unauthenticated capability metadata
๐Ÿ“Š Metrics http://localhost:8080/metrics Prometheus scrape target
โค๏ธ Health http://localhost:8080/healthz Liveness probe
๐Ÿ”ญ Jaeger http://localhost:16686 Distributed tracing UI
๐Ÿ“ˆ Grafana http://localhost:3000 Metrics dashboards (admin / admin)
๐Ÿ—„๏ธ MinIO Console http://localhost:9001 S3-compatible storage UI

3. Run the Support Copilot CLI

pip install -e .

export ATLAS_MCP_URL=http://localhost:8080
export ATLAS_MCP_TOKEN=dev-token
export ATLAS_TENANT=acme
export ANTHROPIC_API_KEY=sk-ant-...

atlas-copilot "Why was the refund on order o_9002 for CUST-1001 delayed?"

You will see the four agents run end-to-end, the final draft printed with [S1][S2] citations, and a full trace summary including token counts, tool calls, and the run_id that ties back to Jaeger.

4. Connect from Claude Desktop / Cursor

Add this to your MCP host config:

{
  "mcpServers": {
    "production-mcp": {
      "type": "http",
      "url": "http://localhost:8080/mcp",
      "headers": {
        "Authorization": "Bearer ${ATLAS_MCP_TOKEN}",
        "X-Tenant-Id": "acme"
      }
    }
  }
}

๐Ÿ“‚ Repository Structure

.
โ”œโ”€โ”€ ๐Ÿ“„ README.md
โ”œโ”€โ”€ ๐Ÿณ docker-compose.yml          # Full local stack: app + data + observability
โ”œโ”€โ”€ ๐Ÿณ Dockerfile                  # Two-stage build, non-root runtime
โ”œโ”€โ”€ ๐Ÿ“œ LICENSE
โ”œโ”€โ”€ ๐Ÿ“ฆ pyproject.toml              # Dependencies, dev tools, CLI entry points
โ”œโ”€โ”€ โš™๏ธ  .env.example                # Every setting documented by component
โ”‚
โ”œโ”€โ”€ ๐Ÿ”ง config/                     # Runtime configuration (hot-reloadable)
โ”‚   โ”œโ”€โ”€ http_allowlist.yaml       # Per-tenant outbound HTTP allowlist
โ”‚   โ””โ”€โ”€ policy.yaml               # YAML-driven authorization policies
โ”‚
โ”œโ”€โ”€ ๐Ÿšข deploy/                     # Deployment sidecar configs
โ”‚   โ”œโ”€โ”€ otel/config.yaml          # OpenTelemetry Collector pipeline
โ”‚   โ”œโ”€โ”€ prometheus/prometheus.yml # Prometheus scrape targets
โ”‚   โ””โ”€โ”€ sql/init.sql              # Schema + RLS policies + seed data
โ”‚
โ”œโ”€โ”€ ๐Ÿ“š docs/                       # Deep-dive documentation
โ”‚   โ”œโ”€โ”€ AGENT_SYSTEM.md           # Multi-agent orchestrator internals
โ”‚   โ”œโ”€โ”€ ARCHITECTURE.md           # The 12 components in detail
โ”‚   โ””โ”€โ”€ DEPLOYMENT.md             # K8s, Cloudflare Workers, bare-metal
โ”‚
โ”œโ”€โ”€ ๐Ÿง  src/atlas_mcp/              # Main application source
โ”‚   โ”œโ”€โ”€ config.py                 # Centralized typed settings
โ”‚   โ”œโ”€โ”€ server.py                 # โšก Component 1: Transport & dispatch
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿค– agents/                 # Four-agent support copilot
โ”‚   โ”‚   โ”œโ”€โ”€ planner.py            # Emits retrieval plan JSON
โ”‚   โ”‚   โ”œโ”€โ”€ retriever.py          # Bounded tool-calling loop
โ”‚   โ”‚   โ”œโ”€โ”€ synthesizer.py        # Drafts reply with citations
โ”‚   โ”‚   โ”œโ”€โ”€ critic.py             # Approves or sends one revise
โ”‚   โ”‚   โ”œโ”€โ”€ orchestrator.py       # Wires the four agents together
โ”‚   โ”‚   โ”œโ”€โ”€ mcp_client.py         # Thin JSON-RPC MCP client
โ”‚   โ”‚   โ”œโ”€โ”€ memory.py             # STM (Redis) + LTM (vector)
โ”‚   โ”‚   โ””โ”€โ”€ cli.py                # atlas-copilot CLI entry point
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ” auth/                   # Components 2 + 3
โ”‚   โ”‚   โ”œโ”€โ”€ oauth.py              # JWT + JWKS validation
โ”‚   โ”‚   โ”œโ”€โ”€ middleware.py         # Bearer token extraction
โ”‚   โ”‚   โ””โ”€โ”€ policy.py             # YAML-driven policy engine
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ›ก๏ธ  governance/             # Component 12
โ”‚   โ”‚   โ”œโ”€โ”€ tenant.py             # Tenant pinning middleware
โ”‚   โ”‚   โ””โ”€โ”€ approval.py           # Human-in-the-loop gate
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”ง tools/                  # Components 4 + 6
โ”‚   โ”‚   โ”œโ”€โ”€ registry.py           # In-memory tool index + discovery
โ”‚   โ”‚   โ”œโ”€โ”€ base.py               # Tool abstract base + metadata
โ”‚   โ”‚   โ”œโ”€โ”€ atomic/               # Level 1: one backend each
โ”‚   โ”‚   โ”œโ”€โ”€ composed/             # Level 2: deterministic chains
โ”‚   โ”‚   โ””โ”€โ”€ workflow/             # Level 3: multi-step procedures
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”„ reliability/            # Component 7
โ”‚   โ”‚   โ”œโ”€โ”€ circuit_breaker.py    # 3-state machine per tool
โ”‚   โ”‚   โ”œโ”€โ”€ retry.py              # Exponential backoff + jitter
โ”‚   โ”‚   โ””โ”€โ”€ atba.py               # Adaptive Timeout Budget Allocation
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿšฆ ratelimit/              # Component 8
โ”‚   โ”‚   โ””โ”€โ”€ limiter.py            # Redis token bucket (Lua-atomic)
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ โšก cache/                   # Component 9
โ”‚   โ”‚   โ””โ”€โ”€ manager.py            # L1 + L2 cache with stampede lock
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿงฑ errors/                 # Component 10
โ”‚   โ”‚   โ””โ”€โ”€ framework.py          # Structured Error Recovery (SERF)
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”ญ observability/          # Component 11
โ”‚   โ”‚   โ”œโ”€โ”€ tracing.py            # OpenTelemetry spans
โ”‚   โ”‚   โ”œโ”€โ”€ metrics.py            # Prometheus instruments
โ”‚   โ”‚   โ””โ”€โ”€ audit.py              # Structured JSONL audit log
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ โœ… validation/             # Component 5
โ”‚       โ””โ”€โ”€ schemas.py            # Tool call envelope
โ”‚
โ””โ”€โ”€ ๐Ÿงช tests/                      # Narrow tests, load-bearing properties
    โ”œโ”€โ”€ test_circuit_breaker.py   # State machine transitions
    โ”œโ”€โ”€ test_errors.py            # SERF wire format + retry semantics
    โ””โ”€โ”€ test_policy.py            # Deny-beats-allow + default-deny

๐ŸŽจ Tech Stack

Layer Technology
Language Python 3.11+
Web framework Starlette + Uvicorn
MCP SDK mcp>=1.2.0
Auth PyJWT + Authlib (OAuth 2.1 resource server)
Validation Pydantic v2 + Pydantic Settings
Database asyncpg (PostgreSQL 16 with RLS)
Search Elasticsearch 8 (async client)
Vector DB Qdrant
Object storage aioboto3 (MinIO / S3)
Cache + queues Redis 7 (redis[hiredis])
Reliability tenacity (retries) + custom breaker + custom ATBA
Tracing OpenTelemetry SDK + OTLP exporter
Metrics prometheus_client
Logging structlog (JSON)
LLM Anthropic Messages API (Claude)

๐Ÿงช Testing

The test suite is deliberately narrow, covering the three load-bearing safety properties:

pip install -e ".[dev]"
pytest -v
  • test_circuit_breaker.py โ€” state machine transitions, retryable vs deterministic error classification
  • test_errors.py โ€” SERF wire format, retry semantics, MCP-level error data
  • test_policy.py โ€” default-deny, deny-beats-allow, glob matching, PII condition blocking

๐Ÿ›ฃ๏ธ Production Deployment

For running this in an actual production environment (managed Postgres, real OAuth provider, SIEM integration, Kubernetes), see docs/DEPLOYMENT.md.

Key swaps between local dev and production:

Local (docker-compose) Production
Dev JWT issuer WorkOS AuthKit / Auth0 / Keycloak
MinIO AWS S3 / GCS / Azure Blob
Local Postgres AWS RDS / Cloud SQL / Supabase
Redis container Upstash / ElastiCache / MemoryDB
Local OTel collector Datadog / Honeycomb / Grafana Cloud
File-based audit log Splunk / Chronicle / SIEM of choice

๐Ÿ“š Documentation


๐Ÿ“œ License

MIT. See LICENSE.


<div align="center">

โญ If this helped you, please consider starring the repo

Built with โ˜• and a lot of 3 AM debugging

๐Ÿ“– Read the full blog walkthrough ยท ๐Ÿ› Report an issue ยท ๐Ÿ’ฌ Start a discussion

</div>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured