custom-mcp-server
Production MCP server for data annotation workflows, exposing six tools backed by AWS S3, DynamoDB, and Slack with JWT auth, rate limiting, and exponential-backoff retries.
README
custom-mcp-server
A production Model Context Protocol server for a data-annotation workflow. It exposes six tools over the MCP stdio transport, backed by AWS S3 + DynamoDB and Slack, with JWT auth, per-tool rate limiting, and exponential-backoff retries.
Prerequisites
- Node.js 20 LTS
- npm
- AWS account (S3 bucket + DynamoDB table) and a Slack bot token for runtime use (not required to run the test suite — all external calls are mocked)
Install
npm install
Environment setup
Copy .env.example to .env and fill in the values. Keys:
| Key | Required by | Notes |
|---|---|---|
AWS_REGION |
all AWS tools | e.g. us-east-1 |
AWS_ACCESS_KEY_ID |
all AWS tools | secret — keep out of source control |
AWS_SECRET_ACCESS_KEY |
all AWS tools | secret |
S3_BUCKET_NAME |
s3_upload, s3_download |
default bucket |
DYNAMO_TABLE_NAME |
dynamo_read/write, annotation_status |
table with partition key id |
SLACK_BOT_TOKEN |
slack_notify, annotation_status |
secret, xoxb-... |
SLACK_DEFAULT_CHANNEL |
slack_notify, annotation_status |
e.g. #annotations |
OAUTH_ISSUER |
auth (every call) | expected iss claim |
OAUTH_AUDIENCE |
auth (every call) | expected aud claim |
JWKS_URI |
auth (RS256) | JWKS endpoint for signature verification |
JWT_SECRET |
auth (HS256, dev only) | optional; ≥ 32 chars; refused when NODE_ENV=production |
NODE_ENV |
auth | set to production to force RS256/JWKS and forbid HS256 |
RATE_LIMIT_PER_MIN |
rate limiter | default 100 |
RETRY_MAX_ATTEMPTS |
retry | default 3 |
RETRY_BASE_DELAY_MS |
retry | default 200 |
LOG_LEVEL |
logger | debug/info/warn/error, default info |
Build / test / run
npm run build # compile TypeScript to dist/
npm run typecheck # tsc --noEmit
npm test # Jest (ESM) — all external calls mocked
npm start # node dist/server.js (stdio transport)
Tools
| Tool | Input (required**) | Required scope | Behavior |
|---|---|---|---|
s3_upload |
key, contentBase64, contentType? |
s3:write |
Upload base64 content to S3 under the caller's prefix; returns { bucket, key, etag } |
s3_download |
key** |
s3:read |
Download object from the caller's prefix; returns { bucket, key, contentBase64, contentType } |
dynamo_read |
id**, consistentRead? |
dynamo:read |
Read a record the caller owns; returns the item (without owner) or { found: false } |
dynamo_write |
id, attributes, overwrite? |
dynamo:write |
Put a record stamped with the caller as owner; can only overwrite records the caller owns |
slack_notify |
message**, channel?, threadTs? |
slack:write |
Post to Slack (text sanitized); returns { channel, ts } |
annotation_status |
taskId**, newStatus?, notify? |
annotation:read (+ annotation:write to update) |
Read/update a task the caller owns, optionally notify Slack |
Auth model
Every tool call is authenticated and authorized:
- Authentication. The caller supplies a JWT via
_meta.authorization(optionallyBearer-prefixed). The expected algorithm is pinned from server configuration — not the token header — to block algorithm-confusion attacks: RS256 (verified againstJWKS_URI) by default, or HS256 only when aJWT_SECRET(≥ 32 chars) is set andNODE_ENVis notproduction. The server checksiss/aud/expiry (with a small clock skew) and derives anAuthContext(subject,scopes). Invalid tokens →AUTH_INVALID. - Scope authorization. Each tool declares
requiredScopes. A token missing a required scope is rejected withFORBIDDENbefore the handler runs. - Object-level authorization (ownership). DynamoDB records carry an
ownerattribute and S3 keys are confined to a per-subject prefix (<subject>/…). Callers can only read/update their own records and objects; foreign records are reported as not-found to avoid ID enumeration. This prevents IDOR. - Error handling. Callers receive only a stable error
codeplus arequestId; full error detail is logged server-side (stderr) and never leaked to the client.
JWKS keys are fetched through a cached, rate-limited client to avoid a network round-trip (and IdP DoS) on every verification. S3 up/downloads are capped at 10 MiB to bound memory use.
Rate limiting & retries
- Rate limit: 100 requests/min per principal+tool (configurable), in-memory
per process, keyed by
subject:toolso one caller cannot starve others. Unknown tool names are rejected before consuming limiter budget. Exceeding it yieldsRATE_LIMITED. - Retry: transient failures (
retryable: true) are retried up to 3 times with exponential backoff (baseDelay * 2^(n-1)). Conflicts, validation, and auth errors are never retried. - Idempotency note: retries wrap non-idempotent writes (
dynamo_write,slack_notify). Only transient errors are retried, but adding idempotency keys is recommended future work.
Cursor setup
.cursor/mcp.json registers the server with Cursor:
{
"mcpServers": {
"custom-mcp-server": {
"command": "node",
"args": ["dist/server.js"],
"env": { "AWS_REGION": "us-east-1", "...": "..." }
}
}
}
Secrets (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, SLACK_BOT_TOKEN,
JWT_SECRET) are not placed in mcp.json; provide them via your shell
environment / .env. Run npm run build before launching so dist/server.js
exists.
Architecture
See PLAN.md for the full milestone plan, interface contracts, and
blocker analysis. Source layout:
src/
server.ts stdio transport + tool-call pipeline
config.ts env loading + validation (zod)
types.ts shared interface contracts
errors.ts AppError exception + guards
security.ts scopes, ownership, key-scoping & sanitization helpers
logger.ts stderr-only structured logger
auth/oauth.ts JWT validation (algorithm-pinned) + cached JWKS
middleware/ retry.ts, rate-limiter.ts
clients/ s3/dynamo/slack factories
tools/ one file per tool + index.ts
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.