Arena MCP Server
Enables running position-driven adversarial debates and code reviews between AI agents via MCP tools, supporting custom positions, multiple rounds, and local CLI models like Claude, Codex, and Gemini.
README
Arena
█████╗ ██████╗ ███████╗███╗ ██╗ █████╗
██╔══██╗██╔══██╗██╔════╝████╗ ██║██╔══██╗
███████║██████╔╝█████╗ ██╔██╗ ██║███████║
██╔══██║██╔══██╗██╔══╝ ██║╚██╗██║██╔══██║
██║ ██║██║ ██║███████╗██║ ╚████║██║ ██║
╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚═╝ ╚═══╝╚═╝ ╚═╝
A position-driven adversarial arena for AI agents. Host provides context and 2+ opposing positions; arena dispatches local CLI models (Claude, Codex, Gemini, OpenAI, Kimi) to argue each position over multiple rounds and returns the transcript.
A standalone CLI — invoke it from your shell, scripts, or any agent that can run shell commands.
Mental model
- Host doesn't fight. The caller (Claude Code, Codex CLI, scripts) just supplies what should be argued and which positions to argue.
- Position is the unit, not the model. Adversarial value comes from clashing stances, not from "which model wins". Same model with two different system prompts is a valid pair if no other CLI is available.
- Arena owns model dispatch. It picks distinct models when multiple CLIs are healthy, falls back to reusing one when not.
Subcommands
| Subcommand | Purpose |
|---|---|
arena challenge |
Core. Run N positions over R rounds against the supplied context. |
arena review |
Code-review preset over arena challenge. Spawns attacker positions (default: bug-hunter + security-auditor) on the supplied code/diff. |
arena health |
List agent CLIs and their availability. |
arena mcp |
Start arena as a stdio MCP server — exposes each scenario as a tool callable from any MCP client. |
Install
# Required: at least one of these CLIs in $PATH
npm install -g @anthropic-ai/claude-cli # for "claude"
npm install -g @codex-ai/cli # for "codex" / "openai" / "gemini"
uv tool install kimi-cli # for "kimi" (or: pipx install kimi-cli)
Shell (no npm/node required)
Downloads a self-contained native binary from the latest GitHub release. Supports macOS (arm64/x64) and Linux (arm64/x64).
curl -fsSL https://raw.githubusercontent.com/tim101010101/arena/main/install.sh | bash
Installs to ~/.local/bin/arena. Override the directory with ARENA_INSTALL_DIR, or pin a version with ARENA_VERSION:
ARENA_INSTALL_DIR=/usr/local/bin ARENA_VERSION=v0.1.3 \
curl -fsSL https://raw.githubusercontent.com/tim101010101/arena/main/install.sh | bash
npm
npm install -g arena-mcp # or: npx arena-mcp
CLI usage
# Adversarial debate — supply your own positions
arena challenge \
--context "Should we use microservices or a monolith for a 10k-user product with 5 devs?" \
--position "Pro-microservices: team boundaries justify the split" \
--position "Pro-monolith: a 5-person team should not carry the ops burden" \
--rounds 3
# Adversarial code review (positions auto-derived from --focus)
arena review --git-ref feature/auth --focus bugs,security
arena review --files src/login.ts,src/session.ts --focus security
# Override which models to use (must already be healthy)
arena challenge --context "..." --position a --position b --models claude,codex
# Diagnostics
arena health
arena --version
arena --help
MCP server
arena mcp starts a stdio MCP server. Each loaded scenario (challenge, review, and any user-defined ones) is exposed as an MCP tool; a health tool is also included.
Add it to your MCP client config (e.g. Claude Desktop or Claude Code .mcp.json):
{
"mcpServers": {
"arena": {
"command": "arena",
"args": ["mcp"]
}
}
}
Once connected, your AI client can call:
challenge— supplycontext(string) andpositions(array of ≥2 strings); optionalroundsandmodels.review— supplysources(array of source objects:raw,git_ref,git_range,file_list, orpatch_file); optionalfocus,rounds, andmodels.health— returns availability of all local agent CLIs.
Configuration (env vars)
| Variable | Default | Notes |
|---|---|---|
ARENA_TIMEOUT_MS |
120000 |
Per-fighter execution timeout |
ARENA_DEFAULT_ROUNDS |
3 |
Default rounds when not specified |
ARENA_DEFAULT_MODE |
parallel |
Reserved (challenge runs sequentially) |
ARENA_MAX_CONTEXT_SIZE |
1000000 |
Max bytes from sources |
ARENA_CLAUDE_MODEL / ARENA_CODEX_MODEL / ARENA_GEMINI_MODEL / ARENA_OPENAI_MODEL / ARENA_KIMI_MODEL |
CLI default | Per-adapter model override |
Dispatch behavior
positions = ["A", "B"]
available = healthCheckAll().filter(ok)
override = caller-supplied --models / models[]
pool = override ?? available
fighter[i].model = pool[i % pool.length]
- Prefers distinct models when
len(positions) ≤ len(pool). - Cycles when positions outnumber the pool — same model, different prompts.
- Each fighter gets a unique id (
<model>#<i>) so transcripts stay disambiguated.
Development
bun install
bun test # full suite
bun run build # produces dist/index.js
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.