MCP Servers

claude-code-codex-agents

Enables Claude Code to delegate tasks to OpenAI's Codex CLI (GPT-5.4) with structured execution traces, parallel execution, session persistence, and adversarial code review.

README

claude-code-codex-agents

日本語版 README はこちら

Give Claude Code structured Codex traces, not raw output.

For Claude Code users who want GPT-5.4 as a real tool: claude-code-codex-agents parses the entire JSONL event stream from Codex CLI and returns a structured execution report -- which tools it used, which files it touched, how long it took, and what went wrong. No other Codex MCP bridge does this.

Architecture Overview

graph LR
    A["Claude Code<br/>(Opus 4.6)"] -->|MCP Protocol| B["claude-code-codex-agents<br/>MCP Server"]
    B -->|"subprocess + stdin"| C[Codex CLI]
    C -->|JSONL stream| B
    C -->|API call| D["OpenAI API<br/>(GPT-5.4)"]
    B -->|Structured Report| A

Without vs With claude-code-codex-agents

Without -- You call Codex CLI and get a wall of text. You don't know what tools it used, what files it changed, or if it actually succeeded.

With claude-code-codex-agents -- Claude Code gets a structured execution trace:

[Codex gpt-5.4] Completed

⏱ Execution time: 8.3s
🧵 Thread: 019d436e-4c39-7093-b7ed-f8a26aca7938

📦 Tools used (3):
  ✅ read_file — src/auth.py
  ✅ edit_file — src/auth.py
  ✅ shell — python -m pytest tests/

📁 Files touched (1):
  • src/auth.py

━━━ Codex Response ━━━
Fixed the authentication logic. Token validation order was incorrect.

Why claude-code-codex-agents?

There are 6+ Codex MCP bridges on GitHub. Here's what makes this one different:

	Other bridges	claude-code-codex-agents
Output	Raw text dump	Structured trace (tools, files, timing, errors)
Parallel tasks	1 at a time	Up to 6 simultaneous
Session continuity	Stateless	threadId persistence across calls
Security	Pass-through	3-tier sandbox + terminal injection prevention
Tests	Few or none	59 tests (parsing, security, sessions, edge cases, agent lifecycle)
Review	Basic or none	Adversarial Review Loop (GPT-5.4 challenges Claude's code)

Key Features

Full JSONL Trace Parsing -- Every Codex event (tool calls, file ops, errors) parsed into a structured report
Parallel Execution -- Run up to 6 Codex tasks simultaneously via parallel_execute
Session Management -- Continue previous threads with session_continue (threadId persistence)
Agent Lifecycle -- Run Codex as a background Claude Code-style worker via spawn_codex_agent, send_codex_agent_input, and wait_codex_agent
Adversarial Review Loop -- GPT-5.4 reviews Claude's code from a different perspective
Sandbox Security -- 3-tier policy (read-only / workspace-write / danger-full-access) + terminal injection prevention
Cross-Model Discussion -- Get GPT-5.4's opinion on design decisions via discuss
Zero External Dependencies -- Just FastMCP + Codex CLI. No databases, no Docker, no config files
Japanese Native -- Full Japanese prompt and report support
59 Tests -- Comprehensive coverage including security, parsing, session management, agent lifecycle, and edge cases

Quick Start

1. Install Codex CLI

npm install -g @openai/codex
codex login

2. Install claude-code-codex-agents

git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync

3. Add to your MCP client

Claude Code (~/.claude/settings.json):

{
  "mcpServers": {
    "claude-code-codex-agents": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
      "env": { "PYTHONUTF8": "1" }
    }
  }
}

<details> <summary><b>Cursor</b> (~/.cursor/mcp.json)</summary>

{
  "mcpServers": {
    "claude-code-codex-agents": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
      "env": { "PYTHONUTF8": "1" }
    }
  }
}

</details>

<details> <summary><b>VS Code / Windsurf</b></summary>

Add to your MCP settings:

{
  "claude-code-codex-agents": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
    "env": { "PYTHONUTF8": "1" }
  }
}

</details>

Tools

Tool	Description	Sandbox
`execute`	Delegate tasks to Codex with structured trace report	workspace-write
`trace_execute`	Same as execute, plus full event timeline	workspace-write
`parallel_execute`	Run up to 6 tasks simultaneously	read-only
`review`	Adversarial code review by GPT-5.4	read-only
`explain`	Code explanation (brief/medium/detailed)	read-only
`generate`	Code generation with optional file output	workspace-write
`discuss`	Get GPT-5.4's perspective on design decisions	read-only
`session_continue`	Continue a previous Codex thread	workspace-write
`session_list`	List session history with thread IDs	-
`spawn_codex_agent`	Launch a background Codex worker with `default` / `explorer` / `worker` roles	role-based
`send_codex_agent_input`	Continue a background Codex worker with follow-up instructions	same as agent
`wait_codex_agent`	Wait for an agent turn and fetch the last structured result	-
`list_codex_agents`	Inspect tracked background Codex agents	-
`close_codex_agent`	Close an idle Codex agent	-
`status`	Check Codex CLI status and auth	-

Claude Code-Style Agents

The new agent lifecycle tools let Claude Code treat Codex more like a persistent sub-agent than a one-shot CLI call.

Use spawn_codex_agent to start a background worker with a role preset: default for balanced execution, explorer for read-heavy investigation, worker for implementation.
Use send_codex_agent_input to continue the same worker after you read its last result.
Use wait_codex_agent to poll for completion without blocking other work.
Use list_codex_agents and close_codex_agent to manage idle workers.

Real-World Example: Adversarial Code Review

Claude Code writes code, then asks GPT-5.4 to review it:

[Codex Review] GPT-5.4 Review Result

⏱ Execution time: 15.7s

━━━ Codex Response ━━━
- [CRITICAL] `run(cmd)` calls `os.system(cmd)` directly -- command injection
  if `cmd` contains user input. Use `subprocess.run([...], shell=False)`.

- [WARNING] `divide(a, b)` raises ZeroDivisionError when b == 0.
  Add a pre-check or explicit error message.

- [INFO] No type hints on function signatures. Add `def divide(a: float,
  b: float) -> float:` for readability.

Real-World Example: Parallel Execution

Analyze multiple tasks simultaneously:

[Parallel Execution Complete] 3 tasks

━━━ Task 1 ✅ ━━━
Instruction: Analyze src/auth.py for security issues
⏱ 5.2s
...

━━━ Task 2 ✅ ━━━
Instruction: Review database query patterns in src/db.py
⏱ 7.8s
...

━━━ Task 3 ✅ ━━━
Instruction: Check error handling in src/api.py
⏱ 4.1s
...

Architecture

sequenceDiagram
    participant C as Claude Code
    participant H as claude-code-codex-agents
    participant X as Codex CLI
    participant O as OpenAI API

    C->>H: MCP tool call (execute)
    H->>H: _validate() + _enforce_sandbox()
    H->>X: subprocess (stdin prompt)
    X->>O: API request (GPT-5.4)
    O-->>X: Response
    X-->>H: JSONL event stream
    H->>H: parse_jsonl_events() → CodexTrace
    H->>H: _sanitize() → format_report()
    H-->>C: Structured report

Security Model

Sandbox Mode	File Write	Shell Exec	Use Case
`read-only`	Blocked	Blocked	Review, explain, discuss
`workspace-write`	CWD only	Allowed	Execute, generate
`danger-full-access`	Anywhere	Allowed	Full system access (use with caution)

Additional protections:

ANSI/OSC escape sequence sanitization (terminal injection prevention)
Input validation on all parameters
Process kill on timeout
--ephemeral flag (no persistent Codex state)

Development

# Setup
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync --extra dev

# Run tests (59 tests)
uv run pytest tests/ -v

# Run server directly
uv run python server.py

Project structure: Single file (server.py, ~820 lines). Easy to read, modify, and contribute.

Use Cases

Cross-Model Code Review -- Claude writes code, GPT-5.4 reviews it. Eliminates single-model bias.
Parallel Codebase Analysis -- Analyze 6 files simultaneously, get structured reports for each.
Design Discussion -- Get GPT-5.4's alternative perspective on architectural decisions via discuss.
Session-Based Refactoring -- Large refactoring across multiple session_continue calls with context preservation.
AI Second Opinion -- When Claude's answer seems off, ask GPT-5.4 for a sanity check.

Requirements

Python 3.12+
Codex CLI (npm install -g @openai/codex)
OpenAI account (Codex CLI must be authenticated via codex login)
uv (recommended) or pip

Related Projects

Helix Ecosystem

helix-ai-studio — All-in-one AI chat studio with 7 providers, RAG, MCP tools, and pipeline
helix-pilot — GUI automation MCP server — AI controls Windows desktop via local Vision LLM
helix-agent — Extend Claude Code with local Ollama models — cut token costs by 60-80%
helix-sandbox — Secure sandbox MCP server — Docker + Windows Sandbox

Alternative Codex Bridges

codex-plugin-cc -- Official OpenAI plugin for Claude Code
codex-mcp-server -- Alternative Codex MCP bridge (Node.js)

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured