claude-code-codex-agents
Enables Claude Code to delegate tasks to OpenAI's Codex CLI (GPT-5.4) with structured execution traces, parallel execution, session persistence, and adversarial code review.
README
claude-code-codex-agents
Give Claude Code structured Codex traces, not raw output.
For Claude Code users who want GPT-5.4 as a real tool: claude-code-codex-agents parses the entire JSONL event stream from Codex CLI and returns a structured execution report -- which tools it used, which files it touched, how long it took, and what went wrong. No other Codex MCP bridge does this.

graph LR
A["Claude Code<br/>(Opus 4.6)"] -->|MCP Protocol| B["claude-code-codex-agents<br/>MCP Server"]
B -->|"subprocess + stdin"| C[Codex CLI]
C -->|JSONL stream| B
C -->|API call| D["OpenAI API<br/>(GPT-5.4)"]
B -->|Structured Report| A
Without vs With claude-code-codex-agents
Without -- You call Codex CLI and get a wall of text. You don't know what tools it used, what files it changed, or if it actually succeeded.
With claude-code-codex-agents -- Claude Code gets a structured execution trace:
[Codex gpt-5.4] Completed
⏱ Execution time: 8.3s
🧵 Thread: 019d436e-4c39-7093-b7ed-f8a26aca7938
📦 Tools used (3):
✅ read_file — src/auth.py
✅ edit_file — src/auth.py
✅ shell — python -m pytest tests/
📁 Files touched (1):
• src/auth.py
━━━ Codex Response ━━━
Fixed the authentication logic. Token validation order was incorrect.
Why claude-code-codex-agents?
There are 6+ Codex MCP bridges on GitHub. Here's what makes this one different:
| Other bridges | claude-code-codex-agents | |
|---|---|---|
| Output | Raw text dump | Structured trace (tools, files, timing, errors) |
| Parallel tasks | 1 at a time | Up to 6 simultaneous |
| Session continuity | Stateless | threadId persistence across calls |
| Security | Pass-through | 3-tier sandbox + terminal injection prevention |
| Tests | Few or none | 59 tests (parsing, security, sessions, edge cases, agent lifecycle) |
| Review | Basic or none | Adversarial Review Loop (GPT-5.4 challenges Claude's code) |
Key Features
- Full JSONL Trace Parsing -- Every Codex event (tool calls, file ops, errors) parsed into a structured report
- Parallel Execution -- Run up to 6 Codex tasks simultaneously via
parallel_execute - Session Management -- Continue previous threads with
session_continue(threadId persistence) - Agent Lifecycle -- Run Codex as a background Claude Code-style worker via
spawn_codex_agent,send_codex_agent_input, andwait_codex_agent - Adversarial Review Loop -- GPT-5.4 reviews Claude's code from a different perspective
- Sandbox Security -- 3-tier policy (read-only / workspace-write / danger-full-access) + terminal injection prevention
- Cross-Model Discussion -- Get GPT-5.4's opinion on design decisions via
discuss - Zero External Dependencies -- Just FastMCP + Codex CLI. No databases, no Docker, no config files
- Japanese Native -- Full Japanese prompt and report support
- 59 Tests -- Comprehensive coverage including security, parsing, session management, agent lifecycle, and edge cases
Quick Start
1. Install Codex CLI
npm install -g @openai/codex
codex login
2. Install claude-code-codex-agents
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync
3. Add to your MCP client
Claude Code (~/.claude/settings.json):
{
"mcpServers": {
"claude-code-codex-agents": {
"type": "stdio",
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}
<details> <summary><b>Cursor</b> (~/.cursor/mcp.json)</summary>
{
"mcpServers": {
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}
</details>
<details> <summary><b>VS Code / Windsurf</b></summary>
Add to your MCP settings:
{
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
</details>
Tools
| Tool | Description | Sandbox |
|---|---|---|
execute |
Delegate tasks to Codex with structured trace report | workspace-write |
trace_execute |
Same as execute, plus full event timeline | workspace-write |
parallel_execute |
Run up to 6 tasks simultaneously | read-only |
review |
Adversarial code review by GPT-5.4 | read-only |
explain |
Code explanation (brief/medium/detailed) | read-only |
generate |
Code generation with optional file output | workspace-write |
discuss |
Get GPT-5.4's perspective on design decisions | read-only |
session_continue |
Continue a previous Codex thread | workspace-write |
session_list |
List session history with thread IDs | - |
spawn_codex_agent |
Launch a background Codex worker with default / explorer / worker roles |
role-based |
send_codex_agent_input |
Continue a background Codex worker with follow-up instructions | same as agent |
wait_codex_agent |
Wait for an agent turn and fetch the last structured result | - |
list_codex_agents |
Inspect tracked background Codex agents | - |
close_codex_agent |
Close an idle Codex agent | - |
status |
Check Codex CLI status and auth | - |
Claude Code-Style Agents
The new agent lifecycle tools let Claude Code treat Codex more like a persistent sub-agent than a one-shot CLI call.
- Use
spawn_codex_agentto start a background worker with a role preset:defaultfor balanced execution,explorerfor read-heavy investigation,workerfor implementation. - Use
send_codex_agent_inputto continue the same worker after you read its last result. - Use
wait_codex_agentto poll for completion without blocking other work. - Use
list_codex_agentsandclose_codex_agentto manage idle workers.
Real-World Example: Adversarial Code Review
Claude Code writes code, then asks GPT-5.4 to review it:
[Codex Review] GPT-5.4 Review Result
⏱ Execution time: 15.7s
━━━ Codex Response ━━━
- [CRITICAL] `run(cmd)` calls `os.system(cmd)` directly -- command injection
if `cmd` contains user input. Use `subprocess.run([...], shell=False)`.
- [WARNING] `divide(a, b)` raises ZeroDivisionError when b == 0.
Add a pre-check or explicit error message.
- [INFO] No type hints on function signatures. Add `def divide(a: float,
b: float) -> float:` for readability.
Real-World Example: Parallel Execution
Analyze multiple tasks simultaneously:
[Parallel Execution Complete] 3 tasks
━━━ Task 1 ✅ ━━━
Instruction: Analyze src/auth.py for security issues
⏱ 5.2s
...
━━━ Task 2 ✅ ━━━
Instruction: Review database query patterns in src/db.py
⏱ 7.8s
...
━━━ Task 3 ✅ ━━━
Instruction: Check error handling in src/api.py
⏱ 4.1s
...
Architecture
sequenceDiagram
participant C as Claude Code
participant H as claude-code-codex-agents
participant X as Codex CLI
participant O as OpenAI API
C->>H: MCP tool call (execute)
H->>H: _validate() + _enforce_sandbox()
H->>X: subprocess (stdin prompt)
X->>O: API request (GPT-5.4)
O-->>X: Response
X-->>H: JSONL event stream
H->>H: parse_jsonl_events() → CodexTrace
H->>H: _sanitize() → format_report()
H-->>C: Structured report
Security Model
| Sandbox Mode | File Write | Shell Exec | Use Case |
|---|---|---|---|
read-only |
Blocked | Blocked | Review, explain, discuss |
workspace-write |
CWD only | Allowed | Execute, generate |
danger-full-access |
Anywhere | Allowed | Full system access (use with caution) |
Additional protections:
- ANSI/OSC escape sequence sanitization (terminal injection prevention)
- Input validation on all parameters
- Process kill on timeout
--ephemeralflag (no persistent Codex state)
Development
# Setup
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync --extra dev
# Run tests (59 tests)
uv run pytest tests/ -v
# Run server directly
uv run python server.py
Project structure: Single file (server.py, ~820 lines). Easy to read, modify, and contribute.
Use Cases
- Cross-Model Code Review -- Claude writes code, GPT-5.4 reviews it. Eliminates single-model bias.
- Parallel Codebase Analysis -- Analyze 6 files simultaneously, get structured reports for each.
- Design Discussion -- Get GPT-5.4's alternative perspective on architectural decisions via
discuss. - Session-Based Refactoring -- Large refactoring across multiple
session_continuecalls with context preservation. - AI Second Opinion -- When Claude's answer seems off, ask GPT-5.4 for a sanity check.
Requirements
- Python 3.12+
- Codex CLI (
npm install -g @openai/codex) - OpenAI account (Codex CLI must be authenticated via
codex login) - uv (recommended) or pip
Related Projects
Helix Ecosystem
- helix-ai-studio — All-in-one AI chat studio with 7 providers, RAG, MCP tools, and pipeline
- helix-pilot — GUI automation MCP server — AI controls Windows desktop via local Vision LLM
- helix-agent — Extend Claude Code with local Ollama models — cut token costs by 60-80%
- helix-sandbox — Secure sandbox MCP server — Docker + Windows Sandbox
Alternative Codex Bridges
- codex-plugin-cc -- Official OpenAI plugin for Claude Code
- codex-mcp-server -- Alternative Codex MCP bridge (Node.js)
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.