gpt-codex-handoff
Enables Codex to delegate decision-making to an OpenAI reviewer for structured next-step recommendations during long-running tasks.
README
GPT Codex Handoff
An MVP MCP server that gives Codex a tool named ask_gpt_next_step. Codex can call it during long-running work to ask an OpenAI-powered reviewer for a structured recommendation about what to do next.
Flow:
Codex -> MCP tool ask_gpt_next_step -> OpenAI API reviewer -> strict JSON recommendation
Windows Setup
From a fresh checkout on Windows PowerShell:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
Run the tests:
python -m pytest --basetemp=".venv\pytest-tmp"
PowerShell can treat brackets as wildcard syntax in some contexts, so keep ".[dev]" quoted.
Environment
Fake reviewer mode is for local wiring tests. It returns valid recommendation JSON without importing the OpenAI client, without requiring OPENAI_API_KEY, and without contacting OpenAI:
$env:GPT_HANDOFF_REVIEWER_MODE = "fake"
Live reviewer calls require OPENAI_API_KEY and must not use fake mode.
Copy .env.example to .env for your own notes, or set variables in the shell that launches Codex:
$env:OPENAI_API_KEY = "sk-..."
$env:OPENAI_REVIEWER_MODEL = "gpt-4.1-mini"
For Codex desktop on Windows, the most reliable real-mode setup stores OPENAI_API_KEY in .codex\.env, which is ignored by Git:
OPENAI_API_KEY=...
Do not paste secrets into tool inputs, logs, diffs, or test fixtures.
Codex MCP Registration
After installing the package in the same environment Codex will use, write the repo-local Codex MCP configuration with:
python scripts\setup_codex_mcp.py --mode fake
The script creates .codex\config.toml and points Codex at this checkout's .venv\Scripts\python.exe.
For fake mode, the generated config looks like:
[mcp_servers.gpt_codex_handoff]
command = "C:\\Users\\jiash\\OneDrive\\Documents\\GPT CodeX integration\\.venv\\Scripts\\python.exe"
args = ["-m", "gpt_codex_handoff.mcp_server"]
env = { GPT_HANDOFF_REVIEWER_MODE = "fake" }
Restart Codex after changing MCP configuration so it can discover ask_gpt_next_step.
Then type /mcp in Codex chat. You should see gpt_codex_handoff as enabled. Some Codex UI versions show only the server row rather than an expandable tool list.
After /mcp shows the server, you can safely ask Codex to call the tool:
Call ask_gpt_next_step with summary="Fake-mode wiring test", changed_files=[], test_results="not run", open_questions=[], recent_log="", diff="", constraints=["Do not call OpenAI."]
The response should include a handoff_note saying fake reviewer mode is enabled and no OpenAI API call was made.
Fake mode is only for wiring tests. It does not evaluate the session with a real model.
Real Mode Registration
When you are ready for real reviewer calls on Windows, run:
python scripts\setup_codex_mcp.py --mode real --write-local-env
The script reads OPENAI_API_KEY from the current shell if it is set. If it is missing, it prompts securely without echoing. It writes the key to ignored .codex\.env, never prints the key, and never writes the key value into config. In real mode, Codex still forwards the existing Windows environment variable by name when available.
The real-mode config sets:
env = { GPT_HANDOFF_REVIEWER_MODE = "real", GPT_HANDOFF_DOTENV_PATH = "C:\\absolute\\path\\to\\.codex\\.env" }
env_vars = ["OPENAI_API_KEY"]
Restart Codex after running the command so the MCP process starts with the updated configuration.
Diagnose Reviewer Setup
To check local reviewer wiring without printing secrets, run:
python scripts\diagnose_reviewer_setup.py
The diagnostic reports only safe status values, such as whether the package imports, whether real or fake mode is configured, whether OPENAI_API_KEY is present, whether GPT_HANDOFF_DOTENV_PATH is configured, whether the dotenv file exists, and whether the MCP server module is importable. It does not print the API key, dotenv path, or dotenv file contents.
Run Tests
python -m pytest --basetemp=".venv\pytest-tmp"
Or, without installing the optional test runner:
$env:PYTHONPATH = "src"
python -m unittest discover -s tests -v
The tests validate schema handling and safety behavior without making real API calls.
Windows Test Troubleshooting
Use python -m pytest instead of bare pytest on Windows. It avoids PATH issues where the pytest launcher is installed in .venv\Scripts but the shell cannot find it.
If pytest reports PermissionError: Access is denied under a temp path such as AppData\Local\Temp\pytest-of-..., point pytest at a repo-local temp directory:
python -m pytest --basetemp=".venv\pytest-tmp"
If .venv\pytest-tmp itself becomes locked, close stale Python or Codex processes and rerun the command, or use a fresh repo-local temp directory such as .venv\pytest-tmp-trial.
Tool
ask_gpt_next_step accepts:
summarychanged_filestest_resultsopen_questionsrecent_logdiffconstraints
It returns strict JSON:
{
"next_step": "inspect failing tests",
"priority": "high",
"reason": "The current failure blocks verification.",
"should_continue": true,
"max_minutes": 15,
"commands_to_run": ["pytest -q"],
"files_to_inspect": ["tests/test_example.py"],
"risk_level": "medium",
"handoff_note": "Focus on the failing test before editing more code."
}
Example Usage
Safe Fake Reviewer
This example uses GPT_HANDOFF_REVIEWER_MODE=fake and does not need OPENAI_API_KEY:
python examples\fake_reviewer.py
The same pattern is useful in tests:
import os
from gpt_codex_handoff.context import ReviewContext
from gpt_codex_handoff.reviewer import OpenAIReviewer
os.environ["GPT_HANDOFF_REVIEWER_MODE"] = "fake"
reviewer = OpenAIReviewer()
print(reviewer.review(ReviewContext(summary="Local dry run.")))
Live Reviewer
Unset GPT_HANDOFF_REVIEWER_MODE and set OPENAI_API_KEY first. Live calls send the provided context to the OpenAI API after the local safety preflight passes.
from gpt_codex_handoff.reviewer import OpenAIReviewer
from gpt_codex_handoff.context import ReviewContext
reviewer = OpenAIReviewer()
recommendation = reviewer.review(
ReviewContext(
summary="Implemented first MCP server skeleton.",
changed_files=["src/gpt_codex_handoff/mcp_server.py"],
test_results="pytest passes",
open_questions=[],
recent_log="No errors.",
diff="",
constraints=["Do not commit."]
)
)
print(recommendation)
Safety
The local preflight check stops before sending context to the API when it sees likely credentials, ambiguous product decisions, high-risk changes, or repeated failures. In those cases the tool returns a conservative JSON recommendation with should_continue: false.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.