acli-helper
A multi-agent MCP server that enables AI coding agents (Claude Code, Codex CLI, Gemini CLI) to communicate with each other.
README
acli-helper
A multi-agent MCP server that enables AI coding agents (Claude Code, Codex CLI, Gemini CLI) to communicate with each other. Register it once in each CLI, then any agent can delegate tasks, request reviews, or start discussions with the others.
How It Works
You (in Claude) → "ask codex to review my auth changes"
You (in Codex) → "have claude implement issue #4"
You (in Gemini) → "ask claude and codex to discuss the migration"
acli-helper runs as an HTTP MCP server. Each agent connects to it and gets tools to communicate with the others.
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Claude │ │ Codex │ │ Gemini │
└─────┬─────┘ └─────┬────┘ └─────┬────┘
└───────┬───────┴──────┬───────┘
│ HTTP (Streamable HTTP)
┌─────────┴───────────────────┐
│ acli-helper │
│ FastMCP v3 │
│ │
│ agent_ask() │──► Codex headless CLI
│ agent_result() │──► Claude headless CLI
│ agent_status() │──► Gemini headless CLI
│ agent_reply_input() │
│ list_conversations() │
│ resume_latest() │
└──────────────────────────────┘
Quick Start
1. Install
cd acli-helper
uv sync
Or without uv: pip install -e "."
2. Start the server
# Background daemon (recommended)
uv run acli-helper start --daemon
# Foreground (for debugging)
uv run acli-helper start
3. Set up a project
# Interactive wizard — detects installed agents, writes config + skills
uv run acli-helper setup
# Non-interactive (CI, scripting)
uv run acli-helper setup --project /path/to/project --mode skill --yes
# Refresh existing generated skills in a project
uv run acli-helper setup --project /path/to/project --mode skill --overwrite-skill --yes
The setup wizard:
- Detects which agent CLIs are installed (Claude, Codex, Gemini)
- Asks for the project folder
- Offers two install modes:
- MCP only — registers the acli-helper server in each agent's config
- MCP + Skill — also installs an orchestration skill (SKILL.md) that teaches agents best practices for multi-agent workflows
- Merges into existing configs without overwriting other MCP servers
- Updates outdated generated
SKILL.mdfiles during interactive setup - Can refresh generated or custom
SKILL.mdfiles in non-interactive mode with--overwrite-skill - Shows server status and a try-it example
<details> <summary>Manual setup (without the wizard)</summary>
Claude Code — add .mcp.json to your project root:
{
"mcpServers": {
"acli": { "type": "http", "url": "http://127.0.0.1:8787/mcp" }
}
}
Codex CLI — add .codex/config.toml to your project root:
[mcp_servers.acli]
url = "http://127.0.0.1:8787/mcp"
Gemini CLI — add .gemini/settings.json to your project root:
{
"mcpServers": {
"acli": { "httpUrl": "http://127.0.0.1:8787/mcp" }
}
}
</details>
4. Use it
From any agent, just ask:
- "ask codex to review the uncommitted changes" →
agent_ask(agent="codex", intent="review") - "have claude implement the login feature" →
agent_ask(agent="claude", intent="implement") - "ask gemini to research rate limiting" →
agent_ask(agent="gemini", intent="discuss")
Calls run in the background by default — use agent_result to poll. For quick questions, pass wait=true to get the response inline.
Recommended CLI Versions
- Claude Code
>= 2.1.89 - Codex CLI
>= 0.118.0 - Gemini CLI
>= 0.36.0
acli-helper now logs a startup warning when a detected CLI is below these baselines.
Cross-Agent Workflow Examples
Research → Implement: Have Gemini research a topic, then Claude implement it.
You (in Claude): "ask gemini to research Python rate limiting libraries and recommend one"
→ agent_ask(agent="gemini", intent="discuss") → conversation_id="abc"
→ agent_result("abc") → Gemini recommends "slowapi"
You (in Claude): "now implement rate limiting using slowapi in our FastAPI app"
→ Claude implements directly, using Gemini's research as context
Parallel Review: Fire off reviews to multiple agents and compare.
You (in Claude): "ask codex to review the auth module, and ask gemini to review it too"
→ agent_ask(agent="codex", intent="review") → cid_1
→ agent_ask(agent="gemini", intent="review") → cid_2
→ (continue working while both review)
→ agent_result(cid_1) → Codex findings
→ agent_result(cid_2) → Gemini findings
→ Claude synthesizes both reviews
Session Continuity: Follow up on a conversation across turns.
You (in Claude): "ask codex what it thinks about our error handling"
→ agent_ask(agent="codex", intent="review") → conversation_id="xyz"
→ agent_result("xyz") → Codex gives feedback
You (in Claude): "ask codex to elaborate on the retry logic concern"
→ agent_ask(agent="codex", conversation_id="xyz")
→ Codex remembers the prior review — no need to rescan the codebase
Cross-Agent Handoff: One agent plans, another implements.
You (in Claude): "ask gemini to plan a migration from REST to GraphQL"
→ agent_ask(agent="gemini", intent="discuss") → conversation_id="plan-1"
→ agent_result("plan-1") → Gemini's migration plan
You (in Claude): "now ask codex to implement phase 1 of gemini's plan"
→ agent_ask(agent="codex", intent="implement")
→ You paste Gemini's plan summary in the prompt
→ Codex implements phase 1
Cross-Session Resume: Pick up where you left off, even after restarting your initiator session.
You (in Claude, new session): "resume the codex review we did earlier"
→ resume_latest(agent="codex", intent="review")
→ {conversation_id: "xyz", last_agent: "codex", turn_count: 3, ...}
→ agent_ask(agent="codex", conversation_id="xyz", prompt="Continue from your last findings")
→ Codex resumes with prior context (handoff summary auto-prepended if session is fresh)
Browsing Conversation History:
You (in Claude): "what conversations have we had with gemini?"
→ list_conversations(agent="gemini", limit=5)
→ [{conversation_id: "abc", turn_count: 2, last_intent: "discuss", ...}, ...]
Server Management
uv run acli-helper start --daemon # Start background server
uv run acli-helper status # Check if running (with PID)
uv run acli-helper ensure # Start only if not running (idempotent)
uv run acli-helper stop # Stop the background server
Daemon logs are written to ~/.config/acli-helper/server.log.
Configuration
Copy acli-helper.example.toml to acli-helper.toml in your project root (or ~/.config/acli-helper/config.toml for global config):
[server]
host = "127.0.0.1"
port = 8787
# advertise_url = "http://127.0.0.1:8787/mcp" # Optional full MCP URL written into client configs
log_level = "INFO"
# db_path = "/path/to/conversations.db" # Default: ~/.config/acli-helper/conversations.db
conversation_log = "metadata" # "full" | "metadata" | "none"
handoff_summary = true # Prepend context when fresh session starts for existing conversation
handoff_max_turns = 10 # Max prior turns in handoff summary
[defaults]
timeout_s = 600
[agents.claude]
# binary = "/path/to/claude" # Override auto-resolved binary
# timeout_s = 300 # Override default timeout
[agents.codex]
# binary = "/path/to/codex"
[agents.gemini]
# binary = "/path/to/gemini"
[experimental]
# gemini_discuss_plan_mode = false # Use plan mode for Gemini discuss (experimental)
Config search order: $ACLI_HELPER_CONFIG env var → ./acli-helper.toml → ~/.config/acli-helper/config.toml → built-in defaults.
advertise_url is used by acli-helper setup when generating client MCP configs. If omitted, setup derives the URL from host and port, except wildcard bind hosts (0.0.0.0, ::) are normalized to 127.0.0.1 for local use.
Conversation logging controls what is persisted in the SQLite database:
full— store prompts + responses (useful for debugging, may contain sensitive data)metadata— store agent, intent, timestamp only (default, privacy-safe)none— don't persist turns (only conversation IDs and session state for resume)
Handoff summary — when a fresh provider session starts for an existing conversation (e.g. resume_policy=fresh or scope mismatch), the broker prepends a short summary of prior turns so the new session has context. Controlled by handoff_summary and handoff_max_turns config options.
Features
Bidirectional — any agent can initiate communication with any other.
Session continuity — conversations persist across turns. Follow-ups reuse conversation_id so agents don't rescan the codebase.
Intent-based permissions — each intent maps to the most restrictive permission set each CLI supports. Due to CLI limitations, granularity varies:
| Intent | Claude Code | Codex CLI | Gemini CLI |
|---|---|---|---|
discuss |
plan mode (read + explore, no edits) |
read-only sandbox (file reads, no shell) |
Policy: read/search allowed, writes/shell denied |
review |
dontAsk + read/git/test tools |
workspace-write (read + write + shell)* |
Policy: read/search allowed, writes/shell denied |
implement |
dontAsk + full tool access |
workspace-write (read + write + shell) |
auto_edit (auto-approve edits) |
*Codex workspace-write allows file writes — no read-only+shell mode exists. Codex read-only blocks all shell commands. Codex network access requires config.toml setting, not available per-request. See intents.py for details.
Async by default — agent_ask returns immediately while the agent works in the background. Poll with agent_result to get the response. Pass wait=true for quick synchronous queries. This is broker-managed async via asyncio, not protocol-native MCP background tasks.
Claude timeout hardening — Claude runs with MCP_CONNECTION_NONBLOCKING=1 so slow/unreachable MCP servers are less likely to consume the full timeout_s window before failing.
Persistent conversations — conversations, session IDs, and turn history are stored in SQLite (~/.config/acli-helper/conversations.db). Sessions survive daemon restarts. Privacy-safe by default — only metadata is stored unless conversation_log = "full" is set.
Conversation continuity tools — list_conversations and resume_latest let initiators find and resume conversations from prior sessions. Works safely with all conversation_log levels.
Handoff summaries — when the broker starts a fresh provider session for an existing conversation, it auto-prepends a concise summary of prior turns. Configurable via handoff_summary and handoff_max_turns.
Non-blocking elicitation — when an agent needs higher permissions, the broker transitions to input_required state instead of blocking. The initiator can supply input via agent_reply_input without losing the conversation context.
Stale session recovery — if a provider session ID is expired/missing (for example, "No conversation found with session ID"), the broker retries once with a fresh provider session automatically.
Early stderr abort — if a child agent's stderr shows a policy block, rate limit, or permission denial, the process is killed immediately instead of waiting for the full timeout.
MCP tools preserved — child agents keep access to their configured MCP servers (context7, etc.). Only the broker's own tools are blocked to prevent recursion.
Experimental: Gemini plan mode — optional gemini_discuss_plan_mode config flag to use plan mode for Gemini discuss intent. Disabled by default; enable in [experimental] config section.
Requirements
- Python 3.12+
- uv (recommended) or pip
- At least two of: Claude Code CLI, Codex CLI, Gemini CLI
Project Structure
src/acli_helper/
├── server.py # FastMCP server + tool definitions (7 tools)
├── config.py # TOML config loader + experimental flags
├── cli.py # CLI (start/stop/status/ensure/setup)
├── state.py # SQLite conversation store + query methods
├── intents.py # Intent → permission profile mapping
├── setup.py # Interactive setup wizard
├── version_check.py # CLI version compatibility checks
└── adapters/
├── base.py # AgentAdapter interface
├── process.py # Subprocess runner with early stderr abort
├── resolve.py # Binary resolver (avoids cmd.exe on Windows)
├── codex.py # Codex CLI headless adapter
├── claude.py # Claude CLI headless adapter
└── gemini.py # Gemini CLI headless adapter
tests/
├── conftest.py # Stub adapters + shared fixtures
└── test_smoke.py # Smoke test matrix (cross-agent, resume, elicitation)
.github/workflows/
└── ci.yml # GitHub Actions CI (lint + compile + smoke tests)
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.