codex-cua-mcp

codex-cua-mcp

Enables AI agents to control Windows desktop applications by wrapping Codex's Computer Use capability.

Category
Visit Server

README

Codex CUA MCP

δΈ­ζ–‡η‰ˆ

πŸ’¬ LINUX DO Discussion

MCP server that wraps Codex's Computer Use capability, enabling AI agents to control Windows desktop applications.

Features

  • List and control Windows desktop applications
  • Capture screenshots and accessibility trees
  • Click, type, press keys, scroll, drag
  • Launch apps and activate windows
  • Ready to use - exe bundled, no extra setup needed

Quick Start (Claude Code)

git clone <repo-url>
cd codex-cua-mcp
.\setup.ps1

Restart Claude Code to use.

Other Agents

Works with any MCP-compatible agent (Cursor, Windsurf, Cline, etc.):

{
  "mcpServers": {
    "codex-cua": {
      "command": "node",
      "args": ["PATH/codex-cua-mcp/bin/codex-cua-mcp.js"]
    }
  }
}

Check your agent's documentation for config file location.

How It Works

AI Agent (Claude Code, Cursor, etc.)
  ↓ MCP protocol (stdio)
MCP Server (codex-cua-mcp)
  ↓ JSON-RPC (stdin/stdout)
codex-computer-use.exe
  ↓ Windows APIs
Desktop Applications

The MCP server communicates with codex-computer-use.exe via JSON-RPC over stdin/stdout. The exe uses Windows APIs (SendInput, UI Automation, Windows.Graphics.Capture) to interact with desktop applications.

Each action requires an approval flow on first use per app. The server auto-approves by default for seamless operation.

πŸ“– Want to understand the design in depth? Read the Architecture Deep Dive. (δΈ­ζ–‡η‰ˆ)

Available Tools

Tool Description
list_windows List all controllable windows
list_apps List installed apps
get_window Rehydrate a window object
launch_app Launch an application
activate_window Bring window to foreground
get_window_state Capture screenshot + accessibility tree
click Click at coordinates or element
type_text Type text
press_key Press keyboard key
scroll Scroll
drag Drag
set_value Set editable element value
perform_secondary_action Secondary action (right-click menu, etc.)

Disclaimer

The core functionality comes from codex-computer-use.exe. Actual results depend on the AI agent and model being used β€” no guarantee of usability.

Requirements

  • Windows 10/11
  • Node.js 18+

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured