macos-computer-use-mcp
Provides native macOS computer control tools including mouse and keyboard simulation, screenshot capture, and application management for MCP-compatible agents. It enables AI assistants to directly interact with the macOS operating system and installed apps through standard tool calls.
README
macos-computer-use-mcp
A standalone MCP server providing native macOS computer control — mouse, keyboard, screenshots, and app management — for any MCP-compatible agent.
Demo
https://github.com/user-attachments/assets/130bee07-1f87-4291-8483-fc47ec51e493
Screenshot & Click in action — The MCP server grants app access, captures a 2560×1664 native screenshot, clicks on-screen elements, and opens URLs — all orchestrated through standard MCP tool calls from a coding agent.
Compatibility
Works with any client that supports the Model Context Protocol, including:
- Claude Code (
claude mcp add) - OpenAI Codex (
~/.codex/config.toml) - Cursor (
~/.cursor/mcp.json) - Any other MCP-compatible agent or IDE
How It Works
The server exposes macOS system control as MCP tools. Under the hood it uses macOS native modules for low-level input simulation and system APIs:
@ant/computer-use-input— Low-level mouse and keyboard event injection@ant/computer-use-swift— macOS native APIs for display management, app control, and screenshots
The MCP server process communicates over stdio, so any agent can spawn it as a subprocess and call its tools via the standard JSON-RPC protocol.
Available Tools (24)
| Tool | Description |
|---|---|
request_access |
Request Accessibility permission for an app |
screenshot |
Capture the full screen |
zoom |
Zoom into a screen region |
left_click |
Left-click at coordinates |
right_click |
Right-click at coordinates |
middle_click |
Middle-click at coordinates |
double_click |
Double-click at coordinates |
triple_click |
Triple-click at coordinates |
type |
Type a string of text |
key |
Press a key or key combination |
cursor_position |
Get current mouse position |
mouse_move |
Move the cursor to coordinates |
scroll |
Scroll at coordinates |
drag |
Drag from one point to another |
left_click_drag |
Left-click and drag |
get_display_size |
Get screen dimensions |
list_displays |
List all connected displays |
get_frontmost_app |
Get the currently active application |
list_installed_apps |
List all installed applications |
open_app |
Open an application by name or bundle ID |
close_app |
Close an application |
focus_app |
Bring an application to the foreground |
get_screen_content |
Get accessibility tree for screen content |
wait |
Wait for a specified duration |
Installation
curl -fsSL https://raw.githubusercontent.com/Zooeyii/macos-computer-use-mcp/main/install.sh | bash
Or manually:
git clone https://github.com/Zooeyii/macos-computer-use-mcp.git ~/.local/share/macos-computer-use-mcp
cd ~/.local/share/macos-computer-use-mcp
npm install
npm run build
Configuration
Claude Code
claude mcp add -s user computer-use-standalone node $HOME/.local/share/macos-computer-use-mcp/dist/cli.js
Or add to ~/.claude/mcp.json:
{
"computer-use-standalone": {
"type": "stdio",
"command": "node",
"args": ["$HOME/.local/share/macos-computer-use-mcp/dist/cli.js"]
}
}
OpenAI Codex
Add to ~/.codex/config.toml:
[mcp_servers.computer-use-standalone]
command = "node"
args = ["$HOME/.local/share/macos-computer-use-mcp/dist/cli.js"]
Or via CLI:
codex mcp add computer-use-standalone -- node $HOME/.local/share/macos-computer-use-mcp/dist/cli.js
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"computer-use-standalone": {
"command": "node",
"args": ["$HOME/.local/share/macos-computer-use-mcp/dist/cli.js"]
}
}
}
Requirements
- macOS (Darwin) — macOS-only due to native module dependencies
- Node.js 18+
- Accessibility Permission — Required for mouse/keyboard control
- System Settings → Privacy & Security → Accessibility
- Screen Recording Permission — Required for screenshots
- System Settings → Privacy & Security → Screen Recording
Architecture
MCP Client (Claude Code / Codex / Cursor / any agent)
│
│ stdio (JSON-RPC / MCP protocol)
│
▼
macos-computer-use-mcp (this server)
│
├── MCP Server
│ └── Tool handler
│
├── Tool Definitions
│ ├── Input tools (click, drag, scroll, type, key)
│ ├── Screen tools (screenshot, zoom, display info)
│ └── App tools (open, close, focus, list)
│
└── Executor
│
├── @ant/computer-use-input.node
│ └── Mouse / keyboard event injection
│
└── @ant/computer-use-swift
└── macOS native APIs
├── App management
├── Display control
└── Screenshot capture
Project Structure
macos-computer-use-mcp/
├── src/
│ ├── cli.ts # MCP server entry point
│ ├── tools.ts # Tool definitions
│ └── executor.ts # Platform implementations
├── install.sh # One-line installer
├── package.json
├── tsconfig.json
├── tsup.config.ts
└── README.md
Development
# Install dependencies
npm install
# Build
npm run build
# Run directly
node dist/cli.js
# Type-check only
npm run typecheck
Disclaimer
This project is for educational and research purposes.
Native module interfaces are based on publicly observable runtime behavior.
Use at your own risk. Only run in trusted environments — computer use grants full control of your mouse, keyboard, and screen.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.