computer-use-mcp

computer-use-mcp

Gives AI assistants full macOS desktop control via screenshots, mouse, keyboard, scrolling, and app management.

Category
Visit Server

README

computer-use-mcp

An MCP server that gives AI assistants full macOS desktop control: screenshots, mouse, keyboard, scrolling, and app management. Should work with any MCP client - Claude Code, OpenCode, or your own agent.

Requires Claude.app — the server loads two native binaries bundled inside /Applications/Claude.app. You don't need to be running Claude Desktop to use this server, but the app must be installed.

What it does

The server exposes 24 tools covering everything needed to operate a macOS desktop:

Category Tools
Vision screenshot, zoom
Mouse left_click, right_click, middle_click, double_click, triple_click, mouse_move, left_click_drag, left_mouse_down, left_mouse_up, scroll
Keyboard key, hold_key, type
Clipboard read_clipboard, write_clipboard
Apps request_access, open_application, list_granted_applications, switch_display
Utility cursor_position, wait, computer_batch

Screenshots are captured at full Retina resolution and scaled to fit model constraints (≤1568px, ≤1.15MP). Click coordinates from the model are automatically mapped back to logical screen coordinates for CGEvent dispatch.

Permission tiers

App access is tiered by category, matching the behaviour of Claude Code's built-in computer use:

Tier Applies to What the model can do
View-only Browsers (Safari, Chrome, Firefox, Edge, Arc, Brave…), trading platforms Screenshot only
Click-only Terminals (Terminal, iTerm, Ghostty, Warp), IDEs (VS Code, Cursor, JetBrains…) Click and scroll, no typing
Full control Everything else All actions

Use cases

  • Test native apps — compile a macOS or iOS target, launch it, click through every screen, and screenshot error states, all in one conversation.
  • Reproduce visual bugs — resize windows to trigger layout regressions, capture the broken state, patch the CSS, verify the fix.
  • Drive GUI-only tools — interact with design tools, hardware panels, simulators, or any app without a CLI or API.
  • End-to-end UI flows — walk through onboarding, checkout, or admin flows and report what you find.

Requirements

  • macOS (Apple Silicon or Intel)
  • Node.js 18+
  • Claude.app installed at /Applications/Claude.app
  • Accessibility and Screen Recording permissions granted to the process that spawns the server (your terminal app or IDE)

Installation

git clone https://github.com/NicolaivdSmagt/computer-use-mcp.git
cd computer-use-mcp
npm install

Enable in Claude Code

Add the server to ~/.claude.json:

{
  "mcpServers": {
    "computer-use": {
      "command": "node",
      "args": ["/absolute/path/to/computer-use-mcp/index.js"]
    }
  }
}

Restart Claude Code. The tools appear as mcp__computer-use__* in your session.

Enable in OpenCode

Add to your OpenCode config (~/.config/opencode/config.json or equivalent):

{
  "mcp": {
    "servers": {
      "computer-use": {
        "command": "node",
        "args": ["/absolute/path/to/computer-use-mcp/index.js"]
      }
    }
  }
}

Granting macOS permissions

The server checks both permissions at startup and exits with a clear error if either is missing.

  1. Open System Settings → Privacy & Security → Accessibility — add and enable your terminal app (or the app that launches the server).
  2. Open System Settings → Privacy & Security → Screen Recording — do the same.

Permissions are inherited by child processes, so you only need to grant them to the parent process once.

Session model

Every session starts with an empty allowlist. Call request_access first with the apps you need:

request_access(apps: ["Safari"], reason: "Navigate to the app and verify the onboarding flow")

The response tells you the tier granted for each app. From that point, Claude can call open_application, screenshot, and the interaction tools that the tier permits. The allowlist resets when the server process exits.

How it works

The server wraps two native NAPI binaries bundled inside Claude.app:

  • computer_use.node — screenshot capture (captureExcluding, captureRegion), display enumeration, app management, TCC permission checks. Its async methods require the macOS run loop to be drained explicitly; the server handles this internally.
  • claude-native-binding.node — mouse events (moveMouse, mouseButton, mouseScroll), keyboard (keys, typeText), cursor position, frontmost app info.

No Electron, no AppleScript, no Accessibility API polling. Both binaries dispatch real CGEvents directly into the macOS event system.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured