macos-screen-mcp

macos-screen-mcp

An MCP server that lets AI assistants see your macOS desktop, capture screenshots, read browser tabs, and preview files via local macOS tools.

Category
Visit Server

README

macos-screen-mcp

npm version License: MIT macOS Node

Give AI eyes on your macOS desktop — an MCP server that lets AI assistants see your screen, read browser tabs, and capture screenshots.

Features

  • Desktop awareness — frontmost app, visible apps, window positions, screen resolution
  • Screenshot capture — full screen, specific region, or frontmost window (with configurable scale)
  • Browser tab inspection — Chrome, Safari, and Arc support (active tab or all tabs)
  • File preview — open files in default app, Chrome, or Quick Look

Requirements

  • macOS 12 (Monterey) or later
  • Node.js 18+
  • Screen Recording permission (for screenshot features only)

Installation

Quick start (recommended)

claude mcp add --transport stdio macos-screen -- npx -y macos-screen-mcp

That's it. No global install needed.

Global install

npm install -g macos-screen-mcp
claude mcp add --transport stdio macos-screen -- macos-screen-mcp

From source

git clone https://github.com/dla-kirito/macos-screen-mcp.git
cd macos-screen-mcp
npm install && npm run build
claude mcp add --transport stdio macos-screen -- node /path/to/macos-screen-mcp/dist/index.js

Cursor / Other MCP Clients

Add to your MCP config:

{
  "mcpServers": {
    "macos-screen": {
      "command": "npx",
      "args": ["-y", "macos-screen-mcp"]
    }
  }
}

Permissions Setup

Screen Recording (required for screenshots)

The first time you use capture_screen, macOS will prompt for Screen Recording permission.

  1. Open System Settings > Privacy & Security > Screen Recording
  2. Enable the toggle for your terminal app (e.g., Ghostty, iTerm2, Terminal)
  3. Restart your terminal if prompted

Automation (for browser tools)

The first time get_desktop_state or get_browser_content reads a browser's tabs, macOS will show a dialog like "<Terminal> wants to control "Google Chrome"". This is the standard macOS Automation prompt — click OK. macOS only asks once per app pair, and you can review/revoke it later under System Settings > Privacy & Security > Automation.

Note: preview_file doesn't require any special permissions — it uses the standard open command.

Available Tools

Tool Description Permissions
get_desktop_state Quick overview: frontmost app, visible apps, Chrome tabs, screen size None
capture_screen Screenshot (full / region / frontmost window), returns as image Screen Recording
get_browser_content Detailed browser tabs for Chrome, Safari, or Arc None
preview_file Open a file in default app, Chrome, or Quick Look None

Tool Details

capture_screen supports a scale parameter (0–1, default 0.5) to reduce image size before sending to the LLM, saving tokens while preserving enough detail for most tasks.

get_browser_content can return just the active tab per window (default) or all tabs with include_all_tabs: true.

How It Works

The server communicates with AI clients over stdio using the Model Context Protocol. Under the hood it uses:

  • AppleScript (osascript) to query desktop state, window bounds, and browser tabs
  • screencapture (macOS built-in) to take screenshots
  • sips to downscale images before returning them as base64 PNG

All operations are read-only — the server never modifies your files, settings, or browser state.

┌─────────────┐   stdio/MCP   ┌──────────────────┐   AppleScript   ┌─────────┐
│  AI Client   │◄────────────►│  macos-screen-mcp │◄──────────────►│  macOS   │
│ (Claude etc) │              └──────────────────┘   screencapture  │ Desktop  │
└─────────────┘                                                     └─────────┘

Security

  • All tool inputs are validated via Zod schemas
  • Application names are restricted to safe characters (no shell/AppleScript injection)
  • File operations use execFile with argument arrays (no shell interpolation)

Privacy

This server runs locally and does not send data to any remote service of its own. However, by design it lets your AI assistant see parts of your desktop, and whatever the AI sees is sent to the LLM provider you've configured (Anthropic, your Cursor backend, etc.) as part of normal MCP tool responses.

What each tool exposes:

  • get_desktop_state — frontmost app, list of visible apps, Chrome window URLs and titles, window positions, screen resolution
  • get_browser_content — for the chosen browser: every window's active tab URL and title (and all tabs if include_all_tabs=true)
  • capture_screen — raw pixels of your screen / a region / the frontmost window, sent as a base64 PNG
  • preview_file — only opens the file locally; no file contents are read or transmitted by this server

Treat anything visible on screen or in a browser tab as something the AI may receive. Avoid calling these tools while password managers, private chats, banking sites, or other sensitive content are visible. Most MCP clients let you disable individual tools per session if you want a temporary lockdown.

Known Limitations

  • macOS only — relies on AppleScript and macOS-specific commands
  • Browser inspection requires the target browser to be running
  • capture_screen requires explicit Screen Recording permission
  • Arc browser only supports active tab queries (no include_all_tabs)

Contributing

git clone https://github.com/dla-kirito/macos-screen-mcp.git
cd macos-screen-mcp
npm install
npm run build    # TypeScript → dist/
npm run lint     # Type-check without emitting

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured