MCP Servers

macos-sys-assist

A secure, constraint-based macOS OS-level automation MCP server for AI assistants.

README

macos-sys-assist

A focused macOS automation MCP server for reliable input simulation, window management, and window-specific screenshots — the three things AppleScript and bash do poorly.

What Is This?

macos-sys-assist is a Python-based MCP server that fills the gaps where bash + Chrome DevTools fall short. It uses pyobjc (native macOS APIs) and Core Graphics for low-level input simulation — more reliable than AppleScript's keystroke.

What It Does That bash/CDP Can't

Capability	Why Not bash/CDP
Core Graphics click/type/key	AppleScript `keystroke` misses keys or fails silently. This uses `CGEventPost` — the same API macOS uses internally.
Window-specific screenshots	bash `screencapture` captures the full screen; cropping is tedious. This captures just the window you want.
Precise window geometry	`osascript` returns position inconsistently. This uses the Accessibility API for accurate pixel-level data.
Multi-app window layouts	Arrange 3+ apps at specific positions in one command. bash needs multiple chained `osascript` calls.

What It Does NOT Do (Use bash Instead)

Tool	Why Use bash
Finding files	`find` / `mdfind` are simpler
Reading files	`cat` / `python3 -c`
Opening files	`open` command
App queries	`osascript -e 'tell app "System Events"...'`
Clipboard	`pbpaste` / `pbcopy`
Screen resolution	`system_profiler SPDisplaysDataType`

Quick Start

Installation

git clone https://github.com/YOUR_USERNAME/macos-sys-assist.git
cd macos-sys-assist
./setup.sh

Grant Permissions

Accessibility — System Settings → Privacy & Security → Accessibility → Add Terminal/Python
Screen Recording (for screenshots) — System Settings → Privacy & Security → Screen Recording → Add Terminal/Python

Configure Apps

Edit allowed_apps.json to control which apps can be automated.

Usage

Standalone Mode

./run.sh

OpenCode Integration

Add to opencode.jsonc:

"mcp": {
  "macos-sys-assist": {
    "type": "local",
    "command": ["/path/to/macos-sys-assist/run.sh"],
    "enabled": true
  }
}

Direct Python Usage (via bash)

.venv/bin/python3 -c "
import sys
sys.path.insert(0, '.')
from macos.input import InputSimulator
InputSimulator().click_at(100, 200, 'left')
"

Tool Reference

Input Simulation (Core Graphics)

Tool	Description	Security
`click_at(x, y, button, double)`	Click at screen coordinates	⚠️ Confirmation
`type_string(text)`	Type text character by character	⚠️ Confirmation, max 500 chars
`press_key(combination)`	Press key combo (e.g., `cmd+tab`)	⚠️ Blocked combos enforced

More reliable than AppleScript — uses CGEventPost instead of keystroke.

Window Management

Tool	Description	Security
`move_window(x, y)`	Move active window to coords	⚠️ Confirmation
`resize_window(width, height)`	Resize active window	⚠️ Confirmation
`get_window_geometry(pid)`	Get window position/size (Accurate)	Read-only

Uses Accessibility API for pixel-level accuracy. More reliable than osascript.

Screenshots (Requires Screen Recording Permission)

Tool	Description
`screenshot(filepath, display_id)`	Capture full screen
`screenshot_window(pid, filepath)`	Capture specific window only — no cropping needed
`screenshot_region(x, y, w, h, filepath)`	Capture a screen region
`get_displays()`	Get all connected displays and resolutions

When to Use This vs bash

✅ Use macos-sys-assist when:

AppleScript keystroke or click fails silently
You need a screenshot of just one window without browser chrome
You're arranging 3+ app windows at specific positions for a workspace
The task requires pixel-level coordinate accuracy

❌ Use bash when:

Finding files (find, mdfind, ls)
Reading files (cat, python3 -c)
Opening files (open)
Basic clipboard (pbpaste, pbcopy)
Checking what app is frontmost (osascript)
Launching apps (open -a)

🔄 Use Chrome DevTools when:

Interacting with web pages (clicking buttons, filling forms)
Uploading files to websites (base64 injection into <input type="file">)
Reading page content
Navigating multi-page web flows

Configuration

`allowed_apps.json`

Controls which apps can be automated:

{
  "allowed_apps": [
    {
      "bundle_id": "com.brave.Browser",
      "name": "Brave Browser",
      "allow_actions": true
    }
  ],
  "global_settings": {
    "require_confirmation_for_click": true,
    "require_confirmation_for_type": true,
    "max_string_length": 500,
    "blocked_key_combinations": [
      "cmd+q",
      "cmd+delete",
      "ctrl+alt+delete"
    ]
  }
}

Project Structure

macos-sys-assist/
├── server.py                 # Main MCP server entry point
├── config.py                 # Configuration management
├── security.py               # Security validation layer
├── allowed_apps.json         # Application allow-list
├── requirements.txt          # Python dependencies
├── setup.sh                  # Installation script
├── run.sh                    # Wrapper script
├── macos/                    # Native macOS API wrappers
│   ├── accessibility.py     # App queries, PID lookup
│   ├── window.py            # Window move/resize/geometry
│   ├── input.py             # Core Graphics click/type/key
│   ├── screenshot.py        # Screen capture (full/window/region)
│   └── task_engine.py       # Multi-step task execution
└── tools/                    # MCP tool definitions
    ├── information.py       # get_window_geometry
    ├── actions.py           # click_at, type_string, press_key, move/resize
    └── screenshot.py        # screenshot, screenshot_window, screenshot_region, get_displays

Roadmap

Completed ✅

[x] Core Graphics input simulation (click, type, key)
[x] Window management (move, resize, geometry)
[x] Window-specific screenshots (no cropping)
[x] Security layer (allow-list, blocked keys, confirmations)

Planned 📋

[ ] Folder Watcher — Detect new files in Downloads, auto-organize by project
[ ] System State — Battery, WiFi, disk space checks before long automations
[ ] Window Layout Presets — Save/restore multi-app workspaces
[ ] Calendar Integration — Meeting-aware automation scheduling

Security Model

Design Principles

No Shell Access — All operations use native macOS APIs
Explicit Allow-List — Only pre-approved apps can be controlled
Human-in-the-Loop — Invasive actions require user confirmation
Input Validation — Text length limits, key combo blocking

What's Blocked

Threat	Mitigation
Unauthorized app control	Application allow-list
Destructive key combos	Blocked combinations list
Excessive text input	Maximum string length (500)
Unconfirmed actions	Confirmation prompts

Troubleshooting

"Accessibility permission not granted"

System Settings → Privacy & Security → Accessibility
Add Terminal.app or .venv/bin/python3
Ensure toggle is ON
Restart the server

"Screen Recording permission required"

System Settings → Privacy & Security → Screen Recording
Add Terminal.app or .venv/bin/python3
Ensure toggle is ON
Restart the server

"App not in allow-list"

Find the app's bundle ID: osascript -e 'id of app "AppName"'
Add it to allowed_apps.json
Restart the server

License

MIT License — see LICENSE

Acknowledgments

Built for the OpenCode AI assistant framework. Uses the Model Context Protocol for tool integration. Powered by pyobjc for native macOS API access.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured