macinput

macinput

An MCP server for macOS that enables AI agents to control the desktop GUI through keyboard input, mouse actions, and screen captures. It provides stable low-level primitives for UI automation and agent-driven desktop workflows.

Category
Visit Server

README

macinput

Language: English | 中文

macinput is a macOS keyboard, mouse, and screenshot control tool for AI agents. This repository is now structured as an installable Python project and an MCP server so desktop agents can control a macOS GUI through standard MCP tool calls.

The project has two goals:

  • Provide stable low-level macOS input and screenshot primitives.
  • Provide an MCP server with a practical tool surface, runtime safety limits, and deployment guidance.

Features

  • Mouse move, left click, right click, and double click
  • Current mouse position lookup
  • Key press, key down, key up, and modifier combinations
  • Unicode text input
  • Clipboard-backed paste input
  • Full-screen screenshots with automatic cleanup
  • MCP resources and prompt templates for agent guidance

Use cases

  • Desktop AI agents controlling macOS applications
  • UI automation prototypes
  • Human-in-the-loop desktop workflows
  • Screenshot-observe plus keyboard/mouse-act agent loops

Requirements

  • macOS
  • Python 3.10+
  • The launching host app must have:
    • Accessibility permission
    • Screen Recording permission

Important: permissions apply to the program that launches the MCP server, not only to Python. If you launch through Claude Desktop, Terminal, iTerm2, Cursor, or VS Code, that host app must be granted permission.

Installation

uv

uv sync

pip

python -m pip install -e .

Run the MCP server

stdio is the recommended default for desktop AI clients:

macinput-mcp

If your MCP host requires HTTP transport:

macinput-mcp --transport streamable-http --host 127.0.0.1 --port 8000 --path /mcp

Example MCP client config

Generic stdio configuration:

{
  "mcpServers": {
    "macinput": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/macinput",
        "run",
        "macinput-mcp"
      ]
    }
  }
}

If the package is already installed into the current environment:

{
  "mcpServers": {
    "macinput": {
      "command": "macinput-mcp",
      "args": []
    }
  }
}

Available tools

  • get_server_settings
  • get_mouse_position
  • move_mouse
  • click_mouse
  • press_keyboard_key
  • keyboard_key_down
  • keyboard_key_up
  • type_text_input
  • paste_text_input
  • capture_screenshot
  • cleanup_screenshot_file

Available resources and prompt

Resources:

  • macinput://overview
  • macinput://best-practices
  • macinput://permissions

Prompt:

  • ui_action_protocol(goal, current_context="")

These are part of the product surface, not decoration. They let hosts ship usage guidance together with the server instead of rewriting it in every system prompt.

Recommended usage

For users:

  1. Prefer stdio for desktop agent integrations.
  2. Verify macOS permissions before the first real run.
  3. Prefer a dedicated macOS account, test machine, or VM for automation.
  4. Keep screenshot TTLs short to reduce data residue.

For agents:

  1. Capture a screenshot before acting.
  2. Make one state-changing action at a time.
  3. Capture a fresh screenshot after clicks, shortcuts, or text submission.
  4. Do not reuse old coordinates after the UI changes.
  5. Keep typed text short and task-specific.
  6. Clean up screenshots when they are no longer needed.

Environment variables

  • MACINPUT_DEFAULT_SCREENSHOT_TTL
    • Default screenshot cleanup timeout in seconds. Default: 30
  • MACINPUT_MAX_SCREENSHOT_TTL
    • Maximum allowed screenshot retention in seconds. Default: 300
  • MACINPUT_MAX_TYPING_LENGTH
    • Maximum characters per typing action. Default: 2000
  • MACINPUT_MIN_ACTION_DELAY
    • Minimum delay after each tool action. Default: 0.05
  • MACINPUT_DEFAULT_TYPING_INTERVAL
    • Default per-character typing interval. Default: 0.02

Use as a Python library

from macinput import click, move_to, press_key, type_text, capture_screen

move_to(400, 300)
click()
type_text("hello macOS")
press_key("a", modifiers=["command"])
path = capture_screen(cleanup_after=10)
print(path)

Development

Project layout

src/macinput/
  __init__.py
  __main__.py
  cli.py
  keyboard.py
  mouse.py
  screenshot.py
  server.py
  settings.py
docs/
  mcp-engineering.md
tests/

Local workflow

uv sync --extra dev
uv run pytest
uv run ruff check .

GitHub Actions

The CI workflow runs lint and tests on push and pull_request. The release workflow builds distributions on workflow_dispatch and on version tags such as v0.1.0, then uploads artifacts and publishes to PyPI if your repository is configured for trusted publishing.

Project rules

  • Keep low-level automation separate from the MCP server layer.
  • Keep MCP tools small and stable.
  • Prefer stdio by default.
  • Keep state-changing actions explainable and composable.
  • Document both user integration and developer maintenance workflows.

Documentation

Limitations

  • macOS only
  • Requires a real GUI session
  • CI can validate imports and configuration, but real machine validation is still necessary for UI injection
  • paste_text_input uses the system clipboard and currently preserves/restores plain-text clipboard content only
  • Does not include OCR, UI element detection, or semantic window understanding

Recommended follow-up work

  1. Add LICENSE.
  2. Add release automation.
  3. Add smoke tests on a real macOS runner.
  4. Add an examples/ directory for common MCP hosts.
  5. Add optional region screenshots, output directories, and audit logging.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured