device-controller-mcp

device-controller-mcp

An MCP server that lets Claude Desktop and Claude Code control your PC — take screenshots, click, type, manage windows, and more.

Category
Visit Server

README

device-controller-mcp

An MCP (Model Context Protocol) server that lets Claude Desktop and Claude Code control your computer — take screenshots, click, type, manage windows, and more.

⚠️ Security notice: This server grants an AI assistant direct control over your mouse, keyboard, clipboard, and shell (including run_command, which executes arbitrary commands). Only run it on machines you own, prefer --app scoping over --full, and review actions before trusting them. See Security.

Features

  • Two scope modes — lock every action to a single app window (--app), or grant full-desktop access (--full).
  • Cross-platform — macOS and Windows, with a clean platform abstraction layer.
  • Screenshot capture (full screen or single window) returned as base64 PNG.
  • Mouse & keyboard — clicks, typing, key combos, scrolling.
  • Window management — list, focus, resize, minimize, maximize.
  • Clipboard read / write.
  • Shell commands — launch apps, run commands, check running processes.

Requirements

  • Python 3.10+
  • macOS or Windows

Installation

With uv (recommended)

git clone https://github.com/geojakes/device-controller-mcp.git
cd device-controller-mcp
uv sync

This creates a virtual environment and installs everything you need. On Windows the pywin32 backend is pulled in automatically. Run the server with uv run:

uv run device-controller-mcp --full

You can also run it without cloning, straight from the repo, via uvx:

uvx --from git+https://github.com/geojakes/device-controller-mcp device-controller-mcp --full

macOS (optional): the server works out of the box using AppleScript, but installing the Quartz extra gives faster, more accurate window bounds:

uv sync --extra macos

With pip

git clone https://github.com/geojakes/device-controller-mcp.git
cd device-controller-mcp
pip install -e .          # add ".[macos]" on macOS for the optional Quartz backend

pywin32 (required on Windows) is selected automatically. The macOS Quartz backend is optional — without it the server falls back to AppleScript.

Usage

Scoped to a single app

device-controller-mcp --app "Google Chrome"

All screenshots, clicks, and coordinates will be relative to Chrome's window. The server auto-focuses Chrome before every action.

Full desktop

device-controller-mcp --full

Screenshots capture the whole screen and coordinates are screen-absolute.

Registering with Claude Desktop

Automatic (recommended)

The install command finds your OS's claude_desktop_config.json, merges in a server entry (leaving any existing servers untouched), and writes a .bak backup first:

# Full-desktop access
device-controller-mcp install --full

# Or scoped to one app
device-controller-mcp install --app "Google Chrome"

Useful flags: --name KEY to set the server key, --command PATH to override the executable, --config FILE to target a specific file, and --dry-run to preview the change without writing. Restart Claude Desktop afterward.

To remove it again:

device-controller-mcp uninstall                 # removes the "device-controller" key
device-controller-mcp uninstall --name device-controller-google-chrome

Manual

Add an entry to your claude_desktop_config.json by hand.

Config file location:

OS Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json

Full-desktop mode

{
  "mcpServers": {
    "device-controller": {
      "command": "device-controller-mcp",
      "args": ["--full"]
    }
  }
}

Scoped to one app

{
  "mcpServers": {
    "device-controller-chrome": {
      "command": "device-controller-mcp",
      "args": ["--app", "Google Chrome"]
    }
  }
}

Tip: If you installed inside a virtualenv, use the full path to the executable, e.g. "/path/to/venv/bin/device-controller-mcp".

Registering with Claude Code

claude mcp add device-controller -- device-controller-mcp --full

Or scoped:

claude mcp add device-controller-chrome -- device-controller-mcp --app "Google Chrome"

Tools reference

Tool Description
screenshot Capture screen or app window as base64 PNG. Optional sub-region crop.
mouse_click Click at (x, y) — configurable button and click count.
mouse_move Move cursor to (x, y) without clicking.
mouse_scroll Scroll up/down at a position.
type_text Type a string. Supports clipboard-paste mode for Unicode.
key_press Press keys or combos (ctrl+c, cmd+shift+s, ...).
list_windows List all visible windows with title, position, and size.
focus_window Bring a window to the foreground.
resize_window Move and resize a window.
minimize_window Minimize a window.
maximize_window Maximize a window to fill the screen.
clipboard_read Read the system clipboard.
clipboard_write Write text to the clipboard.
launch_app Open an application by name.
run_command Run a shell command and return stdout/stderr/exit code.
is_process_running Check whether a named process is running.

macOS permissions

On macOS you need to grant Accessibility access to your terminal app (or to Claude Desktop) in System Settings > Privacy & Security > Accessibility. This is required for pyautogui to control the mouse and keyboard.

Project structure

device-controller-mcp/
├── pyproject.toml
├── uv.lock                    # generated by `uv lock` / `uv sync`
├── .python-version           # Python pin used by uv
├── requirements.txt
├── README.md
└── src/
    └── device_controller_mcp/
        ├── __init__.py
        ├── __main__.py            # CLI entry point (run / install / uninstall)
        ├── install.py             # OS-aware Claude Desktop config registration
        ├── server.py              # FastMCP server factory
        ├── scope.py               # Scope manager (coord translation + auto-focus)
        ├── platform_layer/
        │   ├── __init__.py        # Platform detection factory
        │   ├── base.py            # Abstract base class + WindowInfo
        │   ├── macos.py           # macOS: Quartz + AppleScript
        │   └── windows.py        # Windows: pywin32
        └── tools/
            ├── __init__.py
            ├── screenshot.py      # Screen / window capture
            ├── input_control.py   # Mouse & keyboard
            ├── window_mgmt.py     # Window management
            ├── clipboard.py       # Clipboard read/write
            └── shell.py           # Shell commands & app launcher

Development

This project uses uv. Common tasks:

uv sync                 # create the venv and install deps (+ dev tools)
uv run device-controller-mcp --full   # run the server
uv lock                 # regenerate the lockfile after changing deps
uv run ruff check .     # lint

Note: uv.lock is committed so everyone resolves identical dependency versions. If it isn't present yet, run uv lock once and commit the result.

Security

This server gives an AI assistant real control over your machine. Treat it accordingly:

  • Arbitrary code execution. The run_command tool runs any shell command, and type_text / key_press can drive any application. There is no sandbox.
  • Prefer scoped mode. --app "<name>" keeps screenshots and coordinates bound to a single window. Use --full only when you genuinely need whole-desktop access.
  • Run it locally and trusted. Only register this server with clients you control, on machines you own. Never expose it to untrusted input or networks.
  • Review before trusting. Watch what the assistant does, especially the first time you use it with a new workflow.

Found a vulnerability? Please open a private security advisory on GitHub rather than a public issue.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured