device-controller-mcp
An MCP server that lets Claude Desktop and Claude Code control your PC — take screenshots, click, type, manage windows, and more.
README
device-controller-mcp
An MCP (Model Context Protocol) server that lets Claude Desktop and Claude Code control your computer — take screenshots, click, type, manage windows, and more.
⚠️ Security notice: This server grants an AI assistant direct control over your mouse, keyboard, clipboard, and shell (including
run_command, which executes arbitrary commands). Only run it on machines you own, prefer--appscoping over--full, and review actions before trusting them. See Security.
Features
- Two scope modes — lock every action to a single app window (
--app), or grant full-desktop access (--full). - Cross-platform — macOS and Windows, with a clean platform abstraction layer.
- Screenshot capture (full screen or single window) returned as base64 PNG.
- Mouse & keyboard — clicks, typing, key combos, scrolling.
- Window management — list, focus, resize, minimize, maximize.
- Clipboard read / write.
- Shell commands — launch apps, run commands, check running processes.
Requirements
- Python 3.10+
- macOS or Windows
Installation
With uv (recommended)
git clone https://github.com/geojakes/device-controller-mcp.git
cd device-controller-mcp
uv sync
This creates a virtual environment and installs everything you need. On Windows
the pywin32 backend is pulled in automatically. Run the server with uv run:
uv run device-controller-mcp --full
You can also run it without cloning, straight from the repo, via
uvx:
uvx --from git+https://github.com/geojakes/device-controller-mcp device-controller-mcp --full
macOS (optional): the server works out of the box using AppleScript, but installing the Quartz extra gives faster, more accurate window bounds:
uv sync --extra macos
With pip
git clone https://github.com/geojakes/device-controller-mcp.git
cd device-controller-mcp
pip install -e . # add ".[macos]" on macOS for the optional Quartz backend
pywin32 (required on Windows) is selected automatically. The macOS Quartz
backend is optional — without it the server falls back to AppleScript.
Usage
Scoped to a single app
device-controller-mcp --app "Google Chrome"
All screenshots, clicks, and coordinates will be relative to Chrome's window. The server auto-focuses Chrome before every action.
Full desktop
device-controller-mcp --full
Screenshots capture the whole screen and coordinates are screen-absolute.
Registering with Claude Desktop
Automatic (recommended)
The install command finds your OS's claude_desktop_config.json, merges in a
server entry (leaving any existing servers untouched), and writes a .bak
backup first:
# Full-desktop access
device-controller-mcp install --full
# Or scoped to one app
device-controller-mcp install --app "Google Chrome"
Useful flags: --name KEY to set the server key, --command PATH to override
the executable, --config FILE to target a specific file, and --dry-run to
preview the change without writing. Restart Claude Desktop afterward.
To remove it again:
device-controller-mcp uninstall # removes the "device-controller" key
device-controller-mcp uninstall --name device-controller-google-chrome
Manual
Add an entry to your claude_desktop_config.json by hand.
Config file location:
| OS | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Full-desktop mode
{
"mcpServers": {
"device-controller": {
"command": "device-controller-mcp",
"args": ["--full"]
}
}
}
Scoped to one app
{
"mcpServers": {
"device-controller-chrome": {
"command": "device-controller-mcp",
"args": ["--app", "Google Chrome"]
}
}
}
Tip: If you installed inside a virtualenv, use the full path to the executable, e.g.
"/path/to/venv/bin/device-controller-mcp".
Registering with Claude Code
claude mcp add device-controller -- device-controller-mcp --full
Or scoped:
claude mcp add device-controller-chrome -- device-controller-mcp --app "Google Chrome"
Tools reference
| Tool | Description |
|---|---|
screenshot |
Capture screen or app window as base64 PNG. Optional sub-region crop. |
mouse_click |
Click at (x, y) — configurable button and click count. |
mouse_move |
Move cursor to (x, y) without clicking. |
mouse_scroll |
Scroll up/down at a position. |
type_text |
Type a string. Supports clipboard-paste mode for Unicode. |
key_press |
Press keys or combos (ctrl+c, cmd+shift+s, ...). |
list_windows |
List all visible windows with title, position, and size. |
focus_window |
Bring a window to the foreground. |
resize_window |
Move and resize a window. |
minimize_window |
Minimize a window. |
maximize_window |
Maximize a window to fill the screen. |
clipboard_read |
Read the system clipboard. |
clipboard_write |
Write text to the clipboard. |
launch_app |
Open an application by name. |
run_command |
Run a shell command and return stdout/stderr/exit code. |
is_process_running |
Check whether a named process is running. |
macOS permissions
On macOS you need to grant Accessibility access to your terminal app
(or to Claude Desktop) in System Settings > Privacy & Security > Accessibility.
This is required for pyautogui to control the mouse and keyboard.
Project structure
device-controller-mcp/
├── pyproject.toml
├── uv.lock # generated by `uv lock` / `uv sync`
├── .python-version # Python pin used by uv
├── requirements.txt
├── README.md
└── src/
└── device_controller_mcp/
├── __init__.py
├── __main__.py # CLI entry point (run / install / uninstall)
├── install.py # OS-aware Claude Desktop config registration
├── server.py # FastMCP server factory
├── scope.py # Scope manager (coord translation + auto-focus)
├── platform_layer/
│ ├── __init__.py # Platform detection factory
│ ├── base.py # Abstract base class + WindowInfo
│ ├── macos.py # macOS: Quartz + AppleScript
│ └── windows.py # Windows: pywin32
└── tools/
├── __init__.py
├── screenshot.py # Screen / window capture
├── input_control.py # Mouse & keyboard
├── window_mgmt.py # Window management
├── clipboard.py # Clipboard read/write
└── shell.py # Shell commands & app launcher
Development
This project uses uv. Common tasks:
uv sync # create the venv and install deps (+ dev tools)
uv run device-controller-mcp --full # run the server
uv lock # regenerate the lockfile after changing deps
uv run ruff check . # lint
Note:
uv.lockis committed so everyone resolves identical dependency versions. If it isn't present yet, runuv lockonce and commit the result.
Security
This server gives an AI assistant real control over your machine. Treat it accordingly:
- Arbitrary code execution. The
run_commandtool runs any shell command, andtype_text/key_presscan drive any application. There is no sandbox. - Prefer scoped mode.
--app "<name>"keeps screenshots and coordinates bound to a single window. Use--fullonly when you genuinely need whole-desktop access. - Run it locally and trusted. Only register this server with clients you control, on machines you own. Never expose it to untrusted input or networks.
- Review before trusting. Watch what the assistant does, especially the first time you use it with a new workflow.
Found a vulnerability? Please open a private security advisory on GitHub rather than a public issue.
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.