lowlevel-computer-use-mcp
A low-level computer-use MCP server that exposes raw desktop control (mouse, keyboard, shell, windows, processes, screenshots, etc.) to MCP clients like Claude Code and Codex.
README
lowlevel-computer-use-mcp
A low-level computer-use MCP server for Windows (most tools also work on macOS/Linux). It exposes raw desktop control to any MCP client β Claude Code, Codex, Claude Desktop, etc. β as a set of well-described tools:
- π±οΈ Mouse β move, click, double/right/middle click, drag, scroll, cursor position
- β¨οΈ Keyboard β type text, press hotkey combinations (Ctrl+C, Alt+Tab, β¦)
- π₯οΈ Shell commands β run arbitrary system commands and capture output
- πͺ Windows β list, move, resize, focus, minimize, maximize, restore, close, show/hide
- βοΈ Processes β list running processes; kill by PID or name
- πΈ Screenshots β all monitors, one monitor, a region, or one window via PrintWindow
- βοΈ Cropping β crop any saved image to a sub-region
- π₯ Screen recording β record a monitor or region to mp4 in the background
- π― Background / unfocused targeting β drive a specific window via Win32 messages without focusing it
- π«₯ Headless GUI β run real GUI apps on an off-screen desktop; show them only when a human login is needed
- π‘οΈ Run-as-admin β per-command UAC elevation or whole-server elevation
- π Auto-start on boot β register a logon scheduled task (optionally as admin)
- π’ AutoHotkey add-in β run AHK scripts;
ControlSend/ControlClickfor rock-solid background input - π§ Cross-platform β the same tools work natively on Linux (X11 via xdotool/wmctrl), with Xvfb virtual displays for headless-with-GUI
- π Ephemeral WSL β on a Windows host, spin up a throwaway Linux distro on demand, run commands, tear it down
- π§© GUI installer β one window that installs everything automatically
β οΈ This server performs real, unsandboxed actions on the host machine β clicking, typing, killing processes and running shell/elevated commands with your user's privileges. Only register it in environments where that is acceptable.
Tools
| Tool | Description |
|---|---|
get_screen_size |
Primary screen resolution |
get_cursor_position |
Current mouse position |
mouse_move |
Move cursor to (x, y) |
mouse_click |
Click; hwnd/window_title β background click (client coords) |
mouse_drag |
Press-drag-release between two points |
mouse_scroll |
Scroll the wheel up/down |
type_text |
Type text; hwnd/window_title β background WM_CHAR |
press_keys |
Press a key / hotkey combo, e.g. ["ctrl","c"] |
run_command |
Run a shell command, capture output/exit code |
list_windows |
List top-level windows (title, handle, geometry, state) |
get_active_window |
Info about the focused window |
move_window / resize_window |
Move / resize a window |
window_action |
focus / minimize / maximize / restore / close |
show_window / hide_window |
Bring a window forward (e.g. for login), then hide it again |
list_child_windows |
Enumerate a window's child controls (class, text, rect, handle) |
win_set_control_text |
Set a control's text via WM_SETTEXT (reliable background text) |
win_send_keys |
Post key presses to a window without focusing it |
list_processes / kill_process |
List processes; kill by PID or name |
screenshot |
Monitor/region/single-window (PrintWindow) capture to PNG |
crop_image |
Crop an existing image to a box |
start_screen_recording / stop_screen_recording / recording_status |
mp4 recording |
create_headless_desktop |
Create an off-screen desktop |
launch_on_headless_desktop |
Launch a GUI app onto it |
list_headless_windows |
List windows on the off-screen desktop |
show_headless_desktop / hide_headless_desktop |
Temporarily make it interactive (login), then hide |
close_headless_desktop |
Release the off-screen desktop handle |
ahk_status |
Whether AutoHotkey is installed |
run_ahk |
Run an inline AutoHotkey script |
ahk_control_send |
AHK ControlSend to a background window/control |
is_admin |
Whether the server is running elevated |
run_command_as_admin |
Run a shell command elevated (UAC prompt) |
install_startup / uninstall_startup / startup_status |
Boot auto-start |
linux_status |
Linux: X11 automation tooling availability |
create_virtual_display |
Linux: start an Xvfb headless display |
launch_on_virtual_display |
Linux: launch a GUI app on the Xvfb display |
list_virtual_display_windows |
Linux: windows on the Xvfb display |
screenshot_virtual_display |
Linux: capture the whole Xvfb display |
stop_virtual_display |
Linux: stop the Xvfb display |
wsl_status / wsl_list_distros |
WSL availability + installed distros |
wsl_create_temp |
Provision a throwaway WSL distro (Alpine by default) |
wsl_run |
Run a command inside a WSL distro |
wsl_list_temp / wsl_destroy / wsl_destroy_all_temp |
Manage throwaway distros |
Every tool returns a JSON string {"ok": true, ...} on success or
{"ok": false, "error": "..."} on failure.
Quick start (GUI installer β fully automatic)
The easiest path. It installs uv if missing, runs uv sync, and registers the
server with both Claude Code and Codex automatically on launch.
Clone the repo:
git clone https://github.com/codingmachineedge/lowlevel-computer-use-mcp.git
Enter it:
cd lowlevel-computer-use-mcp
Launch the installer (it auto-runs the full install):
uv run lowlevel-computer-use-mcp-installer
Then restart Claude Code / Codex so they spawn the server.
Manual install
Install dependencies and start the stdio server:
uv run lowlevel-computer-use-mcp
Or with pip β install in editable mode:
pip install -e .
Then run it:
lowlevel-computer-use-mcp
Captures (screenshots, recordings) are written to
~/lowlevel-computer-use-captures by default. Override with the
LOWLEVEL_CU_CAPTURE_DIR environment variable.
Registering with clients
Replace the path below with wherever you cloned this repo.
Claude Code
Register at user scope (one line):
claude mcp add lowlevel-computer-use --scope user -- uv run --directory "C:\path\to\lowlevel-computer-use-mcp" lowlevel-computer-use-mcp
Or add this to ~/.claude.json under mcpServers:
{
"mcpServers": {
"lowlevel-computer-use": {
"command": "uv",
"args": ["run", "--directory", "C:\\path\\to\\lowlevel-computer-use-mcp", "lowlevel-computer-use-mcp"]
}
}
}
Codex (OpenAI Codex CLI)
Add this block to ~/.codex/config.toml:
[mcp_servers.lowlevel-computer-use]
command = "uv"
args = ["run", "--directory", "C:\\path\\to\\lowlevel-computer-use-mcp", "lowlevel-computer-use-mcp"]
startup_timeout_sec = 60
Background / unfocused window targeting
mouse_click, type_text and screenshot accept hwnd or window_title. When
set, input is delivered to that exact window via Win32 messages without focusing
or foregrounding it, and screenshot uses PrintWindow so the window is captured
even if it's behind others, minimized, or on an off-screen desktop.
Typical flow:
- Find the window:
list_windows { "title_filter": "Notepad" }
- Find the control to target:
list_child_windows { "window_title": "Notepad" }
- Set its text in the background (most reliable for edit controls):
win_set_control_text { "hwnd": 23924320, "text": "typed without focus" }
- Or click it in the background (x/y are client coords of the window):
mouse_click { "window_title": "Notepad", "x": 200, "y": 120 }
- See the result without bringing it forward:
screenshot { "window_title": "Notepad" }
Caveat: message-based input is ignored by some apps (raw input / DirectInput / physical-key-state checks). For those, use the AutoHotkey
ahk_control_sendtool, or bring the window forward briefly.
Headless-but-with-GUI mode
Run a real GUI app on an off-screen Win32 desktop so it never touches your visible desktop, then automate and screenshot it via the background tools.
Create the off-screen desktop:
create_headless_desktop { "name": "work" }
Launch an app onto it:
launch_on_headless_desktop { "name": "work", "command": "notepad.exe" }
List its windows (to get handles):
list_headless_windows { "name": "work" }
Capture a window on it (works even though it's off-screen):
screenshot { "hwnd": 2495156 }
Showing it for an interactive login, then hiding again
Some steps (sign-in) need a human. Temporarily switch the live screen to the off-screen desktop:
show_headless_desktop { "name": "work" }
β¦let the user log in, then switch back to the normal desktop:
hide_headless_desktop { "name": "work" }
For an ordinary hidden window on the normal desktop, use show_window /
hide_window instead:
show_window { "window_title": "My App" }
hide_window { "window_title": "My App" }
AutoHotkey add-in
Optional but powerful. AHK's ControlSend/ControlClick drive background windows
very reliably, and run_ahk is a full scripting escape hatch.
Install AutoHotkey (one line):
winget install -e --id AutoHotkey.AutoHotkey
Check the server can find it:
ahk_status {}
Send text to a background window by HWND:
ahk_control_send { "text": "hello", "window": "ahk_id 0x1A2B3C" }
Run an arbitrary AHK script (must call ExitApp):
run_ahk { "code": "ControlSendText \"hi\", , \"ahk_exe notepad.exe\"\nExitApp" }
Point the server at a specific AHK exe by setting LOWLEVEL_CU_AHK to its path.
Macros β save repeated sequences as Skills
When you run a multi-step UI sequence the user is likely to repeat, don't leave
it as ad-hoc tool calls β capture it as a reusable macro Skill. The server tells
agents to do this automatically; see
macros/MACRO_SKILL_TEMPLATE.md for the template
and rules (resolve handles at run time, prefer background tools, parameterize the
variable parts, verify with a screenshot).
Linux (native X11)
The mouse, keyboard, screenshot, process, recording, window management and
background/unfocused targeting tools all work natively on Linux. Window control,
background input and per-window capture use X11 CLI tools; on Linux hwnd is an X11
window id.
Install the X11 helpers (Debian/Ubuntu):
sudo apt install xdotool wmctrl x11-utils imagemagick xvfb
Check what the server can see:
linux_status {}
Background-type into a window without focusing it (X11):
type_text { "window_title": "Editor", "text": "typed in the background" }
Caveat: X11 background typing uses
XSendEvent; most apps accept it, but a few (notablyxtermwith its defaultallowSendEvents: false) ignore synthetic events. For those, focus the window first or use a real X session.
Headless-with-GUI on Linux (Xvfb)
Start a virtual display:
create_virtual_display { "display": 99, "width": 1280, "height": 800 }
Launch a GUI app onto it:
launch_on_virtual_display { "display": 99, "command": "xterm -e bash" }
List its windows:
list_virtual_display_windows { "display": 99 }
Drive a window on that display (note the display field routes input there):
type_text { "hwnd": 2097164, "display": 99, "text": "hello from headless" }
Capture the whole virtual display:
screenshot_virtual_display { "display": 99 }
Stop it when done:
stop_virtual_display { "display": 99 }
Ephemeral WSL (Linux on a Windows host)
On Windows, spin up a throwaway Linux distro on demand to run Linux software, then tear it down. By default a tiny Alpine minirootfs is downloaded and imported in seconds β your existing distros are untouched.
Check WSL is available:
wsl_status {}
Provision a throwaway distro (downloads latest Alpine minirootfs):
wsl_create_temp {}
Run a command in it (use the name returned above):
wsl_run { "distro": "llcu-tmp-1782754365-53b8", "command": "apk add --no-cache curl && curl --version" }
Tear it down (irreversible β deletes the distro):
wsl_destroy { "name": "llcu-tmp-1782754365-53b8" }
You can also clone an existing distro instead of downloading:
wsl_create_temp { "clone_from": "Ubuntu-24.04" }
Run-as-admin mode
Per-command elevation β call run_command_as_admin (UAC prompt unless already
elevated):
run_command_as_admin { "command": "net session" }
Or run the whole server elevated (intended for HTTP mode):
uv run lowlevel-computer-use-mcp --http --admin
Check elevation:
is_admin {}
Auto-start on boot
Register a logon Scheduled Task (admin + HTTP by default):
uv run lowlevel-computer-use-mcp install-startup
Install without admin privileges:
uv run lowlevel-computer-use-mcp install-startup --no-admin
Use a custom port:
uv run lowlevel-computer-use-mcp install-startup --port 9000
Check the task status:
uv run lowlevel-computer-use-mcp startup-status
Remove it:
uv run lowlevel-computer-use-mcp uninstall-startup
Once the boot service runs in HTTP mode, point a client at it as a remote MCP server:
claude mcp add --transport http lowlevel-computer-use-boot http://127.0.0.1:8765/mcp
The task uses an interactive logon trigger (not SYSTEM) so the desktop-automation tools keep access to your session.
Requirements
- Python 3.10+
- uv (recommended; the GUI installer can bootstrap it)
- Windows: mouse/keyboard via
pyautogui; windows viapygetwindow; background input + capture + headless desktop viactypes/Win32 (winio.py); screenshots viamss; recording viaimageio+ bundled ffmpeg. Optional: AutoHotkey, WSL. - Linux: mouse/keyboard via
pyautogui(X11); window mgmt, background input and per-window capture viaxdotool/wmctrl/x11-utils/ImageMagick (linuxio.py); headless-with-GUI viaXvfb; screenshots/recording viamss. Install the X11 helpers with your package manager (see the Linux section). X11 (or XWayland) session.
Safety notes
run_command,run_command_as_admin,kill_process,run_ahkandwindow_action(close)are marked destructive.pyautogui's fail-safe is disabled so automation isn't interrupted by the cursor reaching a screen corner; be deliberate with coordinates.- The server has no authentication of its own β it trusts the MCP client that spawns it.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.