screengrab-tool-mcp
Lets an LLM see what's on your screen by capturing an entire monitor or a specific application window.
README
screengrab-tool-mcp
A small Model Context Protocol server that
lets an LLM see what's on your screen — either an entire monitor or a
specific application window. The screen-capture tool is cross-platform; the
window-level tools are Windows-only and use pywin32 under the hood.
Built with the official mcp Python SDK, mss
for fast cross-platform capture, and Pillow for image processing. Packaged with
uv so it runs with one command via uvx.
Tools
capture_screen (cross-platform)
Capture an entire monitor and return a downsized PNG.
| Parameter | Type | Default | Description |
|---|---|---|---|
delay_seconds |
integer | 0 |
Seconds to wait before capturing. Range 0–60. |
display_index |
integer | 1 |
0 = all monitors stitched, 1 = primary, 2 = secondary, etc. Out-of-range values fall back to primary. |
list_windows (Windows-only)
Enumerate visible top-level windows on the desktop, sorted most-recently-focused
first. Returns a JSON array of {hwnd, title, process, pid} so the model can
pick the one it wants. No parameters.
capture_window (Windows-only)
Capture a specific window matched by query string.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
string | required | Title substring (case-insensitive) tried first; then process name (e.g. "code" matches Code.exe). Most recently focused wins on multi-match. |
remember_as |
string | Optional friendly name to store the resolved window under for the rest of this server process. Recall later via capture_remembered. |
|
delay_seconds |
integer | 0 |
Seconds to wait before capturing. Range 0–60. |
The capture pipeline tries PrintWindow first (no focus disturb). If that
returns a near-blank frame — common with hardware-accelerated apps like
browsers and Electron — it falls back to briefly bringing the window forward,
grabbing its rect with mss, then restoring the previous foreground window.
capture_remembered (Windows-only)
Capture a previously remembered window by friendly name.
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
string | required | The name passed to capture_window's remember_as. last is always valid after the first successful window capture. |
delay_seconds |
integer | 0 |
Seconds to wait before capturing. Range 0–60. |
Aliases live in process memory only — they vanish when the server restarts.
save_last_capture
Persists a recent capture from the in-memory ring buffer to a file in the workspace. Captures from capture_screen, capture_window, and capture_remembered are pushed into a 5-slot ring as a side effect; this tool writes one of them to disk.
| Parameter | Type | Default | Description |
|---|---|---|---|
index |
int | 0 |
Ring slot. 0 = most recent. Max 4. |
name |
str | (auto) | Filename (no path separators). .png appended if missing. |
dir |
str | (env or ./screenshots/) |
Destination directory. |
Directory resolution: dir arg → SCREENGRAB_SAVE_DIR env var → ./screenshots/. Auto-created if missing. Filename collisions auto-suffix (foo.png → foo-1.png).
Returned images are downscaled with Pillow.Image.thumbnail((2000, 2000))
before being base64-encoded as a PNG, so you don't blow up the context window
with a raw 4K screen grab.
Install / run
You don't need to install anything globally. With uv
present on your PATH, the server runs straight from a local checkout via
uvx:
# from inside the project directory
uvx --from . screengrab-tool-mcp
Or, once published to PyPI:
uvx screengrab-tool-mcp
The server speaks MCP over stdio. All logs go to stderr — never stdout — so it's safe to pipe directly into an MCP client.
Adding it to your MCP client
Claude Code (CLI)
claude mcp add screengrab -- uvx --from C:/Users/mahaffey/screengrab-tool-mcp screengrab-tool-mcp
Or, if installed from PyPI:
claude mcp add screengrab -- uvx screengrab-tool-mcp
To remove it later:
claude mcp remove screengrab
Claude Desktop
Edit your claude_desktop_config.json (Settings → Developer → Edit Config):
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
}
}
}
Once published to PyPI, simplify to:
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["screengrab-tool-mcp"]
}
}
}
VS Code / GitHub Copilot
Add to your .vscode/mcp.json (workspace) or user-level mcp.json:
{
"servers": {
"screengrab": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
}
}
}
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["screengrab-tool-mcp"]
}
}
}
Development
# install dependencies into a local venv
uv sync
# run directly
uv run screengrab-tool-mcp
# or run the module
uv run python -m screengrab_tool_mcp.server
Project layout
screengrab-tool-mcp/
├── pyproject.toml # uv / hatchling packaging + script entry point
├── README.md
├── src/
│ └── screengrab_tool_mcp/
│ ├── __init__.py
│ ├── server.py # MCP wiring, tool dispatch, stdio entry
│ ├── capture.py # capture_screen and capture_window pipelines
│ ├── windows.py # Win32 enumeration and query resolution
│ └── aliases.py # in-memory window-alias store
└── tests/
├── test_aliases.py
├── test_windows.py
└── test_capture.py
Notes & gotchas
- stdout is sacred. The MCP stdio transport multiplexes JSON-RPC over
stdout. The server configures
logging.basicConfig(stream=sys.stderr, ...)for exactly this reason. If you addprint(...)calls, route them tosys.stderror you will crash the connection. - Permissions on macOS. macOS requires the parent process (Claude Desktop, VS Code, Terminal, etc.) to have Screen Recording permission in System Settings → Privacy & Security. Grant it once and restart the host.
- Wayland on Linux.
mssuses X11. On a pure-Wayland session, capture will be limited to the Xwayland surface; XWayland or an X11 session works best. - Display indexing.
mssusesmonitors[0]for the virtual all-monitors rectangle andmonitors[1..N]for individual displays. Index1is the primary display. - Window tools are Windows-only.
list_windows,capture_window, andcapture_rememberedusepywin32and assume a real interactive desktop session. They will return a clear error on non-Windows hosts.capture_screenremains cross-platform.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.