Screen MCP

Screen MCP

Enables LLMs to capture screenshots and screen recordings through MCP with chunked session-based transfers for reliable image consumption. Supports multi-monitor selection, timeline capture, and compatibility with both vision and non-vision language models.

Category
Visit Server

README

screen-mcp

A FastMCP server that runs on the client machine and exposes screenshot tools to a host MCP. It supports both direct screenshot capture and session-based chunked transfers so an LLM can consume images reliably.

Official FastMCP documentation: gofastmcp.com/getting-started/welcome

Exposed tools

  • list_monitors: returns detected monitors (index and dimensions)
  • capture_screenshot: captures a screen image with hybrid mode (base64 for non-vision, native MCP image for vision)
  • capture_timeline: captures a timed screen sequence (ordered frames with timestamps)
  • start_timeline_capture: starts a timeline session and returns a timeline_id
  • get_timeline_manifest: returns chunked timeline metadata
  • get_timeline_chunk: retrieves a timeline JSON chunk
  • release_timeline_capture: explicitly releases a timeline session
  • start_screenshot_capture: starts a screenshot session and returns a capture_id
  • get_screenshot_manifest: returns metadata plus ASCII preview for non-vision LLMs
  • get_screenshot_chunk: returns a chunk of base64 image data
  • release_screenshot_capture: releases the screenshot session and frees memory

Quick tool guidance

  • Need available monitor info: list_monitors
  • Need a fast single screenshot with moderate payload: capture_screenshot
  • Need a more robust single screenshot with chunking: start_screenshot_capture -> get_screenshot_manifest -> get_screenshot_chunk (0..N-1) -> release_screenshot_capture
  • Need a short timeline in one call: capture_timeline
  • Need a robust timeline for large payloads: start_timeline_capture -> get_timeline_manifest -> get_timeline_chunk (0..N-1) -> release_timeline_capture

Best practices:

  • Always concatenate chunks in ascending chunk_index order.
  • Always call release_* after reading session data to free memory.
  • For non-vision models, consume preview_text from the manifest before loading full payload.

Prerequisites

  • Linux with an active graphical session (X11/Wayland capture support)
  • DISPLAY environment variable available to the server process (mss requires it on Linux)
  • Python 3.10+

Local installation

uv sync

Or via Taskfile:

task setup

Run the MCP server (stdio)

task server

This task starts the server using mcpm run screen-mcp through uvx. It also registers or updates the local MCP server automatically when needed. Display-related environment variables are propagated during registration: DISPLAY, WAYLAND_DISPLAY, XAUTHORITY, XDG_RUNTIME_DIR.

MCP-compatible smoke-test client

task client

The smoke-test script is located in scripts/smoke_client.py and exercises:

  • list_monitors
  • start_screenshot_capture
  • get_screenshot_manifest
  • get_screenshot_chunk
  • release_screenshot_capture

It writes a verification image to artifacts/smoke_capture.jpg.

You can also run a specific action via --action:

uv run python scripts/smoke_client.py --action list-monitors
uv run python scripts/smoke_client.py --action capture-screenshot --monitor-index 0 --output artifacts/capture.jpg
uv run python scripts/smoke_client.py --action capture-timeline --duration-seconds 6 --output artifacts/timeline.json
uv run python scripts/smoke_client.py --action capture-timeline-session --duration-seconds 6 --chunk-size 120000 --output artifacts/timeline_session.json

Debugging and real-time inspection

task inspector

This launches the MCP Inspector against the mcpm run screen-mcp server.

Using the server in VS Code

  1. Open this project folder in VS Code.
  2. Add a servers configuration.
  3. Create a .vscode/mcp.json file and add one of the examples below.

Recommended local example for a cloned repo (unpublished package):

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "--project", "/absolute/path/to/screen-mcp", "screen-mcp"]
    }
  }
}

Example for running directly from a Git repo without global installation:

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "git+https://github.com/<owner>/screen-mcp.git", "screen-mcp"]
    }
  }
}

Alternative via MCPM:

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcpm", "run", "screen-mcp"]
    }
  }
}

Example tool calls

  • list_monitors()
  • capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80)
  • capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80, response_mode="image")
  • capture_timeline(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70)
  • start_timeline_capture(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70, chunk_size=120000)
  • get_timeline_manifest(timeline_id)
  • get_timeline_chunk(timeline_id, chunk_index)
  • release_timeline_capture(timeline_id)

Timeline behavior in capture_timeline:

  • fixed cadence: TIMELINE_FPS (default 2 images/s, configurable in source)
  • maximum duration: TIMELINE_MAX_DURATION_SECONDS (default 30s, configurable in source)
  • each frame includes: frame_index, t_offset_ms, captured_at, preview_text, image_sha256, image_size_bytes
  • temporal_hint makes chronological order explicit for an LLM

Robust flow recommendation:

  1. start_screenshot_capture(...) -> obtain capture_id
  2. get_screenshot_manifest(capture_id) -> metadata + preview_text
  3. get_screenshot_chunk(capture_id, chunk_index) -> reassemble chunks
  4. release_screenshot_capture(capture_id)

Base64 notes

  • For multi-client MCP, base64 is the most interoperable format: simple, JSON-friendly, compatible with vision and non-vision clients.
  • Tradeoff: larger payload (~33%) and risk of single-block truncation.
  • This project uses session-based chunked base64 transfer (capture_id) to make large exchanges reliable.
  • For non-vision LLMs, prefer get_screenshot_manifest (metadata + ASCII preview) before downloading the full image.

Hybrid mode in capture_screenshot:

  • response_mode="base64" (default): legacy behavior, JSON output with image_base64.
  • response_mode="image": native MCP image output for vision models, with metadata in structured_content.
  • response_mode="auto": reads SCREEN_MCP_CAPTURE_RESPONSE_MODE (base64 or image) and chooses automatically based on the client/host.

Security and privacy

Screen captures may contain sensitive data. Add an explicit client-side policy for production use (consent, masking, window whitelisting, etc.).

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured