Window Screenshooter MCP Server

Window Screenshooter MCP Server

An MCP server that enables AI agents to capture targeted screenshots of specific application windows on Windows and Linux, with smart window state restoration and focus management.

Category
Visit Server

README

Window Screenshooter MCP Server ๐Ÿ–ผ๏ธ

A cross-platform Model Context Protocol (MCP) server that enables AI agents to capture screenshots of specific application windows! (โ‰งโ—กโ‰ฆ)

Overview

Window Screenshooter is an MCP server built in Python that provides AI agents with the capability to take targeted screenshots of specific application windows across Windows and Linux platforms. Unlike traditional screen capture solutions that only capture the entire screen, this server allows precise window-based capture for AI verification workflows, automated testing, and application monitoring.

โœจ New Feature: Smart Window State Restoration & Focus Management!

The latest version now includes automatic window state restoration and intelligent focus handling! ๐ŸŽ‰

When capturing windows, the server will:

  • ๐Ÿ“ธ Save the original window state (minimized, maximized, position, etc.)
  • ๐Ÿ”„ Temporarily modify the window if needed for capture
  • โœจ Restore the window to its exact original state after capture
  • ๐ŸŽฏ Auto-detect your editor (Cursor, Trae, Windsurf, VS Code, etc.) and restore focus to it
  • ๐Ÿ“‰ Minimize captured windows if the calling application can't be found
  • ๐Ÿ’ซ Work seamlessly across all supported platforms

This means your workflow stays smooth - focus returns to your editor and windows don't get left in unexpected states!

Key Features

  • ๐Ÿ–ผ๏ธ Window-Specific Capture: Target individual application windows by name or title
  • ๐ŸŒ Cross-Platform Support: Works on Windows and Linux with platform-optimized backends
  • ๐Ÿ”ง MCP Integration: Seamless integration with AI agents through Model Context Protocol
  • ๐Ÿ“ก STDIO Transport: Uses standard input/output for reliable communication
  • โšก Performance Optimized: Platform-specific implementations for maximum efficiency

Installation & Setup

Prerequisites

  • Python 3.12+
  • Windows or Linux

Quick Start

  1. Clone or download this repository:

    git clone <your-repo-url>
    cd window-screenshooter
    
  2. Install dependencies:

    pip install pywinctl pillow pywin32
    # Or use the project file
    pip install -e .
    
  3. Test the server:

    python test-mcp.py
    
  4. Run the MCP server:

    # STDIO mode (for MCP clients)
    python server.py
    

MCP Tools

The server exposes three main MCP tools:

1. capture_window

Captures a screenshot of a specific window by title or identifier.

Parameters:

  • windowTitle (string): Exact or partial window title to match
  • outputPath (string, optional): Save location for screenshot
  • format (string, optional): Image format (PNG, JPEG) - default: PNG
  • quality (int, optional): JPEG quality (1-100) - default: 85

Returns: Base64-encoded image data or file path confirmation

Example:

# Save to file
await capture_window("Notepad", "screenshot.png", "PNG")

# Get base64 data
await capture_window("Calculator")

2. list_windows

Enumerates all available windows on the system.

Returns: Array of window objects with ID, title, position, and size information

Example:

await list_windows()

3. get_window_info

Retrieves detailed information about a specific window.

Parameters:

  • windowIdentifier (string): Window title or ID

Returns: Window metadata including position, size, visibility state, and process information

Example:

await get_window_info("Visual Studio Code")

Platform-Specific Features

Windows Implementation

  • Utilizes win32gui with BitBlt API for robust window capture
  • Can capture minimized, hidden, or overlapped windows
  • High-performance Graphics Capture API integration
  • Provides Windows handle (HWND) and process information

Linux Implementation

  • X11-based window capture using native protocols
  • Direct window buffer access for efficient capture
  • Support for common Linux desktop environments

MCP Client Configuration

For Cursor IDE

Add to your MCP configuration file:

{
  "mcpServers": {
    "window-screenshooter": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/window-screenshooter",
      "transport": "stdio"
    }
  }
}

For Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "window-screenshooter": {
      "command": "python",
      "args": ["server.py"],
      "cwd": "/path/to/window-screenshooter"
    }
  }
}

๐Ÿ’ก IDE Configuration Tip

๐Ÿ“ Note: For optimal workflow integration, consider adding this rule to your IDE configuration:

"Before capturing windows or screens with the MCP screenshooter, ALWAYS list windows first to get correct names. If working on a Web project, the default browser is Brave. If working on a Unity project, the user wants the Unity game scene window. After you make a screenshot or capture a screen, ALWAYS use vision."

This helps ensure accurate window targeting and proper follow-up analysis of captured content! โœจ

Usage Examples

AI Development Workflows

  • Code Verification: AI takes Unity editor screenshots to verify game object placement
  • UI Testing: Capture application states during automated testing sequences
  • Documentation: Generate visual documentation of application interfaces
  • Debugging: Visual confirmation of application behavior changes

Automation Scenarios

  • Quality Assurance: Screenshot comparison for regression testing
  • Process Monitoring: Capture application states for workflow verification
  • Training Data: Generate labeled screenshots for computer vision training

Error Handling

The server implements robust error handling for:

  • Window not found scenarios
  • Permission-denied capture attempts
  • Cross-platform compatibility issues
  • Invalid parameter validation
  • Graceful degradation when platform-specific features are unavailable

Development & Contributing

Project Structure

window-screenshooter/
โ”œโ”€โ”€ server.py              # Main MCP server implementation
โ”œโ”€โ”€ windows_capture.py     # Windows-specific capture logic
โ”œโ”€โ”€ linux_capture.py      # Linux-specific capture logic
โ”œโ”€โ”€ test-mcp.py           # Test script for functionality
โ”œโ”€โ”€ mcp-config-example.json # Example MCP configuration
โ”œโ”€โ”€ pyproject.toml        # Project dependencies
โ””โ”€โ”€ README.md            # This file

Testing

Run the test script to verify functionality:

python test-mcp.py

Common Issues

  1. "Window not found" errors:

    • Check exact window title with list_windows
    • Try partial title matching
    • Ensure window is visible and not minimized
  2. Permission errors on Windows:

    • Run as administrator if needed
    • Check Windows security settings
  3. Import errors:

    • Ensure all dependencies are installed: pip install pywinctl pillow pywin32
    • Check Python version (requires 3.12+)

Platform Compatibility

  • Windows: Full support with native Win32 API
  • Linux: Basic support with X11 integration

License

This project is licensed under the MIT License - see the LICENSE file for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured