mcp-screenshot

mcp-screenshot

Provides screenshot and OCR capabilities for macOS.

kazuph

Image & Video Processing
OS Automation
Visit Server

Tools

capture

Captures a screenshot of the specified region and performs OCR. Options: - region: 'left'/'right'/'full' (default: 'left') - format: 'json'/'markdown'/'vertical'/'horizontal' (default: 'markdown') The screenshot is saved to a dated directory in Downloads.

README

MCP Screenshot

An MCP server that captures screenshots and performs OCR text recognition.

<a href="https://glama.ai/mcp/servers/vcnmmaejv8"><img width="380" height="200" src="https://glama.ai/mcp/servers/vcnmmaejv8/badge" alt="mcp-screenshot MCP server" /></a>

Features

  • Screenshot capture (left half, right half, full screen)
  • OCR text recognition (supports Japanese and English)
  • Multiple output formats (JSON, Markdown, vertical, horizontal)

OCR Engines

This server uses two OCR engines:

  1. yomitoku

    • Primary OCR engine
    • High-accuracy Japanese text recognition
    • Runs as an API server
  2. Tesseract.js

    • Fallback OCR engine
    • Used when yomitoku is unavailable
    • Supports both Japanese and English recognition

Installation

npx -y @kazuph/mcp-screenshot

Claude Desktop Configuration

Add the following configuration to your claude_desktop_config.json:

{
  "mcpServers": {
    "screenshot": {
      "command": "npx",
      "args": ["-y", "@kazuph/mcp-screenshot"],
      "env": {
        "OCR_API_URL": "http://localhost:8000"  // yomitoku API base URL
      }
    }
  }
}

Environment Variables

Variable Name Description Default Value
OCR_API_URL yomitoku API base URL http://localhost:8000

Usage Example

You can use it by instructing Claude like this:

Please take a screenshot of the left half of the screen and recognize the text in it.

Tool Specification

capture

Takes a screenshot and performs OCR.

Options:

  • region: Screenshot area ('left'/'right'/'full', default: 'left')
  • format: Output format ('json'/'markdown'/'vertical'/'horizontal', default: 'markdown')

License

MIT

Author

kazuph

Recommended Servers

Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
@kazuph/mcp-fetch

@kazuph/mcp-fetch

Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.

Featured
Local
JavaScript
Claude Code MCP

Claude Code MCP

An implementation of Claude Code as a Model Context Protocol server that enables using Claude's software engineering capabilities (code generation, editing, reviewing, and file operations) through the standardized MCP interface.

Featured
Local
JavaScript
@kazuph/mcp-taskmanager

@kazuph/mcp-taskmanager

Model Context Protocol server for Task Management. This allows Claude Desktop (or any MCP client) to manage and execute tasks in a queue-based system.

Featured
Local
JavaScript
mermaid-mcp-server

mermaid-mcp-server

A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images.

Featured
JavaScript
mcp-pinterest

mcp-pinterest

A Pinterest Model Context Protocol (MCP) server for image search and information retrieval

Featured
TypeScript
DeepSRT MCP Server

DeepSRT MCP Server

An MCP server that enables users to generate summaries of YouTube videos in multiple languages and formats through integration with DeepSRT's API.

Official
JavaScript
Beamlit MCP Server

Beamlit MCP Server

An MCP server implementation that enables seamless integration between Beamlit CLI and AI models using the Model Context Protocol standard.

Official
TypeScript
ScreenshotOne MCP Server

ScreenshotOne MCP Server

An official MCP server implementation that allows AI assistants to capture website screenshots through the ScreenshotOne API, enabling visual context from web pages during conversations.

Official
TypeScript
ThingsPanel MCP

ThingsPanel MCP

An integration server that connects AI models with ThingsPanel IoT platform, allowing AI assistants to interact with IoT devices through natural language for device control, data retrieval, and management operations.

Official
Python