macOS OCR MCP

macOS OCR MCP

Provides an OCR tool that extracts text from images using macOS's built-in Vision framework, returning text segments with confidence scores and bounding boxes.

Category
Visit Server

README

macOS OCR MCP Tool

This project provides a MetaCall Protocol (MCP) tool to perform Optical Character Recognition (OCR) on images using macOS's built-in Vision framework. It exposes an ocr_image tool that takes an image file path and returns the recognized text along with confidence scores and bounding boxes.

Project Setup

Dependencies

This project relies on Python 3.13+ and the following main dependencies:

  • ocrmac: For accessing macOS OCR capabilities. See ocrmac.
  • Pillow: For image manipulation.
  • mcp[cli]>=1.7.1: For the MetaCall Protocol server and client.

Installation

It is recommended to use a virtual environment.

  1. Create and activate a virtual environment:

    python -m venv .venv
    source .venv/bin/activate
    
  2. Install dependencies using uv:

    uv sync
    

Running the MCP Server

To start the MCP server, run main.py:

uv run main.py

This will start the MCP server, making the ocr_image tool available.

Available MCP Tools

ocr_image

  • Description: Conducts OCR on the provided image file using macOS's built-in capabilities. Returns recognized text segments, their confidence scores, and bounding box coordinates.
  • Input: file_path: str - The absolute or relative path to the image file.
  • Output (Example Success):
    {
      "filename": "path/to/your/image.png",
      "annotations": [
        {
          "text": "Hello World",
          "confidence": 0.95,
          "bounding_box": [0.1, 0.1, 0.5, 0.05] 
        },
        // ... more annotations
      ]
    }
    
  • Output (Example Error):
    {
      "error": "OCR functionality is only available on macOS."
    }
    
    or
    {
      "error": "File not found: path/to/nonexistent/image.png"
    }
    

Note: This tool will only function correctly on a macOS system due to its reliance on the Vision framework.

Testing with MCP Inspector

You can use the MCP Inspector to connect to the running MCP server and test the tool.

Cursor MCP Configuration

To configure this MCP server in Cursor, you can add the following to your MCP JSON configuration file (e.g., ~/.cursor/mcp.json or project-specific .cursor/mcp.json):

{
  "mcpServers": {
    "ocrmac": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/macos-ocr-mcp",
        "run",
        "main.py"
      ]
    }
  }
}

This configuration tells Cursor how to start your MCP server. You can then call the ocrmac.ocr_image tool from within Cursor.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured