ocrmypdf-mcp

ocrmypdf-mcp

Exposes ocrmypdf as a single tool for OCR-ing scanned PDFs, with automatic PATH handling for Tesseract and Ghostscript on Windows to ensure compatibility with Claude Desktop.

Category
Visit Server

README

ocrmypdf-mcp

A minimal MCP server that exposes ocrmypdf as a single tool, ocr_pdf, so Claude can OCR scanned PDFs and then hand them to markitdown (or any text tool) for downstream work.

Why this exists: the obvious "just call ocrmypdf" approach falls over on Windows with the Microsoft Store (MSIX) build of Claude Desktop, because MSIX launches MCP servers with a stripped-down PATH that doesn't include Tesseract or Ghostscript. This server auto-detects the standard Windows install locations and prepends them to PATH at startup, so OCR Just Works without futzing with system environment variables.

Works on Linux and macOS too — the PATH augmentation is a no-op outside Windows.

Prerequisites (Windows)

Two system installers, then pip install.

1. Tesseract OCR

UB-Mannheim build (the standard Windows distribution): https://github.com/UB-Mannheim/tesseract/wiki

Accept the default install location (C:\Program Files\Tesseract-OCR). Add language packs during install if you need anything beyond English.

2. Ghostscript

AGPL release for Windows (free): https://www.ghostscript.com/releases/gsdnld.html

Accept the default install location (C:\Program Files\gs\gs<version>\).

3. Verify (optional)

tesseract --version
gswin64c --version

If either says "not recognized," reopen PowerShell so it picks up the updated PATH, then retry.

Install the server

git clone https://github.com/jcm4TX/ocrmypdf-mcp
cd ocrmypdf-mcp
pip install --user .

This installs ocrmypdf, the mcp SDK, and the ocrmypdf-mcp executable. On Windows it lands at:

C:\Users\<you>\AppData\Roaming\Python\Python313\Scripts\ocrmypdf-mcp.exe

Wire it up in Claude Desktop

Edit claude_desktop_config.json. On the MSIX (Microsoft Store) build of Claude Desktop, the path is:

%LOCALAPPDATA%\Packages\Claude_pzs8sxrjxfjjc\LocalCache\Roaming\Claude\claude_desktop_config.json

On the regular non-MSIX installer it's:

%APPDATA%\Claude\claude_desktop_config.json

Add an ocrmypdf-mcp entry under mcpServers:

{
  "mcpServers": {
    "ocrmypdf-mcp": {
      "command": "C:\\Users\\<you>\\AppData\\Roaming\\Python\\Python313\\Scripts\\ocrmypdf-mcp.exe",
      "args": []
    }
  }
}

Then fully quit Claude Desktop — right-click the tray icon and pick Quit, not just close the window — and relaunch.

Verify it loaded

In a new chat, ask "what MCP tools do you have for OCR?" — Claude should report ocr_pdf. If not, check the server log:

%LOCALAPPDATA%\Packages\Claude_pzs8sxrjxfjjc\LocalCache\Roaming\Claude\logs\mcp-server-ocrmypdf-mcp.log

Tool API

ocr_pdf(input_path, output_path?, language?, force_ocr?, deskew?)

Arg Type Default Meaning
input_path str required Absolute path to input PDF
output_path str <stem>-ocr.pdf next to input Where to write the OCR'd PDF
language str "eng" Tesseract language code; join multiple with +, e.g. "eng+spa"
force_ocr bool false Re-OCR pages that already have a text layer
deskew bool true Straighten skewed pages before OCR

Default behavior: pages without an existing text layer get OCR'd, pages that already have text pass through unchanged. Safe to run on mixed PDFs.

Typical workflow

  1. You hand Claude a scanned PDF path.
  2. Claude calls ocr_pdf(input_path="...").
  3. Claude calls markitdown.convert_to_markdown on the resulting -ocr.pdf.
  4. Claude reads the markdown and answers your question.

Known limitations

  • The MCP protocol enforces a per-request timeout (~4 minutes in current Claude Desktop). Large multi-page documents may exceed this and surface as a client-side timeout even though the underlying ocrmypdf process completes successfully — the output PDF will still be on disk. If you hit this regularly, split the input into smaller page ranges first.
  • Complex multi-column scanned layouts (legal, probate, ledgers) can produce messy markdown when piped to markitdown afterward, because Tesseract interprets visual alignment as table structure. Post-processing the markdown to drop empty table-pipe rows recovers most of it.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured