MCP-MinerU
Enables document and image parsing to extract text, tables, and formulas from PDFs, screenshots, and scanned documents. Features OCR capabilities, table recognition, LaTeX formula conversion, and MLX acceleration optimized for Apple Silicon.
README
MCP-MinerU
MCP server for document and image parsing via MinerU. Extract text, tables, and formulas from PDFs, screenshots, and scanned documents with MLX acceleration on Apple Silicon.
Installation
claude mcp add --transport stdio --scope user mineru -- \
uvx --from mcp-mineru python -m mcp_mineru.server
This command installs and configures the server for all your Claude Code projects using uvx (no manual installation required).
Alternative methods: See Installation Guide for PyPI, source installation, and Claude Desktop configuration.
Features
- Multiple format support: PDF, JPEG, PNG, and other image formats
- OCR capabilities: Built-in text extraction from screenshots and photos
- Table recognition: Preserves structure when extracting tables
- Formula extraction: Converts mathematical equations to LaTeX
- MLX acceleration: Optimized for Apple Silicon (M1/M2/M3/M4)
- Multiple backends: Choose speed vs quality tradeoffs
Quick Start
Parse a PDF document
User: "Analyze the tables in research_paper.pdf"
Claude: [Calls parse_pdf tool] "The paper contains 3 tables..."
Extract text from a screenshot
User: "What does this screenshot say? image.png"
Claude: [Calls parse_pdf tool] "The screenshot contains..."
Check system capabilities
User: "Which backend should I use?"
Claude: [Calls list_backends tool] "Your system has Apple Silicon M4..."
For more examples, see Usage Examples.
Tools
parse_pdf
Parse PDF and image files to extract structured content as Markdown.
Parameters:
file_path(required): Absolute path to file (PDF, JPEG, PNG, etc.)backend(optional):pipeline|vlm-mlx-engine|vlm-transformersformula_enable(optional): Enable formula recognition (default: true)table_enable(optional): Enable table recognition (default: true)start_page(optional): Starting page for PDFs (default: 0)end_page(optional): Ending page for PDFs (default: -1)
list_backends
Check system capabilities and get backend recommendations.
Returns: System information, available backends, and performance recommendations.
Supported Formats
- PDF documents (.pdf)
- JPEG images (.jpg, .jpeg)
- PNG images (.png)
- Other image formats (WebP, GIF, etc.)
Performance
Benchmarked on Apple Silicon M4 (16GB RAM):
- pipeline: ~32s/page, CPU-only, good quality
- vlm-mlx-engine: ~38s/page, Apple Silicon optimized, excellent quality
- vlm-transformers: ~148s/page, highest quality, slowest
Documentation
- Installation Guide - Detailed installation options
- Updating Guide - How to update to the latest version
- Usage Examples - More use cases and API reference
- MinerU Documentation - Underlying parsing engine
Development
git clone https://github.com/TINKPA/mcp-mineru.git
cd mcp-mineru
uv pip install -e ".[dev]"
# Run tests
pytest
# Format code
black src/
ruff check src/
License
Apache License 2.0 - see LICENSE file for details.
Acknowledgments
Built on top of MinerU by OpenDataLab.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.