pdf-card-mcp
Converts dense PDFs into soft, minimal, card-based HTML readers with preserved source text, rendered pages, and cropped tables/figures as images, all processed locally.
README
PDF Card MCP
<!-- mcp-name: io.github.velyan/pdf-card-mcp -->
PDF Card MCP is a local-first MCP tool that converts dense PDFs into soft, minimal, card-based HTML readers. It preserves source text, renders source pages, crops detected tables, figures, and display formulas as images, and writes a standalone HTML file that can be moved across devices without losing assets.
The default reader is designed for comfortable reading: large type, small cards, search, section navigation, next/previous controls, keyboard navigation, a font-size slider, and source-page previews.
Status
This is an early open-source implementation. It is useful for text-layer PDFs now, with
best-effort table detection via pdfplumber, permissive raster rendering via pypdfium2,
and optional richer local table detection via gmft. Scanned PDFs need optional OCR support.
Install For Development
cd /Users/vel/Projects/pdf-card-mcp
python -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev]"
uv is recommended for MCPB packaging:
uv sync
uv run pdf-card-mcp path/to/document.pdf --output out/document.html
Install the optional local ML table detector when you want stronger table crops:
uv sync --extra table-ml
uv run --extra table-ml pdf-card-mcp path/to/document.pdf --table-engine gmft
CLI Usage
pdf-card-mcp path/to/document.pdf --output examples/out/document.html
The command writes:
document.html: standalone reader with embedded CSS, JavaScript, table crops, figure crops, formula crops, and source-page images.document.manifest.json: structured metadata without embedded image payloads.
MCP Tool
The server exposes one primary tool:
convert_pdf_to_card_html
Inputs:
pdf_path: local PDF path.output_path: optional HTML output path.title: optional title override.standalone: defaults totrue; asset-folder output is reserved for a later release.ocr: optional OCR fallback ifpytesseractis installed.max_pages: optional processing limit.theme: defaults tosoft.table_engine:auto,pdfplumber, orgmft;autousesgmftwhen installed.text_engine:char_geometryorpdfplumber_words; defaults tochar_geometryso missing spaces are repaired from PDF character positions instead of trusting fused words.postprocess_engine:noneorsampling; defaults tonone. When set tosampling, the MCP server asks the host LLM for boundary-only card polish operations, validates exact source-text preservation, and rewrites the generated reader. If the MCP client does not support sampling, deterministic output is returned with a warning.model_cache_dir: optional cache directory for local ML table model weights.offline: use only already-cached optional ML models.
Sampling post-processing is intentionally narrow. The host LLM may suggest merges, heading extraction, or front-matter/footnote classification, but Python validation rejects any operation that rewrites, deletes, invents, or reorders source text.
Run the server locally:
python -m pdf_card_mcp.server
MCPB Packaging
This repo is arranged so the root can be packed directly:
python scripts/build_mcpb.py --variant all
The slim bundle writes dist/pdf-card-mcp-lite.mcpb. The full-quality UV bundle writes
dist/pdf-card-mcp.mcpb and installs the table-ml extra. Neither bundle vendors ML model
weights; gmft downloads and caches them locally on first use unless offline=true is set
with a prewarmed cache.
The MCPB manifests use server.type = "uv", so hosts that support UV runtime can install
dependencies from pyproject.toml instead of relying on a user-managed Python setup.
Privacy
PDF processing is local. The tool does not upload document contents or call external APIs. Optional OCR runs locally when the user has installed OCR dependencies.
How It Works
See docs/how-it-works.html for a self-contained visual explainer
of the conversion pipeline, including page rendering, table/figure crops, overlap suppression,
text-card merging, and standalone HTML output.
How Tables Are Handled
All detected tables are rendered as image cards. The converter uses pdfplumber to find table
regions and can optionally use gmft/Table Transformer for stronger local detection. It then
uses pypdfium2 to rasterize only the source table region into PNG. Captions are preserved as
reader text and alt text, but the table itself remains an image so layout and numeric alignment
survive conversion.
If a document mentions tables but no reliable table regions are found, the manifest includes a warning so callers can decide whether to inspect the source pages.
How Formulas Are Handled
Display formulas are treated as image cards when the PDF exposes them as centered, formula-like text blocks. The extracted formula string is retained for alt/search metadata, but the reader shows the source crop so subscripts, superscripts, arrows, and math spacing remain faithful.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.