pdf-card-mcp

pdf-card-mcp

Converts dense PDFs into soft, minimal, card-based HTML readers with preserved source text, rendered pages, and cropped tables/figures as images, all processed locally.

Category
Visit Server

README

PDF Card MCP

<!-- mcp-name: io.github.velyan/pdf-card-mcp -->

PDF Card MCP is a local-first MCP tool that converts dense PDFs into soft, minimal, card-based HTML readers. It preserves source text, renders source pages, crops detected tables, figures, and display formulas as images, and writes a standalone HTML file that can be moved across devices without losing assets.

The default reader is designed for comfortable reading: large type, small cards, search, section navigation, next/previous controls, keyboard navigation, a font-size slider, and source-page previews.

Status

This is an early open-source implementation. It is useful for text-layer PDFs now, with best-effort table detection via pdfplumber, permissive raster rendering via pypdfium2, and optional richer local table detection via gmft. Scanned PDFs need optional OCR support.

Install For Development

cd /Users/vel/Projects/pdf-card-mcp
python -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev]"

uv is recommended for MCPB packaging:

uv sync
uv run pdf-card-mcp path/to/document.pdf --output out/document.html

Install the optional local ML table detector when you want stronger table crops:

uv sync --extra table-ml
uv run --extra table-ml pdf-card-mcp path/to/document.pdf --table-engine gmft

CLI Usage

pdf-card-mcp path/to/document.pdf --output examples/out/document.html

The command writes:

  • document.html: standalone reader with embedded CSS, JavaScript, table crops, figure crops, formula crops, and source-page images.
  • document.manifest.json: structured metadata without embedded image payloads.

MCP Tool

The server exposes one primary tool:

convert_pdf_to_card_html

Inputs:

  • pdf_path: local PDF path.
  • output_path: optional HTML output path.
  • title: optional title override.
  • standalone: defaults to true; asset-folder output is reserved for a later release.
  • ocr: optional OCR fallback if pytesseract is installed.
  • max_pages: optional processing limit.
  • theme: defaults to soft.
  • table_engine: auto, pdfplumber, or gmft; auto uses gmft when installed.
  • text_engine: char_geometry or pdfplumber_words; defaults to char_geometry so missing spaces are repaired from PDF character positions instead of trusting fused words.
  • postprocess_engine: none or sampling; defaults to none. When set to sampling, the MCP server asks the host LLM for boundary-only card polish operations, validates exact source-text preservation, and rewrites the generated reader. If the MCP client does not support sampling, deterministic output is returned with a warning.
  • model_cache_dir: optional cache directory for local ML table model weights.
  • offline: use only already-cached optional ML models.

Sampling post-processing is intentionally narrow. The host LLM may suggest merges, heading extraction, or front-matter/footnote classification, but Python validation rejects any operation that rewrites, deletes, invents, or reorders source text.

Run the server locally:

python -m pdf_card_mcp.server

MCPB Packaging

This repo is arranged so the root can be packed directly:

python scripts/build_mcpb.py --variant all

The slim bundle writes dist/pdf-card-mcp-lite.mcpb. The full-quality UV bundle writes dist/pdf-card-mcp.mcpb and installs the table-ml extra. Neither bundle vendors ML model weights; gmft downloads and caches them locally on first use unless offline=true is set with a prewarmed cache.

The MCPB manifests use server.type = "uv", so hosts that support UV runtime can install dependencies from pyproject.toml instead of relying on a user-managed Python setup.

Privacy

PDF processing is local. The tool does not upload document contents or call external APIs. Optional OCR runs locally when the user has installed OCR dependencies.

How It Works

See docs/how-it-works.html for a self-contained visual explainer of the conversion pipeline, including page rendering, table/figure crops, overlap suppression, text-card merging, and standalone HTML output.

How Tables Are Handled

All detected tables are rendered as image cards. The converter uses pdfplumber to find table regions and can optionally use gmft/Table Transformer for stronger local detection. It then uses pypdfium2 to rasterize only the source table region into PNG. Captions are preserved as reader text and alt text, but the table itself remains an image so layout and numeric alignment survive conversion.

If a document mentions tables but no reliable table regions are found, the manifest includes a warning so callers can decide whether to inspect the source pages.

How Formulas Are Handled

Display formulas are treated as image cards when the PDF exposes them as centered, formula-like text blocks. The extracted formula string is retained for alt/search metadata, but the reader shows the source crop so subscripts, superscripts, arrows, and math spacing remain faithful.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured