pdf-chapter-splitter

pdf-chapter-splitter

An MCP server that splits PDFs by chapters/sections and reads them in Claude-friendly chunks. Enables structured reading of large PDF documents.

Category
Visit Server

README

pdf-chapter-splitter

An MCP (Model Context Protocol) server that splits PDFs by chapters/sections and reads them in Claude-friendly chunks. Designed for use with Claude Code to enable structured reading of large PDF documents.

PDFをチャプター単位で分割し、Claudeが扱いやすいサイズで読み込めるMCPサーバーです。Claude Codeと組み合わせて、大きなPDFの構造化された読み込みを実現します。

How it works / 仕組み

[You / ユーザー] → provide a PDF file path / PDFファイルパスを指定
  ↓
[pdf-chapter-splitter MCP] → analyze structure & extract text / 構造解析&テキスト抽出
  ↓
[Claude Code] ← receives chapter text + metadata / チャプターテキスト+メタデータで返却 → analyzes content / 内容を分析

Prerequisites / 前提条件

  • Node.js >= 18

No external binaries required. Uses pdfjs-dist (pure JavaScript).

外部バイナリは不要です。pdfjs-dist(ピュアJavaScript)で動作します。

Installation / インストール

Claude Code CLI (Recommended / 推奨)

claude mcp add pdf-chapter-splitter -- npx -y pdf-chapter-splitter

This registers the MCP server with Claude Code. No manual configuration needed.

npx (Direct execution / 直接実行)

npx -y pdf-chapter-splitter

Manual installation / 手動インストール

git clone https://github.com/keigoly/pdf-chapter-splitter.git
cd pdf-chapter-splitter
npm install
npm run build

If you installed manually, add the following to your .mcp.json (project-level) or ~/.claude/settings.json (global):

{
  "mcpServers": {
    "pdf-chapter-splitter": {
      "command": "node",
      "args": ["/path/to/pdf-chapter-splitter/dist/index.js"]
    }
  }
}

Tools / ツール

get_pdf_info

Get PDF metadata (title, author, page count, file size).

Parameter Type Description
file_path string (required) Absolute path to the PDF file

get_toc

Extract table of contents (bookmarks/outline) from a PDF. Returns hierarchical chapter structure with page numbers.

Parameter Type Description
file_path string (required) Absolute path to the PDF file

list_chapters

List chapters/sections detected from the PDF. Uses TOC if available, otherwise splits by pages.

Parameter Type Default Description
file_path string (required) - Absolute path to the PDF file
level number (optional) top-level only Max TOC depth level to include (0=top level only)

read_chapter

Read the text content of a specific chapter by its index (from list_chapters).

Parameter Type Description
file_path string (required) Absolute path to the PDF file
chapter_index number (required) Chapter index from list_chapters output
level number (optional) Max TOC depth level (must match list_chapters)

read_pages

Read text content from a specific page range.

Parameter Type Description
file_path string (required) Absolute path to the PDF file
start_page number (required) Start page number (1-based)
end_page number (required) End page number (1-based, inclusive)

detect_headings

Detect heading candidates by analyzing font sizes. Useful for PDFs without TOC/bookmarks.

Parameter Type Description
file_path string (required) Absolute path to the PDF file

3-level fallback strategy / 3段階フォールバック

  1. TOC/Bookmarks → Use as chapter structure / そのままチャプター構造として利用
  2. Font size analysis → Detect headings from larger text / フォントサイズからヘディング検出
  3. Auto-split → Split every 20 pages / 20ページ単位で自動分割

Constraints / 制約

Item Limit
Supported format .pdf
Max response size 200,000 chars / 1,800 lines
Auto-split chunk 20 pages
External deps None (pdfjs-dist only)

Use cases / ユースケース

  • Structured reading: Read technical books chapter by chapter / 技術書をチャプターごとに読む
  • Document review: Review specific sections of specs or reports / 仕様書やレポートの特定セクションをレビュー
  • Paper analysis: Analyze academic papers section by section / 学術論文をセクション単位で分析
  • Cross-PDF search: Get TOCs from multiple PDFs to find relevant chapters / 複数PDFの目次から関連章を検索

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured