pdf-chapter-splitter
An MCP server that splits PDFs by chapters/sections and reads them in Claude-friendly chunks. Enables structured reading of large PDF documents.
README
pdf-chapter-splitter
An MCP (Model Context Protocol) server that splits PDFs by chapters/sections and reads them in Claude-friendly chunks. Designed for use with Claude Code to enable structured reading of large PDF documents.
PDFをチャプター単位で分割し、Claudeが扱いやすいサイズで読み込めるMCPサーバーです。Claude Codeと組み合わせて、大きなPDFの構造化された読み込みを実現します。
How it works / 仕組み
[You / ユーザー] → provide a PDF file path / PDFファイルパスを指定
↓
[pdf-chapter-splitter MCP] → analyze structure & extract text / 構造解析&テキスト抽出
↓
[Claude Code] ← receives chapter text + metadata / チャプターテキスト+メタデータで返却 → analyzes content / 内容を分析
Prerequisites / 前提条件
- Node.js >= 18
No external binaries required. Uses pdfjs-dist (pure JavaScript).
外部バイナリは不要です。pdfjs-dist(ピュアJavaScript)で動作します。
Installation / インストール
Claude Code CLI (Recommended / 推奨)
claude mcp add pdf-chapter-splitter -- npx -y pdf-chapter-splitter
This registers the MCP server with Claude Code. No manual configuration needed.
npx (Direct execution / 直接実行)
npx -y pdf-chapter-splitter
Manual installation / 手動インストール
git clone https://github.com/keigoly/pdf-chapter-splitter.git
cd pdf-chapter-splitter
npm install
npm run build
If you installed manually, add the following to your .mcp.json (project-level) or ~/.claude/settings.json (global):
{
"mcpServers": {
"pdf-chapter-splitter": {
"command": "node",
"args": ["/path/to/pdf-chapter-splitter/dist/index.js"]
}
}
}
Tools / ツール
get_pdf_info
Get PDF metadata (title, author, page count, file size).
| Parameter | Type | Description |
|---|---|---|
file_path |
string (required) | Absolute path to the PDF file |
get_toc
Extract table of contents (bookmarks/outline) from a PDF. Returns hierarchical chapter structure with page numbers.
| Parameter | Type | Description |
|---|---|---|
file_path |
string (required) | Absolute path to the PDF file |
list_chapters
List chapters/sections detected from the PDF. Uses TOC if available, otherwise splits by pages.
| Parameter | Type | Default | Description |
|---|---|---|---|
file_path |
string (required) | - | Absolute path to the PDF file |
level |
number (optional) | top-level only | Max TOC depth level to include (0=top level only) |
read_chapter
Read the text content of a specific chapter by its index (from list_chapters).
| Parameter | Type | Description |
|---|---|---|
file_path |
string (required) | Absolute path to the PDF file |
chapter_index |
number (required) | Chapter index from list_chapters output |
level |
number (optional) | Max TOC depth level (must match list_chapters) |
read_pages
Read text content from a specific page range.
| Parameter | Type | Description |
|---|---|---|
file_path |
string (required) | Absolute path to the PDF file |
start_page |
number (required) | Start page number (1-based) |
end_page |
number (required) | End page number (1-based, inclusive) |
detect_headings
Detect heading candidates by analyzing font sizes. Useful for PDFs without TOC/bookmarks.
| Parameter | Type | Description |
|---|---|---|
file_path |
string (required) | Absolute path to the PDF file |
3-level fallback strategy / 3段階フォールバック
- TOC/Bookmarks → Use as chapter structure / そのままチャプター構造として利用
- Font size analysis → Detect headings from larger text / フォントサイズからヘディング検出
- Auto-split → Split every 20 pages / 20ページ単位で自動分割
Constraints / 制約
| Item | Limit |
|---|---|
| Supported format | .pdf |
| Max response size | 200,000 chars / 1,800 lines |
| Auto-split chunk | 20 pages |
| External deps | None (pdfjs-dist only) |
Use cases / ユースケース
- Structured reading: Read technical books chapter by chapter / 技術書をチャプターごとに読む
- Document review: Review specific sections of specs or reports / 仕様書やレポートの特定セクションをレビュー
- Paper analysis: Analyze academic papers section by section / 学術論文をセクション単位で分析
- Cross-PDF search: Get TOCs from multiple PDFs to find relevant chapters / 複数PDFの目次から関連章を検索
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.