Custom PDF MCP Server

Custom PDF MCP Server

A server designed for processing PDF documents, enabling text extraction, table data retrieval, and metadata collection from local files. It allows users to scan directories for PDFs and read specific pages, specifically optimized for thesis literature analysis.

Category
Visit Server

README

Custom PDF MCP Server

基于 FastMCP 构建的自定义 PDF 处理服务器,专为毕业论文文献处理设计。

功能特性

  • 读取PDF文本: 支持提取整个PDF或指定页面的文本内容
  • 提取表格数据: 可选择提取PDF中的表格结构
  • 获取PDF信息: 提取PDF的元数据信息(作者、标题等)
  • 列出PDF文件: 扫描目录下所有PDF文件
  • 安全限制: 只能访问当前工作目录下的文件

安装方法

方法一:使用 uv(推荐)

# 克隆项目
git clone https://github.com/Waicy/-pdf-mcp-.git
cd pdf-mcp

# 创建虚拟环境并安装依赖
uv sync

方法二:使用 pip

# 克隆项目
git clone https://github.com/yourusername/pdf-mcp.git
cd pdf-mcp

# 安装依赖
pip install . --index-url https://pypi.tuna.tsinghua.edu.cn/simple

使用方法

1. 直接运行测试

# 如果使用 uv
uv run pdf-mcp

# 如果使用传统方式
python src/pdf_mcp_server.py

2. 配置 Claude Desktop

claude_desktop_config.json 中添加以下配置:

{
  "mcpServers": {
    "pdf-reader-custom": {
      "command": "uv",
      "args": [
         "--directory",
         "path/to/your/pdf-mcp",
         "run",
         "pdf-mcp"
      ]
    }
  }
}

注意

  • path/to/your/pdf-mcp 替换为实际的项目路径

可用工具

read_pdf_text

读取PDF文件并提取文本内容

参数:

  • file_path: PDF文件路径(相对于工作目录)
  • page_numbers: 可选,要提取的页面号列表
  • extract_tables: 可选,是否提取表格数据

get_pdf_info

获取PDF文件的基本信息和元数据

参数:

  • file_path: PDF文件路径

list_pdfs_in_directory

列出指定目录下的所有PDF文件

参数:

  • directory_path: 目录路径,默认为当前目录

使用示例

  1. 读取整个PDF:

    read_pdf_text("文献整理/某篇论文.pdf")
    
  2. 只读取特定页面:

    read_pdf_text("文献整理/某篇论文.pdf", [1, 2, 3])
    
  3. 提取表格数据:

    read_pdf_text("文献整理/某篇论文.pdf", extract_tables=True)
    
  4. 获取PDF信息:

    get_pdf_info("文献整理/某篇论文.pdf")
    
  5. 列出所有PDF:

    list_pdfs_in_directory("文献整理")
    

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured