MCP Document Parse
Parses various document formats (PDF, Word, Excel, PowerPoint) into Markdown content using NiuTrans API, enabling extraction and reading of document text through natural language interactions.
README
MCP Document Parse Tool
项目介绍
这是一个MCP(Model Communication Protocol)工具,用于帮助解析各种格式的文档(PDF、Word、Excel、PPT等)获取其内容。该工具提供了简单易用的接口,使您能够在各种应用中集成文档解析功能。
支持的文件格式
- PDF (.pdf) - 支持可编辑 PDF 和扫描件
- Word (.doc, .docx)
- Excel (.xls, .xlsx)
- PowerPoint (.ppt, .pptx)
安装方法
使用 uv 安装并启动发布版
uv tool install mcp-document-parse
环境变量
NIUTRANS_API_KEY(必填):小牛翻译开放平台提供文档API的 API Key,可免费使用, 请登录后获取:https://niutrans.com/cloud/api/listNIUTRANS_DOCUMENT_APPID(必填):小牛翻译开放平台提供文档API的 APPID,可免费使用, 请登录后获取:https://niutrans.com/cloud/api/list
计费说明
本工具使用小牛翻译开放平台的文档解析 API,计费规则如下:
| 文件类型 | 计费标准 |
|---|---|
| PDF / Word / PPT | 1 页 = 2 积分 |
| Excel | 2000 字符 = 2 积分 |
💡 免费额度:平台每天赠送 100 积分,供大家免费使用!
环境要求
- Python >= 3.9
- 依赖项已在
pyproject.toml中定义
MCP 客户端配置示例
若通过 uv tool install 安装,可在 mcp.json 中配置:
{
"mcpServers": {
"document_parse": {
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"mcp-document-parse"
],
"env": {
"NIUTRANS_API_KEY": "${env.NIUTRANS_API_KEY}",
"NIUTRANS_DOCUMENT_APPID": "${env.NIUTRANS_DOCUMENT_APPID}"
}
}
}
}
启动支持MCP的应用后,执行 ListTools 即可看到 parse_document_by_path 工具,同时支持 ListResources 读取 document://supported-types。
工具说明
parse_document_by_path
将指定路径的文件转换为Markdown格式。
参数:
file_path(str): 文件的绝对路径,支持pdf、doc、docx、xls、xlsx、ppt、pptx格式
返回:
- 成功:
{"status": "success", "text_content": "文件内容", "filename": 文件名} - 失败:
{"status": "error", "error": "错误信息"}
document://supported-types
获取支持的文件类型信息。
返回:
- 包含支持的文件类型列表及其描述的JSON对象
许可证
MIT License
联系方式
如有问题或建议,请联系 tianfengning@niutrans.com
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.