MCP Document Parse

MCP Document Parse

Parses various document formats (PDF, Word, Excel, PowerPoint) into Markdown content using NiuTrans API, enabling extraction and reading of document text through natural language interactions.

Category
Visit Server

README

MCP Document Parse Tool

项目介绍

这是一个MCP(Model Communication Protocol)工具,用于帮助解析各种格式的文档(PDF、Word、Excel、PPT等)获取其内容。该工具提供了简单易用的接口,使您能够在各种应用中集成文档解析功能。

支持的文件格式

  • PDF (.pdf) - 支持可编辑 PDF 和扫描件
  • Word (.doc, .docx)
  • Excel (.xls, .xlsx)
  • PowerPoint (.ppt, .pptx)

安装方法

使用 uv 安装并启动发布版

uv tool install mcp-document-parse

环境变量

  • NIUTRANS_API_KEY(必填):小牛翻译开放平台提供文档API的 API Key,可免费使用, 请登录后获取:https://niutrans.com/cloud/api/list
  • NIUTRANS_DOCUMENT_APPID(必填):小牛翻译开放平台提供文档API的 APPID,可免费使用, 请登录后获取:https://niutrans.com/cloud/api/list

计费说明

本工具使用小牛翻译开放平台的文档解析 API,计费规则如下:

文件类型 计费标准
PDF / Word / PPT 1 页 = 2 积分
Excel 2000 字符 = 2 积分

💡 免费额度:平台每天赠送 100 积分,供大家免费使用!

环境要求

  • Python >= 3.9
  • 依赖项已在 pyproject.toml 中定义

MCP 客户端配置示例

若通过 uv tool install 安装,可在 mcp.json 中配置:

{
  "mcpServers": {
    "document_parse": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "tool",
        "run",
        "mcp-document-parse"
      ],
      "env": {
        "NIUTRANS_API_KEY": "${env.NIUTRANS_API_KEY}",
        "NIUTRANS_DOCUMENT_APPID": "${env.NIUTRANS_DOCUMENT_APPID}"
      }
    }
  }
}

启动支持MCP的应用后,执行 ListTools 即可看到 parse_document_by_path 工具,同时支持 ListResources 读取 document://supported-types

工具说明

parse_document_by_path

将指定路径的文件转换为Markdown格式。

参数:

  • file_path (str): 文件的绝对路径,支持pdf、doc、docx、xls、xlsx、ppt、pptx格式

返回:

  • 成功: {"status": "success", "text_content": "文件内容", "filename": 文件名}
  • 失败: {"status": "error", "error": "错误信息"}

document://supported-types

获取支持的文件类型信息。

返回:

  • 包含支持的文件类型列表及其描述的JSON对象

许可证

MIT License

联系方式

如有问题或建议,请联系 tianfengning@niutrans.com

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured