doc-mcp-server

doc-mcp-server

Enables AI to efficiently analyze and extract structured data from complex documents, especially Excel files, by providing tools like section reading and field mapping to reduce token usage and improve success rates.

Category
Visit Server

README

📄 Document Analyzer MCP Server

English | įŽ€äŊ“中文

PyPI version License: MIT Python 3.10+ MCP

Make AI understand complex documents - MCP server solving AI context limitations


đŸŽ¯ Key Features

  • ✅ Smart Document Analysis - Auto-detect sections, handle merged cells
  • ✅ Multi-format Support - Excel (.xlsx, .xls) | PDF/Word in development
  • ✅ Precise Field Mapping - Field mapping table + section-level reading
  • ✅ High Performance - Structured caching + lazy loading

🚀 Quick Start

Installation

macOS / Linux (Recommended with pipx)

# Install pipx
brew install pipx  # macOS
# or sudo apt install pipx  # Ubuntu/Debian

# Install doc-mcp-server
pipx install doc-mcp-server

Windows

pip install doc-mcp-server

For more installation options, see Full Installation Guide

Configure Claude Code

Add to ~/.claude.json or your project's config file:

{
  "mcpServers": {
    "document-analyzer": {
      "command": "doc-mcp-server"
    }
  }
}

For detailed configuration, see Quick Start Guide

📚 Full Documentation

💡 Usage Example

# 1. Analyze document structure
analyze_document(file_path="/path/to/document.xlsx")

# 2. Read specific section
read_section(file_path="/path/to/document.xlsx", section_name="Section 1")

# 3. Read single field
read_field(file_path="/path/to/document.xlsx", field_key="Section1_CompanyName")

đŸ› ī¸ Available Tools

Tool Description
analyze_document Analyze document structure and generate metadata
get_structure Get cached document structure
read_field Read specific field value
read_section Read entire section data
write_field Write field value (Excel only)
list_sections List all sections
list_fields List all fields
export_structure Export document structure

đŸŽ¯ Why Use This?

Problem: Large Excel files consume massive tokens when directly read by AI

  • ❌ Traditional: Read entire 323-row Excel → 15000+ tokens → Often fails
  • ✅ Using MCP: Structured reading → 2000 tokens → 90%+ success rate

Performance Improvements:

  • 🚀 Token consumption reduced by 87% (15000 → 2000)
  • ✅ Success rate improved from 30% to 90%+
  • ⚡ Handles 323 rows × 24 columns with 4249 merged cells

🤝 Contributing & Feedback


📄 License

MIT License - see LICENSE for details


Made with â¤ī¸ by Yang Jiahui

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured