MCP Servers

docsray-mcp

An MCP server that provides AI assistants with advanced document perception capabilities including text extraction, structure analysis, and deep content understanding through multiple tools and providers.

README

🔍 Docsray MCP Server

Docsray is a powerful Model Context Protocol (MCP) server that gives AI assistants like Claude advanced document perception capabilities. Extract text, navigate pages, analyze structure, and understand any document with ease.

✅ Status: Published to PyPI and TestPyPI - Working in Cursor, Claude Desktop, and other MCP clients

✨ Features

🎯 Five Powerful Tools

docsray_peek - Quick document overview with format detection and provider capabilities
docsray_map - Generate comprehensive document structure maps with caching
docsray_xray - AI-powered deep analysis extracting entities, relationships, and insights
docsray_extract - Extract content in multiple formats (markdown, text, JSON, tables)
docsray_seek - Navigate to specific pages, sections, or search for content

🔌 Multi-Provider Architecture

PyMuPDF4LLM - Lightning-fast PDF processing (✅ Implemented)
- Fast markdown extraction
- Basic table detection
- Multi-page support
- Always enabled as fallback
LlamaParse - Deep document understanding with LLMs (✅ Implemented)
- AI-powered entity extraction
- Custom analysis instructions
- Comprehensive caching in .docsray directories
- Rich format preservation (markdown, images, tables)
PyTesseract - OCR for scanned documents (🔄 Planned)
Mistral OCR - AI-powered OCR and analysis (🔄 Planned)

🚀 Key Benefits

Universal Input Support - Local files (./path, ../path, /absolute) and URLs (https://)
Intelligent Provider Selection - Automatically chooses the best tool for each task
Smart Caching - LlamaParse results cached in .docsray directories for instant access
Dynamic Discovery - Tools report actual capabilities based on what's enabled
Production Ready - Comprehensive error handling, logging, and 56 tests
Self-Documenting - Built-in resources for discovery by MCP clients

📦 Installation

Quick Start with uvx (Recommended)

# Run directly without installation
uvx docsray-mcp start

# Or install globally
uv tool install docsray-mcp
# Then run with:
docsray start
# or
docsray-mcp start

Alternative: Install with pip

# Basic installation (PyMuPDF4LLM only)
pip install docsray-mcp

# With LlamaParse for AI analysis
pip install "docsray-mcp[ai]"

# Development installation
pip install -e ".[dev]"

🚀 Quick Start

1. Set up API Keys (Optional but Recommended)

Create a .env file in your project:

# For AI-powered analysis with LlamaParse
LLAMAPARSE_API_KEY=llx-your-key-here

# Or use environment variables
export LLAMAPARSE_API_KEY=llx-your-key-here

Get your free LlamaParse API key at cloud.llamaindex.ai

2. Configure with Your MCP Client

For Cursor

Add to your Cursor settings:

{
  "mcpServers": {
    "docsray": {
      "command": "uvx",
      "args": ["docsray-mcp"],
      "env": {
        "LLAMAPARSE_API_KEY": "llx-your-key-here"
      }
    }
  }
}

For Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "docsray": {
      "command": "uvx",
      "args": ["docsray-mcp"],
      "env": {
        "LLAMAPARSE_API_KEY": "llx-your-key-here"
      }
    }
  }
}

📚 Usage Examples

Basic Document Overview

Peek at ./document.pdf to see its structure and available formats

Extract Entities from Contracts

Xray ./contract.pdf and extract all parties, dates, payment terms, and obligations

Navigate Documents

Map the complete structure of ./manual.pdf including all sections and subsections

Extract Specific Content

Extract pages 10-20 from ./report.pdf as markdown

Analyze Web Documents

Analyze https://arxiv.org/pdf/2301.00234.pdf for methodology and key findings

Compare Providers

Extract text from document.pdf with provider pymupdf4llm (fast)
Xray document.pdf with provider llama-parse (AI analysis)

🛠️ Advanced Configuration

Environment Variables

# Provider Configuration
DOCSRAY_PYMUPDF4LLM_ENABLED=true  # Always true by default
DOCSRAY_LLAMAPARSE_ENABLED=true
LLAMAPARSE_API_KEY=llx-your-key

# Performance Tuning
DOCSRAY_CACHE_ENABLED=true
DOCSRAY_CACHE_TTL=3600
DOCSRAY_MAX_CONCURRENT_REQUESTS=5
DOCSRAY_TIMEOUT_SECONDS=30

# Logging
DOCSRAY_LOG_LEVEL=INFO

Provider Capabilities

PyMuPDF4LLM (Always Available)

✅ Fast text extraction
✅ Markdown formatting
✅ Basic table detection
✅ Multi-page support
❌ No AI analysis
❌ No OCR

LlamaParse (When API Key Configured)

✅ AI-powered analysis
✅ Entity extraction
✅ Custom instructions
✅ Table extraction
✅ Image extraction
✅ Layout preservation
✅ Relationship mapping
✅ Result caching

🧪 Testing

# Run all tests
pytest tests/

# Run only unit tests (no API calls)
pytest tests/unit/

# Run integration tests
pytest tests/integration/

# Run with coverage
pytest tests/ --cov=src/docsray --cov-report=html

Current test coverage: 52 tests passing with comprehensive coverage across all components

📖 API Reference

Tool: docsray_peek

Get quick document overview and metadata.

{
  "document_url": "path/to/document.pdf",
  "depth": "structure",  # metadata | structure | preview
  "provider": "auto"     # auto | pymupdf4llm | llama-parse
}

Tool: docsray_map

Generate comprehensive document structure map.

{
  "document_url": "path/to/document.pdf",
  "include_content": false,
  "analysis_depth": "deep",  # basic | deep | comprehensive
  "provider": "auto"
}

Tool: docsray_xray

Deep AI-powered document analysis.

{
  "document_url": "path/to/document.pdf",
  "analysis_type": ["entities", "key-points"],
  "custom_instructions": "Extract all dates and amounts",
  "provider": "llama-parse"
}

Tool: docsray_extract

Extract content in various formats.

{
  "document_url": "path/to/document.pdf",
  "extraction_targets": ["text", "tables"],
  "output_format": "markdown",  # markdown | text | json
  "pages": [1, 2, 3],  # Optional: specific pages
  "provider": "auto"
}

Tool: docsray_seek

Navigate to specific document locations.

{
  "document_url": "path/to/document.pdf",
  "target": {"page": 5},  # or {"section": "Introduction"} or {"query": "search text"}
  "extract_content": true,
  "provider": "auto"
}

🏗️ Architecture

docsray-mcp/
├── src/docsray/
│   ├── server.py           # FastMCP server with discovery resources
│   ├── providers/          # Provider implementations
│   │   ├── base.py        # Provider interface
│   │   ├── pymupdf4llm.py # Fast PDF extraction
│   │   └── llamaparse.py  # AI-powered analysis
│   ├── tools/             # MCP tool implementations
│   │   ├── peek.py        # Document overview
│   │   ├── map.py         # Structure mapping
│   │   ├── xray.py        # Deep analysis
│   │   ├── extract.py     # Content extraction
│   │   └── seek.py        # Navigation
│   └── utils/             # Utilities
│       ├── cache.py       # Document caching
│       └── llamaparse_cache.py  # LlamaParse .docsray cache
├── tests/
│   ├── unit/              # Fast isolated tests
│   ├── integration/       # Component interaction tests
│   └── manual/            # Debugging scripts
└── PROMPTS.md            # Example prompts for all use cases

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Development Setup

# Clone the repository
git clone https://github.com/docsray/docsray-mcp.git
cd docsray-mcp

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run linting
ruff check src/

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

Built on FastMCP framework
Document processing powered by PyMuPDF4LLM
AI analysis powered by LlamaParse
Inspired by the Model Context Protocol specification

📬 Support

Made with ❤️ for the MCP ecosystem

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured