MCP Servers

DocNav-MCP

DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities. Available Tools - load_document: Load a document for navigation and analysis - Args: `fi

README

DocNav MCP Server

DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities.

Features

Document Navigation: Navigate through document sections, headings, and content structure
Content Extraction: Extract and summarize specific document sections
Search & Query: Find specific content within documents using intelligent search
Multi-format Support: Currently supports Markdown (.md) files, with planned support for PDF and other formats
MCP Integration: Seamless integration with MCP-compatible LLMs and applications

Architecture

DocNav follows a modular, extensible architecture:

Core MCP Server: Main server implementation using the MCP protocol
Document Processors: Pluggable processors for different file types
Navigation Engine: Handles document structure analysis and navigation
Content Extractors: Extract and format content from documents
Search Engine: Provides search and query capabilities across documents

Installation

Prerequisites

Python 3.10+
uv package manager

Setup

Clone the repository:

git clone https://github.com/shenyimings/DocNav-MCP.git
cd DocNav-MCP

Install dependencies:

uv sync

Usage

Starting the MCP Server

uv run server.py

Connect to the MCP server

{
  "mcpServers": {
    "docnav": {
      "command": "{{PATH_TO_UV}}", // Run `which uv` and place the output here
      "args": [
        "--directory",
        "{{PATH_TO_SRC}}",
        "run",
        "server.py"
      ]
    }
  }
}

Available Tools

load_document: Load a document for navigation and analysis
- Args: file_path (path to document file)
- Returns: Success message with auto-generated document ID
get_outline: Get document outline/table of contents
- Args: doc_id (document identifier), max_depth (max heading depth, default 3)
- Returns: Formatted document outline
- Tip: Use first after loading a document to understand structure
read_section: Read content of a specific document section
- Args: doc_id (document identifier), section_id (e.g., 'h1_0', 'h2_1')
- Returns: Section content with subsections
search_document: Search for specific content within a document
- Args: doc_id (document identifier), query (search term or phrase)
- Returns: Formatted search results with context
navigate_section: Get navigation context for a section
- Args: doc_id (document identifier), section_id (section to navigate to)
- Returns: Navigation context with parent, siblings, children
list_documents: List all currently loaded documents
- Returns: List of loaded documents with metadata
get_document_stats: Get statistics about a loaded document
- Args: doc_id (document identifier)
- Returns: Document statistics and structure info
remove_document: Remove a document from the navigator
- Args: doc_id (document identifier)
- Returns: Success or error message

Example Usage

# Load a document
result = await tools.load_document("path/to/document.md")

# Get document outline
outline = await tools.get_outline(doc_id)

# Get specific section content
section = await tools.read_section(doc_id, section_id)

# Search within document
results = await tools.search_document(doc_id, "search query")

Development

Project Structure

docnav-mcp/
--- server.py             # Main MCP server
--- docnav/
------- __init__.py           # Package initialization
------- models.py             # Data models
------- navigator.py          # Document navigation engine
------- processors/
------- __init__.py       # Processor package
------- base.py           # Base processor interface
------- markdown.py       # Markdown processor
--- tests/
------- ...                   # Test files

Development Guidelines

See CLAUDE.md for detailed development guidelines including:

Code quality standards
Testing requirements
Package management with uv
Formatting and linting rules

Adding New Document Processors

Create a new processor class inheriting from BaseProcessor
Implement the required methods: can_process, process, extract_section, search
Register the processor in the DocumentNavigator
Add comprehensive tests

Running Tests

# Run all tests
uv run tests/run_tests.py

Code Quality

# Format code
uv run --frozen ruff format .

# Check linting
uv run --frozen ruff check .

# Type checking
uv run --frozen pyright

Roadmap

[x] Complete Markdown processor implementation
[x] Add PDF document support (PyMuPDF)
[x] Improve test coverage and quality
[ ] Implement advanced search capabilities
[ ] Add document summarization features
[ ] Support for additional document formats (DOCX, TXT, etc.)
[ ] Performance optimizations for large documents
[ ] Caching mechanisms for frequently accessed documents
[ ] Add persistent storage for loaded documents

Contributing

Fork the repository
Create a feature branch
Follow the development guidelines in CLAUDE.md
Add tests for new functionality
Submit a pull request

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

Support

For issues and questions:

Open an issue on GitHub
Check the documentation in CLAUDE.md
Review existing issues and discussions

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured