Document Processing Server

Document Processing Server

Provides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.

cablate

Digital Note Management
Content Fetching
Visit Server

Tools

document_reader

Read content from non-image document-files at specified paths, supporting various file formats: .pdf, .docx, .txt, .html, .csv

pdf_merger

Merge multiple PDF files into one

pdf_splitter

Split a PDF file into multiple files

docx_to_pdf

Convert DOCX files to PDF format

docx_to_html

Convert DOCX to HTML while preserving formatting

html_cleaner

Clean HTML by removing unnecessary tags and attributes

html_to_text

Convert HTML to plain text while preserving structure

html_to_markdown

Convert HTML to Markdown format

html_extract_resources

Extract all resources (images, videos, links) from HTML

html_formatter

Format and beautify HTML code

text_diff

Compare two text files and show differences

text_splitter

Split text file by specified delimiter or line count

text_formatter

Format text with proper indentation and line spacing

text_encoding_converter

Convert text between different encodings

excel_read

Read Excel file and convert to JSON format while preserving structure

format_convert

Convert between different document formats (Markdown, HTML, XML, JSON)

README

Simple Document Processing MCP Server

smithery badge

A powerful Model Context Protocol (MCP) server providing comprehensive document processing capabilities.

<a href="https://glama.ai/mcp/servers/pb9df6lnel"><img width="380" height="200" src="https://glama.ai/mcp/servers/pb9df6lnel/badge" alt="Simple Document Processing Server MCP server" /></a>

Features

Document Reader

  • Read DOCX, PDF, TXT, HTML, CSV

Document Conversion

  • DOCX to HTML/PDF conversion
  • HTML to TXT/Markdown conversion
  • PDF manipulation (merge, split)

Text Processing

  • Multi-encoding transfer support (UTF-8, Big5, GBK)
  • Text formatting and cleaning
  • Text comparison and diff generation
  • Text splitting by lines or delimiter

HTML Processing

  • HTML cleaning and formatting
  • Resource extraction (images, links, videos)
  • Structure-preserving conversion

Installation

Installing via Smithery

To install Document Processing Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @cablate/mcp-doc-forge --client claude

Manual Installation

npm install -g @cablate/mcp-doc-forge

Usage

Cli

mcp-doc-forge

With Dive Desktop

  1. Click "+ Add MCP Server" in Dive Desktop
  2. Copy and paste this configuration:
{
  "mcpServers": {
    "searxng": {
      "command": "npx",
      "args": [
        "-y",
        "@cablate/mcp-doc-forge"
      ],
      "enabled": true
    }
  }
}
  1. Click "Save" to install the MCP server

License

MIT

Contributing

Welcome community participation and contributions! Here are ways to contribute:

  • ⭐️ Star the project if you find it helpful
  • 🐛 Submit Issues: Report problems or provide suggestions
  • 🔧 Create Pull Requests: Submit code improvements

Contact

If you have any questions or suggestions, feel free to reach out:

  • 📧 Email: reahtuoo310109@gmail.com
  • 📧 GitHub: CabLate
  • 🤝 Collaboration: Welcome to discuss project cooperation
  • 📚 Technical Guidance: Sincere welcome for suggestions and guidance

Recommended Servers

Persistent Knowledge Graph

Persistent Knowledge Graph

An implementation of persistent memory for Claude using a local knowledge graph, allowing the AI to remember information about users across conversations with customizable storage location.

Featured
Local
Hyperbrowser MCP Server

Hyperbrowser MCP Server

Welcome to Hyperbrowser, the Internet for AI. Hyperbrowser is the next-generation platform empowering AI agents and enabling effortless, scalable browser automation. Built specifically for AI developers, it eliminates the headaches of local infrastructure and performance bottlenecks, allowing you to

Featured
Local
Mult Fetch MCP Server

Mult Fetch MCP Server

A versatile MCP-compliant web content fetching tool that supports multiple modes (browser/node), formats (HTML/JSON/Markdown/Text), and intelligent proxy detection, with bilingual interface (English/Chinese).

Featured
Local
Fetch MCP Server

Fetch MCP Server

Provides functionality to fetch web content in various formats, including HTML, JSON, plain text, and Markdown.

Featured
Search1API MCP Server

Search1API MCP Server

A Model Context Protocol (MCP) server that provides search and crawl functionality using Search1API.

Featured
Pandoc Document Conversion

Pandoc Document Conversion

MCP server for seamless document format conversion using Pandoc, supporting Markdown, HTML, PDF, DOCX (.docx), csv and more.

Featured
Perplexity Deep Research

Perplexity Deep Research

A server that allows AI assistants to perform web searches using Perplexity's sonar-deep-research model with citation support.

Featured
Docx Document Processing Service

Docx Document Processing Service

A powerful Word document processing service based on FastMCP, enabling AI assistants to create, edit, and manage docx files with full formatting support. Preserves original styles when editing content.

Featured
Perplexity Chat MCP Server

Perplexity Chat MCP Server

MCP Server for the Perplexity API.

Featured
Web Research Server

Web Research Server

A Model Context Protocol server that enables Claude to perform web research by integrating Google search, extracting webpage content, and capturing screenshots.

Featured