Document Processing Server

Document Processing Server

Provides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.

cablate

Digital Note Management
Content Fetching
Visit Server

Tools

document_reader

Read content from non-image document-files at specified paths, supporting various file formats: .pdf, .docx, .txt, .html, .csv

pdf_merger

Merge multiple PDF files into one

pdf_splitter

Split a PDF file into multiple files

docx_to_pdf

Convert DOCX files to PDF format

docx_to_html

Convert DOCX to HTML while preserving formatting

html_cleaner

Clean HTML by removing unnecessary tags and attributes

html_to_text

Convert HTML to plain text while preserving structure

html_to_markdown

Convert HTML to Markdown format

html_extract_resources

Extract all resources (images, videos, links) from HTML

html_formatter

Format and beautify HTML code

text_diff

Compare two text files and show differences

text_splitter

Split text file by specified delimiter or line count

text_formatter

Format text with proper indentation and line spacing

text_encoding_converter

Convert text between different encodings

excel_read

Read Excel file and convert to JSON format while preserving structure

format_convert

Convert between different document formats (Markdown, HTML, XML, JSON)

README

Simple Document Processing MCP Server

smithery badge

A powerful Model Context Protocol (MCP) server providing comprehensive document processing capabilities.

<a href="https://glama.ai/mcp/servers/pb9df6lnel"><img width="380" height="200" src="https://glama.ai/mcp/servers/pb9df6lnel/badge" alt="Simple Document Processing Server MCP server" /></a>

Features

Document Reader

  • Read DOCX, PDF, TXT, HTML, CSV

Document Conversion

  • DOCX to HTML/PDF conversion
  • HTML to TXT/Markdown conversion
  • PDF manipulation (merge, split)

Text Processing

  • Multi-encoding transfer support (UTF-8, Big5, GBK)
  • Text formatting and cleaning
  • Text comparison and diff generation
  • Text splitting by lines or delimiter

HTML Processing

  • HTML cleaning and formatting
  • Resource extraction (images, links, videos)
  • Structure-preserving conversion

Installation

Installing via Smithery

To install Document Processing Server for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @cablate/mcp-doc-forge --client claude

Manual Installation

npm install -g @cablate/mcp-doc-forge

Usage

Cli

mcp-doc-forge

With Dive Desktop

  1. Click "+ Add MCP Server" in Dive Desktop
  2. Copy and paste this configuration:
{
  "mcpServers": {
    "searxng": {
      "command": "npx",
      "args": [
        "-y",
        "@cablate/mcp-doc-forge"
      ],
      "enabled": true
    }
  }
}
  1. Click "Save" to install the MCP server

License

MIT

Contributing

Welcome community participation and contributions! Here are ways to contribute:

  • ⭐️ Star the project if you find it helpful
  • 🐛 Submit Issues: Report problems or provide suggestions
  • 🔧 Create Pull Requests: Submit code improvements

Contact

If you have any questions or suggestions, feel free to reach out:

  • 📧 Email: reahtuoo310109@gmail.com
  • 📧 GitHub: CabLate
  • 🤝 Collaboration: Welcome to discuss project cooperation
  • 📚 Technical Guidance: Sincere welcome for suggestions and guidance

Recommended Servers

Mult Fetch MCP Server

Mult Fetch MCP Server

A versatile MCP-compliant web content fetching tool that supports multiple modes (browser/node), formats (HTML/JSON/Markdown/Text), and intelligent proxy detection, with bilingual interface (English/Chinese).

Featured
Local
Persistent Knowledge Graph

Persistent Knowledge Graph

An implementation of persistent memory for Claude using a local knowledge graph, allowing the AI to remember information about users across conversations with customizable storage location.

Featured
Local
Hyperbrowser MCP Server

Hyperbrowser MCP Server

Welcome to Hyperbrowser, the Internet for AI. Hyperbrowser is the next-generation platform empowering AI agents and enabling effortless, scalable browser automation. Built specifically for AI developers, it eliminates the headaches of local infrastructure and performance bottlenecks, allowing you to

Featured
Local
Exa MCP

Exa MCP

A Model Context Protocol server that enables AI assistants like Claude to perform real-time web searches using the Exa AI Search API in a safe and controlled manner.

Featured
Web Research Server

Web Research Server

A Model Context Protocol server that enables Claude to perform web research by integrating Google search, extracting webpage content, and capturing screenshots.

Featured
Perplexity Chat MCP Server

Perplexity Chat MCP Server

MCP Server for the Perplexity API.

Featured
PubMedSearch

PubMedSearch

A Model Content Protocol server that provides tools to search and retrieve academic papers from PubMed database.

Featured
Youtube Translate

Youtube Translate

A Model Context Protocol server that enables access to YouTube video content through transcripts, translations, summaries, and subtitle generation in various languages.

Featured
Aindreyway Codex Keeper

Aindreyway Codex Keeper

Serves as a guardian of development knowledge, providing AI assistants with curated access to latest documentation and best practices.

Featured
Perplexity Deep Research

Perplexity Deep Research

A server that allows AI assistants to perform web searches using Perplexity's sonar-deep-research model with citation support.

Featured