Document Processing Server
Provides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.
cablate
Tools
document_reader
Read content from non-image document-files at specified paths, supporting various file formats: .pdf, .docx, .txt, .html, .csv
pdf_merger
Merge multiple PDF files into one
pdf_splitter
Split a PDF file into multiple files
docx_to_pdf
Convert DOCX files to PDF format
docx_to_html
Convert DOCX to HTML while preserving formatting
html_cleaner
Clean HTML by removing unnecessary tags and attributes
html_to_text
Convert HTML to plain text while preserving structure
html_to_markdown
Convert HTML to Markdown format
html_extract_resources
Extract all resources (images, videos, links) from HTML
html_formatter
Format and beautify HTML code
text_diff
Compare two text files and show differences
text_splitter
Split text file by specified delimiter or line count
text_formatter
Format text with proper indentation and line spacing
text_encoding_converter
Convert text between different encodings
excel_read
Read Excel file and convert to JSON format while preserving structure
format_convert
Convert between different document formats (Markdown, HTML, XML, JSON)
README
Simple Document Processing MCP Server
A powerful Model Context Protocol (MCP) server providing comprehensive document processing capabilities.
<a href="https://glama.ai/mcp/servers/pb9df6lnel"><img width="380" height="200" src="https://glama.ai/mcp/servers/pb9df6lnel/badge" alt="Simple Document Processing Server MCP server" /></a>
Features
Document Reader
- Read DOCX, PDF, TXT, HTML, CSV
Document Conversion
- DOCX to HTML/PDF conversion
- HTML to TXT/Markdown conversion
- PDF manipulation (merge, split)
Text Processing
- Multi-encoding transfer support (UTF-8, Big5, GBK)
- Text formatting and cleaning
- Text comparison and diff generation
- Text splitting by lines or delimiter
HTML Processing
- HTML cleaning and formatting
- Resource extraction (images, links, videos)
- Structure-preserving conversion
Installation
Installing via Smithery
To install Document Processing Server for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @cablate/mcp-doc-forge --client claude
Manual Installation
npm install -g @cablate/mcp-doc-forge
Usage
Cli
mcp-doc-forge
With Dive Desktop
- Click "+ Add MCP Server" in Dive Desktop
- Copy and paste this configuration:
{
"mcpServers": {
"searxng": {
"command": "npx",
"args": [
"-y",
"@cablate/mcp-doc-forge"
],
"enabled": true
}
}
}
- Click "Save" to install the MCP server
License
MIT
Contributing
Welcome community participation and contributions! Here are ways to contribute:
- ⭐️ Star the project if you find it helpful
- 🐛 Submit Issues: Report problems or provide suggestions
- 🔧 Create Pull Requests: Submit code improvements
Contact
If you have any questions or suggestions, feel free to reach out:
- 📧 Email: reahtuoo310109@gmail.com
- 📧 GitHub: CabLate
- 🤝 Collaboration: Welcome to discuss project cooperation
- 📚 Technical Guidance: Sincere welcome for suggestions and guidance
Recommended Servers
Mult Fetch MCP Server
A versatile MCP-compliant web content fetching tool that supports multiple modes (browser/node), formats (HTML/JSON/Markdown/Text), and intelligent proxy detection, with bilingual interface (English/Chinese).
Persistent Knowledge Graph
An implementation of persistent memory for Claude using a local knowledge graph, allowing the AI to remember information about users across conversations with customizable storage location.
Hyperbrowser MCP Server
Welcome to Hyperbrowser, the Internet for AI. Hyperbrowser is the next-generation platform empowering AI agents and enabling effortless, scalable browser automation. Built specifically for AI developers, it eliminates the headaches of local infrastructure and performance bottlenecks, allowing you to
Exa MCP
A Model Context Protocol server that enables AI assistants like Claude to perform real-time web searches using the Exa AI Search API in a safe and controlled manner.
Web Research Server
A Model Context Protocol server that enables Claude to perform web research by integrating Google search, extracting webpage content, and capturing screenshots.
Perplexity Chat MCP Server
MCP Server for the Perplexity API.
PubMedSearch
A Model Content Protocol server that provides tools to search and retrieve academic papers from PubMed database.

Youtube Translate
A Model Context Protocol server that enables access to YouTube video content through transcripts, translations, summaries, and subtitle generation in various languages.
Aindreyway Codex Keeper
Serves as a guardian of development knowledge, providing AI assistants with curated access to latest documentation and best practices.
Perplexity Deep Research
A server that allows AI assistants to perform web searches using Perplexity's sonar-deep-research model with citation support.