mcp-server-convert
A Model Context Protocol server that converts documents (PDF, DOCX, HTML, etc.) to Markdown, enabling AI agents to ingest and understand document content.
README
mcp-server-convert
A lightweight Model Context Protocol (MCP) server that converts documents to Markdown. Supports PDF, DOCX, HTML, EPUB, CSV, JSON, and plain text files.
Perfect for AI agents that need to ingest and understand document content.
Features
- 📄 Multi-format support: PDF, DOCX, HTML, EPUB, CSV, JSON, images (via OCR), and plain text
- 🔧 6 MCP tools:
convert_file,convert_url,list_supported_formats,batch_convert,extract_metadata,convert_directory - 🐍 Zero external dependencies for core: Uses Python standard library +
markdownifyfor HTML - ⚡ Fast: In-memory processing, no temp files
- 🐳 Docker-ready: Single Dockerfile, one command deploy
Quick Start
Install & Run
# Clone
git clone https://github.com/demo112/mcp-server-convert.git
cd mcp-server-convert
# Install dependencies
pip install -r requirements.txt
# Run
python -m mcp_server_convert
Configure in Claude Code
Add to your MCP settings (~/.claude/settings.json):
{
"mcpServers": {
"convert": {
"command": "python",
"args": ["-m", "mcp_server_convert"],
"cwd": "/path/to/mcp-server-convert"
}
}
}
Docker
docker build -t mcp-server-convert .
docker run -i --rm mcp-server-convert
Configure with Docker
{
"mcpServers": {
"convert": {
"command": "docker",
"args": ["run", "-i", "--rm", "-v", "/path/to/files:/data", "mcp-server-convert"]
}
}
}
Tools
convert_file
Convert a local file to Markdown.
Parameters:
file_path(string, required): Absolute path to the filemax_length(int, optional): Maximum output length in chars (default: 50000)
convert_url
Fetch a URL and convert its content to Markdown.
Parameters:
url(string, required): URL to fetch and convertmax_length(int, optional): Maximum output length in chars (default: 50000)
batch_convert
Convert multiple files at once.
Parameters:
file_paths(array of strings, required): List of file pathsmax_length_per_file(int, optional): Max length per file (default: 50000)
convert_directory
Convert all supported files in a directory.
Parameters:
dir_path(string, required): Path to directoryrecursive(bool, optional): Include subdirectories (default: true)max_files(int, optional): Maximum files to convert (default: 20)
extract_metadata
Extract metadata from a file without full conversion.
Parameters:
file_path(string, required): Path to the file
list_supported_formats
List all supported file extensions and their conversion methods.
Supported Formats
| Format | Extension | Method |
|---|---|---|
.pdf |
PyMuPDF (fitz) | |
| Word | .docx |
python-docx |
| HTML | .html, .htm |
markdownify |
| EPUB | .epub |
ebooklib |
| CSV | .csv |
pandas → markdown table |
| JSON | .json |
Formatted markdown code block |
| XML | .xml |
xmltodict → markdown |
| Excel | .xlsx |
openpyxl → markdown table |
| PowerPoint | .pptx |
python-pptx → markdown slides |
| Text | .txt, .md, .rst, .log |
Direct passthrough |
| Images | .png, .jpg |
pytesseract OCR (if available) |
Support
If this tool helps your workflow, consider supporting its development:
- GitHub Sponsors: Sponsor via Liberapay
- ETH:
0xddD9f45e14c92846f47C1c1A4431aC2b41D87273
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.