MCP URL Fetcher

MCP URL Fetcher

A Model Context Protocol server that enables LLMs to fetch and process web content in multiple formats (HTML, JSON, Markdown, text) with automatic format detection.

Category
Visit Server

README

MCP URL Format Converter

A Model Context Protocol (MCP) server that fetches content from any URL and converts it to your desired output format.

Overview

MCP URL Format Converter provides tools for retrieving content from any web URL and transforming it into various formats (HTML, JSON, Markdown, or plain text), regardless of the original content type. It's designed to work with any MCP-compatible client, including Claude for Desktop, enabling LLMs to access, transform, and analyze web content in a consistent format.

Features

  • 🔄 Format Conversion: Transform any web content to HTML, JSON, Markdown, or plain text
  • 🌐 Universal Input Support: Handle websites, APIs, raw files, and more
  • 🔍 Automatic Content Detection: Intelligently identifies source format
  • 🧰 Robust Library Support: Uses industry-standard libraries:
    • Cheerio for HTML parsing
    • Marked for Markdown processing
    • Fast-XML-Parser for XML handling
    • CSVtoJSON for CSV conversion
    • SanitizeHTML for security
    • Turndown for HTML-to-Markdown conversion
  • 🔧 Advanced Format Processing:
    • HTML parsing with metadata extraction
    • JSON pretty-printing and structure preservation
    • Markdown rendering with styling
    • CSV-to-table conversion
    • XML-to-JSON transformation
  • 📜 History Tracking: Maintains logs of recently fetched URLs
  • 🛡️ Security Focus: Content sanitization to prevent XSS attacks

Installation

Prerequisites

  • Node.js 16.x or higher
  • npm or yarn

Quick Start

  1. Clone the repository:

    git clone https://github.com/yourusername/mcp-url-converter.git
    cd mcp-url-converter
    
  2. Install dependencies:

    npm install
    
  3. Build the project:

    npm run build
    
  4. Run the server:

    npm start
    

Integration with Claude for Desktop

  1. Open your Claude for Desktop configuration file:

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
  2. Add the URL converter server to your configuration:

    {
      "mcpServers": {
        "url-converter": {
          "command": "node",
          "args": ["/absolute/path/to/mcp-url-converter/build/index.js"]
        }
      }
    }
    
  3. Restart Claude for Desktop

Available Tools

fetch

Fetches content from any URL and automatically detects the best output format.

Parameters:

  • url (string, required): The URL to fetch content from
  • format (string, optional): Format to convert to (auto, html, json, markdown, text). Default: auto

Example:

Can you fetch https://example.com and choose the best format to display it?

fetch-json

Fetches content from any URL and converts it to JSON format.

Parameters:

  • url (string, required): The URL to fetch content from
  • prettyPrint (boolean, optional): Whether to pretty-print the JSON. Default: true

Example:

Can you fetch https://example.com and convert it to JSON format?

fetch-html

Fetches content from any URL and converts it to HTML format.

Parameters:

  • url (string, required): The URL to fetch content from
  • extractText (boolean, optional): Whether to extract text content only. Default: false

Example:

Can you fetch https://api.example.com/users and convert it to HTML?

fetch-markdown

Fetches content from any URL and converts it to Markdown format.

Parameters:

  • url (string, required): The URL to fetch content from

Example:

Can you fetch https://example.com and convert it to Markdown?

fetch-text

Fetches content from any URL and converts it to plain text format.

Parameters:

  • url (string, required): The URL to fetch content from

Example:

Can you fetch https://example.com and convert it to plain text?

web-search and deep-research

These tools provide interfaces to Perplexity search capabilities (when supported by the MCP host).

Available Resources

recent-urls://list

Returns a list of recently fetched URLs with timestamps and output formats.

Example:

What URLs have I fetched recently?

Security

This server implements several security measures:

  • HTML sanitization using sanitize-html to prevent XSS attacks
  • Content validation before processing
  • Error handling and safe defaults
  • Input parameter validation with Zod
  • Safe output encoding

Testing

You can test the server using the MCP Inspector:

npm run test

Troubleshooting

Common Issues

  1. Connection errors: Verify that the URL is accessible and correctly formatted
  2. Conversion errors: Some complex content may not convert cleanly between formats
  3. Cross-origin issues: Some websites may block requests from unknown sources

Debug Mode

For additional debugging information, set the DEBUG environment variable:

DEBUG=mcp:* npm start

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with the Model Context Protocol
  • Uses modern, actively maintained libraries with security focus
  • Sanitization approach based on OWASP recommendations

Last updated: 29 March 2025

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured