url-content-mcp
MCP server that fetches raw HTML content from a given URL to provide web context to LLMs.
README
URL Content MCP Server
This repository provides a Model Context Protocol (MCP) server that retrieves the raw HTML content from a given URL to provide context to Large Language Models (LLMs). It acts as a tool for LLMs to fetch real-time web page content beyond their training data.
Overview
The URL Content MCP server serves as a bridge between LLMs and the web. Through a standardized MCP interface, an LLM can request the content of a specific URL and receive the HTML content of that page. This allows AI assistants to access up-to-date web content on demand.
Features
- Fetch Web Page Content: Retrieve the HTML content of a web page given its URL.
- Real-Time Data: Access current information directly from web pages in real time.
- Optional Caching: Optionally cache fetched content in memory to avoid repeated network calls for the same URL during the server's runtime.
- STDIO and SSE Support: Run the server in
stdiomode for integration as a subprocess, or insse(HTTP Server-Sent Events) mode to serve requests over HTTP.
Requirements
- Python 3.8+ – The server is written in Python and requires version 3.8 or higher.
- Internet Access – The server needs network access to fetch web pages from the internet.
Note: This server fetches raw HTML content. Ensure the target URL is accessible and returns text/HTML content. Some websites may block automated requests or require specific user-agent headers.
Installation
Clone this repository and install the package along with its dependencies:
git clone https://github.com/artryazanov/url-content-mcp.git
cd url-content-mcp
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
This will install the necessary Python packages as listed in requirements.txt. You can also install the package in editable mode (e.g., pip install -e .) if you plan to modify the code.
Usage
After installation, you can run the MCP server using the provided console script url-content-mcp or by executing the module. The server supports two modes of operation: STDIO (for direct integration with an MCP-compatible client) and SSE (for running as an HTTP server).
Running the Server (STDIO Mode)
By default, the server runs in stdio mode. In this mode, the server reads MCP requests from standard input and writes responses to standard output. This mode is suitable for integrating with applications that manage the server as a subprocess and communicate via MCP protocol (such as certain AI assistant platforms).
Example (running in stdio mode):
url-content-mcp
When running in stdio mode, the server will start and wait for incoming MCP requests via stdin (typically from an AI client). Each request (formatted according to the MCP protocol) will be processed, and the server will output the result to stdout as JSON.
Running the Server (SSE/HTTP Mode)
To run the server as an HTTP service, use the --transport sse option. In SSE mode, the server will start an HTTP server and provide a RESTful endpoint for fetching URL content.
Example (running in SSE mode on port 8080):
url-content-mcp --transport sse --host 0.0.0.0 --port 8080 --enable-cache
This starts the server in SSE mode, listening on all interfaces (0.0.0.0) at port 8080, with caching enabled. In this mode, you can send HTTP GET requests to the server's /fetch/{url} endpoint to retrieve content. Note: The {url} in the path should be URL-encoded.
For example, to fetch the content of http://example.com, encode the URL and request:
http://localhost:8080/fetch/http%3A%2F%2Fexample.com
This will return a JSON response containing the URL and the HTML content of the page. The response structure looks like:
{
"url": "http://example.com",
"content": "<!DOCTYPE html>...</html>"
}
If an error occurs during fetching (for example, a network error or a non-200 HTTP status), the response will include an "error" field with a message, and the "content" may be an empty string.
Note: When running in Docker or other container environments, use --host 0.0.0.0 to bind to all interfaces, and ensure the container's port is published (e.g., -p 8080:8080).
Command-Line Options
--transport, -t(string): Transport protocol for the server. Eitherstdio(default) orsse(to run an HTTP server for SSE).--host(string): Host address to bind the HTTP server in SSE mode (default:127.0.0.1).--port(int): Port number for SSE mode (default:8080).--enable-cache(flag): Enable in-memory caching of fetched content. If this flag is set, the server will cache the content of each URL after the first fetch during its runtime.
Run url-content-mcp --help to see the usage information.
Available MCP Tool
This server provides one MCP tool that the LLM can use:
fetch_url
- Description: Fetches the content of a web page at the given URL and returns the HTML content.
- Parameters:
url(string, required) – The web page URL to fetch.
- Returns: A JSON object with the following structure:
url: The URL that was fetched.content: The HTML content of the page as a string. (This will contain the raw HTML, including tags.)error: optional – An error message string, if an error occurred during fetching. This field is only present if there was an error (on success it is omitted).
The fetch_url tool is registered with the MCP server, so an LLM client can call this function to retrieve web page content. In stdio mode, the function is invoked via MCP tool calls in the protocol. In sse (HTTP) mode, the server exposes a GET endpoint /fetch/{url} (with the URL percent-encoded) that returns the same data.
Testing
This project includes a test suite to ensure the server works correctly.
- Install test dependencies:
pip install -r requirements-dev.txt(includes pytest). - Run all tests:
pytest
Docker
A Dockerfile is provided to containerize the MCP server. To build the Docker image:
docker build -t url-content-mcp .
To run the server via Docker (exposing port 8080 for SSE mode):
docker run --rm -it -p 8080:8080 url-content-mcp --transport sse --host 0.0.0.0
This will start the MCP server inside a container. You can then interact with it via HTTP requests to http://localhost:8080 (for SSE mode) or attach it to an MCP-compatible client in stdio mode.
License
This project is licensed under the Unlicense license. See the LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.