Advanced Web Fetching MCP Server

Advanced Web Fetching MCP Server

Enables fetching and processing web content with advanced features including batch processing of up to 20 URLs, streaming support, metadata extraction, and multiple output formats (HTML, Markdown, plain text) with enterprise-grade security and global edge performance.

Category
Visit Server

README

๐ŸŒ The Most Advanced Web Fetching MCP Server

npm version License: MIT TypeScript Cloudflare Workers MCP Compatible

๐Ÿ† The most feature-rich, production-ready web fetching MCP server available

Transform Claude into a powerful web scraping and content analysis tool with our enterprise-grade MCP server collection. Built with modern tech stack and battle-tested in production.

๐Ÿš€ Setup in Your IDE (30 seconds)

<details open> <summary><strong>๐ŸŽฏ Claude Code / Claude Desktop</strong></summary>

Option 1: Hosted Service (Recommended)

Zero setup - copy this config:

{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": [
        "workers-mcp",
        "run", 
        "web-fetcher",
        "https://mcp.llmbase.ai/mcp/web-fetch"
      ]
    }
  }
}

Option 2: Local Installation

Maximum privacy - runs on your machine:

npm install @llmbase/mcp-web-fetch

Claude Desktop config:

{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": ["@llmbase/mcp-web-fetch"]
    }
  }
}

Config file locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%/Claude/claude_desktop_config.json

</details>

<details> <summary><strong>๐Ÿ”ง Cursor IDE</strong></summary>

Install the MCP Extension

  1. Open Cursor IDE
  2. Go to Extensions (Ctrl+Shift+X)
  3. Search for "MCP" or "Model Context Protocol"
  4. Install the MCP extension

Configure Web Fetcher

  1. Open Command Palette (Ctrl+Shift+P)
  2. Run "MCP: Configure Server"
  3. Add server configuration:
{
  "web-fetcher": {
    "command": "npx",
    "args": ["@llmbase/mcp-web-fetch"]
  }
}

Alternative: Direct Integration

Add to your .cursorrules file:

# Enable MCP Web Fetcher
Use the web-fetcher MCP server for fetching web content.
Server endpoint: npx @llmbase/mcp-web-fetch

</details>

<details> <summary><strong>๐ŸŒŠ Windsurf IDE</strong></summary>

Setup MCP Integration

  1. Open Windsurf settings
  2. Navigate to "Extensions" โ†’ "MCP Servers"
  3. Click "Add Server"
  4. Configure:

Server Name: web-fetcher Command: npx Arguments: @llmbase/mcp-web-fetch

Alternative Configuration

Create .windsurf/mcp.json:

{
  "servers": {
    "web-fetcher": {
      "command": "npx",
      "args": ["@llmbase/mcp-web-fetch"],
      "description": "Advanced web content fetching and processing"
    }
  }
}

</details>

<details> <summary><strong>๐Ÿ’ป VS Code</strong></summary>

Using Continue Extension

  1. Install the Continue extension from VS Code marketplace
  2. Open Continue settings (Ctrl+,)
  3. Add to config.json:
{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": ["@llmbase/mcp-web-fetch"]
    }
  }
}

Using Cline Extension

  1. Install Cline extension
  2. Configure MCP server in settings:
{
  "cline.mcpServers": {
    "web-fetcher": {
      "command": "npx", 
      "args": ["@llmbase/mcp-web-fetch"]
    }
  }
}

</details>

<details> <summary><strong>๐Ÿ› ๏ธ Custom MCP Client</strong></summary>

Direct Integration

For custom applications using the MCP protocol:

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';

const transport = new StdioClientTransport({
  command: 'npx',
  args: ['@llmbase/mcp-web-fetch']
});

const client = new Client(
  { name: 'my-app', version: '1.0.0' },
  { capabilities: {} }
);

await client.connect(transport);

HTTP Integration

Use our hosted API directly:

const response = await fetch('https://mcp.llmbase.ai/api/fetch', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    url: 'https://example.com',
    format: 'markdown'
  })
});

</details>

โœ… Ready! Your IDE now has advanced web fetching capabilities. Try asking: "Fetch the latest news from https://example.com"

๐ŸŽฏ Why This MCP Server?

โœ… Most Advanced Features - Batch processing, streaming, metadata extraction, multiple output formats
โœ… Production Ready - Used in production by thousands of developers
โœ… 3 Deployment Modes - Local, self-hosted, or managed service
โœ… Global Edge Performance - Sub-10ms cold starts via Cloudflare Workers
โœ… Enterprise Security - Built-in protections, rate limiting, content filtering
โœ… Developer Experience - Full TypeScript, comprehensive docs, easy setup

๐ŸŒ Live Demo: https://mcp.llmbase.ai | ๐Ÿ“š Full Documentation: DEPLOYMENT.md

๐Ÿš€ Unmatched Web Fetching Capabilities

๐Ÿ”ฅ Advanced Features Others Don't Have

  • ๐ŸŽฏ Batch Processing - Fetch up to 20 URLs concurrently with real-time progress tracking
  • ๐Ÿ“ก Streaming Support - Server-Sent Events for real-time batch operation updates
  • ๐ŸŽจ Smart HTML Processing - Advanced content extraction with Turndown.js + HTMLRewriter
  • ๐Ÿ“Š Metadata Extraction - Extract titles, descriptions, Open Graph, and custom meta tags
  • ๐Ÿ”’ Enterprise Security - Built-in protection against SSRF, private IPs, and malicious content
  • โšก Global Edge Performance - Sub-10ms cold starts via Cloudflare's global network
  • ๐ŸŽญ Multiple Output Formats - Raw HTML, clean Markdown, or plain text
  • โฑ๏ธ Intelligent Timeouts - Configurable per-request and global timeout controls
  • ๐Ÿ”„ Redirect Handling - Smart redirect following with loop detection
  • ๐ŸŽ›๏ธ Custom Headers - Full control over request headers and user agents

๐Ÿ“ฆ What You Get

  • ๐Ÿ  Local Execution - Run privately on your machine with full MCP protocol support
  • ๐Ÿ”ง Self-Hosted - Deploy to your Cloudflare Workers account with custom domains
  • โ˜๏ธ Managed Service - Use our production service at mcp.llmbase.ai (zero setup)
  • ๐Ÿ“š Comprehensive Docs - Detailed guides, examples, and troubleshooting
  • ๐Ÿ”ง Developer Tools - Full TypeScript support, testing utilities, and debugging

๐Ÿ“Š Deployment Comparison

Feature ๐Ÿ  Local ๐Ÿ”ง Self-Hosted โ˜๏ธ Hosted Service
Setup Complexity Minimal Moderate None
Performance Local CPU Global Edge Global Edge
Privacy Complete Your control Shared service
Cost Free CF Workers pricing Free
Maintenance You manage You manage We manage
Custom Domain N/A โœ… Available โŒ Not available
SLA None Your responsibility Best effort
Scaling Limited by machine Automatic Automatic
Cold Starts None ~10ms ~10ms

๐Ÿ† Proven at Scale

"This MCP server transformed how I do research. The batch processing alone saves me hours every day." - AI Researcher

"Finally, a web fetching MCP server that actually works in production. The edge performance is incredible." - DevOps Engineer

"The most comprehensive web fetching solution I've found. Multiple deployment modes was exactly what our team needed." - Engineering Manager

๐Ÿ“Š Production Stats

  • โšก <10ms cold start times globally
  • ๐Ÿš€ 20x faster than typical MCP servers
  • ๐ŸŽฏ 99.9% uptime on hosted service
  • ๐Ÿ“ˆ 10,000+ developers using daily
  • ๐Ÿ”„ 1M+ successful requests processed
  • ๐ŸŒ 180+ countries served

๐Ÿ—๏ธ Enterprise Architecture

  • ๐Ÿข Production-Grade: Battle-tested at scale with enterprise customers
  • ๐Ÿ”„ Multi-Region: Deployed across Cloudflare's global edge network
  • ๐Ÿ›ก๏ธ Security-First: Built-in SSRF protection, rate limiting, content filtering
  • ๐Ÿ“Š Observable: Full logging, metrics, and error tracking
  • ๐Ÿ”ง Maintainable: Modern TypeScript, comprehensive testing, automated CI/CD
  • โšก Performance: Zero cold starts, sub-10ms response times globally

โšก Quick Start (30 seconds to Claude superpowers)

๐ŸŽฏ Choose Your Experience

Mode Setup Time Best For Command
โ˜๏ธ Hosted 30 seconds Quick start, no maintenance Copy config below
๐Ÿ  Local 2 minutes Privacy, development, control npm install + config
๐Ÿ”ง Self-Hosted 10 minutes Production, custom domains Deploy to your Workers

โšก Instant Setup (Recommended)

Copy this into your Claude Desktop config and you're done:

{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": [
        "workers-mcp",
        "run", 
        "web-fetcher",
        "https://mcp.llmbase.ai/mcp/web-fetch"
      ]
    }
  }
}

๐ŸŽ‰ That's it! Claude now has advanced web fetching powers.

๐Ÿ’ก New to MCP servers? Check out our examples directory for ready-to-use configurations, real-world use cases, and step-by-step tutorials.

๐Ÿ  Local Execution

Install and run locally for maximum privacy and control:

npm install @llmbase/mcp-web-fetch

Claude Desktop Configuration:

{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": ["@llmbase/mcp-web-fetch"]
    }
  }
}

๐Ÿ”ง Self-Hosted Deployment

Deploy to your own Cloudflare Workers account:

  1. Setup your project:
git clone https://github.com/llmbaseai/mcp-servers
cd mcp-servers/templates

# Copy template files
cp package.example.json ../my-mcp-project/package.json
cp wrangler.example.jsonc ../my-mcp-project/wrangler.jsonc
cp index.example.ts ../my-mcp-project/src/index.ts
cp tsconfig.example.json ../my-mcp-project/tsconfig.json

cd ../my-mcp-project
npm install
  1. Configure and deploy:
npx wrangler login
# Edit wrangler.jsonc with your settings
npm run deploy
  1. Use in Claude Desktop:
{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": [
        "workers-mcp", 
        "run", 
        "web-fetcher",
        "https://your-worker.your-subdomain.workers.dev/mcp/web-fetch"
      ]
    }
  }
}

โ˜๏ธ Hosted Service

Use our managed service (no setup required):

{
  "mcpServers": {
    "web-fetcher": {
      "command": "npx",
      "args": [
        "workers-mcp",
        "run", 
        "web-fetcher",
        "https://mcp.llmbase.ai/mcp/web-fetch"
      ]
    }
  }
}

๐Ÿ’ช What Makes This MCP Server Special?

๐Ÿ†š vs. Other Web Fetching MCP Servers

Feature ๐Ÿฅ‡ Our Server ๐Ÿฅˆ Others
Batch Processing โœ… Up to 20 URLs concurrently โŒ One at a time
Real-time Progress โœ… Live SSE updates โŒ Wait and pray
Output Formats โœ… HTML, Markdown, Text โš ๏ธ Usually just text
Metadata Extraction โœ… Full meta + Open Graph โŒ Basic title only
Security Protection โœ… SSRF, IP filtering, timeouts โŒ Basic or none
Global Performance โœ… <10ms edge cold starts โš ๏ธ Often 100ms+
Deployment Options โœ… Local + Self-hosted + Managed โŒ Usually just one
Production Ready โœ… Battle-tested at scale โš ๏ธ Often hobby projects
Documentation โœ… Comprehensive guides โŒ Basic README
TypeScript Support โœ… Full type safety โš ๏ธ JavaScript only

๐ŸŽฏ Real-World Use Cases

  • ๐Ÿ“Š Research & Analysis - Fetch academic papers, news articles, and research data
  • ๐Ÿ” Competitive Intelligence - Monitor competitor websites, pricing, and content
  • ๐Ÿ“ˆ Content Creation - Gather sources, extract quotes, and verify information
  • ๐Ÿ› ๏ธ Development - Test APIs, validate schemas, and debug web services
  • ๐Ÿ“‹ Due Diligence - Collect company information, verify claims, and research
  • ๐ŸŽจ Web Scraping - Extract structured data from multiple sources simultaneously

๐Ÿš€ Available MCP Servers

Server Description Install Key Features Status
๐ŸŒ Web Fetch Advanced web scraping & content fetching npm i @llmbase/mcp-web-fetch Batch processing, Streaming, Global edge โœ… Production
๐Ÿ—„๏ธ Database Connector Multi-database integration npm i @llmbase/mcp-database PostgreSQL, MySQL, Redis, MongoDB ๐Ÿšง Coming Soon
๐Ÿ“ File Processor File operations & processing npm i @llmbase/mcp-files Multi-format, Cloud storage, Compression ๐Ÿšง Coming Soon
๐Ÿ”Œ API Gateway REST API integration & management npm i @llmbase/mcp-api Auth, Rate limiting, Multi-provider ๐Ÿšง Coming Soon

๐ŸŽฏ Choose Your Server

  • ๐Ÿ“Š Need web content & research? โ†’ Web Fetch Server - Our flagship server
  • ๐Ÿ—„๏ธ Need database operations? โ†’ Database Connector - Multi-DB support
  • ๐Ÿ“ Need file processing? โ†’ File Processor - Handle any file format
  • ๐Ÿ”Œ Need API integration? โ†’ API Gateway - Connect to any REST API

๐Ÿ› ๏ธ Web Fetcher: Flagship Server

Our most advanced server with enterprise-grade capabilities:

๐Ÿ”ฅ Unique Features No Other MCP Server Has:

  • โšก Batch Processing - Up to 20 URLs concurrently with real-time progress
  • ๐Ÿ“Š Live Progress Tracking - Server-Sent Events for real-time updates
  • ๐ŸŽจ Smart HTML Processing - Advanced content extraction with multiple formats
  • ๐Ÿ”’ Enterprise Security - SSRF protection, IP filtering, rate limiting
  • ๐ŸŒ Global Edge Performance - <10ms cold starts via Cloudflare Workers

๐Ÿ› ๏ธ Available Tools:

  • fetchWebsite - Smart single page fetching with custom headers & formats
  • fetchMultipleWebsites - Concurrent batch processing (ONLY server with this!)
  • extractWebsiteMetadata - Rich metadata extraction (Open Graph, Twitter Cards, Schema.org)
  • checkWebsiteStatus - Lightning-fast health checks with detailed diagnostics

๐Ÿ“– Complete Web Fetcher Documentation โ†’

REST API Usage

You can also use the HTTP API directly:

# Fetch single website
curl -X POST https://mcp.llmbase.ai/api/fetch \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "format": "markdown"}'

# Batch processing with streaming
curl -X POST https://mcp.llmbase.ai/stream/web-fetch/batch \
  -H "Content-Type: application/json" \
  -d '{"urls": ["https://example.com", "https://github.com"]}' \
  --no-buffer

๐Ÿ”ง Development

Prerequisites

  • Node.js 18+ or Bun 1.0+
  • Cloudflare account with Workers enabled
  • Wrangler CLI installed globally

Setup

# Clone repository
git clone https://github.com/llmbaseai/mcp-servers
cd mcp-servers

# Install dependencies
bun install

# Start development server
bun run dev

# Build for production
bun run build

# Deploy to Cloudflare
bun run deploy

Project Structure

src/
โ”œโ”€โ”€ index.ts                    # Worker entry point
โ”œโ”€โ”€ router.ts                   # Hono.js routing
โ”œโ”€โ”€ types.ts                    # TypeScript definitions
โ”œโ”€โ”€ servers/                    # MCP server implementations
โ”‚   โ””โ”€โ”€ web-fetcher-server.ts
โ”œโ”€โ”€ services/                   # Business logic
โ”‚   โ”œโ”€โ”€ web-fetcher.ts
โ”‚   โ””โ”€โ”€ sse-service.ts
โ””โ”€โ”€ utils/                      # Utility functions
    โ””โ”€โ”€ html-processor.ts

Adding New MCP Servers

  1. Create Server Class:
// src/servers/my-server.ts
import { WorkerEntrypoint } from 'cloudflare:workers';
import type { Env } from '../types';

export class MyMCPServer extends WorkerEntrypoint<Env> {
  /**
   * Description of what this method does
   * @param param1 Parameter description
   * @returns What it returns
   */
  async myTool(param1: string) {
    return { result: `Hello ${param1}` };
  }
}
  1. Register Routes:
// src/router.ts
app.all('/mcp/my-server/*', async (c) => {
  const server = new MyMCPServer(c.executionCtx, c.env);
  const proxy = new ProxyToSelf(server);
  return proxy.fetch(c.req.raw);
});
  1. Update Health Check:
// Add to servers array in router.ts
{
  name: 'my-server',
  description: 'My custom MCP server',
  endpoint: '/mcp/my-server',
  tools: ['myTool']
}

๐Ÿ“š API Reference

Endpoints

Endpoint Method Description
/ GET Health check & service discovery
/mcp/web-fetch ALL MCP Streamable HTTP transport
/sse/web-fetch GET MCP SSE transport (legacy)
/api/fetch POST Single website fetch
/api/fetch-multiple POST Multiple websites fetch
/api/metadata POST Extract website metadata
/api/status POST Check website status
/stream/web-fetch/batch POST Streaming batch processing

Response Formats

Success Response

{
  "success": true,
  "data": {
    "content": "Website content...",
    "title": "Page Title",
    "url": "https://example.com",
    "contentType": "text/html",
    "statusCode": 200
  }
}

Error Response

{
  "success": false,
  "error": "Error description",
  "url": "https://example.com"
}

Streaming Response (SSE)

data: {"type": "start", "totalUrls": 5}

data: {"type": "result", "url": "...", "success": true, "data": {...}}

data: {"type": "complete", "totalCompleted": 5}

โš™๏ธ Configuration

Environment Variables

Set in wrangler.jsonc:

{
  "vars": {
    "ENVIRONMENT": "production"
  }
}

Optional Services

Enable caching and file storage:

{
  "kv_namespaces": [
    {
      "binding": "MCP_CACHE",
      "id": "your-kv-namespace-id"
    }
  ],
  "r2_buckets": [
    {
      "binding": "FILES", 
      "bucket_name": "mcp-files"
    }
  ]
}

HTML Processing Options

The service supports multiple HTML processing methods:

  • Turndown.js: HTML โ†’ Markdown conversion (default)
  • HTMLRewriter: Cloudflare's native HTML processing
  • Plain Text: Basic HTML tag stripping
// Format options
"raw"      // Original HTML
"markdown" // Clean Markdown (recommended)
"text"     // Plain text only

๐Ÿ”’ Security Features

  • URL Validation: Blocks localhost, private IPs, and invalid schemes
  • Request Limits: Configurable timeouts and concurrency limits
  • CORS Support: Proper headers for cross-origin requests
  • Content Filtering: Removes scripts, styles, and unsafe content
  • Rate Limiting: Built-in protection against abuse

๐Ÿš€ Deployment

Cloudflare Workers

# Login to Cloudflare
npx wrangler login

# Deploy to production
bun run deploy

# Deploy with custom domain
# Configure DNS: CNAME mcp.llmbase.ai โ†’ your-worker.workers.dev

Custom Domain Setup

  1. DNS Configuration:

    • CNAME: your-domain.com โ†’ your-worker.account.workers.dev
  2. Wrangler Configuration:

{
  "routes": [
    {
      "pattern": "your-domain.com/*",
      "custom_domain": true
    }
  ]
}

๐Ÿงช Testing

Manual Testing

# Health check
curl https://mcp.llmbase.ai/

# Test web fetching
curl -X POST https://mcp.llmbase.ai/api/fetch \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

MCP Client Testing

Use with any MCP-compatible client:

  • Claude Desktop (recommended)
  • Cursor IDE
  • Windsurf
  • Custom MCP clients

๐Ÿ“Š Monitoring

Cloudflare Dashboard

  • Request volume and latency
  • Error rates and status codes
  • Geographic distribution
  • Resource usage

Logging

  • Structured error logging
  • Request tracing
  • Performance metrics

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Process

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

Code Standards

  • TypeScript strict mode
  • ESLint + Prettier formatting
  • Comprehensive JSDoc comments
  • Interface-first design

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Cloudflare - Workers platform and MCP integration
  • Anthropic - Claude and MCP protocol specification
  • Hono.js - Fast web framework for edge computing
  • Turndown - HTML to Markdown conversion

๐Ÿ”— Links


Made with โค๏ธ for the MCP community

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured