WebFetch.MCP

WebFetch.MCP

Enables local LLMs to search the web and fetch clean content from URLs without API keys, using SearxNG and Mozilla Readability.

Category
Visit Server

README

MSeeP.ai Security Assessment Badge

🌐 WebFetch.MCP v0.1.8

Live Web Access for Your Local AI β€” Tunable Search & Clean Content Extraction

License: MIT Node.js 18+ LM Studio Compatible SearxNG Powered

🚨 The Problem

Local LLMs can't browse the web. Out of the box, LM Studio β€” and most MCP setups β€” leave your model stuck in 2023 or earlier. No live data. No current events. Paste a URL into chat and all you get back is:

"I can't access the web." A few third-party MCP servers exist, but they’re API-locked, incomplete, or a pain to run. That means LM Studio users are flying blind β€” unable to fetch or search live content reliably.

βœ… The Solution β€” WebFetch.MCP

WebFetch.MCP is a drop-in, self-hosted MCP server that brings your local AI:

  • πŸ•’ Fresh, Real-Time Data β€” Go beyond your model’s training cutoff.
  • 🌐 Reliable URL Fetch β€” Paste a link, get the clean content.
  • πŸŽ› Full Search Control β€” Choose engines, boost sources, filter by type/date/language.
  • πŸ”“ API-Free Freedom β€” No API keys, quotas, or tracking.
  • 🧠 AI-Ready Output β€” Structured, clean, distraction-free text your LLM can actually use.

Privacy Note: Search requests and web fetches are visible to your ISP and target sites. Use a VPN for enhanced privacy.

πŸ† Why It’s Different

Feature WebFetch.MCP mrkrsl-web-search mcp-server-fetch-python Crawl4AI
Live Web Search βœ… Yes βœ… Yes ❌ No βœ… Yes
URL Content Fetch βœ… Yes ⚠️ Limited βœ… Yes βœ… Yes
Search Tunability βœ… Full Control ❌ API-limited ❌ Basic ⚠️ Limited
70+ Search Engines βœ… Yes ❌ No ❌ No ⚠️ Few
Scientific/Technical Focus βœ… Configurable ❌ No ❌ No ❌ No
No API Keys βœ… Yes ❌ Required ❌ Required βœ… Basic only
Content Quality βœ… Mozilla Readability ⚠️ Basic ⚠️ Basic βœ… Advanced
JS Execution βœ… Yes (JSDOM) ❌ No βœ… Yes βœ… Yes
Setup Simplicity βœ… Easy ⚠️ Medium ❌ Complex ❌ Very Complex
Cost βœ… Free πŸ’° API costs πŸ’° API costs βœ… Free

✨ Core Features

🎯 Precision Search

  • 70+ configurable engines β€” Google Scholar, arXiv, PubMed, IEEE, GitHub, Stack Overflow, weather.gov, and more.
  • Weighted source control β€” Boost authoritative and academic sources.
  • Data type filters β€” Papers, docs, code, or news only.
  • Freshness filters β€” Recent publications, latest docs, breaking news.

πŸ”¬ Scientific & Technical Focus

  • Academic: arXiv, PubMed, IEEE Xplore, ACM Digital Library.
  • Technical: MDN, Stack Overflow, GitHub, official docs.
  • Government: weather.gov, data.gov, NASA, NOAA.

πŸ“„ Clean Content Extraction

  • Mozilla Readability β€” industry-standard parsing.
  • JavaScript execution β€” handles SPAs & dynamic pages.
  • Removes ads, menus, widgets.
  • Optimized handling for research papers & technical docs.

βš™οΈ Complete Control

  • Enable only trusted engines.
  • Language & region targeting.
  • Domain/site restrictions.
  • Custom weighting per source.

πŸ“‹ Prerequisites

⚑ Quick Start

1️⃣ Install SearxNG (5 min)

Docker Compose (Recommended)

git clone https://github.com/searxng/searxng-docker.git
cd searxng-docker
sed -i "s|ultrasecretkey|$(openssl rand -hex 32)|g" searxng/settings.yml
docker compose up -d

Test SearxNG

curl "http://localhost:8080/search?q=test&format=json"

πŸ“– SearxNG Installation Guide

2️⃣ Install WebFetch.MCP

git clone https://github.com/manull/webfetch-mcp.git
cd webfetch-mcp
npm install
node server.mjs

3️⃣ Connect to LM Studio

In LM Studio β†’ Settings β†’ Developer β†’ MCP Servers:

{
  "mcpServers": {
    "webfetch": {
      "command": "node",
      "args": ["/full/path/to/webfetch-mcp/server.mjs"],
      "env": {
        "SEARXNG_BASE": "http://localhost:8080",
        "DEBUG": "false"
      }
    }
  }
}

Restart LM Studio β€” web_search and web_fetch tools will now be available.

4️⃣ Test It

In LM Studio:

πŸ” Search for recent AI research on transformer architectures
πŸ“„ Fetch content from https://example.com/article

πŸ”§ Configuration

Variable Default Description
SEARXNG_BASE http://localhost:8080 SearxNG instance URL
DEBUG false Debug logging
DETAILED_LOG true Detailed log output

⏱️ Smart Rate Limiting

WebFetch.MCP uses intelligent time-based rate limiting designed for real research workflows:

πŸ“Š Rate Limits:

  • 12 calls per 5-minute window - Generous limit for research sessions
  • 8 calls per 30-second burst - Prevents LLM spam while allowing quick queries
  • Automatic reset - No need to restart LM Studio between research sessions

🎯 Why This Works Better:

  • βœ… Research-friendly - Supports extended research sessions
  • βœ… Anti-spam protection - Prevents runaway LLM tool calling
  • βœ… No restarts needed - Limits reset automatically over time
  • βœ… Clear feedback - Shows remaining calls and reset times

πŸ“ˆ Example Usage Patterns:

  • Quick research: 5-8 rapid calls, then brief pause
  • Extended research: 12 calls spread over 5 minutes
  • Continuous work: Limits reset as you work, no interruption

πŸ“Š Example Usage

Search

πŸ” Find Python asyncio docs site:python.org
πŸ” Search for recent climate data from government sources

Fetch

πŸ“„ Extract content from https://news.example.com/article
πŸ“„ Get main text from https://arxiv.org/abs/2305.12345

πŸ§ͺ Testing

curl "http://localhost:8080/search?format=json&q=test&count=5"
DEBUG=true node server.mjs

🀝 Contributing

We welcome:

  • πŸ› Bug reports β†’ Open an issue
  • πŸ”§ Code PRs
  • πŸ“– Documentation improvements

πŸ“„ License

MIT β€” see LICENSE.

πŸ™ Acknowledgments


Built for LM Studio and local LLM users who need real-time, reliable, tunable access to the web.

⭐ Star this repo if you're done with "I can't access the web" from your AI.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured