MCP Servers

WebFetch.MCP

Enables local LLMs to search the web and fetch clean content from URLs without API keys, using SearxNG and Mozilla Readability.

README

🌐 WebFetch.MCP v0.1.8

Live Web Access for Your Local AI — Tunable Search & Clean Content Extraction

🚨 The Problem

Local LLMs can't browse the web. Out of the box, LM Studio — and most MCP setups — leave your model stuck in 2023 or earlier. No live data. No current events. Paste a URL into chat and all you get back is:

"I can't access the web." A few third-party MCP servers exist, but they’re API-locked, incomplete, or a pain to run. That means LM Studio users are flying blind — unable to fetch or search live content reliably.

✅ The Solution — WebFetch.MCP

WebFetch.MCP is a drop-in, self-hosted MCP server that brings your local AI:

🕒 Fresh, Real-Time Data — Go beyond your model’s training cutoff.
🌐 Reliable URL Fetch — Paste a link, get the clean content.
🎛 Full Search Control — Choose engines, boost sources, filter by type/date/language.
🔓 API-Free Freedom — No API keys, quotas, or tracking.
🧠 AI-Ready Output — Structured, clean, distraction-free text your LLM can actually use.

Privacy Note: Search requests and web fetches are visible to your ISP and target sites. Use a VPN for enhanced privacy.

🏆 Why It’s Different

Feature	WebFetch.MCP	mrkrsl-web-search	mcp-server-fetch-python	Crawl4AI
Live Web Search	✅ Yes	✅ Yes	❌ No	✅ Yes
URL Content Fetch	✅ Yes	⚠️ Limited	✅ Yes	✅ Yes
Search Tunability	✅ Full Control	❌ API-limited	❌ Basic	⚠️ Limited
70+ Search Engines	✅ Yes	❌ No	❌ No	⚠️ Few
Scientific/Technical Focus	✅ Configurable	❌ No	❌ No	❌ No
No API Keys	✅ Yes	❌ Required	❌ Required	✅ Basic only
Content Quality	✅ Mozilla Readability	⚠️ Basic	⚠️ Basic	✅ Advanced
JS Execution	✅ Yes (JSDOM)	❌ No	✅ Yes	✅ Yes
Setup Simplicity	✅ Easy	⚠️ Medium	❌ Complex	❌ Very Complex
Cost	✅ Free	💰 API costs	💰 API costs	✅ Free

✨ Core Features

🎯 Precision Search

70+ configurable engines — Google Scholar, arXiv, PubMed, IEEE, GitHub, Stack Overflow, weather.gov, and more.
Weighted source control — Boost authoritative and academic sources.
Data type filters — Papers, docs, code, or news only.
Freshness filters — Recent publications, latest docs, breaking news.

🔬 Scientific & Technical Focus

Academic: arXiv, PubMed, IEEE Xplore, ACM Digital Library.
Technical: MDN, Stack Overflow, GitHub, official docs.
Government: weather.gov, data.gov, NASA, NOAA.

📄 Clean Content Extraction

Mozilla Readability — industry-standard parsing.
JavaScript execution — handles SPAs & dynamic pages.
Removes ads, menus, widgets.
Optimized handling for research papers & technical docs.

⚙️ Complete Control

Enable only trusted engines.
Language & region targeting.
Domain/site restrictions.
Custom weighting per source.

📋 Prerequisites

Node.js 18+ → Download
Docker & Docker Compose → Install
LM Studio → Download

⚡ Quick Start

1️⃣ Install SearxNG (5 min)

Docker Compose (Recommended)

git clone https://github.com/searxng/searxng-docker.git
cd searxng-docker
sed -i "s|ultrasecretkey|$(openssl rand -hex 32)|g" searxng/settings.yml
docker compose up -d

Test SearxNG

curl "http://localhost:8080/search?q=test&format=json"

📖 SearxNG Installation Guide

2️⃣ Install WebFetch.MCP

git clone https://github.com/manull/webfetch-mcp.git
cd webfetch-mcp
npm install
node server.mjs

3️⃣ Connect to LM Studio

In LM Studio → Settings → Developer → MCP Servers:

{
  "mcpServers": {
    "webfetch": {
      "command": "node",
      "args": ["/full/path/to/webfetch-mcp/server.mjs"],
      "env": {
        "SEARXNG_BASE": "http://localhost:8080",
        "DEBUG": "false"
      }
    }
  }
}

Restart LM Studio — web_search and web_fetch tools will now be available.

4️⃣ Test It

In LM Studio:

🔍 Search for recent AI research on transformer architectures
📄 Fetch content from https://example.com/article

🔧 Configuration

Variable	Default	Description
SEARXNG_BASE	http://localhost:8080	SearxNG instance URL
DEBUG	false	Debug logging
DETAILED_LOG	true	Detailed log output

⏱️ Smart Rate Limiting

WebFetch.MCP uses intelligent time-based rate limiting designed for real research workflows:

📊 Rate Limits:

12 calls per 5-minute window - Generous limit for research sessions
8 calls per 30-second burst - Prevents LLM spam while allowing quick queries
Automatic reset - No need to restart LM Studio between research sessions

🎯 Why This Works Better:

✅ Research-friendly - Supports extended research sessions
✅ Anti-spam protection - Prevents runaway LLM tool calling
✅ No restarts needed - Limits reset automatically over time
✅ Clear feedback - Shows remaining calls and reset times

📈 Example Usage Patterns:

Quick research: 5-8 rapid calls, then brief pause
Extended research: 12 calls spread over 5 minutes
Continuous work: Limits reset as you work, no interruption

📊 Example Usage

Search

🔍 Find Python asyncio docs site:python.org
🔍 Search for recent climate data from government sources

Fetch

📄 Extract content from https://news.example.com/article
📄 Get main text from https://arxiv.org/abs/2305.12345

🧪 Testing

curl "http://localhost:8080/search?format=json&q=test&count=5"
DEBUG=true node server.mjs

🤝 Contributing

We welcome:

🐛 Bug reports → Open an issue
🔧 Code PRs
📖 Documentation improvements

📄 License

MIT — see LICENSE.

🙏 Acknowledgments

SearxNG — Privacy-focused metasearch engine.
Mozilla Readability — Clean content extraction.
LM Studio — Local AI runtime.
Model Context Protocol — AI tool integration standard.

Built for LM Studio and local LLM users who need real-time, reliable, tunable access to the web.

⭐ Star this repo if you're done with "I can't access the web" from your AI.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured