WebFetch.MCP
Enables local LLMs to search the web and fetch clean content from URLs without API keys, using SearxNG and Mozilla Readability.
README
π WebFetch.MCP v0.1.8
Live Web Access for Your Local AI β Tunable Search & Clean Content Extraction
π¨ The Problem
Local LLMs can't browse the web. Out of the box, LM Studio β and most MCP setups β leave your model stuck in 2023 or earlier. No live data. No current events. Paste a URL into chat and all you get back is:
"I can't access the web." A few third-party MCP servers exist, but theyβre API-locked, incomplete, or a pain to run. That means LM Studio users are flying blind β unable to fetch or search live content reliably.
β The Solution β WebFetch.MCP
WebFetch.MCP is a drop-in, self-hosted MCP server that brings your local AI:
- π Fresh, Real-Time Data β Go beyond your modelβs training cutoff.
- π Reliable URL Fetch β Paste a link, get the clean content.
- π Full Search Control β Choose engines, boost sources, filter by type/date/language.
- π API-Free Freedom β No API keys, quotas, or tracking.
- π§ AI-Ready Output β Structured, clean, distraction-free text your LLM can actually use.
Privacy Note: Search requests and web fetches are visible to your ISP and target sites. Use a VPN for enhanced privacy.
π Why Itβs Different
| Feature | WebFetch.MCP | mrkrsl-web-search | mcp-server-fetch-python | Crawl4AI |
|---|---|---|---|---|
| Live Web Search | β Yes | β Yes | β No | β Yes |
| URL Content Fetch | β Yes | β οΈ Limited | β Yes | β Yes |
| Search Tunability | β Full Control | β API-limited | β Basic | β οΈ Limited |
| 70+ Search Engines | β Yes | β No | β No | β οΈ Few |
| Scientific/Technical Focus | β Configurable | β No | β No | β No |
| No API Keys | β Yes | β Required | β Required | β Basic only |
| Content Quality | β Mozilla Readability | β οΈ Basic | β οΈ Basic | β Advanced |
| JS Execution | β Yes (JSDOM) | β No | β Yes | β Yes |
| Setup Simplicity | β Easy | β οΈ Medium | β Complex | β Very Complex |
| Cost | β Free | π° API costs | π° API costs | β Free |
β¨ Core Features
π― Precision Search
- 70+ configurable engines β Google Scholar, arXiv, PubMed, IEEE, GitHub, Stack Overflow, weather.gov, and more.
- Weighted source control β Boost authoritative and academic sources.
- Data type filters β Papers, docs, code, or news only.
- Freshness filters β Recent publications, latest docs, breaking news.
π¬ Scientific & Technical Focus
- Academic: arXiv, PubMed, IEEE Xplore, ACM Digital Library.
- Technical: MDN, Stack Overflow, GitHub, official docs.
- Government: weather.gov, data.gov, NASA, NOAA.
π Clean Content Extraction
- Mozilla Readability β industry-standard parsing.
- JavaScript execution β handles SPAs & dynamic pages.
- Removes ads, menus, widgets.
- Optimized handling for research papers & technical docs.
βοΈ Complete Control
- Enable only trusted engines.
- Language & region targeting.
- Domain/site restrictions.
- Custom weighting per source.
π Prerequisites
β‘ Quick Start
1οΈβ£ Install SearxNG (5 min)
Docker Compose (Recommended)
git clone https://github.com/searxng/searxng-docker.git
cd searxng-docker
sed -i "s|ultrasecretkey|$(openssl rand -hex 32)|g" searxng/settings.yml
docker compose up -d
Test SearxNG
curl "http://localhost:8080/search?q=test&format=json"
π SearxNG Installation Guide
2οΈβ£ Install WebFetch.MCP
git clone https://github.com/manull/webfetch-mcp.git
cd webfetch-mcp
npm install
node server.mjs
3οΈβ£ Connect to LM Studio
In LM Studio β Settings β Developer β MCP Servers:
{
"mcpServers": {
"webfetch": {
"command": "node",
"args": ["/full/path/to/webfetch-mcp/server.mjs"],
"env": {
"SEARXNG_BASE": "http://localhost:8080",
"DEBUG": "false"
}
}
}
}
Restart LM Studio β web_search and web_fetch tools will now be available.
4οΈβ£ Test It
In LM Studio:
π Search for recent AI research on transformer architectures
π Fetch content from https://example.com/article
π§ Configuration
| Variable | Default | Description |
|---|---|---|
| SEARXNG_BASE | http://localhost:8080 | SearxNG instance URL |
| DEBUG | false | Debug logging |
| DETAILED_LOG | true | Detailed log output |
β±οΈ Smart Rate Limiting
WebFetch.MCP uses intelligent time-based rate limiting designed for real research workflows:
π Rate Limits:
- 12 calls per 5-minute window - Generous limit for research sessions
- 8 calls per 30-second burst - Prevents LLM spam while allowing quick queries
- Automatic reset - No need to restart LM Studio between research sessions
π― Why This Works Better:
- β Research-friendly - Supports extended research sessions
- β Anti-spam protection - Prevents runaway LLM tool calling
- β No restarts needed - Limits reset automatically over time
- β Clear feedback - Shows remaining calls and reset times
π Example Usage Patterns:
- Quick research: 5-8 rapid calls, then brief pause
- Extended research: 12 calls spread over 5 minutes
- Continuous work: Limits reset as you work, no interruption
π Example Usage
Search
π Find Python asyncio docs site:python.org
π Search for recent climate data from government sources
Fetch
π Extract content from https://news.example.com/article
π Get main text from https://arxiv.org/abs/2305.12345
π§ͺ Testing
curl "http://localhost:8080/search?format=json&q=test&count=5"
DEBUG=true node server.mjs
π€ Contributing
We welcome:
- π Bug reports β Open an issue
- π§ Code PRs
- π Documentation improvements
π License
MIT β see LICENSE.
π Acknowledgments
- SearxNG β Privacy-focused metasearch engine.
- Mozilla Readability β Clean content extraction.
- LM Studio β Local AI runtime.
- Model Context Protocol β AI tool integration standard.
Built for LM Studio and local LLM users who need real-time, reliable, tunable access to the web.
β Star this repo if you're done with "I can't access the web" from your AI.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
