Web Search MCP
Provides LLMs with real-time web search and content extraction capabilities, including text/news search, full-text URL reading, and targeted technical documentation search.
README
Web Search MCP
A comprehensive, production-ready research server for the Model Context Protocol (MCP). Provide your LLM clients with real-time access to the web and more.
✨ Features
- 🌐 Deep Web Search: Text and news search via DuckDuckGo.
- 📄 Content Extraction: Read clutter-free full text from any URL using
trafilatura. Supports multiple output formats (text, markdown, JSON), metadata extraction, and content filtering. - 🛡️ Bot Detection Bypass: Automatic fallback to Chrome TLS impersonation when sites block requests (Cloudflare, etc.).
- 💻 Technical Docs: Targeted search for developer documentation (Python, React, etc.).
🚀 Quick Start
Installation
Install directly using uv:
uv tool install git+https://github.com/sydasif/web-search-mcp.git
Configuration
Add the server to your MCP client configuration (e.g., claude_desktop_config.json). You can optionally configure rate limits via environment variables to avoid DuckDuckGo blocking.
{
"mcpServers": {
"web-search": {
"command": "web-search-mcp",
"env": {
"SEARCH_MCP_RATE_LIMIT_SEARCH": "30",
"SEARCH_MCP_RATE_LIMIT_FETCH": "20"
}
}
}
}
Available Environment Variables:
SEARCH_MCP_RATE_LIMIT_SEARCH: Max search requests per minute (default:30).SEARCH_MCP_RATE_LIMIT_FETCH: Max page fetch requests per minute (default:20).
Fetch Backend Options
The fetch_page tool supports three backend modes to handle sites with bot detection:
| Backend | Description | Use Case |
|---|---|---|
auto (default) |
Tries httpx first, falls back to curl on 403 or Cloudflare challenge |
Recommended for most use cases |
httpx |
Lightweight async HTTP client | Fast, but may be blocked by some sites |
curl |
Uses curl_cffi with Chrome 131 TLS impersonation |
Bypasses Cloudflare and similar bot filters |
🛠️ Tool Reference
| Tool | Description | Key Parameters |
|---|---|---|
web_search |
Universal search (Web, News) | query, search_type ("text", "news"), max_results, time_range, region, page, response_format ("json", "markdown") |
fetch_page |
Extract clean article text from a URL | url, output_format ("csv", "html", "json", "markdown", "python", "txt", "xml", "xmltei"), include_metadata, include_tables, include_comments, include_images, max_length, timeout, backend ("httpx", "curl", "auto") |
search_docs |
Search specific tech documentation or domains | query, domain (e.g., "docs.python.org", "github.com") |
💻 Development
<details> <summary>Click to expand development instructions</summary>
-
Clone the repository
git clone https://github.com/sydasif/web-search-mcp.git cd web-search-mcp -
Sync dependencies
uv sync -
Run tests
# Run all tests uv run pytest # Run with coverage uv run pytest --cov=web_search_mcp -
Linting & Formatting
uv run ruff check .
</details>
📄 License
This project is licensed under the MIT License.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.