mcp-research
Enables AI assistants to perform comprehensive web research through tiered search, secure URL fetching with markdown conversion, and automated multi-source synthesis pipelines. Provides read-only tools with configurable caching, SSRF protection, and optional LLM-powered summarization for search results and content analysis.
README
mcp-research
<!-- mcp-name: io.github.mabaam/mcp-research -->
A standalone MCP (Model Context Protocol) server providing web research tools. Three battle-tested tools for AI assistants: search the web, fetch & convert pages to markdown, and run compound multi-source research — all via the MCP stdio protocol.
Tools
| Tool | Description |
|---|---|
web_search |
3-tier search cascade: Brave API → DuckDuckGo → HTML scraper |
fetch_url |
Fetch any URL → clean markdown, with SSRF protection and 24h cache |
research |
Compound pipeline: query rewrite → search → parallel fetch → summarize → synthesize |
All tools are read-only — they fetch and transform public web content, never modify anything.
Install
pip install mcp-research
Or run directly with uvx (zero-install):
uvx mcp-research
Configuration
All configuration is via environment variables — no config files needed.
| Variable | Default | Description |
|---|---|---|
BRAVE_API_KEY |
(empty) | Brave Search API key. Falls back to DuckDuckGo if unset. |
OLLAMA_URL |
http://localhost:11434 |
Ollama endpoint for summarization/synthesis. Set empty to disable. |
OLLAMA_MODEL |
qwen2.5:14b |
Model to use for summarization and synthesis. |
MCP_RESEARCH_CACHE_DIR |
~/.mcp-research/cache/ |
URL fetch cache directory. |
MCP_RESEARCH_CACHE_TTL |
24 |
Cache TTL in hours. |
MCP_RESEARCH_LOG_DIR |
~/.mcp-research/logs/ |
Search log directory (NDJSON). |
MCP_RESEARCH_MAX_RESULTS |
10 |
Default max search results. |
Usage with Claude Code
Add to your Claude Code MCP config (~/.claude/settings.json or project .mcp.json):
{
"mcpServers": {
"research": {
"command": "uvx",
"args": ["mcp-research"],
"env": {
"BRAVE_API_KEY": "BSA...",
"OLLAMA_URL": "http://localhost:11434"
}
}
}
}
Usage with Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"research": {
"command": "uvx",
"args": ["mcp-research"],
"env": {
"BRAVE_API_KEY": "BSA..."
}
}
}
}
Tool Details
web_search
web_search(query, max_results=5, summarize=False, auto_fetch_top=False)
Searches the web using a 3-tier cascade for maximum reliability:
- Brave Search API — fast, high quality (requires
BRAVE_API_KEY) - DuckDuckGo library — no API key needed, retries on rate limit
- DuckDuckGo HTML scraper — last-resort fallback
Options:
summarize: Use Ollama to summarize results (requires running Ollama)auto_fetch_top: Also fetch and return the full content of the top result
fetch_url
fetch_url(url, summarize=False, max_chars=50000)
Fetches a URL and converts it to clean markdown:
- SSRF protection: Blocks localhost, private IPs, non-HTTP schemes
- Smart retry: Exponential backoff on 429/5xx, per-hop redirect validation
- 24h cache: SHA-256 keyed, configurable TTL
- Content support: HTML → markdown, JSON → code block, binary → rejected
- Smart truncation: Breaks at heading/paragraph boundaries, not mid-text
research
research(query, depth="standard", context="")
Compound research pipeline:
- Query rewrite — Ollama optimizes your question into search keywords
- Web search — finds relevant pages (with zero-result retry expansion)
- Parallel fetch — fetches top N pages concurrently
- Summarize — Ollama summarizes each page
- Synthesize — Ollama produces a final cited answer
Depth levels:
| Depth | Pages | Synthesis |
|---|---|---|
quick |
2 | No |
standard |
5 | Yes |
deep |
10 | Yes |
All steps gracefully degrade without Ollama — you still get search results and raw page content.
Development
git clone https://github.com/MABAAM/Maibaamcrawler.git
cd Maibaamcrawler
pip install -e .
python -m mcp_research
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.