mcp-research

mcp-research

Enables AI assistants to perform comprehensive web research through tiered search, secure URL fetching with markdown conversion, and automated multi-source synthesis pipelines. Provides read-only tools with configurable caching, SSRF protection, and optional LLM-powered summarization for search results and content analysis.

Category
Visit Server

README

mcp-research

<!-- mcp-name: io.github.mabaam/mcp-research -->

A standalone MCP (Model Context Protocol) server providing web research tools. Three battle-tested tools for AI assistants: search the web, fetch & convert pages to markdown, and run compound multi-source research — all via the MCP stdio protocol.

Tools

Tool Description
web_search 3-tier search cascade: Brave API → DuckDuckGo → HTML scraper
fetch_url Fetch any URL → clean markdown, with SSRF protection and 24h cache
research Compound pipeline: query rewrite → search → parallel fetch → summarize → synthesize

All tools are read-only — they fetch and transform public web content, never modify anything.

Install

pip install mcp-research

Or run directly with uvx (zero-install):

uvx mcp-research

Configuration

All configuration is via environment variables — no config files needed.

Variable Default Description
BRAVE_API_KEY (empty) Brave Search API key. Falls back to DuckDuckGo if unset.
OLLAMA_URL http://localhost:11434 Ollama endpoint for summarization/synthesis. Set empty to disable.
OLLAMA_MODEL qwen2.5:14b Model to use for summarization and synthesis.
MCP_RESEARCH_CACHE_DIR ~/.mcp-research/cache/ URL fetch cache directory.
MCP_RESEARCH_CACHE_TTL 24 Cache TTL in hours.
MCP_RESEARCH_LOG_DIR ~/.mcp-research/logs/ Search log directory (NDJSON).
MCP_RESEARCH_MAX_RESULTS 10 Default max search results.

Usage with Claude Code

Add to your Claude Code MCP config (~/.claude/settings.json or project .mcp.json):

{
  "mcpServers": {
    "research": {
      "command": "uvx",
      "args": ["mcp-research"],
      "env": {
        "BRAVE_API_KEY": "BSA...",
        "OLLAMA_URL": "http://localhost:11434"
      }
    }
  }
}

Usage with Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "research": {
      "command": "uvx",
      "args": ["mcp-research"],
      "env": {
        "BRAVE_API_KEY": "BSA..."
      }
    }
  }
}

Tool Details

web_search

web_search(query, max_results=5, summarize=False, auto_fetch_top=False)

Searches the web using a 3-tier cascade for maximum reliability:

  1. Brave Search API — fast, high quality (requires BRAVE_API_KEY)
  2. DuckDuckGo library — no API key needed, retries on rate limit
  3. DuckDuckGo HTML scraper — last-resort fallback

Options:

  • summarize: Use Ollama to summarize results (requires running Ollama)
  • auto_fetch_top: Also fetch and return the full content of the top result

fetch_url

fetch_url(url, summarize=False, max_chars=50000)

Fetches a URL and converts it to clean markdown:

  • SSRF protection: Blocks localhost, private IPs, non-HTTP schemes
  • Smart retry: Exponential backoff on 429/5xx, per-hop redirect validation
  • 24h cache: SHA-256 keyed, configurable TTL
  • Content support: HTML → markdown, JSON → code block, binary → rejected
  • Smart truncation: Breaks at heading/paragraph boundaries, not mid-text

research

research(query, depth="standard", context="")

Compound research pipeline:

  1. Query rewrite — Ollama optimizes your question into search keywords
  2. Web search — finds relevant pages (with zero-result retry expansion)
  3. Parallel fetch — fetches top N pages concurrently
  4. Summarize — Ollama summarizes each page
  5. Synthesize — Ollama produces a final cited answer

Depth levels:

Depth Pages Synthesis
quick 2 No
standard 5 Yes
deep 10 Yes

All steps gracefully degrade without Ollama — you still get search results and raw page content.

Development

git clone https://github.com/MABAAM/Maibaamcrawler.git
cd Maibaamcrawler
pip install -e .
python -m mcp_research

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured