Research Powerpack MCP

Research Powerpack MCP

Enables AI assistants to perform comprehensive research by searching Google, mining Reddit discussions, scraping web content with JS rendering, and synthesizing findings with citations into structured context.

Category
Visit Server

README

MCP server that gives your AI assistant research tools. Google search, Reddit deep-dives, web scraping with LLM extraction, and multi-model deep research — all as MCP tools that chain into each other.

npx mcp-research-powerpack

five tools, zero config to start. each API key you add unlocks more capabilities.

npm node license


tools

tool what it does requires
web_search parallel Google search across 3-100 keywords, CTR-weighted ranking, consensus detection SERPER_API_KEY
search_reddit same engine but filtered to reddit.com, 10-50 queries in parallel SERPER_API_KEY
get_reddit_post fetches 2-50 Reddit posts with full comment trees, optional LLM extraction REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET
scrape_links scrapes 1-50 URLs with JS rendering fallback, HTML-to-markdown, optional LLM extraction SCRAPEDO_API_KEY
deep_research sends questions to research-capable models (Grok, Gemini) with web search enabled, supports local file attachments OPENROUTER_API_KEY

tools are designed to chain: web_search suggests calling scrape_links, which suggests search_reddit, which suggests get_reddit_post, which suggests deep_research for synthesis.

install

Claude Desktop / Claude Code

add to your MCP config:

{
  "mcpServers": {
    "research-powerpack": {
      "command": "npx",
      "args": ["mcp-research-powerpack"],
      "env": {
        "SERPER_API_KEY": "...",
        "OPENROUTER_API_KEY": "..."
      }
    }
  }
}

from source

git clone https://github.com/yigitkonur/mcp-research-powerpack.git
cd mcp-research-powerpack
pnpm install && pnpm build
pnpm start

HTTP mode

MCP_TRANSPORT=http MCP_PORT=3000 npx mcp-research-powerpack

exposes /mcp (POST/GET/DELETE with session headers) and /health.

API keys

each key unlocks a capability. missing keys silently disable their tools — the server never crashes.

variable enables free tier
SERPER_API_KEY web_search, search_reddit 2,500 searches/mo at serper.dev
REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET get_reddit_post unlimited (reddit.com/prefs/apps, "script" type)
SCRAPEDO_API_KEY scrape_links 1,000 credits/mo at scrape.do
OPENROUTER_API_KEY deep_research, LLM extraction in scrape/reddit pay-per-token at openrouter.ai

configuration

optional tuning via environment variables:

variable default description
RESEARCH_MODEL x-ai/grok-4-fast primary deep research model
RESEARCH_FALLBACK_MODEL google/gemini-2.5-flash fallback if primary fails
LLM_EXTRACTION_MODEL openai/gpt-oss-120b:nitro model for scrape/reddit LLM extraction
DEFAULT_REASONING_EFFORT high research depth (low, medium, high)
DEFAULT_MAX_URLS 100 max search results per research question (10-200)
API_TIMEOUT_MS 1800000 request timeout in ms (default 30 min)
MCP_TRANSPORT stdio stdio or http
MCP_PORT 3000 port for HTTP mode

how it works

search ranking

results from multiple queries are deduplicated by normalized URL and scored using CTR-weighted position values (position 1 = 100.0, position 10 = 12.56). URLs appearing across multiple queries get a consensus marker. threshold tries >= 3, falls back to >= 2, then >= 1.

Reddit comment budget

global budget of 1,000 comments, max 200 per post. after the first pass, surplus from posts with fewer comments is redistributed to truncated posts in a second fetch pass.

scraping pipeline

three-mode fallback per URL: basic → JS rendering → JS + US geo-targeting. results go through HTML-to-markdown conversion (turndown), then optional LLM extraction with a 100k char input cap and 8,000 token output per URL.

deep research

32,000 token budget divided across questions (1 question = 32k, 10 questions = 3.2k each). Gemini models get google_search tool access. Grok/Perplexity get search_parameters with citations. primary model fails → automatic fallback.

file attachments

deep_research can read local files and include them as context. files over 600 lines are smart-truncated (first 500 + last 100 lines). line numbers preserved.

concurrency

operation parallel limit
web search keywords 8
Reddit search queries 8
Reddit post fetches per batch 5 (batches of 10)
URL scraping per batch 10 (batches of 30)
LLM extraction 3
deep research questions 3

all clients use manual retry with exponential backoff and jitter. the OpenAI SDK's built-in retry is disabled (maxRetries: 0).

project structure

src/
  index.ts                — entry point, STDIO + HTTP transport, signal handling
  worker.ts               — Cloudflare Workers entry (Durable Objects)
  config/
    index.ts              — env parsing (lazy Proxy objects), capability detection
    loader.ts             — YAML → Zod → JSON Schema pipeline, cached
    yaml/tools.yaml       — single source of truth for all tool definitions
  schemas/
    deep-research.ts      — Zod validation for research questions + file attachments
    scrape-links.ts       — Zod validation for URLs, timeout, LLM options
    web-search.ts         — Zod validation for keyword arrays
  tools/
    registry.ts           — tool lookup → capability check → validate → execute
    search.ts             — web_search handler
    reddit.ts             — search_reddit + get_reddit_post handlers
    scrape.ts             — scrape_links handler
    research.ts           — deep_research handler
  clients/
    search.ts             — Serper API client
    reddit.ts             — Reddit OAuth + comment fetching
    scraper.ts            — scrape.do client with fallback modes
    research.ts           — OpenRouter client with model-specific handling
  services/
    llm-processor.ts      — shared LLM extraction (singleton OpenAI client)
    markdown-cleaner.ts   — HTML → markdown via turndown
    file-attachment.ts    — local file reading with line ranges
  utils/
    concurrency.ts        — bounded parallel execution (pMap, pMapSettled)
    url-aggregator.ts     — CTR-weighted scoring and consensus detection
    errors.ts             — error classification, fetchWithTimeout
    logger.ts             — MCP logging protocol
    response.ts           — standardized output formatting

deploy

Cloudflare Workers

npx wrangler deploy

uses Durable Objects with SQLite storage. YAML-based tool definitions are replaced with inline definitions in the worker entry since there's no filesystem.

license

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured