MCP WebScout

MCP WebScout

A Model Context Protocol server that provides web search capabilities via DuckDuckGo and advanced content extraction using Crawl4AI and LLM-powered analysis. It enables users to perform web-wide searches and fetch processed website data through automated browser interaction and intelligent summarization.

Category
Visit Server

README

MCP WebScout

A Model Context Protocol (MCP) server providing web search (DuckDuckGo) and intelligent content extraction with LLM-powered analysis.

Features

  • search: Search the web using DuckDuckGo
  • fetch: Advanced web fetching with Crawl4AI and LLM extraction

System Requirements

Requirement Version Notes
Python >= 3.10 Required runtime environment
pip latest Package manager (included with Python)
Playwright latest Required by Crawl4AI for browser automation
DeepSeek API Key - Required for LLM extraction mode
Proxy (optional) - Required for users in mainland China

Python Dependencies (14 packages)

Package Version Purpose
mcp >=1.0.0 MCP protocol implementation
duckduckgo-search >=3.0.0 DuckDuckGo search API
requests >=2.32.0 HTTP requests
beautifulsoup4 >=4.12.0 HTML parsing
openai >=1.30.0 OpenAI API client for DeepSeek
crawl4ai >=0.5.0 Advanced web scraping

Quick Start

Get started in 5 steps:

1. Clone and Setup Environment

git clone <repository>
cd mcp-webscout
python -m venv .venv

On Windows:

.venv\Scripts\activate

On macOS/Linux:

source .venv/bin/activate

2. Install Dependencies

pip install -e ".[dev]"

3. Install Playwright Browsers

playwright install chromium

4. Configure Environment Variables

cp .env.example .env

Edit .env and add your configuration:

# Required for LLM extraction
DEEPSEEK_API_KEY=sk-your-actual-key-here

# Required for mainland China users
PROXY_URL=http://127.0.0.1:7890
USE_PROXY=true

5. Verify Installation

# Run tests
pytest tests/ -v

# Test the server
python -m mcp_webscout --help

Detailed Configuration

For detailed environment setup instructions, see ENV_SETUP.md.

Usage

As a Command

mcp-webscout

As a Python Module

python -m mcp_webscout

With Claude Desktop

Add to your claude_desktop_config.json:

Basic Configuration

{
  "mcpServers": {
    "webscout": {
      "command": "mcp-webscout"
    }
  }
}

With Environment Variables (Recommended)

{
  "mcpServers": {
    "webscout": {
      "command": "mcp-webscout",
      "env": {
        "DEEPSEEK_API_KEY": "sk-your-key-here",
        "PROXY_URL": "http://127.0.0.1:7890",
        "USE_PROXY": "true",
        "DEFAULT_MAX_LENGTH": "5000",
        "PYTHONUTF8": "1"
      }
    }
  }
}

Windows Configuration

{
  "mcpServers": {
    "webscout": {
      "command": "python",
      "args": ["-m", "mcp_webscout"],
      "env": {
        "DEEPSEEK_API_KEY": "sk-your-key-here",
        "PROXY_URL": "http://127.0.0.1:7890",
        "USE_PROXY": "true",
        "PYTHONUTF8": "1"
      }
    }
  }
}

Tools

search

Search the web using DuckDuckGo.

Parameters:

Name Type Required Description
query string Yes Search query
max_results integer No Maximum results (1-10, default: 5)

Returns:

Formatted search results with titles, URLs, and snippets.

Example:

{
  "query": "Python programming",
  "max_results": 3
}

fetch

Advanced web fetching with Crawl4AI and LLM extraction.

Parameters:

Name Type Required Description
url string Yes URL to fetch
mode string No Extraction mode: simple, llm (default: simple)
prompt string No Custom extraction prompt for LLM mode
max_length integer No Maximum characters (default: 5000)
use_proxy boolean No Use proxy (default: true)

Returns:

Fetched and optionally extracted content.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured