websearch-mcp

websearch-mcp

Enables web searching via SearXNG, page content extraction with Crawl4AI, and image analysis using vision language models. It provides AI agents with tools for information synthesis and web-based data retrieval through OpenAI-compatible LLM endpoints.

Category
Visit Server

README

websearch-mcp

An MCP server that provides web search and page fetching tools for AI agents. Uses SearXNG for search, Crawl4AI for content extraction, and any OpenAI-compatible LLM for server-side synthesis.

Prerequisites

  • Python 3.12+
  • SearXNG instance with JSON format enabled (search.formats: [json] in settings.yml)
  • OpenAI-compatible LLM endpoint (OpenAI, Ollama, vLLM, LiteLLM, etc.)

Installation

# Run directly from GitHub
uvx --from "git+https://github.com/<org>/websearch-mcp" websearch-mcp

# Or clone and install locally
git clone https://github.com/<org>/websearch-mcp
cd websearch-mcp
uv sync
uv run websearch-mcp

Tools

web_search

Search the web via SearXNG, fetch top result pages, and synthesize with LLM.

Parameter Type Required Description
query string Yes Search query
max_results int No Max results (default: 10)
allowed_domains string[] No Only include these domains
blocked_domains string[] No Exclude these domains

webfetch

Fetch a single URL, extract content, and process with LLM.

Parameter Type Required Description
url string Yes URL to fetch
prompt string No Custom instruction for LLM processing

image-description

Describe an image using a vision language model (VLM). Accepts either base64-encoded image data or an absolute filesystem path to an image file.

Parameter Type Required Description
image string Yes Base64-encoded image data or absolute filesystem path

Returns a JSON object with description, success status, and optional error message.

Environment Variables

Variable Required Default Description
SEARXNG_URL Yes Base URL of SearXNG instance
LLM_BASE_URL Yes OpenAI-compatible endpoint base URL
LLM_API_KEY Yes API key for the LLM endpoint
LLM_MODEL Yes Model name for chat completions
CACHE_TTL_SECONDS No 900 Cache TTL in seconds (0 to disable)
CACHE_MAX_ENTRIES No 1000 Max cache entries before LRU eviction
FETCH_TIMEOUT No 30 Per-page fetch timeout in seconds
LLM_TIMEOUT No 60 LLM request timeout in seconds
MAX_CONTENT_SIZE No 5242880 Max content size in bytes (5MB)
DEFAULT_MAX_RESULTS No 10 Default result count for web_search

VLM Configuration (for image-description tool)

Variable Required Default Description
VLM_BASE_URL No LLM_BASE_URL OpenAI-compatible endpoint for VLM
VLM_API_KEY No LLM_API_KEY API key for VLM endpoint
VLM_MODEL No LLM_MODEL Model name for image description
MAX_IMAGE_SIZE No 10485760 Max image size in bytes (10MB)

Agent Configuration

Claude Desktop (stdio)

{
  "mcpServers": {
    "websearch": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/<org>/websearch-mcp", "websearch-mcp"],
      "env": {
        "SEARXNG_URL": "http://localhost:8888",
        "LLM_BASE_URL": "http://localhost:11434/v1",
        "LLM_API_KEY": "ollama",
        "LLM_MODEL": "llama3"
      }
    }
  }
}

Generic MCP Config (stdio)

{
  "command": "uvx",
  "args": ["--from", "git+https://github.com/<org>/websearch-mcp", "websearch-mcp"],
  "env": {
    "SEARXNG_URL": "http://localhost:8888",
    "LLM_BASE_URL": "https://api.openai.com/v1",
    "LLM_API_KEY": "sk-...",
    "LLM_MODEL": "gpt-4o-mini"
  }
}

HTTP Transport

websearch-mcp --transport http --port 3000
{
  "url": "http://localhost:3000/mcp"
}

Development

uv sync
uv run pytest tests/ -v

Example Usage

image-description tool

With base64-encoded image:

# Using base64 encoded image data
image_b64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
result = await image_description(image_b64)
# Returns: {"description": "A small white square", "success": true, "error": null}

With filesystem path:

# Using absolute filesystem path
result = await image_description("/path/to/image.png")
# Returns: {"description": "A detailed description of the image", "success": true, "error": null}

With Ollama (using llava or other VLM):

{
  "mcpServers": {
    "websearch": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/<org>/websearch-mcp", "websearch-mcp"],
      "env": {
        "SEARXNG_URL": "http://localhost:8888",
        "LLM_BASE_URL": "http://localhost:11434/v1",
        "LLM_API_KEY": "ollama",
        "LLM_MODEL": "llama3",
        "VLM_BASE_URL": "http://localhost:11434/v1",
        "VLM_API_KEY": "ollama",
        "VLM_MODEL": "llava"
      }
    }
  }
}

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured