WebSearch

WebSearch

Built as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.

Category
Visit Server

Tools

search

Performs web searches and retrieves up-to-date information from the internet. Args: - prompt: Specific query or topic to search for on the internet - limit: Maximum number of results to return (between 1 and 20) Returns: - Search results with relevant information about the requested topic

crawl

Crawls a website starting from the specified URL and extracts content from multiple pages. Args: - url: The complete URL of the web page to start crawling from - maxDepth: The maximum depth level for crawling linked pages - limit: The maximum number of pages to crawl Returns: - Content extracted from the crawled pages in markdown and HTML format

extract

Extracts specific information from a web page based on a prompt. Args: - url: The complete URL of the web page to extract information from - prompt: Instructions specifying what information to extract from the page - enabaleWebSearch: Whether to allow web searches to supplement the extraction - showSources: Whether to include source references in the response Returns: - Extracted information from the web page based on the prompt

scrape

README

WebSearch - Advanced Web Search and Content Extraction Tool

License Python Version Firecrawl uv

A powerful web search and content extraction tool built with Python, leveraging the Firecrawl API for advanced web scraping, searching, and content analysis capabilities.

🚀 Features

  • Advanced Web Search: Perform intelligent web searches with customizable parameters
  • Content Extraction: Extract specific information from web pages using natural language prompts
  • Web Crawling: Crawl websites with configurable depth and limits
  • Web Scraping: Scrape web pages with support for various output formats
  • MCP Integration: Built as a Model Context Protocol (MCP) server for seamless integration

📋 Prerequisites

  • Python 3.8 or higher
  • uv package manager
  • Firecrawl API key
  • OpenAI API key (optional, for enhanced features)
  • Tavily API key (optional, for additional search capabilities)

🛠️ Installation

  1. Install uv:
# On Windows (using pip)
pip install uv

# On Unix/MacOS
curl -LsSf https://astral.sh/uv/install.sh | sh

# Add uv to PATH (Unix/MacOS)
export PATH="$HOME/.local/bin:$PATH"

# Add uv to PATH (Windows - add to Environment Variables)
# Add: %USERPROFILE%\.local\bin
  1. Clone the repository:
git clone https://github.com/yourusername/websearch.git
cd websearch
  1. Create and activate a virtual environment with uv:
# Create virtual environment
uv venv

# Activate on Windows
.\.venv\Scripts\activate.ps1

# Activate on Unix/MacOS
source .venv/bin/activate
  1. Install dependencies with uv:
# Install from requirements.txt
uv sync
  1. Set up environment variables:
# Create .env file
touch .env

# Add your API keys
FIRECRAWL_API_KEY=your_firecrawl_api_key
OPENAI_API_KEY=your_openai_api_key

🎯 Usage

Setting Up With Claude for Desktop

Instead of running the server directly, you can configure Claude for Desktop to access the WebSearch tools:

  1. Locate or create your Claude for Desktop configuration file:

    • Windows: %env:AppData%\Claude\claude_desktop_config.json
    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  2. Add the WebSearch server configuration to the mcpServers section:

{
  "mcpServers": {
    "websearch": {
      "command": "uv",
      "args": [
        "--directory",
        "D:\\ABSOLUTE\\PATH\\TO\\WebSearch",
        "run",
        "main.py"
      ]
    }
  }
}
  1. Make sure to replace the directory path with the absolute path to your WebSearch project folder.

  2. Save the configuration file and restart Claude for Desktop.

  3. Once configured, the WebSearch tools will appear in the tools menu (hammer icon) in Claude for Desktop.

Available Tools

  1. Search

  2. Extract Information

  3. Crawl Websites

  4. Scrape Content

📚 API Reference

Search

  • query (str): The search query
  • Returns: Search results in JSON format

Extract

  • urls (List[str]): List of URLs to extract information from
  • prompt (str): Instructions for extraction
  • enableWebSearch (bool): Enable supplementary web search
  • showSources (bool): Include source references
  • Returns: Extracted information in specified format

Crawl

  • url (str): Starting URL
  • maxDepth (int): Maximum crawl depth
  • limit (int): Maximum pages to crawl
  • Returns: Crawled content in markdown/HTML format

Scrape

  • url (str): Target URL
  • Returns: Scraped content with optional screenshots

🔧 Configuration

Environment Variables

The tool requires certain API keys to function. We provide a .env.example file that you can use as a template:

  1. Copy the example file:
# On Unix/MacOS
cp .env.example .env

# On Windows
copy .env.example .env
  1. Edit the .env file with your API keys:
# OpenAI API key - Required for AI-powered features
OPENAI_API_KEY=your_openai_api_key_here

# Firecrawl API key - Required for web scraping and searching
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

Getting the API Keys

  1. OpenAI API Key:

    • Visit OpenAI's platform
    • Sign up or log in
    • Navigate to API keys section
    • Create a new secret key
  2. Firecrawl API Key:

    • Visit Firecrawl's website
    • Create an account
    • Navigate to your dashboard
    • Generate a new API key

If everything is configured correctly, you should receive a JSON response with search results.

Troubleshooting

If you encounter errors:

  1. Ensure all required API keys are set in your .env file
  2. Verify the API keys are valid and have not expired
  3. Check that the .env file is in the root directory of the project
  4. Make sure the environment variables are being loaded correctly

🤝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Firecrawl for their powerful web scraping API
  • OpenAI for AI capabilities
  • MCPThe MCP community for the protocol specification

📬 Contact

José Martín Rodriguez Mortaloni - @m4s1t425 - jmrodriguezm13@gmail.com


Made with ❤️ using Python and Firecrawl

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured