
MCP Firecrawl Server
A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction.
codyde
Tools
scrape-website
extract-data
README
MCP Firecrawl Server
This is a simple MCP server that provides tools to scrape websites and extract structured data using Firecrawl's APIs.
Setup
- Install dependencies:
npm install
- Create a
.env
file in the root directory with the following variables:
FIRECRAWL_API_TOKEN=your_token_here
SENTRY_DSN=your_sentry_dsn_here
FIRECRAWL_API_TOKEN
(required): Your Firecrawl API tokenSENTRY_DSN
(optional): Sentry DSN for error tracking and performance monitoring
- Start the server:
npm start
Alternatively, you can set environment variables directly when running the server:
FIRECRAWL_API_TOKEN=your_token_here npm start
Features
- Website Scraping: Extract content from websites in various formats
- Structured Data Extraction: Extract specific data points based on custom schemas
- Error Tracking: Integrated with Sentry for error tracking and performance monitoring
Usage
The server exposes two tools:
scrape-website
: Basic website scraping with multiple format optionsextract-data
: Structured data extraction based on prompts and schemas
Tool: scrape-website
This tool scrapes a website and returns its content in the requested formats.
Parameters:
url
(string, required): The URL of the website to scrapeformats
(array of strings, optional): Array of desired output formats. Supported formats are:"markdown"
(default)"html"
"text"
Example usage with MCP Inspector:
# Basic usage (defaults to markdown)
mcp-inspector --tool scrape-website --args '{
"url": "https://example.com"
}'
# Multiple formats
mcp-inspector --tool scrape-website --args '{
"url": "https://example.com",
"formats": ["markdown", "html", "text"]
}'
Tool: extract-data
This tool extracts structured data from websites based on a provided prompt and schema.
Parameters:
urls
(array of strings, required): Array of URLs to extract data fromprompt
(string, required): The prompt describing what data to extractschema
(object, required): Schema definition for the data to extract
The schema definition should be an object where keys are field names and values are types. Supported types are:
"string"
: For text fields"boolean"
: For true/false fields"number"
: For numeric fields- Arrays: Specified as
["type"]
where type is one of the above - Objects: Nested objects with their own type definitions
Example usage with MCP Inspector:
# Basic example extracting company information
mcp-inspector --tool extract-data --args '{
"urls": ["https://example.com"],
"prompt": "Extract the company mission, whether it supports SSO, and whether it is open source.",
"schema": {
"company_mission": "string",
"supports_sso": "boolean",
"is_open_source": "boolean"
}
}'
# Complex example with nested data
mcp-inspector --tool extract-data --args '{
"urls": ["https://example.com/products", "https://example.com/pricing"],
"prompt": "Extract product information including name, price, and features.",
"schema": {
"products": [{
"name": "string",
"price": "number",
"features": ["string"]
}]
}
}'
Both tools will return appropriate error messages if the scraping or extraction fails and automatically log errors to Sentry if configured.
Troubleshooting
If you encounter issues:
- Verify your Firecrawl API token is valid
- Check that the URLs you're trying to scrape are accessible
- For complex schemas, ensure they follow the supported format
- Review Sentry logs for detailed error information (if configured)
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Playwright MCP Server
Provides a server utilizing Model Context Protocol to enable human-like browser automation with Playwright, allowing control over browser actions such as navigation, element interaction, and scrolling.
@kazuph/mcp-fetch
Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.
Claude Code MCP
An implementation of Claude Code as a Model Context Protocol server that enables using Claude's software engineering capabilities (code generation, editing, reviewing, and file operations) through the standardized MCP interface.
DuckDuckGo MCP Server
A Model Context Protocol (MCP) server that provides web search capabilities through DuckDuckGo, with additional features for content fetching and parsing.

Supabase MCP Server
A Model Context Protocol (MCP) server that provides programmatic access to the Supabase Management API. This server allows AI models and other clients to manage Supabase projects and organizations through a standardized interface.
YouTube Transcript MCP Server
This server retrieves transcripts for given YouTube video URLs, enabling integration with Goose CLI or Goose Desktop for transcript extraction and processing.
serper-search-scrape-mcp-server
This Serper MCP Server supports search and webpage scraping, and all the most recent parameters introduced by the Serper API, like location.