Crawlbase MCP

Crawlbase MCP

A Model Context Protocol server that enables AI agents to fetch live web content with JavaScript rendering, proxy rotation, and anti-bot evasion.

Category
Visit Server

README

What is Crawlbase MCP?

Crawlbase MCP is a Model Context Protocol (MCP) server that bridges AI agents and the live web. Instead of relying on outdated training data, your LLMs can now fetch fresh, structured, real-time content — powered by Crawlbase’s proven crawling infrastructure trusted by 70,000+ developers worldwide.

It handles the complexity of scraping for you:

  • JavaScript rendering for modern web apps
  • Proxy rotation & anti-bot evasion
  • Structured outputs (HTML, Markdown, screenshots)

How It Works

  • Get Free Crawlbase Tokens → Sign up at Crawlbase ↗️, get free Normal, and JavaScript tokens.
  • Set Up MCP Configuration → Configure the MCP server in your preferred client (Claude, Cursor, or Windsurf) by updating the MCP Servers settings.
  • Start Crawling → Use commands like crawl, crawl_markdown, or crawl_screenshot to bring live web data into your AI agent.

Setup & Integration

Claude Desktop

  1. Open Claude Desktop → Settings → Developer → Edit Config
  2. Add to claude_desktop_config.json:
  3. Replace your_token_here and your_js_token_here with the tokens from your dashboard.
{
  "mcpServers": {
    "crawlbase": {
      "type": "stdio",
      "command": "npx",
      "args": ["@crawlbase/mcp@latest"],
      "env": {
        "CRAWLBASE_TOKEN": "your_token_here",
        "CRAWLBASE_JS_TOKEN": "your_js_token_here"
      }
    }
  }
}

Claude Code

Add to your claude.json configuration:

{
  "mcpServers": {
    "crawlbase": {
      "type": "stdio",
      "command": "npx",
      "args": ["@crawlbase/mcp@latest"],
      "env": {
        "CRAWLBASE_TOKEN": "your_token_here",
        "CRAWLBASE_JS_TOKEN": "your_js_token_here"
      }
    }
  }
}

Cursor IDE

  1. Open Cursor IDE → File → Preferences → Cursor Settings → Tools and Integrations → Add Custom MCP
  2. Add to mcp.json:
  3. Replace your_token_here and your_js_token_here with the tokens from your dashboard.
{
  "mcpServers": {
    "crawlbase": {
      "type": "stdio",
      "command": "npx",
      "args": ["@crawlbase/mcp@latest"],
      "env": {
        "CRAWLBASE_TOKEN": "your_token_here",
        "CRAWLBASE_JS_TOKEN": "your_js_token_here"
      }
    }
  }
}

Windsurf IDE

  1. Open WindSurf IDE → File → Preferences → WindSurf Settings → General → MCP Servers → Manage MCPs → View raw config
  2. Add to mcp_config.json:
  3. Replace your_token_here and your_js_token_here with the tokens from your dashboard.
{
  "mcpServers": {
    "crawlbase": {
      "type": "stdio",
      "command": "npx",
      "args": ["@crawlbase/mcp@latest"],
      "env": {
        "CRAWLBASE_TOKEN": "your_token_here",
        "CRAWLBASE_JS_TOKEN": "your_js_token_here"
      }
    }
  }
}

HTTP Transport Mode

For scenarios where you need a shared MCP server accessible over HTTP (e.g., multi-user environments, custom integrations), you can run the server in HTTP mode:

# Clone and install
git clone https://github.com/crawlbase/crawlbase-mcp.git
cd crawlbase-mcp
npm install

# Start HTTP server with tokens (default port: 3000)
CRAWLBASE_TOKEN=your_token CRAWLBASE_JS_TOKEN=your_js_token npm run start:http

# Or with custom port
CRAWLBASE_TOKEN=your_token CRAWLBASE_JS_TOKEN=your_js_token MCP_PORT=8080 npm run start:http

The server exposes:

  • POST /mcp - MCP Streamable HTTP endpoint
  • GET /health - Health check endpoint

Per-Request Token Authentication

HTTP mode supports per-request tokens via headers, allowing multiple users to share a single server:

curl -X POST http://localhost:3000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "X-Crawlbase-Token: your_token" \
  -H "X-Crawlbase-JS-Token: your_js_token" \
  -d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}'

Headers:

  • X-Crawlbase-Token - Normal token for HTML requests
  • X-Crawlbase-JS-Token - JavaScript token for JS-rendered pages/screenshots

Headers override environment variables when provided, enabling multi-tenant deployments.

🔑 Get your free tokens at Crawlbase ↗️.

Usage

Once configured, use these commands inside Claude, Cursor, or Windsurf:

  • crawl → Fetch raw HTML
  • crawl_markdown → Extract clean Markdown
  • crawl_screenshot → Capture screenshots

Example prompts:

  • “Crawl Hacker News and return top stories in markdown.”
  • “Take a screenshot of TechCrunch homepage.”
  • “Fetch Tesla investor relations page as HTML.”

Async Crawling with Cloud Storage

For larger jobs, Crawlbase MCP can push crawl results to Crawlbase Cloud Storage instead of returning them immediately. Your AI agent can then come back later to read, list, or clean up those pages — useful when crawling many URLs at once, revisiting a dataset across sessions, or keeping heavy HTML out of the chat until you actually need it.

Example prompts:

  • “Crawl these 50 product pages and save them to my Crawlbase storage. Once they're saved, summarize each one.”
  • “Save the Hacker News front page to storage so I can analyze it later.”
  • “How many pages do I have stored in Crawlbase right now? Show me the most recent 20.”
  • “Pull back everything I saved yesterday from my Crawlbase storage and give me a report.”
  • “Delete all the pages I have in Crawlbase storage — I'm done with that project.”

Use Cases

  • Market research → Pull live data from competitors, news, and reports
  • E-commerce monitoring → Track products, reviews, and prices in real time
  • News & finance feeds → Keep AI agents up-to-date with live events
  • Autonomous AI agents → Give them vision to act on fresh web data

Resources & Next Steps

Looking to supercharge your AI agents with live web data? Get started here:


MSeeP.ai Security Assessment Badge

Copyright 2026 Crawlbase

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured