Crawlbase MCP
A Model Context Protocol server that enables AI agents to fetch live web content with JavaScript rendering, proxy rotation, and anti-bot evasion.
README
What is Crawlbase MCP?
Crawlbase MCP is a Model Context Protocol (MCP) server that bridges AI agents and the live web. Instead of relying on outdated training data, your LLMs can now fetch fresh, structured, real-time content — powered by Crawlbase’s proven crawling infrastructure trusted by 70,000+ developers worldwide.
It handles the complexity of scraping for you:
- JavaScript rendering for modern web apps
- Proxy rotation & anti-bot evasion
- Structured outputs (HTML, Markdown, screenshots)
How It Works
- Get Free Crawlbase Tokens → Sign up at Crawlbase ↗️, get free Normal, and JavaScript tokens.
- Set Up MCP Configuration → Configure the MCP server in your preferred client (Claude, Cursor, or Windsurf) by updating the MCP Servers settings.
- Start Crawling → Use commands like crawl, crawl_markdown, or crawl_screenshot to bring live web data into your AI agent.
Setup & Integration
Claude Desktop
- Open Claude Desktop → Settings → Developer → Edit Config
- Add to
claude_desktop_config.json: - Replace
your_token_hereandyour_js_token_herewith the tokens from your dashboard.
{
"mcpServers": {
"crawlbase": {
"type": "stdio",
"command": "npx",
"args": ["@crawlbase/mcp@latest"],
"env": {
"CRAWLBASE_TOKEN": "your_token_here",
"CRAWLBASE_JS_TOKEN": "your_js_token_here"
}
}
}
}
Claude Code
Add to your claude.json configuration:
{
"mcpServers": {
"crawlbase": {
"type": "stdio",
"command": "npx",
"args": ["@crawlbase/mcp@latest"],
"env": {
"CRAWLBASE_TOKEN": "your_token_here",
"CRAWLBASE_JS_TOKEN": "your_js_token_here"
}
}
}
}
Cursor IDE
- Open Cursor IDE → File → Preferences → Cursor Settings → Tools and Integrations → Add Custom MCP
- Add to
mcp.json: - Replace
your_token_hereandyour_js_token_herewith the tokens from your dashboard.
{
"mcpServers": {
"crawlbase": {
"type": "stdio",
"command": "npx",
"args": ["@crawlbase/mcp@latest"],
"env": {
"CRAWLBASE_TOKEN": "your_token_here",
"CRAWLBASE_JS_TOKEN": "your_js_token_here"
}
}
}
}
Windsurf IDE
- Open WindSurf IDE → File → Preferences → WindSurf Settings → General → MCP Servers → Manage MCPs → View raw config
- Add to
mcp_config.json: - Replace
your_token_hereandyour_js_token_herewith the tokens from your dashboard.
{
"mcpServers": {
"crawlbase": {
"type": "stdio",
"command": "npx",
"args": ["@crawlbase/mcp@latest"],
"env": {
"CRAWLBASE_TOKEN": "your_token_here",
"CRAWLBASE_JS_TOKEN": "your_js_token_here"
}
}
}
}
HTTP Transport Mode
For scenarios where you need a shared MCP server accessible over HTTP (e.g., multi-user environments, custom integrations), you can run the server in HTTP mode:
# Clone and install
git clone https://github.com/crawlbase/crawlbase-mcp.git
cd crawlbase-mcp
npm install
# Start HTTP server with tokens (default port: 3000)
CRAWLBASE_TOKEN=your_token CRAWLBASE_JS_TOKEN=your_js_token npm run start:http
# Or with custom port
CRAWLBASE_TOKEN=your_token CRAWLBASE_JS_TOKEN=your_js_token MCP_PORT=8080 npm run start:http
The server exposes:
POST /mcp- MCP Streamable HTTP endpointGET /health- Health check endpoint
Per-Request Token Authentication
HTTP mode supports per-request tokens via headers, allowing multiple users to share a single server:
curl -X POST http://localhost:3000/mcp \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-H "X-Crawlbase-Token: your_token" \
-H "X-Crawlbase-JS-Token: your_js_token" \
-d '{"jsonrpc": "2.0", "method": "tools/list", "id": 1}'
Headers:
X-Crawlbase-Token- Normal token for HTML requestsX-Crawlbase-JS-Token- JavaScript token for JS-rendered pages/screenshots
Headers override environment variables when provided, enabling multi-tenant deployments.
🔑 Get your free tokens at Crawlbase ↗️.
Usage
Once configured, use these commands inside Claude, Cursor, or Windsurf:
- crawl → Fetch raw HTML
- crawl_markdown → Extract clean Markdown
- crawl_screenshot → Capture screenshots
Example prompts:
- “Crawl Hacker News and return top stories in markdown.”
- “Take a screenshot of TechCrunch homepage.”
- “Fetch Tesla investor relations page as HTML.”
Async Crawling with Cloud Storage
For larger jobs, Crawlbase MCP can push crawl results to Crawlbase Cloud Storage instead of returning them immediately. Your AI agent can then come back later to read, list, or clean up those pages — useful when crawling many URLs at once, revisiting a dataset across sessions, or keeping heavy HTML out of the chat until you actually need it.
Example prompts:
- “Crawl these 50 product pages and save them to my Crawlbase storage. Once they're saved, summarize each one.”
- “Save the Hacker News front page to storage so I can analyze it later.”
- “How many pages do I have stored in Crawlbase right now? Show me the most recent 20.”
- “Pull back everything I saved yesterday from my Crawlbase storage and give me a report.”
- “Delete all the pages I have in Crawlbase storage — I'm done with that project.”
Use Cases
- Market research → Pull live data from competitors, news, and reports
- E-commerce monitoring → Track products, reviews, and prices in real time
- News & finance feeds → Keep AI agents up-to-date with live events
- Autonomous AI agents → Give them vision to act on fresh web data
Resources & Next Steps
Looking to supercharge your AI agents with live web data? Get started here:
- ✍️ Learn More – See how MCP powers AI agents with real-time web data ↗️
- 🌐 Crawlbase Website – Get free tokens & start crawling today ↗️
Copyright 2026 Crawlbase
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
