MCP Servers

ScrapeLab MCP

Enables undetectable web scraping and browser automation for AI agents with 84 tools including stealth navigation, element extraction, network interception, and auto cookie consent dismissal. Bypasses anti-bot systems like Cloudflare and DataDome while providing LLM-ready markdown output and full Chrome DevTools Protocol access.

README

ScrapeLab MCP

The most complete stealth browser MCP server for AI agents.

84 tools. Undetectable by anti-bot systems. Full CDP access.
LLM-ready markdown. Auto cookie consent dismiss (100+ CMPs).
Accessibility snapshots, PDF export, HAR capture, network hooks, element cloning.

</div>

What is this?

An MCP server that gives AI agents (Claude, Cursor, Windsurf, etc.) a fully undetectable browser with 84 automation tools. Built on nodriver + Chrome DevTools Protocol + FastMCP.

Why not Playwright MCP? Playwright is detectable. Sites with Cloudflare, DataDome, or any anti-bot system will block it. ScrapeLab uses nodriver (the successor of undetected-chromedriver) — no navigator.webdriver flag, no automation fingerprints, no detection.

Key differentiators

Feature	ScrapeLab MCP	Playwright MCP	Stealth Browser MCP
Anti-bot bypass (Cloudflare, DataDome)	Yes	No	Yes
Markdown output (LLM-ready)	Yes	Yes	No
Cookie consent auto-dismiss (100+ CMPs)	Yes	No	No
Accessibility snapshots	Yes	Yes	No
PDF export	Yes	Yes	No
HAR export	Yes	No	No
Network interception + hooks	Deep (Python hooks)	Routes only	Deep
Element cloning (styles, events, animations)	Full CDP	No	Full CDP
Progressive element cloning	Yes	No	Yes
Tools	84	61	90
Modular sections (enable/disable)	Yes	Capabilities	Yes

LLM-Ready Markdown

get_page_content returns clean markdown instead of raw HTML — 98-99% smaller, ready for LLM consumption.

Mode	Engine	Best for	Size reduction
`readability=False` (default)	html2text	Full page structure, navigation, all content	~98%
`readability=True`	trafilatura	Article/main content only, precision extraction	~99%

Both modes strip scripts, styles, SVGs, cookie banners, navigation chrome, and HTML comments before conversion.

Cookie Consent Auto-Dismiss

Every navigate call automatically dismisses cookie/GDPR consent popups. No manual clicks, no leftover overlays blocking your scraper.

Three-layer system:

DuckDuckGo autoconsent — 2863 rules covering 100+ consent management platforms (iubenda, Cookiebot, OneTrust, Quantcast, TrustArc, etc.)
CMP JS API fallback — Calls platform APIs directly from the main page (_sp_.destroyMessages(), OneTrust.AllowAll(), __tcfapi, Didomi, Cookiebot) — handles cross-origin iframe popups like SourcePoint
DOM click fallback — Catches multi-step consent flows (e.g. iubenda's 2-click Italian flow) by re-clicking accept buttons

Disable per-instance with spawn_browser(auto_dismiss_consent=False).

Quickstart

1. Clone and install

git clone https://github.com/competitorch/ScrapeLabMCP.git
cd ScrapeLabMCP
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2. Add to your MCP client

Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "scrapelab-mcp": {
      "command": "/path/to/ScrapeLabMCP/.venv/bin/python",
      "args": ["/path/to/ScrapeLabMCP/src/server.py"]
    }
  }
}

Claude Code CLI:

claude mcp add-json scrapelab-mcp '{
  "type": "stdio",
  "command": "/path/to/.venv/bin/python",
  "args": ["/path/to/src/server.py"]
}'

3. Use it

You: "Open a browser and navigate to example.com"
You: "Take a screenshot and get the accessibility snapshot"
You: "Get the page content as markdown"
You: "Export the page as PDF"
You: "Show me all network requests and export as HAR"

Tools Reference (84 tools)

Browser Management (10 tools)

Tool	Description
`spawn_browser`	Launch undetectable browser instance (headless, proxy, custom UA, auto-consent)
`navigate`	Navigate to URL with wait conditions + auto cookie consent dismiss
`close_instance`	Clean shutdown of browser instance
`list_instances`	List all active browser instances
`get_instance_state`	Full page state (URL, cookies, storage, viewport)
`go_back` / `go_forward`	Browser history navigation
`reload_page`	Reload with optional cache bypass
`get_accessibility_snapshot`	Structured accessibility tree — the fastest way for an LLM to understand a page
`save_as_pdf`	Export page as PDF with full layout control

Element Interaction (11 tools)

Tool	Description
`query_elements`	Find elements by CSS/XPath with visibility info
`click_element`	Natural click with fallback strategies
`type_text`	Human-like typing
`paste_text`	Instant paste via CDP
`scroll_page`	Directional scrolling
`wait_for_element`	Smart wait with timeout
`execute_script`	Run JavaScript in page context
`select_option`	Dropdown selection
`get_element_state`	Element properties and bounding box
`take_screenshot`	Screenshot (viewport, full page, or element)
`get_page_content`	HTML, text, or markdown (`readability=True` for article extraction)

Element Extraction (8 tools)

Deep extraction with optional save_to_file=True on every tool.
Style extraction supports method="js" or method="cdp" for maximum accuracy.

Tool	Description
`extract_element_styles`	300+ CSS properties, pseudo-elements, inheritance chain
`extract_element_structure`	DOM tree, attributes, data attributes, children
`extract_element_events`	Event listeners, inline handlers, framework detection
`extract_element_animations`	CSS animations, transitions, transforms, keyframes
`extract_element_assets`	Images, backgrounds, fonts, icons, videos
`extract_related_files`	Linked CSS/JS files, imports, modules
`clone_element_complete`	Master clone: all of the above in one call (`method="comprehensive"` or `"cdp"`)

Progressive Cloning (10 tools)

Lazy-load element data on demand — start lightweight, expand what you need.

Tool	Description
`clone_element_progressive`	Base structure with `element_id` for on-demand expansion
`expand_styles` / `expand_events` / `expand_children`	Expand specific data categories
`expand_css_rules` / `expand_pseudo_elements` / `expand_animations`	Expand detailed styling data
`list_stored_elements` / `clear_stored_element` / `clear_all_elements`	Manage stored elements

Network & Traffic (12 tools)

Deep network monitoring with interception, search, and standard export formats.

Tool	Description
`list_network_requests`	All captured requests with type filtering
`get_request_details` / `get_response_details` / `get_response_content`	Inspect individual requests
`search_network_requests`	Search by URL pattern, method, status, body content
`modify_headers`	Modify request headers for future requests
`set_network_capture_filters` / `get_network_capture_filters`	Control what gets captured
`export_network_data` / `import_network_data`	JSON export/import
`export_har`	Export as HAR 1.2 — importable in Chrome DevTools, Postman, Fiddler

Dynamic Hooks (7 tools)

AI-generated Python functions that intercept and modify network traffic in real-time.

Tool	Description
`create_dynamic_hook`	Full hook with custom Python function
`create_simple_dynamic_hook`	Template hook (block, redirect, add_headers, log)
`list_dynamic_hooks` / `get_dynamic_hook_details` / `remove_dynamic_hook`	Manage hooks
`get_hook_documentation`	Docs for writing hooks (overview, requirements, examples, patterns)
`validate_hook_function`	Validate hook code before deploying

CDP Functions (12 tools)

Direct Chrome DevTools Protocol access for advanced automation.

Tool	Description
`execute_cdp_command`	Raw CDP command execution
`discover_global_functions` / `discover_object_methods`	Discover page APIs
`call_javascript_function` / `execute_function_sequence`	Call JS functions
`inject_and_execute_script`	Inject and run scripts
`inspect_function_signature`	Inspect function signatures
`create_persistent_function`	Functions that survive navigation
`create_python_binding` / `execute_python_in_browser`	Python-in-browser via py2js
`get_execution_contexts` / `list_cdp_commands` / `get_function_executor_info`	CDP introspection

Cookies & Storage (3 tools)

Tool	Description
`get_cookies` / `set_cookie` / `clear_cookies`	Cookie management

Tab Management (5 tools)

Tool	Description
`new_tab` / `list_tabs` / `switch_tab` / `close_tab` / `get_active_tab`	Full tab lifecycle

Debugging (5 tools)

Tool	Description
`get_debug_view` / `clear_debug_view` / `export_debug_logs` / `get_debug_lock_status`	Debug system
`validate_browser_environment_tool`	Diagnose platform and browser issues

Modular Architecture

Load only what you need:

# Full suite (84 tools)
python src/server.py

# Core only — browser + element interaction
python src/server.py --minimal

# Disable specific sections
python src/server.py --disable-cdp-functions --disable-progressive-cloning

# List all sections
python src/server.py --list-sections

Sections

Section	Tools	Description
`browser-management`	10	Core browser ops, accessibility, PDF
`element-interaction`	11	Click, type, scroll, screenshot, markdown
`element-extraction`	8	Deep element cloning with save_to_file
`network-debugging`	12	Network monitoring, HAR export
`cdp-functions`	12	Raw CDP access
`progressive-cloning`	10	Lazy element expansion
`cookies-storage`	3	Cookie management
`tabs`	5	Tab management
`debugging`	5	Debug tools
`dynamic-hooks`	7	Network hook system

Environment Variables

Variable	Default	Description
`SCRAPELAB_IDLE_TIMEOUT`	`5`	Minutes before idle browser instances are auto-closed
`PORT`	`8000`	Port for HTTP/SSE transport

Troubleshooting

No compatible browser found — Install Chrome, Chromium, or Edge. Run validate_browser_environment_tool() to diagnose.

Too many tools for your use case — Use --minimal or --disable-<section>.

Browser instances piling up — Instances auto-close after 5 minutes of inactivity (configurable via SCRAPELAB_IDLE_TIMEOUT).

License

MIT — see LICENSE.

Built by Edoardo Nardi
Stealth engine powered by nodriver

</div>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured