MCP Servers

Stealth Browser MCP Server

Provides stealth web browsing using dual browser engines (Chromium and Firefox) with automatic bot-detection bypass, enabling AI agents to browse, interact, and extract content from websites without being blocked.

README

Stealth Browser MCP Server

A Model Context Protocol (MCP) server that provides stealth web browsing capabilities using dual browser engines — Patchright (Chromium) and Camoufox (Firefox) — with automatic bot-detection bypass.

Built for use with Claude Code and other MCP-compatible AI agents.

Features

Dual Engine Architecture — Patchright (Chromium) as primary engine, Camoufox (Firefox) as fallback with stronger anti-fingerprinting
Auto Bot-Block Detection — Detects Cloudflare, CAPTCHAs, and other bot protection; automatically retries with Firefox when engine: auto
Headed Mode via Xvfb — Runs real browser windows (not headless) to beat fingerprint detection
18 MCP Tools — Browse, interact, extract, scrape, crawl, structured data extraction, session management, persistent profile state save/load/list/delete, X/Twitter search extraction helpers, heuristic topic research summaries, thread readers, deep topic research, and saved report bundles
3-Tier Content Extraction — trafilatura → readability → innertext fallback chain
SSRF-Hardened — DNS resolution validation blocks localhost, private IPs, cloud metadata, file://
Session Pooling — Up to 5 isolated BrowserContext sessions per engine, with 10-minute idle eviction
Smart Truncation — Large pages truncated at 50K chars on paragraph boundaries
CAPTCHA Detection — Detects Cloudflare Turnstile, reCAPTCHA, hCaptcha; reports structured captcha_detected flag
Auto-Cleanup — Idle sessions evicted after 10 minutes, crashed browser auto-restarts

Tools

`browse`

Navigate to a URL and return page content as clean markdown.

Parameter	Type	Required	Description
`url`	string	yes	URL to navigate to (http/https only)
`session_id`	string	no	Reuse an existing session. If omitted, creates a new one
`wait_for`	string	no	CSS selector to wait for before extracting
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: url, title, content, session_id, truncated, captcha_detected, extraction_method, timing_ms, status_code, engine

`interact`

Interact with the current page in a session.

Parameter	Type	Required	Description
`session_id`	string	yes	Session from a previous `browse` call
`action`	string	yes	One of: `click`, `type`, `select`, `hover`, `scroll`
`selector`	string	yes	CSS selector for the target element
`value`	string	no	Required for `type` and `select`. For `scroll`, pixel amount

Returns: success, session_id, action_performed, page_url, timing_ms

`extract`

Re-extract content from the current page without re-navigating. Use this instead of browse when you're already on the page.

Parameter	Type	Required	Description
`session_id`	string	yes	Session to extract from
`mode`	string	no	`auto` (default), `article`, or `text` (raw innertext)

Returns: content, session_id, url, extraction_method, truncated

`close_session`

Close a browser session and free its resources.

Parameter	Type	Required	Description
`session_id`	string	yes	Session to close

Returns: status, session_id

`save_session_state`

Persist an active session's cookies and local storage to a named profile.

Parameter	Type	Required	Description
`session_id`	string	yes	Session to persist
`profile_name`	string	yes	Safe profile name to save under

Returns: status, session_id, profile_name, storage_state_path, meta

`load_session_state`

Create a new session from a previously saved profile.

Parameter	Type	Required	Description
`profile_name`	string	yes	Saved profile name
`session_id`	string	no	Optional custom session ID
`engine`	string	no	`chromium` (default) or `firefox`

Returns: status, session_id, profile_name, engine, meta

`list_saved_profiles`

List saved persistent profiles on disk.

Returns: profiles, count

`delete_saved_profile`

Delete a saved profile from disk.

Parameter	Type	Required	Description
`profile_name`	string	yes	Saved profile name

Returns: status, profile_name

`search_x`

Open an X search results page for a query and return structured tweet cards.

Parameter	Type	Required	Description
`query`	string	yes	Search query
`mode`	string	no	`latest` (default) or `top`
`max_items`	int	no	Max tweets to extract (1-50, default 20)
`scroll_rounds`	int	no	Additional scroll/collect rounds (0-10, default 0)
`session_id`	string	no	Reuse an existing session
`profile_name`	string	no	Load a persisted login profile into a fresh session
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: query, mode, search_url, session_id, tweets, extracted_count, scroll_rounds_completed, captcha_detected, engine

`extract_x_search_results`

Extract structured tweet cards from the current page of an existing X search session.

Parameter	Type	Required	Description
`session_id`	string	yes	Active session already on an X search page
`max_items`	int	no	Max tweets to extract (1-50, default 20)

Returns: session_id, tweets, extracted_count, page_url, page_title

`research_x_topic`

Run X search and produce a lightweight heuristic topic summary from the extracted tweets.

Parameter	Type	Required	Description
`query`	string	yes	Search query
`mode`	string	no	`latest` (default) or `top`
`max_items`	int	no	Max tweets to extract (1-50, default 20)
`scroll_rounds`	int	no	Additional scroll/collect rounds (0-10, default 0)
`session_id`	string	no	Reuse an existing session
`profile_name`	string	no	Load a persisted login profile into a fresh session
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: everything from search_x plus research, normalized, and report_markdown

`read_x_thread`

Open a tweet/thread URL and extract the visible main tweet plus replies from the detail page.

Parameter	Type	Required	Description
`url`	string	yes	X tweet/thread URL
`max_items`	int	no	Max visible tweets to extract (1-50, default 20)
`session_id`	string	no	Reuse an existing session
`profile_name`	string	no	Load a persisted login profile into a fresh session
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: main_tweet, replies, reply_count_extracted, and page metadata

`research_x_topic_deep`

Run X search, pick a few high-signal tweets, load their thread pages, and produce a richer deep-research summary.

Parameter	Type	Required	Description
`query`	string	yes	Search query
`mode`	string	no	`latest` (default) or `top`
`max_items`	int	no	Max search tweets to collect
`scroll_rounds`	int	no	Additional search scroll rounds
`deep_dive_count`	int	no	Number of thread URLs to inspect (default 3)
`thread_items`	int	no	Max tweets to extract per thread
`session_id`	string	no	Reuse an existing session
`profile_name`	string	no	Load a persisted login profile into a fresh session
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: deep_dive_candidates, threads, deep_research, normalized, and report_markdown in addition to the base search output

`save_x_research_report`

Run topic research (normal or deep) and save JSON + markdown report bundle to disk.

Parameter	Type	Required	Description
`query`	string	yes	Search query
`deep`	bool	no	If true, use deep research workflow
`mode`	string	no	`latest` (default) or `top`
`max_items`	int	no	Max tweets to collect
`scroll_rounds`	int	no	Additional search scroll rounds
`deep_dive_count`	int	no	Thread deep-dive count
`thread_items`	int	no	Max tweets per thread
`session_id`	string	no	Reuse an existing session
`profile_name`	string	no	Load a persisted login profile into a fresh session
`engine`	string	no	`auto` (default), `chromium`, or `firefox`
`report_name`	string	no	Optional custom output name

Returns: research output plus saved_report paths

`list_saved_x_reports`

List saved research report bundles from disk.

Returns: reports, count

`scrape_webpage`

Navigate to a URL, extract content in the requested format, and auto-close the session.

Parameter	Type	Required	Description
`url`	string	yes	URL to scrape (http/https only)
`output_format`	string	no	`markdown` (default), `text`, `html`, or `links`
`session_id`	string	no	Reuse session. If omitted, creates ephemeral session that auto-closes
`wait_for`	string	no	CSS selector to wait for before extracting
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: url, title, content, session_id, status_code, timing_ms, extraction_method, engine

`extract_structured_data`

Extract structured DOM data (metadata, links, tables, JSON-LD, etc.) from a webpage.

Parameter	Type	Required	Description
`url`	string	yes	URL to extract from (http/https only)
`session_id`	string	no	Reuse session. If omitted, creates ephemeral session
`include`	list	no	Sections to include. Default: all. Options: `metadata`, `og_tags`, `json_ld`, `headings`, `links`, `tables`, `forms`
`wait_for`	string	no	CSS selector to wait for before extracting
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: url, title, session_id, timing_ms, engine, + requested data sections

`crawl_pages`

Crawl multiple pages via BFS starting from a URL.

Parameter	Type	Required	Description
`url`	string	yes	Starting URL (http/https only)
`max_pages`	int	no	Maximum pages to crawl (1-20, default 5)
`link_pattern`	string	no	Regex to filter link hrefs
`output_format`	string	no	`markdown` (default), `text`, `html`, or `links`
`same_domain`	bool	no	Only follow same-domain links (default: true)
`engine`	string	no	`auto` (default), `chromium`, or `firefox`

Returns: pages (list of {url, title, content, status_code}), total_pages, total_timing_ms, engine

Installation

Prerequisites

System libraries (Ubuntu/Debian/WSL2):

sudo apt-get install -y libnspr4 libnss3 libatk1.0-0 libatk-bridge2.0-0 \
  libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxrandr2 libgbm1 \
  libpango-1.0-0 libcairo2 libasound2t64 xvfb

Python 3.12+ and uv (recommended) or pip.

Setup

git clone https://github.com/Axe240-commits/stealth-browser-mcp.git
cd stealth-browser-mcp
chmod +x setup.sh
./setup.sh

Or manually:

uv venv
uv pip install -e ".[dev]"
.venv/bin/python -m patchright install chromium

Verify

# Run tests
.venv/bin/python -m pytest tests/ -v

# Start server (will wait for MCP stdio input)
.venv/bin/python -m stealth_browser

Register with Claude Code

Add to ~/.claude/mcp_servers.json:

{
  "stealth-browser": {
    "type": "stdio",
    "command": "/path/to/stealth-browser-mcp/.venv/bin/python",
    "args": ["-m", "stealth_browser"]
  }
}

Then add permissions in ~/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "mcp__stealth-browser__browse",
      "mcp__stealth-browser__interact",
      "mcp__stealth-browser__extract",
      "mcp__stealth-browser__close_session",
      "mcp__stealth-browser__save_session_state",
      "mcp__stealth-browser__load_session_state",
      "mcp__stealth-browser__list_saved_profiles",
      "mcp__stealth-browser__delete_saved_profile",
      "mcp__stealth-browser__search_x",
      "mcp__stealth-browser__extract_x_search_results",
      "mcp__stealth-browser__research_x_topic",
      "mcp__stealth-browser__read_x_thread",
      "mcp__stealth-browser__research_x_topic_deep",
      "mcp__stealth-browser__save_x_research_report",
      "mcp__stealth-browser__list_saved_x_reports",
      "mcp__stealth-browser__scrape_webpage",
      "mcp__stealth-browser__extract_structured_data",
      "mcp__stealth-browser__crawl_pages"
    ]
  }
}

Restart Claude Code. The tools will be available immediately.

Architecture

┌─────────────────────────────────────────────────┐
│  Claude Code / MCP Client                       │
│                                                 │
│  browse ─ interact ─ extract ─ close_session    │
│  scrape_webpage ─ extract_structured_data       │
│  crawl_pages                                    │
└────────────────┬────────────────────────────────┘
                 │ stdio (JSON-RPC)
┌────────────────▼────────────────────────────────┐
│  server.py — FastMCP Server (7 tools)           │
│  ├── security.py — SSRF validation (every URL)  │
│  ├── session.py — per-session lock + state      │
│  ├── browser_manager.py — dual engine pool      │
│  ├── extractor.py — 3-tier content extraction   │
│  ├── dom_extractor.py — structured DOM data     │
│  └── config.py — configuration                  │
└───────┬─────────────────┬───────────────────────┘
        │                 │
┌───────▼──────┐  ┌───────▼──────┐
│  Patchright   │  │  Camoufox    │
│  (Chromium)   │  │  (Firefox)   │
│  Primary      │  │  Fallback    │
└───────┬──────┘  └───────┬──────┘
        │                 │
┌───────▼─────────────────▼───────────────────────┐
│  Xvfb :99 — 1920x1080 (headed mode)            │
└─────────────────────────────────────────────────┘

Dual Engine & Auto-Fallback

With engine: auto (the default), every request:

Tries Patchright (Chromium) first — fast, low overhead
Checks for bot-block signals: HTTP 403, title keywords ("Just a moment", "Attention Required"), empty content
If blocked, automatically retries with Camoufox (Firefox) which has stronger anti-fingerprinting

For crawl_pages, the engine switch happens on the first page and sticks for the rest of the crawl.

Content Extraction Pipeline

trafilatura (best for articles, tables, links)
    ↓ fallback if < 200 chars
readability-lxml + html2text (complex HTML)
    ↓ fallback if < 200 chars
page.inner_text('body') (SPAs, JS-rendered content)

Session Management

Two persistent browsers launched at MCP server start (Chromium + Firefox)
Each browse() call with no session_id creates a new BrowserContext (~100ms)
Sessions are isolated (separate cookies, storage, state)
Max 5 concurrent sessions, oldest evicted if at capacity
Idle sessions evicted after 10 minutes
All operations per session are serialized via asyncio.Lock
Each session tracks its engine type (chromium or firefox)

Security (SSRF Protection)

Every URL is validated before navigation:

Scheme check — only http and https allowed
DNS resolution — hostname resolved to actual IPs
IP validation — all resolved IPs checked against private/reserved ranges
Redirect validation — redirects re-validated at each hop

Blocked:

localhost, 127.0.0.1, ::1
Private ranges (10.x, 172.16.x, 192.168.x)
Cloud metadata (169.254.169.254)
Link-local, multicast, reserved IPs
file://, data://, javascript://, ftp://

Usage Tips for AI Agents

Use extract to re-read the same page — don't call browse again
Use browse only for actual navigation (new URL or page change)
Reuse session_id across related operations
Always call close_session when done to free resources
Use scrape_webpage for one-shot scraping (auto-closes session)
Use crawl_pages to spider multiple pages from a starting URL
Default navigation uses domcontentloaded (fast, reliable) — use wait_for if you need a specific element

Project Structure

stealth-browser-mcp/
├── pyproject.toml              # Dependencies, build config
├── setup.sh                    # One-command setup
├── src/stealth_browser/
│   ├── __init__.py
│   ├── __main__.py             # Entry: python -m stealth_browser
│   ├── server.py               # MCP server, 7 tools, lifespan
│   ├── browser_manager.py      # Dual engine lifecycle, context pool
│   ├── session.py              # Session state, locking, actions
│   ├── extractor.py            # 3-tier content extraction
│   ├── dom_extractor.py        # Structured DOM data extraction
│   ├── security.py             # SSRF-hardened URL validation
│   ├── config.py               # Configuration dataclass
│   └── proxy.py                # Stub (Phase 2: Tor)
└── tests/
    ├── test_security.py        # URL/IP validation tests
    ├── test_extractor.py       # Extraction mode/fallback tests
    ├── test_dom_extractor.py   # DOM structured data tests
    └── test_server_helpers.py  # Server helper function tests

Configuration

Defaults in config.py — no config file needed:

Setting	Default	Description
`headless`	`False`	Headed mode (Xvfb) for better stealth
`use_xvfb`	`True`	Auto-start Xvfb for headed mode
`max_sessions`	`5`	Max concurrent browser sessions
`session_timeout_minutes`	`10`	Idle session eviction timeout
`navigation_timeout_ms`	`30000`	Page load timeout
`wait_until`	`domcontentloaded`	Navigation wait strategy
`max_content_length`	`50000`	Content truncation limit (chars)
`block_media`	`True`	Block images/fonts/media for speed
`camoufox_enabled`	`True`	Enable Firefox fallback engine
`crawl_max_pages_limit`	`20`	Hard cap for crawl_pages
`crawl_per_page_max`	`10000`	Content limit per crawled page

Dependencies

Package	Purpose
mcp	MCP server framework (Anthropic)
patchright	Stealth Playwright fork (Chromium)
camoufox	Anti-fingerprint Firefox (fallback engine)
trafilatura	Article/content extraction
readability-lxml	Fallback HTML extraction
html2text	HTML to markdown conversion

Troubleshooting

Browser fails to launch: `error while loading shared libraries`

Chromium needs system libraries that aren't installed by default on minimal Linux/WSL2:

error while loading shared libraries: libnspr4.so: cannot open shared object file

Solution:

sudo apt-get install -y libnspr4 libnss3 libatk1.0-0 libatk-bridge2.0-0 \
  libdrm2 libxkbcommon0 libxcomposite1 libxdamage1 libxrandr2 libgbm1 \
  libpango-1.0-0 libcairo2 libasound2t64 xvfb

Camoufox won't start

Camoufox requires xvfb for headed mode:

sudo apt-get install -y xvfb

If Camoufox still fails, it falls back gracefully — Chromium-only mode still works.

MCP server not showing in Claude Code

The server must be registered in ~/.claude/mcp_servers.json:

{
  "stealth-browser": {
    "type": "stdio",
    "command": "/absolute/path/to/.venv/bin/python",
    "args": ["-m", "stealth_browser"]
  }
}

After adding, restart Claude Code — MCP servers are loaded at startup only.

Tools show "Permission denied"

Add all 7 tools to ~/.claude/settings.json permissions (see Register section above).

Page content is empty or too short

Try extract with mode="text" for SPAs/JS-heavy pages
Add wait_for parameter with a CSS selector to wait for dynamic content
Try engine: firefox — some sites respond better to Camoufox
The default domcontentloaded doesn't wait for lazy-loaded content — pass a selector that appears after the page fully renders

Bot-blocked on both engines

If engine: auto falls back to Firefox and still gets blocked, the site may require:

A different IP/proxy (Phase 2)
Manual CAPTCHA solving
Specific cookies/authentication

Session not found

Sessions are evicted after 10 minutes of inactivity or when the 5-session limit is reached. If you get "Session 'xyz' not found", create a new one with browse.

Phase 2 (Planned)

screenshot tool — for CAPTCHA/consent debugging
evaluate_js tool — targeted DOM queries
session_info tool — list active sessions and state
Per-toolcall hard timeout guard
Proxy/Tor opt-in support

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Stealth Browser MCP Server

README

Stealth Browser MCP Server

Features

Tools

browse

interact

extract

close_session

save_session_state

load_session_state

list_saved_profiles

delete_saved_profile

search_x

extract_x_search_results

research_x_topic

read_x_thread

research_x_topic_deep

save_x_research_report

list_saved_x_reports

scrape_webpage

extract_structured_data

crawl_pages

Installation

Prerequisites

Setup

Verify

Register with Claude Code

Architecture

Dual Engine & Auto-Fallback

Content Extraction Pipeline

Session Management

Security (SSRF Protection)

Usage Tips for AI Agents

Project Structure

Configuration

Dependencies

Troubleshooting

Browser fails to launch: error while loading shared libraries

Camoufox won't start

MCP server not showing in Claude Code

Tools show "Permission denied"

Page content is empty or too short

Bot-blocked on both engines

Session not found

Phase 2 (Planned)

License

Recommended Servers

`browse`

`interact`

`extract`

`close_session`

`save_session_state`

`load_session_state`

`list_saved_profiles`

`delete_saved_profile`

`search_x`

`extract_x_search_results`

`research_x_topic`

`read_x_thread`

`research_x_topic_deep`

`save_x_research_report`

`list_saved_x_reports`

`scrape_webpage`

`extract_structured_data`

`crawl_pages`

Browser fails to launch: `error while loading shared libraries`