Browser-Debugger
Enables AI agents to automate and debug real Chromium browsers with capabilities like screenshots, video recording, performance analysis, visual regression testing, and OCR text extraction.
README
๐ง MCP Browser Debugger (Python)
Enterprise-grade Model Context Protocol (MCP) server enabling AI agents to perform real browser automation, debugging, visual testing, performance analysis, HAR export, Playwright trace replay, video recording, and OCR-based text extraction on real web applications.
Built with Playwright (Python) and FastMCP, this server allows AI clients to act as autonomous QA engineers operating on real Chromium browsers, not mock environments.
๐ Overview
The MCP Browser Debugger is a stateful MCP server that bridges AI reasoning with real UI behavior.
It enables AI tools (Cursor, VS Code, Windsurf, etc.) to:
- Control a real Chrome browser (non-headless)
- Automatically attach to running localhost servers
- Interact with UI without fragile selectors
- Collect screenshots, videos, HAR files, logs, and traces
- Perform visual regression testing
- Extract visible text via OCR
โ What This Project Is / Is Not
โ This project IS
- A long-running MCP server
- A browser automation backend for AI agents
- A QA / debugging system for real web apps
- A stateful, artifact-producing tool
โ This project is NOT
- A Playwright test framework replacement
- A headless-only automation tool
- A stateless CLI utility
- A mock or simulated browser
๐ Architecture
AI Client (Cursor / VS Code / Windsurf)
โ
โผ
Model Context Protocol (MCP)
โ
โผ
FastMCP Server (Python)
โ
โผ
Playwright (Chromium โ real browser)
โ
โผ
Artifacts (Screenshots, Video, HAR, Trace, OCR)
๐ Server Identity (Important)
| Purpose | Name |
|---|---|
| MCP config name | Browser-Debugger |
| FastMCP internal name | Browser_Debugger |
โ ๏ธ These names are intentionally different. Do not rename either.
โ๏ธ Core Capabilities
Browser Automation
- Launches real Chrome (
channel="chrome", non-headless) - Auto-detects running localhost servers
- Automatically attaches to the most relevant server
- Creates session folders per project
Universal UI Interaction
- Intelligent form filling
- Button detection without selectors
- Semantic button scoring (submit/login/search)
- Keyboard fallback (
Enter)
Debugging & Diagnostics
- Console log capture
- Network request logging
- DOM inspection
- HAR export
Performance Analysis
- LCP (Largest Contentful Paint)
- FCP (First Contentful Paint)
- CLS (Cumulative Layout Shift)
- Load (Navigation Timing based)
Visual Testing
- Full-page screenshots
- Element screenshots
- Baseline creation
- Visual diff with severity classification
- Region-based comparison
- Multi-page regression testing
Video Recording
- Start/stop mid-session
.webmexport- Cursor & click overlay
Playwright Tracing
- Enabled by default
- Captures actions, DOM, network, screenshots, timeline
- Integrated trace viewer support
โ ๏ธ Traces are saved only after
browser_close()is called
OCR (Optical Character Recognition)
- Extracts visible text from page or elements
- Confidence scoring
- Works with canvas, images, popups
- Powered by Tesseract OCR
OCR tools exist without Tesseract installed but will return a guided error until the binary is installed.
โ Requirements
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.8+ | Required |
| pip | Latest | Auto-upgraded |
| fastmcp | < 3.0 | Installed automatically |
| playwright | โฅ 1.40 | Installed automatically |
| psutil | โฅ 5.9 | Installed automatically |
| Pillow | โฅ 9.0 | Installed automatically |
| numpy | โฅ 1.21 | Installed automatically |
| pytesseract | โฅ 0.3 | Installed automatically |
| Tesseract OCR | 5.x | Required only for OCR |
| Node.js / npm | Latest | For trace viewer |
| Git | Latest | Recommended |
๐ฆ Installation
Windows (Recommended)
Run in Command Prompt or PowerShell
git clone https://github.com/Selvadinesh-giga/MCP-based-Browser-Debug-Server.git
cd MCP-based-Browser-Debug-Server
python install.py
What the installer does
-
Validates Python version
-
Creates a virtual environment
-
Installs dependencies
-
Installs Playwright Chromium
-
Detects Tesseract OCR (optional)
-
Registers MCP server for:
- Cursor
- VS Code
- Windsurf
-
Writes MCP config with absolute paths
Optional: Install Tesseract OCR (Windows)
Required only for OCR features.
https://github.com/tesseract-ocr/tesseract/releases/download/5.5.0/
tesseract-ocr-w64-setup-5.5.0.20241111.exe
Verify:
tesseract --version
macOS / Linux
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
python -m playwright install chromium
Install Tesseract if OCR is needed:
brew install tesseract # macOS
sudo apt-get install tesseract-ocr # Linux
โถ๏ธ Usage (AI-Side)
Launch Browser
browser_launch()
browser_launch(url="http://localhost:3000")
UI Interaction
app_interact(["admin@test.com", "password"], button="Login")
app_interact(["laptop"])
app_interact(button="Search")
Debugging
debug_read_logs()
dom_get_source()
dom_inspect_element(selector="#submit")
Performance
performance_get_metrics()
Screenshots & Video
media_take_screenshot()
media_record_video("start")
media_record_video("stop")
OCR
media_read_text_ocr()
media_read_text_ocr(selector=".error")
Visual Regression
visual_diff(action="create_baseline", name="homepage")
visual_diff(action="compare", baseline_path="baselines/homepage.png")
Trace Viewer
trace_viewer(action="open")
Close Session (Required)
browser_close()
๐ Session Artifacts
.mcp_sessions/
โโโ Chrome_Session_YYYY_MM_DD_HH_MM_SS/
โโโ screenshot_*.png
โโโ baselines/
โโโ regression_<name>_<timestamp>/
โโโ full_recording.webm
โโโ network.har
โโโ trace.zip
โโโ session.log
- Screenshots are stored directly in the session folder
.mcp_sessions/.gitignoreis created automatically
๐งน Uninstall & Cleanup
python uninstall.py
Options:
- Remove MCP config only
- Remove config + virtual environment
- Full cleanup (Playwright cache, Tesseract)
๐ง Troubleshooting
- Restart IDE after installation
- Always call
browser_close() - Use absolute paths only
- OCR requires Tesseract binary
- Avoid running multiple sessions simultaneously
๐ฏ Use Cases
- AI-driven QA automation
- Bug reproduction & reporting
- Visual regression testing
- Performance audits
- Accessibility validation
- DevTools automation
๐ฃ Roadmap
- Multi-tab automation
- Headless CI execution
- AI-generated bug reports
- Cloud artifact sync
- Mobile browser testing
๐ Acknowledgments
- FastMCP
- Playwright
- Tesseract OCR
- Model Context Protocol community
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.