Pydoll MCP Server
Browser automation MCP server using Pydoll, enabling agents to navigate, observe, and interact with web pages via tools like page navigation, element clicking, and screenshot capture.
README
Pydoll MCP Server
MCP server for browser automation built on the Pydoll library.
This project offers a local alternative to the Playwright MCP Server, with its own agent-oriented API built on Pydoll. It does not copy the Playwright API. It provides a predictable layer for agents: observe pages, choose elements, act by element_id, navigate, capture screenshots, execute JavaScript with limits, and handle iframes and shadow DOM.
Status
Local alpha (v0.1.0a1). HTTP on 127.0.0.1 is the primary transport. stdio transport is available as an option (--transport stdio).
Endpoints:
/health- Health check (no auth)/mcp- Streamable HTTP MCP (bearer token required)/sse- Server-Sent Events (bearer token required)
Requirements
- Python
>=3.10 - Chrome or Chromium installed
- Pydoll
>=2.23.0
On this machine, Python is available via Anaconda at:
C:\Users\Yuri\anaconda3\python.exe
Installation
C:\Users\Yuri\anaconda3\python.exe -m pip install -e ".[dev]"
For release distribution:
pip install pydoll-mcp-server
Running
Set a token before starting:
# Windows (PowerShell)
$env:PYDOLL_MCP_AUTH_TOKEN = "your-secret-token"
# Linux / macOS
export PYDOLL_MCP_AUTH_TOKEN="your-secret-token"
Start the server (HTTP, the default):
C:\Users\Yuri\anaconda3\python.exe -m pydoll_mcp_server.cli --host 127.0.0.1 --port 8765
Or via stdio:
C:\Users\Yuri\anaconda3\python.exe -m pydoll_mcp_server.cli --transport stdio
Endpoints:
GET http://127.0.0.1:8765/health- public health check, no tokenPOST http://127.0.0.1:8765/mcp/- Streamable HTTP MCP, with bearer tokenGET http://127.0.0.1:8765/sse/- SSE MCP, with bearer token
MCP clients must send:
Authorization: Bearer <PYDOLL_MCP_AUTH_TOKEN>
PYDOLL_MCP_ALLOW_NO_AUTH=true should only be used in isolated development.
MCP Tools
Health and diagnostics:
health_checkserver_statusdiagnostics_snapshottrace_start,trace_stop,trace_get,trace_cleanup
Lifecycle:
browser_launchbrowser_listbrowser_closebrowser_attachtab_listtab_activatetab_closetab_recover
Navigation:
page_gotopage_reloadpage_backpage_forwardpage_wait
Observation:
page_get_textpage_get_treepage_get_tree_deeppage_screenshot
Elements:
element_findelement_find_deepelement_clickelement_typeelement_fillelement_get_textelement_get_attributeelement_screenshot
JavaScript and advanced helpers:
js_evaluate_readonlyjs_evaluateuser_agent_setviewport_setcookies_getcookies_setstorage_getstorage_setdownload_expectupload_files
Network inspection:
network_enablenetwork_disablenetwork_listnetwork_get_response
Console inspection:
console_enable,console_disable,console_list(returnUNSUPPORTED)
Agent-friendly model
page_get_tree returns a compact, limited tree by default. Interactive nodes receive element_id, selector_hint, xpath_hint, actionable, and resolution_confidence. An agent can observe the tree and call element_click or element_fill directly with the element_id, without calling element_find first.
page_get_tree_deep is the recommended option when the page uses iframes or shadow DOM. It is more expensive, has its own timeout, and returns:
frame_pathshadow_pathpartialerrors- visibility and interaction metadata when available
The alpha covers simple iframes, same-origin nested iframes, and open shadow DOM. Closed shadow roots and complex cross-origin cases still require additional validation.
Security
- Bearer token is required by default.
- The default bind must remain
127.0.0.1. - Free
execute_cdp_cmdis not exposed. - Operating system commands are not exposed.
- Arbitrary filesystem read or write is not exposed.
- Screenshots, downloads, and uploads use controlled directories or an allowlist.
- Cookies and storage are redacted by default on read.
- Sensitive attributes such as tokens, passwords, and cookies are redacted.
- Logs must redact bearer tokens, cookies, authorization headers, and sensitive fields.
js_evaluate is a sensitive tool:
- Requires explicit
tab_id. - Uses a short timeout by default.
- Limits code and result size.
- Logs a summarized audit with hash, duration, and size.
- Must not log full code or full results.
- Warns or blocks dangerous patterns, depending on mode.
- May be disabled in the future via a safe-mode configuration.
js_evaluate_readonly is preferred for inspection, but should also be treated as sensitive.
Runtime directories
Runtime data is stored outside the repository by default:
- Windows:
%LOCALAPPDATA%\pydoll-mcp-server - macOS:
~/Library/Application Support/pydoll-mcp-server - Linux:
~/.local/share/pydoll-mcp-server
Expected subdirectories:
profiles/tmp/downloads/artifacts/logs/
Vendored Pydoll documentation
Vendored Pydoll documentation is available at:
references/pydoll-docs/
Do not mix vendored documentation with MCP server code.
Testing
Core gates:
C:\Users\Yuri\anaconda3\python.exe -m pytest -q
C:\Users\Yuri\anaconda3\python.exe -m ruff check .
C:\Users\Yuri\anaconda3\python.exe -m mypy src
C:\Users\Yuri\anaconda3\python.exe -m pytest -m browser_smoke -q
Useful test suites by area:
C:\Users\Yuri\anaconda3\python.exe -m pytest tests\contract -q
C:\Users\Yuri\anaconda3\python.exe -m pytest tests\unit\test_concurrency.py -q
C:\Users\Yuri\anaconda3\python.exe -m pytest tests\unit\test_security.py tests\unit\test_files_security.py -q
C:\Users\Yuri\anaconda3\python.exe -m pytest tests\p2\ -q
browser_smoke opens Chrome/Chromium headless and validates real flows with local fixtures.
Known limitations
- Console inspection is not available (returns
UNSUPPORTED; depends on additional Pydoll Runtime API validation). browser_attachdoes not support reconnection across server sessions (returnsUNSUPPORTED).- Closed shadow roots and complex OOPIFs still require dedicated validation.
- Deep traversal is more expensive than
page_get_treeand should be used explicitly. - Downloads depend on Pydoll's
expect_downloadflow and must remain in the controlled runtime dir. - Uploads must only use paths allowed by the allowlist.
Plans and progress
- Overview:
PLAN.md - P1:
plans/PLAN_P1.md(completed) - P2:
plans/PLAN_P2.md(completed) - Agent progress logs:
progress/
Agents should log short progress entries at:
progress/YYYY-MM-DD_AGENT_PLAN_XX.md
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.