selenium-mcp

selenium-mcp

A production-ready MCP server that exposes Selenium 4 browser automation as MCP tools.

Category
Visit Server

README

mcp-selenium

A production-ready MCP (Model Context Protocol) server that exposes Selenium 4 browser automation as MCP tools. Supports Chrome and Firefox with full BiDi (Bidirectional API) event streaming.


Architecture

selenium-mcp/
├── server.py               # MCP entrypoint (JSON-RPC 2.0 over stdio)
├── config/
│   ├── settings.py         # ENV + YAML config loader (singleton)
│   ├── default.yaml        # Default configuration values
│   └── logging_config.py   # Structured logging setup
├── driver/
│   ├── factory.py          # WebDriver factory (Chrome / Firefox + BiDi)
│   ├── session.py          # BrowserSession – wraps a single WebDriver
│   └── session_manager.py  # Registry of all active sessions
├── events/
│   ├── dispatcher.py       # Async pub/sub event dispatcher (asyncio.Queue)
│   ├── bidi_listeners.py   # BiDi WebSocket event listeners
│   └── network_interceptor.py  # CDP/BiDi network interception
├── tools/
│   ├── base.py             # BaseTool + error-screenshot decorator
│   ├── navigation_tools.py # open_page, navigate_back/forward, get_dom
│   ├── interaction_tools.py# click, type_text, get_text, wait_for
│   ├── script_tools.py     # execute_js, screenshot
│   ├── log_tools.py        # get_console_logs, get_network_logs, intercept_requests
│   ├── session_tools.py    # create_session, close_session, list_sessions
│   └── registry.py         # Tool name → callable map + MCP descriptors
└── models/
    ├── session.py          # SessionInfo, BrowserType, SessionStatus
    ├── events.py           # BrowserEvent, ConsoleLogEvent, NetworkRequestEvent, …
    ├── network.py          # NetworkLog, ConsoleLog, InterceptRule, PerformanceMetrics
    └── exceptions.py       # Custom exception hierarchy

Layered design

MCP Client (Claude / any MCP host)
        ↕  JSON-RPC 2.0 / stdio
    server.py  (MCPServer)
        ↕
    tools/registry.py  →  tools/*.py  (business logic)
        ↕
    driver/session_manager.py  →  driver/session.py
        ↕                               ↕
    driver/factory.py           events/bidi_listeners.py
    (WebDriver creation)        (BiDi / CDP event capture)
        ↕                               ↕
    Selenium 4 WebDriver       events/dispatcher.py
    (Chrome / Firefox)         (async pub/sub)

Quick start

Prerequisites

Requirement Version
Python 3.11+
Chrome / ChromeDriver latest stable
Firefox / GeckoDriver latest stable (optional)

Installation

# 1. Clone / enter the project
git clone <repo> selenium-mcp
cd selenium-mcp

# 2. Create a virtual environment
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. (Optional) install as editable package
pip install -e .

Run the server

# Stdio mode (standard MCP transport)
python server.py

# Or via the installed entry-point
selenium-mcp

The server reads JSON-RPC 2.0 messages from stdin and writes responses to stdout.


Configuration

Configuration is loaded in priority order:

  1. Environment variables (SMCP_*)
  2. YAML file (SMCP_CONFIG_FILE env var or config/default.yaml)
  3. Built-in defaults

Key settings

ENV variable YAML key Default Description
SMCP_BROWSER browser.default chrome Default browser (chrome/firefox)
SMCP_HEADLESS browser.headless true Run headless
SMCP_MAX_SESSIONS browser.max_sessions 5 Max concurrent sessions
SMCP_BIDI_ENABLED bidi.enabled true Enable BiDi WebSocket
SMCP_LOG_LEVEL server.log_level INFO Log level
SMCP_DEBUG server.debug false Verbose debug logging
SMCP_SCREENSHOT_ON_ERROR screenshot.on_error true Auto-screenshot on errors
SMCP_SCREENSHOT_DIR screenshot.directory screenshots/ Screenshot output dir

Custom YAML config

SMCP_CONFIG_FILE=/path/to/my-config.yaml python server.py

Example my-config.yaml:

browser:
  default: firefox
  headless: false
  max_sessions: 3
bidi:
  enabled: true
screenshot:
  on_error: true
  directory: /tmp/mcp-screenshots

MCP Tools reference

Session management

Tool Description Key params
create_session Open a new browser browser, headless
close_session Close a session session_id
list_sessions List active sessions
get_session_info Get session metadata session_id

Navigation

Tool Description Key params
open_page Navigate to URL url
navigate_back History back
navigate_forward History forward
get_dom Full page HTML

Element interaction

Tool Description Key params
click Click a CSS selector selector
type_text Type into an input selector, text
get_text Get element text selector
wait_for Wait until visible selector, timeout
wait_for_dom_stable Smart DOM-stability wait timeout

Script & media

Tool Description Key params
execute_js Run JavaScript script
screenshot Capture viewport (base64 PNG)

Logs & network

Tool Description Key params
get_console_logs Browser console entries
get_network_logs Network request/response log
get_performance_metrics Page timing data
intercept_requests Register URL intercept rule pattern, action

Connecting to MCP clients

Via uvx (recommended — zero install)

uvx mcp_selenium

Via pip

pip install mcp_selenium
selenium-mcp

Local development (uv run)

git clone https://github.com/SCV-Consultants/selenium-mcp.git
cd selenium-mcp
uv run selenium-mcp

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "selenium": {
      "command": "uvx",
      "args": ["mcp_selenium"],
      "env": {
        "SMCP_HEADLESS": "false",
        "SMCP_BROWSER": "chrome"
      }
    }
  }
}

Antigravity / Gemini

Add to your MCP config (mcp_config.json):

{
  "mcpServers": {
    "selenium": {
      "command": "uvx",
      "args": ["mcp_selenium"],
      "env": {
        "SMCP_HEADLESS": "false",
        "SMCP_BROWSER": "chrome"
      }
    }
  }
}

For local development, use uv run instead:

{
  "mcpServers": {
    "selenium": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/selenium-mcp", "selenium-mcp"],
      "env": {
        "SMCP_HEADLESS": "false",
        "SMCP_BROWSER": "chrome"
      }
    }
  }
}

Claude CLI

claude mcp add selenium -- uvx mcp_selenium

Install via Smithery

npx -y @smithery/cli install mcp_selenium --client claude

BiDi / Event system

When bidi.enabled: true the server attaches BiDi WebSocket listeners to each session:

  • Console eventsconsole.log, console.error, etc. are captured and stored per-session. Retrieved via get_console_logs.
  • JS errors – JavaScript runtime errors are captured as error-level console entries.
  • Network events – CDP Network.enable (Chrome) captures request/response data. Retrieved via get_network_logs.
  • Event dispatcher – All events flow through an asyncio.Queue-backed pub/sub hub (events/dispatcher.py). Custom async handlers can be registered per event type for real-time streaming use cases.

Error handling

All tools wrap failures in a typed exception hierarchy:

Exception Trigger
SessionNotFoundError Invalid session_id
SessionLimitError Too many concurrent sessions
ElementNotFoundError CSS selector matched nothing
ElementInteractionError Element not clickable/typeable
NavigationError get() / history navigation failed
ScriptExecutionError JavaScript threw or timed out
TimeoutError wait_for condition not met
NetworkInterceptionError CDP interception setup failed
BiDiNotSupportedError BiDi requested but unavailable

When screenshot.on_error: true, a PNG is saved to screenshot.directory automatically on any SeleniumMCPError.


Development

# Lint
ruff check .

# Type check
mypy .

# Tests
pytest tests/ -v

Retry mechanism

All element interactions use an internal _retry() helper that retries on StaleElementReferenceException and transient WebDriverException. Configurable via:

retry:
  max_attempts: 3
  backoff_seconds: 1.0

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured