MCP Servers

Selenium MCP Server

Exposes Selenium WebDriver as an MCP server, enabling AI agents and LLMs to control real browsers for automation tasks like navigation, element interaction, and screenshot capture.

README

Selenium MCP Server

Model Context Protocol (MCP) server for Selenium WebDriver that enables AI agents and LLMs to control real browsers for automation

This project exposes Selenium WebDriver as an MCP (Model Context Protocol) server, allowing AI agents to control a real browser through structured tools.

It enables LLMs and autonomous agents to perform tasks like:

Opening browsers
Navigating websites
Discovering UI elements
Clicking buttons and links
Typing into inputs
Extracting page text
Taking screenshots
Many more future upgrades (in-progress)

This makes it possible to build AI-powered browser automation systems and autonomous QA agents.

Why This Project Exists
Architecture
Features
Installation
Running the Server
MCP Server Version
Available MCP Tools
Browser Session Flow
Example Agent Workflow
System Prompt for AI Agents
Prompt Customization
Logging
Configure Your MCP Client
Requirements
Use Cases
Contributing
License
Author

WHY THIS PROJECT EXISTS

Modern AI agents need a way to interact with real applications.

While traditional automation tools like Selenium exist, they are not directly usable by LLM agents.

This project bridges that gap by exposing Selenium functionality through MCP tools so that agents can:

Understand web pages
Discover UI elements
Perform actions
Validate results

ARCHITECTURE

flowchart TD
    A[LLM Agent] --> B[MCP Protocol]
    B --> C[Selenium MCP Server]

    C --> D[Browser Tools]
    C --> E[Navigation Tools]
    C --> F[Interaction Tools]
    C --> G[Element Tools]
    C --> H[Debug Tools]

    D --> I[Selenium WebDriver]
    E --> I
    F --> I
    G --> I
    H --> I
    
    I --> J[Browser]

FEATURES

MCP-compatible Selenium automation server
Browser session management
Navigation controls
UI element discovery
Accessibility-aware interaction
Screenshot capture
Page text extraction
Headless browser support
Multi-tab browser management (open, switch, close, track active tab)
Improved interactive element detection for modern UI frameworks (React, Angular, dynamic DOM)

INSTALLATION

Run the following command

pip install selenium-mcp

RUNNING THE SERVER

Start the MCP server

You can start the Selenium MCP server using different transport modes depending on your use case.

Default (STDIO)

selenium-mcp run

Uses stdio transport
Best for local agent integrations
No network exposure

HTTP Mode (Recommended)

selenium-mcp run --transport http --host 127.0.0.1 --port 3345

Starts server at: http://127.0.0.1:3345

MCP endpoint: http://127.0.0.1:3345/mcp

Best for:

API integrations
Postman / curl testing
production-style usage

SSE Mode (Streaming)

selenium-mcp run --transport sse --host 127.0.0.1 --port 3345

Starts server at: http://127.0.0.1:3345/sse

Best for:

streaming-based agents
real-time interactions

Note: Note: SSE endpoints are streaming and may not show output directly in the browser.

Expose Server on Network:

selenium-mcp run --transport http --host 0.0.0.0 --port 3345

Makes server accessible from:

other devices on the same network
Docker / VM environments

Notes:

Default port: 3336 Supported transports:

stdio (default)
http
sse

Ensure port is within range: 1–65535

MCP SERVER VERSION

To check the current version of the selenium MCP server, run the following command:

selenium-mcp version

AVAILABLE MCP TOOLS

Run the following command to get the list of tools supported by MCP server:

selenium-mcp tools

This returns the list of tools supported by MCP server.

BROWSER CONTROL

open_browser – Launch a new browser session
close_browser – Close the browser session
maximize_browser – Maximize browser window
fullscreen_browser – Switch browser to fullscreen

NAVIGATION

open_url – Navigate to a specific URL
navigate_back – Navigate back in browser history
navigate_forward – Navigate forward in history
refresh_page – Reload the page
wait_for_page – Wait for page to load
get_page_title – Get the current page title

TAB MANAGEMENT

get_tabs – Retrieve all open tabs in the current session
switch_tab – Switch to a specific tab using index
open_new_tab – Open a new tab and optionally navigate to a URL
close_tab – Close a specific tab by index
get_current_tab – Retrieve the currently active tab
name_tab – Assign a custom name to a tab for easier identification

These tools allow agents to manage multiple tabs within a single browser session.

ELEMENT DISCOVERY

get_interactive_elements – Discover visible interactive elements on the page
get_accessibility_tree – Retrieve simplified accessibility tree for the page

These tools allow agents to understand the UI structure before interacting with it.

Notes

Element detection is optimized for modern web applications (React, Angular, dynamic UI frameworks).
Elements are identified using interaction signals such as roles, click handlers, and focusability.
Only visible and meaningful elements are returned to reduce noise.

INTERACTION TOOLS

click_element – Click an element by index
type_into_element – Enter text into an input field

Elements must first be discovered using: get_interactive_elements

PAGE ANALYSIS

get_page_text – Extract visible text from the page

Useful for:

validation
reasoning
information extraction

VISUAL DEBUGGING

take_screenshot – Capture a screenshot of the current browser window

Screenshot Storage Location

When screenshots are captured, they are automatically saved in a hidden folder inside your home directory.

macOS / Linux

Screenshots are stored at:

~/.selenium-mcp/screenshot

Example full path:

/Users/<your-username>/.selenium-mcp/screenshot

You can open the folder using Terminal:

open ~/.selenium-mcp/screenshot

Windows

Screenshots are stored at:

C:\Users\<your-username>\.selenium-mcp\screenshot

Example:

C:\Users\John\.selenium-mcp\screenshot

You can open it from File Explorer by entering the following in the address bar:

%USERPROFILE%\.selenium-mcp\screenshot

Custom Screenshot Directory (Optional)

You can override the default screenshot location using the environment variable: SELENIUM_MCP_SCREENSHOT_DIR

macOS / Linux

export SELENIUM_MCP_SCREENSHOT_DIR=~/my-screenshots

Windows (PowerShell)

$env:SELENIUM_MCP_SCREENSHOT_DIR="C:\my-screenshots"

All screenshots will then be saved to the specified directory.

Notes

The folder is created automatically the first time a screenshot is taken.
The .selenium-mcp directory is hidden by default because it starts with a dot (.).
You can safely delete screenshots anytime.

BROWSER SESSION FLOW

Each browser session is identified by a session_id.

Typical workflow for agents:

open_browser
open_url
wait_for_page
get_interactive_elements
(optional) get_tabs / switch_tab if multiple tabs are present
click_element or type_into_element

MULTI-TAB WORKFLOW

Agents can work with multiple tabs within the same browser session.

Example workflow:

open_browser
open_url
open_new_tab("https://example.com")
get_tabs
switch_tab(index)
perform actions
close_tab(index)

Notes

Each tab is tracked using an internal index.
The active tab is automatically managed and updated.
All actions are performed on the currently active tab.

EXAMPLE AGENT WORKFLOW

Example task:

Open Chrome browser.
Navigate to Google.com
Type the text "Selenium MCP" in the search box.
Press the search button

Agent steps:

open_browser
open_url("https://google.com")
wait_for_page
get_interactive_elements
type_into_element(index, "Selenium MCP")
click_element(index)
wait_for_page
get_page_text

SYSTEM PROMPT FOR AI AGENTS

This repository includes a production-grade system prompt designed specifically for browser automation agents that interact with this Selenium MCP server.

The prompt contains detailed operational guidelines that instruct the AI agent on how to:

initialize and control the browser
discover and interact with UI elements
analyze page structure using the accessibility tree
avoid hallucinating element indexes
handle navigation and page reloads
recover from stale elements
follow a deterministic execution loop (PLAN → ACT → OBSERVE → UPDATE PLAN)
enforce safety limits on tool usage

Prompt location

prompts/system_prompt.md

How to use

Whenever you build an AI agent that interacts with this MCP server, this prompt should be provided as the system prompt for the model.

Why this prompt

Browser automation agents can easily make incorrect decisions if not guided properly. This system prompt provides strict operational rules and guardrails that help the agent:

use MCP tools correctly
avoid incorrect element interactions
minimize hallucinations
perform reliable browser automation tasks

Using this prompt significantly improves the stability, accuracy, and reliability of AI-driven browser automation.

Recommendation

It is strongly recommended that all AI agents interacting with this Selenium MCP server use this system prompt to ensure consistent and reliable behavior.

PROMPT CUSTOMIZATION

You may modify or extend the system prompt depending on your use case. However, it is recommended to preserve the core operational rules related to:

MCP tool usage
element discovery
navigation handling
safety limits

LOGGING

All application logs are stored in a user-specific directory:

~/.selenium-mcp/logs/

This directory is automatically created when the server starts.

Log file

Logs are written to:

~/.selenium-mcp/logs/selenium_mcp.log

Features:

Daily log file rotation
Automatic cleanup of older log files
Logs written to both console and file
Persistent logs independent of the project directory

Logs are stored in the user's home directory so they remain available even if the package is installed globally via pip. This makes it easier to debug issues and monitor MCP server activity across different projects.

Example Log Entry

2026-03-15 19:00:07,444 [INFO] [selenium-mcp] Initializing Selenium MCP Server...

macOS / Linux

Logs are stored in:

/Users/<username>/.selenium-mcp/logs/

Example:

/Users/john/.selenium-mcp/logs/selenium_mcp.log

You can open it from the terminal:

cd ~/.selenium-mcp/logs
ls

View logs:

cat selenium_mcp.log

tail -f selenium_mcp.log

Windows

Logs are stored in:

C:\Users\<username>\.selenium-mcp\logs\

Example:

C:\Users\John\.selenium-mcp\logs\selenium_mcp.log

Open it in File Explorer:

C:\Users\%USERNAME%\.selenium-mcp\logs\

Or from Command Prompt:

cd %USERPROFILE%\.selenium-mcp\logs
dir

CONFIGURE YOUR MCP CLIENT

Add the Selenium MCP server to your MCP client configuration.

Example STDIO mode:

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "selenium-mcp"
    }
  }
}

This tells the MCP client how to start the Selenium MCP server using stdio mode.

Example HTTP mode:

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "selenium-mcp",
      "args": ["run", "--transport", "http", "host", "127.0.0.1",  "--port", "3345"]
    }
  }
}

Runs MCP server over HTTP
Endpoint: http://127.0.0.1:3345/mcp

Example SSE mode:

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "selenium-mcp",
      "args": ["run", "--transport", "sse", "host", "127.0.0.1", "--port", "3345"]
    }
  }
}

Runs MCP server with streaming (SSE) transport
Useful for real-time agent interactions
Endpoint: http://127.0.0.1:3345/sse

Client Examples

Claude Desktop

Config file location:

macOS

~/Library/Application Support/Claude/claude_desktop_config.json

Windows

%APPDATA%\Claude\claude_desktop_config.json

STDIO – Works for Claude Desktop

Add

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "selenium-mcp"
    }
  }
}

Restart Claude Desktop after updating the configuration.

Uses stdio transport
Works out of the box with Claude Desktop
No additional configuration required

Troubleshooting

If you encounter issues while setting up or running Selenium MCP, try the following solutions.

selenium-mcp: command not found

This usually means the CLI command is not available in your system PATH.

First verify the package is installed:

pip show selenium-mcp

Locate the installed command.

macOS / Linux

which selenium-mcp

Example output:

/Users/<username>/.local/bin/selenium-mcp

If the command is found, update your MCP client configuration to use the full path:

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "/Users/<username>/.local/bin/selenium-mcp"
    }
  }
}

Windows

Run:

where selenium-mcp

Example output:

C:\Users\<username>\AppData\Roaming\Python\Python311\Scripts\selenium-mcp.exe

Update your MCP client configuration:

{
  "mcpServers": {
    "selenium-mcp": {
      "command": "C:\\Users\\<username>\\AppData\\Roaming\\Python\\Python311\\Scripts\\selenium-mcp.exe"
    }
  }
}

Note: Windows paths in JSON require double backslashes (\\).

REQUIREMENTS

Python 3.10+
Web browser

USE CASES

This project can be used to build:

AI test automation agents
Autonomous QA assistants
LLM-powered browser copilots
Self-healing test frameworks
AI web scraping agents
Intelligent UI testing systems

CONTRIBUTING

Contributions are welcome.

Steps:

Fork the repository
Create a feature branch
Submit a pull request

LICENSE

MIT License

AUTHOR

Prashant Nayak

🔗 LinkedIn: https://www.linkedin.com/in/prashantjnayak

Built to help the QA and AI automation community build intelligent browser automation systems.

SUPPORT THE PROJECT

If this project helps you:

Star the repository
Share it with the QA community

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Selenium MCP Server

README

Selenium MCP Server

Table of Contents

WHY THIS PROJECT EXISTS

ARCHITECTURE

FEATURES

INSTALLATION

Run the following command

RUNNING THE SERVER

Start the MCP server

Default (STDIO)

HTTP Mode (Recommended)

SSE Mode (Streaming)

Expose Server on Network:

Notes:

MCP SERVER VERSION

AVAILABLE MCP TOOLS

BROWSER CONTROL

NAVIGATION

TAB MANAGEMENT

ELEMENT DISCOVERY

Notes

INTERACTION TOOLS

PAGE ANALYSIS

VISUAL DEBUGGING

Screenshot Storage Location

macOS / Linux

Windows

Custom Screenshot Directory (Optional)

macOS / Linux

Windows (PowerShell)

Notes

BROWSER SESSION FLOW

Typical workflow for agents:

MULTI-TAB WORKFLOW

Example workflow:

Notes

EXAMPLE AGENT WORKFLOW

Example task:

Agent steps:

SYSTEM PROMPT FOR AI AGENTS

Prompt location

How to use

Why this prompt

Recommendation

PROMPT CUSTOMIZATION

LOGGING

Log file

Example Log Entry

macOS / Linux

Windows

CONFIGURE YOUR MCP CLIENT

Client Examples

Claude Desktop

macOS

Windows

STDIO – Works for Claude Desktop

Troubleshooting

selenium-mcp: command not found

macOS / Linux

Windows

REQUIREMENTS

USE CASES

CONTRIBUTING

Steps:

LICENSE

AUTHOR

SUPPORT THE PROJECT

Recommended Servers