agent-browser-mcp-server
Provides complete browser automation capabilities for AI agents via 44 tools, including navigation, element interaction, state management, and session recording.
README
Agent-Browser MCP
Model Context Protocol (MCP) server for agent-browser - providing complete browser automation capabilities for AI agents.
This project is an independent MCP server implementation that wraps the excellent agent-browser CLI tool, making its powerful browser automation features available through the Model Context Protocol.
Features
- 🔧 44 Tools - Complete coverage of agent-browser's functionality
- 🎯 Token-Efficient @ref System - Reduces token usage by caching element references
- 🌐 Full Playwright API - Leverage the complete browser automation capabilities
- 🔄 Auto-Launch - Browser starts automatically when needed
- 💾 State Persistence - Save and restore browser state across sessions
- 🎬 Video Recording - Record browser sessions for debugging
- 🌐 Network Interception - Monitor and modify network requests
- 📊 Session Management - Manage multiple tabs and windows
Installation
Using npm
npm install agent-browser-mcp-server
From Source
git clone https://github.com/hughedward/agent_browser_mcp.git
cd agent_browser_mcp
npm install
npm run build
Quick Start
For Claude Desktop
- Install the package
- Configure in Claude Desktop settings (
~/.claude/settings.json):
{
"mcpServers": {
"agent-browser-mcp-server": {
"command": "npx",
"args": ["agent-browser-mcp-server"],
"env": {
"HEADED": "false"
}
}
}
}
Standalone
agent-browser-mcp-server
Available Tools
Core Tools
browser_navigate- Navigate to a URLbrowser_snapshot- Capture page structure with @ref systembrowser_screenshot- Take screenshotsbrowser_close- Close browser/page
Navigation & History
browser_back- Go back in historybrowser_forward- Go forward in historybrowser_reload- Reload the current page
Element Interaction
browser_click- Click an elementbrowser_fill- Fill input fieldsbrowser_type- Type without clearingbrowser_select- Select dropdown optionsbrowser_check/browser_uncheck- Check/uncheck checkboxesbrowser_drag- Drag and dropbrowser_upload- Upload filesbrowser_dblclick- Double clickbrowser_focus- Focus elementsbrowser_hover- Hover over elementsbrowser_scroll- Scroll pagebrowser_press- Press keyboard keys
Element Discovery
browser_find- Semantic element search (role, text, label, placeholder, etc.)browser_get- Get element informationbrowser_is- Check element state
Tabs & Windows
browser_tab- Manage tabsbrowser_window- Manage windowsbrowser_frame- Switch to iframes
Advanced Features
browser_record- Record browser sessionsbrowser_network- Monitor network requestsbrowser_console- Access consolebrowser_errors- Track JavaScript errorsbrowser_trace- Performance tracingbrowser_profiler- Chrome DevTools profilingbrowser_evaluate- Execute JavaScriptbrowser_pdf- Export to PDFbrowser_dialog- Handle JavaScript dialogsbrowser_download- Manage downloads
State & Storage
browser_state- Save/load browser statebrowser_cookies- Manage cookiesbrowser_storage- Access localStorage/sessionStorage
Utilities
browser_wait- Wait for conditionsbrowser_set- Set element attributesbrowser_mouse- Mouse controlbrowser_diff- Compare pagesbrowser_highlight- Debug highlighting
Configuration
Environment Variables:
| Variable | Description | Default |
|---|---|---|
HEADED |
Run in headed mode (visible browser) | false |
BROWSER |
Browser to use (chromium/firefox/webkit) | chromium |
Development
# Install dependencies
npm install
# Build
npm run build
# Run in development mode (auto-rebuild)
npm run dev
# Run tests
npm test
# Watch mode
npm run test:watch
# Start server
npm start
Documentation
- CLAUDE.md - Development guide for Claude Code
- TESTING_GUIDE.md - Testing instructions
- QUICK_TEST_GUIDE.md - Quick reference
Related Projects
- agent-browser - Original CLI tool this project wraps
- Model Context Protocol - The protocol this server implements
License
Apache-2.0
Note: This project is an independent implementation and is not officially affiliated with Vercel or the original agent-browser project.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.