MCP Servers

Gemini Web Automation MCP

Enables AI-powered web browsing automation using Google's Gemini 2.5 Computer Use API. Allows agents to navigate websites, click buttons, fill forms, and extract information through natural language commands with real-time progress tracking.

README

Gemini Web Automation MCP

Production-ready Model Context Protocol (MCP) server providing AI-powered web browsing automation using Google's Gemini 2.5 Computer Use API. Built with FastMCP and optimized for 4-5x faster performance than baseline implementations.

Overview
Features
Quick Start
Installation
Configuration
Usage
Architecture
Browser Actions
Performance
Best Practices
Troubleshooting
Development
Deployment
Roadmap
Changelog
Security
License
Support

Overview

Gemini Web Automation MCP enables Claude Desktop and other MCP clients to perform intelligent web browsing automation. The AI agent navigates websites, clicks buttons, fills forms, searches for information, and interacts with web pages like a human user.

Key Statistics:

7 MCP tools for comprehensive browser control
13 browser actions supported (click, type, scroll, navigate, etc.)
4-5x faster than naive implementations
90% context reduction with compact progress mode
1440x900 optimized resolution (Gemini recommended)

Features

Core Capabilities

7 Production-Ready MCP Tools:

browse_web - Synchronous web browsing with immediate completion
start_web_task - Start long-running tasks in background
check_web_task - Monitor progress with compact/full modes
wait - Intelligent rate limiting (1-60 seconds)
stop_web_task - Cancel running background tasks
list_web_tasks - View all active and completed tasks
get_web_screenshots - Retrieve session screenshots for verification

Advanced Features:

Real-time progress tracking with timestamped events
Automatic screenshot capture at each step
Safety decision framework (Gemini safety controls)
Context-aware polling with recommended delays
Background task management with status tracking
Session-based screenshot organization

Technical Highlights

MCP Protocol Compliant: Follows 2025 Model Context Protocol best practices
Performance Optimized: Conditional wait states (0.3-3s), fast page loads, single screenshot per turn
Safety-First: Implements Gemini's safety decision acknowledgment framework
Production Ready: Comprehensive error handling, logging, and validation
User-Friendly: Action-oriented tool names, clear descriptions, helpful examples

Quick Start

Prerequisites

Python 3.10 or higher
UV package manager (Install UV)
Gemini API key (Get one here)

Installation (Super Simple!)

Option A: One-Line Install (PyPI)

// Add to Claude Desktop config
{
  "mcpServers": {
    "computer-use": {
      "command": "uvx",
      "args": ["computer-use-mcp"],
      "env": {"GEMINI_API_KEY": "your_key_here"}
    }
  }
}

Then: playwright install chromium and restart Claude Desktop. Done!

Option B: Local Development

git clone https://github.com/vincenthopf/computer-use-mcp.git
cd computer-use-mcp
uv sync
playwright install chromium
cp .env.sample .env
# Edit .env with your GEMINI_API_KEY
uv run mcp dev server.py

The MCP Inspector will open at http://localhost:6274 where you can test all tools interactively.

Verification

Run validation tests to ensure everything is set up correctly:

uv run python3 test_server.py

All tests should pass with ✓ markers.

Installation

Method 1: UVX (Recommended - One Command)

The simplest way to use this MCP server. Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "computer-use": {
      "command": "uvx",
      "args": ["computer-use-mcp"],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here"
      }
    }
  }
}

That's it! uvx automatically downloads and installs the package from PyPI. Restart Claude Desktop and you're ready to go.

Note: You'll also need to install Playwright browsers once:

playwright install chromium

Method 2: Local Development

For development or contributing:

git clone https://github.com/vincenthopf/computer-use-mcp.git
cd computer-use-mcp
uv sync
playwright install chromium
cp .env.sample .env
# Edit .env with your GEMINI_API_KEY
uv run mcp dev server.py

Method 3: Direct Git Install

Install directly from GitHub without cloning:

{
  "mcpServers": {
    "computer-use": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/vincenthopf/computer-use-mcp.git", "computer-use-mcp"],
      "env": {
        "GEMINI_API_KEY": "your_api_key_here"
      }
    }
  }
}

Configuration

Create a .env file in the project root with the following variables:

Required Configuration

# Gemini API Configuration
GEMINI_API_KEY=your_api_key_here
GEMINI_MODEL=gemini-2.5-computer-use-preview-10-2025

Optional Configuration

# Browser Configuration
SCREEN_WIDTH=1440        # Recommended by Google (don't change)
SCREEN_HEIGHT=900        # Recommended by Google (don't change)
HEADLESS=false           # Set to 'true' for faster headless mode

# Output Configuration (optional - defaults to system temp directory)
# SCREENSHOT_OUTPUT_DIR=/custom/path/to/screenshots

Note: Screen resolution of 1440x900 is optimized for Gemini's Computer Use model. Other resolutions may degrade performance.

Usage

Synchronous Workflow

For quick tasks that complete in under 10 seconds, use the synchronous browse_web tool:

# Example: Quick product search
result = browse_web(
    task="Go to Amazon and find the top 3 gaming laptops with prices",
    url="https://www.amazon.com"
)

# Response includes full results and progress history
{
    "ok": true,
    "data": "Found top 3 gaming laptops...",
    "session_id": "20251017_143022_abc123",
    "screenshot_dir": "output_screenshots/20251017_143022_abc123",
    "progress": [
        {"timestamp": "...", "type": "info", "message": "Started browser automation"},
        {"timestamp": "...", "type": "turn", "message": "Turn 1/30"},
        {"timestamp": "...", "type": "function_call", "message": "Action: navigate"}
    ]
}

Asynchronous Workflow

For long-running tasks (15+ seconds), use the async workflow to monitor progress:

Step 1: Start the task

result = start_web_task(
    task="Research top 10 AI companies and compile details",
    url="https://www.google.com"
)
# Returns immediately with task_id
# {"task_id": "abc-123-def", "status": "running"}

Step 2: Check progress

# Wait 5 seconds
wait(5)

# Check status (compact mode - recommended)
status = check_web_task(task_id="abc-123-def", compact=true)

# Response shows recent progress summary
{
    "ok": true,
    "task_id": "abc-123-def",
    "status": "running",
    "progress_summary": {
        "total_steps": 12,
        "recent_actions": [
            "Turn 8/30",
            "Action: type_text_at",
            "Action: click_at"
        ]
    },
    "recommended_poll_after": "2025-01-17T14:30:15Z"
}

Step 3: Get final result

# Continue polling until status is "completed"
final = check_web_task(task_id="abc-123-def")

# When completed, result field contains data
{
    "ok": true,
    "status": "completed",
    "result": {
        "ok": true,
        "data": "Compiled research on 10 AI companies...",
        "session_id": "..."
    }
}

All 7 MCP Tools

1. browse_web

Synchronous web browsing that waits for task completion.

Parameters:

task (string, required): Natural language description of what to accomplish
url (string, optional): Starting URL (defaults to Google)

Returns:

ok (boolean): Success status
data (string): Task completion message with results
session_id (string): Unique session identifier
screenshot_dir (string): Path to saved screenshots
progress (array): Full progress history with timestamps
error (string): Error message if task failed

Example:

browse_web(
    task="Go to example.com and click the login button",
    url="https://example.com"
)

2. start_web_task

Start a long-running web task in the background.

Parameters:

task (string, required): Natural language task description
url (string, optional): Starting URL

Returns:

ok (boolean): Task started successfully
task_id (string): Unique ID for checking progress
status (string): Always "running" initially
message (string): Instructions for checking progress

Example:

start_web_task(
    task="Research and compare prices for iPhone 15 on 5 different sites"
)

3. check_web_task

Monitor progress of a background task.

Parameters:

task_id (string, required): Task ID from start_web_task
compact (boolean, optional): Return summary only (default: true)

Returns (compact mode):

ok (boolean): Success status
task_id (string): Task identifier
status (string): "pending" | "running" | "completed" | "failed" | "cancelled"
progress_summary (object): Recent actions and total steps
result (object): Task results (when completed)
error (string): Error message (when failed)
recommended_poll_after (string): ISO timestamp for next check

Example:

check_web_task(task_id="abc-123", compact=true)

4. wait

Pause execution for rate limiting and polling delays.

Parameters:

seconds (integer, required): Wait time in seconds (1-60)

Returns:

ok (boolean): Success status
waited_seconds (integer): Actual wait time
message (string): Confirmation message

Example:

wait(5)  # Wait 5 seconds between status checks

5. stop_web_task

Cancel a running background task.

Parameters:

task_id (string, required): Task ID to cancel

Returns:

ok (boolean): Cancellation success
message (string): Confirmation message
task_id (string): Cancelled task ID
error (string): Error if task not found

Example:

stop_web_task(task_id="abc-123")

6. list_web_tasks

View all tasks (active and completed).

Parameters: None

Returns:

ok (boolean): Success status
tasks (array): List of task status objects
count (integer): Total number of tasks
active_count (integer): Number of running tasks

Example:

list_web_tasks()

7. get_web_screenshots

Retrieve screenshots from a completed session.

Parameters:

session_id (string, required): Session ID from browse_web or check_web_task

Returns:

ok (boolean): Success status
screenshots (array): List of screenshot file paths
session_id (string): Session identifier
count (integer): Number of screenshots
error (string): Error if session not found

Example:

get_web_screenshots(session_id="20251017_143022_abc123")

Architecture

┌─────────────────────────────────────────┐
│    MCP Client (Claude Desktop, etc.)    │
└─────────────────┬───────────────────────┘
                  │ MCP Protocol (JSON-RPC)
┌─────────────────▼───────────────────────┐
│           FastMCP Server                │
│  7 Tools: browse_web, start_web_task,  │
│  check_web_task, wait, stop_web_task,  │
│  list_web_tasks, get_web_screenshots   │
└─────────────────┬───────────────────────┘
                  │
┌─────────────────▼───────────────────────┐
│       BrowserTaskManager                │
│  Manages async task queue, status      │
│  tracking, and background threads       │
└─────────────────┬───────────────────────┘
                  │
┌─────────────────▼───────────────────────┐
│      GeminiBrowserAgent                 │
│  Executes browser automation loop:     │
│  Screenshot → Gemini → Actions → Loop  │
└─────────────────┬───────────────────────┘
                  │
        ┌─────────┴─────────┐
        │                   │
┌───────▼────────┐  ┌──────▼────────┐
│  Gemini API    │  │  Playwright   │
│  (Vision+AI)   │  │  (Chromium)   │
└────────────────┘  └───────────────┘

How It Works

Task Submission: MCP client sends natural language task + optional URL
Browser Launch: Playwright launches Chromium (1440x900 viewport)
Gemini Loop: Screenshot → Gemini vision analysis → Browser actions → Screenshot (repeat)
Completion: Gemini returns text response when task is complete
Cleanup: Browser closed, screenshots saved to session directory

Browser Actions

The Gemini Computer Use API supports 13 browser automation actions:

Action	Description	Example Use
`navigate`	Go to a URL	Navigate to https://example.com
`click_at`	Click at normalized coordinates (0-999)	Click button at position (500, 300)
`hover_at`	Hover at normalized coordinates	Hover over menu item
`type_text_at`	Type text at coordinates	Type search query in input field
`key_combination`	Press keyboard combinations	Press Enter, Ctrl+A, etc.
`scroll_document`	Page-level scrolling	Scroll down one page
`scroll_at`	Scroll at specific coordinates	Scroll within specific element
`drag_and_drop`	Drag from source to destination	Reorder list items
`go_back`	Navigate backward in history	Return to previous page
`go_forward`	Navigate forward in history	Go to next page
`search`	Navigate to Google search	Start a Google search
`wait_5_seconds`	Wait 5 seconds	Wait for dynamic content
`open_web_browser`	No-op (browser already open)	-

Coordinate System: Gemini uses a normalized 1000x1000 grid. The agent automatically converts to actual pixels based on your screen resolution.

Performance

Benchmark Results

Task Type	Duration	Turns	Description
Simple	5-8 seconds	1-2	Navigate to a page, verify content
Medium	20-40 seconds	5-8	Search, click, extract information
Complex	60-120 seconds	15-30	Multi-step workflows, data compilation

Optimization Strategies

Before Optimization:

6 seconds wait after every action
networkidle page loads (waits for all network requests)
3 screenshots captured per turn
Sequential action execution

After Optimization:

0.3-3 seconds conditional waits (navigation only)
domcontentloaded page loads (DOM ready)
1 screenshot per turn (reused for all responses)
Parallel function execution when possible

Performance Improvement: 4-5x faster execution

Latency Breakdown (Per Turn)

Gemini API call: 2-4 seconds (vision processing)
Browser action: 0.3-3 seconds (optimized waits)
Screenshot capture: <0.5 seconds
Total per turn: ~3-8 seconds

Best Practices

Task Design

1. Be Specific and Clear

# ❌ Bad: Vague task
"Search for stuff"

# ✅ Good: Specific task
"Go to Amazon, search for 'wireless headphones', and find the top 3 results with prices"

2. Choose the Right Tool

# ❌ Bad: Async for simple tasks
start_web_task("Navigate to google.com")  # Requires 3+ tool calls

# ✅ Good: Sync for simple tasks
browse_web("Navigate to google.com")  # Single tool call

3. Use Appropriate Polling

# ❌ Bad: Poll too frequently
check_web_task(task_id)  # Called every 0.5 seconds

# ✅ Good: Poll every 3-5 seconds
wait(5)
check_web_task(task_id)

Error Handling

4. Handle All Status States

status = check_web_task(task_id)

if status["status"] == "completed":
    process_result(status["result"])
elif status["status"] == "failed":
    handle_error(status["error"])
elif status["status"] == "running":
    continue_polling()

5. Use Compact Mode

# ✅ Recommended: Compact mode (90% smaller)
check_web_task(task_id, compact=true)

# ⚠️ Only when needed: Full progress
check_web_task(task_id, compact=false)

Security

6. Review Screenshots Always check saved screenshots to verify agent behavior, especially for sensitive operations.

7. Environment Variables Never commit API keys. Always use environment variables or secure vaults.

8. Rate Limiting Use the wait tool to respect rate limits and avoid overwhelming services.

9. Domain Validation Be cautious with user-provided URLs. Consider implementing domain allowlists for production.

10. Logging and Audit All actions are logged with timestamps. Review logs for debugging and compliance.

Troubleshooting

Common Issues

Error: 400 INVALID_ARGUMENT (safety decision)

Solution: This is fixed in v1.0.0. Update to latest version.

Context window filling up too fast

Solution: Use compact=true in check_web_task (default behavior).

Tasks timing out at 30 turns

Solution: Break complex tasks into smaller subtasks or increase max_turns in browser_agent.py.

Browser not visible during execution

Solution: Set HEADLESS=false in .env to see the browser window.

"No module named 'mcp'" error

Solution: Activate virtual environment and run uv sync.

"GEMINI_API_KEY environment variable not set"

Solution: Create .env file with your API key or set it in Claude Desktop config.

"Executable doesn't exist" (Playwright)

Solution: Run playwright install chromium.

MCP server not showing in Claude Desktop

Solution: Verify absolute paths in config, ensure uv is in PATH, restart Claude Desktop completely.

FAQ

Q: Can I use both synchronous and asynchronous workflows? A: Yes! Use browse_web for quick tasks (<10s) and start_web_task for long tasks (>15s).

Q: What happens to old completed tasks? A: Tasks auto-cleanup after 24 hours to free memory.

Q: Can I check progress of a synchronous task? A: No, but the response includes full progress history after completion.

Q: How many tasks can run simultaneously? A: Unlimited. Each task runs in its own browser instance and thread.

Q: Does Claude automatically poll async tasks? A: No, Claude must manually call check_web_task() multiple times with wait() between calls.

Q: Can I cancel a task mid-execution? A: Yes, use stop_web_task(task_id) to cancel any running task.

Development

Setup for Contributors

1. Fork and clone:

git clone https://github.com/YOUR_USERNAME/gemini-web-automation-mcp.git
cd gemini-web-automation-mcp

2. Install dependencies:

uv sync
playwright install chromium

3. Create .env:

cp .env.sample .env
# Add your GEMINI_API_KEY

4. Test your setup:

uv run mcp dev server.py

Testing

Manual Testing:

# Start MCP Inspector
uv run mcp dev server.py

# Opens web interface at http://localhost:6274
# Test all tools interactively

Validation Tests:

# Run automated validation
uv run python3 test_server.py

Coding Standards

Follow PEP 8 style guide
Use type hints for all function signatures
Write docstrings for public functions/classes
Keep functions focused and small
Use meaningful variable names

Example Function:

async def check_web_task(task_id: str, compact: bool = True) -> dict[str, Any]:
    """
    Check progress of a background web browsing task.

    Args:
        task_id: Task ID from start_web_task()
        compact: Return summary only (default: True)

    Returns:
        Dictionary containing task status and progress

    Raises:
        ValueError: If task_id is invalid
    """
    # Implementation
    pass

Contributing

How to Contribute:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes following coding standards
Test your changes with MCP Inspector
Update documentation if needed
Commit with clear message: git commit -m "Add: Brief description"
Push to your fork: git push origin feature/amazing-feature
Open a Pull Request

Commit Message Prefixes:

Add: - New features
Fix: - Bug fixes
Update: - Updates to existing features
Refactor: - Code refactoring
Docs: - Documentation changes
Test: - Adding or updating tests

Pull Request Checklist:

All tests pass
Documentation updated
Code follows style guidelines
No breaking changes (or clearly documented)
Commit messages are clear

Tool Design Principles

When adding or modifying MCP tools:

User-focused naming: browse_web not execute_browser_automation
Clear descriptions: Explain what users accomplish, not technical details
Action-oriented: Use verbs (check, start, stop, wait)
Proper validation: Validate inputs and provide helpful error messages
Consistent responses: Always return {"ok": bool, ...} format

Deployment

Pre-Deployment Checklist

Project Files:

[x] README.md with comprehensive documentation
[x] LICENSE (MIT)
[x] CHANGELOG.md
[x] pyproject.toml with proper metadata
[x] .gitignore with comprehensive rules
[x] .env.sample template
[x] Core files (server.py, browser_agent.py, task_manager.py)

Code Quality:

[x] All tests passing
[x] Safety decision bug fixed
[x] Compact progress mode implemented
[x] MCP best practices followed
[x] Performance optimizations applied

GitHub Release Process

1. Create GitHub Repository:

# On GitHub, create new repository "gemini-web-automation-mcp"
# Do NOT initialize with README (we have it)

2. Push to GitHub:

git remote add origin https://github.com/YOUR_USERNAME/gemini-web-automation-mcp.git
git branch -M main
git push -u origin main

3. Create Release Tag:

git tag -a v1.0.0 -m "Version 1.0.0 - Initial production release"
git push origin v1.0.0

4. Create GitHub Release:

Go to Releases → Create a new release
Choose tag: v1.0.0
Release title: v1.0.0 - Initial Production Release
Description: Copy from CHANGELOG.md
Publish release

Repository Settings

Configure:

Description: Production-ready MCP server for AI-powered web automation
Topics: mcp, gemini, browser-automation, claude-desktop, ai-agents, playwright
Enable Issues for bug reports
Enable Discussions for community support

Distribution Methods

Method 1: Direct Git Clone (Recommended)

git clone https://github.com/yourusername/gemini-web-automation-mcp.git
cd gemini-web-automation-mcp
uv sync
playwright install chromium

Method 2: UVX (Future - when published to PyPI)

uvx gemini-web-automation-mcp

Roadmap

Planned Features

Priority 1: Security Enhancements

Human-in-the-loop confirmation UI for safety decisions
Domain allowlist/blocklist for navigation
Input sanitization and validation
Container-based sandboxing for production

Priority 2: Reliability

Retry logic with exponential backoff
Better error recovery mechanisms
Network resilience improvements
Connection timeout handling

Priority 3: Functionality

Cookie and session management
Form auto-fill templates
Multi-tab support
Mobile viewport emulation

Priority 4: Developer Experience

Proxy support for corporate environments
Custom wait conditions
Advanced screenshot comparison
Performance profiling tools

Known Limitations

Maximum 30 turns per task (configurable but not recommended to increase)
Browser automation only (no desktop OS-level control)
Single browser instance per task (no tab switching)
Limited to Chromium (Firefox/WebKit not supported)
Safety confirmations not yet implemented (future enhancement)

Changelog

[1.0.0] - 2025-01-17

Initial production-ready release

Added:

7 MCP tools for comprehensive browser automation
Real-time progress tracking with compact mode (90% size reduction)
Safety decision framework (fixes 400 INVALID_ARGUMENT error)
Context-aware polling with recommended delay timestamps
Performance optimizations (4-5x faster than baseline)
Automatic screenshot capture at each step
Background task management with status tracking
Comprehensive documentation and examples

Performance:

Conditional wait states (0.3-3s vs 6s per action)
Fast page loads (domcontentloaded vs networkidle)
Single screenshot per turn (eliminates duplicates)
Parallel function execution for batch operations

Security:

Safety decision acknowledgment implementation
Environment-based API key configuration
Comprehensive logging for audit trails
Screenshot-based verification capability

Full changelog →

Security

Safety Decision Framework

This MCP server implements Gemini's safety decision framework to prevent:

Financial transactions without confirmation
Sensitive data access without review
System-level changes without approval
CAPTCHA bypassing attempts
Potentially harmful actions

Best Practices

Sandboxing: Run in containerized environment for production deployment
API Key Security: Use environment variables, never commit keys to version control
Rate Limiting: Respect built-in polling delays (recommended 5 seconds)
Human-in-Loop: Review outputs before taking actions based on results
Logging: All actions are logged with timestamps for audit trail
Screenshot Review: Check saved screenshots to verify agent behavior

Reporting Vulnerabilities

For security vulnerabilities, please email security@example.com with:

Description of the vulnerability
Steps to reproduce
Potential impact
Suggested fix (if available)

Do not open public issues for security vulnerabilities.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with excellence using:

Google Gemini Team - Gemini 2.5 Computer Use API
Anthropic - Model Context Protocol specification
FastMCP - Excellent MCP server framework
Playwright - Robust browser automation library

Support

Get Help

Documentation: You're reading it!
GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and share ideas

Useful Links

Community

Share your use cases and automations
Contribute improvements and bug fixes
Help others in GitHub Discussions
Star the repository if you find it useful

Built with Gemini 2.5 Computer Use API

MCP Protocol 2025 | Production Ready | Open Source

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Gemini Web Automation MCP

README

Gemini Web Automation MCP

Table of Contents

Overview

Features

Core Capabilities

Technical Highlights

Quick Start

Prerequisites

Installation (Super Simple!)

Verification

Installation

Method 1: UVX (Recommended - One Command)

Method 2: Local Development

Method 3: Direct Git Install

Configuration

Required Configuration

Optional Configuration

Usage

Synchronous Workflow

Asynchronous Workflow

All 7 MCP Tools

1. browse_web

2. start_web_task

3. check_web_task

4. wait

5. stop_web_task

6. list_web_tasks

7. get_web_screenshots

Architecture

How It Works

Browser Actions

Performance

Benchmark Results

Optimization Strategies

Latency Breakdown (Per Turn)

Best Practices

Task Design

Error Handling

Security

Troubleshooting

Common Issues

FAQ

Development

Setup for Contributors

Testing

Coding Standards

Contributing

Tool Design Principles

Deployment

Pre-Deployment Checklist

GitHub Release Process

Repository Settings

Distribution Methods

Roadmap

Planned Features

Known Limitations

Changelog

[1.0.0] - 2025-01-17

Security

Safety Decision Framework

Best Practices

Reporting Vulnerabilities

License

Acknowledgments

Support

Get Help

Useful Links

Community

Recommended Servers