AgentKit Browser Automation

AgentKit Browser Automation

agentkit for playwright-mcp server

tmahesh

Browser Automation
Visit Server

README

AgentKit Browser Automation

A sophisticated browser automation framework built with AgentKit, featuring a multi-agent system for intelligent web navigation and task execution.

Overview

This project implements a multi-agent system for browser automation, where different agents work together to:

  • Plan and break down tasks
  • Navigate web pages
  • Execute browser actions
  • Validate results

Architecture (TODO)

The system consists of four specialized agents:

  1. Planning Agent

    • Breaks down tasks into actionable steps
    • Creates detailed execution plans
    • Determines task completion criteria
  2. Navigator Agent

    • Determines the next actions to take
    • Manages state transitions
    • Handles action execution
    • Provides detailed logging and feedback
  3. Browser Agent

    • Executes browser automation actions
    • Interacts with web elements
    • Handles page navigation
    • Manages browser state
  4. Validation Agent

    • Validates task completion
    • Verifies results
    • Handles error cases
    • Provides feedback on success/failure

Features

  • Intelligent Task Planning: Breaks down complex tasks into manageable steps
  • State Management: Tracks browser state and action results
  • Error Handling: Robust error handling and recovery mechanisms
  • Event System: Comprehensive event logging and monitoring
  • Flexible Action System: Extensible action registry for custom behaviors
  • Validation Framework: Built-in validation for task completion
  • Memory Management: Maintains context and history of actions

Getting Started

Prerequisites

  • Node.js (v14 or higher)
  • npm or yarn
  • OpenAI API key (for GPT models)

Installation

  1. Clone the repository:
git clone https://github.com/tmahesh/playwright-agent.git
cd playwright-agent
  1. Install dependencies:
npm install
  1. Set up environment variables:
cp .env.sample .env
# Edit .env with your OpenAI API key and other configurations
  1. run these commands on diff terminals: index.ts, playwright-mcp, inngest-cli
npx @playwright/mcp@latest --port 8931

npx tsx index.ts

npx inngest-cli@latest dev --no-discovery -u http://localhost:3000/api/inngest -v

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

Acknowledgments

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Playwright MCP Server

Playwright MCP Server

Provides a server utilizing Model Context Protocol to enable human-like browser automation with Playwright, allowing control over browser actions such as navigation, element interaction, and scrolling.

Featured
Local
TypeScript
@kazuph/mcp-fetch

@kazuph/mcp-fetch

Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.

Featured
Local
JavaScript
DuckDuckGo MCP Server

DuckDuckGo MCP Server

A Model Context Protocol (MCP) server that provides web search capabilities through DuckDuckGo, with additional features for content fetching and parsing.

Featured
Python
YouTube Transcript MCP Server

YouTube Transcript MCP Server

This server retrieves transcripts for given YouTube video URLs, enabling integration with Goose CLI or Goose Desktop for transcript extraction and processing.

Featured
Python
serper-search-scrape-mcp-server

serper-search-scrape-mcp-server

This Serper MCP Server supports search and webpage scraping, and all the most recent parameters introduced by the Serper API, like location.

Featured
TypeScript
The Verge News MCP Server

The Verge News MCP Server

Provides tools to fetch and search news from The Verge's RSS feed, allowing users to get today's news, retrieve random articles from the past week, and search for specific keywords in recent Verge content.

Featured
TypeScript
Tavily MCP Server

Tavily MCP Server

Provides AI-powered web search capabilities using Tavily's search API, enabling LLMs to perform sophisticated web searches, get direct answers to questions, and search recent news articles.

Featured
Python
mcp-pinterest

mcp-pinterest

A Pinterest Model Context Protocol (MCP) server for image search and information retrieval

Featured
TypeScript
Crawlab MCP Server

Crawlab MCP Server

Official
Python