Safari MCP Server

Safari MCP Server

Enables visual web access and browser automation through Safari, with tools for navigation, screenshots, element interaction, and tab management.

Category
Visit Server

README

Safari MCP Server

License: BSD 3-Clause npm Socket Node.js TypeScript

A MCP (Model Context Protocol) server for visual web access through Safari.

Features

  • Visual Web Access: Viewport screenshots for visual inspection
  • Full Authentication State: Operates on the user's actual Safari session with cookies preserved
  • Tab-Aware Targeting: Act operations target a captured working tab, observe operations follow the user's focus
  • Viewport Scrolling: Pixel amount or viewport-page index
  • Element Inspection: Tag, visibility, attributes, bounding rect for any CSS selector
  • Interaction Tools: Click, type, hover, select option
  • Wait Conditions: Selector to appear, disappear, or page text to render
  • Link Extraction: Anchor links as { text, href } pairs
  • Browser History: Back and forward navigation
  • Console Error Capture: Two-phase injection during and after page load

Prerequisites

  • macOS → System Settings → Desktop & Dock → Windows → "Prefer tabs when opening documents" option must be set to Always
  • macOS → System Settings → Privacy & Security → Screen & System Audio Recording → Terminal app must be enabled
  • Safari → Settings → Developer → Automation → "Allow JavaScript from Apple Events" option must be enabled

MCP Server Configuration

Add to mcp.json servers configuration:

{
  "mcpServers": {
    "safari": {
      "command": "npx",
      "args": ["-y", "@axivo/mcp-safari"],
      "env": {
        "SAFARI_WINDOW_HEIGHT": "1600"
      }
    }
  }
}

Environment Variables

All variables are optional:

  • SAFARI_PAGE_TIMEOUT - Page load and selector wait timeout, in milliseconds (default: 10000)
  • SAFARI_WINDOW_BOUNDS - Browser window margin offset from top-left corner, in pixels (default: 20)
  • SAFARI_WINDOW_HEIGHT - Browser window height, in pixels (default: 1024)
  • SAFARI_WINDOW_WIDTH - Browser window width, in pixels (default: 1280)

Prompt Examples

  • "Open Safari and use status tool for guidelines"
  • "Navigate to example.com"
  • "Search for example query"
  • "Take a screenshot of the current page"
  • "Read the page content to understand what's on the page"
  • "Click the 'Sign In' button"
  • "Type my email into the login form and submit"
  • "Refresh the page to see the latest changes"
  • "Go back to the previous page"
  • "Navigate forward two steps in browser history"
  • "Scroll down 500 pixels"
  • "Scroll to page 3 of this article"
  • "Search for 'Claude AI' and click the first result"
  • "List all open browser tabs"
  • "Open a new browser tab and go to example.com"
  • "Switch to the first browser tab"
  • "Close the second browser tab"
  • "Inspect the submit button before clicking it"
  • "Hover over the Products menu"
  • "Choose 'Canada' in the country dropdown"
  • "Wait for the loading spinner to disappear"
  • "Read all links on this page"

[!NOTE]

The "use status tool" instruction helps Claude pause and process the _meta.usage guidelines before interacting with the browser.

MCP Tools

Call status first at session start to get the runtime state and full tool surface:

  • Act tools target a captured working tab
  • Observe tools target the front window's current tab
  1. click

    • Click an element on the working tab
    • Type: act tool
    • Optional inputs:
      • key (string): Key to press (e.g., Escape, ArrowRight, Enter, Tab)
      • selector (string): CSS selector for the target element
      • text (string): Visible text or aria-label to match
      • wait (string): CSS selector to wait for after click
      • x (number): X coordinate in pixels
      • y (number): Y coordinate in pixels
    • Returns: Result with change detection
  2. close

    • Close the working tab
    • Type: act tool
  3. execute

    • Execute JavaScript in the working tab
    • Type: act tool
    • Required inputs:
      • script (string): JavaScript code
  4. hover

    • Dispatch hover events to reveal hover-triggered UI
    • Type: act tool
    • Optional inputs (one is required):
      • selector (string): CSS selector for the target element
      • text (string): Visible text to match
  5. inspect

    • Return element metadata for a CSS selector
    • Type: observe tool
    • Required inputs:
      • selector (string): CSS selector for the target element
    • Optional inputs:
      • index (number): Tab index in the front window
    • Returns: { found, tag, text, visible, disabled, attributes, rect }
  6. navigate

    • Navigate the working tab to a URL or through history
    • Type: act tool
    • Optional inputs (url or direction required):
      • direction (string: back or forward)
      • selector (string): CSS selector to wait for after load
      • steps (number, default: 1): Steps for history navigation
      • url (string): URL to navigate to
  7. open

    • Open a blank tab as the working target
    • Type: act tool
  8. read

    • Get page title, URL, and text or links from a tab
    • Type: observe tool
    • Optional inputs:
      • index (number): Tab index in the front window
      • mode (string: text or links, default: text)
      • selector (string): CSS selector to scope extraction
  9. refresh

    • Refresh the working tab
    • Type: act tool
    • Optional inputs:
      • hard (boolean, default: false): Bypass cache
      • selector (string): CSS selector to wait for after reload
  10. screenshot

    • Capture the Safari window, an element, the full page, or the screen
    • Type: observe tool
    • Optional inputs:
      • display (number): Display index for screen mode, 1-based, defaults to the main display
      • mode (string: element, page, screen, window, default: window): Capture mode
      • selector (string): CSS selector for element mode
      • settle (number): For page mode, milliseconds to wait after each scroll for content to settle (default: 500). Raise for slow dynamic sites, lower for static sites.
      • share (boolean, default: false): Save to disk and return only the file path instead of the inline image
    • Returns: Inline base64 image when share is false, or { path, width, height, mimeType, ... } when share is true. Browser metadata { innerHeight, scrollHeight, pages } is included for non-screen modes.
  11. scroll

    • Scroll by direction or to a viewport-page index
    • Type: observe tool
    • Optional inputs:
      • direction (string: up or down)
      • page (number): Viewport-page index to scroll to
      • pixels (number): Pixels to scroll, paired with direction
  12. search

    • Search using the browser's default engine
    • Type: act tool
    • Required inputs:
      • text (string): Search query
  13. select

    • Choose an option in a <select> element
    • Type: act tool
    • Required inputs:
      • selector (string): CSS selector for the <select>
    • Optional inputs (one is required):
      • text (string): Option visible text
      • value (string): Option value attribute
  14. status

    • Return current Safari tabs and full tool surface
    • Type: observe tool
    • Returns: { tabs, tools }
  15. type

    • Type text into an input field
    • Type: act tool
    • Required inputs:
      • text (string): Text to type
    • Optional inputs:
      • append (boolean, default: false): Append instead of replace
      • selector (string): CSS selector for the input
      • submit (boolean, default: false): Press Enter after typing
  16. wait

    • Wait for selector or page text condition
    • Type: observe tool
    • Optional inputs (exactly one of the first three required):
      • selector (string): CSS selector to wait for
      • selectorGone (string): CSS selector to wait absent
      • text (string): Page text to wait for
      • timeoutMs (number): Timeout in milliseconds
    • Returns: { matched, elapsedMs }
  17. window

    • Manage browser window tabs
    • Type: observe tool
    • Required inputs:
      • action (string: close, list, open, switch)
    • Optional inputs:
      • index (number): Tab index for close and switch
      • url (string): URL for open

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured