Safari MCP Server
Enables visual web access and browser automation through Safari, with tools for navigation, screenshots, element interaction, and tab management.
README
Safari MCP Server
A MCP (Model Context Protocol) server for visual web access through Safari.
Features
- Visual Web Access: Viewport screenshots for visual inspection
- Full Authentication State: Operates on the user's actual Safari session with cookies preserved
- Tab-Aware Targeting: Act operations target a captured working tab, observe operations follow the user's focus
- Viewport Scrolling: Pixel amount or viewport-page index
- Element Inspection: Tag, visibility, attributes, bounding rect for any CSS selector
- Interaction Tools: Click, type, hover, select option
- Wait Conditions: Selector to appear, disappear, or page text to render
- Link Extraction: Anchor links as
{ text, href }pairs - Browser History: Back and forward navigation
- Console Error Capture: Two-phase injection during and after page load
Prerequisites
- macOS → System Settings → Desktop & Dock → Windows → "Prefer tabs when opening documents" option must be set to Always
- macOS → System Settings → Privacy & Security → Screen & System Audio Recording → Terminal app must be enabled
- Safari → Settings → Developer → Automation → "Allow JavaScript from Apple Events" option must be enabled
MCP Server Configuration
Add to mcp.json servers configuration:
{
"mcpServers": {
"safari": {
"command": "npx",
"args": ["-y", "@axivo/mcp-safari"],
"env": {
"SAFARI_WINDOW_HEIGHT": "1600"
}
}
}
}
Environment Variables
All variables are optional:
SAFARI_PAGE_TIMEOUT- Page load and selector wait timeout, in milliseconds (default:10000)SAFARI_WINDOW_BOUNDS- Browser window margin offset from top-left corner, in pixels (default:20)SAFARI_WINDOW_HEIGHT- Browser window height, in pixels (default:1024)SAFARI_WINDOW_WIDTH- Browser window width, in pixels (default:1280)
Prompt Examples
- "Open Safari and use
statustool for guidelines" - "Navigate to
example.com" - "Search for
example query" - "Take a screenshot of the current page"
- "Read the page content to understand what's on the page"
- "Click the 'Sign In' button"
- "Type my email into the login form and submit"
- "Refresh the page to see the latest changes"
- "Go back to the previous page"
- "Navigate forward two steps in browser history"
- "Scroll down 500 pixels"
- "Scroll to page 3 of this article"
- "Search for 'Claude AI' and click the first result"
- "List all open browser tabs"
- "Open a new browser tab and go to
example.com" - "Switch to the first browser tab"
- "Close the second browser tab"
- "Inspect the submit button before clicking it"
- "Hover over the Products menu"
- "Choose 'Canada' in the country dropdown"
- "Wait for the loading spinner to disappear"
- "Read all links on this page"
[!NOTE]
The "use
statustool" instruction helps Claude pause and process the_meta.usageguidelines before interacting with the browser.
MCP Tools
Call status first at session start to get the runtime state and full tool surface:
- Act tools target a captured working tab
- Observe tools target the front window's current tab
-
click- Click an element on the working tab
- Type:
acttool - Optional inputs:
key(string): Key to press (e.g., Escape, ArrowRight, Enter, Tab)selector(string): CSS selector for the target elementtext(string): Visible text or aria-label to matchwait(string): CSS selector to wait for after clickx(number): X coordinate in pixelsy(number): Y coordinate in pixels
- Returns: Result with change detection
-
close- Close the working tab
- Type:
acttool
-
execute- Execute JavaScript in the working tab
- Type:
acttool - Required inputs:
script(string): JavaScript code
-
hover- Dispatch hover events to reveal hover-triggered UI
- Type:
acttool - Optional inputs (one is required):
selector(string): CSS selector for the target elementtext(string): Visible text to match
-
inspect- Return element metadata for a CSS selector
- Type:
observetool - Required inputs:
selector(string): CSS selector for the target element
- Optional inputs:
index(number): Tab index in the front window
- Returns:
{ found, tag, text, visible, disabled, attributes, rect }
-
navigate- Navigate the working tab to a URL or through history
- Type:
acttool - Optional inputs (
urlordirectionrequired):direction(string:backorforward)selector(string): CSS selector to wait for after loadsteps(number, default: 1): Steps for history navigationurl(string): URL to navigate to
-
open- Open a blank tab as the working target
- Type:
acttool
-
read- Get page title, URL, and text or links from a tab
- Type:
observetool - Optional inputs:
index(number): Tab index in the front windowmode(string:textorlinks, default:text)selector(string): CSS selector to scope extraction
-
refresh- Refresh the working tab
- Type:
acttool - Optional inputs:
hard(boolean, default: false): Bypass cacheselector(string): CSS selector to wait for after reload
-
screenshot- Capture the Safari window, an element, the full page, or the screen
- Type:
observetool - Optional inputs:
display(number): Display index forscreenmode, 1-based, defaults to the main displaymode(string:element,page,screen,window, default:window): Capture modeselector(string): CSS selector forelementmodesettle(number): Forpagemode, milliseconds to wait after each scroll for content to settle (default:500). Raise for slow dynamic sites, lower for static sites.share(boolean, default:false): Save to disk and return only the file path instead of the inline image
- Returns: Inline base64 image when
shareisfalse, or{ path, width, height, mimeType, ... }whenshareistrue. Browser metadata{ innerHeight, scrollHeight, pages }is included for non-screenmodes.
-
scroll- Scroll by direction or to a viewport-page index
- Type:
observetool - Optional inputs:
direction(string:upordown)page(number): Viewport-page index to scroll topixels(number): Pixels to scroll, paired withdirection
-
search- Search using the browser's default engine
- Type:
acttool - Required inputs:
text(string): Search query
-
select- Choose an option in a
<select>element - Type:
acttool - Required inputs:
selector(string): CSS selector for the<select>
- Optional inputs (one is required):
text(string): Option visible textvalue(string): Option value attribute
- Choose an option in a
-
status- Return current Safari tabs and full tool surface
- Type:
observetool - Returns:
{ tabs, tools }
-
type- Type text into an input field
- Type:
acttool - Required inputs:
text(string): Text to type
- Optional inputs:
append(boolean, default: false): Append instead of replaceselector(string): CSS selector for the inputsubmit(boolean, default: false): Press Enter after typing
-
wait- Wait for selector or page text condition
- Type:
observetool - Optional inputs (exactly one of the first three required):
selector(string): CSS selector to wait forselectorGone(string): CSS selector to wait absenttext(string): Page text to wait fortimeoutMs(number): Timeout in milliseconds
- Returns:
{ matched, elapsedMs }
-
window- Manage browser window tabs
- Type:
observetool - Required inputs:
action(string:close,list,open,switch)
- Optional inputs:
index(number): Tab index forcloseandswitchurl(string): URL foropen
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.