ios-agent-driver

ios-agent-driver

An MCP server that lets an AI agent drive the iOS Simulator in a loop, enabling tapping, typing, swiping, reading the screen via the accessibility tree, and verifying app state.

Category
Visit Server

README

ios-agent-driver

An MCP server that lets an AI agent drive the iOS Simulator in a loop — so an agent can actually use your app: tap, type, swipe, read the screen, and verify what happened.

It bridges the gap between iOS development and agentic testing. The primitives to control a simulator exist (xcrun simctl, Meta's idb), but nothing packages them into tools an agent can call to close the perceive → decide → act → observe loop. This does.

  • Accessibility-tree-first perception. The agent reasons over labeled UI elements (describe_ui) and taps by label, not by guessing pixel coordinates — far more robust to layout changes.
  • Screenshot fallback. For custom-drawn views that don't expose accessibility, screenshot gives a vision fallback and a way to verify state.
  • Loud failures. A tap on a missing label returns the nearest labels on screen, not a silent no-op.

How it works

Agent (Claude / any MCP client)
   goal: "log a leg workout, confirm it appears in History"
        observe → decide → act → observe  (loop)
        │  MCP (stdio)
   ios-agent-driver
        │                         │
   xcrun simctl              idb (+ companion)
   lifecycle, screenshots    accessibility tree,
   deeplinks, permissions    tap / type / swipe by element

Requirements

  • macOS with Xcode (provides xcrun simctl)
  • idb for UI perception and actions:
    brew tap facebook/fb && brew trust facebook/fb
    brew install facebook/fb/idb-companion   # source build — needs current Xcode Command Line Tools
    pip3 install fb-idb                       # the `idb` CLI; use pipx/venv if pip is externally-managed
    idb list-targets                          # confirm it sees your booted sim
    
    If the companion build errors with “Command Line Tools are too outdated”, update them (System Settings › Software Update, or xcode-select --install). Lifecycle tools work without idb; describe_ui / tap / type_text / swipe require it and will tell you how to install it if it's missing.
  • Node.js ≥ 18

Install

git clone https://github.com/CodeJonesW/ios-agent-driver.git
cd ios-agent-driver
npm install      # builds via the prepare script

Register with Claude Code

Add to your MCP config (user-level ~/.claude.json, or a project .mcp.json):

{
  "mcpServers": {
    "ios-agent-driver": {
      "command": "node",
      "args": ["/absolute/path/to/ios-agent-driver/dist/server.js"]
    }
  }
}

Or with the Claude Code CLI:

claude mcp add ios-agent-driver -- node /absolute/path/to/ios-agent-driver/dist/server.js

Tools

Tool Backend Purpose
list_sims simctl List devices (udid, name, state, runtime).
boot_sim simctl Boot a sim (defaults to booted, else first iPhone).
install_app simctl Install a built .app bundle.
launch simctl Launch an app by bundle id.
terminate simctl Terminate a running app.
reset_app simctl Uninstall + reinstall for a clean state.
deeplink simctl Open a URL / universal link.
set_permission simctl Grant/revoke/reset a privacy permission.
describe_ui idb Primary perception — accessibility tree as JSON.
screenshot simctl PNG of the current screen (vision fallback).
tap idb Tap by accessibility label (preferred) or x,y.
type_text idb Type into the focused field.
swipe idb Swipe/scroll by direction or coordinates.
press_button idb Hardware buttons (HOME, LOCK, …).

The loop, by example

A typical agent goal runs as a bounded loop:

GOAL: "open Settings and confirm Notifications is enabled"
1. boot_sim
2. launch { bundle_id: "com.apple.Preferences" }
3. describe_ui            → see "Notifications" cell
4. tap { label: "Notifications" }
5. describe_ui            → assert the toggle state
   (re-read after each action; stop when the goal predicate holds
    or a step budget is exhausted)

The agent owns the loop and the success predicate; this server provides the primitives. That keeps the tool simple and the test logic where it belongs.

Development

npm run build     # compile TypeScript → dist/
npm start         # run the server on stdio

License

MIT © Will Jones (CodeJonesW)

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured