RunCue

RunCue

RunCue MCP enables coding agents to navigate, inspect, and verify iOS app UI using natural language tasks via WebDriverAgent. It provides tools for running UI flows, checking UI state, listing devices, and diagnosing WDA setup.

Category
Visit Server

README

<p align="center"> <img src="docs/assets/runcue-logo.png" alt="RunCue logo" width="180" /> </p>

<h1 align="center">RunCue</h1>

RunCue is a developer UI navigation tool for iOS apps. It uses natural-language tasks to navigate, type, inspect, and verify app UI through WebDriverAgent, with MCP tools designed to work alongside build tools such as XcodeBuildMCP.

RunCue is intentionally scoped: it does not build, install, or debug your app. Use XcodeBuildMCP or your normal Xcode workflow for build, launch, screenshots, and logs. Use RunCue when you need an agent to reach a specific UI state.

Demo

Click the screenshot to open the demo recording.

<p align="center"> <a href="https://github.com/lihei12345/RunCue/blob/main/docs/assets/screen-recording-720p.mov"> <img src="docs/assets/terminal-screenshot.png" alt="RunCue terminal execution screenshot" width="860" /> </a> </p>

<p align="center"> <a href="https://github.com/lihei12345/RunCue/blob/main/docs/assets/screen-recording-720p.mov">Watch the demo recording</a> </p>

Features

  • WDA-only iOS automation for simulators and physical devices.
  • MCP tools for coding agents: runcue_run, runcue_check, runcue_devices, and runcue_doctor.
  • View-tree-first observation with screenshot fallback for WebView, SwiftUI, custom UI, and sparse accessibility trees.
  • Direct WDA text input through /keys, avoiding paste-menu workarounds.
  • Planner, locator, executor, and verifier loop for more stable multi-step navigation.

Requirements

  • macOS with Xcode installed.
  • Node.js 20 or newer.
  • A visible iOS Simulator or trusted physical iOS device.
  • An OpenAI-compatible vision-language model (VLM) API key.

For physical devices, you also need:

  • Device trust enabled.
  • Developer Mode enabled.
  • The device unlocked while running tasks.
  • WebDriverAgent signing configured through RUNCUE_WDA_TEAM_ID or RunCue config.

Install

npm install -g runcue

For local development from this repository:

npm install
npm run build
node dist/cli.js --help

Configure Models

RunCue needs a vision-language model, not a text-only LLM. The provider must be OpenAI-compatible and support image input for visual fallback, visual grounding, and screenshot checks.

Supported wire APIs:

  • chat using chat.completions with text and image_url content parts. This is the default.
  • responses using the OpenAI Responses API with input_text and input_image.

The local config file is:

~/.runcue/config.json

RunCue uses built-in defaults when this file does not exist. The file is created when you run runcue config set ..., or you can create it manually.

Inspect the effective config:

runcue config list

Set the default provider:

runcue config set provider my-vl

Environment variable references such as ${MY_VL_API_KEY} are resolved at runtime. A minimal custom provider looks like this:

{
  "vlm": {
    "default": "my-vl",
    "providers": {
      "my-vl": {
        "baseUrl": "https://api.example.com/v1",
        "model": "your-vl-model",
        "apiKey": "${MY_VL_API_KEY}",
        "wireApi": "chat",
        "inputMode": "viewtree"
      }
    }
  }
}

Provider fields:

Field Required Meaning
baseUrl Yes OpenAI-compatible API base URL.
model Yes VLM model name accepted by that provider.
apiKey Yes API key value or environment reference such as ${MY_VL_API_KEY}.
wireApi No chat or responses; defaults to chat.
inputMode No viewtree or screenshot; defaults to viewtree. Use screenshot only for providers/apps where visual-only operation is preferred.
headers No Extra HTTP headers to pass to the provider.

RunCue ships with several built-in provider examples, including DashScope/Qwen VL. They are examples, not a requirement. For example:

export DASHSCOPE_API_KEY="your-dashscope-api-key"
runcue config set provider dashscope-vl-plus

Quick Start

List devices:

runcue devices

Check WDA readiness:

runcue doctor --device "iPhone 17 Pro Simulator" --platform ios-simulator

Run a navigation task:

runcue run "Open Maps, search for the nearest Walmart, and start navigation" \
  --device "iPhone 17 Pro Simulator" \
  --platform ios-simulator \
  --bundle-id com.apple.Maps \
  --fresh-app \
  --max-steps 10 \
  --timeout 120

For complex or non-standard app flows, include product-specific UI knowledge in the task or hints, for example:

runcue run "Open Maps, search for the nearest Walmart, and start navigation. In Apple Maps, if there is no normal Start Navigation button, tap the Route Steps item in the route card list to enter navigation." \
  --device "iPhone 17 Pro Simulator" \
  --platform ios-simulator \
  --bundle-id com.apple.Maps \
  --fresh-app

XcodeBuildMCP + RunCue Workflow

RunCue is designed to cooperate with XcodeBuildMCP instead of replacing it. XcodeBuildMCP owns build, install, launch, screenshots, logs, and Xcode project state. RunCue owns UI navigation and state checks on the same device.

Coding Agent
  |
  | 1. Build, install, and launch the app
  v
XcodeBuildMCP
  |
  | build_run_sim / launch_app_sim
  | returns simulator name or UDID
  v
iOS Simulator or Device
  |
  | 2. Navigate to the target UI state on that same device
  v
RunCue MCP / CLI
  |
  | observe -> plan -> locate -> execute -> verify
  | through WebDriverAgent
  v
Target App UI State
  |
  | 3. Capture final evidence when needed
  v
XcodeBuildMCP
  |
  | screenshot / logs / test output
  v
Coding Agent

Practical rules:

  • Use XcodeBuildMCP first to prepare the app state.
  • Pass the exact simulator name or UDID from that build/run flow to RunCue.
  • Let RunCue perform UI actions while it is running.
  • Use XcodeBuildMCP again after RunCue finishes for screenshots, logs, or build diagnostics.

MCP Usage

RunCue exposes an MCP server over stdio:

runcue mcp

Example Codex MCP configuration:

[mcp_servers.RunCue]
type = "stdio"
command = "runcue"
args = ["mcp"]

For a local checkout:

[mcp_servers.RunCue]
type = "stdio"
command = "node"
args = ["/absolute/path/to/RunCue/dist/cli.js", "mcp"]

MCP Tools

  • runcue_run: autonomously navigate a UI flow.
  • runcue_check: inspect the current UI state with a question.
  • runcue_devices: list iOS devices and simulators visible to Xcode.
  • runcue_doctor: diagnose WDA setup and signing issues.

Architecture

RunCue iOS UI Navigation Architecture

The current architecture is WDA-only:

Coding Agent
  -> RunCue MCP / CLI
  -> Agent loop: planner -> locator -> executor -> verifier
  -> WebDriverAgent HTTP API
  -> iOS Simulator or physical iOS device

See docs/architecture.md for the current architecture and docs/tech-solution-v2.md for the longer design record.

Documentation

Development

npm install
npm run build
npm test
npm_config_cache=/private/tmp/runcue-npm-cache npm pack --dry-run

Third-Party Code

RunCue vendors appium-webdriveragent so the CLI can bootstrap WDA without asking users to manually clone a separate project. See THIRD_PARTY_NOTICES.md.

License

RunCue is licensed under the MIT License. See LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured