open-mobile-mcp

open-mobile-mcp

An MCP server for mobile automation that enables LLMs to interact with Android and iOS devices through screenshot, tap, swipe, log analysis, and app lifecycle management.

Category
Visit Server

README

Open Mobile MCP Server 📱

An open-source Model Context Protocol (MCP) server for mobile automation. Give any LLM eyes and hands on a real Android or iOS device — screenshot, tap, swipe, read logs, and verify your app without writing test code.

Works with Claude Code, Claude Desktop, Cursor, and any other MCP-compatible client.

Features

  • Perception: Screenshots, semantic UI hierarchy, OCR, element finder, layout health analysis.
  • Interaction: Tap, swipe, type, pinch, rotate, long-press, hardware key presses.
  • Logging: Per-app Android log filtering via PID (pass deviceId + packageId to eliminate system noise). Background log watching via wait_for_log.
  • Environment: Metro bundler management, app lifecycle, deep links, screen recording, locale switching.
  • Text Input: Unicode/Cyrillic/CJK/Emoji support via ADB Keyboard with automatic keyboard restore.

Prerequisites

  1. Node.js (v18+)

  2. ADB installed and in PATH (for Android).

  3. Maestro (required for iOS; fallback for Android input).

    • Mac/Linux: curl -Ls "https://get.maestro.mobile.dev" | bash

    • Windows:

      powershell -Command "iwr -useb https://get.maestro.mobile.dev | iex"
      
  4. (Optional) ADB Keyboard — only needed for non-ASCII input (Unicode, Cyrillic, Emoji).

    • Download from GitHub and install: adb install ADBKeyboard.apk.

Configuration

macOS / Linux

{
  "mcpServers": {
    "open-mobile-mcp": {
      "command": "npx",
      "args": ["open-mobile-mcp"]
    }
  }
}

Windows

{
  "mcpServers": {
    "open-mobile-mcp": {
      "command": "npx",
      "args": ["open-mobile-mcp"],
      "env": {
        "MAESTRO_HOME": "C:\\Users\\YOUR_USER\\.maestro",
        "PATH": "C:\\Users\\YOUR_USER\\.maestro\\maestro\\bin;C:\\Windows\\system32;C:\\Windows;..."
      }
    }
  }
}

Note: On Windows, explicitly setting MAESTRO_HOME and PATH is often required for maestro to be found.

<details> <summary>Running from source</summary>

git clone https://github.com/xzaleksey/open-mobile-mcp.git
cd open-mobile-mcp
npm install && npm run build

Then use "command": "node", "args": ["/path/to/open-mobile-mcp/build/index.js"] in your MCP config. </details>

Tools

Perception

Tool Platform Description
device_list Android/iOS List connected emulators and simulators
get_viewport Android/iOS Screenshot (~800px wide). Use originalWidth/originalHeight for tap coordinates — the image is scaled down, tapping at image pixels will miss.
get_semantic_hierarchy Android/iOS Pruned UI tree as JSON
get_screen_text Android/iOS OCR via Tesseract.js (default eng)
configure_ocr Android/iOS Set default OCR language (e.g. eng+fra)
find_element Android/iOS Find elements by testId, text, or contentDescription
wait_for_element Android/iOS Poll until element appears (default 20s)
get_element_image Android/iOS Cropped screenshot of a specific element
capture_diff — Compare two base64 screenshots, returns diff %
analyze_layout_health Android/iOS Detect deep nesting and layout performance issues

Interaction

Tool Platform Description
tap_on_element Android/iOS Recommended — find + tap by selector. Note: text matching is exact; emoji prefixes (e.g. 🇫🇷 French A2) break text matching — check get_semantic_hierarchy for exact text first.
device_tap Android/iOS Raw coordinate tap. Must use original device pixels, not screenshot pixels.
device_swipe Android/iOS Swipe by coordinates
device_type Android/iOS Type text (handles Unicode)
device_pinch Android Two-finger pinch/zoom
device_rotate_gesture Android Two-finger rotation
device_press_key Android/iOS Hardware keys: back, home, recents, enter, delete, volume_up, volume_down, power, tab, search, space, menu, dpad_*. Also accepts raw Android keycodes.

Environment & Logs

Tool Platform Description
manage_bundler Android/iOS Start/stop/restart Metro. Pass deviceId + packageId for PID-based Android log filtering.
manage_platform_logs Android/iOS Manual control over adb logcat / xcrun capture. Pass deviceId + packageId for per-app filtering.
get_bundler_logs Android/iOS Recent Metro/Android/iOS logs. Returns [status] line — if buffer is empty, capture may not be running.
stream_errors Android/iOS Recent error/exception lines across all sources
get_network_logs Android/iOS Network logcat lines. For iOS, filters the internal log capture buffer (enable via manage_platform_logs).
wait_for_log Android/iOS Block until a log pattern matches. See subagent pattern below.
manage_app_lifecycle Android/iOS Launch, stop, install, or uninstall apps
open_deep_link Android/iOS Open a URL or deep link
clear_app_data Android/iOS Reset app to fresh-install state
get_app_info Android Version, permissions, install date
start_recording / stop_recording Android/iOS Screen recording to .mp4 (Android uses screenrecord, iOS uses simctl io).
run_maestro_flow Android/iOS Run a Maestro YAML flow
run_doctor — Run npx expo-doctor
install_deps — Run npx expo install <packages>

wait_for_log — Background Subagent Pattern

wait_for_log blocks until a pattern appears in the log buffer. Calling it directly in the main agent freezes the conversation. Always delegate it to a background subagent in Claude Code:

// Step 1 — spawn the watcher BEFORE the action that will trigger the log
// (In Claude Code, use Agent tool with run_in_background: true)
// Subagent prompt: "Call wait_for_log with pattern 'route: /home', timeout 60000. Report the result."

// Step 2 — perform the action in the main agent
tap_on_element({ selector: "Home", strategy: "text" });

// Step 3 — main agent continues freely; gets notified when subagent finishes

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured