appium-mcp
Professional mobile test automation framework with AI-powered test generation, Page Object Model support, and AWS Bedrock integration.
README
Appium MCP - AI-Driven Mobile Test Automation
Professional mobile test automation framework with AI-powered test generation, Page Object Model support, and AWS Bedrock integration.
Supported Platforms: iOS | Android
Python Version: 3.8+
License: MIT
⨠Features
- Interactive Chatbot: Walk through your app testing in natural language
- AI-Generated Tests: Automatically generate test code from your interactions
- Page Object Model: Auto-generate clean, reusable page objects
- YAML Workflows: Define test flows in simple YAML, AI handles execution
- Multi-Platform: iOS and Android support
- Element Discovery: Automatic element locator generation
- Screenshot Capture: Automatic screenshots at each step
- AWS Integration: Use Claude AI via Bedrock for test generation
ļæ½ Prerequisites
Before you start, ensure you have:
-
Python 3.8 or higher
python --version -
Node.js 14+ and npm (for Appium)
node --version npm --version
š¦ Installation
š Detailed Setup Guide: For complete step-by-step instructions including:
- System requirements verification
- Virtual environment setup
- Appium server configuration
- AWS Bedrock integration (AI-powered tests)
- Device configuration (iOS/Android)
- Troubleshooting guide
ā See SETUP.md (Recommended for first-time users)
From PyPI (Recommended)
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install appium-mcp
pip install appium-mcp
š Detailed PyPI Installation Guide: See PYPI_INSTALL.md for complete step-by-step instructions
From Source (Development)
git clone https://github.com/youcanautomate-yca/ai-driven-mobile-automation.git
cd ai-driven-mobile-automation
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .
š§ Setup Appium Server
For complete Appium setup guide including prerequisites ā See INSTALLATION.md
Appium server must be running for appium-mcp to work.
Installation
# Install globally (one-time)
npm install -g appium
# Install drivers
appium driver install xcuitest # iOS
appium driver install uiautomator2 # Android
Start Server
# Terminal 1: Start Appium (runs on port 4723)
appium
# You should see:
# [Appium] Welcome to Appium v2.x.x
# ...
# [Appium] Server listening on http://127.0.0.1:4723
šÆ CLI Commands
Once installed, you have access to these commands:
Interactive Chatbot (Easy!)
appium-mcp-chatbot
Perfect for beginners - guides you through test generation step-by-step.
Run YAML Workflows
appium-mcp-run-yaml my_workflow.yaml
Execute predefined test workflows from YAML files.
Generate Tests
appium-mcp-generate-tests workflow.json
Auto-generate test scripts from recorded interactions.
Start MCP Server
appium-mcp-server
Start the Model Context Protocol server for integration.
š YAML Workflow Examples
Example 1: Simple Login Test
Create a file login_test.yaml:
version: "1.0"
description: "Login test"
platform: "ios"
device_name: "iPhone 14"
bundle_id: "com.example.app"
app_path: "/path/to/app.app"
workflow:
LoginScreen:
- "Take a screenshot to see the app"
- "Tap on the email field"
- "Type user@example.com"
- "Tap on the password field"
- "Type MyPassword123"
- "Tap the Sign In button"
- "Wait 2 seconds for login to complete"
HomeScreen:
- "Take a screenshot to verify login success"
- "Verify that Welcome message is visible"
Run it:
appium-mcp-run-yaml login_test.yaml
Example 2: E-Commerce Purchase Flow
Create purchase_flow.yaml:
version: "1.0"
description: "Complete purchase flow"
platform: "android"
device_name: "emulator"
app_package: "com.myapp"
app_activity: ".MainActivity"
workflow:
HomePage:
- "Take screenshot"
- "Scroll down to see products"
- "Tap on first product"
ProductPage:
- "Take screenshot"
- "Tap on size selector"
- "Choose size M"
- "Tap Add to Cart button"
- "Verify item added message"
CartPage:
- "Tap on shopping cart icon"
- "Take screenshot"
- "Tap Checkout button"
CheckoutPage:
- "Enter shipping address"
- "Enter payment details"
- "Tap Place Order"
- "Verify order confirmation"
Run it:
appium-mcp-run-yaml purchase_flow.yaml
YAML File Structure
version: "1.0" # YAML version (required)
description: "Test description" # What this test does
platform: "ios" or "android" # Target platform (required)
device_name: "iPhone 14" # Device name/simulator
bundle_id: "com.example.app" # iOS bundle ID
app_package: "com.example.app" # Android package (Android only)
app_activity: ".MainActivity" # Android activity (Android only)
app_path: "/path/to/app.app" # Path to app binary
workflow: # Test steps
ScreenName: # Group steps by screen
- "Natural language prompt" # AI executes this
- "Another step"
- "..."
NextScreen:
- "Step 1"
- "Step 2"
How YAML Workflows Work
- Write prompts in natural language - No tool names needed
- AI inspects the current screen - Analyzes app UI
- AI finds the right elements - Uses element locators
- AI performs the action - Click, type, scroll, etc.
- Auto-generates page objects - Reusable code
- Auto-generates test code - Ready-to-run tests
Example: Write one line:
"Tap on the email field and type test@example.com"
A generates Python code:
class LoginPage:
def enter_email(self, email):
self.find_element("email_field").send_keys(email)
š» Full Example: Running a Test
Step 1: Create YAML file
Save as test_app.yaml:
version: "1.0"
description: "Simple app test"
platform: "ios"
device_name: "iPhone 14"
bundle_id: "com.testapp"
app_path: "./app/TestApp.app"
workflow:
Start:
- "Take a screenshot"
- "Tap the start button"
- "Wait 1 second"
- "Take final screenshot"
Step 2: Start Appium (Terminal 1)
appium
Step 3: Run the test (Terminal 2)
cd my_project
source venv/bin/activate
appium-mcp-run-yaml test_app.yaml
Step 4: View results
- Generated test:
generated_tests/test_app.py - Page objects:
page_objects/StartPage.py - Screenshots:
screenshots/
š¤ Python API Usage
Use appium-mcp as a Python library:
from appium_mcp import MobileAutomationFramework
from appium.options.ios import XCUITestOptions
# Initialize framework
framework = MobileAutomationFramework()
# Create options
options = XCUITestOptions()
options.device_name = "iPhone 14"
options.bundle_id = "com.example.app"
# Create session
driver = framework.create_session(options)
# Use driver like standard Appium
driver.find_element("xpath", "//XCUIElementTypeButton[@name='Login']").click()
# Generate test code from session
test_code = framework.generate_test("my_test")
print(test_code)
# Cleanup
driver.quit()
š Comprehensive Guides
- Complete Setup Guide ā START HERE - Step-by-step installation, system requirements, Appium setup, AWS Bedrock configuration, device setup, and troubleshooting for PyPI users
- Installation Guide - AWS Bedrock integration details, model comparison, cost estimation, and advanced setup
- YAML Workflow Guide - Complete YAML workflow reference and examples
- YAML Quick Start - 5-minute quick start for YAML workflows
- Chatbot Guide - Interactive chatbot step-by-step guide
- Development Guide - Contributing, testing, and building from source
š Troubleshooting
"Appium server not responding"
# Make sure Appium is running
appium
# Check if running on correct port
curl http://localhost:4723/status
"Connection refused"
# Restart Appium
appium --port 4723
# Verify config
appium-mcp-run-yaml --help
"Element not found"
- Take a screenshot first:
"Take a screenshot" - Check the actual element name in the app
- Use more specific natural language: "Tap on the blue Login button"
Import errors after installation
# Reinstall in clean environment
python -m venv fresh_env
source fresh_env/bin/activate
pip install appium-mcp
š Resources
- PyPI Package: https://pypi.org/project/appium-mcp/
- GitHub Repository: https://github.com/youcanautomate-yca/ai-driven-mobile-automation
- Appium Documentation: https://appium.io/docs/en/2.0/
- Report Issues: https://github.com/youcanautomate-yca/ai-driven-mobile-automation/issues
š License
MIT License - see LICENSE file for details
š„ Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
š Support
- GitHub Issues: https://github.com/youcanautomate-yca/ai-driven-mobile-automation/issues
- Email: youcanautomate@gmail.com
- Documentation: https://github.com/youcanautomate-yca/ai-driven-mobile-automation/wiki
Architecture
The server is organized into modular components:
ai-driven-mobile-automation/
āāā server.py # Main MCP server and tool registration
āāā session_store.py # Global driver and session management
āāā command.py # Core Appium command execution
āāā logger.py # Logging utilities
āāā tools_session.py # Session management tools
āāā tools_interactions.py # Element interaction tools
āāā tools_navigations.py # Navigation tools
āāā tools_app_management.py # App management tools
āāā tools_context.py # Context switching tools
āāā tools_ios.py # iOS-specific tools
āāā tools_test_generation.py # Test generation tools
āāā tools_documentation.py # Documentation/help tools
āāā requirements.txt # Python dependencies
āāā README.md # This file
Mirrored Tools from TypeScript Implementation
Session Management
select_platform- Select iOS or Android platformselect_device- Select target devicecreate_session- Create new Appium sessiondelete_session- Delete active sessionopen_notifications- Open notifications panel
Element Interactions
appium_click- Click on elementappium_find_element- Find element by strategy/selectorappium_double_tap- Double tap elementappium_long_press- Long press elementappium_drag_and_drop- Drag element to targetappium_press_key- Press keyappium_set_value- Set text valueappium_get_text- Get element textappium_get_active_element- Get focused elementappium_screenshot- Take screenshotappium_element_screenshot- Screenshot of elementappium_get_orientation- Get device orientationappium_set_orientation- Set device orientationappium_handle_alert- Handle alert dialogs
Navigation
appium_scroll- Scroll up/down/left/rightappium_scroll_to_element- Scroll until element visibleappium_swipe- Perform swipe gesture
App Management
appium_activate_app- Activate installed appappium_install_app- Install appappium_uninstall_app- Uninstall appappium_terminate_app- Terminate running appappium_list_apps- List installed appsappium_is_app_installed- Check if app installedappium_deep_link- Open deep link
Context Management
appium_get_contexts- Get available contextsappium_switch_context- Switch between contexts
iOS Tools
appium_boot_simulator- Boot iOS simulatorappium_setup_wda- Setup WebDriverAgentappium_install_wda- Install WebDriverAgent
Test Generation
appium_generate_locators- Generate element locatorsappium_generate_tests- Generate test scripts
Documentation
appium_answer_appium- Answer Appium questions
YAML Workflow Automation
Simple, clean YAML workflows. Define screen names, write prompts, let AI handle everything else!
Quick Example
Create workflow.yml:
version: "1.0"
description: "Login and purchase"
platform: "ios"
device_name: "iPhone 16"
bundle_id: "com.example.ecommerce"
# Screen-based workflow - group prompts by screen
workflow:
LoginScreen:
- "Enter email user@example.com"
- "Enter password mypassword"
- "Click login button"
HomeScreen:
- "Click first product"
- "View product details"
CartScreen:
- "Add item to cart"
- "Click checkout"
CheckoutScreen:
- "Enter shipping address"
- "Enter payment details"
- "Complete purchase"
OrderConfirmationScreen:
- "Take screenshot"
Run Workflow
appium-mcp-run-yaml workflow.yml
What Happens Automatically
For each prompt:
- ā AI inspects the current screen
- ā Analyzes page source to find elements
- ā Determines the right element to interact with
- ā Performs the action (click, type, scroll, etc.)
- ā Creates page object with discovered elements
- ā Generates test code with reusable methods
Auto-Generated Page Objects
From this YAML:
LoginScreen:
- "Enter email user@example.com"
AI generates:
class LoginScreen:
def login(self, email, password):
self.email_field.send_keys(email)
self.password_field.send_keys(password)
self.login_button.click()
Why This Approach?
- No Tool Names - Just write natural language
- No Element Selectors - AI finds them automatically
- No Manual Page Objects - Generated automatically
- No ID/Xpath Maintenance - AI updates as UI changes
- Clean & Readable - Anyone can write the YAML
Advanced Features
- Error Recovery - AI retries with different approaches
- Screenshots - Automatic at each step
- Element Waiting - AI waits for elements to appear
- Scroll Handling - AI scrolls to find elements
- Multi-Platform - iOS and Android
YAML Guides
- YAML Quick Start - 5-minute guide
- YAML Comprehensive Guide - Full reference
- Example Workflows - Ready-to-use examples
Usage Example
import asyncio
from mcp.client import StdioClient
async def main():
# Create client
client = StdioClient("python", ["server.py"])
# Call a tool
result = await client.call_tool(
"select_platform",
{"platform": "ios"}
)
print(result)
asyncio.run(main())
Tool Schemas
Each tool includes:
name: Unique tool identifierdescription: Human-readable descriptioninputSchema: JSON schema for parametersexecute: Async function that implements the tool
Error Handling
All tools return a status JSON response:
{
"status": "success|error|warning",
"message": "Human-readable message",
...tool-specific fields
}
Logging
Logs are written to stderr with format:
[TOOL START] tool_name: arguments
[TOOL END] tool_name
[TOOL ERROR] tool_name: error message
Differences from TypeScript Implementation
- Async/Await: Python version uses async/await instead of Promise-based TypeScript
- Module Organization: Simplified into single Python files instead of separate TypeScript modules
- Error Handling: Try-catch blocks instead of TypeScript error handling
- Type System: Uses type hints instead of TypeScript types
Contributing
To add new tools:
- Create a function in the appropriate
tools_*.pyfile - Register it in
server.pyin theregister_tools()function - Document the tool parameters and return values
License
Same as parent appium-mcp project.
Related
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.