Automation MCP
Enables AI assistants to automate macOS desktop tasks including mouse control, keyboard input, screenshots, window management, and UI interaction.
README
🤖 Automation MCP
Automation MCP is a Model Context Protocol (MCP) server that provides AI models with complete desktop automation capabilities on macOS. It enables AI assistants to:
- 🖱️ Control your mouse (click, move, scroll, drag)
- ⌨️ Type and send keyboard input (including system shortcuts)
- 📸 Take screenshots and analyze screen content
- 🪟 Manage windows (focus, move, resize, minimize)
- 🎯 Interact with UI elements through coordinates
- 🎨 Analyze screen colors and highlight regions
- 🔍 Wait for images to appear on screen
🚀 Quick start
Make sure you have furi installed, and then run the following command:
furi add ashwwwin/automation-mcp
followed by:
furi start ashwwwin/automation-mcp
and you're done! (or you can just use the furi desktop app for no cli).
🥲 Normal start (without furi)
Prerequisites
- Bun runtime - Install with:
curl -fsSL https://bun.sh/install | bash
1. Clone and Install
git clone https://github.com/ashwwwin/automation-mcp.git
cd automation-mcp
bun install
2. Start the Server
# Start with HTTP transport (recommended for web apps)
bun run index.ts
# Or start with stdio transport (for command line tools)
bun run index.ts --stdio
3. Grant Permissions
On first run, macOS will ask for permissions. You must grant these for full functionality:
- Accessibility - Allows keyboard/mouse control
- Screen Recording - Enables screenshots and screen analysis
Or manually enable in: System Settings → Privacy & Security → Accessibility/Screen Recording
🛠️ Available Tools
🖱️ Mouse Control
mouseClick- Click at coordinates with left/right/middle buttonmouseDoubleClick- Double-click at coordinatesmouseMove- Move cursor to positionmouseGetPosition- Get current cursor locationmouseScroll- Scroll in any directionmouseDrag- Drag from current position to targetmouseButtonControl- Press/release mouse buttonsmouseMovePath- Follow a smooth path with multiple points
⌨️ Keyboard Input
type- Type text or press key combinationskeyControl- Advanced key press/release controlsystemCommand- Common shortcuts (copy, paste, undo, save, etc.)
📸 Screen Capture & Analysis
screenshot- Capture full screen, regions, or specific windowsscreenInfo- Get screen dimensionsscreenHighlight- Highlight screen regions visuallycolorAt- Get color of any pixelwaitForImage- Wait for images to appear (template matching)
🪟 Window Management
getWindows- List all open windowsgetActiveWindow- Get current active windowwindowControl- Focus, move, resize, minimize windows
🔒 Security & Permissions
-
Accessibility - Required for:
- Mouse clicks and movement
- Keyboard input simulation
- Window management
-
Screen Recording - Required for:
- Taking screenshots
- Screen analysis
- Color detection
🚀 Integration Examples
With Claude Desktop + furi
If you've already configured furi with Claude Desktop, you don't need to do anything.
Add to your MCP configuration:
{
"mcpServers": {
"furi": {
"command": "furi",
"args": ["connect"]
}
}
}
With Claude Desktop
Add to your MCP configuration:
{
"mcpServers": {
"automation": {
"command": "bun",
"args": ["run", "/path/to/automation-mcp/index.ts", "--stdio"]
}
}
}
🐛 Troubleshooting
Common Issues
Permission Denied Errors
- Ensure Accessibility and Screen Recording permissions are granted
- Ensure Xcode Command Line Tools:
xcode-select --install
🙋♂️ Support
Having issues? Check the troubleshooting section above or open an issue with:
- Your operating system and version
- Error messages
- Steps to reproduce
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.