computer-control-mcp-lands
Provides comprehensive computer control capabilities including mouse and keyboard automation, screen capture, OCR text recognition, and window management through MCP protocol.
README
Computer Control MCP Lands
🖥️ 一个强大的计算机控制MCP服务器,提供鼠标、键盘、OCR等计算机控制功能
📖 简介
Computer Control MCP Lands 是一个基于 Model Context Protocol (MCP) 的服务器,提供全面的计算机控制功能。它使用 PyAutoGUI、RapidOCR、ONNXRuntime 等技术,类似于 Anthropic 的 'computer-use' 功能,但具有零外部依赖的特点。
✨ 主要功能
🖱️ 鼠标控制
- 鼠标移动和点击
- 拖拽操作
- 滚轮控制
- 多种点击模式(左键、右键、双击)
⌨️ 键盘控制
- 文本输入
- 按键模拟
- 组合键支持
- 特殊键处理
📸 屏幕截图
- 全屏截图
- 窗口截图
- 区域截图
- 支持多显示器
🔍 OCR文字识别
- 高精度文字识别
- 支持中英文
- 坐标定位
- 置信度检测
- 边界框绘制
🪟 窗口管理
- 窗口列表获取
- 窗口激活
- 窗口查找
- 模糊匹配
📷 效果展示
屏幕截图功能

OCR文字识别与标注

上图展示了OCR文字识别功能,可以准确识别屏幕上的文字并标注边界框
🚀 快速开始
安装
pip install computer-control-mcp-lands
基本使用
作为MCP服务器运行
computer-control-mcp-server
作为命令行工具使用
computer-control-mcp --help
配置示例
在你的MCP客户端配置中添加:
{
"mcpServers": {
"computer-control": {
"command": "computer-control-mcp-server",
"args": []
}
}
}
🛠️ 可用工具
鼠标操作
click_screen(x, y)- 点击屏幕指定位置move_mouse(x, y)- 移动鼠标到指定位置drag_mouse(from_x, from_y, to_x, to_y)- 拖拽鼠标
键盘操作
type_text(text)- 输入文本press_key(key)- 按下指定按键
屏幕操作
take_screenshot()- 截取屏幕get_screen_size()- 获取屏幕尺寸
窗口管理
list_windows()- 列出所有窗口activate_window(title_pattern)- 激活指定窗口
📋 系统要求
- Python 3.12+
- Windows
🔧 依赖项
pyautogui- 鼠标键盘控制mcp[cli]- MCP协议支持pillow- 图像处理pygetwindow- 窗口管理fuzzywuzzy- 模糊匹配rapidocr- OCR文字识别onnxruntime- AI推理引擎opencv-python- 计算机视觉
🔒 安全说明
此工具具有完整的系统控制权限,请:
- 仅在受信任的环境中使用
- 避免在生产系统上运行未经测试的脚本
- 定期检查和更新依赖项
⭐ 如果这个项目对你有帮助,请给它一个星标!
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.