computer-control-mcp-lands

computer-control-mcp-lands

Provides comprehensive computer control capabilities including mouse and keyboard automation, screen capture, OCR text recognition, and window management through MCP protocol.

Category
Visit Server

README

Computer Control MCP Lands

🖥️ 一个强大的计算机控制MCP服务器,提供鼠标、键盘、OCR等计算机控制功能

Python Version PyPI

📖 简介

Computer Control MCP Lands 是一个基于 Model Context Protocol (MCP) 的服务器,提供全面的计算机控制功能。它使用 PyAutoGUI、RapidOCR、ONNXRuntime 等技术,类似于 Anthropic 的 'computer-use' 功能,但具有零外部依赖的特点。

✨ 主要功能

🖱️ 鼠标控制

  • 鼠标移动和点击
  • 拖拽操作
  • 滚轮控制
  • 多种点击模式(左键、右键、双击)

⌨️ 键盘控制

  • 文本输入
  • 按键模拟
  • 组合键支持
  • 特殊键处理

📸 屏幕截图

  • 全屏截图
  • 窗口截图
  • 区域截图
  • 支持多显示器

🔍 OCR文字识别

  • 高精度文字识别
  • 支持中英文
  • 坐标定位
  • 置信度检测
  • 边界框绘制

🪟 窗口管理

  • 窗口列表获取
  • 窗口激活
  • 窗口查找
  • 模糊匹配

📷 效果展示

屏幕截图功能

屏幕截图示例

OCR文字识别与标注

OCR识别效果

上图展示了OCR文字识别功能,可以准确识别屏幕上的文字并标注边界框

🚀 快速开始

安装

pip install computer-control-mcp-lands

基本使用

作为MCP服务器运行

computer-control-mcp-server

作为命令行工具使用

computer-control-mcp --help

配置示例

在你的MCP客户端配置中添加:

{
  "mcpServers": {
    "computer-control": {
      "command": "computer-control-mcp-server",
      "args": []
    }
  }
}

🛠️ 可用工具

鼠标操作

  • click_screen(x, y) - 点击屏幕指定位置
  • move_mouse(x, y) - 移动鼠标到指定位置
  • drag_mouse(from_x, from_y, to_x, to_y) - 拖拽鼠标

键盘操作

  • type_text(text) - 输入文本
  • press_key(key) - 按下指定按键

屏幕操作

  • take_screenshot() - 截取屏幕
  • get_screen_size() - 获取屏幕尺寸

窗口管理

  • list_windows() - 列出所有窗口
  • activate_window(title_pattern) - 激活指定窗口

📋 系统要求

  • Python 3.12+
  • Windows

🔧 依赖项

  • pyautogui - 鼠标键盘控制
  • mcp[cli] - MCP协议支持
  • pillow - 图像处理
  • pygetwindow - 窗口管理
  • fuzzywuzzy - 模糊匹配
  • rapidocr - OCR文字识别
  • onnxruntime - AI推理引擎
  • opencv-python - 计算机视觉

🔒 安全说明

此工具具有完整的系统控制权限,请:

  • 仅在受信任的环境中使用
  • 避免在生产系统上运行未经测试的脚本
  • 定期检查和更新依赖项

⭐ 如果这个项目对你有帮助,请给它一个星标!

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured