Vision MCP Server
Enables screenshot capture and visual analysis using cloud or local vision models, with tools to describe screens, list windows, and analyze images.
README
Vision MCP Server
截图 + 视觉模型分析 MCP 服务器,支持 Claude Code 等 MCP 客户端。
功能
| 工具 | 说明 |
|---|---|
describe_screen |
截图并用视觉模型分析(支持全屏/主屏/指定窗口) |
take_screenshot |
纯截图保存,不分析 |
list_windows |
列出当前所有可见窗口标题 |
describe_image |
分析一张已有的图片文件 |
安装
方式一:Claude Code 插件安装
claude /plugin install github.com/你的用户名/vision-mcp-server
方式二:手动配置
在 ~/.mcp.json 中添加:
{
"mcpServers": {
"vision": {
"command": "node",
"args": ["路径/server.mjs"],
"env": {
"DASHSCOPE_API_KEY": "你的阿里云百炼 API Key"
}
}
}
}
前置要求
- Node.js ≥ 18
- npm ≥ 9
- 阿里云百炼 API Key:去 bailian.console.aliyun.com → API Key 创建
- Windows:支持(PowerShell + .NET)
- 可选:Ollama +
minicpm-v:8b(本地模型备用)
配置说明
| 环境变量 | 必填 | 默认值 | 说明 |
|---|---|---|---|
DASHSCOPE_API_KEY |
✅ | - | 阿里云百炼 API Key |
VISION_CLOUD_MODEL |
❌ | qwen-vl-plus |
云端模型名 |
VISION_LOCAL_MODEL |
❌ | minicpm-v:8b |
本地备用模型 |
VISION_SCREENSHOT_DIR |
❌ | ~/Pictures/Screenshots |
截图保存目录 |
用法示例
"帮我看看浏览器当前页面"
"截一张全屏截图"
"分析这张图片:C:\photo.jpg"
"现在有哪些窗口开着"
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.