MCP Vision Server

MCP Vision Server

Enables Claude Code to describe images and extract text using Kimi/Moonshot vision API. Supports local image files with customizable prompts.

Category
Visit Server

README

MCP Vision Server · 视觉识别服务

License: MIT

语言 / Language: 中文 | English

基于 Kimi/Moonshot 视觉 API 的 MCP 服务器,作为 Claude Code 全局插件使用。传入本地图片路径,返回 AI 对图片内容的详细描述、文字提取等。

MCP server for image recognition via Kimi/Moonshot vision API. Works as a global Claude Code plugin.


中文

功能

  • describe_image — 识别图片内容,返回文字描述
  • describe_image_to_file — 识别并保存为 UTF-8 文件(解决 Windows 终端中文乱码)
  • 支持 PNG / JPG / GIF / WebP / BMP,最大 20MB
  • 支持自定义提示词(如"提取所有文字""描述图表结构")

安装

pip install mcp-vision-server

或从源码安装:

git clone https://github.com/coffe-d/MCP-Vision-Server.git
cd mcp-vision-server
pip install -e .

获取 API Key

Moonshot 开放平台 注册并创建 API Key。

注册到 Claude Code

claude mcp add vision-server \
  --env KIMI_API_KEY="sk-你的密钥" \
  -- mcp-vision-server

注册后 Claude Code 即可使用 describe_imagedescribe_image_to_file 两个工具。

配置

环境变量 必填 默认值 说明
KIMI_API_KEY Moonshot API 密钥
KIMI_BASE_URL https://api.moonshot.cn/v1 API 地址
KIMI_MODEL moonshot-v1-8k-vision-preview 模型名称

工具说明

describe_image — 识别图片,返回文本描述。

参数 类型 必填 默认值 说明
image_path string 图片绝对路径
prompt string 自定义提示词
max_tokens int 4096 最大输出长度

describe_image_to_file — 识别图片,结果保存为 UTF-8 文件。适合中文环境避免终端乱码。

参数 类型 必填 默认值 说明
image_path string 图片绝对路径
output_path string 自动(同名 .md) 输出文件路径

常见问题

"KIMI_API_KEY environment variable is not set"

未设置环境变量。注册时确保使用了 --env KIMI_API_KEY="sk-..."

终端中文乱码

使用 describe_image_to_file 代替 describe_image,结果直接写入 UTF-8 文件。

"不支持的图片格式"

仅支持 PNG、JPG、JPEG、GIF、WebP、BMP 格式。

许可

MIT — 详见 LICENSE


English

Features

  • describe_image — Recognize image content and return text description
  • describe_image_to_file — Recognize and save result to a UTF-8 file
  • Supports PNG / JPG / GIF / WebP / BMP up to 20MB
  • Customizable prompt for targeted extraction

Install

pip install mcp-vision-server

Or from source:

git clone https://github.com/coffe-d/MCP-Vision-Server.git
cd mcp-vision-server
pip install -e .

Get an API key

Sign up at Moonshot Platform and create an API key.

Register with Claude Code

claude mcp add vision-server \
  --env KIMI_API_KEY="sk-your-key-here" \
  -- mcp-vision-server

Configuration

Variable Required Default Description
KIMI_API_KEY Yes Moonshot API key
KIMI_BASE_URL No https://api.moonshot.cn/v1 API base URL
KIMI_MODEL No moonshot-v1-8k-vision-preview Model name

API Reference

describe_image — Return image description as text.

Parameter Type Required Default Description
image_path string Yes Absolute path to image
prompt string No Custom prompt
max_tokens int No 4096 Max output tokens

describe_image_to_file — Save result to a UTF-8 file.

Parameter Type Required Default Description
image_path string Yes Absolute path to image
output_path string No auto (.md) Output file path

Troubleshooting

"KIMI_API_KEY environment variable is not set" — Make sure you passed --env KIMI_API_KEY="sk-..." when running claude mcp add.

Garbled Chinese in terminal — Use describe_image_to_file to write directly to UTF-8 file.

License

MIT — see LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured