MCP Servers

MCP Image Recognition Server

Enables image analysis and recognition through multiple LLM vision models (Gemini, GPT-4o, Qwen-VL, Doubao) by accepting image URLs or Base64 data and returning text descriptions or answers to questions about the images.

README

MCP Image Recognition Server (Python)

An MCP server implementation in Python providing image recognition capabilities using various LLM providers (Gemini, OpenAI, Qwen/Tongyi, Doubao, etc.).

Features

Image Recognition: Describe images or answer questions about them.
Multi-Model Support: Dynamically switch between Gemini, GPT-4o, Qwen-VL, Doubao, etc.
Flexible: Accepts image URLs or Base64 data.

Quick Setup (Recommended)

We provide automated scripts to set up the environment and dependencies in one click.

Linux / macOS

git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py
./setup.sh

Windows

Clone or download this repository.
Double-click setup.bat.

After the script finishes, simply edit the .env file with your API keys.

Installation & Usage (Manual)

If you prefer manual installation or want to use uv:

Prerequisites

Python 3.10 or higher
An API Key for your preferred model provider (Google Gemini, OpenAI, Aliyun DashScope, etc.)

Method 1: Using `uv` (Recommended)

uv is an extremely fast Python package manager.

1. Run directly with `uv run`

You don't need to manually create a virtual environment.

# Clone the repo
git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py

# Create .env file with your API keys
cp .env.example .env
# Edit .env with your keys

# Run the server
uv run server.py

2. Using `uvx` (for ephemeral execution)

If you want to run it without cloning the repo explicitly (experimental support via git):

# Note: You still need to provide environment variables. 
# It's easier to clone and use 'uv run' for persistent config via .env
uvx --from git+https://github.com/glasses666/mcp-image-recognition-py mcp-image-recognition

Method 2: Standard Python (pip)

Linux / macOS

Clone and Setup:

git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Configure:

cp .env.example .env
# Edit .env and add your API keys

Run:
```
python server.py
```

Windows

Clone and Setup:

git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py
python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt

Configure:

copy .env.example .env
# Edit .env and add your API keys

Run:
```
python server.py
```

Configuration

Create a .env file in the project root based on .env.example:

1. For Google Gemini (Recommended for speed/cost)

Get an API key from Google AI Studio.

GEMINI_API_KEY=your_google_api_key
DEFAULT_MODEL=gemini-1.5-flash

2. For Tongyi Qianwen (Qwen - Alibaba Cloud)

Get an API key from Aliyun DashScope.

OPENAI_API_KEY=your_dashscope_api_key
OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
DEFAULT_MODEL=qwen-vl-max

3. For Doubao (Volcengine)

Get an API key from Volcengine Ark.

OPENAI_API_KEY=your_volcengine_api_key
OPENAI_BASE_URL=https://ark.cn-beijing.volces.com/api/v3
DEFAULT_MODEL=doubao-pro-32k

Agent AI Configuration (Claude Desktop, etc.)

To use this server with an MCP client (like Claude Desktop), add it to your configuration file.

Configuration File Paths

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json (if available)

Configuration JSON

Option A: Using uv (Easiest) If you have uv installed, you can let it handle the environment.

{
  "mcpServers": {
    "image-recognition": {
      "command": "/path/to/uv",
      "args": [
        "run",
        "--directory",
        "/absolute/path/to/mcp-image-recognition-py",
        "server.py"
      ],
      "env": {
        "GEMINI_API_KEY": "your_gemini_key_here",
        "OPENAI_API_KEY": "your_openai_key_here",
        "OPENAI_BASE_URL": "https://api.openai.com/v1",
        "DEFAULT_MODEL": "gemini-1.5-flash"
      }
    }
  }
}

Option B: Standard Python Venv Ensure you provide the absolute path to the python executable in your virtual environment.

{
  "mcpServers": {
    "image-recognition": {
      "command": "/absolute/path/to/mcp-image-recognition-py/venv/bin/python", 
      "args": [
        "/absolute/path/to/mcp-image-recognition-py/server.py"
      ],
      "env": {
        "GEMINI_API_KEY": "your_gemini_key_here",
        "OPENAI_API_KEY": "your_openai_key_here",
        "OPENAI_BASE_URL": "https://api.openai.com/v1",
        "DEFAULT_MODEL": "gemini-1.5-flash"
      }
    }
  }
}

Windows Note: For paths, use double backslashes \\ (e.g., C:\\Users\\Name\\...).

Usage Tool

`recognize_image`

Analyzes an image and returns a text description.

Parameters:

image (string, required): The image to analyze. Supports:
- HTTP/HTTPS URLs (e.g., https://example.com/cat.jpg)
- Base64 encoded strings (with or without data:image/...;base64, prefix)
prompt (string, optional): Specific instruction. Default: "Describe this image".
model (string, optional): Override the default model for this specific request.

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured

MCP Image Recognition Server

README

MCP Image Recognition Server (Python)

Features

Quick Setup (Recommended)

Linux / macOS

Windows

Installation & Usage (Manual)

Prerequisites

Method 1: Using uv (Recommended)

1. Run directly with uv run

2. Using uvx (for ephemeral execution)

Method 2: Standard Python (pip)

Linux / macOS

Windows

Configuration

1. For Google Gemini (Recommended for speed/cost)

2. For Tongyi Qianwen (Qwen - Alibaba Cloud)

3. For Doubao (Volcengine)

Agent AI Configuration (Claude Desktop, etc.)

Configuration File Paths

Configuration JSON

Usage Tool

recognize_image

License

Recommended Servers

Method 1: Using `uv` (Recommended)

1. Run directly with `uv run`

2. Using `uvx` (for ephemeral execution)

`recognize_image`