MCP Servers

Gemini Image MCP Server

Provides image generation, modification, and analysis capabilities using Google's Gemini API, enabling AI-powered image operations through natural language.

README

Gemini Image MCP Server

A Model Context Protocol (MCP) server that provides image generation and manipulation capabilities using Google's Gemini API. This server integrates with Claude Desktop and other MCP-compatible clients to enable AI-powered image operations.

Features

Image Generation: Create images from text prompts using Gemini 2.5 Flash Image Preview
Image Modification: Modify existing images with natural language instructions
Image Analysis: Analyze images for objects, text, colors, emotions, and comprehensive insights
Batch Generation: Generate multiple images from different prompts in one operation
Style Transfer: Apply artistic styles to existing images
Rate Limiting: Built-in rate limiting to respect API quotas
Safety Settings: Configurable content safety levels

Available Tools

1. generateImage

Generate images from text prompts with customizable options.

Parameters:

prompt (required): Text description of the image to generate
width (optional): Image width in pixels
height (optional): Image height in pixels
aspectRatio (optional): One of 1:1, 16:9, 9:16, 4:3, 3:4
style (optional): One of realistic, artistic, cartoon, sketch, watercolor, oil-painting
quality (optional): One of standard, high, ultra
numberOfImages (optional): Number of images to generate (default: 1)

2. modifyImage

Modify existing images using natural language instructions.

Parameters:

imageBase64 (required): Base64 encoded image data
instructions (required): Text instructions for modification
preserveStyle (optional): Whether to preserve original artistic style
strength (optional): Modification strength from 0 to 1

3. analyzeImage

Analyze images and extract various types of information.

Parameters:

imageBase64 (required): Base64 encoded image data
analysisType (optional): One of description, objects, text, colors, emotions, comprehensive
detail (optional): Analysis detail level - low, medium, high

4. batchGenerate

Generate multiple images from different prompts efficiently.

Parameters:

prompts (required): Array of text prompts
baseOptions (optional): Shared options to apply to all generations

5. applyStyleTransfer

Apply artistic styles to existing images.

Parameters:

imageBase64 (required): Base64 encoded image data
style (required): One of anime, renaissance, impressionist, cyberpunk, minimalist, vintage, futuristic
intensity (optional): Style intensity from 0 to 100

Installation

Prerequisites

Node.js 18 or higher
Google Gemini API key (Get one here)

Setup

Clone or download the project:

git clone <repository-url>
cd gemini-image-mcp

Install dependencies:

npm install

Create environment configuration:

cp .env.example .env

Edit .env and add your Gemini API key:

GEMINI_API_KEY=your-gemini-api-key-here

Build the project:

npm run build

Usage

With Claude Desktop

Add the server to your Claude Desktop configuration file:

On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json On Windows: %APPDATA%/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "gemini-image": {
      "command": "node",
      "args": ["/path/to/gemini-image-mcp/dist/index.js"],
      "env": {
        "GEMINI_API_KEY": "your-gemini-api-key-here"
      }
    }
  }
}

Standalone Usage

You can also run the server directly for testing:

npm start

Configuration Options

Environment variables you can set:

GEMINI_API_KEY (required): Your Google Gemini API key
GEMINI_MODEL (optional): Model to use (default: gemini-2.5-flash-image-preview)
SAFETY_LEVEL (optional): Content safety level - LOW, MEDIUM, HIGH, BLOCK_NONE (default: MEDIUM)
MAX_REQUESTS_PER_MINUTE (optional): Rate limit (default: 10)

Examples

Once integrated with Claude Desktop, you can use natural language to interact with the tools:

Image Generation

"Generate an image of a sunset over mountains in watercolor style"

Image Modification

"Take this image and add a rainbow in the sky while preserving the original style"

Image Analysis

"Analyze this image and tell me what objects you can detect with confidence scores"

Batch Generation

"Generate 3 different versions of a futuristic cityscape: one cyberpunk style, one minimalist, and one realistic"

Style Transfer

"Apply an impressionist style to this photograph with high intensity"

Development

Running in Development Mode

npm run dev

Building

npm run build

Type Checking

npm run typecheck

Linting

npm run lint

API Limitations

Rate limiting is enforced based on your configuration
Image generation may take 10-30 seconds depending on complexity
Maximum image size depends on Gemini API limits
Content safety filters are applied based on your safety level setting

Troubleshooting

Common Issues

"GEMINI_API_KEY environment variable is required"
- Ensure you've set the API key in your environment or Claude Desktop config
"Rate limit exceeded"
- Wait for the rate limit window to reset or adjust MAX_REQUESTS_PER_MINUTE
"Image generation failed"
- Check your prompt for potentially unsafe content
- Verify your API key has proper permissions
- Try adjusting the safety level settings

Debug Logging

The server logs errors to stderr. Check the Claude Desktop console or your terminal for detailed error messages.

Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests and linting
Submit a pull request

License

MIT License - see LICENSE file for details

Support

For issues and questions:

Check the troubleshooting section above
Review Claude Desktop MCP documentation
Create an issue in the repository

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured