Gemini Image Generation MCP Server
Provides image generation capabilities using Google's Gemini 2.0 Flash Preview model through the MCP protocol, enabling AI assistants to generate high-quality images from text prompts.
README
Gemini Image Generation MCP Server
A Model Context Protocol (MCP) server that provides image generation capabilities using Google's Gemini 2.0 Flash Preview model. This server allows AI assistants to generate high-quality images from text prompts through the MCP protocol.
Features
- Image Generation: Generate images from text prompts using Gemini 2.0 Flash Preview
- Multiple Output Formats: Support for PNG, JPEG, and other image formats
- File Management: Automatic saving of generated images with organized file naming
- Base64 Support: Handle image data in base64 format for easy integration
- Status Monitoring: Check API connection status and model information
- Prompt Templates: Pre-built prompts for optimized image generation
Prerequisites
- Python 3.9 or higher
- Google AI API key (Gemini API access)
- uv package manager (recommended) or pip
Installation
Using uv (Recommended)
- Clone or download this repository:
git clone <repository-url>
cd image-generation-gemini-mcp
- Install dependencies:
uv sync
Using pip
- Install dependencies:
pip install -r requirements.txt
Setup
1. Get Google AI API Key
- Visit Google AI Studio
- Create a new API key
- Copy the API key for use in the next step
2. Set Environment Variable
Set your Gemini API key as an environment variable:
Windows (PowerShell):
$env:GEMINI_API_KEY="your-api-key-here"
Windows (Command Prompt):
set GEMINI_API_KEY=your-api-key-here
macOS/Linux:
export GEMINI_API_KEY="your-api-key-here"
For permanent setup, add the environment variable to your system's environment variables or shell profile.
Usage
Running the Server
Development Mode
To test the server locally:
uv run server.py
MCP Integration Mode
To run as an MCP server:
uv run server.py stdio
Integration with MCP Clients
Claude Desktop Integration
- Add the server to your Claude Desktop configuration file:
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"gemini-image-generator": {
"command": "uv",
"args": ["run", "server.py", "stdio"],
"cwd": "C:\\path\\to\\image-generation-gemini-mcp",
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}
- Restart Claude Desktop
- The image generation tools will be available in your conversations
Other MCP Clients
For other MCP-compatible clients, configure them to run:
uv run server.py stdio
With the working directory set to this project folder and the GEMINI_API_KEY environment variable configured.
Available Tools
generate_image
Generate an image from a text prompt.
Parameters:
prompt(string): Text description of the image to generateoutput_dir(string, optional): Directory to save the image (default: "./generated_images")
Returns:
success(boolean): Whether generation was successfulmessage(string): Status message or error descriptionimage_data(string): Base64 encoded image data (if successful)mime_type(string): MIME type of the generated imagefile_extension(string): File extension for the image
save_image_from_base64
Save a base64 encoded image to a file.
Parameters:
image_data(string): Base64 encoded image datafilename(string): Name for the output file (include extension)output_dir(string, optional): Directory to save the image
Returns:
success(string): "true" or "false"message(string): Status messagefile_path(string): Path to saved file (if successful)
Available Resources
gemini://api-status
Check the status of the Gemini API connection.
gemini://model-info
Get information about the Gemini image generation model capabilities.
Available Prompts
image_generation_prompt
Generate a detailed prompt optimized for image generation.
Parameters:
subject(string): Main subject or object to generatestyle(string, optional): Art style (default: "photorealistic")mood(string, optional): Mood or atmosphere (default: "neutral")
Example Usage
Once integrated with an MCP client like Claude Desktop, you can:
-
Generate an image:
Please generate an image of a sunset over mountains -
Use specific styles:
Create a cartoon-style image of a friendly robot -
Check API status:
Can you check the status of the Gemini API? -
Get model information:
What are the capabilities of the image generation model?
File Structure
image-generation-gemini-mcp/
├── server.py # Main MCP server implementation
├── requirements.txt # Python dependencies
├── pyproject.toml # Project configuration
├── README.md # This file
├── docs/ # Documentation files
│ ├── development-guidelines.md
│ ├── mcp-info.md
│ └── mcp-python-sdk-readme.md
└── generated_images/ # Default output directory (created automatically)
Troubleshooting
Common Issues
-
"GEMINI_API_KEY environment variable is required"
- Ensure you've set the
GEMINI_API_KEYenvironment variable - Verify the API key is valid and has access to Gemini API
- Ensure you've set the
-
"Error connecting to Gemini API"
- Check your internet connection
- Verify your API key is correct and active
- Ensure you have access to the Gemini 2.0 Flash Preview model
-
"No image was generated"
- Try rephrasing your prompt
- Ensure your prompt is descriptive and clear
- Check if there are any content policy restrictions
-
Permission errors when saving files
- Ensure the output directory is writable
- Check file system permissions
Getting Help
- Check the MCP documentation
- Review the Google AI documentation
- Ensure all dependencies are properly installed
Development
Code Quality
This project follows the MCP development guidelines:
- Formatting:
uv run ruff format . - Linting:
uv run ruff check . - Type checking:
uv run pyright
Testing
Run tests with:
uv run pytest
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contributing
Contributions are welcome! Please follow the MCP development guidelines and ensure all code is properly formatted and type-checked before submitting.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.