universal-image-mcp
MCP server for multi-provider AI image generation (AWS Bedrock, OpenAI, Google Gemini) enabling image generation, transformation, and editing through a unified interface.
README
Universal Image MCP - Multi-Provider AI Image Generation Server for Claude Desktop & MCP Clients
Universal MCP server for AI image generation supporting AWS Bedrock (Nova Canvas), OpenAI (GPT Image, DALL-E), and Google Gemini (Imagen 4). Generate, transform, and edit images using multiple AI models through a single Model Context Protocol interface.
What is Universal Image MCP?
Universal Image MCP is a Model Context Protocol (MCP) server that provides unified access to multiple AI image generation providers. Whether you're using Claude Desktop, Kiro IDE, or any MCP-compatible client, this server lets you generate and transform images using: n
- AWS Bedrock - Amazon Nova Canvas for enterprise-grade image generation
- OpenAI - GPT Image 1.5, ChatGPT Image, DALL-E models
- Google Gemini - Gemini 2.5 Flash Image, Imagen 4, Imagen 4 Ultra
Perfect for developers building AI applications, content creators, and anyone needing programmatic access to multiple image generation APIs through a single interface.
Example Outputs
Comparison of architecture diagrams generated by different models using this MCP server:
<table><thead><tr><th>Style</th><th>Model</th><th>Output</th></tr></thead><tbody><tr><td><strong>Technical Diagram</strong></td><td>OpenAI <code>gpt-image-1.5</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture.png" alt="Universal Image MCP Server Architecture Diagram - OpenAI GPT Image 1.5 - Multi-provider AI image generation with AWS Bedrock, OpenAI, Google Gemini integration"></td></tr><tr><td><strong>Technical Diagram</strong></td><td>Google <code>gemini-2.5-flash-image</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture-gemini.png" alt="Universal Image MCP Server Architecture Diagram - Google Gemini 2.5 Flash - Model Context Protocol for AI image generation"></td></tr><tr><td><strong>Technical Diagram</strong></td><td>AWS <code>amazon.nova-canvas-v1:0</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture-nova.png" alt="Universal Image MCP Server Architecture Diagram - AWS Bedrock Nova Canvas - Enterprise AI image generation architecture"></td></tr><tr><td><strong>3D Clay Art</strong></td><td>Google <code>gemini-2.5-flash-image</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture-clay-gemini.png" alt="3D Clay Art Style MCP Architecture - Google Gemini 2.5 Flash - AI-generated technical diagram in cute clay aesthetic"></td></tr><tr><td><strong>3D Clay Art</strong></td><td>OpenAI <code>gpt-image-1.5</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture-clay-openai.png" alt="3D Clay Art Style MCP Architecture - OpenAI GPT Image 1.5 - AI image generation server in 3D clay art style"></td></tr><tr><td><strong>3D Clay Art</strong></td><td>AWS <code>amazon.nova-canvas-v1:0</code></td><td><img src="https://raw.githubusercontent.com/manu-mishra/universal-image-mcp/main/images/universal-image-mcp-architecture-clay-nova.png" alt="3D Clay Art Style MCP Architecture - AWS Nova Canvas - Multi-provider image generation in pastel clay aesthetic"></td></tr></tbody></table>
View Prompt Used
Technical Diagram Prompt:
Technical architecture diagram of a Universal Image MCP Server system. The diagram shows:
Top layer: MCP Client (Claude Desktop, Kiro IDE) connecting via Model Context Protocol
Middle layer: Universal Image MCP Server (FastMCP) with three main components:
1. Server Module (server.py) - handles list_models, generate_image, transform_image, prompt_guide tools
2. Provider Module (providers.py) - manages lazy initialization and provider abstraction
3. Configuration - environment variables for ENABLE_AWS, ENABLE_OPENAI, ENABLE_GEMINI
Bottom layer: Three provider boxes side by side:
- AWS Bedrock (boto3) - Amazon Nova Canvas, with AWS credentials and region config
- OpenAI API - GPT Image 1.5, ChatGPT Image Latest, with API key
- Google Gemini API - Gemini 2.5 Flash, Imagen 4, with API key
Data flow arrows showing:
- Client sends tool requests to Server
- Server routes to appropriate Provider based on model_id
- Providers make API calls to their respective services
- Image data flows back through the chain
Clean, professional software architecture diagram style with boxes, arrows, and labels. Use blue and gray color scheme. Modern technical documentation aesthetic. Isometric or layered view showing clear separation of concerns.
3D Clay Art Prompt:
Same technical architecture content as above, but rendered in:
3D clay art style with soft rounded shapes, pastel colors, cute minimalist aesthetic, soft studio lighting, clean composition with depth and shadows.
Note: 3D Clay Art versions used s3tablearch.png as a reference image for style guidance.
Key Features
- 🔄 Multi-Provider Support - Switch between AWS Bedrock, OpenAI, and Google Gemini seamlessly
- 🚀 Dynamic Model Discovery - Automatically fetches latest available models from each provider API
- ⚡ Lazy Initialization - Provider clients load only when needed for optimal performance
- 🎨 Reference Image Support - Generate new images based on existing image styles
- 📐 Configurable Dimensions - Custom width/height for supported AI models
- 📚 Built-in Prompt Guide - Best practices for writing effective image generation prompts
- 🔌 MCP Protocol - Works with Claude Desktop, Kiro IDE, and all MCP-compatible clients
- 🐍 Python 3.11+ - Modern Python with type hints and async support
Quick Start Installation
Install via pip:
pip install universal-image-mcp
Or use with uvx (recommended for MCP servers):
uvx universal-image-mcp@latest
MCP Server Configuration
For Claude Desktop
Add to your Claude Desktop MCP configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"universal-image-mcp": {
"command": "uvx",
"args": ["universal-image-mcp@latest"],
"env": {
"ENABLE_AWS": "true",
"AWS_PROFILE": "default",
"AWS_REGION": "us-east-1",
"ENABLE_OPENAI": "true",
"OPENAI_API_KEY": "sk-...",
"ENABLE_GEMINI": "true",
"GEMINI_API_KEY": "..."
}
}
}
}
For Kiro IDE
Add to ~/.kiro/settings/mcp.json:
{
"mcpServers": {
"universal-image-mcp": {
"command": "uvx",
"args": ["universal-image-mcp@latest"],
"env": {
"ENABLE_AWS": "true",
"ENABLE_OPENAI": "true",
"OPENAI_API_KEY": "sk-...",
"ENABLE_GEMINI": "true",
"GEMINI_API_KEY": "..."
}
}
}
}
Getting API Keys and Credentials
Before using this MCP server, you'll need to obtain credentials for the providers you want to use.
AWS Bedrock Setup
AWS Bedrock uses your local AWS credentials. You have several options:
-
AWS CLI Configuration (Recommended)
- Install the AWS CLI
- Run
aws configureand provide your access key, secret key, and region - Official guide: AWS CLI Configuration
-
AWS Credentials File
- Create
~/.aws/credentialswith your access keys - Create
~/.aws/configwith your region settings - Official guide: Shared Config and Credentials Files
- Create
-
Environment Variables
- Set
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, andAWS_REGION - Official guide: AWS CLI Authentication
- Set
Getting AWS Access Keys:
- Sign in to the AWS Console
- Navigate to IAM → Users → Your User → Security Credentials
- Create a new access key under "Access keys"
- Ensure your IAM user has permissions for Bedrock (e.g.,
AmazonBedrockFullAccesspolicy)
OpenAI API Key
-
Create an OpenAI Account
- Visit OpenAI Platform
- Sign up or log in to your account
-
Generate API Key
- Go to API Keys page
- Click "Create new secret key"
- Give it a descriptive name (optional)
- Copy the key immediately (you won't be able to see it again)
-
Add Billing Information
- OpenAI requires payment information to use the API
- Navigate to Billing to add payment details
Official Documentation: OpenAI Quickstart Guide
Google Gemini API Key
-
Get a Gemini API Key
- Visit Google AI Studio
- Sign in with your Google account
- Click "Get API Key" or "Create API Key"
- Create a new project or select an existing one
- Copy your API key
-
Alternative: Google Cloud API Key
- For production use, you can use Vertex AI on Google Cloud
- This provides more enterprise features and billing controls
Official Documentation: Gemini API Quickstart
Environment Variables
<table><thead><tr><th>Variable</th><th>Required</th><th>Description</th></tr></thead><tbody><tr><td><code>ENABLE_AWS</code></td><td>No</td><td>Enable AWS Bedrock provider (<code>true</code>/<code>false</code>, default: <code>false</code>)</td></tr><tr><td><code>AWS_PROFILE</code></td><td>No</td><td>AWS profile name (default, SSO, or named profile)</td></tr><tr><td><code>AWS_REGION</code></td><td>No</td><td>AWS region (default: <code>us-east-1</code>)</td></tr><tr><td><code>ENABLE_OPENAI</code></td><td>No</td><td>Enable OpenAI provider (<code>true</code>/<code>false</code>, default: <code>false</code>)</td></tr><tr><td><code>OPENAI_API_KEY</code></td><td>If OpenAI enabled</td><td>OpenAI API key from <a href="https://platform.openai.com/account/api-keys">platform.openai.com</a></td></tr><tr><td><code>ENABLE_GEMINI</code></td><td>No</td><td>Enable Google Gemini provider (<code>true</code>/<code>false</code>, default: <code>false</code>)</td></tr><tr><td><code>GEMINI_API_KEY</code></td><td>If Gemini enabled</td><td>Google Gemini API key from <a href="https://ai.google.dev/gemini-api/docs/api-key">Google AI Studio</a></td></tr></tbody></table>
API Reference - MCP Tools
list_models()
List all available AI image generation models from enabled providers. Models are fetched dynamically from each provider's API with deprecated models automatically filtered.
Returns: Formatted list of model IDs compatible with generate_image() and transform_image()
Example models:
- AWS:
amazon.nova-canvas-v1:0 - OpenAI:
gpt-image-1.5,chatgpt-image-latest - Gemini:
models/gemini-2.5-flash-image,models/imagen-4.0-generate-001
generate_image(prompt, model_id, output_path, reference_image?, width?, height?)
Generate AI images from text prompts using any supported model.
<table><thead><tr><th>Parameter</th><th>Required</th><th>Default</th><th>Description</th></tr></thead><tbody><tr><td><code>prompt</code></td><td>Yes</td><td>-</td><td>Detailed text description of the image. Be specific about subject, style, lighting, colors, composition, and mood.</td></tr><tr><td><code>model_id</code></td><td>Yes</td><td>-</td><td>Model ID from <code>list_models()</code>. Examples: <code>amazon.nova-canvas-v1:0</code>, <code>gpt-image-1.5</code>, <code>models/gemini-2.5-flash-image</code></td></tr><tr><td><code>output_path</code></td><td>Yes</td><td>-</td><td>File path to save the generated image. Parent directories created automatically.</td></tr><tr><td><code>reference_image</code></td><td>No</td><td>None</td><td>Path to reference image for style/content guidance</td></tr><tr><td><code>width</code></td><td>No</td><td>1024</td><td>Image width in pixels. Note: Some models only support specific sizes.</td></tr><tr><td><code>height</code></td><td>No</td><td>1024</td><td>Image height in pixels. Note: Some models only support specific sizes.</td></tr></tbody></table>
transform_image(image_path, prompt, model_id, output_path)
Transform and edit existing images using AI-powered modifications based on text prompts.
<table><thead><tr><th>Parameter</th><th>Required</th><th>Description</th></tr></thead><tbody><tr><td><code>image_path</code></td><td>Yes</td><td>Path to source image to transform (PNG, JPEG, etc.)</td></tr><tr><td><code>prompt</code></td><td>Yes</td><td>AI transformation instructions (e.g., "Make it black and white", "Add a rainbow", "Convert to watercolor style")</td></tr><tr><td><code>model_id</code></td><td>Yes</td><td>Model ID from <code>list_models()</code></td></tr><tr><td><code>output_path</code></td><td>Yes</td><td>File path to save the transformed image</td></tr></tbody></table>
Use cases: Image editing, style transfer, AI-powered photo manipulation, artistic transformations
prompt_guide()
Get AI prompt engineering best practices for image generation. Returns comprehensive guidelines covering:
- Prompt structure (Subject + Details + Style + Lighting + Mood + Composition)
- Specific vs generic descriptions
- Style, lighting, and mood keywords
- Example prompts for different use cases
Supported AI Image Models
All models are discovered dynamically. Use list_models() to see current options.
AWS Bedrock Models
- Amazon Nova Canvas (
amazon.nova-canvas-v1:0) - Enterprise-grade image generation with text and image input support
OpenAI Models
- GPT Image 1.5 (
gpt-image-1.5) - Latest OpenAI image generation model - ChatGPT Image Latest (
chatgpt-image-latest) - ChatGPT-integrated image generation
Google Gemini Models
- Gemini 2.5 Flash Image (
models/gemini-2.5-flash-image) - Fast, efficient image generation - Gemini 3 Pro Image (
models/gemini-3-pro-image-preview) - Advanced image generation capabilities - Imagen 4 (
models/imagen-4.0-generate-001) - Google's state-of-the-art image model - Imagen 4 Ultra (
models/imagen-4.0-ultra-generate-001) - Highest quality Imagen model - Imagen 4 Fast (
models/imagen-4.0-fast-generate-001) - Optimized for speed
Use Cases
- AI Application Development - Integrate multiple image generation providers into your apps
- Content Creation - Generate marketing materials, social media content, illustrations
- Prototyping & Design - Quickly visualize concepts and design ideas
- Image Editing Automation - Batch process and transform images with AI
- Research & Experimentation - Compare outputs across different AI models
- Claude Desktop Workflows - Enhance Claude conversations with image generation
- Developer Tools - Build MCP-compatible tools and extensions
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Related Projects
- Model Context Protocol - Official MCP documentation
- Claude Desktop - AI assistant with MCP support
- FastMCP - Python framework for building MCP servers
Keywords
mcp-server image-generation ai-images aws-bedrock openai google-gemini claude-desktop imagen nova-canvas python fastmcp model-context-protocol ai-art text-to-image image-transformation
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.