nano-banana-claude
An MCP server that provides image generation using Google's Nano Banana Gemini models, with additional tools for background removal, upscaling, and format conversion via deterministic post-processing.
README
nano-banana-claude
Generated by generate_image (Nano Banana flash, 16:9).
A small MCP server that gives Claude Code (or any MCP client) image generation via Google's Nano Banana Gemini image models — plus the deterministic post-processing the model can't do reliably on its own: true alpha transparency, AI upscaling, resizing, and format conversion.
The idea: keep generation in the model, but put the things that need real code (a genuine alpha channel, exact dimensions, super-resolution) behind tools so they're reliable instead of prompt-and-hope.
Tools
| Tool | What it does |
|---|---|
generate_image |
Text → image. model (flash/pro), aspect_ratio, format, output_path. |
generate_transparent_image |
Generate, then remove the background with rembg (U2-Net) → a real RGBA PNG. Nano Banana can't produce reliable alpha on its own. |
edit_image |
One or more input images + a prompt → edited / composited result. |
process_image |
Local Pillow ops, no API call: resize/fit/crop, convert (png/webp/jpeg + quality), optional background removal. |
upscale_image |
AI super-resolution with Real-ESRGAN (realesr-general-x4v3). Tiled for large images, alpha-preserving. |
Models: flash → gemini-2.5-flash-image (Nano Banana, default, fast/cheap),
pro → gemini-3-pro-image-preview (Nano Banana Pro, higher quality).
Requirements
- Python 3.10+
- A Gemini API key (free tier available) from https://aistudio.google.com/apikey
- ~500 MB disk for dependencies; CPU is fine (no GPU required)
Install
git clone https://github.com/tougenrip/nano-banana-claude.git
cd nano-banana-claude
python3 -m venv .venv
.venv/bin/pip install -r requirements.txt
Two ML models download automatically on first use (not committed):
- rembg U2-Net (~176 MB) →
~/.u2net/— first transparent-image call - Real-ESRGAN x4 (~4.6 MB) →
~/.cache/nano-banana/— first upscale call
Register with Claude Code
From inside the cloned directory:
claude mcp add nano-banana --scope user \
--env GEMINI_API_KEY=your-key-here \
-- "$(pwd)/.venv/bin/python" "$(pwd)/server.py"
--scope user makes it available in every project; drop it to scope to the current project.
Reconnect and the five nano-banana tools appear. The API key is read from the environment
at runtime and is never written into the code.
Usage
Once registered, just ask in natural language — Claude picks the tool:
- "Generate a 16:9 image of a neon city at night." →
generate_image - "Make a transparent PNG of a red sneaker." →
generate_transparent_image - "Put the logo in photo.png onto the mug in mug.jpg." →
edit_image - "Resize hero.png to 1200px wide as webp." →
process_image - "Upscale icon.png 4×." →
upscale_image
Tools can chain: generate a transparent cutout, then upscale it — the alpha survives.
Outputs are written next to the working directory (or to output_path) and the absolute
path is returned. Set NANO_BANANA_OUTPUT_DIR to change the default location.
How transparency works
Image models paint pixels; asking for a "transparent background" usually yields a flat color
or a drawn-in checkerboard, not a real alpha channel. So generate_transparent_image
generates the subject on a plain background, then runs rembg (U2-Net) to compute an actual
alpha matte, producing a true RGBA PNG. process_image(remove_background=true) does the same
on any existing image.
<img src="examples/transparent-banana.png" alt="Transparent cutout example" width="320">
Output of generate_transparent_image — a real RGBA PNG (~90% of pixels fully
transparent), not a painted-on checkerboard.
Layout
server.py FastMCP server + tool definitions; Gemini REST via stdlib urllib
upscale.py Real-ESRGAN inference (onnxruntime), lazily loaded, with tiling
requirements.txt mcp, Pillow, rembg, onnxruntime
License
MIT — see LICENSE.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.