nano-banana-claude

nano-banana-claude

An MCP server that provides image generation using Google's Nano Banana Gemini models, with additional tools for background removal, upscaling, and format conversion via deterministic post-processing.

Category
Visit Server

README

nano-banana-claude

Example: a 3D banana character generated by this server Generated by generate_image (Nano Banana flash, 16:9).

A small MCP server that gives Claude Code (or any MCP client) image generation via Google's Nano Banana Gemini image models — plus the deterministic post-processing the model can't do reliably on its own: true alpha transparency, AI upscaling, resizing, and format conversion.

The idea: keep generation in the model, but put the things that need real code (a genuine alpha channel, exact dimensions, super-resolution) behind tools so they're reliable instead of prompt-and-hope.

Tools

Tool What it does
generate_image Text → image. model (flash/pro), aspect_ratio, format, output_path.
generate_transparent_image Generate, then remove the background with rembg (U2-Net) → a real RGBA PNG. Nano Banana can't produce reliable alpha on its own.
edit_image One or more input images + a prompt → edited / composited result.
process_image Local Pillow ops, no API call: resize/fit/crop, convert (png/webp/jpeg + quality), optional background removal.
upscale_image AI super-resolution with Real-ESRGAN (realesr-general-x4v3). Tiled for large images, alpha-preserving.

Models: flashgemini-2.5-flash-image (Nano Banana, default, fast/cheap), progemini-3-pro-image-preview (Nano Banana Pro, higher quality).

Requirements

Install

git clone https://github.com/tougenrip/nano-banana-claude.git
cd nano-banana-claude
python3 -m venv .venv
.venv/bin/pip install -r requirements.txt

Two ML models download automatically on first use (not committed):

  • rembg U2-Net (~176 MB) → ~/.u2net/ — first transparent-image call
  • Real-ESRGAN x4 (~4.6 MB) → ~/.cache/nano-banana/ — first upscale call

Register with Claude Code

From inside the cloned directory:

claude mcp add nano-banana --scope user \
  --env GEMINI_API_KEY=your-key-here \
  -- "$(pwd)/.venv/bin/python" "$(pwd)/server.py"

--scope user makes it available in every project; drop it to scope to the current project. Reconnect and the five nano-banana tools appear. The API key is read from the environment at runtime and is never written into the code.

Usage

Once registered, just ask in natural language — Claude picks the tool:

  • "Generate a 16:9 image of a neon city at night."generate_image
  • "Make a transparent PNG of a red sneaker."generate_transparent_image
  • "Put the logo in photo.png onto the mug in mug.jpg."edit_image
  • "Resize hero.png to 1200px wide as webp."process_image
  • "Upscale icon.png 4×."upscale_image

Tools can chain: generate a transparent cutout, then upscale it — the alpha survives.

Outputs are written next to the working directory (or to output_path) and the absolute path is returned. Set NANO_BANANA_OUTPUT_DIR to change the default location.

How transparency works

Image models paint pixels; asking for a "transparent background" usually yields a flat color or a drawn-in checkerboard, not a real alpha channel. So generate_transparent_image generates the subject on a plain background, then runs rembg (U2-Net) to compute an actual alpha matte, producing a true RGBA PNG. process_image(remove_background=true) does the same on any existing image.

<img src="examples/transparent-banana.png" alt="Transparent cutout example" width="320">

Output of generate_transparent_image — a real RGBA PNG (~90% of pixels fully transparent), not a painted-on checkerboard.

Layout

server.py          FastMCP server + tool definitions; Gemini REST via stdlib urllib
upscale.py         Real-ESRGAN inference (onnxruntime), lazily loaded, with tiling
requirements.txt   mcp, Pillow, rembg, onnxruntime

License

MIT — see LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured