Gemini Nano Banana MCP
An MCP server for AI-powered image generation, editing, and video generation using Google Gemini and Veo models.
README
Gemini Nano Banana MCP
An MCP (Model Context Protocol) server for AI-powered image generation, editing, and video generation using Google Gemini and Veo. Works with Claude Code, Cursor, and any MCP-compatible client.
Features
- Text-to-Image Generation - Generate images from text prompts via Gemini AI
- Image Editing - Edit existing images with natural language instructions
- Reference Images - Use reference images for style and content guidance
- Text-to-Video Generation - Generate videos from text prompts using Veo (veo-3.1, veo-3, veo-2)
- Image-to-Video - Use an image as the first frame for video generation
- First & Last Frame Interpolation - Generate videos between two keyframe images
- Session Memory - Continue editing the last image without re-specifying the path
- Configurable Models - Choose any Gemini model for images, any Veo model for videos
- Media History - Track and browse recently generated images and videos
- Cross-Platform - Works on macOS, Windows, and Linux
Quick Start
1. Get a Gemini API Key
Get your free API key from Google AI Studio.
2. Install
npm install -g @seungmanchoi/nano-banana-mcp
Or install from source:
git clone https://github.com/seungmanchoi/nano-banana-mcp.git
cd nano-banana-mcp
npm install
npm run build
3. Configure Your MCP Client
Claude Code
Add to ~/.claude/settings.json:
{
"mcpServers": {
"nano-banana": {
"command": "npx",
"args": ["-y", "@seungmanchoi/nano-banana-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}
Cursor
Add to your MCP settings:
{
"mcpServers": {
"nano-banana": {
"command": "npx",
"args": ["-y", "@seungmanchoi/nano-banana-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}
From Source
If installed from source, use the absolute path:
{
"mcpServers": {
"nano-banana": {
"command": "node",
"args": ["/absolute/path/to/nano-banana-mcp/dist/index.js"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}
You can also skip the
envfield and configure the API key at runtime using theconfigure_api_keytool.
Authentication Modes
This server supports two authentication modes. The default is API key mode (above).
Mode A — API key (official, default)
Uses a Gemini API key from Google AI Studio. Supports image generation, image editing, and video (Veo). Note that image generation models are largely a paid feature.
Mode B — Free Google-cookie mode (consumer Gemini, unofficial)
Drives your logged-in gemini.google.com session via its session cookies instead of an API key. Free, and supports image generation + editing only (no video).
⚠️ This mode is unofficial. It talks to an undocumented internal endpoint, not the official API. It may break when Google changes things, cookies expire periodically and must be re-extracted, and use is a Terms-of-Service gray area. Intended for personal use with your own account.
1. Extract your cookies from a browser where you're logged into gemini.google.com:
- Open
https://gemini.google.comand sign in. - DevTools (F12) → Application → Cookies →
https://gemini.google.com. - Copy the value of
__Secure-1PSID(required) and__Secure-1PSIDTS(recommended).
2a. Configure via environment variables:
{
"mcpServers": {
"nano-banana": {
"command": "npx",
"args": ["-y", "@seungmanchoi/nano-banana-mcp"],
"env": {
"GEMINI_AUTH_MODE": "gemini-web",
"GEMINI_SECURE_1PSID": "your-__Secure-1PSID-value",
"GEMINI_SECURE_1PSIDTS": "your-__Secure-1PSIDTS-value"
}
}
}
}
2b. Or configure at runtime with the configure_google_login tool:
Use configure_google_login with secure1psid "<...>" and secure1psidts "<...>"
This switches the active mode to gemini-web and persists to ~/.nano-banana/config.json.
Run configure_api_key again at any time to switch back to API key mode.
Model Configuration
Image Models
The default image model is gemini-2.0-flash-preview-image-generation. You can change it in several ways:
Option 1: Environment Variable
Set GEMINI_MODEL in your MCP client config:
{
"mcpServers": {
"nano-banana": {
"command": "npx",
"args": ["-y", "@seungmanchoi/nano-banana-mcp"],
"env": {
"GEMINI_API_KEY": "your-api-key-here",
"GEMINI_MODEL": "gemini-2.0-flash-preview-image-generation"
}
}
}
}
Option 2: Runtime Tool
Use the configure_model tool to change the default model at runtime. The setting persists across sessions in ~/.nano-banana/config.json.
Set the model to imagen-3.0-generate-002
Option 3: Per-Request Override
Pass the model parameter directly to generate_image, edit_image, or continue_editing to override the default for a single request:
Generate an image of a cat using model imagen-3.0-generate-002
Model Priority
- Per-request
modelparameter (highest priority) GEMINI_MODELenvironment variable- Config file (
~/.nano-banana/config.json) - Default:
gemini-2.0-flash-preview-image-generation
Available Image Models
| Model | Tier | Description |
|---|---|---|
gemini-2.0-flash-preview-image-generation |
Free | Default. Native image generation via Gemini 2.0 Flash. |
imagen-3.0-generate-002 |
Paid | Best quality. Google's dedicated image generation model. |
imagen-3.0-fast-generate-001 |
Paid | Fast variant of Imagen 3, optimized for speed. |
Note: Free-tier API keys support
gemini-2.0-flash-preview-image-generation. Imagen models require billing enabled on your Google Cloud project.
Available Video Models
| Model | Description |
|---|---|
veo-3.1-generate-preview |
Latest. Native audio, scene extension, reference images, 4K support. |
veo-3-generate-preview |
Previous generation with audio support. |
veo-2-generate-preview |
Older generation, stable. |
For the latest list of models, see Google AI documentation.
Tools
Image Tools
| Tool | Description |
|---|---|
configure_api_key |
Set or update the Gemini API key (switches to apiKey mode). Persists across sessions. |
configure_google_login |
Switch to free, unofficial gemini-web mode using consumer Gemini cookies (secure1psid, optional secure1psidts). Image generation + editing only. |
configure_model |
Set the default Gemini model for images (apiKey mode). Persists across sessions. |
generate_image |
Generate a new image from a text description. Supports optional model override. |
edit_image |
Edit an existing image with text instructions, optional reference images, and optional model override. |
continue_editing |
Continue editing the last generated/edited image in the session. Supports optional model override. |
list_history |
List recently generated and edited images with prompts and timestamps. |
Video Tools
| Tool | Description |
|---|---|
generate_video |
Generate a video from a text prompt. Supports text-to-video, image-to-video, and frame interpolation. |
list_video_history |
List recently generated videos with prompts, models, and timestamps. |
Utility Tools
| Tool | Description |
|---|---|
get_status |
Check configuration status, active models, output directories, and last image/video info. |
Usage Examples
Image Generation
Generate an image of a sunset over mountains with a lake reflection
Edit the image at ~/nano-banana-images/gen_2025-01-01.png to add a boat on the lake
Continue editing - make the sky more vibrant with orange and pink tones
Video Generation
Generate a video of ocean waves crashing on a rocky shore at sunset
Generate a video of a cat playing with yarn, model veo-3.1-generate-preview, resolution 1080p, duration 8 seconds
Image-to-Video (First Frame)
Generate a video starting from the image at ~/nano-banana-images/gen_2025-01-01.png showing the scene coming to life with wind blowing through the trees
First + Last Frame Interpolation
Generate a video transitioning from the image at ~/images/start.png to ~/images/end.png with a smooth camera pan
Portrait Video
Generate a video of a person walking through a garden, aspect ratio 9:16
History & Status
Show me the last 5 images I generated
Show me recent video history
Check the current status
Switch to imagen-3.0-generate-002 model for higher quality
Video Generation Details
Configuration Options
| Parameter | Options | Default | Description |
|---|---|---|---|
model |
veo-3.1-generate-preview, veo-3-generate-preview, veo-2-generate-preview |
veo-3.1-generate-preview |
Veo model to use |
aspectRatio |
16:9, 9:16 |
16:9 |
Landscape or portrait |
resolution |
720p, 1080p, 4k |
720p |
Output resolution |
durationSeconds |
4, 6, 8 (number) |
Varies by model | Video length |
numberOfVideos |
1+ |
1 |
Number of variants |
negativePrompt |
Any text | - | Elements to avoid |
Generation Modes
- Text-to-Video: Provide only a
prompt - Image-to-Video: Provide
prompt+imagePath(used as first frame) - Frame Interpolation: Provide
prompt+imagePath(first frame) +lastFramePath(last frame)
Important Notes
- Video generation takes 1-6 minutes depending on load
- Generated videos are saved as
.mp4files - Videos are watermarked with SynthID technology
- Pricing: $0.75 per second of generated video
- Videos are retained on Google servers for 2 days after generation
API Key Configuration
The server loads the API key in the following priority order:
- Environment variable -
GEMINI_API_KEY - Config file -
~/.nano-banana/config.json - Runtime - via the
configure_api_keytool
File Storage
Images
| Platform | Path |
|---|---|
| macOS / Linux | ~/nano-banana-images/ |
| Windows | Documents\nano-banana-images\ |
Videos
| Platform | Path |
|---|---|
| macOS / Linux | ~/nano-banana-videos/ |
| Windows | Documents\nano-banana-videos\ |
Project Structure
src/
├── index.ts # Entry point
├── server.ts # MCP server setup and request routing
├── config/
│ └── settings.ts # API key and model management
├── services/
│ ├── gemini.ts # Google Gemini & Veo API client
│ └── storage.ts # Image/video file I/O and history tracking
├── tools/
│ ├── definitions.ts # MCP tool schemas
│ └── handlers.ts # Tool request handlers
└── types/
└── index.ts # TypeScript type definitions
Development
npm run dev # Run with tsx (no build needed)
npm run build # Compile TypeScript
npm run typecheck # Type check without emitting
npm run lint # Run ESLint
Tech Stack
- Runtime: Node.js
- Language: TypeScript (strict mode, ES2022)
- MCP SDK:
@modelcontextprotocol/sdk - AI Models: Google Gemini (images) + Veo (videos)
- Validation: Zod
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.