jgkme/kilo-image-gen-mcp
MCP server for generating, editing, and processing images via multiple providers including Kilo, OpenRouter, OpenAI, and Gemini, with local tools for background removal, resizing, and cropping.
README
JGKME kilo-image-gen-mcp
MCP server for image generation through Kilo Gateway and compatible providers.
Features
generate_imagefor OpenRouter-first generation with response normalizationkilo_generate_imagefor Kilo Gateway routingedit_imagefor prompt-driven image editingbackground_removefor local segmentation-backed transparent PNG cutoutsresize_imageandauto_cropfor deterministic local transformsfinalize_imagefor a one-call local workflow that can remove background, trim, crop, and resize- Support for OpenAI
gpt-image-1 - Gemini support for
gemini-2.5-flash-image,gemini-3.1-flash-image-preview, andgemini-3-pro-image-preview - Global Kilo skills for generation, editing, background removal, and transforms
- Reusable smoke validation commands for OpenRouter, OpenAI, and local background removal
- Debug mode via
IMAGE_MCP_DEBUG=1for full error details and response payloads background_remove,resize_image,auto_crop, andfinalize_imagework without any provider API key- The smoke harness prints compact summaries by default; add
--jsonor--verbosewhen you need raw output
Install
npm install -g @jgkme/kilo-image-gen-mcp
Configuration
Set the default provider with IMAGE_MCP_DEFAULT_PROVIDER.
Set the default model with IMAGE_MCP_DEFAULT_MODEL.
Set a project-specific image output root with IMAGE_MCP_PROJECT_OUTPUT_DIR.
Set the default local background-removal backend with IMAGE_MCP_DEFAULT_BG_BACKEND (rmbg or imgly).
Set IMAGE_MCP_DEFAULT_BG_ALPHA_THRESHOLD to a number like 24 if you want the quality backend to default to a tighter mask for logos/header assets.
Set IMAGE_MCP_DEBUG=1 to include full error details, provider response payloads, and stack traces in MCP error output.
The local transform tools, background removal, and finalize workflow do not require provider keys. Only generation and edit routes need API keys.
For MCP clients, use whatever field name the client expects for process environment variables. In Kilo, the working key is env for local MCP servers. Some other clients use environment or similar, but the server itself only reads standard process environment variables.
Provider environment variables:
| Provider | Variable |
|---|---|
| Kilo | KILO_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Gemini | GEMINI_API_KEY |
Default model notes:
- If
IMAGE_MCP_DEFAULT_MODELis set, the server uses that model whenmodelis omitted. - Otherwise, the default follows the selected provider.
- Kilo Gateway can route to other compatible models if your account has access to them.
Tools
kilo_generate_image
| Input | Type | Notes |
|---|---|---|
prompt |
string | Required |
provider |
string | kilo, openrouter, openai, gemini |
model |
string | Optional. Defaults from IMAGE_MCP_DEFAULT_MODEL or provider default |
purpose |
string | Optional. Helps normalize the prompt for a specific use case |
style |
string | Optional. Helps normalize the prompt's visual style |
size |
string | Example 1024x1024 |
width / height |
number | Overrides size |
aspect |
string | square, landscape, portrait |
steps |
number | Optional |
input_image |
string | Path, base64, or URL |
output_path |
string | Optional output file |
generate_image
OpenRouter-first image generation tool with broader request options and response normalization.
| Input | Type | Notes |
|---|---|---|
prompt |
string | Required |
provider |
string | Defaults to openrouter if omitted |
model |
string | Optional |
purpose |
string | Optional. Helps normalize the prompt for a specific use case |
style |
string | Optional. Helps normalize the prompt's visual style |
input_image |
string | Optional reference image |
input_images |
string[] | Optional reference image list for OpenRouter chat-image models |
modalities |
string[] | Optional override, defaults to ['image', 'text'] for OpenRouter |
quality |
string | Optional OpenRouter image_config hint |
background |
string | Optional OpenRouter image_config hint |
output_format |
string | Optional OpenRouter image_config hint |
moderation |
string | Optional OpenRouter image_config hint |
output_path |
string | Optional output file path |
OpenRouter responses are normalized from choices[0].message.images, message.content, data.output, and other common image payload shapes. Multiple returned images are preserved.
Project output hint:
- If
IMAGE_MCP_PROJECT_OUTPUT_DIRis set, the server writes generated images there. - Otherwise, if the repo root contains
.image-mcp-outputor.image-mcp-output.json, that value is used. - This is intended for project-specific layouts like Laravel
public/storage/imageswithout hardcoding framework detection.
Example hint file contents:
public/storage/images
Or JSON:
{ "outputDir": "public/storage/images" }
list_image_models
Returns configured provider status, current defaults, and known model families.
get_provider_status
Returns configured providers and defaults.
edit_image
Same shape as kilo_generate_image, but requires input_image and routes the prompt as an edit instruction. Native edit routes are used for Kilo, OpenAI, and OpenRouter when available, with chat-based fallback for other providers.
background_remove
Locally removes a background with a segmentation model and preserves transparency in a PNG output.
| Input | Type | Notes |
|---|---|---|
input_image |
string | Required |
backend |
string | Optional. rmbg for fast/default or imgly for higher quality |
model |
string | Optional. u2netp, modnet, or briaai |
max_resolution |
number | Optional. Defaults to 2048 |
alpha_feather |
number | Optional. Softens the final alpha edge a little |
alpha_threshold |
number | Optional. Tightens the alpha mask after feathering |
output_path |
string | Optional output file path |
If backend is omitted, the server uses IMAGE_MCP_DEFAULT_BG_BACKEND when set, otherwise rmbg.
Backend notes:
rmbgis the fast default and usesu2netp,modnet, orbriaaiimglyis the higher-quality path and usessmallormedium- Both backends run locally and do not need API keys
- Expect the quality backend to use more RAM and disk for model assets
- For logo/header artwork, a threshold like
24often removes leftover white halo pixels better than the raw output
Resource notes:
- CPU-only processing is fine for most images, but the quality backend may take longer on simple laptops
- Expect roughly 40 MB to 170 MB of model assets depending on the chosen backend/model
- GPU is optional; it helps if the host already has an accelerator, but it is not required
finalize_image
One-call local workflow for cleanup and export.
| Input | Type | Notes |
|---|---|---|
input_image |
string | Required |
remove_background |
boolean | Optional. Run local background removal first |
background_backend |
string | Optional. rmbg or imgly |
background_model |
string | Optional. Backend-specific model name |
max_resolution |
number | Optional. Used by the background-removal step |
alpha_feather |
number | Optional. Softens the final alpha edge a little |
alpha_threshold |
number | Optional. Tightens the alpha mask after feathering |
trim |
boolean | Optional. Trim transparent or empty borders |
width / height |
number | Optional. Resize target |
fit |
string | Optional. cover, contain, fill, inside, outside |
gravity |
string | Optional crop gravity |
background |
string | Optional flatten background color |
output_path |
string | Optional output file path |
Typical use: remove the background from a logo or product shot, then trim or resize it in the same call.
If background_backend is omitted, the server uses IMAGE_MCP_DEFAULT_BG_BACKEND when set, otherwise rmbg.
If you still see tiny edge residue after removal, try a small alpha_feather value like 0.4 to 0.8 or a mild alpha_threshold like 8 to 20.
resize_image
Locally resizes an image with aspect-ratio-preserving defaults.
| Input | Type | Notes |
|---|---|---|
input_image |
string | Required |
width / height |
number | Optional target dimensions |
fit |
string | cover, contain, fill, inside, outside |
background |
string | Optional flatten background color |
output_path |
string | Optional output file path |
auto_crop
Locally crops to target dimensions or trims surrounding whitespace when no size is provided.
| Input | Type | Notes |
|---|---|---|
input_image |
string | Required |
width / height |
number | Optional crop target |
gravity |
string | Optional crop gravity |
output_path |
string | Optional output file path |
Behavior
providerdefaults toIMAGE_MCP_DEFAULT_PROVIDERorkilomodeldefaults toIMAGE_MCP_DEFAULT_MODELwhen set, otherwise to the provider defaultopenaidefaults togpt-image-1generate_imageandedit_imageadd light prompt normalization whenpurpose,style,aspect, orqualityare providedaspectmaps to size when width and height are not providedinput_imagecan be a file path, base64 string, or URLoutput_pathwrites the generated PNG to diskgenerate_imagesupports OpenRouter response normalization and multiple image payloads when presentedit_imagetreatsinput_imageas the reference image and preserves subject/composition unless instructed otherwiseedit_imageuses native edit endpoints for Kilo, OpenAI, and OpenRouter when possiblebackground_remove,resize_image, andauto_cropare local deterministic tools that do not require provider API keysbackground_removeuses a localrmbgsegmentation model by default and can switch toimglyfor better edgesfinalize_imagecan chain local background removal, trim, crop, resize, and flatten in one callIMAGE_MCP_DEBUG=1expands tool error output withdetails,response, andstack- If a generated image still contains checkerboard-like residue in the background, use
background_backend=imglyandfinalize_imagefirst; deeper alpha cleanup would require refinement knobs
Smoke tests
The repo includes a reusable validation script:
npm run smoke -- --provider openrouter --prompt "a colorful parrot perched on a branch"
npm run smoke -- --provider openai --prompt "a red panda wearing glasses" --purpose portrait
npm run smoke -- --tool background_remove --input-image generated-images/parrot.png --model modnet
npm run smoke -- --tool finalize_image --input-image generated-images/j-gemini-openrouter.png --remove-background true --background-backend imgly --background-model medium --trim true
Convenience commands are also available:
npm run smoke:openrouter
npm run smoke:openai
npm run smoke:bg
npm run demo:bg
npm run demo:finalize
Demo commands are useful when you want to test local background removal or finalization directly against an existing generated image without typing the full tool flags.
Provider notes
kilouseshttps://api.kilo.ai/api/gateway/images/generationsopenroutersupports many image models, but not every model supports every modality combination. The server adapts the request shape per model and falls back to image-only when needed.openaiusesgpt-image-1for generation and the Images API for editsgeminiuses the OpenAI-compatible Gemini endpoint for generation and chat-style image flows where supported- Errors return structured JSON text with
code,message,details, andretryable - For Kilo/OpenRouter, you can pick a different compatible model by setting
modelexplicitly or by changingIMAGE_MCP_DEFAULT_MODEL - In live Kilo runtime tests, the MCP can receive a non-empty
KILO_API_KEY, but the Kilo image gateway still responds withPlease pass a valid API key/PAID_MODEL_AUTH_REQUIREDfor image generation requests. That indicates the backend is rejecting the token at the image endpoint, not that the MCP is dropping the key. - Local image manipulation tools work without any provider API key after the model assets are installed or downloaded on first use.
Kilo config
{
"mcp": {
"kilo-image-gen": {
"type": "local",
"command": ["npx", "-y", "@jgkme/kilo-image-gen-mcp"],
"env": {
"IMAGE_MCP_DEFAULT_PROVIDER": "kilo",
"IMAGE_MCP_DEFAULT_MODEL": "black-forest-labs/flux.2-pro",
"IMAGE_MCP_DEFAULT_BG_BACKEND": "imgly",
"KILO_API_KEY": "your_kilo_api_key_here",
"OPENROUTER_API_KEY": "your_openrouter_api_key_here",
"OPENAI_API_KEY": "your_openai_api_key_here",
"GEMINI_API_KEY": "your_gemini_api_key_here"
},
"enabled": true
}
}
}
If you are using another MCP client such as Cursor, continue using the same process environment variable names (KILO_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY), but match that client’s config schema for local server env injection. The key point is that the launched process must receive those variables in its runtime environment.
For a local install path, use:
{
"mcp": {
"kilo-image-gen": {
"type": "local",
"command": ["npx", "-y", "@jgkme/kilo-image-gen-mcp"],
"env": {
"IMAGE_MCP_DEFAULT_PROVIDER": "kilo",
"IMAGE_MCP_DEFAULT_MODEL": "black-forest-labs/flux.2-pro",
"IMAGE_MCP_DEFAULT_BG_BACKEND": "imgly",
"KILO_API_KEY": "your_kilo_api_key_here",
"OPENROUTER_API_KEY": "your_openrouter_api_key_here",
"OPENAI_API_KEY": "your_openai_api_key_here",
"GEMINI_API_KEY": "your_gemini_api_key_here"
},
"enabled": true
}
}
}
Troubleshooting
If generation fails, verify the provider API key and model name. If no output file is written, confirm output_path points to a writable location.
If you need full provider payloads for debugging, set IMAGE_MCP_DEBUG=1 in the MCP server environment and retry the same call.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.