z_ai_image_gen_mcp

z_ai_image_gen_mcp

MCP server for generating images and videos using Z.AI models (GLM-Image, CogView-4, CogVideoX-3, Vidu Q1, etc.) with support for synchronous and asynchronous generation, downloads, and multiple input modes.

Category
Visit Server

README

Z.AI Image & Video Generation MCP Server

A Model Context Protocol (MCP) server that provides access to Z.AI's image and video generation models for LLM applications.

Features

  • Image Generation: GLM-Image and CogView-4 models for high-quality image generation
  • Video Generation: CogVideoX-3, Vidu Q1, and Vidu 2 models for AI video creation
  • Multiple Input Modes: Text-to-image/video, image-to-video, start-end frame animation
  • Asynchronous Processing: Submit long-running tasks and poll for results
  • Automatic Downloads: Generate and download in a single operation
  • Automatic Retries: Built-in retry logic with exponential backoff
  • Comprehensive Validation: Input validation with clear error messages
  • Type-Safe: Full TypeScript support with detailed type definitions

Installation

npm install GeorgH93/z_ai_image_gen_mcp

Configuration

Set your Z.AI API key as an environment variable:

export ZAI_API_KEY=your_api_key_here

Get your API key from the Z.AI API Keys page or sign up for the GLM Coding Plan.

Optional Configuration

Environment Variable Description Default
ZAI_API_BASE_URL API base URL https://api.z.ai/api
ZAI_DEFAULT_MODEL Default model glm-image
ZAI_DEFAULT_SIZE Default image size 1280x1280
ZAI_REQUEST_TIMEOUT Request timeout (ms) 60000
ZAI_MAX_RETRIES Max retry attempts 3
ZAI_RETRY_DELAY Initial retry delay (ms) 1000

Usage

With Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "z-ai-image": {
      "command": "npx",
      "args": ["z-ai-image-mcp"],
      "env": {
        "ZAI_API_KEY": "your_api_key_here"
      }
    }
  }
}

With Other MCP Clients

Run the server directly:

npx z-ai-image-mcp

Or programmatically:

import { createServer, loadConfig } from 'z-ai-image-mcp';

const config = loadConfig();
const server = createServer(config);
// Connect to your transport...

With OpenCode

Add to your OpenCode configuration (opencode.json or opencode.jsonc in your project root):

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "your_api_key_here"
      }
    }
  }
}

Or using an environment variable reference:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "{env:ZAI_API_KEY}"
      }
    }
  }
}

Using with OpenCode prompts:

Generate a professional logo for a tech startup. use z-ai-image

Or add to your AGENTS.md:

When generating images, use the `z-ai-image` MCP server tools.

Per-agent configuration (optional):

To enable the MCP server only for specific agents:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "{env:ZAI_API_KEY}"
      }
    }
  },
  "tools": {
    "z-ai-image*": false
  },
  "agent": {
    "design-agent": {
      "tools": {
        "z-ai-image*": true
      }
    }
  }
}

Available Tools

1. list_models

List all available image generation models and their capabilities.

Use this tool to discover available models, their features, and recommended settings.

2. generate_image

Generate an image synchronously from a text prompt.

Parameters:

  • prompt (required): Text description of the image (max 4000 characters)
  • model (optional): glm-image or cogview-4-250304 (default: glm-image)
  • size (optional): Image dimensions, e.g., 1280x1280 (default: 1280x1280)
  • quality (optional): hd or standard (default: hd for GLM-Image)
  • user_id (optional): End user ID for abuse prevention (6-128 characters)

Example:

Generate an image of a cute kitten sitting on a windowsill with a sunset background.

3. generate_image_async

Start an asynchronous image generation task. Returns a task ID for polling.

Parameters:

  • prompt (required): Text description of the image
  • model (optional): Only glm-image supports async (default: glm-image)
  • size (optional): Image dimensions (default: 1280x1280)
  • quality (optional): Only hd supported for async (default: hd)
  • user_id (optional): End user ID for abuse prevention

Example:

Start async generation of a complex poster design.

4. get_async_result

Retrieve the result of an asynchronous image generation task.

Parameters:

  • task_id (required): The task ID from generate_image_async

Example:

Check the status of task ID "task-12345".

5. download_image

Download an image from a URL and return it as base64 or save to a file.

Parameters:

  • url (required): The URL of the image to download (e.g., from generate_image or get_async_result)
  • output (optional): base64 or file_output (default: base64)
  • file_output (optional): Absolute path to save the image file (required if output is file_output). Example: /path/to/image.png

Output Modes:

  • base64: Returns the image data directly as base64 (auto-switches to file if > 1MB)
  • file_output: Saves the image to disk at the specified path

Example:

Download the generated image and save it to /home/user/images/logo.png

Note: Z.AI image URLs expire after 30 days. Use this tool to download and store images permanently.

6. generate_and_download_image ⭐ Recommended

Generate an image and automatically download it in a single operation. This is the most convenient tool when you want the image data immediately.

Parameters:

  • prompt (required): Text description of the image (max 4000 characters)
  • model (optional): glm-image or cogview-4-250304 (default: glm-image)
  • size (optional): Image dimensions, e.g., 1280x1280 (default: 1280x1280)
  • quality (optional): hd or standard (default: hd for GLM-Image)
  • user_id (optional): End user ID for abuse prevention (6-128 characters)
  • output (optional): base64 or file_output (default: base64)
  • file_output (optional): Absolute path to save the image file (required if output is file_output)
  • poll_interval (optional): Seconds to wait between polling for async results (default: 3)
  • max_wait (optional): Maximum seconds to wait for generation (default: 120)

Output Modes:

  • base64: Returns the image data directly as base64 (auto-switches to file if > 1MB)
  • file_output: Saves the image to disk at the specified path

Examples:

# Generate and get as base64
Generate a logo for my company and show me the image.

# Generate and save to file
Generate a logo and save it to /home/user/images/logo.png

Behavior:

  • For GLM-Image: Uses async API with automatic polling until complete
  • For CogView-4: Uses synchronous API
  • Automatically downloads the result once generation completes
  • Returns image as base64 or saves to specified path

Video Generation Tools

7. list_video_models

List all available video generation models and their capabilities.

Use this tool to discover available video models, their features, and supported parameters.

8. generate_video

Generate a video asynchronously from text or images. Returns a task ID for polling.

Parameters:

  • model (required): Video generation model
    • cogvideox-3: Z.AI flagship model (up to 4K, 5-10s, audio support)
    • viduq1-text: Text-to-video, 1080P, 5s
    • viduq1-image: Image-to-video, 1080P, 5s
    • viduq1-start-end: Start-end frame, 1080P, 5s
    • vidu2-image: Image-to-video, 720P, 4s (faster, cheaper)
    • vidu2-start-end: Start-end frame, 720P, 4s
    • vidu2-reference: Reference-based, 720P, 4s
  • prompt (optional): Text description (max 512 characters)
  • image_url (optional): Image URL(s) for image-to-video generation
  • quality (CogVideoX-3): quality or speed
  • size (optional): Video resolution
  • duration (optional): Video duration in seconds
  • fps (CogVideoX-3): 30 or 60
  • with_audio (optional): Generate AI sound effects
  • style (Vidu Q1 text): general or anime
  • aspect_ratio (Vidu Q1/2): 16:9, 9:16, or 1:1
  • movement_amplitude (Vidu): auto, small, medium, or large
  • user_id (optional): End user ID for abuse prevention

Examples:

# Text-to-video
Generate a video of a cat playing with a ball.

# Image-to-video
Animate this image: [image_url]

# Start-end frame
Create a smooth transition from [first_frame] to [last_frame].

9. get_video_result

Retrieve the result of an asynchronous video generation task.

Parameters:

  • task_id (required): The task ID from generate_video

Note: Video generation typically takes 30 seconds to several minutes depending on duration and quality.

10. generate_and_download_video ⭐ Recommended

Generate a video and automatically download it. Polls for completion and saves the video file.

Parameters:

  • All parameters from generate_video plus:
  • file_output (optional): Absolute path to save the video file
  • poll_interval (optional): Seconds to wait between polling (default: 10)
  • max_wait (optional): Maximum seconds to wait (default: 300)

Example:

Generate a video of a sunset over the ocean and save it to /home/user/videos/sunset.mp4

Note: Videos are always saved to file (too large for base64). Video URLs expire after 1 day.

Models

GLM-Image

Z.AI's flagship image generation model with a hybrid autoregressive + diffusion architecture.

  • Best for: Complex compositions, text rendering, detailed illustrations, commercial posters
  • Quality options: hd (detailed, ~20s), standard (faster, ~5-10s)
  • Size range: 1024-2048px per dimension (divisible by 32)
  • Recommended sizes: 1280×1280, 1568×1056, 1056×1568, 1472×1088, 1088×1472, 1728×960, 960×1728
  • Async support: Yes

CogView-4-250304

General-purpose image generation with fast text understanding.

  • Best for: General image generation, quick iterations
  • Quality options: hd, standard
  • Size range: 512-2048px per dimension (divisible by 16)
  • Recommended sizes: 1024×1024, 768×1344, 864×1152, 1344×768, 1152×864, 1440×720, 720×1440
  • Async support: No

Video Models

CogVideoX-3

Z.AI's flagship video generation model with improved frame stability and clarity.

  • Best for: Text-to-video, image-to-video, start-end frame animation
  • Resolution: Up to 4K (3840x2160)
  • Duration: 5 or 10 seconds
  • Features: Audio generation, 30/60 FPS, quality/speed modes
  • Price: $0.20/video

Vidu Q1

High-quality video generation with 1080P output.

Model Capability Duration Price
viduq1-text Text-to-video 5s $0.40
viduq1-image Image-to-video 5s $0.40
viduq1-start-end Start-end frame 5s $0.40
  • Features: General/anime styles, motion amplitude control

Vidu 2

Fast and cost-effective video generation with 720P output.

Model Capability Duration Price
vidu2-image Image-to-video 4s $0.20
vidu2-start-end Start-end frame 4s $0.20
vidu2-reference Reference-based 4s $0.40
  • Features: Audio generation, motion amplitude control, multi-image reference

Error Handling

The server handles various error scenarios:

Error Type Description
AUTH_ERROR Invalid or missing API key
RATE_LIMIT Too many requests - will auto-retry
VALIDATION_ERROR Invalid parameters
SERVER_ERROR Z.AI server issues - will auto-retry
NETWORK_ERROR Connection issues - will auto-retry
TIMEOUT_ERROR Request timeout - will auto-retry
CONTENT_FILTER Prompt blocked by content policy

Development

Setup

git clone <repo-url>
cd z-ai-image-mcp
npm install
cp .env.example .env
# Edit .env with your API key

Scripts

npm run build        # Build TypeScript
npm run dev          # Run in development mode
npm test             # Run all tests
npm run test:unit    # Run unit tests only
npm run test:integration  # Run integration tests
npm run test:e2e     # Run E2E tests
npm run test:coverage    # Run tests with coverage
npm run typecheck    # Type check without emit

License

MIT

Links

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured