MCP Servers

z_ai_image_gen_mcp

MCP server for generating images and videos using Z.AI models (GLM-Image, CogView-4, CogVideoX-3, Vidu Q1, etc.) with support for synchronous and asynchronous generation, downloads, and multiple input modes.

README

Z.AI Image & Video Generation MCP Server

A Model Context Protocol (MCP) server that provides access to Z.AI's image and video generation models for LLM applications.

Features

Image Generation: GLM-Image and CogView-4 models for high-quality image generation
Video Generation: CogVideoX-3, Vidu Q1, and Vidu 2 models for AI video creation
Multiple Input Modes: Text-to-image/video, image-to-video, start-end frame animation
Asynchronous Processing: Submit long-running tasks and poll for results
Automatic Downloads: Generate and download in a single operation
Automatic Retries: Built-in retry logic with exponential backoff
Comprehensive Validation: Input validation with clear error messages
Type-Safe: Full TypeScript support with detailed type definitions

Installation

npm install GeorgH93/z_ai_image_gen_mcp

Configuration

Set your Z.AI API key as an environment variable:

export ZAI_API_KEY=your_api_key_here

Get your API key from the Z.AI API Keys page or sign up for the GLM Coding Plan.

Optional Configuration

Environment Variable	Description	Default
`ZAI_API_BASE_URL`	API base URL	`https://api.z.ai/api`
`ZAI_DEFAULT_MODEL`	Default model	`glm-image`
`ZAI_DEFAULT_SIZE`	Default image size	`1280x1280`
`ZAI_REQUEST_TIMEOUT`	Request timeout (ms)	`60000`
`ZAI_MAX_RETRIES`	Max retry attempts	`3`
`ZAI_RETRY_DELAY`	Initial retry delay (ms)	`1000`

Usage

With Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "z-ai-image": {
      "command": "npx",
      "args": ["z-ai-image-mcp"],
      "env": {
        "ZAI_API_KEY": "your_api_key_here"
      }
    }
  }
}

With Other MCP Clients

Run the server directly:

npx z-ai-image-mcp

Or programmatically:

import { createServer, loadConfig } from 'z-ai-image-mcp';

const config = loadConfig();
const server = createServer(config);
// Connect to your transport...

With OpenCode

Add to your OpenCode configuration (opencode.json or opencode.jsonc in your project root):

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "your_api_key_here"
      }
    }
  }
}

Or using an environment variable reference:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "{env:ZAI_API_KEY}"
      }
    }
  }
}

Using with OpenCode prompts:

Generate a professional logo for a tech startup. use z-ai-image

Or add to your AGENTS.md:

When generating images, use the `z-ai-image` MCP server tools.

Per-agent configuration (optional):

To enable the MCP server only for specific agents:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "z-ai-image": {
      "type": "local",
      "command": ["npx", "z-ai-image-mcp"],
      "enabled": true,
      "environment": {
        "ZAI_API_KEY": "{env:ZAI_API_KEY}"
      }
    }
  },
  "tools": {
    "z-ai-image*": false
  },
  "agent": {
    "design-agent": {
      "tools": {
        "z-ai-image*": true
      }
    }
  }
}

Available Tools

1. `list_models`

List all available image generation models and their capabilities.

Use this tool to discover available models, their features, and recommended settings.

2. `generate_image`

Generate an image synchronously from a text prompt.

Parameters:

prompt (required): Text description of the image (max 4000 characters)
model (optional): glm-image or cogview-4-250304 (default: glm-image)
size (optional): Image dimensions, e.g., 1280x1280 (default: 1280x1280)
quality (optional): hd or standard (default: hd for GLM-Image)
user_id (optional): End user ID for abuse prevention (6-128 characters)

Example:

Generate an image of a cute kitten sitting on a windowsill with a sunset background.

3. `generate_image_async`

Start an asynchronous image generation task. Returns a task ID for polling.

Parameters:

prompt (required): Text description of the image
model (optional): Only glm-image supports async (default: glm-image)
size (optional): Image dimensions (default: 1280x1280)
quality (optional): Only hd supported for async (default: hd)
user_id (optional): End user ID for abuse prevention

Example:

Start async generation of a complex poster design.

4. `get_async_result`

Retrieve the result of an asynchronous image generation task.

Parameters:

task_id (required): The task ID from generate_image_async

Example:

Check the status of task ID "task-12345".

5. `download_image`

Download an image from a URL and return it as base64 or save to a file.

Parameters:

url (required): The URL of the image to download (e.g., from generate_image or get_async_result)
output (optional): base64 or file_output (default: base64)
file_output (optional): Absolute path to save the image file (required if output is file_output). Example: /path/to/image.png

Output Modes:

base64: Returns the image data directly as base64 (auto-switches to file if > 1MB)
file_output: Saves the image to disk at the specified path

Example:

Download the generated image and save it to /home/user/images/logo.png

Note: Z.AI image URLs expire after 30 days. Use this tool to download and store images permanently.

6. `generate_and_download_image` ⭐ Recommended

Generate an image and automatically download it in a single operation. This is the most convenient tool when you want the image data immediately.

Parameters:

prompt (required): Text description of the image (max 4000 characters)
model (optional): glm-image or cogview-4-250304 (default: glm-image)
size (optional): Image dimensions, e.g., 1280x1280 (default: 1280x1280)
quality (optional): hd or standard (default: hd for GLM-Image)
user_id (optional): End user ID for abuse prevention (6-128 characters)
output (optional): base64 or file_output (default: base64)
file_output (optional): Absolute path to save the image file (required if output is file_output)
poll_interval (optional): Seconds to wait between polling for async results (default: 3)
max_wait (optional): Maximum seconds to wait for generation (default: 120)

Output Modes:

base64: Returns the image data directly as base64 (auto-switches to file if > 1MB)
file_output: Saves the image to disk at the specified path

Examples:

# Generate and get as base64
Generate a logo for my company and show me the image.

# Generate and save to file
Generate a logo and save it to /home/user/images/logo.png

Behavior:

For GLM-Image: Uses async API with automatic polling until complete
For CogView-4: Uses synchronous API
Automatically downloads the result once generation completes
Returns image as base64 or saves to specified path

Video Generation Tools

7. `list_video_models`

List all available video generation models and their capabilities.

Use this tool to discover available video models, their features, and supported parameters.

8. `generate_video`

Generate a video asynchronously from text or images. Returns a task ID for polling.

Parameters:

model (required): Video generation model
- cogvideox-3: Z.AI flagship model (up to 4K, 5-10s, audio support)
- viduq1-text: Text-to-video, 1080P, 5s
- viduq1-image: Image-to-video, 1080P, 5s
- viduq1-start-end: Start-end frame, 1080P, 5s
- vidu2-image: Image-to-video, 720P, 4s (faster, cheaper)
- vidu2-start-end: Start-end frame, 720P, 4s
- vidu2-reference: Reference-based, 720P, 4s
prompt (optional): Text description (max 512 characters)
image_url (optional): Image URL(s) for image-to-video generation
quality (CogVideoX-3): quality or speed
size (optional): Video resolution
duration (optional): Video duration in seconds
fps (CogVideoX-3): 30 or 60
with_audio (optional): Generate AI sound effects
style (Vidu Q1 text): general or anime
aspect_ratio (Vidu Q1/2): 16:9, 9:16, or 1:1
movement_amplitude (Vidu): auto, small, medium, or large
user_id (optional): End user ID for abuse prevention

Examples:

# Text-to-video
Generate a video of a cat playing with a ball.

# Image-to-video
Animate this image: [image_url]

# Start-end frame
Create a smooth transition from [first_frame] to [last_frame].

9. `get_video_result`

Retrieve the result of an asynchronous video generation task.

Parameters:

task_id (required): The task ID from generate_video

Note: Video generation typically takes 30 seconds to several minutes depending on duration and quality.

10. `generate_and_download_video` ⭐ Recommended

Generate a video and automatically download it. Polls for completion and saves the video file.

Parameters:

All parameters from generate_video plus:
file_output (optional): Absolute path to save the video file
poll_interval (optional): Seconds to wait between polling (default: 10)
max_wait (optional): Maximum seconds to wait (default: 300)

Example:

Generate a video of a sunset over the ocean and save it to /home/user/videos/sunset.mp4

Note: Videos are always saved to file (too large for base64). Video URLs expire after 1 day.

Models

GLM-Image

Z.AI's flagship image generation model with a hybrid autoregressive + diffusion architecture.

Best for: Complex compositions, text rendering, detailed illustrations, commercial posters
Quality options: hd (detailed, ~20s), standard (faster, ~5-10s)
Size range: 1024-2048px per dimension (divisible by 32)
Recommended sizes: 1280×1280, 1568×1056, 1056×1568, 1472×1088, 1088×1472, 1728×960, 960×1728
Async support: Yes

CogView-4-250304

General-purpose image generation with fast text understanding.

Best for: General image generation, quick iterations
Quality options: hd, standard
Size range: 512-2048px per dimension (divisible by 16)
Recommended sizes: 1024×1024, 768×1344, 864×1152, 1344×768, 1152×864, 1440×720, 720×1440
Async support: No

Video Models

CogVideoX-3

Z.AI's flagship video generation model with improved frame stability and clarity.

Best for: Text-to-video, image-to-video, start-end frame animation
Resolution: Up to 4K (3840x2160)
Duration: 5 or 10 seconds
Features: Audio generation, 30/60 FPS, quality/speed modes
Price: $0.20/video

Vidu Q1

High-quality video generation with 1080P output.

Model	Capability	Duration	Price
`viduq1-text`	Text-to-video	5s	$0.40
`viduq1-image`	Image-to-video	5s	$0.40
`viduq1-start-end`	Start-end frame	5s	$0.40

Features: General/anime styles, motion amplitude control

Vidu 2

Fast and cost-effective video generation with 720P output.

Model	Capability	Duration	Price
`vidu2-image`	Image-to-video	4s	$0.20
`vidu2-start-end`	Start-end frame	4s	$0.20
`vidu2-reference`	Reference-based	4s	$0.40

Features: Audio generation, motion amplitude control, multi-image reference

Error Handling

The server handles various error scenarios:

Error Type	Description
`AUTH_ERROR`	Invalid or missing API key
`RATE_LIMIT`	Too many requests - will auto-retry
`VALIDATION_ERROR`	Invalid parameters
`SERVER_ERROR`	Z.AI server issues - will auto-retry
`NETWORK_ERROR`	Connection issues - will auto-retry
`TIMEOUT_ERROR`	Request timeout - will auto-retry
`CONTENT_FILTER`	Prompt blocked by content policy

Development

Setup

git clone <repo-url>
cd z-ai-image-mcp
npm install
cp .env.example .env
# Edit .env with your API key

Scripts

npm run build        # Build TypeScript
npm run dev          # Run in development mode
npm test             # Run all tests
npm run test:unit    # Run unit tests only
npm run test:integration  # Run integration tests
npm run test:e2e     # Run E2E tests
npm run test:coverage    # Run tests with coverage
npm run typecheck    # Type check without emit

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

z_ai_image_gen_mcp

README

Z.AI Image & Video Generation MCP Server

Features

Installation

Configuration

Optional Configuration

Usage

With Claude Desktop

With Other MCP Clients

With OpenCode

Available Tools

1. list_models

2. generate_image

3. generate_image_async

4. get_async_result

5. download_image

6. generate_and_download_image ⭐ Recommended

Video Generation Tools

7. list_video_models

8. generate_video

9. get_video_result

10. generate_and_download_video ⭐ Recommended

Models

GLM-Image

CogView-4-250304

Video Models

CogVideoX-3

Vidu Q1

Vidu 2

Error Handling

Development

Setup

Scripts

License

Links

Recommended Servers

1. `list_models`

2. `generate_image`

3. `generate_image_async`

4. `get_async_result`

5. `download_image`

6. `generate_and_download_image` ⭐ Recommended

7. `list_video_models`

8. `generate_video`

9. `get_video_result`

10. `generate_and_download_video` ⭐ Recommended