ai-slides-mcp

ai-slides-mcp

Generates images through ChatGPT's web backend and assembles them into full-bleed, branded PowerPoint decks with slide styling and reference-based design.

Category
Visit Server

README

ai-slides-mcp

An MCP server (and CLI) that generates images through the ChatGPT web backend and assembles them into full-bleed PowerPoint decks (brand-aware and reference-styled).

⚠️ Disclaimer

This project is provided for personal learning and research only. It reverse-engineers the ChatGPT web backend:

  • Not affiliated with, endorsed by, or sponsored by OpenAI.
  • Using it violates OpenAI's Terms of Service. Your account may be rate-limited or permanently banned.
  • Use a throwaway / secondary account — never your important one.

Provided as-is, with no warranty. You assume all risk. If you are not comfortable with these terms, do not use this software.

What it does

  • Generates images from text prompts at a chosen aspect ratio (16:9, 1:1, 3:4, 4:3, 9:16, or raw WxH).
  • Reasoning effort (thinking): optionally make the model think harder before drawing (standard < extended < max) — markedly improves rendered text fidelity, e.g. Vietnamese diacritics.
  • Auto-enhances prompts via your ChatGPT account's text model before drawing (mirrors what the web UI does silently). Three slide styles: auto, slide, fintech.
  • Builds full-bleed PowerPoint (.pptx) decks from a set of images.
  • Branded decks: auto-detects brand colors from your logo, generates slides in those colors, and composites your logo onto every slide.
  • Styled decks: matches the design style and palette of a reference image (without copying its text/content).
  • Works as an MCP server across Claude Code, Codex, and Antigravity over stdio.
  • Single-account, fully local auth — your token never leaves your machine.

Requirements

  • uv — handles Python and dependencies (you do not need to install Python separately; uv fetches Python 3.12 automatically).
  • git
  • A ChatGPT account (use a secondary one — see disclaimer)

Install uv (one time per machine):

# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

Install

One-click installer (recommended)

After cloning, run the installer — it installs uv if missing, syncs deps, walks you through login, and registers the MCP server:

git clone https://github.com/vuhai2002/ai-slides-mcp.git
cd ai-slides-mcp
# Windows (PowerShell) — or right-click install.ps1 -> Run with PowerShell
powershell -ExecutionPolicy Bypass -File install.ps1
# macOS / Linux
bash install.sh

Manual install

git clone https://github.com/vuhai2002/ai-slides-mcp.git
cd ai-slides-mcp
uv sync
uv run cgimg login
# A browser opens -> log into ChatGPT -> you land on a platform.openai.com page (it may say "Oops").
# Copy the FULL callback URL from the address bar, then run:
uv run cgimg login --callback "<paste the callback URL here>"

If you keep your repo private, the cloning machine must be signed into a GitHub account with access (e.g. gh auth login).

Why login is two steps

Login uses OAuth with PKCE. Step 1 builds the authorization URL and stashes a one-time secret (the PKCE verifier) on disk. Step 2 exchanges the code in the callback URL for tokens — and that exchange needs the same verifier that step 1 generated. Splitting it into two commands lets the verifier persist between building the URL and redeeming the code, instead of being lost when the browser hands control back to you.

Register as an MCP server

The config is the same shape for Claude Code, Codex, and Antigravity:

{
  "mcpServers": {
    "ai-slides": {
      "command": "uv",
      "args": ["run", "cgimg-mcp"],
      "cwd": "<absolute path to the cloned repo>"
    }
  }
}

Replace cwd with the absolute path where you cloned the repo (e.g. D:\\ai-slides-mcp on Windows — note the doubled backslashes in JSON). For Claude Code you can instead run:

claude mcp add ai-slides -- uv run cgimg-mcp

MCP tools

Six tools are exposed by the server (src/cgimg/server.py):

Tool Params Returns Description
login_status {authed, ...} Check whether a ChatGPT account is logged in.
generate_image prompt, aspect="16:9", n=1, out_dir="out", enhance=True, style="auto", thinking="auto", brand_colors=None, reserve_corner=None {paths} Generate n image(s) from a prompt. Returns saved PNG paths.
build_pptx image_paths, out_path="deck.pptx", aspect="16:9" {path} Assemble existing images into a full-bleed PPTX.
generate_slide_deck prompts, aspect="16:9", out_pptx="deck.pptx", out_dir="out", enhance=True, style="slide", thinking="auto", brand_colors=None, reserve_corner=None {path, image_paths} Generate one image per prompt (one slide each, named s01.png...), then assemble into a PPTX.
branded_deck logo_path, prompts, aspect="16:9", out_pptx="deck.pptx", out_dir="out", logo_position="top-left", logo_scale=0.15, thinking="auto" {path, image_paths, brand_colors} Auto-detects brand colors from the logo, generates slides in those colors, and composites the original logo onto each slide.
styled_deck ref_image, prompts, aspect="16:9", out_pptx="deck.pptx", out_dir="out", thinking="auto" {path, image_paths, brand_colors} Matches a reference image's design style + colors (does not copy its text/content).

For generate_slide_deck / branded_deck / styled_deck, each prompt is the content of one slide — pass raw slide content and the enhancer designs it.

Slide styles

When enhance=True, the prompt is expanded by your ChatGPT account's text model before drawing. The style argument picks the design treatment:

Style Look Enhancement
auto General — a dense infographic, or a photographic scene if the prompt names a real scene. Skipped if the prompt is already long (≥280 chars).
slide Clean editorial presentation slide: light cream background, ONE warm accent color, a soft 3D hero visual, slide-number pill, bottom takeaway banner. Always runs.
fintech Premium light-blue dashboard look: glassmorphism cards, circular blue-gradient icon badges, optional 3D robot + chart widgets, bottom blue banner. Always runs.

Content completion (for slide / fintech): these styles complete your content into a full, information-rich slide — every main point gets a bold label plus a 2-line supporting description, sparse input is intelligently expanded into a sensible slide, and the prompt explicitly demands that all text be rendered in full (never dropped or abbreviated). It stays legible (not a wall of tiny text) and never fabricates fake statistics. See docs/styles.md for a full reference.

generate_image and the CLI gen accept style values auto, slide, and fintech. generate_slide_deck defaults to slide.

Reasoning effort (thinking)

ChatGPT's web UI has an "Intelligence" selector that makes the image model reason more before drawing. generate_image, the deck tools, and the CLI expose the same via thinking:

thinking Effect
auto (default) Sends no preference — ChatGPT's own default (fast).
standard Light reasoning.
extended More reasoning.
max Most reasoning — best for rendered text (e.g. Vietnamese diacritics), slowest.

Higher effort noticeably improves text fidelity inside the image. If generated slides garble Vietnamese diacritics at the default, try --thinking max (CLI) or thinking="max" (MCP). Values are passed to ChatGPT's image backend as thinking_effort; anything outside the set above is rejected by the backend.

CLI usage

# 1. Log in (two-step, see Install above)
uv run cgimg login
uv run cgimg login --callback "<paste callback URL>"

# 2. Generate image(s)  (auto-enhance is ON by default; add --no-enhance to send the prompt as-is)
uv run cgimg gen "a serene mountain lake at dawn" --aspect 16:9 --n 1 --out out
uv run cgimg gen "AI agents for customer support" --style slide      # clean editorial slide
uv run cgimg gen "real-time fraud detection" --style fintech         # light-blue dashboard slide
uv run cgimg gen "ai agent" --aspect 1:1 --no-enhance                # send prompt verbatim
uv run cgimg gen "Trí tuệ nhân tạo cho doanh nghiệp" --thinking max  # max reasoning = best Vietnamese text
uv run cgimg gen "RAG pipeline" --style slide --accent "#10B981" --reserve-corner top-left  # brand color + clear corner for a logo

# 3. Generate a multi-slide deck (one image per prompt; slides saved s01.png, s02.png...)
uv run cgimg deck --prompts "Intro" "How it works" "Pricing" --style slide --out deck.pptx
uv run cgimg deck --prompts-file slides.txt --thinking max --accent "#10B981" --reserve-corner top-left
#   --prompts-file: one prompt per line (blank lines and lines starting with # are skipped)
#   --no-enhance + --style: applies the slide look via a concise offline template (good for dense Vietnamese text)

# 4. Build a deck from existing images
uv run cgimg ppt img1.png img2.png --out deck.pptx --aspect 16:9

# 5. Branded deck — slides in your brand colors with your logo composited on each
uv run cgimg branded logo.png \
  --prompts "What is RAG?" "RAG pipeline" "Benefits" \
  --out deck.pptx --position top-left --scale 0.15

# 6. Styled deck — match a reference image's design (its text/content is NOT copied)
uv run cgimg styled reference-slide.png \
  --prompts "Intro" "How it works" "Pricing" \
  --out deck.pptx

gen and deck accept --accent <hex> (forces a brand color) and --reserve-corner <pos> (keeps a corner clear for a logo you add later; the model draws no logo/text there). branded and styled also accept --aspect and --out-dir (default out). They print the deck path and each generated image path.

Aspect ratios

Aspect Size sent ChatGPT returns
16:9 1920x1080 ~1672x941
1:1 1024x1024 1024x1024
3:4 1024x1536 ~1086x1448
9:16 1080x1920 ~941x1672
4:3 1440x1080 ~1448x1086
WxH as given normalized by ChatGPT

ChatGPT honors the ratio, not exact pixels — it normalizes to its own native dimensions (so 16:9 yields roughly 1672x941, not exactly 1920x1080). This is expected.

PPTX output is full-bleed: the image fills the slide edge to edge, so the image aspect should match the deck aspect to avoid cropping or letterboxing.

How it works

  • Vendors chatgpt2api's proven OAuth + image backend (under src/cgimg/_vendor/).
  • Stores a single account token locally at %APPDATA%\cgimg\auth.json (Windows) or ~/.config/cgimg/auth.json (Linux/macOS). It is never committed.
  • Auto-refreshes the access token via the stored refresh_token when it expires.

Examples

Showcase slides generated by this server live in examples/sample-slides/ — a mix of slide/fintech styles, dense multi-task slides, custom layouts, and tables.

Troubleshooting

Symptom Cause / Fix
not logged in Run uv run cgimg login (two steps).
Token expired / auth errors Re-run the login flow.
Generation is slow (~30–90s per image) Normal — it polls ChatGPT until the image is ready.
Prompt rejected / blocked ChatGPT's content moderation refused it. Adjust the prompt and retry — refusals are often transient (styled_deck auto-retries up to 3×).
Text in the image looks imperfect on a very dense slide Image models can garble small text when a slide is packed. Reduce the content or split into more slides.
Image dims aren't exactly what you asked Expected — ChatGPT honors the ratio and normalizes to its native size.

Attribution & License

This project vendors code from:

  • chatgpt2api — Copyright (c) 2026 kunkun, MIT License. Powers the OAuth + image backend. Vendored under src/cgimg/_vendor/.

Vendored files retain their original behavior. See NOTICE for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured