mcp-litmedia

mcp-litmedia

Exposes litmedia.ai text-to-image and image-to-video generation tools via MCP, enabling AI agents to generate images and videos directly from prompts.

Category
Visit Server

README

mcp-litmedia

An MCP server that exposes litmedia.ai text-to-image and image-to-video generation tools to MCP-compatible clients (Claude Code, Claude Desktop, Cursor, etc.). Generate images and videos directly from your terminal or AI agent.

Disclaimer: this project talks to litmedia's private web app endpoints (not their official public API). It works today but may break if litmedia changes its request signing or model identifiers. For production / commercial usage, prefer their official skill API (Bearer token auth, stable contract).

Features

  • generate_image — text-to-image generation, returns public URL.
  • generate_video — image-to-video (URL or local path), returns public .mp4 URL. Supports modern models (Seedance 2.0, Kling V3, etc.) with their extended payload sent automatically.
  • get_user_info — remaining credits, VIP status, allowed model parameters.

Local image paths are uploaded automatically through litmedia's Aliyun OSS bucket (STS credentials fetched on demand). Remote URLs (including images you've just generated with generate_image) are passed through directly.

Requirements

  • Bun ≥ 1.0
  • An active litmedia.ai account with credits
  • An MCP-compatible client (this README uses Claude Code as the example)

Install

git clone https://github.com/Para-FR/mcp-litmedia.git
cd mcp-litmedia
bun install

Get your token and fingerprint

The server needs two values pulled from your browser after logging into litmedia. Both are read from the litmedia web app's localStorage.

  1. Open https://www.litmedia.ai/ and log in.
  2. Open DevTools (Cmd+Option+I on macOS) → ApplicationLocal Storagehttps://www.litmedia.ai.
  3. Find the key lit_video_pro$_token. The value is a JSON blob — copy the value field. This is your LITMEDIA_TOKEN.
  4. Find the key lit_video_pro$_fingerprint. Same thing — copy the value field. This is your LITMEDIA_FINGERPRINT.

The token is bound to your browser session and expires when you log out. Re-capture it whenever generation calls return 401.

Configure your MCP client

Claude Code

claude mcp add-json -s user litmedia '{
  "command": "bun",
  "args": ["run", "/absolute/path/to/mcp-litmedia/src/server.ts"],
  "env": {
    "LITMEDIA_TOKEN": "<your_token>",
    "LITMEDIA_FINGERPRINT": "<your_fingerprint>"
  }
}'

Verify with claude mcp list — you should see litmedia: ... ✓ Connected.

Other MCP clients (Claude Desktop, Cursor, etc.)

Add the equivalent JSON entry to your client's MCP configuration file:

{
  "mcpServers": {
    "litmedia": {
      "command": "bun",
      "args": ["run", "/absolute/path/to/mcp-litmedia/src/server.ts"],
      "env": {
        "LITMEDIA_TOKEN": "<your_token>",
        "LITMEDIA_FINGERPRINT": "<your_fingerprint>"
      }
    }
  }
}

Restart the client. The three tools (generate_image, generate_video, get_user_info) should appear in the available tools.

Tools

generate_image

Param Type Required Default Notes
prompt string yes What to generate
ratio "16:9" | "1:1" | "9:16" no "16:9" Aspect ratio
model number no 14 Image model ID
quality string no "1k" Resolution tag
num number (1-4) no 1 Images per call

Returns one or more public CDN URLs. Cost: ~10 credits per image (depends on model and quality).

generate_video

Param Type Required Default Notes
image string yes HTTPS URL or local file path (png/jpg/jpeg/webp)
prompt string yes Motion / animation description
duration number no 5 Seconds
model number no 1 Video model ID. Modern models with extended payload: Seedance 2.0 = 41, Seedance 2.0 Fast = 42, Kling 2.5 Turbo = 29, Kling O1 = 32, Kling 2.6 = 33, Kling V3 = 43, Kling O3 = 44
quality "360p" | "480p" | "720p" | "std" no "360p" legacy / "480p" modern Resolution
ratio "16:9" | "1:1" | "9:16" no empty legacy / "16:9" modern Aspect ratio
negative_prompt string no "" What to avoid
sound_effect boolean no true Generate ambient audio
seed number no random Reproducibility seed

Returns a public .mp4 URL. Cost: ~20 credits for a 5s clip; higher for longer durations or modern models. Generation typically takes 30s–3min, occasionally longer for 15s 720p output.

For the full mapping of model display names → numeric IDs (Seedance, Kling, Veo, Sora, Wan, Vidu, Minimax, LitAI families), see the public litmedia-ai/skill repository.

get_user_info

No arguments. Returns:

  • vip_times / vip_total_times — remaining vs total credits
  • is_vip — VIP flag
  • allowed_image_model_parameters — image model combos available on your account
  • (also exposes video model combos in the underlying response)

Useful to check quota and discover which model/duration/quality combos your account can actually use.

Usage examples

Once installed, just ask your MCP client naturally:

"Generate a cinematic image of a fox in a library, 16:9"

(client uses generate_image)

"Now animate that image: slow camera dolly-in, candle flames flicker"

(client passes the URL of the previous result to generate_video)

"Use Seedance 2.0 in 720p for a 15-second version"

(client calls generate_video with model: 41, quality: "720p", duration: 15)

"How many credits do I have left?"

(client uses get_user_info)

Tests

# Unit tests (signing golden cases)
bun test

# Live image generation (consumes credits)
LITMEDIA_TOKEN=... LITMEDIA_FINGERPRINT=... bun run test/integration.manual.ts

# Live video generation (consumes credits)
LITMEDIA_TOKEN=... LITMEDIA_FINGERPRINT=... \
  bun run test/integration.video.manual.ts \
  "https://app-images.litmedia.ai/.../your_image.png" \
  "the subject slowly moves, cinematic lighting"

How it works

The server speaks the same HTTP(S) protocol as the litmedia web app:

  1. Reads LITMEDIA_TOKEN and LITMEDIA_FINGERPRINT from the environment.
  2. Signs each request with two derived values: sign (SHA1 over token + timestamp + "member_sign") and signature (MD5 of SHA1 over timestamp + randomStr + salt).
  3. Sends the call to litvideo-api.litmedia.ai, polls the status endpoint until the asset URL is returned.

For image-to-video with local files, the server fetches short-lived STS credentials from litmedia's /lit-video/get-sls endpoint, then uploads via ali-oss (multipart, idempotent content-hashed object key) to the litmedia OSS bucket. The resulting CDN URL feeds into the video generation request.

Architecture

src/
├── signing.ts   – SHA1 + MD5 + nonce generator
├── client.ts    – HTTP client, types, polling, orchestrators
├── uploader.ts  – ali-oss wrapper with STS-based multipart upload
└── server.ts    – MCP server (stdio transport), tool registration & dispatch

Three layers, one responsibility each. Tests live in test/.

Limitations & disclaimers

  • Unofficial integration: this project mimics the web app's request format. litmedia can change their signing or model IDs at any time. If everything 401s or rejects with signature mismatch, that's likely what happened.
  • Token TTL: the token is your browser session token. It expires; re-capture it when calls start failing with 401.
  • Scope: text-to-image and image-to-video only. No text-to-video, lipsync, voice, history, or auto-download of finished assets. The returned URL can be downloaded with any HTTP client.
  • Allowed model combos vary by account tier. If generate_video errors with a "combo not allowed" type message, check get_user_info (allowed_model_parameters) to see what's available on your plan.
  • Cost transparency: each generate_image and generate_video call consumes account credits. The integration test scripts spend real credits.
  • Personal / educational use: use this on your own account that you've legitimately paid for. Don't use it for high-volume scraping or commercial redistribution.

For production use cases, switch to the official Bearer-token API.

License

MIT — see the LICENSE file.

Contributing

PRs welcome, especially:

  • Additional tool exposure (text-to-video, lipsync, voice)
  • Better model discovery (auto-introspection of get_user_info for combos)
  • Tests that don't require live credits

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured