omni-video-mcp
An MCP server that transforms LLM-enabled IDEs into professional video editors by pre-processing footage into text proxies, generating motion graphics via HTML/CSS, and orchestrating complex FFmpeg renders.
README
Omni-Video Studio MCP
The Omni-Video Studio MCP is an enterprise-grade, autonomous Model Context Protocol (MCP) server that empowers any LLM-enabled IDE (Cursor, Claude Code, Antigravity) to act as a professional video editor.
It evolves simple transcription-based editing into a deterministic, token-efficient, and pipeline-driven workflow, featuring agent-native motion graphics (Hyperframes), visual metadata proxies, and high-fidelity final renders.
🌟 Key Features
- Metadata Proxy Ingestion: Instead of streaming expensive video tokens to an LLM, this server pre-processes footage to extract a
takes_packed.md(audio mapping) and a Visual Scene Graph. The agent edits using text proxies, cutting costs and accelerating reasoning. - Hyperframes Engine: Forget complex Node.js dependencies (e.g., Remotion). The agent generates deterministic HTML/CSS motion graphics which are instantly rendered to transparent video using Playwright.
- Advanced Rendering Pipeline: Powered by robust FFmpeg filter graphs, the final output supports EDL (Edit Decision List) cuts, overlay rendering, Subtitle burning, LUT color grading, and optional DeepFilterNet AI audio restoration.
- IDE-Agnostic: Because it adheres to the official MCP specification, it drops directly into Cursor, Antigravity, or Claude Desktop without custom plugins.
📦 Installation
Prerequisites:
python 3.10+ffmpeg(must be installed on your system path)uv(recommended for dependency management)
# Clone the repository
git clone https://github.com/your-org/omni-video-mcp.git
cd omni-video-mcp
# Install dependencies
uv venv
source .venv/bin/activate
uv pip install -e .
# Install Playwright browsers (for Hyperframes)
playwright install chromium
🛠 Configuration
Add the server to your IDE's MCP settings file (e.g., ~/.gemini/antigravity/mcp_config.json, ~/.cursor/mcp.json, or Claude Desktop config):
{
"mcpServers": {
"omni-video-mcp": {
"command": "uv",
"args": [
"run",
"/path/to/omni-video-mcp/server.py"
],
"env": {
"ELEVENLABS_API_KEY": "your_api_key_here"
}
}
}
}
Note: The ELEVENLABS_API_KEY is currently required for high-fidelity word-level transcription mapping during ingestion.
🎬 How it Works (The Agent Pipeline)
When the agent uses this MCP server, it follows a 4-phase architecture:
- Phase 1: Ingestion (
omni_video_ingest) The agent scans your raw.mp4/.movfiles, extracting a packed markdown transcript and an initial Visual Scene Graph. - Phase 2: Director's Cut (
omni_video_preview) The agent uses the transcript to construct an EDL (Edit Decision List) of the best takes. Ambiguous cuts can be visually verified by generating filmstrip PNGs via the preview tool. - Phase 3: VFX (
omni_video_generate_vfx) The agent generates HTML/CSS motion graphics (lower thirds, b-roll layouts) and the server renders them deterministically into transparent.webmvideos via Hyperframes. - Phase 4: Sweetening & Render (
omni_video_render) The agent passes the EDL, VFX timestamps, and render settings to the server, which builds a complex FFmpeg graph to concatenate the footage, grade it, restore the audio, and export a final master.
🤝 Contributing
Contributions are welcome! If you're adding new render pipeline capabilities (like auto-tracking or local whisper fallbacks), please open a PR. Ensure that any added Python dependencies are added to the pyproject.toml using uv add <package>.
📄 License
MIT License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.