Enhanced Multimedia Analysis MCP
Enables AI agents to analyze images and videos, and generate optimized prompts for AI video generation systems.
README
π¬ Enhanced Multimedia Analysis MCP
A Model Context Protocol (MCP) server for professional multimedia content analysis and AI video generation prompt engineering
π Overview
The Enhanced Multimedia Analysis MCP Server is a production-ready Model Context Protocol implementation that provides AI agents with sophisticated tools for analyzing visual content (images and videos) and generating optimized prompts for AI video generation systems.
Core Capabilities
- π Systematic multi-dimensional content analysis via hotkey framework
- π¨ Professional prompt generation for AI video/image generators
- π± Platform-specific optimization (TikTok, Instagram, YouTube, Cinema)
- π₯ Character consistency tracking across scenes
- π Four analysis depth levels (Quick, Standard, Deep, Comprehensive)
- β‘ Quick activation via
/aivslash command
Key Benefits
β Reduces prompt engineering time from hours to minutes β Improves prompt quality through systematic analysis β Enables consistency across multiple generations β Optimizes for platforms automatically β Empowers AI agents with 100+ analysis dimensions
π Quick Start
Installation
-
Clone the repository:
git clone https://github.com/yourusername/enhanced-multimedia-analysis-mcp.git cd enhanced-multimedia-analysis-mcp -
Install dependencies:
pip install -r requirements.txt -
Install the
/aivcommand:./scripts/install_aiv.sh -
Configure Claude Desktop:
Add to your Claude Desktop configuration (
~/Library/Application Support/Claude/claude_desktop_config.jsonon macOS):{ "mcpServers": { "video-analysis": { "command": "python3", "args": ["/path/to/enhanced-multimedia-analysis-mcp/video_analysis_mcp.py"] } } } -
Restart Claude Desktop and test:
/aiv sunset over mountains with dramatic clouds
π‘ Usage Examples
Basic Analysis
/aiv A majestic eagle soaring over mountains at sunset
With Platform Optimization
/aiv 30-second product video --platform Instagram --depth deep
Character-Focused Analysis
/aiv Detective noir scene --focus character consistency, cinematography
With Custom Hotkeys
/aiv Epic battle scene --hotkeys A1,C1,L1,E1 --format json
Available Options
| Option | Values | Purpose |
|---|---|---|
--depth |
quick|standard|deep|comprehensive | Analysis thoroughness |
--platform |
TikTok|Instagram|YouTube|Cinema | Platform optimization |
--focus |
comma-separated areas | Targeted analysis |
--format |
markdown|json | Output format |
--hotkeys |
comma-separated list | Custom hotkey selection |
--style |
"reference style" | Style reference |
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Claude Desktop / MCP Client β
β Slash Commands: /aiv β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β JSON-RPC 2.0 over stdio
ββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β Video Analysis MCP Server β
β (video_analysis_mcp.py) β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β 4 MCP Tools β β
β β β’ video_analysis_analyze_image β β
β β β’ video_analysis_analyze_video β β
β β β’ video_analysis_analyze_multimedia β β
β β β’ video_analysis_get_hotkeys β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ β
β β Analysis Engine (Hotkey-Based) β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ β
β β Prompt Generator β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ β
β β Output Formatter (Markdown/JSON) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Technical Stack
- Framework: MCP Python SDK (FastMCP)
- Validation: Pydantic v2 models
- Python Version: 3.10+
- Design Pattern: Tool-oriented, stateless
- Communication: JSON-RPC 2.0 over stdio
π Documentation
- Master Specification - Complete system documentation
- /aiv Command Guide - Slash command usage and examples
- Installation Script - Located in
scripts/install_aiv.sh
π§ Configuration
Environment Variables
Configure the MCP server behavior using environment variables:
# Output character limit
export VIDEO_ANALYSIS_CHAR_LIMIT=25000
# Enable debug logging
export VIDEO_ANALYSIS_DEBUG=false
# Enable caching (improves performance)
export VIDEO_ANALYSIS_CACHE_ENABLED=true
export VIDEO_ANALYSIS_CACHE_DIR=/tmp/video_analysis_cache
export VIDEO_ANALYSIS_CACHE_TTL=3600
# Set default analysis depth
export VIDEO_ANALYSIS_DEFAULT_DEPTH=standard
Claude Desktop Configuration
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"video-analysis": {
"command": "python3",
"args": ["/path/to/video_analysis_mcp.py"],
"env": {
"VIDEO_ANALYSIS_CHAR_LIMIT": "25000",
"VIDEO_ANALYSIS_CACHE_ENABLED": "true",
"VIDEO_ANALYSIS_CACHE_DIR": "/tmp/video_analysis_cache"
}
}
}
}
π― Features
Analysis Framework
The system uses a comprehensive hotkey-based analysis framework with 100+ dimensions organized into categories:
- A-Series: Aesthetic & Visual Style (A1-A13)
- S-Series: Story & Narrative (S1-S12)
- C-Series: Character & Subject (C1-C12)
- K-Series: Cinematography (K1-K13)
- P-Series: Platform Optimization (P1-P10)
- E-Series: Execution & Technical (E1-E12)
Analysis Depths
| Depth | Hotkeys | Use Case | Time |
|---|---|---|---|
| Quick | 4-6 | Fast iterations | 2-5s |
| Standard | 8-12 | Balanced analysis | 5-10s |
| Deep | 15-25 | Detailed work | 10-20s |
| Comprehensive | 30-50 | Production-ready | 20-30s |
Platform Optimizations
- TikTok: Vertical format, hook-first, trending sounds
- Instagram: Aesthetic-first, grid-aware, story integration
- YouTube: Thumbnail optimization, retention focus, SEO
- Cinema: Cinematic language, aspect ratios, theatrical quality
π’ Deployment
Docker
docker build -t video-analysis-mcp:1.1.0 .
docker run -d --name video-analysis-mcp video-analysis-mcp:1.1.0
Systemd Service
See docs/MASTER_SPECIFICATION.md for complete deployment instructions including:
- Systemd service configuration
- Kubernetes deployment
- Docker Compose setup
- Monitoring & observability
π§ͺ Testing
Run comprehensive tests:
python3 -m pytest tests/
Test individual tools:
# Test image analysis
python3 -c "from video_analysis_mcp import test_image_analysis; test_image_analysis()"
# Test video analysis
python3 -c "from video_analysis_mcp import test_video_analysis; test_video_analysis()"
π Performance
With Caching Enabled
| Scenario | No Cache | With Cache | Improvement |
|---|---|---|---|
| Standard Analysis | 5.2s | 0.08s | 98.5% faster |
| Deep Analysis | 12.5s | 0.09s | 99.3% faster |
| Quick Analysis | 2.3s | 0.06s | 97.4% faster |
| Comprehensive | 25.8s | 0.11s | 99.6% faster |
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Setup
- Clone the repository
- Install development dependencies:
pip install -r requirements-dev.txt - Run tests:
pytest tests/ - Follow the code style guide (PEP 8)
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Built on the Model Context Protocol by Anthropic
- Uses FastMCP for MCP server implementation
- Inspired by professional video production workflows
π Support
- Issues: GitHub Issues
- Documentation: docs/MASTER_SPECIFICATION.md
- Discussions: GitHub Discussions
πΊοΈ Roadmap
- [ ] Real-time video file analysis
- [ ] Integration with popular AI video generators
- [ ] Web interface for prompt generation
- [ ] Batch processing capabilities
- [ ] Advanced caching strategies
- [ ] Multi-language support
Made with β€οΈ for the AI video generation community
Version 1.1.0 | Changelog | Documentation
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.