Youtube Vision MCP

Youtube Vision MCP

MCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.

Category
Visit Server

Tools

summarize_youtube_video

Generates a summary of a given YouTube video URL using Gemini Vision API.

ask_about_youtube_video

Answers a question about the video or provides a general description if no question is asked.

extract_key_moments

Extracts key moments (timestamps and descriptions) from a given YouTube video.

list_supported_models

Lists available Gemini models that support the 'generateContent' method.

README

YouTube Vision MCP Server (youtube-vision)

NPM version License: MIT smithery badge

<a href="https://glama.ai/mcp/servers/@minbang930/Youtube-Vision-MCP"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@minbang930/Youtube-Vision-MCP/badge" alt="Youtube Vision MCP" /> </a>

MCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.

Features

  • Analyzes YouTube videos using the Gemini Vision API.
  • Provides multiple tools for different interactions:
    • General description or Q&A (ask_about_youtube_video)
    • Summarization (summarize_youtube_video)
    • Key moment extraction (extract_key_moments)
  • Lists available Gemini models supporting generateContent.
  • Configurable Gemini model via environment variable.
  • Communicates via stdio (standard input/output).

Prerequisites

Before using this server, ensure you have the following:

  • Node.js: Version 18 or higher recommended. You can download it from nodejs.org.
  • Google Gemini API Key: Obtain your API key from Google AI Studio or Google Cloud Console.

Installation & Usage

There are two main ways to use this server:

Installing via Smithery

To install youtube-vision-mcp for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @minbang930/youtube-vision-mcp --client claude

Option 1: Using npx (Recommended for quick use)

The easiest way to run this server is using npx, which downloads and runs the package without needing a permanent installation.

You can configure it within your MCP client's settings file (Claude, VSCode .. ):

{
  "mcpServers": {
    "youtube-vision": {
      "command": "npx",
      "args": [
        "-y",
        "youtube-vision"
      ],
      "env": {
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "GEMINI_MODEL_NAME": "gemini-2.0-flash"
      }
    }
  }
}

Replace "YOUR_GEMINI_API_KEY" with your actual Google Gemini API key.

Option 2: Manual Installation (from Source)

If you want to modify the code or run it directly from the source:

  1. Clone the repository:

    git clone https://github.com/minbang930/Youtube-Vision-MCP.git
    cd youtube-vision
    
  2. Install dependencies:

    npm install
    
  3. Build the project:

    npm run build
    
  4. Configure and run: You can then run the compiled code using node dist/index.js directly (ensure GEMINI_API_KEY is set as an environment variable) or configure your MCP client to run it using the node command and the absolute path to dist/index.js, passing the API key via the env setting as shown in the npx example.

Configuration

The server uses the following environment variables:

  • GEMINI_API_KEY (Required): Your Google Gemini API key.
  • GEMINI_MODEL_NAME (Optional): The specific Gemini model to use (e.g., gemini-1.5-flash). Defaults to gemini-2.0-flash. Important: For production or commercial use, ensure you select a model version that is not marked as "Experimental" or "Preview".

Environment variables should be set in the env section of your MCP client's settings file (e.g., mcp_settings.json).

Available Tools

1. ask_about_youtube_video

Answers a question about the video or provides a general description if no question is asked.

  • Input:
    • youtube_url (string, required): The URL of the YouTube video.
    • question (string, optional): The specific question to ask about the video. If omitted, a general description is generated.
  • Output: Text containing the answer or description.

2. summarize_youtube_video

Generates a summary of a given YouTube video.

  • Input:
    • youtube_url (string, required): The URL of the YouTube video.
    • summary_length (string, optional): Desired summary length ('short', 'medium', 'long'). Defaults to 'medium'.
  • Output: Text containing the video summary.

3. extract_key_moments

Extracts key moments (timestamps and descriptions) from a given YouTube video.

  • Input:
    • youtube_url (string, required): The URL of the YouTube video.
    • number_of_moments (integer, optional): Number of key moments to extract. Defaults to 3.
  • Output: Text describing the key moments with timestamps.

4. list_supported_models

Lists available Gemini models that support the generateContent method (fetched via REST API).

  • Input: None
  • Output: Text listing the supported model names.

Important Notes

  • Model Selection for Production: When using this server for production or commercial purposes, please ensure the selected GEMINI_MODEL_NAME is a stable version suitable for production use. According to the Gemini API Terms of Service, models marked as "Experimental" or "Preview" are not permitted for production deployment.
  • API Terms of Service: Usage of this server relies on the Google Gemini API. Users are responsible for reviewing and complying with the Google APIs Terms of Service and the Gemini API Additional Terms of Service. Note that data usage policies may differ between free and paid tiers of the Gemini API. Do not submit sensitive or confidential information when using free tiers.
  • Content Responsibility: The accuracy and appropriateness of content generated via the Gemini API are not guaranteed. Use discretion before relying on or publishing generated content.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured