MCP YouTube Transcript Pro

MCP YouTube Transcript Pro

Enables fetching YouTube video transcripts with metadata, including timed captions in multiple formats (JSON, SRT, VTT, CSV, TXT) and preprocessing options.

Category
Visit Server

README

MCP YouTube Transcript Pro

A production-ready Model Context Protocol (MCP) server for fetching YouTube video transcripts with metadata.

🎯 Features

  • 4 MCP Tools: Complete implementation of list_tracks, get_transcript, get_timed_transcript, get_video_info
  • Hybrid Architecture: YouTube Data API v3 for metadata + yt-dlp for robust content extraction
  • Full MCP Compliance: JSON-RPC 2.0 protocol over stdin/stdout
  • Battle-Tested: Comprehensive test suite with 100% success rate
  • Production Quality: TypeScript with strict types, proper error handling, detailed logging
  • No OAuth Required: Uses API key for metadata, yt-dlp for transcript content (no OAuth 2.0 complexity)

πŸ“‹ Prerequisites

  1. Node.js 20+ (for running the MCP server)
  2. YouTube Data API Key (free tier available)
  3. yt-dlp (for transcript extraction)

Installing yt-dlp

Windows (winget):

winget install yt-dlp

macOS (Homebrew):

brew install yt-dlp

Linux (curl):

sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp

Getting a YouTube API Key

  1. Go to Google Cloud Console
  2. Create a new project (or select existing)
  3. Enable "YouTube Data API v3"
  4. Create credentials β†’ API key
  5. Copy the API key

πŸš€ Quick Start

Installation

# Clone or navigate to the project directory
cd mcp-youtube-transcript-pro

# Install dependencies
npm install

# Create .env file with your API key
echo "YOUTUBE_API_KEY=your_api_key_here" > .env

# Build the project
npm run build

Running Tests

# Test all four MCP tools directly
npx ts-node test-mcp-tools.ts

# Test the JSON-RPC protocol implementation
npx ts-node test-mcp-protocol.ts

Starting the Server

# Start the MCP server (listens on stdin/stdout)
npm run start

πŸ”§ Usage with Claude Desktop

Add to your Claude Desktop configuration (claude_desktop_config.json):

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "node",
      "args": [
        "H:\\-EMBLEM-PROJECT(s)-\\Tools\\packages\\mcp-youtube-transcript-pro\\dist\\index.js"
      ],
      "env": {
        "YOUTUBE_API_KEY": "your_api_key_here"
      }
    }
  }
}

Note: Replace the path with your actual installation directory.

πŸ“š MCP Tools

1. list_tracks

Lists available caption tracks for a YouTube video.

Input:

{
  "url": "https://www.youtube.com/watch?v=lxRAj1Gijic"
}

Output:

[
  {
    "lang": "en",
    "source": "youtube_api_manual"
  }
]

2. get_transcript

Returns a merged plain text transcript.

Input:

{
  "url": "lxRAj1Gijic",
  "lang": "en"
}

Output:

"today we're going to enhance your vs code to ensure that you've got the most efficient workspace..."

3. get_timed_transcript

Returns timestamped transcript segments in multiple formats.

Input:

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "lang": "en",
  "format": "json"
}

Output (format: json, default):

[
  {
    "start": 0.08,
    "end": 0.32,
    "text": "today",
    "lang": "en",
    "source": "web_extraction"
  },
  ...
]

Supported Formats:

  • json (default): Array of TranscriptSegment objects
  • srt: SubRip subtitle format
  • vtt: WebVTT web caption format
  • csv: Spreadsheet format with 7 columns
  • txt: Plain text format

See Format Support below for detailed examples.

4. get_video_info

Returns video metadata including title, channel, duration, and available captions.

Input:

{
  "url": "https://www.youtube.com/watch?v=lxRAj1Gijic"
}

Output:

{
  "title": "The ULTIMATE VS Code Setup - Extensions & Settings 2025",
  "channelId": "UCRVtCne4XmwFLot1FHMfhuw",
  "duration": "PT15M23S",
  "captionsAvailable": [
    { "lang": "en", "source": "youtube_api_manual" }
  ]
}

πŸ“€ Format Support

The get_timed_transcript tool supports 5 output formats optimized for different use cases:

JSON (default)

Structured data format, perfect for programmatic processing.

[
  {
    "start": 0.08,
    "end": 4.359,
    "text": "today I'm going to be showing you the best extensions",
    "lang": "en",
    "source": "web_extraction"
  }
]

SRT (SubRip)

Standard subtitle format for video editing software (Adobe Premiere, Final Cut Pro, DaVinci Resolve).

1
00:00:00,080 --> 00:00:04,359
today I'm going to be showing you the best extensions

2
00:00:04,359 --> 00:00:07,000
and settings for VS Code in 2025

VTT (WebVTT)

Web-native caption format for HTML5 video players and browsers.

WEBVTT

00:00:00.080 --> 00:00:04.359
today I'm going to be showing you the best extensions

00:00:04.359 --> 00:00:07.000
and settings for VS Code in 2025

CSV

Spreadsheet format for data analysis (Excel, Google Sheets, Python pandas).

Sequence,Start,End,Duration,Text,Language,Source
1,00:00:00.080,00:00:04.359,00:00:04.279,"today I'm going to be showing you the best extensions",en,web_extraction
2,00:00:04.359,00:00:07.000,00:00:02.641,"and settings for VS Code in 2025",en,web_extraction

TXT (Plain Text)

Human-readable format for documentation or simple text extraction.

today I'm going to be showing you the best extensions and settings for VS Code in 2025

Usage Example

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "format": "srt"
}

Format Comparison

Format File Size* Best For MIME Type
JSON 289 KB Data processing, APIs application/json
SRT 144 KB Video editing (Premiere, Final Cut) application/x-subrip
VTT 127 KB Web captions, HTML5 video text/vtt
CSV 175 KB Spreadsheet analysis, Excel text/csv
TXT 17.5 KB Documentation, simple text text/plain

*Based on 15-minute video with 3,624 transcript segments.

For detailed format specifications, compatibility information, and decision trees, see FORMATS.md.

πŸ”§ Preprocessing Options

The get_timed_transcript tool includes optional preprocessing parameters to clean and optimize transcript data before formatting. All options are disabled by default for backward compatibility.

filterEmpty

Remove segments with empty or whitespace-only text.

Use case: Clean up auto-generated captions that include timing markers for silent periods.

Example:

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "filterEmpty": true
}

Before (1,089 segments):

[
  { "start": 0.08, "end": 0.32, "text": "today", ... },
  { "start": 0.32, "end": 0.56, "text": "", ... },
  { "start": 0.56, "end": 1.12, "text": "  ", ... },
  { "start": 1.12, "end": 1.44, "text": "we're", ... }
]

After (987 segments, 102 removed):

[
  { "start": 0.08, "end": 0.32, "text": "today", ... },
  { "start": 1.12, "end": 1.44, "text": "we're", ... }
]

mergeOverlaps

Merge segments with overlapping timestamps.

Use case: Fix word-level timing issues in auto-generated captions where end[n] > start[n+1].

Example:

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "mergeOverlaps": true
}

Before (overlapping timestamps):

[
  { "start": 0.08, "end": 1.50, "text": "Hello", ... },
  { "start": 1.20, "end": 2.50, "text": "world", ... }
]

After (merged):

[
  { "start": 0.08, "end": 2.50, "text": "Hello world", ... }
]

removeSilence

Remove silence and pause markers from transcript.

Use case: Create clean reading transcripts without [silence], [pause], [Music] markers.

Example:

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "removeSilence": true
}

Removed patterns (case-insensitive):

  • [silence]
  • [pause]
  • [Music]
  • Single period: .
  • Single dash: -
  • Empty/whitespace-only text

Before:

[
  { "start": 0.08, "end": 0.32, "text": "Hello", ... },
  { "start": 0.32, "end": 1.50, "text": "[silence]", ... },
  { "start": 1.50, "end": 2.80, "text": "[Music]", ... },
  { "start": 2.80, "end": 3.20, "text": "world", ... }
]

After (2 segments removed):

[
  { "start": 0.08, "end": 0.32, "text": "Hello", ... },
  { "start": 2.80, "end": 3.20, "text": "world", ... }
]

Combining Options

All three preprocessing options can be used together. They are applied in this order:

  1. removeSilence - Remove silence/pause markers
  2. filterEmpty - Remove empty segments
  3. mergeOverlaps - Merge overlapping timestamps

Example (all options enabled):

{
  "url": "https://youtu.be/lxRAj1Gijic",
  "filterEmpty": true,
  "mergeOverlaps": true,
  "removeSilence": true,
  "format": "srt"
}

Results:

  • Original: 1,089 segments
  • After removeSilence: 1,012 segments (77 removed)
  • After filterEmpty: 987 segments (25 removed)
  • After mergeOverlaps: 342 segments (645 merged)
  • Final: 342 clean, merged segments in SRT format

TypeScript Usage

import { get_timed_transcript } from './tools';

// Clean transcript for reading
const cleanTranscript = await get_timed_transcript({
  url: 'https://youtu.be/lxRAj1Gijic',
  filterEmpty: true,
  removeSilence: true,
  format: 'txt'
});

// Optimized subtitle file
const subtitles = await get_timed_transcript({
  url: 'https://youtu.be/lxRAj1Gijic',
  mergeOverlaps: true,
  filterEmpty: true,
  format: 'srt'
});

πŸ—οΈ Architecture

MCP Client (e.g., Claude Desktop)
    ↓ JSON-RPC 2.0 over stdin
MCP Server (index.ts)
    ↓
Tool Router (tools.ts)
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ YouTube Data API v3  β”‚  yt-dlp (web extraction)β”‚
β”‚ (youtube_api.ts)     β”‚  (web_extraction.ts)    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β€’ List captions      β”‚ β€’ Get transcript contentβ”‚
β”‚ β€’ Get video metadata β”‚ β€’ Timestamped segments  β”‚
β”‚ β€’ API key auth       β”‚ β€’ No auth required      β”‚
β”‚ β€’ Quota limits       β”‚ β€’ No quota limits       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Hybrid?

  1. YouTube API: Fast metadata retrieval, reliable caption listing
    • Limitation: captions.download() requires OAuth 2.0 (not suitable for automated servers)
  2. yt-dlp: No authentication needed, actively maintained, handles edge cases
    • Advantage: Downloads transcript content without OAuth complexity
  3. Best of Both Worlds: API for metadata, yt-dlp for content extraction

πŸ“ Project Structure

mcp-youtube-transcript-pro/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts                 # MCP server entry point (JSON-RPC handler)
β”‚   β”œβ”€β”€ tools.ts                 # MCP tool implementations
β”‚   β”œβ”€β”€ types.ts                 # TypeScript interfaces
β”‚   └── adapters/
β”‚       β”œβ”€β”€ youtube_api.ts       # YouTube Data API v3 integration
β”‚       └── web_extraction.ts    # yt-dlp integration
β”œβ”€β”€ test-mcp-tools.ts            # Direct tool tests
β”œβ”€β”€ test-mcp-protocol.ts         # End-to-end protocol tests
β”œβ”€β”€ package.json
β”œβ”€β”€ tsconfig.json
β”œβ”€β”€ .env                         # YOUTUBE_API_KEY
└── dist/                        # Compiled JavaScript

πŸ§ͺ Test Results

All tests passing with 100% success rate:

=== MCP YouTube Transcript Pro - Tool Tests ===
βœ… list_tracks passed
βœ… get_video_info passed  
βœ… get_timed_transcript passed (3624 segments, 15.39 minutes)
βœ… get_transcript passed (17917 characters, 3624 words)

=== MCP JSON-RPC Protocol Tests ===
βœ… initialize passed
βœ… tools/list passed (4 tools)
βœ… tools/call (all 4 tools) passed
βœ… ping passed

πŸ› οΈ Development

Available Scripts

npm run build        # Compile TypeScript to dist/
npm run start        # Start the MCP server
npm run dev          # Start in development mode with auto-reload
npm run lint         # Run ESLint
npm test             # Run Jest tests

VS Code Tasks

Use Ctrl+Shift+B (or Cmd+Shift+B on macOS) to access pre-configured tasks:

  • Build: Compile TypeScript
  • Start: Run the server
  • Dev: Development mode with ts-node
  • Lint: Check code quality
  • Test: Run test suite
  • Install Dependencies: npm install

πŸ“ Environment Variables

Create a .env file in the project root:

YOUTUBE_API_KEY=your_youtube_data_api_v3_key_here

πŸ” Troubleshooting

"yt-dlp not found"

  • Solution: Install yt-dlp using package manager (see Prerequisites)
  • Verify: Run yt-dlp --version in terminal

"YOUTUBE_API_KEY environment variable not set"

  • Solution: Create .env file with your API key
  • Verify: Check that .env exists and contains YOUTUBE_API_KEY=...

"Cannot find module '../types'"

  • Solution: Rebuild the project with npm run build
  • Verify: Check that dist/ directory exists and contains compiled .js files

API Quota Exceeded

  • Issue: YouTube Data API has daily quota limits (free tier: 10,000 units/day)
  • Solution: Each API call uses ~3 units, yt-dlp has no quota limits
  • Workaround: The server uses yt-dlp for transcript content (no API quota impact)

πŸ“„ License

MIT License - see LICENSE file for details

🀝 Contributing

This project was built with AI assistance (GitHub Copilot - Claude Sonnet 4.5). Contributions are welcome!

See IMPLEMENTATION_COMPLETE.md for detailed implementation notes and lessons learned.

πŸ™ Acknowledgments

  • yt-dlp: Gold standard for YouTube content extraction
  • Google YouTube Data API: Reliable metadata and caption listing
  • Model Context Protocol: Standardized protocol for AI tool integration

Status: βœ… Production Ready Last Updated: October 17, 2025 Test Video: https://www.youtube.com/watch?v=lxRAj1Gijic

Run the container:

docker run -i mcp-youtube-transcript-pro

Note: Version 1.1.0 adds preprocessing options (filterEmpty, mergeOverlaps, removeSilence) and CSV/TXT output formats.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured