gobbler-mcp

gobbler-mcp

Converts YouTube videos, audio, documents, and web pages to clean markdown with YAML frontmatter, providing AI assistants with structured content via the MCP protocol.

Category
Visit Server

README

<p align="center"> <img src="docs/assets/Gobby Feasting (small).png" alt="Gobby the Turkey mascot consuming PDF, HTML, DOCX, and VIDEO files, outputting clean MD blocks" width="500"> </p>

Gobbler

Universal Content Conversion to Markdown for AI

Gobbler transforms any content—YouTube videos, web pages, documents, audio files, even live browser sessions—into clean, structured markdown that AI systems can immediately reason about.

License: MIT Python 3.11+

The Problem

AI assistants work best with markdown. But content exists in countless formats—PDFs, videos, web pages behind logins, audio recordings. Getting that content into a format AI can use requires:

  • Different tools for each content type
  • Custom scripts to extract and format
  • Lost metadata and inconsistent output
  • No unified way for AI agents to access content

Gobbler solves this. One tool, one output format, multiple access patterns.

The Solution

# Every content type → Same pattern → Same output format
gobbler youtube "https://youtube.com/watch?v=..." -o transcript.md
gobbler document report.pdf -o report.md
gobbler audio meeting.mp3 -o meeting.md
gobbler webpage "https://docs.example.com" -o docs.md

Every conversion produces markdown with YAML frontmatter:

---
source: https://youtube.com/watch?v=VIDEO_ID
type: youtube_transcript
title: "Video Title"
duration: 847
word_count: 2341
converted_at: 2025-01-03T10:30:00Z
---

# Video Title

Content here, ready for AI consumption...

Quick Start

# Install
git clone https://github.com/Enablement-Engineering/gobbler.git
cd gobbler && make install

# Start services (for web/document conversion)
make start-docker

# Convert content
gobbler youtube "https://youtube.com/watch?v=dQw4w9WgXcQ"
gobbler document paper.pdf --no-ocr -o paper.md
gobbler audio interview.mp3 --model small -o interview.md

📖 Full Documentation

Three Ways to Use Gobbler

1. CLI (For Humans & Scripts)

gobbler youtube URL              # YouTube transcripts
gobbler audio FILE               # Audio/video transcription
gobbler document FILE            # PDF, DOCX, PPTX, XLSX
gobbler webpage URL              # Web pages (JS-rendered)
gobbler batch youtube-playlist URL  # Batch processing

2. Skills (For AI Agents)

Skills are markdown instruction files (SKILL.md) that teach AI agents how to use the gobbler CLI. Compatible with:

  • OpenClaw - Personal AI assistant platform
  • Claude Code - Anthropic's coding agent
  • Cursor / Windsurf - AI-powered IDEs
  • Any agent that supports skill/tool discovery via markdown
skills/
├── gobbler-youtube/     # 📺 YouTube transcription
├── gobbler-audio/       # 🎙️ Audio/video transcription  
├── gobbler-document/    # 📄 Document conversion (PDF, DOCX, PPTX, XLSX)
├── gobbler-webpage/     # 🌐 Web scraping with JS rendering
├── gobbler-browser/     # 🔌 Browser automation + AI chat integrations
├── gobbler-setup/       # 🔧 Installation and troubleshooting
└── gobbler-utils/       # 📦 Batch processing utilities

Each skill includes OpenClaw-compatible metadata for automatic dependency checking:

metadata:
  openclaw:
    emoji: 📺
    requires:
      bins: [gobbler]  # CLI tools that must be installed
    install:
      - id: gobbler
        kind: script
        label: Install Gobbler
        script: |
          git clone https://github.com/Enablement-Engineering/gobbler.git
          cd gobbler && uv sync && uv tool install .

Usage with OpenClaw:

# Copy skills to your OpenClaw workspace
cp -r skills/* ~/.openclaw/skills/

# Or symlink for development
ln -s $(pwd)/skills/* ~/.openclaw/skills/

Usage with Claude Code:

# Add skills directory to CLAUDE.md or workspace instructions
echo "Skills available in ./skills/" >> CLAUDE.md

The gobbler-browser skill includes integrations for NotebookLM, Claude.ai, ChatGPT, and Gemini (DOM automation - may break with site updates).

Skills use progressive disclosure—agents only load skill metadata at startup, then read full CLI instructions when triggered.

3. MCP Protocol (For AI Coding Assistants)

# For Claude Code
claude mcp add gobbler-mcp -- uv --directory /path/to/gobbler run gobbler-mcp
// For OpenCode (opencode.json)
{
  "mcp": {
    "gobbler": {
      "type": "local",
      "command": ["uv", "--directory", "/path/to/gobbler", "run", "gobbler-mcp"]
    }
  }
}
// For Claude Desktop (~/.config/claude/claude_desktop_config.json)
{
  "mcpServers": {
    "gobbler-mcp": {
      "command": "uv",
      "args": ["--directory", "/path/to/gobbler", "run", "gobbler-mcp"]
    }
  }
}

Features

Content Conversion

Type Command Backend
YouTube gobbler youtube URL youtube-transcript-api
Audio/Video gobbler audio FILE faster-whisper (local)
Documents gobbler document FILE Docling (Docker)
Web Pages gobbler webpage URL Crawl4AI (Docker)

Browser Automation

Control browsers via the Gobbler extension for authenticated content:

gobbler browser extract          # Extract current page
gobbler notebooklm query "..."   # Query NotebookLM
gobbler chatgpt query "..."      # Send to ChatGPT
gobbler claude query "..."       # Send to Claude.ai
gobbler gemini query "..."       # Send to Gemini

Setup:

  1. Load the extension in Chrome:

    • Go to chrome://extensions/
    • Enable "Developer mode"
    • Click "Load unpacked" → select browser-extension/ folder
  2. Create a tab group named "Gobbler" (right-click any tab → Add to group → New group)

  3. Add tabs you want to control to the Gobbler group

Only tabs in the "Gobbler" group are accessible—this prevents accidental access to sensitive tabs.

Batch Processing

gobbler batch youtube-playlist "https://youtube.com/playlist?list=..."
gobbler batch directory ./documents --pattern "*.pdf"
gobbler batch webpages urls.txt --output-dir ./pages

Architecture

Gobbler provides three interfaces that all use the same CLI and backends:

┌─────────────────────────────────────────────────────────────┐
│                     Your Automations                        │
└──────────────┬──────────────┬──────────────┬───────────────┘
               │              │              │
      ┌────────▼────┐  ┌──────▼─────┐  ┌─────▼─────┐
      │   Skills    │  │  gobbler   │  │    MCP    │
      │(CLI instrs) │  │    CLI     │  │  Server   │
      └────────┬────┘  └──────┬─────┘  └─────┬─────┘
               │              │              │
               └──────────────┼──────────────┘
                              ▼
                    ┌─────────────────┐
                    │  Provider Layer │
                    │  (Converters)   │
                    └────────┬────────┘
                             │
         ┌───────────────────┼───────────────────┐
         ▼                   ▼                   ▼
    ┌─────────┐        ┌───────────┐       ┌─────────┐
    │ Whisper │        │  Docling  │       │Crawl4AI │
    │ (local) │        │ (Docker)  │       │(Docker) │
    └─────────┘        └───────────┘       └─────────┘

Skills are markdown files that teach Claude which CLI commands to run. MCP exposes the same functionality as tools for Claude Desktop/Code.

Installation

Prerequisites

  • Python 3.11+
  • uv (Python package manager)
  • Docker Desktop (for web/document conversion)
  • ffmpeg (for audio extraction from video)

Install

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone https://github.com/Enablement-Engineering/gobbler.git
cd gobbler
make install

# Start Docker services
make start-docker

# Verify
gobbler --version

What Works Without Docker

  • YouTube transcripts - Uses YouTube's API directly
  • Audio transcription - Uses local Whisper model

What Needs Docker

  • Document conversion - Docling service (port 5001)
  • Web scraping - Crawl4AI service (port 11235)

Configuration

Config file: ~/.config/gobbler/config.yaml

services:
  docling: "http://localhost:5001"
  crawl4ai: "http://localhost:11235"

storage:
  type: "sqlite"
  path: "~/.config/gobbler/jobs.db"

logging:
  level: "INFO"

Troubleshooting

Document conversion crashes

# Use --no-ocr for digital PDFs (faster, less memory)
gobbler document file.pdf --no-ocr -o output.md

Service not responding

docker compose up -d
docker compose ps
curl http://localhost:5001/health   # Docling
curl http://localhost:11235/health  # Crawl4AI

YouTube "IP blocked"

# Set up TranscriptAPI.com for reliable access
export TRANSCRIPTAPI_KEY=your_key
gobbler youtube "URL"

See gobbler-setup skill for complete troubleshooting.

Project Structure

gobbler/
├── src/
│   ├── gobbler_cli/       # CLI interface (typer)
│   ├── gobbler_core/      # Converters & utilities
│   ├── gobbler_mcp/       # MCP protocol server
│   ├── gobbler_relay/     # Browser extension bridge
│   └── gobbler_queue/     # Background job queue
├── skills/                # AI agent instruction files
├── browser-extension/     # Chrome/Firefox extension
└── docker-compose.yml     # External services

Documentation

📖 Full Documentation - Installation, configuration, and usage guides

Key pages:

License

MIT License - see LICENSE file.

Acknowledgments

Built on the shoulders of giants:

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured