Link Scan MCP Server

Link Scan MCP Server

Automatically scans and summarizes video links (YouTube, Instagram Reels) and text links (blogs, articles) using AI-powered transcription and summarization. Provides concise 3-sentence summaries without requiring API keys.

Category
Visit Server

README

Link Scan MCP Server πŸš€

링크λ₯Ό μŠ€μΊ”ν•˜κ³  μš”μ•½μ„ μ œκ³΅ν•˜λŠ” 포괄적인 Model Context Protocol (MCP) μ„œλ²„μž…λ‹ˆλ‹€. YouTube, Instagram Reels λ“± λΉ„λ””μ˜€ 링크와 λΈ”λ‘œκ·Έ, 기사 λ“± ν…μŠ€νŠΈ 링크λ₯Ό μžλ™μœΌλ‘œ κ°μ§€ν•˜κ³  λΆ„μ„ν•˜μ—¬ 3λ¬Έμž₯ μ΄λ‚΄μ˜ κ°„κ²°ν•œ μš”μ•½μ„ μ œκ³΅ν•©λ‹ˆλ‹€. API ν‚€ 없이 λͺ¨λ“  κΈ°λŠ₯을 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€!

Link Scan MCP Server - A comprehensive Model Context Protocol (MCP) server for scanning and summarizing links. Automatically detects and analyzes video links (YouTube, Instagram Reels) and text links (blogs, articles) to provide concise 3-sentence summaries. All features work without requiring API keys!

Python 3.11+ | MCP Compatible | License: MIT

✨ Features

πŸŽ₯ Video Link Analysis

  • YouTube Support
    • Comprehensive metadata extraction (title, description)
    • Subtitle extraction for first 7 seconds (yt-dlp)
    • Audio transcription using OpenAI Whisper
    • Integrated summarization combining all text sources
  • Instagram Reels Support
    • Audio download and transcription (first 7 seconds)
    • Automatic content summarization
  • Smart Link Detection
    • Automatic video/text link type detection
    • Error handling for unsupported URLs

πŸ“ Text Link Analysis

  • Web Content Extraction
    • BeautifulSoup-based HTML parsing
    • Main content area detection
    • Automatic navigation/ad removal
  • Intelligent Summarization
    • Llama3-powered text summarization
    • 3-sentence limit enforcement
    • Natural Korean output

πŸ€– AI-Powered Summarization

  • Llama3 Integration
    • Local LLM via Ollama (no API keys required)
    • Separate prompts for video and text content
    • Fallback to original text on errors
  • Whisper Transcription
    • High-quality speech-to-text conversion
    • Optimized for speed and accuracy
    • Supports multiple languages

🐳 Docker Support

  • One-Command Setup
    • Docker Compose configuration
    • Automatic Ollama service setup
    • Llama3 model auto-download
    • Development mode with hot reload

πŸ”§ Developer-Friendly

  • Type-safe with Pydantic models
  • Async/await support for better performance
  • Comprehensive error handling
  • Extensible architecture
  • Hot reload in development mode

πŸš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan

# Install dependencies
pip install -r requirements.txt

System Dependencies

ffmpeg (required for audio processing):

  • macOS: brew install ffmpeg
  • Ubuntu/Debian: sudo apt-get install ffmpeg
  • Windows: Download from https://ffmpeg.org/download.html

Ollama (required for summarization):

  • macOS: brew install ollama or download from https://ollama.com/download
  • Linux: curl -fsSL https://ollama.com/install.sh | sh
  • Windows: Download from https://ollama.com/download
  • After installation: ollama pull llama3:latest

Configuration

Create a .env file:

# μ„œλ²„ μ„€μ •
PORT=8000                    # μ„œλ²„ 포트 (κΈ°λ³Έκ°’: 8000)
HOST=0.0.0.0                 # μ„œλ²„ 호슀트 (κΈ°λ³Έκ°’: 0.0.0.0)
DEBUG=False                  # 디버그 λͺ¨λ“œ (κΈ°λ³Έκ°’: False)

# API 경둜 prefix (선택)
# 같은 μ„œλ²„μ— μ—¬λŸ¬ MCP μ„œλ²„λ₯Ό ν˜ΈμŠ€νŒ…ν•  λ•Œ μ‚¬μš©
# κΈ°λ³Έκ°’: /link-scan
API_PREFIX=/link-scan

# Ollama μ„€μ • (선택)
# Docker Composeλ₯Ό μ‚¬μš©ν•˜λŠ” 경우 μžλ™μœΌλ‘œ 섀정됨
OLLAMA_API_URL=http://localhost:11434    # Ollama API URL (κΈ°λ³Έκ°’: http://localhost:11434)
OLLAMA_MODEL=llama3:latest                # μ‚¬μš©ν•  Ollama λͺ¨λΈ (κΈ°λ³Έκ°’: llama3)

ν™˜κ²½ λ³€μˆ˜ μ„€λͺ…

λ³€μˆ˜λͺ… ν•„μˆ˜ κΈ°λ³Έκ°’ μ„€λͺ…
PORT ❌ 8000 μ„œλ²„κ°€ μ‚¬μš©ν•  포트 번호
HOST ❌ 0.0.0.0 μ„œλ²„κ°€ 바인딩할 호슀트 μ£Όμ†Œ
DEBUG ❌ False 디버그 λͺ¨λ“œ ν™œμ„±ν™” (True/False)
API_PREFIX ❌ /link-scan API μ—”λ“œν¬μΈνŠΈ 경둜 prefix
OLLAMA_API_URL ❌ http://localhost:11434 Ollama API μ„œλ²„ URL
OLLAMA_MODEL ❌ llama3 μ‚¬μš©ν•  Ollama λͺ¨λΈ 이름

Running as MCP Server

Local Mode (stdio):

python -m src.server

Remote Mode (HTTP):

python run_server.py

Or with uvicorn directly:

uvicorn src.server_http:app --host 0.0.0.0 --port 8000

Docker Setup (Recommended)

Using Docker Compose:

# Start all services (link-scan + Ollama)
docker-compose up -d

# Check logs
docker-compose logs -f

# Stop services
docker-compose down

Docker Compose automatically:

  • Sets up Ollama service with 8GB memory
  • Downloads Llama3 model
  • Configures link-scan service
  • Enables development mode with hot reload

Development Mode: The docker-compose.yml is configured for development with:

  • Source code volume mounting
  • Hot reload enabled (DEBUG=True)
  • Automatic code changes detection

Testing with MCP Inspector

You can test the server using the MCP Inspector tool:

# Test with Python
npx @modelcontextprotocol/inspector python run_server.py

# Or test stdio mode
npx @modelcontextprotocol/inspector python -m src.server

The MCP Inspector provides a web interface to:

  • View available tools and their schemas
  • Test tool execution with sample inputs
  • Debug server responses and error handling
  • Validate MCP protocol compliance

πŸ› οΈ Available Tools

1. scan_video_link

Scan and summarize video links (YouTube, Instagram Reels, etc.).

Parameters:

  • url (string, required): Video URL to scan

Example:

{
  "name": "scan_video_link",
  "arguments": {
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  }
}

Process:

  1. Detects link type (YouTube, Instagram, etc.)
  2. For YouTube: Extracts title, description, subtitles (first 7s)
  3. Downloads audio (first 7 seconds)
  4. Transcribes audio with Whisper
  5. Combines all text sources
  6. Summarizes with Llama3 (3 sentences max)

2. scan_text_link

Scan and summarize text links (blogs, articles, etc.).

Parameters:

  • url (string, required): Text URL to scan

Example:

{
  "name": "scan_text_link",
  "arguments": {
    "url": "https://example.com/blog/article"
  }
}

Process:

  1. Fetches HTML content
  2. Extracts main text content
  3. Removes navigation, ads, and noise
  4. Summarizes with Llama3 (3 sentences max)

πŸ“Š Example Outputs

Video Link Summary

Input: YouTube video URL

Output:

이 μ˜μƒμ€ Python ν”„λ‘œκ·Έλž˜λ° μ–Έμ–΄μ˜ κΈ°λ³Έ κ°œλ…μ„ μ†Œκ°œν•©λ‹ˆλ‹€. 
λ³€μˆ˜, ν•¨μˆ˜, 클래슀 λ“± 핡심 문법을 μ‹€μŠ΅ μ˜ˆμ œμ™€ ν•¨κ»˜ μ„€λͺ…ν•©λ‹ˆλ‹€. 
μ΄ˆλ³΄μžλ„ μ‰½κ²Œ 따라할 수 μžˆλ„λ‘ λ‹¨κ³„λ³„λ‘œ κ΅¬μ„±λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.

Text Link Summary

Input: Blog article URL

Output:

이 글은 Docker μ»¨ν…Œμ΄λ„ˆ 기술의 μž₯단점을 λΆ„μ„ν•©λ‹ˆλ‹€. 
가상화 기술과 λΉ„κ΅ν•˜μ—¬ λ¦¬μ†ŒμŠ€ νš¨μœ¨μ„±κ³Ό 배포 νŽΈμ˜μ„±μ„ κ°•μ μœΌλ‘œ μ œμ‹œν•©λ‹ˆλ‹€. 
λ‹€λ§Œ λ³΄μ•ˆκ³Ό λ³΅μž‘μ„± μΈ‘λ©΄μ—μ„œ μ£Όμ˜κ°€ ν•„μš”ν•˜λ‹€κ³  μ‘°μ–Έν•©λ‹ˆλ‹€.

πŸ—οΈ Architecture

mcp-link-scan/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ server.py              # Local server (stdio)
β”‚   β”œβ”€β”€ server_http.py         # Remote server (HTTP)
β”‚   β”œβ”€β”€ tools/                  # MCP tools
β”‚   β”‚   β”œβ”€β”€ link_scanner.py     # Main tool definitions
β”‚   β”‚   β”œβ”€β”€ media_handler.py    # Video processing (Whisper)
β”‚   β”‚   └── text_handler.py    # Text extraction
β”‚   β”œβ”€β”€ utils/                  # Utilities
β”‚   β”‚   β”œβ”€β”€ link_detector.py    # Link type detection
β”‚   β”‚   β”œβ”€β”€ youtube_extractor.py # YouTube metadata/subtitles
β”‚   β”‚   └── llm_summarizer.py   # Llama3 integration
β”‚   └── prompts/                # LLM prompts
β”‚       └── __init__.py         # Video/text prompt templates
β”œβ”€β”€ docker/
β”‚   └── init-ollama.sh          # Ollama initialization script
β”œβ”€β”€ docker-compose.yml          # Docker services
β”œβ”€β”€ Dockerfile                  # Container build config
β”œβ”€β”€ requirements.txt            # Python dependencies
└── run_server.py               # Server entry point

πŸ”§ Development

Setting up Development Environment

# Clone and install
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your settings

# Start Ollama (if not using Docker)
ollama serve
ollama pull llama3:latest

Development Mode with Docker

# Start in development mode (hot reload enabled)
docker-compose up -d

# View logs
docker-compose logs -f link-scan

# Code changes are automatically reloaded

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_link_scanner.py

Customizing Prompts

Edit src/prompts/__init__.py to customize LLM prompts:

# Video summarization prompt
VIDEO_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

# Text summarization prompt
TEXT_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

Configuring Whisper Model

Edit src/tools/media_handler.py:

# Change model size (tiny, base, small, medium, large)
_whisper_model = whisper.load_model("base")  # Default: "base"

πŸ“‹ Requirements

  • Python 3.11+
  • ffmpeg - Audio processing
  • Ollama - LLM runtime (for summarization)
  • yt-dlp - Video/audio download
  • openai-whisper - Speech-to-text
  • torch - PyTorch (for Whisper)
  • aiohttp - Async HTTP client
  • beautifulsoup4 - HTML parsing
  • fastapi - HTTP server framework
  • uvicorn - ASGI server
  • mcp - Model Context Protocol SDK

🌐 Deployment

PlayMCP Registration

  1. Deploy Server: Deploy to cloud hosting (Render, Railway, Fly.io, AWS, GCP, etc.)
  2. Get Server URL: Example: https://your-server.railway.app
  3. Register in PlayMCP: Use URL https://your-server.railway.app/messages

Important: Server URL must be publicly accessible and support HTTPS for production use.

Using with MCP Clients

Amazon Q CLI:

{
  "mcpServers": {
    "link-scan": {
      "command": "python",
      "args": ["run_server.py"],
      "cwd": "/path/to/mcp-link-scan"
    }
  }
}

Other MCP Clients:

{
  "mcpServers": {
    "link-scan": {
      "url": "https://your-server.com/messages"
    }
  }
}

🀝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (pytest)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Development Workflow

# Install in development mode
pip install -e .

# Run tests
pytest

# Format code (if using formatters)
black src/ tests/
isort src/ tests/

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • yt-dlp team for the excellent YouTube extraction library
  • OpenAI Whisper team for the speech-to-text model
  • Ollama team for the local LLM runtime
  • MCP team for the Model Context Protocol specification
  • Pydantic team for the data validation library

πŸ“ž Support

πŸ—ΊοΈ Roadmap

  • [ ] Batch processing for multiple links
  • [ ] Caching layer for improved performance
  • [ ] Export functionality (JSON, CSV, etc.)
  • [ ] Advanced analytics (sentiment analysis, topic extraction)
  • [ ] Support for more video platforms (TikTok, Vimeo, etc.)
  • [ ] WebSocket support for real-time updates
  • [ ] Integration examples with popular MCP clients
  • [ ] Custom prompt templates via API
  • [ ] Multi-language support for summaries
  • [ ] Video thumbnail extraction

πŸ“ Notes

  • Audio downloads are temporarily stored and automatically cleaned up
  • Whisper model is loaded once and reused for better performance
  • Processing time depends on video length and Whisper model size
  • YouTube videos are processed for first 7 seconds only to reduce processing time
  • All text sources (title, description, subtitles, transcription) are combined for YouTube videos
  • Summaries are limited to 3 sentences maximum
  • For production, consider using GPU for faster Whisper conversion
  • Ollama timeout is set to 5 minute for tool calls

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured