MediaCrawler MCP Server

MediaCrawler MCP Server

MCP server for crawling social media platforms (e.g., Bilibili) by keywords, video IDs, or creator IDs, with support for MySQL, JSON, and CSV storage.

Category
Visit Server

README

🔥 MediaCrawler_MCP_Server - MCP for MediaCrawler 🕷️

🔗 MediaCrawler仓库地址

https://github.com/NanmiCoder/MediaCrawler

🔧 基于MediaCrawler改进

  • python版本:所有包版本可用于python 3.13,以支持mcp的使用
  • mysql存储:如表已经存在,初始化不会覆盖原有数据

✨ 更多设置(config)可见MediaCrawler仓库

🧰 可使用的MCP工具

  1. crawl_search(platform: str, store_type: str, keywords: str) - Start the crawler for the media platform by keywords
  2. crawl_detail(platform: str, store_type: str, video_id: list): - Start the crawler for the media platform by video ID
  3. crawl_creator(platform: str, store_type: str, creator_id: list) - Start the crawler for the media platform by creator id.

📦 Python包安装

# 进入项目目录
cd MediaCrawler_MCP_Server

# 使用 uv sync 命令来保证 python 版本和相关依赖包的一致性
uv sync

🌐 浏览器驱动安装

# 安装浏览器驱动
uv run playwright install

⚙️ 设置

设置环境变量:

MYSQL_DB_HOST=localhost     # Database host
MYSQL_DB_PORT=3306         # Optional: Database port (defaults to 3306 if not specified)
MYSQL_DB_USER=your_username
MYSQL_DB_PWD=your_password
MYSQL_DB_NAME=your_database
CRAWLER_MAX_NOTES_COUNT=20 # number of notes you want to crawl
MAX_CONCURRENCY_NUM=1 # number of concurrent crawlers
ENABLE_GET_COMMENTS=true # crawl the comments or not

🚀 使用

添加至 claude_desktop_config.json or cline_mcp_settings.json

"mediacrawler": {
      "disabled": false,
      "timeout": 600,
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "path/to/MediaCrawler_MCP_Server",
        "run",
        "main.py"
      ],
      "env": {
        "MYSQL_DB_HOST": "localhost",
        "MYSQL_DB_PORT": "3306",
        "MYSQL_DB_USER": "your_username",
        "MYSQL_DB_NAME": "your_database",
        "MYSQL_DB_PWD": "your_password",
        "CRAWLER_MAX_NOTES_COUNT": "20",
        "MAX_CONCURRENCY_NUM": "1",
        "ENABLE_GET_COMMENTS": "true"
      }
  }

🌰 例子

  1. 帮我爬取b站视频资料,关键词为"钱",存储模式为mysql。
  2. 帮我爬取b站视频号为BV1d54y1g7db,BV1Sz4y1U77N的视频存储模式为json。
  3. 帮我爬取b站up主视频资料,其id为20813884,存储模式为csv。

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured