weixin-articles-mcp

weixin-articles-mcp

Read WeChat (微信) Official Account articles with native multimodal output — body, images, and video keyframes returned as MCP content blocks. Handles all three embed types: Tencent Video, WeChat-native, and Channels (视频号 metadata via public API).

Category
Visit Server

README

weixin-articles-mcp

MCP server for reading WeChat (微信) Official Account articles, with native multimodal output — images and video keyframes returned as content blocks, not URLs.

For personal/research use. This tool reads only publicly accessible article URLs and does not bypass any authentication or anti-bot measures. See Disclaimer before using.

License: MIT Python 3.10+ GitHub stars

Why this exists

Other tools that read WeChat articles for LLMs return a list of image URLs — your LLM has to click through to actually see them, costing extra round-trips and context.

This server returns the images themselves. And the video keyframes. Your LLM sees what you see, in one shot.

Tool Article text Images Videos
WebFetch (built-in) ✅ (often blocked by anti-bot) ❌ URLs only
Existing WeChat MCPs / Skills ❌ URLs only
weixin-articles-mcp Native image blocks Keyframes as image blocks

Features

  • 📰 Reliable WeChat scraping — pure Python httpx GET, no Rust binary or headless browser required
  • 🖼️ Native image content — PNG/JPG returned as MCP Image blocks, GIFs filtered, capped at 10 per article
  • 🎬 Video handling for all three embed types:
    • WeChat Official Account native videos (<iframe data-mpvid="wxv_*">): mp4 extracted from inline JS, 8 evenly-spaced keyframes via ffmpeg
    • Tencent Video (v.qq.com iframes): yt-dlp + ffmpeg keyframes
    • WeChat Channels (视频号, <mp-common-videosnap>): full metadata via the public batch_get_video_snap API (duration, dimensions, hi-res cover, full description, like count, publisher verification) + cover image. mp4 stream is locked behind WeChat's finder protocol — see Why no Channels mp4? below
  • 🕒 Publish time recovery — extracts var ct Unix timestamp that other parsers miss
  • 🪶 Minimal installpip install + optional ffmpeg for video; no Chromium, no Rust

Install

# Core (article + images)
pip install weixin-articles-mcp

# With video keyframe support
pip install "weixin-articles-mcp[video]"
brew install ffmpeg   # or apt install ffmpeg on Linux

Configure

Claude Desktop / Claude Code

{
  "mcpServers": {
    "weixin-articles": {
      "command": "weixin-articles-mcp"
    }
  }
}

Cursor / Cline

Same JSON, drop into the MCP server config of your client.

Usage

Once configured, just paste a WeChat article URL into your conversation:

Read https://mp.weixin.qq.com/s/cexkyzQBRDG3uIF6g5cEbQ

Your LLM will receive:

  • Article metadata (title, account, publish time, cover URL)
  • Full article body in Markdown
  • All inline PNG/JPG images as native image content blocks
  • For each video, 8 keyframe images

Tool reference

read_article(url: str) -> list[content_block]

Returns a list of MCP content blocks:

  • [0] — text block: metadata + article body markdown
  • [1..N] — image blocks: article images (max 10, GIFs filtered)
  • For each video (max 3):
    • WeChat-native or Tencent: one text marker + 8 keyframe image blocks
    • WeChat Channels: one text marker (with duration, dimensions, like count, publisher, description) + 1 hi-res cover image block

On failure, returns a single text block starting with Error:.

Roadmap

  • [x] WeChat article fetching with anti-bot handling
  • [x] Native image content blocks
  • [x] WeChat Official Account native video keyframe extraction
  • [x] Tencent Video keyframe extraction
  • [x] WeChat Channels (视频号) metadata enrichment via public API
  • [ ] ASR subtitles via faster-whisper (for native + Tencent videos)
  • [ ] Full-text search across read articles
  • [ ] Account subscription / new-article notifications

Why no Channels mp4?

Short answer: WeChat Channels (视频号) videos in articles intentionally don't expose a downloadable mp4 stream to public web access. The mp4 lives inside WeChat's finder protocol, which requires (a) a logged-in WeChat client session, (b) finder-specific encryption (the first 128KB of the mp4 is XOR-encrypted with a fixed key), and (c) intercepting the stream from the WeChat PC client at network level.

Every open-source WeChat Channels downloader in the wild — ltaoo/wx_channels_download, qiye45/wechatVideoDownload, putyy/res-downloader, KingsleyYau/WeChatChannelsDownloader and others — solves this with a MITM HTTPS proxy + WeChat PC client + root CA installation. That model is fundamentally incompatible with how an MCP server runs (no client, no user interaction, no admin install).

What we do instead: call WeChat's public batch_get_video_snap API (no cookie or session required) to give your LLM the next-best thing — high-resolution cover image, full description, duration, dimensions, like count, and publisher verification. For most use cases (reading and summarizing articles), this is enough to convey the video's substance.

If you specifically need the mp4 file, install one of the dedicated tools above alongside this MCP — they complement each other.

Architecture

src/weixin_articles_mcp/
├── server.py     # FastMCP entrypoint, tool registration
├── fetcher.py    # httpx GET with browser UA
├── parser.py     # WeChat DOM extraction (BeautifulSoup + lxml)
├── markdown.py   # HTML → Markdown (markdownify subclass)
└── media.py      # Image download + video download/keyframe extraction

Contributing

PRs welcome. Particularly looking for help on:

  • WeChat Channels (视频号) URL handling
  • Resilience to template variants from less common publishers
  • More test fixtures (different article styles)

Open an issue: https://github.com/jj-cheng25/weixin-articles-mcp/issues

Disclaimer

This tool is provided for personal, educational, and research use only.

What this tool does:

  • Reads publicly accessible WeChat article URLs (mp.weixin.qq.com/s/...) using a standard browser User-Agent — the same content any user with a web browser can view
  • Calls only public WeChat API endpoints that accept empty authentication fields (i.e. designed by WeChat to be reachable without login)
  • Enforces a default 1-second minimum interval between requests to prevent the tool from being repurposed as a high-volume crawler

What this tool does not do:

  • Use cookies, login sessions, or any form of user credential
  • Bypass any technical protection, anti-bot measure, or encrypted stream (e.g. WeChat Channels mp4 is intentionally not supported — see Why no Channels mp4?)
  • Decrypt, reverse-engineer, or circumvent WeChat's protocol-level protections
  • Store, cache, or redistribute fetched content beyond the immediate response

User responsibilities:

  • Respect WeChat's Terms of Service when using this tool. Personal/research use of publicly accessible articles is generally aligned with how the content is intended to be consumed; high-volume scraping or commercial redistribution likely is not.
  • Respect copyright of fetched content. Article content remains the property of its original authors and publishers; this tool only fetches and forwards it to your LLM for inline processing.
  • Do not flood mp.weixin.qq.com — keep usage at human reading rates. The default rate limit is set conservatively, but you can tighten it further by setting WEIXIN_FETCH_INTERVAL_S=2.0 (or higher) in your environment.

The authors and contributors of this project disclaim all liability arising from misuse. By using this software you accept full responsibility for ensuring your usage complies with applicable laws and the terms of service of the services it connects to.

License

MIT — see LICENSE.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured