web-retrieval-mcp

web-retrieval-mcp

A Model Context Protocol server that analyzes webpage design structures, providing detailed layout, navigation, content, form, image, and link information from a given URL.

Category
Visit Server

README

🌐 Web Retrieval MCP

一个专门用于解析网页设计结构的 Model Context Protocol (MCP) 工具。通过提供URL,即可获得详细的页面结构分析,包括布局、导航、内容区域、表单、图片等元素的完整信息。

✨ 功能特性

  • 🏗️ 页面布局分析 - 自动识别头部、底部、侧边栏等布局元素
  • 📝 标题结构解析 - 提取并分析H1-H6标题层次结构
  • 🧭 导航结构识别 - 解析网站导航菜单和链接结构
  • 📄 内容区域提取 - 识别主要内容区域和文本内容
  • 📋 表单信息分析 - 解析表单字段、提交方式等信息
  • 🖼️ 图片资源统计 - 统计页面图片资源和属性信息
  • 🔗 链接关系分析 - 区分内部链接和外部链接
  • 🎨 样式特征检测 - 检测响应式设计、字体等样式信息

🚀 快速开始

安装依赖

npm install

构建项目

npm run build

启动服务

Stdio 模式(本地开发)

npm start

SSE 模式(通过 Supergateway)

# 安装 supergateway
npm install -g supergateway

# 启动 SSE 服务器
npm run sse

服务将在 http://localhost:3100 启动。

🔧 Claude 配置

Stdio 模式配置

在 Claude 的 MCP 配置中添加:

{
  "mcpServers": {
    "web-retrieval-mcp": {
      "command": "node",
      "args": ["path/to/web-retrieval-mcp/build/index.js"]
    }
  }
}

SSE 模式配置

{
  "mcpServers": {
    "web-retrieval-mcp": {
      "type": "sse",
      "url": "http://localhost:3100/sse",
      "timeout": 600
    }
  }
}

📖 使用方法

工具:analyze_web_structure

深度解析指定URL网页的前端设计架构与后端交互面。

参数

  • url (必需): 要解析的网页URL地址

示例

analyze_web_structure({
  url: "https://example.com"
})

输出示例(节选)

# 🌐 网页结构分析报告

**URL:** https://example.com
**标题:** Example Domain
**描述:** This domain is for use in illustrative examples

---

## 🏗️ 前端架构画像

- 框架候选: React, Next.js
- SPA 判定: ✅ 可能是 SPA
- 路由线索: React Router
- 构建工具线索: Webpack, Next.js build
- CSS 框架: Tailwind
- 微前端线索: 无

## 🔌 后端交互面(可通向后端的触点)

### 表单
- 表单 1: POST -> https://example.com/api/login [CSRF]
  - 字段: hidden token(_csrf), text(username), password(password)

### API/HTTP 端点
- https://example.com/api/v1/user
- https://api.example.com/graphql

### WebSocket
- wss://ws.example.com/realtime

...

🛠️ 开发

项目结构

src/
├── index.ts                    # MCP服务器主入口
└── tools/                      # 业务工具模块
    └── web-structure-analyzer.ts # 网页结构解析工具

开发模式

# 监听文件变化并自动重新编译
npm run dev

📋 技术栈

  • TypeScript - 类型安全的JavaScript
  • @modelcontextprotocol/sdk - MCP SDK
  • Cheerio - 服务端jQuery实现,用于HTML解析
  • Axios - HTTP客户端,用于获取网页内容

🔒 安全考虑

  • 请求超时设置为10秒,避免长时间等待
  • 使用标准浏览器User-Agent,提高兼容性
  • 限制链接和内容提取数量,避免内存溢出
  • URL格式验证,确保输入安全

📄 许可证

Apache License 2.0

👨‍💻 作者

Xingyu Chen

🤝 贡献

欢迎提交 Issue 和 Pull Request!

📝 更新日志

v1.0.0

  • 🎉 初始版本发布
  • ✅ 基础网页结构解析功能
  • ✅ 支持布局、导航、内容、表单、图片、链接分析
  • ✅ 样式特征检测
  • ✅ MCP协议支持

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured