MCP Servers

diffgrab

Enables tracking web page changes, detecting content modifications with structured diffs and snapshot history through MCP tools.

README

diffgrab

한국어 문서 · llms.txt

Web page change tracking with structured diffs. markgrab + snapgrab integration, MCP native.

from diffgrab import DiffTracker

tracker = DiffTracker()
await tracker.track("https://example.com")
changes = await tracker.check()
for c in changes:
    if c.changed:
        print(c.summary)     # "3 lines added, 1 lines removed in sections: Introduction."
        print(c.unified_diff) # Standard unified diff output
await tracker.close()

Features

Change detection — track any URL, detect content changes via content hashing
Structured diffs — unified diff + section-level analysis (which headings changed)
Human-readable summaries — "5 lines added, 2 removed in sections: Intro, Methods"
Snapshot history — SQLite storage, browse past versions of any page
markgrab powered — HTML/YouTube/PDF/DOCX extraction via markgrab
Visual diff — optional screenshot comparison via snapgrab
MCP server — 5 tools for Claude Code / MCP clients
CLI included — diffgrab track, check, diff, history, untrack

How It Works

flowchart TD
    A["diffgrab track URL"] --> B["Fetch initial snapshot\n(markgrab + snapgrab)"]
    B --> C["Store baseline"]
    C --> D["diffgrab check"]
    D --> E["Fetch current page"]
    E --> F{"Content\nhash match?"}
    F -->|"changed"| G["Compute structured diff\n+ section analysis"]
    F -->|"unchanged"| H["No changes"]
    G --> I["📊 DiffResult\nadded / removed / modified"]

Install

pip install diffgrab

Optional extras:

pip install 'diffgrab[cli]'      # CLI with click + rich
pip install 'diffgrab[visual]'   # Visual diff with snapgrab
pip install 'diffgrab[mcp]'      # MCP server with fastmcp
pip install 'diffgrab[all]'      # Everything

Usage

Python API

import asyncio
from diffgrab import DiffTracker

async def main():
    tracker = DiffTracker()

    # Track a URL (takes initial snapshot)
    await tracker.track("https://example.com", interval_hours=12)

    # Check for changes
    changes = await tracker.check()
    for change in changes:
        if change.changed:
            print(change.summary)
            print(change.unified_diff)

    # Get diff between specific snapshots
    result = await tracker.diff("https://example.com", before_id=1, after_id=2)

    # Browse snapshot history
    history = await tracker.history("https://example.com", count=20)

    # Stop tracking
    await tracker.untrack("https://example.com")

    await tracker.close()

asyncio.run(main())

Convenience Functions

from diffgrab import track, check, diff, history, untrack

await track("https://example.com")
changes = await check()
result = await diff("https://example.com")
snaps = await history("https://example.com")
await untrack("https://example.com")

CLI

# Track a URL
diffgrab track https://example.com --interval 12

# Check all tracked URLs for changes
diffgrab check

# Check a specific URL
diffgrab check https://example.com

# Show diff between snapshots
diffgrab diff https://example.com
diffgrab diff https://example.com --before 1 --after 3

# View snapshot history
diffgrab history https://example.com --count 20

# Stop tracking
diffgrab untrack https://example.com

MCP Server

Add to your Claude Code MCP config:

{
  "mcpServers": {
    "diffgrab": {
      "command": "diffgrab-mcp",
      "args": []
    }
  }
}

Or with uvx:

{
  "mcpServers": {
    "diffgrab": {
      "command": "uvx",
      "args": ["--from", "diffgrab[mcp]", "diffgrab-mcp"]
    }
  }
}

MCP Tools:

Tool	Description
`track_url`	Register a URL for change tracking
`check_changes`	Check tracked URLs for changes
`get_diff`	Get structured diff between snapshots
`get_history`	Browse snapshot history
`untrack_url`	Stop tracking a URL

DiffResult

Every diff operation returns a DiffResult:

@dataclass
class DiffResult:
    url: str                           # The tracked URL
    changed: bool                      # Whether content changed
    added_lines: int                   # Lines added
    removed_lines: int                 # Lines removed
    changed_sections: list[str]        # Markdown headings with changes
    unified_diff: str                  # Standard unified diff
    summary: str                       # Human-readable summary
    before_snapshot_id: int | None     # DB ID of older snapshot
    after_snapshot_id: int | None      # DB ID of newer snapshot
    before_timestamp: str              # When older snapshot was taken
    after_timestamp: str               # When newer snapshot was taken

Storage

Snapshots are stored in SQLite at ~/.local/share/diffgrab/diffgrab.db (auto-created). Custom path:

tracker = DiffTracker(db_path="/path/to/custom.db")

QuartzUnit Ecosystem

Package	Role	PyPI
markgrab	HTML/YouTube/PDF/DOCX to markdown	`pip install markgrab`
snapgrab	URL to screenshot + metadata	`pip install snapgrab`
docpick	OCR + LLM document extraction	`pip install docpick`
feedkit	RSS feed collection	`pip install feedkit`
diffgrab	Web page change tracking	`pip install diffgrab`
browsegrab	Browser agent for LLMs	Coming soon

Used in

newswatch — RSS news monitoring pipeline (feedkit → markgrab → embgrep → diffgrab)
watchdeck — Web page monitoring with visual diffs and safety guards

License

MIT

<sub>Part of the QuartzUnit ecosystem — composable Python libraries for data collection, extraction, search, and AI agent safety.</sub>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured