Crawl Parity MCP

Crawl Parity MCP

An MCP server for analyzing Googlebot vs AI crawler parity from Nginx access logs and Google Search Console data.

Category
Visit Server

README

Crawl Parity MCP

Part of the GEO Stack research programme — where we discovered that Googlebot and AI crawlers make independent authority assessments on new domains.

An MCP (Model Context Protocol) server for analyzing Googlebot vs AI crawler parity from Nginx access logs and Google Search Console (GSC) data. Determines how consistently AI crawlers access your content relative to Googlebot.

Installation

npm install

Requires Node.js 18+.

Usage

Start the Server

npm start

Or directly:

node src/index.js

Tool: analyze_logs

Parse an Nginx combined log format file and classify requests by crawler type.

Input:

  • log_path (string, required): Path to Nginx access log file

Output:

{
  "googlebot_requests": 1234,
  "ai_crawler_requests": 456,
  "parity_ratio": 37.0,
  "parity_level": "low_parity",
  "paths_googlebot": 89,
  "paths_ai_crawler": 42,
  "unique_paths": 120
}

Tool: gsc_crossref

Cross-reference log analysis results with GSC search analytics data.

Input:

  • logs_analysis (object, required): Output from analyze_logs or similar structure
  • gsc_data (array, required): GSC analytics records with page, impressions, clicks, ctr, position

Output:

{
  "both": 45,
  "logs_only": 20,
  "gsc_only": 35,
  "total_analyzed": 100
}

Tool: parity_report

Generate a comprehensive crawl parity report combining logs and GSC data.

Input:

  • log_path (string, required): Path to Nginx access log file
  • gsc_data (array, required): GSC analytics records

Output:

{
  "timestamp": "2026-03-26T14:30:00.000Z",
  "logs_analysis": { ... },
  "gsc_crossref": { ... },
  "summary": {
    "googlebot_activity": "detected",
    "ai_crawler_activity": "detected",
    "parity_status": "low_parity",
    "parity_percentage": 37.0,
    "recommendation": "Low parity - AI crawlers are significantly underrepresented"
  }
}

Parity Levels

Level Ratio Meaning
high_parity ≥80% AI crawlers access your content nearly as often as Googlebot
medium_parity 40-79% Moderate parity; consider optimizing AI crawler discoverability
low_parity <40% AI crawlers significantly underrepresented; review robots.txt and crawlability
insufficient_data N/A Googlebot requests = 0; cannot calculate parity

Bot Signatures

Googlebot Detection

  • Googlebot (with space, slash, or end-of-string)
  • Googlebot-Image
  • AdsBot-Google

AI Crawler Detection

  • GPTBot (OpenAI)
  • ClaudeBot (Anthropic)
  • PerplexityBot (Perplexity)
  • YouBot (You.com)
  • Bytespider (ByteDance)
  • Google-Extended (Google AI)
  • CCBot (Common Crawl)
  • PetalBot (Alibaba)
  • Applebot-Extended (Apple)

Nginx Log Format

Expects combined log format:

ip - - [timestamp] "METHOD /path HTTP/1.1" status bytes "referer" "user-agent"

License

MIT — See LICENSE file

Author

Artur Ferreira / The GEO Lab


A project by The GEO Lab — Generative Engine Optimisation research

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured