ai-visibility-mcp
Audits AI-bot visibility: robots.txt per-bot for 22 AI user-agents (GPTBot/ClaudeBot/PerplexityBot/etc), Cloudflare flags, JSON-LD, sitemap, llms.txt, SPA shell, plus cross-model brand mentions via Perplexity + OpenRouter. 0-100 score. SSRF-guarded, spend-capped.
README
ai-visibility-mcp
MCP server that audits and fixes how AI sees your website. Robots, schema, LLM mentions, Cloudflare AI defaults — audit the problem, generate the fix, re-audit in one loop.
Most websites are accidentally invisible to AI search. Cloudflare's bot-management defaults block GPTBot / ClaudeBot / PerplexityBot. SPAs render an empty <div id="root"> to crawlers that don't run JS. Marketing teams have no idea their brand isn't surfacing in ChatGPT, Claude, or Perplexity answers — until traffic dries up.
ai-visibility-mcp closes the audit-and-fix loop inside a single agent session:
- Audit — find what's blocking AI visibility
- Fix — generate the artifact that corrects it
- Paste — site owner applies the output
- Re-audit — verify the fix was picked up
Tools
Audit tools
| Tool | Purpose | Needs API keys? |
|---|---|---|
check_ai_bot_access(domain) |
Per-bot robots.txt + Cloudflare AI-default flag for 22 AI user-agents | No |
audit_ai_visibility(domain) |
0-100 composite score with explainable deductions (robots, meta, JSON-LD, sitemap, llms.txt, SPA shell) | No |
check_llm_mention(brand, query, aliases?, models?) |
Cross-model brand surfacing (Perplexity sonar + OpenAI gpt-4o-mini + Gemini 2.0 Flash by default) | Yes |
compare_competitors(your_domain, competitor_domains[]) |
Parallel ranked audit, max 10 in flight | No |
Generator tools (v0.3)
| Tool | Purpose | LLM call? |
|---|---|---|
generate_robots_patch(domain, allow_bots?, deny_bots?) |
Corrected robots.txt that opens access to AI bots; preserves existing rules; detects Cloudflare | No |
generate_json_ld(url, page_type?) |
Schema.org JSON-LD block for any page; auto-detects type (Product/Article/Organization/FAQPage/SoftwareApplication/WebSite); validates required fields | Yes (gpt-4o-mini) |
generate_llms_txt(domain, crawl_depth?, max_pages?) |
spec-compliant llms.txt; crawls homepage + sitemap; graceful fallback to link extraction | Yes (gpt-4o-mini) |
Why this exists
- Cloudflare flipped defaults in 2024-2025 to block AI scrapers. Most site owners never updated their config, so AI bots get challenged and bounce.
- MCP marketplaces shipped in 2026 (MCP Hive, Smithery, mcp.so, Glama). Every AI agent needs tools that can audit the real web. This is one.
- Brand visibility in LLM answers is the new SEO. Nobody has a clean stack for measuring it from a single MCP call.
Install
Requires Python 3.10+ and uv.
git clone https://github.com/bestaiinsider/ai-visibility-mcp
cd ai-visibility-mcp
uv sync
cp .env.example .env # fill in PERPLEXITY_API_KEY / OPENROUTER_API_KEY
Run
# stdio transport — Claude Desktop / Claude Code
uv run ai-visibility-mcp
# HTTP transport — remote agents
uv run ai-visibility-mcp --http --port 8000
Claude Desktop / Claude Code config
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (Desktop) or ~/.claude.json (CLI):
{
"mcpServers": {
"ai-visibility": {
"command": "uv",
"args": ["--directory", "/absolute/path/to/ai-visibility-mcp", "run", "ai-visibility-mcp"]
}
}
}
Audit-and-fix loop
# Step 1 — audit
> audit_ai_visibility(domain="example.com")
score: 55
warnings:
- "9/22 AI bots disallowed — site largely invisible to AI search"
- "no JSON-LD structured data — LLMs lose entity grounding"
- "no /llms.txt found at root"
# Step 2 — generate fixes
> generate_robots_patch(domain="example.com")
→ new_robots: "User-agent: GPTBot\nAllow: /\n\nUser-agent: ClaudeBot\nAllow: /\n..."
→ diff: unified diff of exactly what changed
→ paste_target: "/robots.txt at site root, replaces existing"
> generate_json_ld(url="https://example.com/")
→ page_type_detected: "Organization"
→ script_tag: '<script type="application/ld+json">{"@context":"https://schema.org","@type":"Organization"...}</script>'
→ paste_target: "inside <head> of the page"
> generate_llms_txt(domain="example.com")
→ content: "# Example Corp\n\n> One-sentence summary...\n\n## Pages\n- [Home](...): ..."
→ paste_target: "/llms.txt"
# Step 3 — site owner pastes the three artifacts
# Step 4 — re-audit
> audit_ai_visibility(domain="example.com")
score: 95 ← was 55
Example session
> check_ai_bot_access(domain="bandcamp.com")
summary: { total: 22, allowed: 13, disallowed: 9 }
warnings: ["9/22 AI bots disallowed — site largely invisible to AI search"]
blocked: ["GPTBot", "ClaudeBot", "Google-Extended", "Bytespider",
"CCBot", "Meta-ExternalAgent", "FacebookBot", "Amazonbot", "Diffbot"]
> audit_ai_visibility(domain="bandcamp.com")
score: 49
reasons:
-36: 9 AI bots disallowed in robots.txt
-10: no JSON-LD structured data
-5: no /sitemap.xml
> check_llm_mention(brand="Anthropic", query="Who makes the leading foundation AI models?")
share_of_voice: 0.667
by_model:
perplexity/sonar mentioned=true citations=3
openrouter/gpt-4o-mini mentioned=true citations=0
openrouter/gemini-flash mentioned=false citations=0
est_total_cost_usd: 0.00088
daily_spend_usd: 0.00088 / $5.00 cap
Security posture
This server makes outbound HTTP requests to caller-supplied domains and to LLM providers. v0.2 hardening:
- SSRF guard. All outbound HTTP refuses loopback, link-local (AWS / GCP / Azure metadata IPs), RFC1918, CGNAT, and IPv6 ULA addresses. Redirects are re-validated.
- Daily spend cap. LLM calls are gated by
MAX_DAILY_USD(default $5.00), persisted to~/.cache/ai-visibility-mcp/spend.json. Loop-amplification can't drain your Perplexity / OpenRouter credits. - Per-call cost ceiling.
MAX_COST_PER_CALL(default $0.10) plusLLM_MAX_OUTPUT_TOKENS(default 1024) hard-bounds any single tool invocation. - No persistence of user content. Nothing is logged to disk except the daily spend totals.
Configuration
| Env var | Default | Purpose |
|---|---|---|
PERPLEXITY_API_KEY |
— | Required for Perplexity models in check_llm_mention |
OPENROUTER_API_KEY |
— | Required for OpenAI / Gemini / Claude via OpenRouter |
MAX_COST_PER_CALL |
0.10 |
USD ceiling per tool invocation |
MAX_DAILY_USD |
5.00 |
USD ceiling per UTC day, persisted |
LLM_MAX_OUTPUT_TOKENS |
1024 |
Hard cap on output tokens per LLM call |
AI_VISIBILITY_SPEND_FILE |
~/.cache/ai-visibility-mcp/spend.json |
Override spend ledger location |
Development
uv sync --extra dev
uv run pytest # 40 tests
uv run ruff check . # lint
Status
v0.3 — audit + fix loop complete. 7 tools (4 audit + 3 generator), 40/40 tests, SSRF-hardened, spend-capped. Smoke-verified against tealhq.com / bandcamp.com / anthropic.com.
License
MIT.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.