wasp-mcp
A Model Context Protocol server that lets Claude query arbitrary webpages with token-efficient, structure-aware retrieval, reducing token costs by fetching only relevant sections.
README
wasp-mcp
Web Agent Semantic Protocol — MCP Server
wasp-mcp is a Model Context Protocol server that lets Claude (or any MCP client) query arbitrary webpages with token-efficient, structure-aware retrieval. Instead of dumping raw HTML into the context window, WASP builds a lightweight structural index (the manifest) from a page's headings, then fetches content only for the sections relevant to a query.
The result: answers grounded in real page content at a fraction of the token cost of naive scraping.
See the WASP Whitepaper for full protocol specification.
How It Works
Every webpage has two useful layers:
- Structure — headings and section anchors that form a table of contents. Small, cheap to index.
- Content — the text under each heading. Expensive to send in full; most is irrelevant to any given query.
WASP exploits this split with a two-tier pipeline:
Tier 1 — get_manifest(url)
↓ Try GET /.well-known/wasp.json (site-native manifest, 3 s timeout)
↓ Fall back: fetch HTML → parse headings → generate manifest client-side
→ Returns: structured index (headings, anchors, depth, token estimates)
Tier 2 — fetch_chunk(url, anchor)
↓ Resolve anchor → DOM element (getElementById → querySelector → fuzzy match)
↓ Extract section text via Range API / heading-sibling walk
→ Returns: plain-text body of that section only
query_page(url, query)
↓ get_manifest → score chunks by keyword match → fetch_chunk for top results
↓ Build numbered [1. Heading] context → call Claude API → inline [N] citations
→ Returns: { answer, sources[] }
A naive full-page scrape of a typical faculty profile costs ~16,700 tokens. The same query via WASP costs ~2,700 — a 6× reduction.
Install
Requirements: Node.js ≥ 18, an Anthropic API key.
git clone https://github.com/seanfeeney/wasp-mcp
cd wasp-mcp
npm install
npm run build
Set your API key:
export ANTHROPIC_API_KEY=sk-ant-...
Run the server (stdio transport, for Claude Desktop / Claude Code):
node dist/index.js
Add to Claude Code
Add wasp-mcp as a local MCP server in your Claude Code project config:
claude mcp add wasp -- node /absolute/path/to/wasp-mcp/dist/index.js
Or edit .claude/settings.json manually:
{
"mcpServers": {
"wasp": {
"command": "node",
"args": ["/absolute/path/to/wasp-mcp/dist/index.js"],
"env": {
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
Restart Claude Code after saving. Confirm the server is live:
/mcp
MCP Tools
get_manifest
Fetches the structural index for a URL. Tries the site's own /.well-known/wasp.json first; falls back to client-side DOM generation from the fetched HTML.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
url |
string | yes | Fully-qualified URL of the page |
Example
get_manifest("https://engineering.tamu.edu/cse/profiles/aklappenecker.html")
{
"wasp": "1.0",
"url": "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
"title": "Andreas Klappenecker — Texas A&M CSE",
"summary": "Faculty profile for Andreas Klappenecker.",
"keywords": ["quantum computing", "cryptography", "image processing"],
"chunks": [
{ "id": "chunk_001", "heading": "Andreas Klappenecker", "anchor": "#wasp-001", "depth": 1, "tokens": 5, "order": 1 },
{ "id": "chunk_002", "heading": "Research Interests", "anchor": "#wasp-002", "depth": 2, "tokens": 4, "order": 2 },
{ "id": "chunk_003", "heading": "Selected Publications","anchor": "#wasp-003", "depth": 2, "tokens": 5, "order": 3 }
],
"generated": "client"
}
fetch_chunk
Retrieves the plain-text body of a single section identified by its anchor. Anchor resolution uses a three-stage fallback: getElementById → querySelector → fuzzy heading match.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
url |
string | yes | Page URL (used for cache lookup; re-fetches if not cached) |
anchor |
string | yes | CSS anchor string from the manifest (e.g. "#research-interests") |
Example
fetch_chunk(
"https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
"#wasp-002"
)
Quantum computing, image processing, cryptography.
query_page
Full end-to-end retrieval: builds the manifest, scores chunks against the query, fetches relevant section bodies, calls Claude, and returns a cited answer.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
url |
string | yes | Page to query |
query |
string | yes | Natural-language question |
provider |
string | no | "claude" (default) | "openai" | "ollama" |
Example
query_page(
"https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
"What are this professor's research interests?"
)
{
"answer": "Professor Klappenecker's research interests are quantum computing [1], image processing [1], and cryptography [1].",
"sources": [
{ "heading": "Research Interests", "anchor": "#wasp-002" }
]
}
Token Efficiency
| Approach | Tokens sent to LLM | Example page |
|---|---|---|
| Raw HTML scrape | ~16,700 | TAMU faculty profile |
WASP query_page |
~2,700 | same page, same query |
| Reduction | 6.1× |
Token savings grow with page length. A 50,000-token documentation page may see 20–40× reduction when only 2–3 sections are relevant.
Project Structure
wasp-mcp/
index.ts MCP server entry — registers tools
manifest.ts get_manifest() — discovery + DOM generation
chunks.ts fetch_chunk() — anchor resolution + text extraction
retrieval.ts query_page() — scoring, enrichment, LLM call
providers.ts claude / openai / ollama provider adapters
cache.ts In-memory URL → { manifest, html } cache with TTL
types.ts Shared TypeScript types
License
MIT © Sean Feeney, 2026
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.