wasp-mcp

wasp-mcp

A Model Context Protocol server that lets Claude query arbitrary webpages with token-efficient, structure-aware retrieval, reducing token costs by fetching only relevant sections.

Category
Visit Server

README

wasp-mcp

Web Agent Semantic Protocol — MCP Server

wasp-mcp is a Model Context Protocol server that lets Claude (or any MCP client) query arbitrary webpages with token-efficient, structure-aware retrieval. Instead of dumping raw HTML into the context window, WASP builds a lightweight structural index (the manifest) from a page's headings, then fetches content only for the sections relevant to a query.

The result: answers grounded in real page content at a fraction of the token cost of naive scraping.

See the WASP Whitepaper for full protocol specification.


How It Works

Every webpage has two useful layers:

  1. Structure — headings and section anchors that form a table of contents. Small, cheap to index.
  2. Content — the text under each heading. Expensive to send in full; most is irrelevant to any given query.

WASP exploits this split with a two-tier pipeline:

Tier 1 — get_manifest(url)
  ↓ Try GET /.well-known/wasp.json (site-native manifest, 3 s timeout)
  ↓ Fall back: fetch HTML → parse headings → generate manifest client-side
  → Returns: structured index (headings, anchors, depth, token estimates)

Tier 2 — fetch_chunk(url, anchor)
  ↓ Resolve anchor → DOM element (getElementById → querySelector → fuzzy match)
  ↓ Extract section text via Range API / heading-sibling walk
  → Returns: plain-text body of that section only

query_page(url, query)
  ↓ get_manifest → score chunks by keyword match → fetch_chunk for top results
  ↓ Build numbered [1. Heading] context → call Claude API → inline [N] citations
  → Returns: { answer, sources[] }

A naive full-page scrape of a typical faculty profile costs ~16,700 tokens. The same query via WASP costs ~2,700 — a 6× reduction.


Install

Requirements: Node.js ≥ 18, an Anthropic API key.

git clone https://github.com/seanfeeney/wasp-mcp
cd wasp-mcp
npm install
npm run build

Set your API key:

export ANTHROPIC_API_KEY=sk-ant-...

Run the server (stdio transport, for Claude Desktop / Claude Code):

node dist/index.js

Add to Claude Code

Add wasp-mcp as a local MCP server in your Claude Code project config:

claude mcp add wasp -- node /absolute/path/to/wasp-mcp/dist/index.js

Or edit .claude/settings.json manually:

{
  "mcpServers": {
    "wasp": {
      "command": "node",
      "args": ["/absolute/path/to/wasp-mcp/dist/index.js"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Restart Claude Code after saving. Confirm the server is live:

/mcp

MCP Tools

get_manifest

Fetches the structural index for a URL. Tries the site's own /.well-known/wasp.json first; falls back to client-side DOM generation from the fetched HTML.

Parameters

Name Type Required Description
url string yes Fully-qualified URL of the page

Example

get_manifest("https://engineering.tamu.edu/cse/profiles/aklappenecker.html")
{
  "wasp": "1.0",
  "url": "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "title": "Andreas Klappenecker — Texas A&M CSE",
  "summary": "Faculty profile for Andreas Klappenecker.",
  "keywords": ["quantum computing", "cryptography", "image processing"],
  "chunks": [
    { "id": "chunk_001", "heading": "Andreas Klappenecker", "anchor": "#wasp-001", "depth": 1, "tokens": 5, "order": 1 },
    { "id": "chunk_002", "heading": "Research Interests",   "anchor": "#wasp-002", "depth": 2, "tokens": 4, "order": 2 },
    { "id": "chunk_003", "heading": "Selected Publications","anchor": "#wasp-003", "depth": 2, "tokens": 5, "order": 3 }
  ],
  "generated": "client"
}

fetch_chunk

Retrieves the plain-text body of a single section identified by its anchor. Anchor resolution uses a three-stage fallback: getElementByIdquerySelector → fuzzy heading match.

Parameters

Name Type Required Description
url string yes Page URL (used for cache lookup; re-fetches if not cached)
anchor string yes CSS anchor string from the manifest (e.g. "#research-interests")

Example

fetch_chunk(
  "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "#wasp-002"
)
Quantum computing, image processing, cryptography.

query_page

Full end-to-end retrieval: builds the manifest, scores chunks against the query, fetches relevant section bodies, calls Claude, and returns a cited answer.

Parameters

Name Type Required Description
url string yes Page to query
query string yes Natural-language question
provider string no "claude" (default) | "openai" | "ollama"

Example

query_page(
  "https://engineering.tamu.edu/cse/profiles/aklappenecker.html",
  "What are this professor's research interests?"
)
{
  "answer": "Professor Klappenecker's research interests are quantum computing [1], image processing [1], and cryptography [1].",
  "sources": [
    { "heading": "Research Interests", "anchor": "#wasp-002" }
  ]
}

Token Efficiency

Approach Tokens sent to LLM Example page
Raw HTML scrape ~16,700 TAMU faculty profile
WASP query_page ~2,700 same page, same query
Reduction 6.1×

Token savings grow with page length. A 50,000-token documentation page may see 20–40× reduction when only 2–3 sections are relevant.


Project Structure

wasp-mcp/
  index.ts        MCP server entry — registers tools
  manifest.ts     get_manifest() — discovery + DOM generation
  chunks.ts       fetch_chunk() — anchor resolution + text extraction
  retrieval.ts    query_page() — scoring, enrichment, LLM call
  providers.ts    claude / openai / ollama provider adapters
  cache.ts        In-memory URL → { manifest, html } cache with TTL
  types.ts        Shared TypeScript types

License

MIT © Sean Feeney, 2026

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured