Fetch MCP Server

Fetch MCP Server

Reduces token consumption by 73-87% by cleaning web and API data before it reaches the LLM context window. Supports fetching URLs, searching the web, optimizing JSON, and more.

Category
Visit Server

README

Fetch MCP Server

The high-efficiency networking layer for LLMs. Reduce token consumption by 73–87% by cleaning web and API data before it hits your context window.

No API keys required — search is powered by DuckDuckGo.

Why

When an LLM fetches a URL or calls an API, most of the response is noise — nav bars, scripts, tracking pixels, templated API URLs, null fields, repeated sub-objects. You pay for all of it in tokens, latency, and reduced reasoning room.

Fetch MCP sits between your agent and the network. It strips the noise, returns only what matters, and lets the agent drill into specifics on demand.

How It Works

Agent calls smart_fetch(url)
        │
        ▼
   ┌─────────┐
   │  Fetch   │
   └────┬─────┘
        │
   HTML ▼          JSON ▼
┌──────────────┐  ┌──────────────────────┐
│ → Markdown   │  │ Strip URL templates  │
│ Strip noise  │  │ Remove nulls/empties │
│ 73% savings  │  │ Dedup sub-objects    │
└──────────────┘  │ Schema-first mode    │
                  │ 87% savings          │
                  └──────────────────────┘

For JSON, the default behavior is schema-first: large arrays return the structure + 2 sample items instead of all data. The agent then uses jsonpath to fetch exactly what it needs.

1. smart_fetch("https://api.github.com/orgs/python/repos")
   → { _schema: {id: int, name: string, ...}, _count: 30, _sample: [...2 items] }

2. smart_fetch("https://api.github.com/orgs/python/repos", jsonpath="$[*].name")
   → ["cpython", "mypy", "typeshed", ...]

Token Savings

Run uv run python scripts/benchmark.py to reproduce. Results from real endpoints:

HTML → Markdown

Page Raw tokens Optimized Saved
GitHub Blog 92,352 26,459 71%
Hacker News 11,790 4,237 64%
MDN — JavaScript 51,417 8,855 83%
BBC News 116,111 27,207 77%
Rust Lang 5,107 1,163 77%
Go pkg — net/http 121,427 55,383 54%
Python docs — asyncio 6,692 1,473 78%
Socket.dev — Axios compromise 138,981 23,788 83%
Total 543,877 148,565 73%

JSON → Schema-first

Endpoint Raw tokens Pruned Schema-first Best
GitHub API — repos 16,518 7,055 2,474 85%
GitHub API — issues 20,790 16,690 3,785 82%
JSONPlaceholder — posts 8,761 8,761 315 96%
JSONPlaceholder — todos 8,240 8,240 202 98%
JSONPlaceholder — users 1,839 1,839 529 71%
JSONPlaceholder — comments 492 479 330 33%
npm — typescript 1,750 1,745 n/a 0%
OpenLibrary — search 1,646 1,640 n/a 0%
Total 60,036 11,020 82%

At Sonnet pricing ($3/M), that's $1.33 saved per batch. At Opus pricing ($15/M), $6.66.

Tools

Tool What it does
smart_fetch Fetch any URL — auto-optimizes HTML (→ markdown) and JSON (→ schema-first)
browser_fetch Fetch JavaScript-rendered pages with Playwright/Chrome
web_search Search the web via DuckDuckGo, no API key needed
css_query Fetch a page, return only elements matching a CSS selector
pdf_fetch Fetch a PDF URL and return its text content (requires pdfminer.six)
optimize_json Optimize any JSON blob — use on output from other MCP servers

smart_fetch

Fetches a URL and auto-detects the content type:

  • HTML — strips navigation, ads, scripts, and tracking. Converts to clean markdown.
  • JSON arrays (5+ items) — returns schema + 2 sample items. Use jsonpath to drill in.
  • JSON objects / small arrays — prunes empty values, strips URL templates, deduplicates.
Parameter Type Default Description
url str required URL to fetch
jsonpath str None JSONPath to extract specific fields (e.g. $[*].name, $[?@.id==42])
max_depth int 5 Max JSON nesting depth before flattening to dot-notation
extract_metadata bool False Include YAML frontmatter with page metadata (HTML only)
max_chars int 20000 Maximum characters in output (1,000–100,000)
headers dict None Optional HTTP headers (e.g. {"Authorization": "Bearer token"})
use_cache bool True Return cached response if available (TTL-scoped per URL + params)
ttl int 1800 Cache TTL in seconds (60–86400)

browser_fetch

Fetches a URL with Playwright/Chrome, waits for the rendered page, and converts the final HTML to markdown.

Use this for pages that block simple HTTP clients or require JavaScript rendering. It does not bypass CAPTCHA; use headed mode when a human needs to complete a challenge or login before extraction.

Parameter Type Default Description
url str required URL to fetch
selector str None Optional CSS selector to extract from the rendered page
wait_ms int 3000 Milliseconds to wait after DOMContentLoaded
timeout_ms int 30000 Navigation timeout in milliseconds
headed bool False Open a visible browser window for manual CAPTCHA/login
extract_metadata bool False Include YAML frontmatter with page metadata
max_chars int 20000 Maximum characters in output (1,000–100,000)
headers dict None Optional HTTP headers injected into the browser context

web_search

Search the web via DuckDuckGo. Returns results as a markdown list.

Parameter Type Default Description
query str required Search query
max_results int 10 Number of results (1–20)
region str "wt-wt" Region code ("us-en", "wt-wt" for global)

css_query

Fetch a page and return only content matching a CSS selector. Use when you know exactly which part of a page you need (a pricing table, an article body, a specific div).

Parameter Type Default Description
url str required URL to fetch
selector str required CSS selector (e.g. #pricing-table, .product-card, article)
max_chars int 20000 Maximum characters in output (1,000–100,000)
use_cache bool True Return cached response if available
ttl int 1800 Cache TTL in seconds (60–86400)

pdf_fetch

Fetch a URL that serves a PDF and return its text as plain markdown. Falls back to HTML→markdown if the URL does not return a PDF.

Parameter Type Default Description
url str required URL of a PDF document
pages str None Page range to extract, e.g. "1-5" or "3". Default: all pages.
headers dict None Optional HTTP headers (e.g. {"Authorization": "Bearer token"})
max_chars int 20000 Maximum characters in output (1,000–100,000)

optimize_json

Optimize any JSON payload — from other MCP servers, API responses, or files. This is the key tool for reducing token usage across your entire MCP stack.

Accepts raw JSON strings or file paths. When an MCP tool response is too large and gets saved to a file by Claude, pass the file path directly.

Parameter Type Default Description
data str required Raw JSON string, or a file path to a JSON file
jsonpath str None JSONPath to extract specific fields
max_depth int 5 Max nesting depth before flattening
max_chars int 20000 Maximum characters in output (1,000–100,000)

Typical workflow with other MCP servers:

1. Call mcp__github__list_pull_requests → agent gets large JSON response
2. Call optimize_json(data=<response>) → schema + 2 samples, 85% fewer tokens
3. Call optimize_json(data=<response>, jsonpath="$[?@.state=='open'].title") → exactly what's needed

JSON Optimization Pipeline

Applied by both smart_fetch (on JSON URLs) and optimize_json (on any JSON blob):

Step What it does Impact
Schema-first mode Large arrays → structure + 2 samples Huge on list endpoints
URL template stripping Removes forks_url, keys_url{/key_id}, etc. ~30 keys per object in REST APIs
Empty/null removal Strips null, "", [], {} Moderate
Sub-object dedup Identical nested dicts (e.g. owner) extracted once Large on org/user APIs
Deep flattening Dicts beyond max_depth → dot-notation keys Prevents runaway nesting
JSONPath drill-in Extract only matching fields on follow-up calls Surgical precision

CLI

The fetcher and optimizer are also available as a standalone CLI for shell pipes, scripts, and hooks.

# Smart-fetch any URL
uv run fetch-mcp smart_fetch https://example.com

# Smart-fetch JSON and extract specific fields with JSONPath
uv run fetch-mcp smart_fetch https://api.github.com/orgs/python/repos --jsonpath '$[*].name'

# Browser-fetch a JavaScript-rendered or HTTP-client-blocked page
uv run fetch-mcp browser_fetch https://example.com

# Open a visible browser for manual CAPTCHA/login, then extract after waiting
uv run fetch-mcp browser_fetch https://example.com --headed --wait-ms 30000

# Fetch a PDF and extract its text
uv run fetch-mcp pdf_fetch https://example.com/paper.pdf

# Extract specific pages from a PDF
uv run fetch-mcp pdf_fetch https://example.com/report.pdf --pages 1-5

# Optimize any JSON from stdin
curl -s https://api.github.com/orgs/python/repos | uv run fetch-mcp optimize

# Extract specific fields with JSONPath
cat response.json | uv run fetch-mcp optimize --jsonpath '$[*].name'

# Control nesting depth
echo '{"deep": {"nested": {"data": 1}}}' | uv run fetch-mcp optimize --max-depth 2

# View savings report
uv run fetch-mcp report

Savings Tracking

Every call to optimize_json, smart_fetch, and the CLI logs the before/after character counts to ~/.local/share/fetch-mcp/savings.jsonl. View the cumulative report:

uv run fetch-mcp report
Source                          Calls    Raw chars    Opt chars        Saved       %
------------------------------------------------------------------------------------
optimize_json                      12      284,103       41,220      242,883   85.5%
smart_fetch:https://api.gith       3       59,986       24,823       35,163   58.6%
hook:mcp__jira__jira_search         5       93,052       93,052            0    0.0%
------------------------------------------------------------------------------------
TOTAL                              20      437,141      159,095      278,046   63.6%

The hook:* entries track raw MCP response sizes before optimization. The optimize_json entries track actual savings.

Override the log path with REQUEST_MCP_SAVINGS_LOG=/custom/path.jsonl.

Setup

No local clone required — run directly from GitHub with uv:

uvx --from git+https://github.com/micaelmalta/fetch-mcp.git fetch-mcp

Or clone locally for development:

git clone https://github.com/micaelmalta/fetch-mcp.git && cd fetch-mcp
uv sync --group dev

Install as a Claude skill:

curl -fsSL https://raw.githubusercontent.com/micaelmalta/fetch-mcp/main/install.sh | bash

Install a different branch or tag:

curl -fsSL https://raw.githubusercontent.com/micaelmalta/fetch-mcp/main/install.sh | REQUEST_MCP_REF=your-branch-or-tag bash

Integration

Claude Code

1. Add the MCP server:

claude mcp add fetch-mcp -- uvx --from git+https://github.com/micaelmalta/fetch-mcp.git fetch-mcp

2. (Optional) Instruct the agent via CLAUDE.md:

## JSON Optimization

When any MCP tool (GitHub, Jira, Datadog, Confluence, etc.) returns a JSON response
larger than ~50 lines, pass it through the `optimize_json` tool from fetch-mcp before
reasoning over it. You can pass raw JSON or a file path directly. Use jsonpath to drill
into specifics rather than consuming the full payload.

3. (Optional) Auto-hook for logging + nudging:

Add to ~/.claude/settings.json to automatically log MCP response sizes and remind the agent to optimize:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "mcp__github__*|mcp__jira__*|mcp__datadog__*|mcp__confluence__*",
        "hooks": [
          {
            "type": "command",
            "command": "jq -r '{tool: .tool_name, chars: (.tool_response | tostring | length)}' | jq -r '\"\\(.tool) \\(.chars)\"' | { read -r tool chars; mkdir -p ~/.local/share/fetch-mcp; echo \"{\\\"ts\\\":\\\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\\\",\\\"source\\\":\\\"hook:$tool\\\",\\\"raw_chars\\\":$chars,\\\"opt_chars\\\":$chars,\\\"saved_chars\\\":0,\\\"saved_pct\\\":0}\" >> ~/.local/share/fetch-mcp/savings.jsonl; echo \"{\\\"hookSpecificOutput\\\":{\\\"hookEventName\\\":\\\"PostToolUse\\\",\\\"additionalContext\\\":\\\"MCP response was ${chars} chars. Pipe it through optimize_json from fetch-mcp to reduce token usage. You can pass raw JSON or a file path directly.\\\"}}\"; }"
          }
        ]
      }
    ]
  }
}

Add or remove MCP prefixes from the matcher as needed.

Cursor

1. Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):

{
  "mcpServers": {
    "fetch-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "git+https://github.com/micaelmalta/fetch-mcp.git", "fetch-mcp"]
    }
  }
}

2. Add to Cursor Rules (Settings > Rules, or .cursorrules):

When any MCP tool returns a large JSON response (>50 lines), pass it through the
optimize_json tool from fetch-mcp before reasoning. You can pass raw JSON or a
file path directly. Use the jsonpath parameter to drill into specific fields.

OpenCode

1. Add to .opencode.json (project) or ~/.opencode.json (global):

{
  "mcpServers": {
    "fetch-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "git+https://github.com/micaelmalta/fetch-mcp.git", "fetch-mcp"]
    }
  }
}

2. Add to .opencode.md (project memory):

## JSON Optimization

When any MCP tool (GitHub, Jira, Datadog, Confluence, etc.) returns a JSON response
larger than ~50 lines, pass it through the `optimize_json` tool from fetch-mcp before
reasoning over it. You can pass raw JSON or a file path directly. Use jsonpath to drill
into specifics rather than consuming the full payload.

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "fetch-mcp": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/micaelmalta/fetch-mcp.git", "fetch-mcp"]
    }
  }
}

MCP Inspector (dev)

uv run mcp dev fetch_mcp/server.py

Integration Summary

Claude Code Cursor OpenCode Claude Desktop
Add MCP claude mcp add .cursor/mcp.json .opencode.json claude_desktop_config.json
Instruct agent CLAUDE.md .cursorrules .opencode.md Server instructions (built-in)
Auto-hook + logging PostToolUse hook Not supported Not supported Not supported
CLI pipe | uv run fetch-mcp optimize N/A N/A N/A

Benchmark

uv run python scripts/benchmark.py

Fetches real pages and API endpoints, counts tokens with tiktoken (cl100k_base), and compares raw vs optimized output across HTML and JSON with cost estimates.

Dependencies

Package Purpose
mcp FastMCP server framework
httpx Async HTTP client
html-to-markdown Rust-based HTML → Markdown (~200 MB/s)
beautifulsoup4 CSS selector extraction
jsonpath-ng JSONPath query support
ddgs DuckDuckGo search (no API key)
truststore System certificate store for SSL
pdfminer.six PDF text extraction for pdf_fetch
tiktoken Token counting (dev only, for benchmark)

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured