distill-mcp-v2

distill-mcp-v2

distill-mcp-v2 is a high-performance, network-dependency-free Python FastMCP server designed to aggressively optimize Large Language Model (LLM) context windows. It provides specialized tools for compressing and analyzing massive AI-agent payloads without losing critical semantic information.

Category
Visit Server

README

<div align="center"> <h1>Distill v2</h1> <p><strong>Model Context Protocol (MCP) Server for Massive Context Engineering & Compression</strong></p>

CI PyPI version License: MIT Python </div>


⚑ Overview

distill-mcp-v2 is a high-performance, network-dependency-free Python FastMCP server designed to aggressively optimize Large Language Model (LLM) context windows. It provides specialized tools for compressing and analyzing massive AI‑agent payloads without losing critical semantic information.

By filtering noise, stabilizing cache prefixes, and running multi-model token cost estimations locally, Distill v2 dramatically reduces API costs and preserves LLM reasoning abilities when dealing with heavy payloads like infinite logs, monolithic API schemas, and multi-agent war room transcripts.

πŸš€ Performance & Token Compression

Our rigorous, independent stress-testing benchmarks (audited via pytest and chaos blueprints) prove that distill-mcp-v2 achieves up to 99.7% token compression while retaining 100% of the crucial context.

Scenario Payload Profile Raw Tokens Distilled Tokens Savings %
Trace Avalanche Heavy Java Stacktraces 150,027 546 99.6%
Schema Monolith Massive Microservice JSON 56,588 1,355 97.6%
Incident War Room Multi-Agent Chat Logs 117,952 371 99.7%

(Tested against Claude-3-Opus budgets. Scenario 1 reduced costs from $2.25/call to $0.008/call.)

Read the full Benchmark & Execution Report for deeper insights.

πŸ›  Features & Toolset

Distill v2 exposes 8 precise tools to agents via the Model Context Protocol:

  1. distill_json β€” Compresses raw JSON payloads, retaining anomalies, exceptions, and errors.
  2. distill_logs β€” Compresses raw .log files, preserving head/tail contexts and stack traces.
  3. distill_schema β€” Compacts massive MCP tool catalogs and JSON schemas to structural parameters.
  4. distill_response β€” Progressively prunes, minifies, and truncates outputs to fit strict token budgets.
  5. distill_conversation β€” Extracts goals, decisions, blockers, and actions from multi-agent transcripts without leaking raw ISO timestamps.
  6. stabilize_for_cache β€” Maps chaotic raw identifiers (UUIDs, hex IDs) to sequential placeholders to stabilize LLM prompt caching.
  7. analyze_tokens β€” Accurately estimates token counts using tiktoken (cl100k_base).
  8. compare β€” Computes detailed diffs and cost-savings analyses between raw and distilled payloads.

πŸ“¦ Quick Start

Installation

Distill v2 requires Python 3.10+. We recommend using uv or pip in an isolated virtual environment.

# Clone the repository
git clone https://github.com/yatinkoul/distill.git
cd distill

# Create a virtual environment and activate it
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -e .

Running the Server

Run the FastMCP server, which natively binds to a stateless HTTP endpoint.

# Run the FastMCP server (default host: 0.0.0.0, port: 8000)
distill --host 0.0.0.0 --port 8000
# Or using uvicorn directly:
.venv/bin/uvicorn src.main:app --host 0.0.0.0 --port 8000

The server exposes a JSON‑RPC endpoint at http://localhost:8000/mcp.

πŸ§ͺ Stress Testing & Development

The repository ships with an exhaustive stress-test runner that dynamically allocates ports, executes deterministic payload scenarios, and validates the integrity of the distilled outputs.

To run the complete test suite (118/118 passing):

# Install development dependencies
pip install -e ".[dev]"

# Run the test suite
pytest

To manually execute the stress-testing blueprints and regenerate the benchmark report:

python stress_tests_blueprints/run_blueprints.py

πŸ“– Documentation

🀝 Contributing

Contributions, issues, and feature requests are welcome! Please read our CONTRIBUTING.md (coming soon) for guidelines on how to propose improvements. Ensure all tests and linting (ruff) pass before submitting pull requests.

πŸ“œ License

This project is licensed under the MIT License – see the LICENSE file for details.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured