MCP Servers

LODA MCP Server

Provides token-efficient document search and retrieval for LLMs by returning relevant document sections within specified token budgets. It utilizes section-aware parsing and Bloom filter elimination to offer high-speed, zero-dependency access to large documents.

README

LODA MCP Server

LLM-Optimized Document Access - A Model Context Protocol server for token-efficient document search in Claude Desktop and Claude Code.

What is LODA?

LODA (LLM-Optimized Document Access) is a search strategy designed specifically for how LLMs consume documents. Instead of returning raw matches or arbitrary chunks, LODA understands document structure and returns the most relevant sections within your token budget.

The Problem

When LLMs work with large documents, they face a fundamental challenge:

Traditional Approach	Problem
Load entire document	Exceeds context limits
Keyword search	No relevance ranking, returns too much
RAG/Vector search	Requires infrastructure, 200-500ms latency
Chunk-based retrieval	Arbitrary boundaries break coherence

We discovered a "gap zone" at 25-35% document positions where traditional smart retrieval actually performed worse than brute-force loading.

The Solution

LODA combines lightweight techniques to achieve vector search quality at grep-like speeds:

┌─────────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│  Large Document │────▶│  LODA Search Engine  │────▶│ Relevant Sections│
│   (5000+ lines) │     │  • Bloom Filters     │     │  within budget   │
│                 │     │  • Token Budget      │     │  (~200 tokens)   │
│                 │     │  • Relevance Scoring │     │                  │
└─────────────────┘     │  • Smart Caching     │     └─────────────────┘
                        └──────────────────────┘

Results:

70-95% token savings compared to loading full document
1-5ms search latency (cached) vs 200-500ms for RAG
Zero external dependencies - no vector database needed

Quick Start

1. Installation

git clone https://github.com/patrickkarle/loda-mcp-server.git
cd loda-mcp-server
npm install

2. Configure Claude Desktop

Find your config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

Add this to the file:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

3. Configure Claude Code

Add to your project's .claude/settings.json or global ~/.claude/settings.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

4. Use It!

Ask Claude:

"Use loda_search to find the authentication section in api-docs.md"

"Search architecture.md for deployment instructions with a 500 token budget"

How LODA Works

1. Bloom Filter Elimination

Before scoring, LODA uses Bloom filters to instantly eliminate sections that definitely don't contain your search terms. This O(1) operation typically eliminates 80%+ of sections.

2. Section-Aware Parsing

LODA respects your document's structure. It understands markdown headings and returns complete logical sections, not arbitrary text chunks.

3. Relevance Scoring

Each candidate section is scored based on:

Query term presence in content (0.8 base score)
Header match bonus (+0.2 for header matches)
Multi-term coverage (all terms weighted equally)

4. Token Budget Selection

You specify a token budget, LODA returns the best sections that fit:

// "I need info about auth, but only have 500 tokens of context"
{
  query: "authentication",
  contextBudget: 500
}

5. Aggressive Caching

Document structures and Bloom filters are cached with TTL (60s default). Repeated searches on the same document are 10x+ faster.

API Reference

loda_search

The main search tool.

Parameters:

Parameter	Type	Required	Default	Description
`documentPath`	string	Yes	-	Path to document (relative to staging or absolute)
`query`	string	Yes	-	Search keywords or phrase
`contextBudget`	number	No	null	Maximum tokens to return (null = unlimited)
`maxSections`	number	No	5	Maximum sections to return

Example Request:

{
  "documentPath": "api-docs.md",
  "query": "authentication oauth",
  "contextBudget": 500,
  "maxSections": 3
}

Example Response:

{
  "query": "authentication oauth",
  "documentPath": "/path/to/api-docs.md",
  "sections": [
    {
      "id": "section-5",
      "header": "OAuth 2.0 Authentication",
      "level": 3,
      "score": 1.0,
      "lineRange": [27, 41],
      "tokenEstimate": 88
    },
    {
      "id": "section-4",
      "header": "API Key Authentication",
      "level": 3,
      "score": 0.8,
      "lineRange": [15, 26],
      "tokenEstimate": 66
    }
  ],
  "metadata": {
    "totalSections": 21,
    "candidatesAfterBloom": 5,
    "scoredAboveZero": 3,
    "returnedSections": 2,
    "totalTokens": 154,
    "budgetStatus": "SAFE",
    "truncated": false,
    "cacheHit": true
  }
}

Budget Status Values

Status	Meaning
`UNLIMITED`	No budget was specified
`SAFE`	Total tokens under 80% of budget
`WARNING`	Total tokens between 80-100% of budget
`EXCEEDED`	Over budget (first section always returned)

Other Tools

Tool	Description
`list_document_sections`	Get hierarchical structure of document
`read_section`	Read specific section by ID with context
`read_lines`	Read specific line range
`search_content`	Basic regex search (no LODA optimization)

Staging Directory

By default, LODA looks for documents in the staging/ subdirectory:

loda-mcp-server/
├── staging/              ← Put documents here
│   ├── api-docs.md
│   ├── architecture.md
│   └── user-guide.md
└── document_access_mcp_server.js

You can also use absolute paths to search any document on your system.

HTTP Mode (Development/Testing)

For testing without Claude, run the server in HTTP mode:

node document_access_mcp_server.js --mode=http --port=49400

Then test with curl:

# Health check
curl http://localhost:49400/health

# List tools
curl http://localhost:49400/tools

# Search
curl -X POST http://localhost:49400/tools/loda_search \
  -H "Content-Type: application/json" \
  -d '{"documentPath": "api-docs.md", "query": "authentication"}'

Performance

Metric	Target	Achieved
Search latency (cached)	<10ms	1-5ms
Search latency (cold)	<100ms	20-50ms
Token savings	>70%	70-95%
Bloom filter effectiveness	>80%	~85%
Cache hit rate	>80%	~90%

Testing

# Run all LODA tests
npm test

# Run specific component tests
npm test -- tests/loda_search_handler.test.js

# Run with coverage
npm test -- --coverage

Test Results: 46/46 Passing

Component	Tests	Status
token_estimator	6	✅
relevance_scorer	8	✅
budget_manager	6	✅
bloom_filter	10	✅
loda_index	8	✅
loda_search_handler	8	✅

Architecture

loda/
├── token_estimator.js      # Pure token estimation (~4 chars/token)
├── relevance_scorer.js     # Section relevance scoring
├── budget_manager.js       # Token budget selection
├── loda_index.js           # Cached document structure (TTL + LRU)
├── bloom_filter.js         # O(1) section elimination
├── loda_search_handler.js  # Main orchestrator
└── index.js                # Module entry

document_access_mcp_server.js  # MCP server with 5 tools

Research & Development

This project was built using the Continuum Development Process (CDP), a 13-phase methodology that emphasizes traceability and quality gates.

Why We Built This

We tried several approaches before arriving at LODA:

Approach	Why It Failed
Semantic Chunking	Arbitrary boundaries split logical units
RAG + Vector Search	Too much infrastructure for single-doc access
JIT-Steg Retrieval	"Gap zone" at 25-35% where overhead exceeded brute-force
Simple Grep	No relevance ranking, no token awareness

LODA combines the best of each: section awareness, fast elimination, budget control, and zero external dependencies.

Research Documents

ULTRATHINK Analysis - Problem analysis from 5+ perspectives
Research Notes - Literature review and approach comparison
Implementation Plan - Technical architecture
Testing Plan - 61 test cases specified

Configuration Examples

Claude Desktop (Windows)

%APPDATA%\Claude\claude_desktop_config.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["C:/Users/YourName/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Claude Desktop (macOS/Linux)

~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/home/yourname/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Claude Code (Project-level)

.claude/settings.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Contributing

Fork the repository
Create a feature branch
Write tests for new functionality
Submit a PR with documentation

License

MIT License - see LICENSE for details.

Acknowledgments

Built for Model Context Protocol
Developed using Continuum Development Process
Inspired by information retrieval research on probabilistic data structures

Made with 🧠 for LLMs that need to read documents efficiently.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured