MCP Servers

rlm-mcp-server

Provides recursive language model capabilities to AI assistants, enabling efficient exploration of large contexts through iterative Python code execution.

README

RLM MCP Server

"The difference between the Enterprise's computer and a Culture Mind is that the Mind doesn't try to hold everything in immediate consciousness—it knows how to efficiently explore vast data stores."

A containerized MCP (Model Context Protocol) server that provides Recursive Language Model capabilities to any AI assistant.

What is RLM?

Based on the paper "Recursive Language Models" from MIT CSAIL, RLM treats large contexts as external environments that can be explored programmatically rather than stuffed into a context window.

Instead of cramming a 500-page document into context, RLM teaches the model to write Python code that explores the document—searching, parsing, counting, extracting—building up understanding iteratively until it can answer your question.

Quick Start

With Docker Compose (recommended)

# Clone and enter directory
cd rlm-mcp-server

# With OpenAI
OPENAI_API_KEY=sk-xxx docker compose up

# With local llama.cpp server (running on port 8080)
RLM_API_BASE=http://host.docker.internal:8080/v1 docker compose up

# With Ollama
RLM_API_BASE=http://host.docker.internal:11434/v1 RLM_MODEL=llama3.2 docker compose up

With Docker directly

# Build
docker build -t rlm-mcp-server .

# Run with OpenAI
docker run -e OPENAI_API_KEY=sk-xxx -p 8765:8765 rlm-mcp-server

# Run with local LLM
docker run \
  -e RLM_API_BASE=http://host.docker.internal:8080/v1 \
  --add-host host.docker.internal:host-gateway \
  -p 8765:8765 \
  rlm-mcp-server

Integration with Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "rlm": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "--network", "host",
        "-e", "OPENAI_API_KEY",
        "rlm-mcp-server",
        "python", "src/server.py", "--transport", "stdio"
      ],
      "env": {
        "OPENAI_API_KEY": "your-key-here"
      }
    }
  }
}

See examples/ for more configuration options including local LLM setups.

Available Tools

Session Management

Tool	Description
`rlm_load_context`	Load text content into a session
`rlm_load_file`	Load a file into a session
`rlm_list_sessions`	List active sessions
`rlm_close_session`	Close a session to free memory

Querying

Tool	Description
`rlm_query`	Ask a question about loaded context (iterative exploration)
`rlm_quick_query`	One-shot: load and query in one call
`rlm_execute_code`	Execute Python directly against context (power user)

Configuration

Tool	Description
`rlm_config`	View current server configuration

Usage Examples

Load and query a large document

Human: Load this 200-page PDF transcript and find all mentions of "quarterly revenue"

Claude: I'll use RLM to explore this large document.

[Uses rlm_load_context to load the document]
[Uses rlm_query with question "Find all mentions of quarterly revenue with surrounding context"]

Based on exploring the document, I found 47 mentions of quarterly revenue...

Analyze a codebase

Human: Here's our entire codebase (500 files). What authentication methods are used?

Claude: I'll load this into RLM and explore it programmatically.

[Uses rlm_load_context with the codebase]
[Uses rlm_query to explore authentication patterns]

After exploring the codebase, I found three authentication methods:
1. JWT tokens in /api/auth/...
2. OAuth2 in /integrations/...
3. API keys in /external/...

Environment Variables

Variable	Default	Description
`RLM_MODEL`	`gpt-4o-mini`	Primary model for RLM
`RLM_SUB_MODEL`	Same as RLM_MODEL	Model for iterations (can be cheaper)
`RLM_MAX_ITERATIONS`	`15`	Max exploration iterations
`RLM_API_BASE`	OpenAI	API endpoint (for local models)
`RLM_API_KEY` / `OPENAI_API_KEY`	-	API key
`RLM_SUB_API_BASE`	Same as RLM_API_BASE	Separate endpoint for sub-model

Cost Optimization

You can use a cheaper/local model for the iterative exploration while using a more capable model for initialization:

RLM_MODEL=gpt-4o \
RLM_SUB_MODEL=gpt-4o-mini \
docker compose up

Or use a local model for iterations entirely:

RLM_MODEL=gpt-4o \
RLM_SUB_MODEL=local-model \
RLM_SUB_API_BASE=http://host.docker.internal:8080/v1 \
docker compose up

How It Works

Load Context: Your massive document/codebase is stored in a session
Question: You ask a question about the content
Exploration: The LLM writes Python code to explore the context
Iteration: Code executes, LLM sees results, writes more code
Answer: When confident, LLM provides final answer

The REPL environment has access to:

CONTEXT - the full loaded text
re - regex module
json - JSON module
Counter, defaultdict - from collections
Standard Python builtins

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Claude Desktop                        │
│                         or                               │
│                    Any MCP Client                        │
└────────────────────────┬────────────────────────────────┘
                         │ MCP Protocol
                         ▼
┌─────────────────────────────────────────────────────────┐
│                   RLM MCP Server                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │   Sessions   │  │   RLM Core   │  │  REPL Env    │  │
│  │   Storage    │  │   Engine     │  │  (Python)    │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
└────────────────────────┬────────────────────────────────┘
                         │ OpenAI-compatible API
                         ▼
┌─────────────────────────────────────────────────────────┐
│           LLM Backend (OpenAI / Local / Ollama)          │
└─────────────────────────────────────────────────────────┘

Development

# Install dependencies
pip install -r requirements.txt

# Run locally (stdio mode for testing)
cd src && python server.py --transport stdio

# Run tests
pytest tests/

Transferring to Offline Lab (Mojoverse)

Build the image on Cybertron:

docker save rlm-mcp-server:latest | gzip > rlm-mcp-server.tar.gz

Transfer to Mojoverse via your usual method

Load on Mojoverse:

gunzip -c rlm-mcp-server.tar.gz | docker load

Run with local L4-powered llama.cpp:

RLM_API_BASE=http://localhost:8080/v1 docker compose up

References

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured