MCP Servers

codebase-rlm

MCP server that uses Recursive Language Models to analyze codebases hierarchically, creating a persistent, queryable knowledge map to overcome context window limits and prevent context rot in RooCode.

README

🧠 Codebase RLM Navigator

Recursive Language Model Intelligence for RooCode

A Model Context Protocol (MCP) server that uses Recursive Language Models (RLM) to transform your codebase into an intelligent, hierarchical knowledge map. Built specifically for RooCode integration, it eliminates context window limitations and prevents context rot by analyzing code in layers—from individual files to modules to system architecture—creating a persistent intelligence map that can be queried instantly.

Now with Standalone Mode: Analyze codebases directly from the filesystem without requiring Repomix!

🔄 What is RLM (Recursive Language Models)?

RLM is an architectural pattern that breaks down large codebases into hierarchical layers of understanding:

Level 1 (File Layer): Each file is analyzed individually for functions, classes, and dependencies
Level 2 (Module Layer): Related files are grouped and analyzed for collaboration patterns
Level 3 (System Layer): The entire architecture is synthesized from module summaries

This recursive approach solves the context window problem by:

✅ Preventing Context Rot: Analysis is cached and reused, not regenerated each time
✅ Breaking the Context Limit: Only relevant layers are loaded on-demand
✅ Maintaining Coherence: Each layer builds on the previous, preserving relationships
✅ Enabling Instant Queries: Pre-computed summaries mean zero re-analysis time

🎯 What Problem Does This Solve?

When working with large codebases in RooCode (or any AI coding assistant), you face critical challenges:

Context Window Limitations: AI assistants can't load entire codebases at once
Context Rot: AI forgets architectural decisions between conversations
Manual File Navigation: You waste time explaining project structure repeatedly
Inefficient Re-Analysis: Same code analyzed multiple times, wasting tokens and time
Lost Relationships: AI doesn't understand how modules interact

Codebase RLM Navigator solves this using Recursive Language Models to create a persistent, hierarchical "intelligence map" that RooCode can query instantly—no context rot, no re-analysis, no token waste.

✨ Key Features

🚀 Intelligent Caching

Analyzes files once, caches results forever (until code changes)
MD5 hash-based change detection
Parallel processing with ThreadPoolExecutor (5 concurrent workers)

🏗️ Three-Level RLM Architecture

File Level: Function signatures, dependencies, core responsibilities
Module Level: How files collaborate within folders
System Level: Overall architecture patterns and data flow

⚡ Instant Retrieval

No re-parsing on subsequent queries
Direct file content access
Fuzzy file path matching

🔒 100% Local & Private

Uses Ollama for LLM analysis (no cloud APIs)
All data stays on your machine
Works offline after initial setup

🎓 Use Cases

1️⃣ Onboarding to a New Codebase (Junior Dev Friendly)

Scenario: You just cloned a 50-file Python project and need to understand it quickly.

# In RooCode, simply ask:
"Use the codebase-rlm MCP tool to analyze this codebase and explain the architecture"

Important: Always specify "use the codebase-rlm MCP tool" to ensure RooCode uses the correct MCP server.

What happens:

RooCode calls mcp--codebase-rlm--analyze_codebase() → Scans the filesystem directly and analyzes all files in parallel using RLM.
Calls mcp--codebase-rlm--get_architecture() → Gets high-level system overview.
Calls mcp--codebase-rlm--list_modules() → Shows you all modules with summaries.

Result: In 30 seconds, you understand the entire project structure without reading a single file. The RLM hierarchy prevents context rot—this analysis is cached forever.

2️⃣ Finding Where to Add a Feature

Scenario: You need to add user authentication but don't know where the auth logic lives.

# In RooCode:
"Use the codebase-rlm MCP tool to show me all modules in this project"
# Then:
"Use the codebase-rlm tool to read the details of the 'auth' module"
# Then:
"Use the codebase-rlm tool to get the full content of auth/login.py"

Important: Specify the MCP tool name to avoid confusion with other tools.

What happens:

mcp--codebase-rlm--list_modules() → Shows all folders with brief descriptions
mcp--codebase-rlm--read_module_details("auth") → Shows all files in auth folder with summaries
mcp--codebase-rlm--get_file_content("auth/login.py") → Retrieves exact file content

Result: You navigate directly to the right file without guessing. The RLM structure maintains context across queries.

3️⃣ Code Review Preparation

Scenario: You're reviewing a PR that touches 10 files across 3 modules.

# In RooCode:
"Use the codebase-rlm MCP tool to analyze the 'api', 'database', and 'utils' modules and explain how they interact"

What happens:

RooCode queries module summaries from the RLM hierarchy
Understands data flow between modules
Provides architectural context for your review

Result: You review with full context of how changes affect the system. No context rot between review sessions.

4️⃣ Refactoring Large Projects

Scenario: You need to refactor a monolithic app into microservices.

# In RooCode:
"Use the codebase-rlm MCP tool to show me the system architecture and identify tightly coupled modules"

What happens:

mcp--codebase-rlm--get_architecture() → Reveals architecture patterns from RLM analysis
Module summaries show dependencies
You identify refactoring boundaries

Result: Data-driven refactoring decisions instead of guesswork. The recursive analysis reveals hidden coupling.

🛠️ Installation & Setup

Prerequisites

Python 3.12+

python --version  # Must be 3.12 or higher

NVIDIA GPU with 6GB+ VRAM (Required for ministral-3 model)
- Minimum: RTX 3060, RTX 4060, GTX 1660 Ti, or any GPU with 6GB+ VRAM
- Recommended: RTX 4070 (12GB+) for faster analysis
- Check your GPU: nvidia-smi

Ollama (Local LLM runtime - this runs the AI model on your computer)

# Install from https://ollama.ai/
# After installation, verify it's working:
ollama list

# Download the ministral-3 model (requires 6GB VRAM minimum):
ollama pull ministral-3:latest

Repomix (Optional - for manual snapshots)
```
npm install -g repomix
```

Note: Without a compatible NVIDIA GPU, Ollama will run on CPU (10-50x slower). AMD/Intel GPUs are not currently supported by Ollama.

Step 1: Install Dependencies

pip install fastmcp requests

Git Ignore Setup

The project includes a .gitignore file that excludes the following:

.repomix_rlm_cache/ - RLM analysis cache directory
rlm-main/ - RLM main directory (external/submodule)
RLM-Research.pdf - Research documentation
guidelines.md - Project guidelines

These files are automatically excluded from git commits.

Step 2: Configure the MCP Server

Add this to your RooCode MCP settings file:

Windows: %APPDATA%\Code\User\globalStorage\rooveterinaryinc.roo-cline\settings\cline_mcp_settings.json

macOS/Linux: ~/.config/Code/User/globalStorage/rooveterinaryinc.roo-cline/settings/cline_mcp_settings.json

{
  "mcpServers": {
    "codebase-rlm": {
      "command": "python",
      "args": ["C:/Users/YourUsername/path/to/smart_context.py"],
      "disabled": false,
      "description": "Recursive Language Model codebase analyzer"
    }
  }
}

Important Notes:

Replace the path with the actual location of smart_context.py
The server name is codebase-rlm (this is what you reference in prompts)
Always specify this MCP tool name when prompting RooCode to avoid confusion

Step 3: Restart RooCode

Close and reopen VS Code to load the MCP server.

📖 How to Use

Basic Workflow (Standalone - Recommended)

# 1. Navigate to your project directory in VS Code
# 2. In RooCode, ALWAYS specify the MCP tool:
"Use the codebase-rlm MCP tool to analyze this codebase and show me the architecture"

Advanced Workflow (Using Repomix)

If you prefer to use a pre-generated Repomix XML file:

# 1. Generate Repomix snapshot manually
repomix .

# 2. In RooCode:
"Use the codebase-rlm MCP tool to analyze this codebase using the repomix XML"

🚨 Important: How to Prompt RooCode

To avoid confusion with other MCP tools, always specify the tool name:

✅ Correct Prompts:

"Use the codebase-rlm MCP tool to analyze this codebase"
"Call the analyze_codebase tool from codebase-rlm"
"Use codebase-rlm to show me all modules"

❌ Incorrect Prompts (will confuse RooCode):

"Analyze this codebase"  # Too vague, might use wrong tool
"Show me the architecture"  # Doesn't specify which MCP server

Available MCP Tools

RooCode calls these tools when you specify the codebase-rlm MCP server:

Tool	Purpose	Example Query
`mcp--codebase-rlm--analyze_codebase(path, use_repomix_xml, repomix_xml_path)`	Scan files and build RLM intelligence map	"Use codebase-rlm to analyze this codebase"
`mcp--codebase-rlm--get_architecture()`	Get system-level overview	"Use codebase-rlm to show the architecture pattern"
`mcp--codebase-rlm--list_modules()`	List all modules with summaries	"Use codebase-rlm to show me all modules"
`mcp--codebase-rlm--read_module_details(module_path)`	Get detailed module info	"Use codebase-rlm to explain the 'api' module"
`mcp--codebase-rlm--get_file_content(file_path)`	Retrieve file content	"Use codebase-rlm to show me api/routes.py"

⚙️ Configuration

Edit the Config class in smart_context.py:

class Config:
    OLLAMA_BASE_URL = "http://localhost:11434"  # Ollama API endpoint
    LLM_MODEL = "ministral-3:latest"            # Model for analysis
    CACHE_DIR = "./.repomix_rlm_cache"          # Cache location

Recommended Models

Model	Speed	Quality	VRAM Required	Use Case
`ministral-3:latest`	⚡⚡⚡	⭐⭐⭐	6GB	Default - Fast & good quality
`llama3.2:latest`	⚡⚡	⭐⭐⭐⭐	8GB	Larger projects, better summaries
`qwen2.5-coder:latest`	⚡⚡	⭐⭐⭐⭐⭐	8GB	Code-specialized, highest quality

Note: VRAM = Video RAM on your graphics card. Check yours with nvidia-smi.

🚀 Advantages Over Traditional Methods

vs. Manual Code Reading

Traditional	Repomix RLM
❌ Read files one by one	✅ Instant hierarchical overview
❌ Forget context between sessions	✅ Persistent cached analysis
❌ Miss architectural patterns	✅ AI-generated architecture insights

vs. Grep/Search Tools

Grep/Ripgrep	Repomix RLM
❌ Keyword matching only	✅ Semantic understanding
❌ No context about file purpose	✅ Function signatures + responsibilities
❌ Can't explain relationships	✅ Module collaboration analysis

vs. Loading Entire Codebase into AI

Full Context Dump	Codebase RLM
❌ Exceeds context window	✅ Recursive hierarchy breaks context limits
❌ Expensive token usage	✅ One-time analysis, infinite queries
❌ Slow processing	✅ Cached, instant retrieval
❌ Context rot between sessions	✅ Persistent RLM intelligence map

🐛 Troubleshooting

Issue: "Ollama connection refused"

Solution:

# Check if Ollama is running
ollama list

# Start Ollama service
ollama serve

Issue: "repomix-output.xml not found" (When using XML mode)

Solution:

# Either run without XML mode (recommended):
"Use codebase-rlm to analyze this codebase"

# Or generate the XML file manually:
repomix .

Issue: "Analysis is slow"

Causes & Fixes:

No GPU: Requires NVIDIA GPU with 6GB+ VRAM for ministral-3. Check with nvidia-smi
Large files: RLM only analyzes first 500,000 chars per file (configurable)
Slow model: The default ministral-3:latest is the fastest option
CPU fallback: Ollama on CPU is 10-50x slower than GPU

Issue: "Cache not updating after code changes"

Solution:

# Delete cache and re-analyze
rm -rf .repomix_rlm_cache
# Then in RooCode:
"Use codebase-rlm to analyze this codebase"

Issue: "RooCode uses wrong MCP tool"

Solution: Always specify the tool name in your prompt:

"Use the codebase-rlm MCP tool to [your request]"

🔧 Advanced Usage

Custom Prompts

Edit the prompts in RLMIndexer to customize analysis style:

# File-level analysis (line 139)
prompt = f"""You are a Senior Code Reviewer. Summarize this code file.
...
"""

# Module-level analysis (line 182)
prompt = f"""You are a Software Architect. Summarize this MODULE.
...
"""

# System-level analysis (line 256)
prompt = f"""You are the CTO. Provide a System Architecture Overview...
...
"""

Multi-Project Support

The cache is project-specific (based on XML path hash):

# Project A
cd /project-a
repomix .
# RooCode analyzes → cached in .repomix_rlm_cache/abc12345/

# Project B
cd /project-b
repomix .
# RooCode analyzes → cached in .repomix_rlm_cache/def67890/

🤝 Integration with RooCode Workflows

Workflow 1: Feature Development

1. Ask RooCode: "Use the codebase-rlm MCP tool to analyze this codebase"
   → RooCode calls mcp--codebase-rlm--analyze_codebase()
   → Builds RLM hierarchy (file → module → system)

2. Ask RooCode: "Use codebase-rlm to find where I should add user profile editing"
   → RooCode calls mcp--codebase-rlm--list_modules() + read_module_details()
   → Navigates the RLM structure

3. Ask RooCode: "Use codebase-rlm to show me the user model file"
   → RooCode calls mcp--codebase-rlm--get_file_content()
   → Retrieves from cached RLM map

4. Ask RooCode: "Add a profile_picture field to the User model"
   → RooCode edits the file with full architectural context
   → No context rot—RLM maintains relationships

Workflow 2: Bug Investigation

1. Ask RooCode: "Use codebase-rlm to explain the authentication flow"
   → RooCode queries auth module from RLM hierarchy

2. Ask RooCode: "Use codebase-rlm to show me where JWT tokens are validated"
   → RooCode retrieves specific file content

3. Ask RooCode: "Fix the token expiration bug"
   → RooCode makes targeted changes with full context

📊 Performance Metrics

Test Project: 50 Python files, ~10,000 lines of code

Operation	Time	Notes
Initial Analysis (GPU)	~30s	One-time cost with RTX 4070
Initial Analysis (CPU)	~15min	Without GPU (not recommended)
Subsequent Queries	<1s	Cached retrieval
Architecture Overview	<1s	Pre-computed
File Content Retrieval	<0.1s	Direct XML access

Cache Size: ~500KB for 50 files (JSON summaries)

🛡️ Security & Privacy

✅ No cloud APIs: Everything runs locally via Ollama
✅ No telemetry: Zero data collection
✅ Gitignore respected: Repomix excludes sensitive files
✅ Cache is local: Stored in .repomix_rlm_cache/ (add to .gitignore)

🗺️ Roadmap

[ ] Support for non-XML Repomix formats (Markdown, Plain Text)
[ ] Interactive CLI for standalone usage
[ ] VS Code extension for direct integration
[ ] Support for incremental updates (only re-analyze changed files)
[ ] Multi-language syntax highlighting in summaries
[ ] Export architecture diagrams (Mermaid/PlantUML)
[ ] AMD/Intel GPU support (when Ollama adds support)

🤔 FAQ

Q: Does this work with non-Python projects?
A: Yes! Repomix supports all languages. The LLM analyzes any code.

Q: Can I use a different LLM provider (OpenAI, Anthropic)?
A: Currently Ollama-only, but you can modify OllamaClient to support other APIs.

Q: Do I really need an NVIDIA GPU?
A: Technically no, but CPU analysis is 10-50x slower. For a 50-file project: GPU = 30s, CPU = 15min.

Q: How much disk space does the cache use?
A: ~10KB per file (JSON summaries). A 100-file project = ~1MB cache.

Q: Does this replace reading code?
A: No, it's a navigation tool. Use it to find what to read, then read the actual code.

Q: Can I share the cache with my team?
A: Not recommended. Cache paths are machine-specific. Each dev should generate their own.

Q: What's the difference between RLM and RAG?
A: RAG retrieves chunks on-demand. RLM pre-analyzes and caches hierarchical summaries, preventing context rot.

📄 License

MIT License - Feel free to use, modify, and distribute.

🙏 Acknowledgments

Repomix - Codebase packaging tool
FastMCP - MCP server framework
Ollama - Local LLM runtime
RooCode - AI coding assistant (formerly Roo Cline)

📞 Support

Issues? Open an issue on GitHub or ask RooCode:

"Use the codebase-rlm MCP tool to help me debug the Repomix RLM server"

Built with ❤️ for developers who want to understand code faster

Stop reading files one by one. Start navigating with Recursive Language Model intelligence.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured