codebase-rlm
MCP server that uses Recursive Language Models to analyze codebases hierarchically, creating a persistent, queryable knowledge map to overcome context window limits and prevent context rot in RooCode.
README
π§ Codebase RLM Navigator
Recursive Language Model Intelligence for RooCode
A Model Context Protocol (MCP) server that uses Recursive Language Models (RLM) to transform your codebase into an intelligent, hierarchical knowledge map. Built specifically for RooCode integration, it eliminates context window limitations and prevents context rot by analyzing code in layersβfrom individual files to modules to system architectureβcreating a persistent intelligence map that can be queried instantly.
Now with Standalone Mode: Analyze codebases directly from the filesystem without requiring Repomix!
π What is RLM (Recursive Language Models)?
RLM is an architectural pattern that breaks down large codebases into hierarchical layers of understanding:
- Level 1 (File Layer): Each file is analyzed individually for functions, classes, and dependencies
- Level 2 (Module Layer): Related files are grouped and analyzed for collaboration patterns
- Level 3 (System Layer): The entire architecture is synthesized from module summaries
This recursive approach solves the context window problem by:
- β Preventing Context Rot: Analysis is cached and reused, not regenerated each time
- β Breaking the Context Limit: Only relevant layers are loaded on-demand
- β Maintaining Coherence: Each layer builds on the previous, preserving relationships
- β Enabling Instant Queries: Pre-computed summaries mean zero re-analysis time
π― What Problem Does This Solve?
When working with large codebases in RooCode (or any AI coding assistant), you face critical challenges:
- Context Window Limitations: AI assistants can't load entire codebases at once
- Context Rot: AI forgets architectural decisions between conversations
- Manual File Navigation: You waste time explaining project structure repeatedly
- Inefficient Re-Analysis: Same code analyzed multiple times, wasting tokens and time
- Lost Relationships: AI doesn't understand how modules interact
Codebase RLM Navigator solves this using Recursive Language Models to create a persistent, hierarchical "intelligence map" that RooCode can query instantlyβno context rot, no re-analysis, no token waste.
β¨ Key Features
π Intelligent Caching
- Analyzes files once, caches results forever (until code changes)
- MD5 hash-based change detection
- Parallel processing with ThreadPoolExecutor (5 concurrent workers)
ποΈ Three-Level RLM Architecture
- File Level: Function signatures, dependencies, core responsibilities
- Module Level: How files collaborate within folders
- System Level: Overall architecture patterns and data flow
β‘ Instant Retrieval
- No re-parsing on subsequent queries
- Direct file content access
- Fuzzy file path matching
π 100% Local & Private
- Uses Ollama for LLM analysis (no cloud APIs)
- All data stays on your machine
- Works offline after initial setup
π Use Cases
1οΈβ£ Onboarding to a New Codebase (Junior Dev Friendly)
Scenario: You just cloned a 50-file Python project and need to understand it quickly.
# In RooCode, simply ask:
"Use the codebase-rlm MCP tool to analyze this codebase and explain the architecture"
Important: Always specify "use the codebase-rlm MCP tool" to ensure RooCode uses the correct MCP server.
What happens:
- RooCode calls
mcp--codebase-rlm--analyze_codebase()β Scans the filesystem directly and analyzes all files in parallel using RLM. - Calls
mcp--codebase-rlm--get_architecture()β Gets high-level system overview. - Calls
mcp--codebase-rlm--list_modules()β Shows you all modules with summaries.
Result: In 30 seconds, you understand the entire project structure without reading a single file. The RLM hierarchy prevents context rotβthis analysis is cached forever.
2οΈβ£ Finding Where to Add a Feature
Scenario: You need to add user authentication but don't know where the auth logic lives.
# In RooCode:
"Use the codebase-rlm MCP tool to show me all modules in this project"
# Then:
"Use the codebase-rlm tool to read the details of the 'auth' module"
# Then:
"Use the codebase-rlm tool to get the full content of auth/login.py"
Important: Specify the MCP tool name to avoid confusion with other tools.
What happens:
mcp--codebase-rlm--list_modules()β Shows all folders with brief descriptionsmcp--codebase-rlm--read_module_details("auth")β Shows all files in auth folder with summariesmcp--codebase-rlm--get_file_content("auth/login.py")β Retrieves exact file content
Result: You navigate directly to the right file without guessing. The RLM structure maintains context across queries.
3οΈβ£ Code Review Preparation
Scenario: You're reviewing a PR that touches 10 files across 3 modules.
# In RooCode:
"Use the codebase-rlm MCP tool to analyze the 'api', 'database', and 'utils' modules and explain how they interact"
What happens:
- RooCode queries module summaries from the RLM hierarchy
- Understands data flow between modules
- Provides architectural context for your review
Result: You review with full context of how changes affect the system. No context rot between review sessions.
4οΈβ£ Refactoring Large Projects
Scenario: You need to refactor a monolithic app into microservices.
# In RooCode:
"Use the codebase-rlm MCP tool to show me the system architecture and identify tightly coupled modules"
What happens:
mcp--codebase-rlm--get_architecture()β Reveals architecture patterns from RLM analysis- Module summaries show dependencies
- You identify refactoring boundaries
Result: Data-driven refactoring decisions instead of guesswork. The recursive analysis reveals hidden coupling.
π οΈ Installation & Setup
Prerequisites
-
Python 3.12+
python --version # Must be 3.12 or higher -
NVIDIA GPU with 6GB+ VRAM (Required for ministral-3 model)
- Minimum: RTX 3060, RTX 4060, GTX 1660 Ti, or any GPU with 6GB+ VRAM
- Recommended: RTX 4070 (12GB+) for faster analysis
- Check your GPU:
nvidia-smi
-
Ollama (Local LLM runtime - this runs the AI model on your computer)
# Install from https://ollama.ai/ # After installation, verify it's working: ollama list # Download the ministral-3 model (requires 6GB VRAM minimum): ollama pull ministral-3:latest -
Repomix (Optional - for manual snapshots)
npm install -g repomix
Note: Without a compatible NVIDIA GPU, Ollama will run on CPU (10-50x slower). AMD/Intel GPUs are not currently supported by Ollama.
Step 1: Install Dependencies
pip install fastmcp requests
Git Ignore Setup
The project includes a .gitignore file that excludes the following:
.repomix_rlm_cache/- RLM analysis cache directoryrlm-main/- RLM main directory (external/submodule)RLM-Research.pdf- Research documentationguidelines.md- Project guidelines
These files are automatically excluded from git commits.
Step 2: Configure the MCP Server
Add this to your RooCode MCP settings file:
Windows: %APPDATA%\Code\User\globalStorage\rooveterinaryinc.roo-cline\settings\cline_mcp_settings.json
macOS/Linux: ~/.config/Code/User/globalStorage/rooveterinaryinc.roo-cline/settings/cline_mcp_settings.json
{
"mcpServers": {
"codebase-rlm": {
"command": "python",
"args": ["C:/Users/YourUsername/path/to/smart_context.py"],
"disabled": false,
"description": "Recursive Language Model codebase analyzer"
}
}
}
Important Notes:
- Replace the path with the actual location of
smart_context.py - The server name is
codebase-rlm(this is what you reference in prompts) - Always specify this MCP tool name when prompting RooCode to avoid confusion
Step 3: Restart RooCode
Close and reopen VS Code to load the MCP server.
π How to Use
Basic Workflow (Standalone - Recommended)
# 1. Navigate to your project directory in VS Code
# 2. In RooCode, ALWAYS specify the MCP tool:
"Use the codebase-rlm MCP tool to analyze this codebase and show me the architecture"
Advanced Workflow (Using Repomix)
If you prefer to use a pre-generated Repomix XML file:
# 1. Generate Repomix snapshot manually
repomix .
# 2. In RooCode:
"Use the codebase-rlm MCP tool to analyze this codebase using the repomix XML"
π¨ Important: How to Prompt RooCode
To avoid confusion with other MCP tools, always specify the tool name:
β Correct Prompts:
"Use the codebase-rlm MCP tool to analyze this codebase"
"Call the analyze_codebase tool from codebase-rlm"
"Use codebase-rlm to show me all modules"
β Incorrect Prompts (will confuse RooCode):
"Analyze this codebase" # Too vague, might use wrong tool
"Show me the architecture" # Doesn't specify which MCP server
Available MCP Tools
RooCode calls these tools when you specify the codebase-rlm MCP server:
| Tool | Purpose | Example Query |
|---|---|---|
mcp--codebase-rlm--analyze_codebase(path, use_repomix_xml, repomix_xml_path) |
Scan files and build RLM intelligence map | "Use codebase-rlm to analyze this codebase" |
mcp--codebase-rlm--get_architecture() |
Get system-level overview | "Use codebase-rlm to show the architecture pattern" |
mcp--codebase-rlm--list_modules() |
List all modules with summaries | "Use codebase-rlm to show me all modules" |
mcp--codebase-rlm--read_module_details(module_path) |
Get detailed module info | "Use codebase-rlm to explain the 'api' module" |
mcp--codebase-rlm--get_file_content(file_path) |
Retrieve file content | "Use codebase-rlm to show me api/routes.py" |
βοΈ Configuration
Edit the Config class in smart_context.py:
class Config:
OLLAMA_BASE_URL = "http://localhost:11434" # Ollama API endpoint
LLM_MODEL = "ministral-3:latest" # Model for analysis
CACHE_DIR = "./.repomix_rlm_cache" # Cache location
Recommended Models
| Model | Speed | Quality | VRAM Required | Use Case |
|---|---|---|---|---|
ministral-3:latest |
β‘β‘β‘ | βββ | 6GB | Default - Fast & good quality |
llama3.2:latest |
β‘β‘ | ββββ | 8GB | Larger projects, better summaries |
qwen2.5-coder:latest |
β‘β‘ | βββββ | 8GB | Code-specialized, highest quality |
Note: VRAM = Video RAM on your graphics card. Check yours with nvidia-smi.
π Advantages Over Traditional Methods
vs. Manual Code Reading
| Traditional | Repomix RLM |
|---|---|
| β Read files one by one | β Instant hierarchical overview |
| β Forget context between sessions | β Persistent cached analysis |
| β Miss architectural patterns | β AI-generated architecture insights |
vs. Grep/Search Tools
| Grep/Ripgrep | Repomix RLM |
|---|---|
| β Keyword matching only | β Semantic understanding |
| β No context about file purpose | β Function signatures + responsibilities |
| β Can't explain relationships | β Module collaboration analysis |
vs. Loading Entire Codebase into AI
| Full Context Dump | Codebase RLM |
|---|---|
| β Exceeds context window | β Recursive hierarchy breaks context limits |
| β Expensive token usage | β One-time analysis, infinite queries |
| β Slow processing | β Cached, instant retrieval |
| β Context rot between sessions | β Persistent RLM intelligence map |
π Troubleshooting
Issue: "Ollama connection refused"
Solution:
# Check if Ollama is running
ollama list
# Start Ollama service
ollama serve
Issue: "repomix-output.xml not found" (When using XML mode)
Solution:
# Either run without XML mode (recommended):
"Use codebase-rlm to analyze this codebase"
# Or generate the XML file manually:
repomix .
Issue: "Analysis is slow"
Causes & Fixes:
- No GPU: Requires NVIDIA GPU with 6GB+ VRAM for ministral-3. Check with
nvidia-smi - Large files: RLM only analyzes first 500,000 chars per file (configurable)
- Slow model: The default
ministral-3:latestis the fastest option - CPU fallback: Ollama on CPU is 10-50x slower than GPU
Issue: "Cache not updating after code changes"
Solution:
# Delete cache and re-analyze
rm -rf .repomix_rlm_cache
# Then in RooCode:
"Use codebase-rlm to analyze this codebase"
Issue: "RooCode uses wrong MCP tool"
Solution: Always specify the tool name in your prompt:
"Use the codebase-rlm MCP tool to [your request]"
π§ Advanced Usage
Custom Prompts
Edit the prompts in RLMIndexer to customize analysis style:
# File-level analysis (line 139)
prompt = f"""You are a Senior Code Reviewer. Summarize this code file.
...
"""
# Module-level analysis (line 182)
prompt = f"""You are a Software Architect. Summarize this MODULE.
...
"""
# System-level analysis (line 256)
prompt = f"""You are the CTO. Provide a System Architecture Overview...
...
"""
Multi-Project Support
The cache is project-specific (based on XML path hash):
# Project A
cd /project-a
repomix .
# RooCode analyzes β cached in .repomix_rlm_cache/abc12345/
# Project B
cd /project-b
repomix .
# RooCode analyzes β cached in .repomix_rlm_cache/def67890/
π€ Integration with RooCode Workflows
Workflow 1: Feature Development
1. Ask RooCode: "Use the codebase-rlm MCP tool to analyze this codebase"
β RooCode calls mcp--codebase-rlm--analyze_codebase()
β Builds RLM hierarchy (file β module β system)
2. Ask RooCode: "Use codebase-rlm to find where I should add user profile editing"
β RooCode calls mcp--codebase-rlm--list_modules() + read_module_details()
β Navigates the RLM structure
3. Ask RooCode: "Use codebase-rlm to show me the user model file"
β RooCode calls mcp--codebase-rlm--get_file_content()
β Retrieves from cached RLM map
4. Ask RooCode: "Add a profile_picture field to the User model"
β RooCode edits the file with full architectural context
β No context rotβRLM maintains relationships
Workflow 2: Bug Investigation
1. Ask RooCode: "Use codebase-rlm to explain the authentication flow"
β RooCode queries auth module from RLM hierarchy
2. Ask RooCode: "Use codebase-rlm to show me where JWT tokens are validated"
β RooCode retrieves specific file content
3. Ask RooCode: "Fix the token expiration bug"
β RooCode makes targeted changes with full context
π Performance Metrics
Test Project: 50 Python files, ~10,000 lines of code
| Operation | Time | Notes |
|---|---|---|
| Initial Analysis (GPU) | ~30s | One-time cost with RTX 4070 |
| Initial Analysis (CPU) | ~15min | Without GPU (not recommended) |
| Subsequent Queries | <1s | Cached retrieval |
| Architecture Overview | <1s | Pre-computed |
| File Content Retrieval | <0.1s | Direct XML access |
Cache Size: ~500KB for 50 files (JSON summaries)
π‘οΈ Security & Privacy
- β No cloud APIs: Everything runs locally via Ollama
- β No telemetry: Zero data collection
- β Gitignore respected: Repomix excludes sensitive files
- β
Cache is local: Stored in
.repomix_rlm_cache/(add to.gitignore)
πΊοΈ Roadmap
- [ ] Support for non-XML Repomix formats (Markdown, Plain Text)
- [ ] Interactive CLI for standalone usage
- [ ] VS Code extension for direct integration
- [ ] Support for incremental updates (only re-analyze changed files)
- [ ] Multi-language syntax highlighting in summaries
- [ ] Export architecture diagrams (Mermaid/PlantUML)
- [ ] AMD/Intel GPU support (when Ollama adds support)
π€ FAQ
Q: Does this work with non-Python projects?
A: Yes! Repomix supports all languages. The LLM analyzes any code.
Q: Can I use a different LLM provider (OpenAI, Anthropic)?
A: Currently Ollama-only, but you can modify OllamaClient to support other APIs.
Q: Do I really need an NVIDIA GPU?
A: Technically no, but CPU analysis is 10-50x slower. For a 50-file project: GPU = 30s, CPU = 15min.
Q: How much disk space does the cache use?
A: ~10KB per file (JSON summaries). A 100-file project = ~1MB cache.
Q: Does this replace reading code?
A: No, it's a navigation tool. Use it to find what to read, then read the actual code.
Q: Can I share the cache with my team?
A: Not recommended. Cache paths are machine-specific. Each dev should generate their own.
Q: What's the difference between RLM and RAG?
A: RAG retrieves chunks on-demand. RLM pre-analyzes and caches hierarchical summaries, preventing context rot.
π License
MIT License - Feel free to use, modify, and distribute.
π Acknowledgments
- Repomix - Codebase packaging tool
- FastMCP - MCP server framework
- Ollama - Local LLM runtime
- RooCode - AI coding assistant (formerly Roo Cline)
π Support
Issues? Open an issue on GitHub or ask RooCode:
"Use the codebase-rlm MCP tool to help me debug the Repomix RLM server"
Built with β€οΈ for developers who want to understand code faster
Stop reading files one by one. Start navigating with Recursive Language Model intelligence.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.