MCP Codebase RAG Server

MCP Codebase RAG Server

Provides semantic vector search over local codebases via MCP, enabling hybrid search (dense + sparse + RRF) for any MCP client like GitHub Copilot or Claude Desktop.

Category
Visit Server

README

MCP Codebase RAG Server

Self-hosted MCP server that adds semantic vector search over your local codebases to any MCP-capable client (GitHub Copilot, Cline, Claude Desktop, etc.).
Goal: Robust RAG for Copilot (or any MCP client) without paying for Cursor/Windsurf.
Zero cost. Zero limits. Full control.


๐Ÿ“‹ Overview

Problem Solved

  • GitHub Copilot Pro has an excellent model but limited codebase RAG
  • Cursor/Windsurf have good RAG but cost $15โ€“20/month
  • Continue.dev has RAG but doesn't integrate natively with MCP-aware agents

Solution

This MCP server provides:

  1. Indexing of local codebases using vector embeddings
  2. Hybrid semantic search via search_codebase tool โ€” combines dense (embeddings) + sparse (BM25) + RRF fusion
  3. Multi-project support with isolated ChromaDB collections
  4. Universal integration with any MCP client

Tech Stack

Component Technology Why
Embeddings sentence-transformers (all-MiniLM-L6-v2) Fast, lightweight, 384-dim
Code embeddings microsoft/unixcoder-base (optional) Code-specific model, activated via EMBEDDING_MODEL
Vector DB ChromaDB Simple, persistent, zero config
Code parsing Tree-sitter + BM25 + RRF Universal language-agnostic chunking and hybrid search
MCP SDK modelcontextprotocol/python-sdk Official standard
Runtime Python 3.11+ โ€”

๐Ÿš€ Installation

Prerequisites

  • Python 3.11+
  • pip or uv

Install

git clone https://github.com/di5rupt0r/codebase-rag.git
cd codebase-rag

# Install as a package (adds the `codebase-rag` command to ~/.local/bin)
pip install -e .

Health Check

python scripts/health_check.py

Expected output:

๐Ÿ” MCP Codebase RAG Server Health Check
==================================================
Checking Embedding Provider... โœ“ OK (3.03s)
Checking ChromaDB Connection... โœ“ OK (0.14s)
Checking Search Functionality... โœ“ OK (2.36s)
Checking Data Directory... โœ“ OK (0.00s)
==================================================
Health Check Summary: 4/4 checks passed
๐ŸŽ‰ All systems operational!

๐Ÿ“– Quick Start

1. Index a Project

# Index the current directory
python scripts/index_project.py . --name my-project

# Index a specific path
python scripts/index_project.py ~/projects/api --name api-backend

# Force full reindex
python scripts/index_project.py . --name my-project --force

# Dry run to preview what will be indexed
python scripts/index_project.py . --name my-project --dry-run

2. Start the MCP Server

stdio (default โ€” for local clients)

codebase-rag

HTTP (for remote clients or always-on service)

MCP_TRANSPORT=streamable-http MCP_PORT=8080 codebase-rag

3. Configure Your MCP Client

VS Code (GitHub Copilot / Cline) โ€” stdio mode

Add to your VS Code mcp.json:

{
  "servers": {
    "codebase-rag": {
      "type": "stdio",
      "command": "codebase-rag"
    }
  }
}

VS Code โ€” HTTP mode (when running as a service)

{
  "servers": {
    "codebase-rag": {
      "type": "http",
      "url": "http://127.0.0.1:8080/mcp"
    }
  }
}

Claude Desktop

{
  "mcpServers": {
    "codebase-rag": {
      "command": "codebase-rag"
    }
  }
}

๏ฟฝ Search Capabilities

Hybrid Search Architecture

The server implements a hybrid search system that combines:

  1. Dense Search (Vector Embeddings)

    • Semantic similarity using sentence-transformers
    • Finds conceptually similar code
    • Base: ChromaDB vector similarity
  2. Sparse Search (BM25)

    • Exact lexical term matching
    • Finds precise identifiers and keywords
    • Base: rank-bm25 with regex tokenization
  3. Reciprocal Rank Fusion (RRF)

    • Intelligent fusion of dense + sparse results
    • k=60 (standard literature value)
    • Improves both precision and recall

Search Results

{
  "results": [
    {
      "path": "src/auth.py",
      "content": "def authenticate_user(user, password): ...",
      "score": 0.0325,
      "type": "function",
      "name": "authenticate_user", 
      "line_start": 15,
      "line_end": 25
    }
  ],
  "total_indexed_chunks": 1247,
  "query_time_ms": 23.4,
  "search_type": "hybrid_rrf"
}

Performance Characteristics

Metric Target Description
Tree-sitter parsing < 50ms/file Universal language parsing
BM25 indexing < 10ms/query In-memory reconstruction
RRF fusion < 1ms In-memory score calculation
Total query time < 100ms End-to-end hybrid search
Memory overhead < 50MB For 5k chunks

Fallback Behavior

  • Tree-sitter unavailable โ†’ Line-based chunking
  • BM25 unavailable โ†’ Dense-only search
  • Both unavailable โ†’ Original dense search with keyword reranking

๏ฟฝ๏ธ MCP Tools

search_codebase

Hybrid semantic search over an indexed project using vector embeddings + BM25 + RRF fusion.

Input:

{
  "query": "where is the authentication logic?",
  "top_k": 5,
  "project": "my-project",
  "file_types": [".py", ".js"]
}

Output:

{
  "results": [
    {
      "path": "src/auth.py",
      "content": "def authenticate_user(user, password):\n    ...",
      "score": 0.89
    }
  ],
  "total_indexed_chunks": 1247,
  "query_time_ms": 23
}

reindex_project

Re-index a project after large changes.

Input:

{
  "project_path": "/path/to/your/project",
  "project_name": "my-project",
  "force": false
}

list_indexed_projects

List all indexed projects.

get_files

List indexed files in a project.

Input: { "project": "my-project" }

get_file_content

Return the full content of an indexed file.

Input: { "path": "src/main.py" }


โš™๏ธ Configuration

Environment Variables

# ChromaDB path (default: ./data/chroma_db relative to install dir)
export CHROMA_DB_PATH="/custom/path/to/chroma"

# Embedding model (default: all-MiniLM-L6-v2)
# Use microsoft/unixcoder-base for better code-specific embeddings (~2GB, requires torch)
export EMBEDDING_MODEL="microsoft/unixcoder-base"

# HTTP transport settings (only needed in HTTP/service mode)
export MCP_TRANSPORT="streamable-http"
export MCP_HOST="127.0.0.1"
export MCP_PORT="8080"
# Set this when exposing via reverse proxy or Tailscale Funnel
export MCP_ALLOWED_HOST="your-hostname.example.com"

# Log level (default: INFO)
export LOG_LEVEL="DEBUG"

Chunking (Advanced)

Edit src/codebase_rag/config.py:

CHUNK_SIZE = 500          # characters per chunk
CHUNK_OVERLAP = 50        # overlap between chunks
DEFAULT_TOP_K = 5         # default results per search

Supported File Types

Python, JavaScript, TypeScript, JSX, TSX, Java, C, C++, Go, Rust, Ruby, PHP, C#, Shell, YAML, JSON.

Ignored Patterns

*.pyc, __pycache__, .git, node_modules, .venv, venv, *.egg-info, .pytest_cache


๐Ÿ“Š Benchmarks

Operation Expected Time Notes
Index 20 .py files (~5k LOC) ~5โ€“8s First run; incremental is much faster
Vector search (top_k=5) ~20โ€“50ms ChromaDB in-process
Query embedding ~10โ€“20ms sentence-transformers, CPU
Server cold start ~2โ€“3s Model loaded into memory

๐Ÿค– Automation Scripts

Auto-discovery

Scan a directory for Git repositories and index them all automatically:

python scripts/auto_index.py ~/projects

Watch Mode

Watch a project for file changes and reindex incrementally (debounced, 5s):

python scripts/watch.py /path/to/project --name my-project

Git Hook (post-commit reindex)

Install a post-commit hook so changed files are reindexed automatically after every commit:

python scripts/setup_git_hook.py /path/to/your/repo my-project

๐Ÿงช Tests

# All tests (116 passing)
pytest -v

# Specific modules
pytest tests/test_config.py -v
pytest tests/test_embeddings.py -v
pytest tests/test_indexer.py -v
pytest tests/test_server.py -v

# With coverage
pytest --cov=codebase_rag --cov-report=html

๐Ÿ”ง Deploy as a systemd Service (Linux)

A template service file is provided at systemd/codebase-rag-server.service.
Replace YOUR_USERNAME with your actual Linux username before installing:

# Substitute your username in-place
sed -i "s/YOUR_USERNAME/$USER/g" systemd/codebase-rag-server.service

# Install and start
sudo cp systemd/codebase-rag-server.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable codebase-rag-server
sudo systemctl start codebase-rag-server

# Check
sudo systemctl status codebase-rag-server
sudo journalctl -u codebase-rag-server -f

Exposing Remotely via Tailscale Funnel (optional)

To use the server from a remote machine (Codespaces, company laptop, etc.):

# Expose port 8080 via Tailscale Funnel
tailscale funnel 8080

# Add to your service file:
# Environment="MCP_ALLOWED_HOST=your-machine.your-tailnet.ts.net"

# Then in your remote mcp.json:
# "url": "https://your-machine.your-tailnet.ts.net/mcp"

๐Ÿ› Troubleshooting

Slow first start: The embedding model (~100MB) is downloaded on first use. Run health_check.py to pre-load it.

High memory usage: The default model uses ~500MB RAM. If needed, use an even smaller model via EMBEDDING_MODEL.

Permission errors: Ensure the running user has write access to data/chroma_db/.

Debug mode:

LOG_LEVEL=DEBUG codebase-rag

๐Ÿ“ Contributing

  1. Fork the project
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Follow strict TDD: RED โ†’ GREEN โ†’ REFACTOR
  4. Atomic, descriptive commits
  5. Open a pull request with tests
# Dev setup
pip install -e ".[dev]"
pytest -v --cov=codebase_rag

๐Ÿ“„ License

MIT License โ€” see LICENSE.


๐Ÿ”— References


Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured