MCP Codebase Mentor

MCP Codebase Mentor

An MCP server that acts as an AI mentor for any codebase using dual-layer indexing, enabling codebase initialization, tutorial generation, and semantic search.

Category
Visit Server

README

MCP Codebase Mentor

An MCP (Model Context Protocol) server that acts as an AI mentor for any codebase using dual-layer indexing.

Features

  • Universal language support - AI handles all programming languages
  • Complete file coverage - Indexes code, tests, configs, and docs
  • Smart filtering - Respects .gitignore and applies sensible defaults
  • Semantic search - Vector-based code search using LlamaIndex
  • Tutorial generation - Creates structured learning guides with architecture diagrams

Installation

# Clone the repository
git clone <repository-url>
cd mcp-codebase

# Install dependencies
npm install

# Build the project
npm run build

Usage with Cursor/Claude

Add to your MCP configuration:

{
  "mcpServers": {
    "codebase-mentor": {
      "command": "node",
      "args": ["/path/to/mcp-codebase/dist/index.js"]
    }
  }
}

Available Tools

init_codebase

Initialize and index a codebase for AI mentoring.

init_codebase(rootPath: "/path/to/your/project")

This will:

  1. Crawl the directory structure (respecting .gitignore)
  2. Analyze each file with AI to extract summaries, imports, and exports
  3. Build a manifest with file metadata and dependency graph
  4. Create a vector index for semantic search

Output files:

  • .mcp_manifest.json - File metadata and dependency graph
  • .mcp_index/ - Vector index for semantic search

generate_tutorial

Generate a comprehensive "Zero to Hero" tutorial for a codebase.

generate_tutorial(rootPath: "/path/to/your/project", focusTopic?: "authentication")

Creates:

  • Project overview and architecture
  • Mermaid.js dependency diagrams
  • Structured learning path (chapters)
  • Key insights and patterns

search_codebase

Perform semantic search across a codebase.

search_codebase(rootPath: "/path/to/your/project", query: "how is authentication handled?")

Returns relevant code snippets with:

  • File paths and line numbers
  • Relevance scores
  • File context and summaries

Project Structure

mcp-codebase/
├── src/
│   ├── index.ts                    # MCP server entry point
│   ├── tools/
│   │   ├── init.ts                 # init_codebase implementation
│   │   ├── tutorial.ts             # generate_tutorial implementation
│   │   └── search.ts               # search_codebase implementation
│   ├── core/
│   │   ├── crawler.ts              # File system walker (.gitignore aware)
│   │   ├── analyzer.ts             # LLM-based file analysis
│   │   ├── manifest.ts             # Manifest CRUD operations
│   │   └── vectorIndex.ts          # LlamaIndex integration
│   ├── utils/
│   │   ├── fileFilter.ts           # Smart file filtering logic
│   │   ├── languageDetect.ts       # Language/file type detection
│   │   ├── progress.ts             # Progress reporter
│   │   └── git.ts                  # Git metadata extraction
│   ├── prompts/
│   │   ├── analyze.ts              # Universal file analysis prompt
│   │   └── curriculum.ts           # Tutorial generation prompt
│   └── types/
│       ├── manifest.ts             # Manifest type definitions
│       └── mcp.ts                  # MCP tool interfaces
├── package.json
├── tsconfig.json
└── README.md

Development

# Type checking
npm run typecheck

# Development mode with auto-reload
npm run dev

# Build for production
npm run build

Performance Expectations

For a typical repository:

  • 500 files: ~10-15 minutes (mostly AI analysis)
  • 1000 files: ~20-30 minutes
  • 5000 files: ~2 hours

Initialization is a one-time operation. Subsequent queries use the cached index.

Storage

For a 500-file repository (~50MB source):

  • Manifest: ~100-200 KB
  • Vector Index: ~5-10 MB
  • Total overhead: ~20% of source size

Limitations

  1. LLM Dependency: Initialization requires an MCP host with sampling capability
  2. No Incremental Updates: Re-run init_codebase when files change significantly
  3. Binary Files: Skipped (images, PDFs, executables)
  4. Very Large Files: May hit LLM context limits (>100K tokens)

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured