MCP Codebase Mentor
An MCP server that acts as an AI mentor for any codebase using dual-layer indexing, enabling codebase initialization, tutorial generation, and semantic search.
README
MCP Codebase Mentor
An MCP (Model Context Protocol) server that acts as an AI mentor for any codebase using dual-layer indexing.
Features
- Universal language support - AI handles all programming languages
- Complete file coverage - Indexes code, tests, configs, and docs
- Smart filtering - Respects
.gitignoreand applies sensible defaults - Semantic search - Vector-based code search using LlamaIndex
- Tutorial generation - Creates structured learning guides with architecture diagrams
Installation
# Clone the repository
git clone <repository-url>
cd mcp-codebase
# Install dependencies
npm install
# Build the project
npm run build
Usage with Cursor/Claude
Add to your MCP configuration:
{
"mcpServers": {
"codebase-mentor": {
"command": "node",
"args": ["/path/to/mcp-codebase/dist/index.js"]
}
}
}
Available Tools
init_codebase
Initialize and index a codebase for AI mentoring.
init_codebase(rootPath: "/path/to/your/project")
This will:
- Crawl the directory structure (respecting
.gitignore) - Analyze each file with AI to extract summaries, imports, and exports
- Build a manifest with file metadata and dependency graph
- Create a vector index for semantic search
Output files:
.mcp_manifest.json- File metadata and dependency graph.mcp_index/- Vector index for semantic search
generate_tutorial
Generate a comprehensive "Zero to Hero" tutorial for a codebase.
generate_tutorial(rootPath: "/path/to/your/project", focusTopic?: "authentication")
Creates:
- Project overview and architecture
- Mermaid.js dependency diagrams
- Structured learning path (chapters)
- Key insights and patterns
search_codebase
Perform semantic search across a codebase.
search_codebase(rootPath: "/path/to/your/project", query: "how is authentication handled?")
Returns relevant code snippets with:
- File paths and line numbers
- Relevance scores
- File context and summaries
Project Structure
mcp-codebase/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── tools/
│ │ ├── init.ts # init_codebase implementation
│ │ ├── tutorial.ts # generate_tutorial implementation
│ │ └── search.ts # search_codebase implementation
│ ├── core/
│ │ ├── crawler.ts # File system walker (.gitignore aware)
│ │ ├── analyzer.ts # LLM-based file analysis
│ │ ├── manifest.ts # Manifest CRUD operations
│ │ └── vectorIndex.ts # LlamaIndex integration
│ ├── utils/
│ │ ├── fileFilter.ts # Smart file filtering logic
│ │ ├── languageDetect.ts # Language/file type detection
│ │ ├── progress.ts # Progress reporter
│ │ └── git.ts # Git metadata extraction
│ ├── prompts/
│ │ ├── analyze.ts # Universal file analysis prompt
│ │ └── curriculum.ts # Tutorial generation prompt
│ └── types/
│ ├── manifest.ts # Manifest type definitions
│ └── mcp.ts # MCP tool interfaces
├── package.json
├── tsconfig.json
└── README.md
Development
# Type checking
npm run typecheck
# Development mode with auto-reload
npm run dev
# Build for production
npm run build
Performance Expectations
For a typical repository:
- 500 files: ~10-15 minutes (mostly AI analysis)
- 1000 files: ~20-30 minutes
- 5000 files: ~2 hours
Initialization is a one-time operation. Subsequent queries use the cached index.
Storage
For a 500-file repository (~50MB source):
- Manifest: ~100-200 KB
- Vector Index: ~5-10 MB
- Total overhead: ~20% of source size
Limitations
- LLM Dependency: Initialization requires an MCP host with sampling capability
- No Incremental Updates: Re-run
init_codebasewhen files change significantly - Binary Files: Skipped (images, PDFs, executables)
- Very Large Files: May hit LLM context limits (>100K tokens)
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.