repo-index-mcp
A local codebase retrieval tool that indexes a git repository into SQLite and provides query and retrieval capabilities over MCP stdio, enabling coding agents to search and retrieve code chunks from local repositories without external data transmission.
README
repo-index-mcp
Local codebase retrieval tool for coding agents. Phase 1 is a walking skeleton: index one git repo into SQLite, query chunks from the CLI, and expose retrieval over MCP stdio.
Install
pipx install .
For development:
python -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'
Use
Index a repo:
repo-index index /path/to/git/repo
Discover and index every git repo under a root:
repo-index index-root ~/code
Install freshness hooks for one repo or a repo root:
repo-index install-hooks /path/to/git/repo
repo-index install-hooks ~/code --recursive
Query it:
repo-index query "where is request retry handled" -k 5
Show indexed repos:
repo-index status
Run the Phase 0 eval set:
repo-index eval evals/golden.repo-index-mcp.jsonl . -k 10
Run the MCP server over stdio:
repo-index serve
Agent config example:
{
"mcpServers": {
"repo-index": {
"command": "repo-index",
"args": ["serve"]
}
}
}
Evals
Phase 0 eval docs live in docs/phase-0-baseline.md. The seed golden set lives in evals/golden.repo-index-mcp.jsonl.
Phase 2 behavior
index-rootdiscovers git repos under a directory.- Reindexing compares tracked file content hashes and only re-embeds changed files.
- Deleted tracked files remove their old chunks from the index.
install-hooksaddspost-commitandpost-mergehooks that runrepo-index reindex "$PWD".status/list_reposreport stale repos by comparing indexed commit to currentHEAD.
Current limits
- Naive line-window chunks.
- Local deterministic hash embeddings, not quality-tuned semantic embeddings.
- SQLite storage implemented with Python cosine search, no ANN/vector extension yet.
get_symbolis best-effort search until tree-sitter symbol extraction lands.
Data boundary
Default embedding is local and deterministic. Source code is not sent to external APIs.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.