large-repo-mcp-server
A stdio-based MCP server for fast, bounded search and navigation of large code repositories using ripgrep, with tools for project-wide search, symbol lookup, file listing, and code reading.
README
Large Repo MCP Server
Stdio-based Model Context Protocol server designed as a standard toolkit for large and complex repositories. Uses ripgrep for fast, bounded search across codebases of any size.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client (Codex CLI / Claude Desktop / Claude Code) │
│ │
│ tools/call ──► JSON-RPC 2.0 request │
│ Content-Length: N\r\n\r\n{...} │
└─────────────┬───────────────────────────────────▲───────────────┘
│ stdin │ stdout
▼ │
┌─────────────────────────────────────────────────────────────────┐
│ large-repo-mcp server │
│ │
│ ┌──────────┐ ┌────────────┐ ┌──────────────────────┐ │
│ │ Frame │───►│ JSON-RPC │───►│ Tool Dispatch │ │
│ │ Parser │ │ Router │ │ │ │
│ │ │ │ │ │ project_search_rg │ │
│ │ Content- │ │ initialize │ │ symbol_search │ │
│ │ Length │ │ tools/list │ │ read_range │ │
│ │ framing │ │ tools/call │ │ list_files │ │
│ └──────────┘ │ ping │ └──────────┬───────────┘ │
│ └────────────┘ │ │
│ │ │
│ ┌────────────────────────────────────────────▼──────────┐ │
│ │ Security Layer │ │
│ │ │ │
│ │ • Path confinement (resolve + realpath + root check) │ │
│ │ • Command allowlist (rg only, shell: false) │ │
│ │ • Request size limit (1 MB) │ │
│ │ • Response size limit (200 KB) │ │
│ │ • Null-byte rejection │ │
│ └───────────────────────────┬───────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ rg (ripgrep) │ │
│ │ subprocess │ │
│ │ │ │
│ │ • 15s timeout │ │
│ │ • JSON output │ │
│ │ • Streaming │ │
│ └─────────────────┘ │
│ stderr ──► │
│ structured JSON logs │
└─────────────────────────────────────────────────────────────────┘
Request Flow
Client Server ripgrep
│ │ │
│ Content-Length: N\r\n\r\n │ │
│ {"jsonrpc":"2.0", │ │
│ "method":"tools/call",...} │ │
│──────────────────────────────►│ │
│ │ validate request size (≤1MB) │
│ │ parse JSON-RPC frame │
│ │ validate tool + arguments │
│ │ resolve & confine paths │
│ │ │
│ │ spawn rg --json ... │
│ │──────────────────────────────►│
│ │ │
│ │ stream results (line-by-line)│
│ │◄──────────────────────────────│
│ │ │
│ │ enforce match/byte limits │
│ │ kill rg if limit reached │
│ │ │
│ Content-Length: M\r\n\r\n │ │
│ {"jsonrpc":"2.0", │ │
│ "result":{...}} │ │
│◄──────────────────────────────│ │
│ │ │
Safety Guarantees
| Guarantee | Mechanism | Limit |
|---|---|---|
| Fast, bounded search | ripgrep with match caps | 500 max matches |
| Response size cap | Byte-level output budget | 200 KB |
| Subprocess timeout | setTimeout + child.kill() |
15 seconds |
| Path confinement | path.resolve + realpath + root prefix check |
repo root only |
| Symlink escape prevention | fs.realpath() resolves then re-validates |
double-checked |
| Command execution | Allowlist (rg only) + shell: false |
no shell injection |
| Request size limit | Frame-level rejection before parse | 1 MB |
| Input sanitization | Null-byte rejection, type validation | all tool inputs |
| Child process cleanup | Tracked set, killed on shutdown signals | SIGTERM/SIGINT/exit |
Supported Clients
- Codex CLI
- Claude Desktop
- Claude Code
- Any MCP client supporting stdio transport (protocol version
2024-11-05)
Requirements
- Node.js 18+
- ripgrep (
rg) onPATHsymbol_searchrequires PCRE2 support (most official packages include it)
Verify ripgrep:
rg --version # should show version
rg --pcre2-version # should show PCRE2 version
Install ripgrep:
| Platform | Command |
|---|---|
| macOS | brew install ripgrep |
| Ubuntu/Debian | sudo apt-get install ripgrep |
| Fedora | sudo dnf install ripgrep |
| Windows | winget install BurntSushi.ripgrep.MSVC |
| Cargo | cargo install ripgrep --features pcre2 |
Install
npm install
npm run build
Run
npm start
Default repo root behavior:
- If
REPO_ROOTis unset, repo root is current working directory (process.cwd()). - Set
REPO_ROOTto override explicitly.
PowerShell:
$env:REPO_ROOT = "C:\absolute\path\to\repo"
npm start
MCP Configuration
Codex (Global)
Add to ~/.codex/config.toml:
[mcp_servers.large_repo_mcp]
command = "node"
args = ["C:/absolute/path/to/mcp/dist/server.js"]
startup_timeout_sec = 30
[mcp_servers.large_repo_mcp.env]
REPO_ROOT = "."
Codex (Project-Local)
Add to .codex/config.toml in a project:
[mcp_servers.large_repo_mcp]
command = "node"
args = ["C:/absolute/path/to/mcp/dist/server.js"]
startup_timeout_sec = 30
[mcp_servers.large_repo_mcp.env]
REPO_ROOT = "."
Claude Desktop
Add to Claude Desktop MCP config:
{
"mcpServers": {
"large_repo_mcp": {
"command": "node",
"args": ["C:/absolute/path/to/mcp/dist/server.js"],
"env": {
"REPO_ROOT": "C:/absolute/path/to/target-repo"
}
}
}
}
Claude Code
Add to .mcp.json in a project or ~/.claude/mcp.json globally:
{
"mcpServers": {
"large_repo_mcp": {
"command": "node",
"args": ["C:/absolute/path/to/mcp/dist/server.js"],
"env": {
"REPO_ROOT": "."
}
}
}
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
REPO_ROOT |
process.cwd() |
Repository root directory |
LARGE_REPO_MCP_LOG_LEVEL |
error |
Log level: error, warn, info, debug |
LARGE_REPO_MCP_DEBUG |
— | Set to 1 to force debug logging |
Tools
project_search_rg
Search the repository with ripgrep. Supports regex patterns and optional glob filters.
┌──────────────────────────────────────────────────────┐
│ project_search_rg │
│ │
│ Input Output │
│ ───── ────── │
│ pattern (string, required) matches[] │
│ globs (string[], max 50) .path │
│ maxMatches (1-500, def 100) .line │
│ .column │
│ .text │
│ .submatches[] │
│ truncated │
│ truncateReason │
│ timedOut │
│ serverVersion │
│ durationMs │
└──────────────────────────────────────────────────────┘
Example call:
{
"name": "project_search_rg",
"arguments": {
"pattern": "TODO|FIXME",
"globs": ["*.ts", "*.tsx"],
"maxMatches": 50
}
}
symbol_search
Search for an exact symbol using word-boundary matching (\b...\b). Auto-detects project type (TypeScript or Python) and scopes file types accordingly.
┌──────────────────────────────────────────────────────┐
│ symbol_search │
│ │
│ Input Output │
│ ───── ────── │
│ symbol (string, required) projectType │
│ maxMatches (1-500, def 100) matchMode │
│ matches[] │
│ .path │
│ Detection logic: .line │
│ ┌────────────────────┐ .column │
│ │ tsconfig.json? │ .text │
│ │ yes → typescript │ .submatches[] │
│ │ no → check files │ │
│ │ .py only → python│ │
│ │ .ts only → ts │ │
│ │ fallback → ts │ │
│ └────────────────────┘ │
└──────────────────────────────────────────────────────┘
Example call:
{
"name": "symbol_search",
"arguments": {
"symbol": "handleRequest",
"maxMatches": 20
}
}
read_range
Read a specific line range from a file. Path must be relative and resolve inside the repo root.
┌──────────────────────────────────────────────────────┐
│ read_range │
│ │
│ Input Output │
│ ───── ────── │
│ path (relative, required) path (normalized) │
│ startLine (int, required) requestedLines │
│ endLine (int, required) returnedLines │
│ lines[] │
│ Constraints: .line │
│ • max 500 lines per call .text │
│ • path confined to repo root truncated │
│ • symlinks resolved + checked truncateReason │
└──────────────────────────────────────────────────────┘
Example call:
{
"name": "read_range",
"arguments": {
"path": "src/server.ts",
"startLine": 1,
"endLine": 50
}
}
list_files
List repository files using ripgrep --files with optional glob filters. Excludes .git, node_modules, dist, build, and coverage by default.
┌──────────────────────────────────────────────────────┐
│ list_files │
│ │
│ Input Output │
│ ───── ────── │
│ globs (string[], max 50) files[] (paths) │
│ maxResults (1-500, def 500) returned │
│ truncated │
│ Auto-excluded dirs: truncateReason │
│ .git, node_modules, dist, timedOut │
│ build, coverage │
└──────────────────────────────────────────────────────┘
Example call:
{
"name": "list_files",
"arguments": {
"globs": ["src/**/*.ts"],
"maxResults": 100
}
}
Project Type Detection
The server auto-detects whether a repository is TypeScript or Python to scope symbol_search file types. Detection runs once per process and is cached.
┌─────────────────────┐
│ tsconfig.json │
│ exists? │
└──────┬──────────────┘
yes │ no
┌──────────┘ │
▼ ▼
┌──────────┐ ┌────────────────────┐
│TYPESCRIPT│ │ Python project │
└──────────┘ │ file exists? │
│ (pyproject.toml, │
│ requirements.txt, │
│ setup.py, etc.) │
└──────┬─────────────┘
yes │ no
┌──────────┤ │
▼ │ ▼
┌────────────┐ │ ┌──────────────────┐
│package.json│ │ │ package.json │
│ exists? │ │ │ exists? │
└─────┬──────┘ │ └──────┬───────────┘
yes │ no │ yes │ no
│ │ │ │ │ │ │
▼ │ ▼ │ ▼ │ ▼
┌────────┐│ ┌──────┐ │ ┌──────────┐ ┌──────────────┐
│ Both ││ │PYTHON│ │ │TYPESCRIPT│ │ Count .ts vs │
│ scan ││ └──────┘ │ └──────────┘ │ .py files │
│ files │▼ ▼ └──────┬───────┘
└───┬────┘ ┌──────────┐ more │ more
│ │ No pkg, │ .py │ .ts
▼ │ has py │ │ │
┌────────────┐ │ file → │ ▼ ▼
│ .py only → │ │ PYTHON │ ┌──────┐ ┌──┐
│ PYTHON │ └──────────┘ │PYTHON│ │TS│
│ .ts only → │ └──────┘ └──┘
│ TS │
│ fallback → │
│ TS │
└────────────┘
Dev Workflow
npm run typecheck # type-check without emitting
npm test # build + run unit & integration tests
npm run lint # run ESLint
npm run format:check # check Prettier formatting
Troubleshooting
| Problem | Solution |
|---|---|
rg not found |
Install ripgrep and ensure it is on PATH |
symbol_search PCRE2 error |
Install ripgrep with PCRE2 support (most official packages include it) |
Request exceeds 1048576 bytes |
Split large requests; max inbound JSON-RPC frame body is 1 MB |
truncated: true |
Increase maxMatches/maxResults (up to bounds), narrow your query, or add globs |
No results from symbol_search |
Check projectType in response — detection may have picked the wrong language; use project_search_rg as a fallback |
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.