MCP Servers

codebase-index

AST-aware codebase indexing with semantic search, exposed as an MCP server. Enables semantic search and file context retrieval across your codebase using natural language queries.

README

codebase-index

AST-aware codebase indexing with semantic search, exposed as an MCP server. Think Cursor's codebase awareness, but for Claude Code (or any MCP client).

What it does: Parses your codebase with tree-sitter, chunks it intelligently (functions, components, hooks, stores, configs, classes with method extraction, types), embeds locally with Ollama, stores in LanceDB, and exposes semantic search via MCP.

Quick Start

# Prerequisites
brew install ollama
ollama serve
ollama pull nomic-embed-text

# Clone the tool (one-time)
git clone https://github.com/LevelPanic/codebase-index.git ~/codebase-index
cd ~/codebase-index && npm install

# In your repo
cd ~/your-project
npx tsx ~/codebase-index/src/cli/index.ts init          # generates config + wires .mcp.json
npx tsx ~/codebase-index/src/cli/index.ts index --full   # build the index

The init command automatically:

Creates codebase-index.config.json with detected file patterns
Adds .codebase-index/ to .gitignore
Wires the MCP server into .mcp.json with the correct path

Restart Claude Code and it has search_codebase and get_file_context tools.

How It Works

Your repo files
     │
     ▼
  tree-sitter AST parsing
     │
     ▼
  Smart chunks (functions, components, hooks, stores, configs, classes, types, Prisma models)
     │
     ▼
  Ollama embeddings (nomic-embed-text, local, free)
     │
     ▼
  LanceDB vector storage (just files on disk)
     │
     ▼
  MCP server (stdio) → search_codebase / get_file_context

Chunking is AST-aware — not dumb line splits. Each function, component, type definition, and Prisma model is its own chunk.
Context-enriched — function chunks include referenced type definitions inline, so embeddings capture the full picture.
Smart truncation — large chunks keep signature + head + tail instead of cutting off at the bottom (preserves return statements and JSX output).
Class method extraction — large classes are split into individual method chunks instead of one truncated blob.
Chunk type detection — React hooks (useXxx), Zustand/Redux stores, config objects, and barrel files are all detected and tagged.
Small type batching — tiny adjacent type aliases are merged into a single chunk to reduce embedding calls.
Embeddings are local — Ollama runs on your machine. No API keys, no network, no cost.
Storage is embedded — LanceDB is just a directory on disk. No database server.
Branch-aware — when on a feature branch, search results for modified files return live content from disk.

Configuration

The config file codebase-index.config.json goes at your repo root. It's optional — the tool works with zero config on any TypeScript/JavaScript project.

{
  "include": ["src/**/*.{ts,tsx,js,jsx}"],
  "exclude": ["**/node_modules/**", "**/dist/**"],
  "output": ".codebase-index",
  "baseBranch": "main",
  "embedding": {
    "provider": "ollama",
    "url": "http://localhost:11434",
    "model": "nomic-embed-text"
  },
  "tags": []
}

Monorepo Example

{
  "include": [
    "apps/**/*.{ts,tsx,js,jsx}",
    "packages/**/*.{ts,tsx,js,jsx}",
    "apps/**/schema.prisma"
  ],
  "tags": [
    {
      "name": "platform",
      "defaultValue": "all",
      "rules": [
        { "pattern": "facebook", "value": "meta" },
        { "pattern": "tiktok", "value": "tiktok" },
        { "pattern": "google-ads", "value": "google" }
      ]
    },
    {
      "name": "app",
      "defaultValue": "unknown",
      "rules": [
        { "pattern": "apps/web/", "value": "web" },
        { "pattern": "apps/api/", "value": "api" },
        { "pattern": "packages/", "value": "packages" }
      ]
    }
  ]
}

CLI

Run commands via npx tsx ~/codebase-index/src/cli/index.ts <command> from your repo directory. Or create a shell alias:

alias cbi="npx tsx ~/codebase-index/src/cli/index.ts"

cbi init [--force]     Generate config + wire .mcp.json
cbi index              Incremental index (changed files only)
cbi index --full       Full reindex (drop and rebuild)
cbi stats              Show index statistics
cbi serve              Start MCP server (stdio)

Indexing

Full index: Parses all files, embeds everything, rebuilds the database. ~15-20 min for ~5K files on Apple Silicon.
Incremental index: Only re-embeds files changed since last indexed commit. ~10 seconds for typical daily changes.
Stats: Shows chunk counts broken down by type and configured tags.

MCP Tools

`search_codebase`

Semantic search across the codebase. Returns relevant code chunks ranked by similarity.

Parameter	Type	Description
`query`	string	Natural language search query
`limit`	number	Max results (default 10)
`<tag_name>`	enum	Filter by any configured tag

`get_file_context`

Get all indexed chunks for a specific file. Shows the file's structure — functions, components, types.

Parameter	Type	Description
`file_path`	string	Relative path from repo root

Chunk Types

The chunker detects and labels each chunk:

Type	What it captures
`function`	Functions and arrow functions
`component`	React components (JSX-returning functions in `.tsx`/`.jsx`)
`hook`	React hooks (`useXxx` naming convention)
`store`	Zustand/Redux stores (`create()`, `Store`, `Slice`)
`config`	Plain object/array literals (route maps, constants, configs)
`class`	Class overview with method listing
`method`	Individual methods extracted from large classes
`type`	Type aliases, interfaces, enums
`model`	Prisma model/enum/type blocks
`summary`	File-level overview (imports + export listing)

Prerequisites

Node.js 18+
Git — the repo must be a git repository with at least one commit
Ollama running locally with an embedding model (ollama serve && ollama pull nomic-embed-text)
C++ compiler (for tree-sitter native module — Xcode CLI tools on macOS, build-essential on Linux)

How Freshness Works

The index tracks the base branch (default: main). When you're on a feature branch:

Search results are returned from the index as normal
If a result points to a file modified on your branch (git diff main...HEAD)
The tool reads the live file from disk and re-parses it
You get current content, not stale indexed content

This means the index only needs to track main — feature branch changes are always live.

Using in Multiple Repos

Clone once, use everywhere. Run init in each repo — it auto-detects the structure and wires everything up:

cd ~/project-a && npx tsx ~/codebase-index/src/cli/index.ts init
cd ~/project-b && npx tsx ~/codebase-index/src/cli/index.ts init

Each repo gets its own config, its own .codebase-index/ directory, and its own .mcp.json entry.

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured