MCP Servers

semantic-code-mcp

A local MCP server that provides semantic code search for Python codebases using tree-sitter for chunking and LanceDB for vector storage. It enables natural language queries to find relevant code snippets based on meaning rather than just text matching.

README

semantic-code-mcp

MCP server that provides semantic code search for Claude Code. Instead of iterative grep/glob, it indexes your codebase with embeddings and returns ranked results by meaning.

Supports Python, Rust, and Markdown — more languages planned.

How It Works

Claude Code ──(MCP/STDIO)──▶ semantic-code-mcp server
                                    │
                    ┌───────────────┼───────────────┐
                    ▼               ▼               ▼
              AST Chunker      Embedder        LanceDB
             (tree-sitter)  (sentence-trans)  (vectors)

Chunking — tree-sitter parses source files into functions, classes, methods, structs, traits, markdown sections, etc.
Embedding — sentence-transformers encodes each chunk (all-MiniLM-L6-v2, 384d)
Storage — vectors stored in LanceDB (embedded, like SQLite)
Search — hybrid semantic + keyword search with recency boosting

Indexing is incremental (mtime-based) and uses git ls-files for fast file discovery. The embedding model loads lazily on first query.

Installation

macOS / Windows

PyPI ships CPU-only torch on these platforms, so no extra flags are needed (~1.7GB install).

uvx semantic-code-mcp

Claude Code integration:

claude mcp add --scope user semantic-code -- uvx semantic-code-mcp

Linux

[!IMPORTANT] Without the --index flag, PyPI installs CUDA-bundled torch (~3.5GB). Unless you need GPU acceleration (you don't — embeddings run on CPU), use the command below to get the CPU-only build (~1.7GB).

uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu semantic-code-mcp

Claude Code integration:

claude mcp add --scope user semantic-code -- \
  uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu semantic-code-mcp

<details> <summary>Claude Desktop / other MCP clients (JSON config)</summary>

{
  "mcpServers": {
    "semantic-code": {
      "command": "uvx",
      "args": ["--index", "pytorch-cpu=https://download.pytorch.org/whl/cpu", "semantic-code-mcp"]
    }
  }
}

On macOS/Windows you can omit the --index and pytorch-cpu args.

</details>

Updating

uvx caches the installed version. To get the latest release:

uvx --upgrade semantic-code-mcp

Or pin a specific version in your MCP config:

claude mcp add --scope user semantic-code -- uvx semantic-code-mcp@0.2.0

MCP Tools

`search_code`

Search code by meaning, not just text matching. Auto-indexes on first search.

Parameter	Type	Default	Description
`query`	`str`	required	Natural language description of what you're looking for
`project_path`	`str`	required	Absolute path to the project root
`limit`	`int`	`10`	Maximum number of results

Returns ranked results with file_path, line_start, line_end, name, chunk_type, content, and score.

`index_codebase`

Index a codebase for semantic search. Only processes new and changed files unless force=True.

Parameter	Type	Default	Description
`project_path`	`str`	required	Absolute path to the project root
`force`	`bool`	`False`	Re-index all files regardless of changes

`index_status`

Check indexing status for a project.

Parameter	Type	Default	Description
`project_path`	`str`	required	Absolute path to the project root

Returns is_indexed, files_count, and chunks_count.

Configuration

All settings are environment variables with the SEMANTIC_CODE_MCP_ prefix (via pydantic-settings):

Variable	Default	Description
`SEMANTIC_CODE_MCP_CACHE_DIR`	`~/.cache/semantic-code-mcp`	Where indexes are stored
`SEMANTIC_CODE_MCP_LOCAL_INDEX`	`false`	Store index in `.semantic-code/` within each project
`SEMANTIC_CODE_MCP_EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Sentence-transformers model
`SEMANTIC_CODE_MCP_DEBUG`	`false`	Enable debug logging
`SEMANTIC_CODE_MCP_PROFILE`	`false`	Enable pyinstrument profiling

Pass environment variables via the env field in your MCP config:

{
  "mcpServers": {
    "semantic-code": {
      "command": "uvx",
      "args": ["semantic-code-mcp"],
      "env": {
        "SEMANTIC_CODE_MCP_DEBUG": "true",
        "SEMANTIC_CODE_MCP_LOCAL_INDEX": "true"
      }
    }
  }
}

Or with Claude Code CLI:

claude mcp add --scope user semantic-code \
  -e SEMANTIC_CODE_MCP_DEBUG=true \
  -e SEMANTIC_CODE_MCP_LOCAL_INDEX=true \
  -- uvx semantic-code-mcp

Tech Stack

Component	Choice	Rationale
MCP Framework	FastMCP	Python decorators, STDIO transport
Embeddings	sentence-transformers	Local, no API costs, good quality
Vector Store	LanceDB	Embedded (like SQLite), no server needed
Chunking	tree-sitter	AST-based, respects code structure

Development

uv sync                            # Install dependencies
uv run python -m semantic_code_mcp # Run server
uv run pytest                      # Run tests
uv run ruff check src/             # Lint
uv run ruff format src/            # Format

Pre-commit hooks enforce linting, formatting, type-checking (ty), security scanning (bandit), and Conventional Commits.

Releasing

Versions are derived from git tags automatically (hatch-vcs) — there's no hardcoded version in pyproject.toml.

git tag v0.2.0
git push origin v0.2.0

CI builds the package, publishes to PyPI, and creates a GitHub Release with auto-generated notes.

Adding a New Language

The chunker system is designed to make adding languages straightforward. Each language needs:

A tree-sitter grammar package (e.g. tree-sitter-javascript)
A chunker subclass that walks the AST and extracts meaningful chunks

Steps:

uv add tree-sitter-mylang

Create src/semantic_code_mcp/chunkers/mylang.py:

from enum import StrEnum, auto

import tree_sitter_mylang as tsmylang
from tree_sitter import Language, Node

from semantic_code_mcp.chunkers.base import BaseTreeSitterChunker
from semantic_code_mcp.models import Chunk, ChunkType


class NodeType(StrEnum):
    function_definition = auto()
    # ... other node types


class MyLangChunker(BaseTreeSitterChunker):
    language = Language(tsmylang.language())
    extensions = (".ml",)

    def _extract_chunks(self, root: Node, file_path: str, lines: list[str]) -> list[Chunk]:
        chunks = []
        for node in root.children:
            match node.type:
                case NodeType.function_definition:
                    name = node.child_by_field_name("name").text.decode()
                    chunks.append(self._make_chunk(node, file_path, lines, ChunkType.function, name))
                # ... other node types
        return chunks

from semantic_code_mcp.chunkers.mylang import MyLangChunker

def get_chunkers(self) -> list[BaseTreeSitterChunker]:
    return [PythonChunker(), RustChunker(), MarkdownChunker(), MyLangChunker()]

The CompositeChunker handles dispatch by file extension automatically. Use BaseTreeSitterChunker._make_chunk() for consistent chunk construction. See chunkers/python.py and chunkers/rust.py for complete examples.

Project Structure

src/semantic_code_mcp/chunkers/ — language chunkers (base.py, composite.py, python.py, rust.py, markdown.py)
src/semantic_code_mcp/services/ — IndexService (scan/chunk/index), SearchService (search + auto-index)
src/semantic_code_mcp/indexer.py — embed + store pipeline
docs/decisions/ — architecture decision records
TODO.md — epics and planning
CHANGELOG.md — completed work (Keep a Changelog format)
.claude/rules/ — context-specific coding rules for AI agents

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured