CodeBrain MCP Server
Enables semantic code search across codebases using AI embeddings and vector similarity, integrated with Claude Desktop and Cursor.
README
๐ง CodeBrain MCP Server
Semantic code search powered by AI embeddings and vector similarity.
Integrate intelligent code search directly into Claude Desktop and Cursor through the Model Context Protocol (MCP).
๐ฏ What It Does
CodeBrain indexes your codebase using AST-based splitting and AI embeddings, enabling:
- Semantic search - Find code by meaning, not just keywords
- Smart chunking - AST-aware code splitting (respects functions, classes, etc.)
- Fast retrieval - Vector similarity search with pgvector
- Multi-project - Index and search across multiple codebases
๐ Quick Start
1. Prerequisites
# Docker running (for PostgreSQL + pgvector)
docker ps | grep codebrain
# Node.js 20+
node --version
# Dependencies installed
cd /Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain
pnpm install
2. Setup Database
# Start PostgreSQL with pgvector (if not running)
docker run -d \
--name codebrain \
-p 5484:5432 \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=codebrain \
pgvector/pgvector:pg15
# Setup database schema
pnpm db:setup
pnpm db:migrate
3. Configure Environment
Edit .env:
GEMINI_API_KEY=your_api_key_here
DATABASE_URL=postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp
4. Test the Server
# Run tests
pnpm test
# Should show: โ
28 tests passed
# Test MCP server starts
npx tsx src/index.ts
# Should output: ๐ CodeBrain MCP Server started (stdio mode)
# Press Ctrl+C to stop
๐ Connect to Cursor/Claude
For Cursor
- Open Cursor Settings โ MCP Servers
- Add server named
codebrain - Copy this config:
{
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}
- Restart Cursor
- Verify - Check MCP panel shows "codebrain" connected
๐ Detailed guide: See CURSOR_SETUP.md
For Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"codebrain": {
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}
}
}
Restart Claude Desktop.
๐ ๏ธ Available MCP Tools
1. index_codebase
Index a codebase for semantic search.
Parameters:
{
projectName: string; // Unique project identifier
rootPath: string; // Absolute path to code
force?: boolean; // Re-index existing files
}
Example:
"Index my React project at /Users/me/projects/my-app with name 'my-app'"
2. semantic_search
Search code semantically across indexed projects.
Parameters:
{
query: string; // What to search for
projectName?: string; // Filter by project
topK?: number; // Number of results (default: 5)
threshold?: number; // Similarity threshold (default: 0.5)
}
Example:
"Find authentication logic in my-app"
3. list_projects
List all indexed projects.
Parameters: None
Example:
"Show me all indexed projects"
4. get_project_stats
Get statistics for a project.
Parameters:
{
projectName: string; // Project to query
}
Example:
"Show me stats for the my-app project"
๐ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Cursor / Claude Desktop โ
โ (MCP Client) โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Protocol (stdio)
โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CodeBrain MCP Server โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ AST Code Splitter โ โ
โ โ - JavaScript/TypeScript โ โ
โ โ - Python, Go, Rust, Java, C++ โ โ
โ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Gemini Embeddings โ โ
โ โ - 768-dimensional vectors โ โ
โ โ - Semantic descriptions โ โ
โ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Vector Search โ โ
โ โ - Cosine similarity โ โ
โ โ - Threshold filtering โ โ
โ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PostgreSQL + pgvector โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Projects โ Files โ Chunks โ Embeds โ โ
โ โ Normalized relational schema โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐งช Testing
# Run all tests
pnpm test
# Watch mode
pnpm test:watch
# Individual test suites
pnpm test:splitter # AST code splitter
pnpm test:indexing # Indexing workflow
pnpm test:search # Semantic search
pnpm test:embedding # Embedding generation
# Integration test (end-to-end)
pnpm test:integration
๐ Graph Viewer (React)
Visualise the code graph in the browser with the React/Vite viewer.
# Start the Graph API server (serves graph JSON on http://localhost:4000)
pnpm graph:server
# In a separate terminal, install and run the viewer UI
cd apps/graph-viewer
pnpm install
pnpm dev
# Open the browser UI โ http://localhost:5173
Override the API target with VITE_GRAPH_API_URL (inside apps/graph-viewer/.env) if the server runs elsewhere.
๐ Project Structure
CodeBrain/
โโโ src/
โ โโโ index.ts # MCP server entry point
โ โโโ core/
โ โ โโโ indexing.ts # Indexing orchestration
โ โ โโโ search.ts # Semantic search
โ โ โโโ splitter.ts # AST-based code splitting
โ โ โโโ embedding/
โ โ โโโ base-embedding.ts # Embedding interface
โ โ โโโ gemini-embedding.ts # Gemini implementation
โ โโโ test/
โ โโโ *.test.ts # Unit tests
โ โโโ utils.ts # Test utilities
โโโ db/
โ โโโ index.ts # Prisma client
โ โโโ setup.ts # Database setup script
โ โโโ vector-indexes.ts # Vector index management
โโโ prisma/
โ โโโ schema.prisma # Database schema
โโโ .env # Environment variables
โโโ mcp-config.json # MCP configuration template
โโโ CURSOR_SETUP.md # Cursor integration guide
โโโ README.md # This file
๐๏ธ Database Schema
Project (1) โโ
โโ> File (N) โโ
โโ> Chunk (N) โโ
โโ> Embedding (N)
- Project: Root container (name, rootPath)
- File: Individual source files (path, language, hash)
- Chunk: Code segments (text, lines, AST metadata)
- Embedding: Vector representations (768-dim, model, similarity search)
๐ง Development
Scripts
pnpm dev # Start with auto-reload
pnpm start # Start server
pnpm build # Compile TypeScript
pnpm db:setup # Setup database + pgvector
pnpm db:migrate # Run migrations
pnpm db:generate # Generate Prisma client
pnpm db:studio # Open Prisma Studio
Environment Variables
# Required
GEMINI_API_KEY=your_gemini_api_key
DATABASE_URL=postgresql://user:pass@host:port/db?schema=cbmcp
# Optional
NODE_ENV=development
๐ Troubleshooting
MCP Connection Issues
Problem: Server won't connect in Cursor
Solutions:
- Test manually:
npx tsx src/index.ts(should output startup message) - Check absolute path in config matches your directory
- Verify environment variables in MCP config
- Restart Cursor completely (Cmd+Q, then reopen)
- Check MCP output panel for error logs
Database Issues
Problem: type "vector" does not exist
Solution:
pnpm db:setup # This installs pgvector in cbmcp schema
Problem: Connection refused
Solution:
docker ps | grep codebrain # Verify container running
docker start codebrain # Start if stopped
Embedding Issues
Problem: GEMINI_API_KEY is required
Solution: Add API key to .env and MCP config
Performance Issues
Problem: Indexing is slow
Solutions:
- Embeddings are cached - subsequent runs are faster
- Adjust batch size in
indexing.tsif needed - Consider excluding large directories (node_modules, etc.)
๐ Performance
- Indexing: ~2-5 seconds per file (first time, includes embedding generation)
- Re-indexing: ~100ms per file (if unchanged, uses hash comparison)
- Search: ~500ms per query (includes embedding + vector search)
- Storage: ~10KB per code chunk (text + embedding + metadata)
๐ Security
- API keys stored in environment variables (not in code)
- Database credentials configurable
- MCP runs locally (no external API calls except Gemini)
- Vector embeddings don't leave your machine
๐ License
MIT
๐ค Contributing
This is a personal project, but feel free to fork and adapt for your needs!
๐ Learn More
โ Status
- โ Database setup and migrations
- โ AST-based code splitting
- โ Gemini embedding integration
- โ Vector similarity search
- โ MCP server implementation
- โ Comprehensive test suite (28 tests)
- โ Multi-project support
- โ Cursor/Claude integration ready
Ready for production use! ๐
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.