
Embeddings Searcher
Enables semantic search through markdown documentation in code repositories using AI embeddings. Provides intelligent document chunking and similarity-based search to help users find relevant documentation based on meaning rather than just keywords.
README
Embeddings Searcher for Claude Code Documentation
A focused embeddings-based search system for navigating markdown documentation in code repositories.
Features
- Semantic Search: Uses sentence transformers to find relevant documentation based on meaning, not just keywords
- Markdown-Focused: Optimized for markdown documentation with intelligent chunking
- Repository-Aware: Organizes and searches across multiple repositories
- MCP Integration: Provides an MCP server for integration with Cursor/Claude
- UV Package Management: Uses UV for fast dependency management
Quick Start for Claude Code
1. Clone and setup
git clone <this-repo>
cd kb
uv sync
2. Add your documentation
Place your documentation repositories in the repos/
directory.
3. Index your documentation
uv run python embeddings_searcher.py --index
4. (Optional) Convert to ONNX for faster inference
uv run python onnx_convert.py --convert --test
5. Add MCP server to Claude Code
claude mcp add -- documentation-searcher uv run --directory /absolute/path/to/kb python mcp_server.py
Replace /absolute/path/to/kb
with the actual path to your project directory.
6. Use in Claude Code
Ask Claude Code questions like:
- "Search for authentication patterns"
- "Find API documentation"
- "Look up configuration options"
The MCP server will automatically search through your indexed documentation and return relevant results.
Quick Start
1. Index Documentation
First, index all markdown documentation in your repositories:
uv run python embeddings_searcher.py --index
This will:
- Find all
.md
,.markdown
, and.txt
files in therepos/
directory - Chunk them intelligently based on markdown structure
- Generate embeddings using sentence transformers
- Store everything in a SQLite database
2. Search Documentation
# Basic search
uv run python embeddings_searcher.py --query "API documentation"
# Search within a specific repository
uv run python embeddings_searcher.py --query "authentication" --repo "my-project.git"
# Limit results and set similarity threshold
uv run python embeddings_searcher.py --query "configuration" --max-results 5 --min-similarity 0.2
3. Get Statistics
# Show indexing statistics
uv run python embeddings_searcher.py --stats
# List indexed repositories
uv run python embeddings_searcher.py --list-repos
MCP Server Integration
The project includes an MCP server for integration with Cursor/Claude:
# Start the MCP server
uv run python mcp_server.py
MCP Tools Available
- search_docs: Search through documentation using semantic similarity
- list_repos: List all indexed repositories
- get_stats: Get indexing statistics
- get_document: Retrieve full document content by path
Project Structure
kb/
├── embeddings_searcher.py # Main searcher implementation
├── mcp_server.py # MCP server for Claude Code integration
├── onnx_convert.py # ONNX model conversion utility
├── pyproject.toml # UV project configuration
├── embeddings_docs.db # SQLite database with embeddings
├── sentence_model.onnx # ONNX model (generated)
├── model_config.json # Model configuration (generated)
├── tokenizer/ # Tokenizer files (generated)
├── repos/ # Your documentation repositories
│ ├── project1.git/
│ ├── project2.git/
│ └── documentation.git/
└── README.md # This file
How It Works
Intelligent Chunking
The system chunks markdown documents based on:
- Header structure (H1, H2, H3, etc.)
- Content length (500 words per chunk)
- Semantic boundaries
Embedding Generation
- Uses
all-MiniLM-L6-v2
sentence transformer model by default - Supports ONNX models for faster inference
- Caches embeddings for efficient updates
Search Algorithm
- Generates embedding for your query
- Compares against all document chunks using cosine similarity
- Returns ranked results with context and metadata
- Supports repository-specific searches
CLI Options
embeddings_searcher.py
# Indexing
--index # Index all repositories
--force # Force reindex of all documents
# Search
--query "search terms" # Search query
--repo "repo-name" # Search within specific repository
--max-results 10 # Maximum results to return
--min-similarity 0.1 # Minimum similarity threshold
# Information
--stats # Show indexing statistics
--list-repos # List indexed repositories
# Configuration
--kb-path /path/to/kb # Path to knowledge base
--db-path embeddings.db # Path to embeddings database
--model model-name # Sentence transformer model
--ignore-dirs [DIRS...] # Directories to ignore during indexing
mcp_server.py
--kb-path /path/to/kb # Path to knowledge base
--docs-db-path embeddings.db # Path to docs embeddings database
--model model-name # Sentence transformer model
ONNX Model Conversion
For faster inference, you can convert the sentence transformer model to ONNX format:
# Convert model to ONNX
uv run python onnx_convert.py --convert
# Test ONNX model
uv run python onnx_convert.py --test
# Convert and test in one command
uv run python onnx_convert.py --convert --test
Example Usage
# Index documentation
uv run python embeddings_searcher.py --index
# Search for API documentation
uv run python embeddings_searcher.py --query "API endpoints"
# Search for authentication in specific repository
uv run python embeddings_searcher.py --query "user authentication" --repo "my-project.git"
# Get detailed statistics
uv run python embeddings_searcher.py --stats
Performance
- Indexing: ~1400 documents in ~1 minute
- Search: Sub-second response times
- Storage: ~50MB for embeddings database with 6500+ chunks
- Memory: ~500MB during indexing, ~200MB during search
Troubleshooting
Unicode Errors
Some files may have encoding issues. The system automatically falls back to latin-1 encoding for problematic files.
Large Files
Files larger than 1MB are automatically skipped to prevent memory issues.
Model Loading
If sentence-transformers is not available, the system will attempt to use ONNX models or fall back to dummy embeddings for testing.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.