local-docs-mcp
Enables AI assistants to perform semantic, hybrid, and filtered search on indexed local documentation with RAG capabilities.
README
Local Docs MCP
A modular semantic search system with MCP (Model Context Protocol) integration for searching local documentation. The Retrieval-Augmented Generation (RAG) system not only lets you manage document chunks for knowledge retrieval but also gives AI assistants semantic search capabilities through the MCP.
Key Features
| Core Capability | Technical Implementation |
|---|---|
| Document Indexing | A full indexing pipeline that processes documents from the docs/ directory, chunks them, and creates embeddings using Ollama. With Cocoindex, it updates only the parts that have changed — when users edit or add new content, the system detects those changes and updates selectively. |
| Vector Database | Uses Qdrant to store document embeddings for semantic search. |
| Retrieval | The search service provides semantic search capabilities with multiple strategies (semantic, hybrid, and filtered). |
| MCP Integration | The MCP server exposes these retrieval capabilities to AI assistants. |
Quick Start
Installation
- Clone and setup the project:
git clone git@github.com:nguyenchiencong/local-docs-mcp.git
cd local-docs-mcp
- Start required services:
# Start Qdrant
docker run -d -p 6334:6334 -p 6333:6333 qdrant/qdrant
# Make sure Ollama is running with the embedding model
ollama pull hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:F16
# Setup postgres for cocoindex
docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/postgres.yaml) up -d
- Configure environment:
# Edit .env with your specific configuration
cp .env.example .env
# Don't forget to setup your .cocoignore file
cp .cocoignore.example .cocoignore
- Install dependencies:
uv sync
Usage
Before indexing, add your documents to the docs folder.
To index your documents:
uv run python -m src.indexing.main_flow
# To run the force reindex utility
uv run python -m src.indexing.force_reindex
To start the MCP server:
uv run python -m src.mcp_server.server
Make the CLI available on your PATH
If you want to run local-docs-mcp from any directory:
-
Windows (PowerShell):
setx PATH "path\to\local-docs-mcp\.venv\Scripts;$($env:Path)" -
Linux/macOS (bash/zsh):
echo 'export PATH="/path/to/local-docs-mcp/.venv/bin:$PATH"' >> ~/.bashrc
To run MCP tools directly from the CLI (one-off calls):
# Start the server (default behavior)
local-docs-mcp
# Run a semantic search once and exit
local-docs-mcp semantic_search --query "vector search overview" --limit 5
# Run hybrid search
local-docs-mcp hybrid_search --query "async await" --semantic-weight 0.7 --limit 5
# Run a filtered search with metadata (JSON object)
local-docs-mcp search_with_metadata_filter --query "UI tutorial" --metadata-filter '{"filename": "ui.md"}' --limit 5
# Retrieve a specific document by ID
local-docs-mcp document_retrieval --document-id "doc-123"
# Fetch collection info
local-docs-mcp get_collection_info --json
Configuration
System Settings
All configuration is managed in pyproject.toml under the [tool.local-docs] section:
[tool.local-docs]
# Qdrant configuration
qdrant_url = "http://localhost:6334"
qdrant_collection = "local-docs-collection"
# Ollama configuration
ollama_url = "http://localhost:11434"
ollama_model = "hf.co/Qwen/Qwen3-Embedding-0.6B-GGUF:F16"
embedding_dimension = 1024
# Document configuration
docs_directory = "docs"
supported_extensions = [".md", ".rst", ".txt"]
# Search configuration
search_limit = 10
# Chunking configuration
chunk_size = 1200
chunk_overlap = 200
# Search configuration
search_limit = 10
similarity_threshold = 0.15
search_hnsw_ef = 256
hybrid_semantic_weight = 0.85
mmr_lambda = 0.75
Environment Variables: Override any setting with LOCAL_DOCS_* environment variables:
export LOCAL_DOCS_SEARCH_LIMIT=20
export LOCAL_DOCS_OLLAMA_MODEL="different-model"
MCP Client Setup
Add this to your MCP client configuration (e.g., Claude Code):
{
"mcpServers": {
"local-docs-mcp": {
"command": "uv",
"args": ["run", "--project", "/path/to/local-docs-mcp", "-m", "src.mcp_server.server"]
}
}
}
MCP Tools
The MCP server exposes the following semantic search tools to AI assistants:
| Tool | Purpose | Parameters | Example Prompt |
|---|---|---|---|
semantic_search |
Perform semantic search on indexed documents. Finds content based on meaning and context rather than exact keywords. | query (required string), limit (optional number, default: 10), min_similarity_score (optional number, default: 0.0) |
"Find information about error handling patterns in the codebase" |
hybrid_search |
Combine semantic search with keyword matching. Useful when exact terminology matters alongside conceptual meaning. | query (required string), semantic_weight (optional number, default: 0.7), limit (optional number, default: 10), min_similarity_score (optional number, default: 0.0) |
"Search for 'async await' patterns and asynchronous programming concepts" |
document_retrieval |
Retrieve complete document by ID. Use this when you need the full context of a specific document found in search results. | document_id (required string) |
"Get the full document for ID 'doc_12345'" |
search_with_metadata_filter |
Search with metadata constraints. Use this to narrow down search results by specific document properties. | query (required string), metadata_filter (optional object), limit (optional number, default: 10), min_similarity_score (optional number, default: 0.0) |
"Search for API documentation in files with filename containing 'api'" |
get_collection_info |
Get information about the indexed document collection, including statistics and status. | none | "Show me collection statistics and indexing status" |
Use cases
Documentation Research:
- "What are signals and how do they work in Godot?"
- "Find tutorials about character controllers"
- "Explain the difference between KinematicBody and RigidBody"
Problem Solving:
- "How do I fix 'node not found' errors?"
- "What are the best practices for performance optimization?"
- "Search for debugging techniques in Godot"
Learning Paths:
- "I'm a beginner, show me getting started content"
- "What should I learn after basic GDScript?"
- "Find intermediate tutorials about physics"
Specific Searches:
- "Show me the top 5 most relevant results about animations"
- "Find only tutorial files about UI design"
- "Look for performance optimization guides"
Development
Running Tests
uv run pytest tests/
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Related Projects
- CocoIndex - Document indexing and processing
- Qdrant - Vector database for similarity search
- Ollama - Local AI model serving
- Model Context Protocol - Standard for AI tool integration
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.