MCP-Markdown-RAG
A Model Context Protocol (MCP) server that provides a local-first RAG engine for your markdown documents. It uses a file-based Milvus vector database to index your notes, enabling LLMs to perform semantic search and retrieve relevant content from your local files.
README
<div align="center"> <img src="docs/banner.png" alt="MCP-Markdown-RAG" width="800" style="border-radius:10px;"/> <h1>MCP-Markdown-RAG</h1> <p> <img alt="GitHub forks" src="https://img.shields.io/github/forks/Zackriya-Solutions/MCP-Markdown-RAG"/> <img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/Zackriya-Solutions/MCP-Markdown-RAG"> <img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/Zackriya-Solutions/MCP-Markdown-RAG"> </p> <p> <a href="LICENSE"> <img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License" /> </a> <img src="https://img.shields.io/badge/MCP-Server-blue"/> </p> </div>
A Model Context Protocol (MCP) server that provides a local-first RAG engine for your markdown documents. This server uses a file-based Milvus vector database to index your notes, enabling Large Language Models (LLMs) to perform semantic search and retrieve relevant content from your local files.
[!NOTE] This project is in active development. The API and implementation are subject to change. We are exploring future enhancements, including a potential port to an Obsidian plugin for seamless vault integration.
šÆ Key Features
ā Local-First & Private: All your data is processed and stored locally. Nothing is sent to a third-party service for indexing.
ā Semantic Search for Markdown: Go beyond simple keyword search. Find document sections based on conceptual meaning.
ā MCP Compatible: Integrates with any MCP-supported host application like Claude Desktop, Windsurf, or Cursor.
ā
Simple Tooling: Provides two straightforward tools (index_documents and search) for managing and querying your knowledge base.
āļø How It Works
The server operates in two main phases, exposing its functionality through MCP tools.
-
Indexing:
- The
index_documentstool is called with a path to your markdown files. - The server reads the documents, splits them into logical chunks (e.g., by headings), and converts each chunk into a vector embedding.
- These embeddings, along with their metadata (original text, file path), are stored in a local Milvus vector database.
- You can run it in two modes:
- Full Reindex (force_reindex=True): Clears and rebuilds the entire index from scratch.
- Incremental Update (force_reindex=False, default): Automatically detects and re-indexes only changed files by comparing them against a tracking log. Deleted or modified chunks are pruned and replaced to keep the index up-to-date.
- Recursive Indexing (recursive=False, default): Recursively indexes all subdirectories.
- The
-
Searching:
- When you ask a question in a host application, it uses the
searchtool. - The server converts your query into a vector embedding.
- It then performs a similarity search against the Milvus database to find the most semantically relevant document chunks.
- The results are returned to the LLM, providing it with the context needed to answer your question accurately.
<div align="center" > <img src="docs/mcp_search.png" alt="MCP Search" width="800" style="border-radius:10px;"/> </div>
- When you ask a question in a host application, it uses the
š ļø Available Tools
-
index_documents- Description: Indexes Markdown documents for semantic search. Converts each file into structured vector chunks and inserts them into the Milvus database.
- Incremental Indexing: Automatically reindexes only changed files unless force_reindex=True is passed.
- Arguments:
directory(string, optional): The path to the folder containing .md files. Defaults to current directory.force_reindex(boolean, optional): If True, clears and rebuilds the full index. Defaults to False.recursive(boolean, optional): If True, recursively indexes all subdirectories. Defaults to False.
-
search- Description: Searches the indexed documents using semantic similarity.
- Arguments:
query(string, required): Your natural language query.limit(integer, optional): Max number of chunks to return (default is usually 5ā10).
š Installation & Setup
This server requires UV (for running the Python server).
Step 1: Get the Server Code
Clone this repository to your local machine:
git clone https://github.com/Zackriya-Solutions/MCP-Markdown-RAG.git
Step 2: Configure Your Host App
Configure your MCP host application (e.g., Windsurf, Claude.app) to use the server. Add the following to your settings file:
{
"mcpServers": {
"markdown_rag": {
"command": "uv",
"args": [
"--directory",
"/ABSOLUTE/PATH/TO/MCP-Markdown-RAG",
"run",
"server.py"
]
}
}
}
Note: Replace
/ABSOLUTE/PATH/TO/MCP-Markdown-RAGwith the absolute path to where you cloned this repository.
Note: The first run will take a while and the same for the first indexing, as it needs to download the embedding model(~50MB).
š What's Next? (Roadmap)
We are actively working on improving the server. Future plans include:
- Performance Optimization: Improve indexing by encoding inputs in batches, which should better manage CPU usage.
- Flexible Embedding Models: Add support for other embedding models, such as the
BGEM3-largemodel for potentially higher accuracy. - Obsidian Plugin: Explore creating a dedicated Obsidian plugin for a fully integrated experience.
š Debugging
You can use the MCP inspector to debug the server directly. Run the following command from the repository's root directory:
npx @modelcontextprotocol/inspector uv --directory /ABSOLUTE/PATH/TO/MCP-Markdown-RAG run server.py
š¤ Contributing
Contributions are welcome! Please feel free to open an issue or submit a pull request.
š Acknowledgments
- The Model Context Protocol for the open standard that makes this possible.
- The Milvus Project for the powerful open-source vector database.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.