EyeLevel RAG MCP Server
A local Retrieval-Augmented Generation system that enables users to ingest markdown files into a FAISS-powered vector knowledge base for semantic search. It provides tools for document indexing and context retrieval to support informed LLM queries without external dependencies.
README
EyeLevel RAG MCP Server
A local Retrieval-Augmented Generation (RAG) system implemented as an MCP (Model Context Protocol) server. This server allows you to ingest markdown files into a local knowledge base and perform semantic search to retrieve relevant context for LLM queries.
Features
- Local RAG Implementation: No external dependencies or paid services required
- Markdown File Support: Ingest and search through
.mdfiles - Semantic Search: Uses sentence transformers for embedding-based similarity search
- Persistent Storage: Automatically saves and loads the vector index using FAISS
- Chunk Management: Intelligently splits documents into searchable chunks
- Multiple Documents: Support for ingesting and searching across multiple markdown files
Installation
- Clone this repository
- Install dependencies using uv:
uv sync
Dependencies
sentence-transformers: For creating text embeddingsfaiss-cpu: For efficient vector similarity searchnumpy: For numerical operationsmcp[cli]: For the MCP server framework
Available Tools
1. search_doc_for_rag_context(query: str)
Searches the knowledge base for relevant context based on a user query.
Parameters:
query(str): The search query
Returns:
- Relevant text chunks with relevance scores
2. ingest_markdown_file(local_file_path: str)
Ingests a markdown file into the knowledge base.
Parameters:
local_file_path(str): Path to the markdown file to ingest
Returns:
- Status message indicating success or failure
3. list_indexed_documents()
Lists all documents currently in the knowledge base.
Returns:
- Summary of indexed files and chunk counts
4. clear_knowledge_base()
Clears all documents from the knowledge base.
Returns:
- Confirmation message
Usage
-
Start the server:
python main.py -
Ingest markdown files: Use the
ingest_markdown_filetool to add your.mdfiles to the knowledge base. -
Search for context: Use the
search_doc_for_rag_contexttool to find relevant information for your queries.
How It Works
- Document Processing: Markdown files are split into chunks based on paragraphs and sentence boundaries
- Embedding Creation: Text chunks are converted to embeddings using the
all-MiniLM-L6-v2model - Vector Storage: Embeddings are stored in a FAISS index for fast similarity search
- Retrieval: User queries are embedded and matched against the stored vectors to find relevant content
File Structure
main.py: Main server implementation with RAG functionalitypyproject.toml: Project dependencies and configurationrag_index.faiss: FAISS vector index (created automatically)rag_documents.pkl: Serialized documents and metadata (created automatically)
Configuration
The RAG system uses the all-MiniLM-L6-v2 sentence transformer model by default. This model provides a good balance between speed and quality for semantic search tasks.
Example Workflow
- Prepare your markdown files with the content you want to search
- Use
ingest_markdown_fileto add each file to the knowledge base - Use
search_doc_for_rag_contextto find relevant context for your questions - The retrieved context can be used by an LLM to provide informed answers
Notes
- The first time you run the server, it will download the sentence transformer model
- The vector index is automatically saved and loaded between sessions
- Long documents are automatically chunked to optimize search performance
- The system supports multiple markdown files and maintains source file metadata
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.