MCP Servers

MCP RAG System

A Retrieval-Augmented Generation system that enables uploading, processing, and semantic search of PDF documents using vector embeddings and FAISS indexing for context-aware question answering.

README

MCP RAG System

A comprehensive Retrieval-Augmented Generation (RAG) system built using the Model Context Protocol (MCP) for storing, processing, and searching PDF documents.

Features

🔧 Tools

upload_pdf: Upload and process PDF files with automatic text extraction and chunking
search_documents: Semantic search across all uploaded documents using vector embeddings
list_documents: View all uploaded documents and their metadata
delete_document: Remove documents and their associated chunks from the system
get_rag_stats: Get comprehensive statistics about the RAG system

📦 Resources

rag://documents: List all documents in the system
rag://document/{document_id}: Get full content of a specific document
rag://stats: Get system statistics

💬 Prompts

rag_query_prompt: Generate prompts for RAG-based question answering
document_summary_prompt: Create document summarization prompts
search_suggestions_prompt: Generate better search query suggestions

Installation

Install dependencies:
```
pip install -r requirements.txt
```
Download required models: The system will automatically download the sentence-transformers model on first use.

Usage

Starting the Server

python mcp_server.py

The server will start on http://localhost:8000 with SSE (Server-Sent Events) transport.

Using the Client

Demo Mode

python mcp_client.py
# Choose option 1 for demo mode

Interactive Mode

python mcp_client.py
# Choose option 2 for interactive mode

Available commands in interactive mode:

upload - Upload a PDF file
search - Search documents with a query
list - List all uploaded documents
stats - Show system statistics
quit - Exit the client

Example Workflow

Upload a PDF:

# Via tool call
result = await session.call_tool("upload_pdf", arguments={
    "file_path": "/path/to/document.pdf",
    "document_name": "My Research Paper"
})

Search documents:

# Via tool call
result = await session.call_tool("search_documents", arguments={
    "query": "machine learning applications",
    "top_k": 5
})

Use RAG prompt:

# Get search results first, then use in prompt
prompt = await session.get_prompt("rag_query_prompt", arguments={
    "query": "What are the key findings?",
    "context_chunks": search_results_text
})

System Architecture

Document Processing Pipeline

PDF Upload → Text extraction using PyMuPDF/PyPDF2
Text Chunking → Split into overlapping chunks (1000 chars, 200 overlap)
Embedding Generation → Create vector embeddings using SentenceTransformers
Storage → Store in FAISS index with metadata

Storage Structure

rag_storage/
├── documents/          # Original extracted text
├── chunks/            # Individual text chunks
├── embeddings/        # Numpy arrays of embeddings
├── faiss_index.bin    # FAISS vector index
└── metadata.json      # Document and chunk metadata

Vector Search

Model: all-MiniLM-L6-v2 (384-dimensional embeddings)
Index: FAISS IndexFlatIP (Inner Product similarity)
Search: Cosine similarity for semantic matching

Configuration

Chunk Settings

Modify in mcp_server.py:

def _create_text_chunks(text: str, chunk_size: int = 1000, overlap: int = 200):

Embedding Model

Change the model in RAGSystem.__init__():

self.embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

Storage Location

Set custom storage directory:

rag_system = RAGSystem(storage_dir="custom_rag_storage")

API Reference

Tools

upload_pdf

Parameters: file_path (str), document_name (optional str)
Returns: Document ID, chunk count, success status

search_documents

Parameters: query (str), top_k (optional int, default 5)
Returns: Ranked list of relevant chunks with scores

list_documents

Parameters: None
Returns: List of all documents with metadata

delete_document

Parameters: document_id (str)
Returns: Success status and confirmation message

get_rag_stats

Parameters: None
Returns: System statistics (documents, chunks, storage size)

Resources

rag://documents

Returns formatted list of all documents in the system.

rag://document/{document_id}

Returns full text content of specified document with metadata header.

rag://stats

Returns formatted system statistics.

Prompts

rag_query_prompt

Parameters: query (str), context_chunks (str)
Returns: Structured prompt for RAG-based QA

document_summary_prompt

Parameters: document_content (str)
Returns: Prompt for document summarization

search_suggestions_prompt

Parameters: query (str), available_documents (str)
Returns: Prompt for generating better search queries

Performance Considerations

Memory Usage

Embeddings: ~1.5KB per chunk (384 float32 values)
FAISS index: Scales linearly with number of chunks
Text storage: Depends on document size and chunking

Search Speed

FAISS IndexFlatIP: O(n) search time
For large collections, consider IndexIVFFlat or IndexHNSW

Optimization Tips

Batch uploads for multiple documents
Adjust chunk size based on document type
Use GPU with faiss-gpu for large datasets
Implement caching for frequent queries

Troubleshooting

Common Issues

PDF text extraction fails:
- Ensure PDF is not password-protected
- Try different PDF files to isolate the issue
- Check PyMuPDF and PyPDF2 installation
Memory errors with large documents:
- Reduce chunk size
- Process documents in batches
- Monitor system memory usage
Search returns no results:
- Verify documents are uploaded successfully
- Check query similarity to document content
- Try broader search terms
Server connection issues:
- Ensure server is running on correct port
- Check firewall settings
- Verify MCP client configuration

Debug Mode

Enable detailed logging by modifying the server:

import logging
logging.basicConfig(level=logging.DEBUG)

Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Submit a pull request

License

This project is licensed under the MIT License. #� �M�C�P� � �

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured