mcp-powered-agentic-rag
An agentic Retrieval-Augmented Generation (RAG) system that combines a small curated machine learning knowledge base with real-time web search capabilities, powered by the Model Context Protocol (MCP).
README
MCP-Powered Agentic RAG
An agentic Retrieval-Augmented Generation (RAG) system that combines a small curated machine learning knowledge base with real-time web search capabilities, powered by the Model Context Protocol (MCP).
Limitations of Naive RAG
Traditional RAG systems have several limitations:
-
Static Knowledge Base: Naive RAG relies solely on pre-indexed documents, making it unable to answer questions about recent events, current information, or topics not in the knowledge base.
-
No Tool Selection: These systems cannot intelligently decide when to use different information sources. They always query the same vector database regardless of the question type.
-
Limited Context Awareness: They lack the ability to understand query intent and route to appropriate tools (e.g., domain-specific knowledge base vs. general web search).
-
Single Source of Truth: All queries go through the same retrieval mechanism, even when the question might be better answered by external sources.
-
No Fallback Mechanism: If the knowledge base doesn't contain relevant information, the system fails rather than seeking alternative sources.
How Agentic RAG solves the Problem
Agentic RAG introduces intelligent decision-making and tool orchestration:
-
Multi-Source Intelligence: The system can choose between a curated knowledge base (for domain-specific questions) and web search (for general or current information).
-
Context-Aware Routing: An intelligent prompt guides the LLM to analyze query intent and route to the appropriate tool based on the question type.
-
Dynamic Information Retrieval: The system can fetch real-time information from the web when the knowledge base is insufficient.
-
Tool Orchestration: Through MCP, the system can seamlessly switch between different tools based on the query context.
-
Graceful Degradation: If one source fails, the system can automatically try alternative sources.
Solution Overview
This project implements an Agentic RAG system that:
- Maintains a small curated ML knowledge base (50 expert FAQs) in ChromaDB Cloud
- Provides real-time web search via Firecrawl for general queries
- Leverages MCP (Model Context Protocol) for seamless tool integration with Claude
The system acts as an intelligent assistant that knows when to use its specialized knowledge base versus when to search the web for general information not relevant to the knowledge base.
Workflow
- User Query: User asks a question through Claude Desktop
- Intent Analysis: Intelligent prompt analyzes the query to determine:
- Is this an ML-related question? → Use
ml_faq_retrieval - Is this a general question? → Use
firecrawl_web_search
- Is this an ML-related question? → Use
- Tool Execution:
- ML FAQ Tool: Queries ChromaDB Cloud, retrieves top 3 relevant FAQs
- Web Search Tool: Searches the web via Firecrawl API
- Return to User: Formatted response is returned through Claude
Tech Stack
- FastMCP: Fast Model Context Protocol framework for building MCP servers
- ChromaDB Cloud: Cloud-hosted vector database for storing and querying FAQ embeddings
- Firecrawl: Web scraping and search API for real-time information retrieval
Setup
Prerequisites
- Python 3.12 or higher
uvpackage manager installed- ChromaDB Cloud account (for API key, tenant, and database)
- Firecrawl API key
Installation
- Clone and cd into the repository:
cd mcp-powered-agentic-rag
- Install dependencies with uv:
uv pip install -r requirements.txt
Or use uv's project management:
uv sync
- Set up environment variables:
Create a
.envfile in the project root:
CHROMA_API_KEY=your_chroma_api_key
CHROMA_TENANT=your_chroma_tenant
CHROMA_DATABASE=your_chroma_database
FIRECRAWL_API_KEY=your_firecrawl_api_key
- Verify setup:
uv run fastmcp dev server.py
Usage
Running the MCP Server
Development Mode (with Inspector)
uv run fastmcp dev server.py
Production Mode
uv run python server.py
Integrating with Claude Desktop
Add the following to your Claude Desktop MCP configuration:
{
"mcpServers": {
"mcp-rag": {
"command": "/path/to/uv",
"args": [
"--directory",
"/path/to/mcp-powered-agentic-rag",
"run",
"server.py"
]
}
}
}
Configuration
ChromaDB Cloud Setup
- Create a ChromaDB Cloud account
- Create a database
- Get your API key, tenant ID, and database name
- Add to
.envfile
License
This project is licensed under the MIT License - see the LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.