mcp-agentic-rag
Provides RAG tools with local vector retrieval and web fallback using Firecrawl, enabling document ingestion and querying through MCP stdio transport.
README
Agentic RAG MCP
A minimal FastAPI + FastMCP project that combines local RAG retrieval with Firecrawl web fallback.
What this project does
- Loads a FastAPI application for document ingestion and vector queries.
- Uses ChromaDB for local vector storage and SentenceTransformers for embeddings.
- Provides an MCP tool server via
fastmcpto expose RAG tools over stdio transport. - Falls back to Firecrawl web search only when the local vector DB returns no documents.
Repository structure
app/- application source codeapi/- FastAPI routes and schemascore/- RAG logic, embeddings, fallback helperservices/- ChromaDB service integrationmcp/- FastMCP server entrypoint
scripts/- utility scripts (seed data, etc.)data/- storage and persistence directories.env.example- environment variable templatepyproject.toml- project dependencies and packaging config
Setup for a new user
1. Clone the repository
git clone https://github.com/sampathpulukurthi/agentic-rag-mcp.git
cd agentic-rag-mcp
2. Create a Python virtual environment
python3 -m venv .venv
source .venv/bin/activate
3. Install dependencies
python -m pip install -e .
4. Create environment variables
cp .env.example .env
Edit .env and set:
FIRECRAWL_API_KEY=your_firecrawl_api_key_here
5. Run the FastAPI backend
uvicorn app.main:app --host 127.0.0.1 --port 8000 --reload
Then verify:
curl http://127.0.0.1:8000/api/health
6. Run the MCP server
With the virtualenv active:
.venv/bin/python -m app.mcp.server
This starts the FastMCP server named mcp-agentic-rag using stdio transport.
How to use
Ingest documents
curl -X POST http://127.0.0.1:8000/api/ingest \
-H "Content-Type: application/json" \
-d '{"documents": [{"id":"doc1","text":"Machine learning models can classify text.","metadata":{"topic":"ml"}}]}'
Query local vector store
curl -X POST http://127.0.0.1:8000/api/query \
-H "Content-Type: application/json" \
-d '{"query_text":"How do text classification models work?","k":3}'
Query with fallback to Firecrawl
curl -X POST http://127.0.0.1:8000/api/query_with_fallback \
-H "Content-Type: application/json" \
-d '{"query_text":"What is machine learning?","k":5}'
If the vector store returns no documents, the endpoint will return fallback: true and web_results from Firecrawl.
Notes
- There is currently no chat UI included in this repository.
- The app returns vector DB matches by default and only uses Firecrawl when local results are empty.
- If you want stronger fallback behavior, the
query_with_fallbacklogic can be updated to use a similarity threshold.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.