SecondBrain MCP Server
Enables external AI clients to query and interact with a personal knowledge assistant using retrieval-augmented generation and long-term memory.
README
<div align="center">
๐ง SecondBrain AI
Production-Ready AI Personal Knowledge Assistant
An intelligent knowledge management system powered by Agentic RAG, LangGraph, Long-Term Memory, and the Model Context Protocol (MCP).
</div>
๐ Overview
SecondBrain AI is a production-ready AI knowledge assistant designed to transform personal documents into a searchable, conversational knowledge base.
Unlike a traditional chatbot, SecondBrain combines Retrieval-Augmented Generation (RAG), LangGraph-based agent orchestration, short-term and long-term memory, and the Model Context Protocol (MCP) to create an extensible AI assistant capable of understanding documents, remembering context across conversations, and exposing its capabilities to external AI clients.
The project follows a modular architecture with asynchronous document ingestion, background processing, vector search using Qdrant, persistent conversational memory through MongoDB, and containerized deployment using Docker Compose.
โจ Key Highlights
- ๐ง Agentic RAG powered by LangGraph
- ๐ Upload and chat with PDF documents
- ๐ Semantic vector search using Qdrant
- ๐ง Persistent long-term memory with Mem0
- ๐ฌ Conversation state managed using MongoDB Checkpointer
- โก FastAPI REST API
- ๐ Redis + RQ background workers
- ๐ Model Context Protocol (MCP) Server
- ๐ณ Fully Dockerized architecture
- ๐ Structured logging and centralized error handling
- ๐งช Smoke-tested production workflow
๐ Project Overview
SecondBrain AI is a production-ready personal knowledge assistant that enables users to build an intelligent, searchable knowledge base from their own documents. Instead of relying solely on a large language model's built-in knowledge, SecondBrain retrieves relevant information from user-provided content and generates grounded, context-aware responses.
The system is built around an Agentic Retrieval-Augmented Generation (RAG) architecture powered by LangGraph. User requests are orchestrated through a multi-step workflow that performs intent routing, semantic retrieval, memory lookup, and response generation before returning a final answer.
To support real-world usage, SecondBrain combines multiple components into a modular backend:
- FastAPI for serving REST APIs
- Qdrant for semantic vector search
- MongoDB for conversation checkpoints and persistence
- Mem0 for long-term memory management
- Redis + RQ for asynchronous background processing
- Docker Compose for reproducible deployment
- Model Context Protocol (MCP) for integration with external AI clients
The project follows production-oriented software engineering practices including modular architecture, centralized logging, environment-based configuration, containerization, background workers, and automated smoke testing.
Rather than being a simple chatbot, SecondBrain demonstrates how modern AI systems combine retrieval, reasoning, memory, orchestration, and external tools to build scalable intelligent applications.
โจ Features
๐ง AI & Intelligence
- Agentic Retrieval-Augmented Generation (RAG) powered by LangGraph
- Semantic document retrieval using Qdrant Vector Database
- Long-term memory using Mem0
- Short-term conversational memory with MongoDB Checkpointer
- Context-aware response generation using Google Gemini
- Multi-step workflow orchestration for intelligent query handling
๐ Document Processing
- PDF document ingestion
- Automatic text extraction and cleaning
- Intelligent document chunking
- Embedding generation
- Vector indexing for semantic search
โ๏ธ Backend & Infrastructure
- FastAPI REST API
- Asynchronous document processing using Redis + RQ
- Modular service-oriented architecture
- Centralized logging
- Custom exception handling
- Environment-based configuration
- Production-ready Docker deployment
๐ Integrations
- Model Context Protocol (MCP) Server
- REST API endpoints
- Command Line Interface (CLI)
๐งช Quality & Reliability
- Smoke tests
- Dockerized development environment
- Persistent MongoDB storage
- Persistent Qdrant storage
- Modular project structure
๐๏ธ System Architecture
The following diagram illustrates the high-level architecture of SecondBrain AI.
<p align="center"> <img src="assets/architecture-overview.png" alt="SecondBrain Architecture" width="100%"> </p>
SecondBrain follows a modular architecture where each component has a well-defined responsibility.
| Component | Responsibility |
|---|---|
| FastAPI | Exposes REST APIs |
| LangGraph | Orchestrates the AI workflow |
| Gemini | Generates responses |
| Qdrant | Stores vector embeddings |
| MongoDB | Stores conversation state |
| Mem0 | Manages long-term memory |
| Redis + RQ | Executes background jobs |
| MCP Server | Exposes tools for external AI clients |
๐ Detailed Workflow
The following diagram illustrates the complete execution flow inside SecondBrain AI.
<p align="center"> <img src="assets/architecture-detailed.png" alt="SecondBrain Detailed Workflow" width="100%"> </p>
Request Flow
User
โ
โผ
FastAPI
โ
โผ
LangGraph Workflow
โ
โโโ Route Request
โโโ Retrieve Documents
โโโ Retrieve Memory
โโโ Grade Documents
โโโ Rewrite Query (if needed)
โโโ Generate Response
โโโ Store Conversation Memory
โ
โผ
Gemini LLM
โ
โผ
Response
Document Upload Flow
PDF Upload
โ
โผ
FastAPI
โ
โผ
Redis Queue
โ
โผ
RQ Worker
โ
โผ
PDF Loader
โ
โผ
Text Cleaning
โ
โผ
Chunking
โ
โผ
Gemini Embeddings
โ
โผ
Qdrant Vector Store
This architecture separates document ingestion from user interaction, enabling scalable background processing while keeping the API responsive.
โ๏ธ Tech Stack
| Category | Technologies |
|---|---|
| Programming Language | Python 3.13 |
| Backend Framework | FastAPI |
| AI Framework | LangChain, LangGraph |
| LLM | Google Gemini |
| Embeddings | Gemini Embedding Model |
| Vector Database | Qdrant |
| Memory | Mem0, MongoDB Checkpointer |
| Database | MongoDB |
| Background Processing | Redis, RQ Worker |
| Document Processing | PyPDF |
| API Documentation | Swagger / OpenAPI |
| Protocol | Model Context Protocol (MCP) |
| Containerization | Docker, Docker Compose |
| Configuration | Python Dotenv |
| Testing | Smoke Tests |
| Version Control | Git, GitHub |
๐ Project Structure
SecondBrain/
โ
โโโ secondbrain/
โ โโโ agent/ # Agent orchestration
โ โโโ agents/ # Specialized AI agents
โ โโโ api/ # FastAPI endpoints
โ โโโ cli/ # Command-line interface
โ โโโ core/ # Logging & exceptions
โ โโโ graph/ # LangGraph workflow
โ โโโ mcp_server/ # MCP server implementation
โ โโโ memory/ # Short & long-term memory
โ โโโ models/ # Request & response models
โ โโโ queues/ # Redis queue & worker
โ โโโ rag/ # RAG pipeline
โ โโโ tools/ # AI tools
โ โโโ data/ # Runtime data
โ โโโ main.py # FastAPI application
โ
โโโ tests/ # Smoke tests
โโโ assets/ # README images
โโโ logs/ # Application logs
โ
โโโ Dockerfile
โโโ docker-compose.yml
โโโ requirements.txt
โโโ pyproject.toml
โโโ .env.example
โโโ README.md
๐ฆ Core Components
| Module | Description |
|---|---|
| RAG Pipeline | Document ingestion, chunking, embeddings, retrieval |
| LangGraph | Agent workflow orchestration |
| Memory | Short-term & long-term conversational memory |
| FastAPI | REST API layer |
| Redis Worker | Background document processing |
| Qdrant | Vector similarity search |
| MongoDB | Conversation persistence |
| MCP Server | External AI tool integration |
๐ Installation & Quick Start
Prerequisites
Before getting started, ensure you have the following installed:
- Python 3.13+
- Docker & Docker Compose
- Git
- Google Gemini API Key
Clone the Repository
git clone https://github.com/MandarGavali/SecondBrain.git
cd SecondBrain
Configure Environment Variables
Create a .env file in the project root.
GOOGLE_API_KEY=your_google_api_key
MONGODB_URI=mongodb://localhost:27017
QDRANT_URL=http://localhost:6333
REDIS_URL=redis://localhost:6379
Run with Docker (Recommended)
Start the complete application stack:
docker compose up --build
This launches:
- FastAPI Server
- Redis
- MongoDB
- Qdrant
- Background Worker
Swagger UI:
http://localhost:8000/docs
Run Locally (Without Docker)
Create a virtual environment:
python -m venv venv
Activate it.
Windows:
venv\Scripts\activate
Linux / macOS:
source venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Start the FastAPI server:
uvicorn secondbrain.main:app --reload
Start the Redis worker in another terminal:
python -m secondbrain.queues.worker
The API will be available at:
http://localhost:8000
Swagger UI:
http://localhost:8000/docs
๐ณ Docker Deployment
SecondBrain is fully containerized using Docker Compose, making the entire application stack reproducible with a single command.
Containers
| Container | Purpose |
|---|---|
secondbrain-api |
FastAPI application |
secondbrain-worker |
Background document processing |
secondbrain-mongodb |
Conversation state & memory |
secondbrain-qdrant |
Vector database |
secondbrain-redis |
Background job queue |
Start the Stack
docker compose up --build
Run in detached mode:
docker compose up -d
Stop the stack:
docker compose down
Rebuild after dependency changes:
docker compose up --build --force-recreate
Verify Services
docker ps
Expected running containers:
- secondbrain-api
- secondbrain-worker
- secondbrain-mongodb
- secondbrain-qdrant
- secondbrain-redis
Persistent Storage
Docker volumes are used to persist application data.
| Volume | Stores |
|---|---|
mongodb_data |
MongoDB data |
qdrant_data |
Vector embeddings |
This ensures conversations and indexed documents remain available even after restarting the containers.
๐ก API Reference
Once the application is running, Swagger documentation is available at:
http://localhost:8000/docs
<p align="center"> <img src="assets/swagger-ui.png" alt="Swagger UI" width="100%"> </p>
REST Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /upload |
Upload and index PDF documents |
| POST | /chat |
Chat with indexed documents |
| GET | /jobs/{job_id} |
Check background processing status |
Example Upload Request
curl -X POST \
"http://localhost:8000/upload" \
-F "file=@document.pdf"
Example Chat Request
POST /chat
{
"query": "Summarize the uploaded document."
}
๐ Model Context Protocol (MCP) Integration
SecondBrain includes a dedicated Model Context Protocol (MCP) Server, allowing external AI clients (such as Claude Desktop, Cursor, VS Code, and other MCP-compatible applications) to interact directly with the knowledge base.
Instead of exposing only REST APIs, MCP enables AI assistants to invoke tools, retrieve documents, and access memory through a standardized protocol.
Available MCP Tools
| Tool | Description |
|---|---|
ask_secondbrain |
Query the complete RAG pipeline with memory support |
search_documents |
Perform semantic search across indexed documents |
upload_document |
Upload new documents to the knowledge base |
memory |
Access and manage long-term memory |
MCP Capabilities
- AI-assisted document search
- Agentic RAG workflow execution
- Long-term memory retrieval
- Knowledge base interaction
- Tool-based AI integration
- Standardized MCP interface
MCP Architecture
AI Client
(Claude Desktop / Cursor / VS Code)
โ
โผ
MCP Server
โ
โผ
SecondBrain Tools
โ
โผ
LangGraph Workflow
โ
โโโโโโโโดโโโโโโโโโ
โผ โผ
MongoDB Qdrant
โ โ
โโโโโโโโฌโโโโโโโโโ
โผ
Google Gemini
The MCP server allows external AI systems to securely access the capabilities of SecondBrain without directly interacting with the internal application components.
๐ผ๏ธ Screenshots
System Architecture
<p align="center"> <img src="assets/architecture-overview.png" width="100%"> </p>
Engineering Workflow
<p align="center"> <img src="assets/architecture-detailed.png" width="100%"> </p>
FastAPI Swagger UI
<p align="center"> <img src="assets/swagger-ui.png" width="100%"> </p>
๐ง Future Improvements
SecondBrain is designed with extensibility in mind. Some planned enhancements include:
- ๐ Web-based user interface
- ๐ User authentication and role-based access control
- ๐ Support for additional document formats (DOCX, Markdown, HTML, TXT)
- โ๏ธ Cloud storage integration (AWS S3, Google Cloud Storage)
- ๐ Hybrid Search (Semantic + Keyword Search)
- โก Streaming responses using Server-Sent Events (SSE)
- ๐ Observability with Prometheus & Grafana
- ๐ Monitoring and analytics dashboard
- ๐งฉ Plugin architecture for custom tools
- ๐ค Multi-agent collaboration workflows
- ๐ฃ Voice input and speech synthesis
- ๐ Multi-language document support
- ๐ฑ Web and mobile client applications
- ๐ Kubernetes deployment for horizontal scaling
๐ค Contributing
Contributions are welcome!
If you'd like to improve SecondBrain, please follow these steps:
- Fork the repository
- Create a feature branch
git checkout -b feature/your-feature
- Commit your changes
git commit -m "Add new feature"
- Push to your branch
git push origin feature/your-feature
- Open a Pull Request
Please ensure your code follows the existing project structure and coding style.
๐ License
This project is licensed under the MIT License.
See the LICENSE file for more information.
๐ Acknowledgements
This project was built using several outstanding open-source technologies.
Special thanks to the teams behind:
- LangChain
- LangGraph
- Google Gemini
- FastAPI
- Qdrant
- MongoDB
- Redis
- Mem0
- Docker
- Model Context Protocol (MCP)
Their work makes projects like SecondBrain possible.
<div align="center">
โญ If you found this project interesting, consider giving it a star!
Built with โค๏ธ by Mandar Gavali
</div>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.