RAG-Anything MCP Server
Enables advanced RAG with knowledge graphs, supporting document ingestion, multimodal extraction, and multiple query modes (naive, local, global, hybrid) via the Model Context Protocol.
README
<div align="center">
RAG-Anything MCP Server
π Model Context Protocol server for advanced RAG with Knowledge Graphs
Knowledge Graph + Document Processing + Multimodal AI
</div>
π Overview
RAG-Anything MCP Server is a production-ready Model Context Protocol (MCP) server that combines:
- π§ Knowledge Graph Queries - Multiple query modes (naive, local, global, hybrid, mix, bypass)
- π Document Ingestion - Text and PDF processing with multimodal content extraction
- π Entity Extraction - Automatic entity and relationship extraction from documents
- πΎ Hybrid Storage - Neo4j (graph) + PostgreSQL with pgvector (vectors)
- πΌοΈ Multimodal Support - Process images, tables, and equations from PDFs
- π‘ MCP Compliant - Standard Model Context Protocol implementation
β¨ Features
Core Capabilities
| Feature | Description |
|---|---|
| π Document Ingestion | Text and PDF processing with multimodal content extraction |
| π§ Knowledge Graph Queries | Multiple query modes (naive, local, global, hybrid, mix, bypass) |
| π Entity Extraction | Automatic entity and relationship extraction from documents |
| πΎ Hybrid Storage | Neo4j (graph) + PostgreSQL with pgvector (vectors) |
| π‘ MCP Compliant | Standard Model Context Protocol implementation |
| πΌοΈ Multimodal Support | Process images, tables, and equations from PDFs |
Query Modes
- naive - Simple keyword search
- local - Local entity-based search
- global - Global community-based search
- hybrid - Combines local and global (recommended)
- mix - Mixes multiple strategies
- bypass - Direct LLM query without graph
π Quick Start
Prerequisites
- Docker & Docker Compose (for database services)
- Python 3.13+
- OpenAI API Key
Option 1: All-in-One Docker (Recommended)
Start everything with Docker Compose:
# Clone the repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp
# Copy environment template
cp .env.example .env
# Edit .env and set your OpenAI API key
# OPENAI_API_KEY=sk-...
# Start all services (Neo4j + PostgreSQL + MCP Server)
docker-compose up -d
# View logs
docker-compose logs -f rag-mcp
# Stop services
docker-compose down
This will start:
- Neo4j 5.23 on
bolt://localhost:7687(HTTP UI onhttp://localhost:7474) - PostgreSQL 16 + pgvector on
localhost:5432 - MCP Server on
http://localhost:8000
Option 2: Development Mode
Run the MCP server locally while databases run in Docker:
Windows:
# One-click startup - handles everything automatically
start-dev.bat
Linux/Mac:
# Make executable and run
chmod +x start-dev.sh
./start-dev.sh
Option 3: Manual Setup
# 1. Start Docker services (databases only)
docker-compose up -d neo4j postgres
# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# 3. Install dependencies
pip install -e .
# 4. Run the server
python main.py
βοΈ Configuration
Create a .env file in the project root:
# =====================
# Neo4j Configuration
# =====================
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password
# =====================
# PostgreSQL Configuration (for RAG)
# =====================
RAG_DB_HOST=localhost
RAG_DB_PORT=5432
RAG_DB_NAME=rag_anythink
RAG_DB_USER=postgres
RAG_DB_PASSWORD=your_postgres_password
# =====================
# OpenAI Configuration
# =====================
OPENAI_API_KEY=your_openai_api_key
# =====================
# Document Processing
# =====================
# Parser: mineru, docling
KG_PARSER=mineru
# Parse method: auto, ocr, txt
KG_PARSE_METHOD=auto
# Enable image extraction from PDFs
KG_ENABLE_IMAGE=true
# Enable table extraction from PDFs
KG_ENABLE_TABLE=true
# Enable equation extraction from PDFs
KG_ENABLE_EQUATION=true
# =====================
# RAG Configuration
# =====================
# Working directory for RAG output
KG_WORKING_DIR=./rag_output
# Workspace name (production, development, etc.)
KG_WORKSPACE=production
# Context window for LLM (pages before/after for context)
KG_CONTEXT_WINDOW=1
# Maximum concurrent files for processing
KG_MAX_CONCURRENT_FILES=4
# Embedding dimension (depends on model)
KG_EMBEDDING_DIM=3072
# Maximum token size for embeddings
KG_MAX_TOKEN_SIZE=8192
# LLM model for knowledge graph operations
KG_LLM_MODEL=gpt-4o-mini
# Vision model for multimodal processing
KG_VISION_MODEL=gpt-4o
# Embedding model
KG_EMBEDDING_MODEL=text-embedding-3-large
# Default query mode (naive, local, global, hybrid, mix, bypass)
KG_DEFAULT_MODE=hybrid
# =====================
# MCP Configuration
# =====================
# MCP server name
RAG_MCP_NAME=rag-anything-mcp
# MCP server version
RAG_MCP_VERSION=1.0.0
# MCP server host
RAG_MCP_HOST=localhost
# MCP server port
RAG_MCP_PORT=8055
# MCP log level
RAG_MCP_LOG=info
# MCP transport protocol: stdio, http, sse, streamable-http
RAG_MCP_TRANSPORT=streamable-http
# Application log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
# Application log format: json, text
LOG_FORMAT=json
π MCP Tools
The server provides the following MCP tools:
| Tool | Description |
|---|---|
ingest_document |
Ingest text or PDF documents |
query_knowledge_graph |
Query with multiple modes (naive, local, global, hybrid) |
query_multimodal |
Query with images, tables, equations |
process_document_file |
Process PDF files with multimodal extraction |
insert_content_list |
Insert pre-parsed content |
delete_data |
Delete documents by ID |
get_graph_statistics |
Get graph statistics |
get_config_info |
Get configuration info |
π Usage Examples
Python Client
from src.services.kg_service import KGService
# Initialize service
service = KGService()
await service.initialize()
# Ingest a document
result = await service.ingest_text(
text="Your document text here...",
metadata={"title": "My Document"}
)
# Query the knowledge graph
response = await service.query(
query_text="What are the main topics?",
mode="hybrid"
)
MCP Client (Claude Desktop)
Add to your Claude Desktop MCP config:
{
"mcpServers": {
"rag-anything": {
"command": "docker-compose",
"args": ["up", "rag-mcp"],
"env": {
"OPENAI_API_KEY": "your-api-key"
}
}
}
}
ποΈ Project Structure
rag-anythink-mcp/
βββ src/
β βββ config/ # Pydantic configuration
β βββ core/ # Interfaces and models
β βββ database/ # Database connections (Neo4j, PostgreSQL)
β β βββ kg/ # Knowledge Graph layer
β βββ mcp/ # MCP servers
β β βββ kg/ # RAG MCP server
β βββ services/ # Business logic
β βββ utils/ # Utilities
β βββ llm.py # LLM clients
βββ main.py # Entry point
βββ Dockerfile # Docker image for MCP server
βββ docker-compose.yml # Multi-container orchestration
βββ pyproject.toml # Dependencies (uv)
βββ start-dev.bat # Windows dev startup
βββ start-dev.sh # Linux/Mac dev startup
βββ README.md
π οΈ Development
Setup Development Environment
# Clone repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode with all extras
pip install -e ".[full]"
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=src
# Run specific test file
pytest tests/test_kg_service.py
Code Quality
# Format code
ruff format src/
# Check linting
ruff check src/
# Type checking
pyright src/
π³ Docker Deployment
Full Stack Deployment
# Deploy all services
docker-compose up -d
# Check service health
docker-compose ps
# View logs
docker-compose logs -f
# Stop all services
docker-compose down
# Stop and remove volumes (clean slate)
docker-compose down -v
Individual Services
# Only databases (for local dev)
docker-compose up -d neo4j postgres
# Only MCP server (databases must be running)
docker-compose up -d rag-mcp
Health Checks
The services include built-in health checks:
- Neo4j: Cypher-shell connectivity test
- PostgreSQL:
pg_isreadychecks - MCP Server: Depends on healthy databases
π Architecture
System Components
βββββββββββββββββββ
β MCP Client β
β (Claude, etc) β
ββββββββββ¬βββββββββ
β MCP Protocol
βΌ
βββββββββββββββββββ
β MCP Server β
β (FastMCP) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β KG Service β
β (RAG-Anything) β
ββββββββββ¬βββββββββ
β
ββββββ΄βββββ
βΌ βΌ
ββββββββ βββββββββββ
βNeo4jβ βPostgreSQLβ
βGraphβ β+ pgvectorβ
ββββββββ βββββββββββ
Data Flow
- Ingestion: Documents β Entity Extraction β Neo4j (graph) + PostgreSQL (vectors)
- Query: Query Text β Embedding β Vector Search + Graph Traversal β LLM Synthesis
π€ Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
pytest) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Development Guidelines
- Write tests for new features
- Follow PEP 8 style guidelines
- Update documentation as needed
- Keep commits atomic and well-described
β FAQ
<details> <summary><b>How do I change the database passwords?</b></summary>
Update the passwords in both .env and docker-compose.yml. Make sure they match.
</details>
<details> <summary><b>Can I use a different embedding model?</b></summary>
Yes! Set OPENAI_EMBEDDING_MODEL and adjust RAG_EMBEDDING_DIM in your .env file.
</details>
<details> <summary><b>How do I backup my data?</b></summary>
# Neo4j backup
docker exec rag-neo4j neo4j-admin database dump neo4j --to-path=/backups
# PostgreSQL backup
docker exec rag-postgres pg_dump -U postgres rag_anythink > backup.sql
</details>
<details> <summary><b>The server won't start - what do I do?</b></summary>
- Check Docker is running:
docker ps - Check service logs:
docker-compose logs - Verify environment variables in
.env - Ensure databases are healthy:
docker-compose ps</details>
π License
This project is licensed under the MIT License - see the LICENSE file for details.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.