RAG-Anything MCP Server

RAG-Anything MCP Server

Enables advanced RAG with knowledge graphs, supporting document ingestion, multimodal extraction, and multiple query modes (naive, local, global, hybrid) via the Model Context Protocol.

Category
Visit Server

README

<div align="center">

RAG-Anything MCP Server

πŸš€ Model Context Protocol server for advanced RAG with Knowledge Graphs

License: MIT Python 3.13+ MCP

Knowledge Graph + Document Processing + Multimodal AI

</div>

πŸ“‹ Overview

RAG-Anything MCP Server is a production-ready Model Context Protocol (MCP) server that combines:

  • 🧠 Knowledge Graph Queries - Multiple query modes (naive, local, global, hybrid, mix, bypass)
  • πŸ“„ Document Ingestion - Text and PDF processing with multimodal content extraction
  • πŸ” Entity Extraction - Automatic entity and relationship extraction from documents
  • πŸ’Ύ Hybrid Storage - Neo4j (graph) + PostgreSQL with pgvector (vectors)
  • πŸ–ΌοΈ Multimodal Support - Process images, tables, and equations from PDFs
  • πŸ“‘ MCP Compliant - Standard Model Context Protocol implementation

✨ Features

Core Capabilities

Feature Description
πŸ“„ Document Ingestion Text and PDF processing with multimodal content extraction
🧠 Knowledge Graph Queries Multiple query modes (naive, local, global, hybrid, mix, bypass)
πŸ” Entity Extraction Automatic entity and relationship extraction from documents
πŸ’Ύ Hybrid Storage Neo4j (graph) + PostgreSQL with pgvector (vectors)
πŸ“‘ MCP Compliant Standard Model Context Protocol implementation
πŸ–ΌοΈ Multimodal Support Process images, tables, and equations from PDFs

Query Modes

  • naive - Simple keyword search
  • local - Local entity-based search
  • global - Global community-based search
  • hybrid - Combines local and global (recommended)
  • mix - Mixes multiple strategies
  • bypass - Direct LLM query without graph

πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose (for database services)
  • Python 3.13+
  • OpenAI API Key

Option 1: All-in-One Docker (Recommended)

Start everything with Docker Compose:

# Clone the repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp

# Copy environment template
cp .env.example .env

# Edit .env and set your OpenAI API key
# OPENAI_API_KEY=sk-...

# Start all services (Neo4j + PostgreSQL + MCP Server)
docker-compose up -d

# View logs
docker-compose logs -f rag-mcp

# Stop services
docker-compose down

This will start:

  • Neo4j 5.23 on bolt://localhost:7687 (HTTP UI on http://localhost:7474)
  • PostgreSQL 16 + pgvector on localhost:5432
  • MCP Server on http://localhost:8000

Option 2: Development Mode

Run the MCP server locally while databases run in Docker:

Windows:

# One-click startup - handles everything automatically
start-dev.bat

Linux/Mac:

# Make executable and run
chmod +x start-dev.sh
./start-dev.sh

Option 3: Manual Setup

# 1. Start Docker services (databases only)
docker-compose up -d neo4j postgres

# 2. Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -e .

# 4. Run the server
python main.py

βš™οΈ Configuration

Create a .env file in the project root:

# =====================
# Neo4j Configuration
# =====================
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password

# =====================
# PostgreSQL Configuration (for RAG)
# =====================
RAG_DB_HOST=localhost
RAG_DB_PORT=5432
RAG_DB_NAME=rag_anythink
RAG_DB_USER=postgres
RAG_DB_PASSWORD=your_postgres_password

# =====================
# OpenAI Configuration
# =====================
OPENAI_API_KEY=your_openai_api_key

# =====================
# Document Processing
# =====================
# Parser: mineru, docling
KG_PARSER=mineru
# Parse method: auto, ocr, txt
KG_PARSE_METHOD=auto
# Enable image extraction from PDFs
KG_ENABLE_IMAGE=true
# Enable table extraction from PDFs
KG_ENABLE_TABLE=true
# Enable equation extraction from PDFs
KG_ENABLE_EQUATION=true

# =====================
# RAG Configuration
# =====================
# Working directory for RAG output
KG_WORKING_DIR=./rag_output
# Workspace name (production, development, etc.)
KG_WORKSPACE=production
# Context window for LLM (pages before/after for context)
KG_CONTEXT_WINDOW=1
# Maximum concurrent files for processing
KG_MAX_CONCURRENT_FILES=4
# Embedding dimension (depends on model)
KG_EMBEDDING_DIM=3072
# Maximum token size for embeddings
KG_MAX_TOKEN_SIZE=8192
# LLM model for knowledge graph operations
KG_LLM_MODEL=gpt-4o-mini
# Vision model for multimodal processing
KG_VISION_MODEL=gpt-4o
# Embedding model
KG_EMBEDDING_MODEL=text-embedding-3-large
# Default query mode (naive, local, global, hybrid, mix, bypass)
KG_DEFAULT_MODE=hybrid

# =====================
# MCP Configuration
# =====================
# MCP server name
RAG_MCP_NAME=rag-anything-mcp
# MCP server version
RAG_MCP_VERSION=1.0.0
# MCP server host
RAG_MCP_HOST=localhost
# MCP server port
RAG_MCP_PORT=8055
# MCP log level
RAG_MCP_LOG=info
# MCP transport protocol: stdio, http, sse, streamable-http
RAG_MCP_TRANSPORT=streamable-http
# Application log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
# Application log format: json, text
LOG_FORMAT=json

πŸ”Œ MCP Tools

The server provides the following MCP tools:

Tool Description
ingest_document Ingest text or PDF documents
query_knowledge_graph Query with multiple modes (naive, local, global, hybrid)
query_multimodal Query with images, tables, equations
process_document_file Process PDF files with multimodal extraction
insert_content_list Insert pre-parsed content
delete_data Delete documents by ID
get_graph_statistics Get graph statistics
get_config_info Get configuration info

πŸ“š Usage Examples

Python Client

from src.services.kg_service import KGService

# Initialize service
service = KGService()
await service.initialize()

# Ingest a document
result = await service.ingest_text(
    text="Your document text here...",
    metadata={"title": "My Document"}
)

# Query the knowledge graph
response = await service.query(
    query_text="What are the main topics?",
    mode="hybrid"
)

MCP Client (Claude Desktop)

Add to your Claude Desktop MCP config:

{
  "mcpServers": {
    "rag-anything": {
      "command": "docker-compose",
      "args": ["up", "rag-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-api-key"
      }
    }
  }
}

πŸ—‚οΈ Project Structure

rag-anythink-mcp/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ config/         # Pydantic configuration
β”‚   β”œβ”€β”€ core/           # Interfaces and models
β”‚   β”œβ”€β”€ database/       # Database connections (Neo4j, PostgreSQL)
β”‚   β”‚   └── kg/         # Knowledge Graph layer
β”‚   β”œβ”€β”€ mcp/            # MCP servers
β”‚   β”‚   └── kg/         # RAG MCP server
β”‚   β”œβ”€β”€ services/       # Business logic
β”‚   β”œβ”€β”€ utils/          # Utilities
β”‚   └── llm.py          # LLM clients
β”œβ”€β”€ main.py             # Entry point
β”œβ”€β”€ Dockerfile          # Docker image for MCP server
β”œβ”€β”€ docker-compose.yml  # Multi-container orchestration
β”œβ”€β”€ pyproject.toml      # Dependencies (uv)
β”œβ”€β”€ start-dev.bat       # Windows dev startup
β”œβ”€β”€ start-dev.sh        # Linux/Mac dev startup
└── README.md

πŸ› οΈ Development

Setup Development Environment

# Clone repository
git clone https://github.com/serkanyasr/rag-anythink-mcp.git
cd rag-anythink-mcp

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode with all extras
pip install -e ".[full]"

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_kg_service.py

Code Quality

# Format code
ruff format src/

# Check linting
ruff check src/

# Type checking
pyright src/

🐳 Docker Deployment

Full Stack Deployment

# Deploy all services
docker-compose up -d

# Check service health
docker-compose ps

# View logs
docker-compose logs -f

# Stop all services
docker-compose down

# Stop and remove volumes (clean slate)
docker-compose down -v

Individual Services

# Only databases (for local dev)
docker-compose up -d neo4j postgres

# Only MCP server (databases must be running)
docker-compose up -d rag-mcp

Health Checks

The services include built-in health checks:

  • Neo4j: Cypher-shell connectivity test
  • PostgreSQL: pg_isready checks
  • MCP Server: Depends on healthy databases

πŸ“– Architecture

System Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MCP Client    β”‚
β”‚  (Claude, etc)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ MCP Protocol
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  MCP Server     β”‚
β”‚  (FastMCP)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  KG Service     β”‚
β”‚  (RAG-Anything) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
    β–Ό         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Neo4jβ”‚  β”‚PostgreSQLβ”‚
β”‚Graphβ”‚  β”‚+ pgvectorβ”‚
β””β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow

  1. Ingestion: Documents β†’ Entity Extraction β†’ Neo4j (graph) + PostgreSQL (vectors)
  2. Query: Query Text β†’ Embedding β†’ Vector Search + Graph Traversal β†’ LLM Synthesis

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests (pytest)
  5. Commit your changes (git commit -m 'Add some amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Development Guidelines

  • Write tests for new features
  • Follow PEP 8 style guidelines
  • Update documentation as needed
  • Keep commits atomic and well-described

❓ FAQ

<details> <summary><b>How do I change the database passwords?</b></summary>

Update the passwords in both .env and docker-compose.yml. Make sure they match. </details>

<details> <summary><b>Can I use a different embedding model?</b></summary>

Yes! Set OPENAI_EMBEDDING_MODEL and adjust RAG_EMBEDDING_DIM in your .env file. </details>

<details> <summary><b>How do I backup my data?</b></summary>

# Neo4j backup
docker exec rag-neo4j neo4j-admin database dump neo4j --to-path=/backups

# PostgreSQL backup
docker exec rag-postgres pg_dump -U postgres rag_anythink > backup.sql

</details>

<details> <summary><b>The server won't start - what do I do?</b></summary>

  1. Check Docker is running: docker ps
  2. Check service logs: docker-compose logs
  3. Verify environment variables in .env
  4. Ensure databases are healthy: docker-compose ps </details>

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured