MCP RAG Server

MCP RAG Server

Enables AI assistants to search and retrieve information from your knowledge base using RAG (Retrieval-Augmented Generation) with hybrid search, document indexing, and ChromaDB vector storage.

Category
Visit Server

README

MCP RAG Server

An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.

This is a standalone extraction from my production portfolio site. See it in action at danmonteiro.com.


The Problem

You're using Claude Code but:

  • No access to your documents — Claude can't search your knowledge base
  • Context is manual — you're copy-pasting relevant docs into prompts
  • RAG is disconnected — your vector database isn't accessible to AI tools
  • Integration is custom — every project builds its own RAG bridge

The Solution

MCP RAG Server provides:

  • Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
  • Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
  • Simple toolsrag_query, rag_search, index_document, get_stats
  • Zero config — point at ChromaDB and go
# In Claude Code, after configuring the server:
"Search my knowledge base for articles about RAG architecture"
# Claude automatically uses rag_query tool and gets relevant context

Results

From production usage:

Without MCP RAG With MCP RAG
Manual context copy-paste Automatic retrieval
No document search Hybrid search built-in
Static knowledge Live vector database
Custom integration per project Standard MCP protocol

Design Philosophy

Why MCP?

MCP (Model Context Protocol) standardizes how AI applications connect to external tools:

┌──────────────┐     MCP Protocol     ┌──────────────┐
│  MCP Client  │◀────────────────────▶│  MCP Server  │
│ (Claude Code)│                      │ (This repo)  │
└──────────────┘                      └──────────────┘
                                             │
                                      ┌──────▼──────┐
                                      │ RAG Pipeline │
                                      │  (ChromaDB)  │
                                      └─────────────┘

Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.

Tools Exposed

Tool Description
rag_query Query with hybrid search, returns formatted context
rag_search Raw similarity search, returns chunks with scores
index_document Add a single document
index_documents_batch Batch index multiple documents
delete_by_source Delete all docs from a source
get_stats Collection statistics
clear_collection Clear all data (requires confirmation)

Quick Start

1. Prerequisites

# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma

# Set OpenAI API key (for embeddings)
export OPENAI_API_KEY="sk-..."

2. Install & Build

git clone https://github.com/0xrdan/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build

3. Configure Claude Code

Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):

{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "CHROMA_URL": "http://localhost:8000",
        "CHROMA_COLLECTION": "my_knowledge_base"
      }
    }
  }
}

4. Use in Claude Code

# Restart Claude Code to load the server
claude

# Now Claude has access to RAG tools:
"Index this document into my knowledge base: [paste content]"
"Search for information about transformer architectures"
"What do my docs say about error handling?"

API Reference

rag_query

Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.

// Input
{
  question: string;      // Required: the question to search for
  topK?: number;         // Optional: number of results (default: 5)
  threshold?: number;    // Optional: min similarity 0-1 (default: 0.5)
  filters?: object;      // Optional: metadata filters
}

// Output
{
  context: string;       // Formatted context for LLM
  chunks: [{
    content: string;
    score: number;
    metadata: object;
  }];
  stats: {
    totalChunks: number;
    avgSimilarity: number;
  };
}

rag_search

Raw similarity search without context formatting.

// Input
{
  query: string;         // Required: search query
  topK?: number;         // Optional: number of results (default: 10)
  filters?: object;      // Optional: metadata filters
}

// Output: Array of chunks with scores

index_document

Add a document to the knowledge base.

// Input
{
  id: string;            // Required: unique identifier
  title: string;         // Required: document title
  content: string;       // Required: document content
  source: string;        // Required: source identifier
  category?: string;     // Optional: category
  tags?: string[];       // Optional: tags array
}

// Output
{
  success: boolean;
  documentId: string;
  chunksIndexed: number;
}

get_stats

Get collection statistics.

// Output
{
  totalChunks: number;
  totalDocuments: number;
  // ... other stats from RAG pipeline
}

Configuration

Environment Variables

Variable Required Default Description
OPENAI_API_KEY Yes - OpenAI API key for embeddings
CHROMA_URL No http://localhost:8000 ChromaDB URL
CHROMA_COLLECTION No mcp_knowledge_base Collection name
EMBEDDING_MODEL No text-embedding-3-large Embedding model
EMBEDDING_DIMENSIONS No Native Reduced dimensions

Project Structure

mcp-rag-server/
├── src/
│   ├── server.ts        # Main MCP server implementation
│   └── index.ts         # Exports
├── mcp-config.example.json  # Example Claude Code configuration
├── package.json
└── README.md

Advanced Usage

Programmatic Server Creation

import { createServer } from 'mcp-rag-server';

const server = await createServer({
  vectorDB: {
    host: 'http://custom-chroma:8000',
    collectionName: 'my_collection',
  },
  rag: {
    topK: 10,
    enableHybridSearch: true,
  },
});

Using with Claude Desktop

Same configuration works with Claude Desktop's MCP support:

// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"]
    }
  }
}

Part of the Context Continuity Stack

This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.

Layer Role This Repo
Intra-session Short-term memory
Document-scoped Injected content
Retrieved Long-term semantic memory via MCP mcp-rag-server
Progressive Staged responses

MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.

Related repos:


Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feat/add-new-tool)
  3. Make changes with semantic commits
  4. Open a PR with clear description

License

MIT License - see LICENSE for details.


Acknowledgments

Built with Claude Code.

Co-Authored-By: Claude <noreply@anthropic.com>

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured