MCP Servers

RAG MCP Gateway

A smart proxy that aggregates multiple downstream MCP servers and enables natural language search for tools, allowing clients like Claude Desktop to discover and use tools through semantic queries.

README

RAG MCP Gateway

A smart proxy server for the Model Context Protocol (MCP) that aggregates multiple downstream MCP servers and provides Natural Language Search capabilities over their tools.

The gateway acts as a single entry point for an MCP client (like Claude Desktop or an Agent), allowing it to discover and use tools from a wide array of connected servers using semantic queries instead of exact naming matching.

Architecture

The system is built on a modular "Gateway" architecture designed for high discoverability and robust connection management.

graph TD
    Client[MCP Client] <-->|Stdio| Gateway[RAG MCP Gateway]
    
    subgraph "Internal Components"
        Gateway --> ConnectionManager
        Gateway --> Indexer
        Gateway --> Retriever
        
        subgraph "Indexing Pipeline"
            Indexer --> Discovery[Tool Discovery]
            Discovery --> Enrichment[LLM Enrichment]
            Enrichment --> Embedding[Vector Embedding]
            Embedding --> Orama[(Orama DB)]
            Enrichment --> Gemini[Google Gemini API]
        end
        
        subgraph "Retrieval Pipeline"
            Retriever --> Search[Parallel Dense/Sparse Search]
            Search --> RRF[RRF Fusion]
            RRF --> Rerank[Cross-Encoder Reranking]
            Rerank --> Model[Transformers.js]
            Search --> Orama
        end
    end
    
    subgraph "Downstream Servers"
        ConnectionManager <-->|Stdio| ServerA[Local Process]
        ConnectionManager <-->|SSE / HTTP| ServerB[Remote Server]
        ConnectionManager <-->|Docker| ServerC[Containerized Tool]
    end

Key Components

Connection Manager: Handles persistent connections to multiple downstream MCP servers.
- Transports: Supports Stdio, SSE, and Streamable-HTTP.
- Docker Integration: Can manage lifecycle for Docker-based servers, including automatic container cleanup (stop and rm) before startup to avoid name conflicts.
Indexer: Synchronizes the local index with downstream servers.
- Tool Discovery: Polls listTools from all clients.
- Enrichment: Uses Google Gemini to generate human-readable summaries and potential search questions for tools, significantly increasing search accuracy.
- Smart Sync: Only re-indexes tools that have changed their name, description, or schema.
Vector Store (Orama): A high-performance, in-memory JavaScript vector database that persists to JSON. It handles both vector (dense) and full-text (sparse) indexing.
Retriever: Implements a sophisticated search pipeline:
- Hybrid Search: Simultaneously executes vector search and keyword search.
- RRF Fusion: Combines results using Reciprocal Rank Fusion to balance semantic and exact matches.
- Reranking: A second-stage Cross-Encoder (HuggingFace model via Transformers.js) reranks candidates based on the actual technical schema and logic, ensuring the most relevant tool is prioritized.
LLM Service: Provides the generative bridge for metadata enrichment, ensuring that even minimally documented tools are discoverable via natural language queries.

Prerequisites

Node.js: v18 or higher
NPM: v9 or higher
Gemini API Key (Optional but Recommended): For generating better tool descriptions and search queries. Get one here.

Installation

Clone the repository:
```
git clone <repository-url>
cd rag-mcp
```
Install dependencies:
```
npm install
```
Build the project:
```
npm run build
```

Configuration

The gateway is configured using a config.json file in the root directory. You can copy the example file to start:

cp config.example.json config.json

`config.json` Structure

Define your downstream servers in the mcpServers object:

{
  "mcpServers": {
    "weather": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-weather"]
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "./allowed-dir"]
    },
    "remote-server": {
        "transport": "sse",
        "url": "http://localhost:3000/sse"
    }
  }
}

Environment Variables

You can configure the gateway using the following environment variables. These can be set in your OS or passed via the env object in your Claude Desktop configuration.

Variable	Description	Default
`GEMINI_API_KEY`	Required for Enrichment. API Key for Google Generative AI.	-
`RAG_MCP_BASE_DIR`	Base directory for all relative paths.	`process.cwd()`
`RAG_MCP_CONFIG_PATH`	Path to the downstream servers config file.	`BASE_DIR/config.json`
`RAG_MCP_DB_PATH`	Path to the Orama persistence folder.	`BASE_DIR/data/orama_db`
`RAG_MCP_LOG_PATH`	Path to the debug log file.	`BASE_DIR/rag-mcp.log`
`RAG_MCP_LOGGING_ENABLED`	Set to `true` to enable debug logging to the log file.	`false`
`RAG_MCP_REBUILD_INDEX`	Set to `true` to force a full re-index on every startup.	`false`
`RAG_MCP_SEARCH_THRESHOLD`	Minimum relevance score (0.0 to 1.0) for search results.	`0.85`
`RAG_MCP_EMBEDDING_MODEL`	Required if Dense enabled. Transformers.js model for generating vector embeddings.	-
`RAG_MCP_RERANKING_MODEL`	Required if Reranker enabled. Transformers.js model for second-stage reranking.	-
`RAG_MCP_GENERATIVE_MODEL`	Required if LLM enabled. Google Gemini model for tool enrichment.	-
`RAG_MCP_ENABLE_LLM`	Enable LLM enrichment (summaries and questions) during indexing.	`false`
`RAG_MCP_ENABLE_DENSE`	Enable semantic vector search (Dense retrieval).	`true`
`RAG_MCP_ENABLE_SPARSE`	Enable full-text keyword search (Sparse retrieval).	`true`
`RAG_MCP_ENABLE_RERANKER`	Enable the cross-encoder reranking stage.	`true`

Usage

1. Running Locally (Development)

You can run the server directly using ts-node:

# Set your API key first (Windows PowerShell)
$env:GEMINI_API_KEY="your-key-here"
npm run dev

2. Connecting to Claude Desktop

To use this gateway with Claude Desktop, edit your config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json
Mac/Linux: ~/Library/Application Support/Claude/claude_desktop_config.json

Add the gateway to the mcpServers list:

{
  "mcpServers": {
    "rag-gateway": {
      "command": "node",
      "args": ["C:/path/to/rag-mcp/dist/src/server.js"],
      "env": {
        "GEMINI_API_KEY": "your-key-here",
        "RAG_MCP_LOGGING_ENABLED": "true"
      }
    }
  }
}

Note: Always use absolute paths for the command and arguments when configuring Claude Desktop.

How it Works

Once connected, the Gateway exposes two primary tools to the client:

`search_tool(query: string, limit?: number)`

This is the discovery mechanism. The Agent should call this first when it doesn't know which tool to use.

Input: query: "I need to check the weather in London", limit: 3
Process: The gateway embeds this query, searches the vector database, reranks results, and returns up to limit matching tool schemas (default is 10).

`execute_tool(tool_name: string, arguments: object)`

This is the execution mechanism.

Input: tool_name: "weather_get_current", arguments: { city: "London" }
Process: The gateway looks up which downstream server owns "weather_get_current" and proxies the request to it.

Testing & Development

This project includes a suite of verification scripts in the tests/ directory to validate different components without needing a full MCP client.

Running Tests

Use ts-node to run specific test scenarios:

Verify Gateway Logic: Simulates a client connecting to the gateway and running searches.
```
npx ts-node tests/verify_gateway.ts
```
Verify Index Synchronization: Checks if tools are correctly added, updated, or removed from the vector index when downstream servers change.
```
npx ts-node tests/verify_index_sync.ts
```
Verify Transports: Tests the connection managers handling of Stdio and SSE connections.
```
npx ts-node tests/verify_transports.ts
```

Debugging

Since the server communicates over Stdio, standard output (console.log) is reserved for the protocol.

Logs: Check rag-mcp.log in the project root (must enable RAG_MCP_LOGGING_ENABLED=true).
Errors: Critical errors are also logged to the file.

Project Structure

rag-mcp/
├── src/
│   ├── server.ts             # Gateway Entry Point (Stdio Server)
│   ├── indexer.ts            # Tool Discovery & Enrichment Logic
│   ├── retriever.ts          # Hybrid Search & Reranking Pipeline
│   ├── connection_manager.ts # Transport Management (Stdio/SSE/Docker)
│   ├── vector_store.ts       # Orama DB Wrapper (Dense/Sparse)
│   ├── models.ts             # Transformer.js Model Management
│   └── llm.ts                # Gemini API Integration
├── data/                     # Local Database & Persistence
├── tests/                    # Verification Scripts
├── config.json               # Downstream Servers Configuration
└── rag-mcp.log               # Debug Logs (if enabled)

Security & Best Practices

API Keys: Avoid hardcoding GEMINI_API_KEY. Use an environment variable or a secure secret manager.
Environment Forwarding: When using the Stdio transport, the Gateway forwards process.env plus any specific env defined in config.json to the child process. Be mindful of sensitive variables.
Local Persistence: Orama data is stored as a plain JSON file in the ./data directory. Ensure this directory is protected.
Network Access: Transformers.js will attempt to download models from HuggingFace on the first run. Ensure your environment allows this or pre-download the models.

Troubleshooting

"No tools found"

Verify that downstream servers in config.json are running and accessible.
Check rag-mcp.log for connection errors (ensure RAG_MCP_LOGGING_ENABLED=true).
Run refresh_index() tool to force a scan.

"Vector search is inaccurate"

Enable RAG_MCP_ENABLE_LLM=true and provide a GEMINI_API_KEY. Tools with poor descriptions need LLM enrichment to be discoverable via natural language.
Adjust RAG_MCP_SEARCH_THRESHOLD. A lower value (e.g., 0.7) returns more candidates but may include irrelevant results.

"Docker errors"

Ensure the Docker daemon is running.
The Gateway attempts to stop and rm containers with the same serverId on startup to avoid name conflicts. Ensure the system user has permissions to execute these commands.

"Model download failed"

If deployment is in an air-gapped environment, you must pre-cache models in the ~/.cache/huggingface (or equivalent) directory.

License & Credits

Project License

The source code for RAG MCP Gateway is licensed under the ISC License

Third-Party Licenses & Terms

This project utilizes several high-quality models and libraries that are subject to their own licenses:

Inference Engine: Transformers.js is licensed under the Apache License 2.0.
Vector Database: Orama is licensed under the Apache License 2.0.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

RAG MCP Gateway

README

RAG MCP Gateway

Architecture

Key Components

Prerequisites

Installation

Configuration

config.json Structure

Environment Variables

Usage

1. Running Locally (Development)

2. Connecting to Claude Desktop

How it Works

search_tool(query: string, limit?: number)

execute_tool(tool_name: string, arguments: object)

Testing & Development

Running Tests

Debugging

Project Structure

Security & Best Practices

Troubleshooting

"No tools found"

"Vector search is inaccurate"

"Docker errors"

"Model download failed"

License & Credits

Project License

Third-Party Licenses & Terms

Recommended Servers

`config.json` Structure

`search_tool(query: string, limit?: number)`

`execute_tool(tool_name: string, arguments: object)`