MCP Servers

rag

A CLI tool and MCP server that turns markdown documentation into a searchable, queryable knowledge base.

README

rag

rag is a CLI tool and MCP server that turns markdown documentation into a searchable, queryable knowledge base.

It chunks .md files by heading, embeds them via Ollama, stores vectors in LanceDB, and exposes search + RAG through both a terminal CLI and MCP.

Prerequisites

Bun runtime
Ollama running locally with embedding model (auto-pulled if missing)

Minimum hardware

Component	Requirement
RAM	4 GB (8 GB for larger doc sets)
CPU	Any x86-64 or ARM64, 2+ cores
GPU	Optional. Any NVIDIA GPU with 2+ GB VRAM. CPU-only fallback is functional but slower
Disk	100 MB for index (scales with doc count)

Indexing 5000 chunks: ~25s on RTX 3060, ~3min on CPU-only.

Install

git clone https://github.com/FrameMuse/llm-rag.git
cd llm-rag
bun install

Add shell alias:

alias rag='bun /path/to/llm-rag/scripts/cli.ts'

Quick start

cd my-docs-project
rag init              # create .rag/ project scope
rag index             # chunk, embed, index all .md files
rag mcp search "..."  # semantic search
rag mcp query "..."   # RAG: synthesize answer from docs

Commands

Command	Description
`rag init`	Create .rag/ config, mcp.json, .gitignore
`rag index`	Chunk files by heading, embed via Ollama, store in LanceDB
`rag serve`	Start MCP server (STDIO) for current .rag/ scope
`rag mcp <tool>`	One-shot CLI proxy for MCP tools
`rag info`	Show index statistics
`rag help`	Show usage

rag mcp tools

Tool	Usage	Description
`search`	`rag mcp search "query" [--limit N]`	Semantic vector search
`query`	`rag mcp query "question"`	RAG: retrieve chunks, synthesize answer
`list-documents`	`rag mcp list-documents`	List all indexed files
`get-document`	`rag mcp get-document <path>`	Show full document content
`config`	`rag mcp config`	Print mcp.json for opencode.json adoption

Project scope (.rag/)

project/
├── .rag/
│   ├── config.json       # { name, embedModel, ragModel, pattern }
│   ├── mcp.json          # MCP config snippet for opencode.json
│   ├── .gitignore        # *
│   └── data/lancedb/     # Vector index (generated by rag index)
├── *.md
└── ...

Each project keeps its index local. rag discovers .rag/ by walking up from current directory (like git).

MCP integration

{
  "mcp": {
    "my-docs": {
      "type": "local",
      "command": ["rag", "serve"],
      "cwd": "/path/to/project",
      "enabled": true
    }
  }
}

Run rag mcp config from project directory to print the snippet with cwd pre-filled.

Architecture

flowchart LR
  MD[.md files] --> Chunker
  Chunker -->|heading split| Chunks
  Chunks -->|Ollama embed| Vectors
  Vectors -->|store| LanceDB
  Query -->|embed| LanceDB
  LanceDB -->|search| Results
  Question -->|embed + search| Context
  Context -->|Ollama chat| Answer

Chunker: splits by ## / ### headings, preserves heading hierarchy, merges tiny sections
Embedder: Ollama /api/embed in batches of 20, truncates to 500 tokens per chunk
Store: LanceDB embedded vector database (no external server)
RAG: retrieve top 8 chunks, build context prompt, call Ollama chat for synthesis

Configuration

.rag/config.json:

{
  "name": "my-docs",
  "embedModel": "nomic-embed-text",
  "ragModel": "llama3.2:3b",
  "pattern": "*.md"
}

Models auto-pull if missing. Override via rag init or edit config.json directly.

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured