rag

rag

A CLI tool and MCP server that turns markdown documentation into a searchable, queryable knowledge base.

Category
Visit Server

README

rag

rag is a CLI tool and MCP server that turns markdown documentation into a searchable, queryable knowledge base.

It chunks .md files by heading, embeds them via Ollama, stores vectors in LanceDB, and exposes search + RAG through both a terminal CLI and MCP.


Prerequisites

  • Bun runtime
  • Ollama running locally with embedding model (auto-pulled if missing)

Minimum hardware

Component Requirement
RAM 4 GB (8 GB for larger doc sets)
CPU Any x86-64 or ARM64, 2+ cores
GPU Optional. Any NVIDIA GPU with 2+ GB VRAM. CPU-only fallback is functional but slower
Disk 100 MB for index (scales with doc count)

Indexing 5000 chunks: ~25s on RTX 3060, ~3min on CPU-only.

Install

git clone https://github.com/FrameMuse/llm-rag.git
cd llm-rag
bun install

Add shell alias:

alias rag='bun /path/to/llm-rag/scripts/cli.ts'

Quick start

cd my-docs-project
rag init              # create .rag/ project scope
rag index             # chunk, embed, index all .md files
rag mcp search "..."  # semantic search
rag mcp query "..."   # RAG: synthesize answer from docs

Commands

Command Description
rag init Create .rag/ config, mcp.json, .gitignore
rag index Chunk files by heading, embed via Ollama, store in LanceDB
rag serve Start MCP server (STDIO) for current .rag/ scope
rag mcp <tool> One-shot CLI proxy for MCP tools
rag info Show index statistics
rag help Show usage

rag mcp tools

Tool Usage Description
search rag mcp search "query" [--limit N] Semantic vector search
query rag mcp query "question" RAG: retrieve chunks, synthesize answer
list-documents rag mcp list-documents List all indexed files
get-document rag mcp get-document <path> Show full document content
config rag mcp config Print mcp.json for opencode.json adoption

Project scope (.rag/)

project/
├── .rag/
│   ├── config.json       # { name, embedModel, ragModel, pattern }
│   ├── mcp.json          # MCP config snippet for opencode.json
│   ├── .gitignore        # *
│   └── data/lancedb/     # Vector index (generated by rag index)
├── *.md
└── ...

Each project keeps its index local. rag discovers .rag/ by walking up from current directory (like git).

MCP integration

Register in opencode.json:

{
  "mcp": {
    "my-docs": {
      "type": "local",
      "command": ["rag", "serve"],
      "cwd": "/path/to/project",
      "enabled": true
    }
  }
}

Run rag mcp config from project directory to print the snippet with cwd pre-filled.

Architecture

flowchart LR
  MD[.md files] --> Chunker
  Chunker -->|heading split| Chunks
  Chunks -->|Ollama embed| Vectors
  Vectors -->|store| LanceDB
  Query -->|embed| LanceDB
  LanceDB -->|search| Results
  Question -->|embed + search| Context
  Context -->|Ollama chat| Answer
  • Chunker: splits by ## / ### headings, preserves heading hierarchy, merges tiny sections
  • Embedder: Ollama /api/embed in batches of 20, truncates to 500 tokens per chunk
  • Store: LanceDB embedded vector database (no external server)
  • RAG: retrieve top 8 chunks, build context prompt, call Ollama chat for synthesis

Configuration

.rag/config.json:

{
  "name": "my-docs",
  "embedModel": "nomic-embed-text",
  "ragModel": "llama3.2:3b",
  "pattern": "*.md"
}

Models auto-pull if missing. Override via rag init or edit config.json directly.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured