journal-rag

journal-rag

Hybrid retrieval MCP server for searching team markdown journals using BM25 and local vector embeddings, with tools for search, browse, and regex lookup.

Category
Visit Server

README

journal-rag

Source-control-friendly hybrid retrieval over team markdown journals. Heading-chunked BM25 + local vector embeddings fused via Reciprocal Rank Fusion (RRF), with regex as an escape hatch. Index built on startup with an optional gitignored JSON cache.

Embeddings run locally via @huggingface/transformers (default model: all-MiniLM-L6-v2) — no API keys, no external calls.

Each consuming repo commits journal-rag.config.json and markdown under docs/journal/ (or other configured folders). This package is the shared engine.

Per-repo config

Create journal-rag.config.json at the repo root:

{
  "sources": ["docs/journal"],
  "cachePath": ".journal-rag/index.json",
  "embeddingModel": "Xenova/all-MiniLM-L6-v2"
}
Field Required Default Description
sources yes Directories containing markdown journals
cachePath no .journal-rag/index.json BM25 chunk index cache path
embeddingModel no Xenova/all-MiniLM-L6-v2 Hugging Face model ID for local embeddings

The vector cache (vectors.json) is stored in the same directory as cachePath.

Add to .gitignore:

.journal-rag/

Build & install (once per machine)

cd c:/repos/journal-rag
npm install          # runs prepare → build
npm link             # puts journal + journal-mcp on your PATH

npm link registers two global commands:

Command What it runs
journal CLI (search, list, get, …)
journal-mcp MCP stdio server (for editor config)

Re-run npm run build (or npm link again) after pulling server changes. Alternative to link: npm install -g . from this repo (same effect).

CLI (any teammate)

From a repo root with config:

journal search "HttpFacade singleton"        # hybrid BM25 + vector (default)
journal search "HttpFacade singleton" --bm25 # BM25-only (no embedding)
journal list --filter dialog
journal get docs/journal/2026-04-21_vapp-http-facade-and-singleton-sweep.md
journal index --rebuild

After npm link in this repo, journal search "..." works globally.

Set JOURNAL_RAG_WORKSPACE to an absolute repo root only when you must run the CLI from a subdirectory.

The first run downloads the embedding model (~80 MB) to the Hugging Face cache directory. Subsequent runs load from cache.

MCP tools

Tool Purpose
search_journal Hybrid BM25 + vector search with RRF fusion (query, k). Falls back to BM25-only if vector index is unavailable.
get_entry Full file by path or filename
list_entries Browse metadata (filter optional)
search_regex Exact / path / symbol lookup

Editor setup

Use stdio — spawn Node with dist/server.js.

Put MCP config in the workspace, not your user profile

The server resolves journal-rag.config.json by walking up from its working directory. That file lives at each consuming repo's root (next to docs/journal/), not in journal-rag itself.

If you add the server to a global / user-level editor profile, the spawn cwd is usually wrong (home dir, editor install dir, last random folder, etc.) and the server cannot find config — even if you hardcode "cwd": "C:/repos/my-repo", that breaks the moment you open a second repo workspace.

Do this instead: commit workspace-level MCP config inside each repo that has journals. Teammates run npm link once (see above) so journal-mcp is on PATH — no machine-specific paths in the committed JSON.

Cursor

.cursor/mcp.json at the repo root (e.g. my-repo/.cursor/mcp.json) — safe to commit:

{
  "mcpServers": {
    "journal": {
      "command": "journal-mcp",
      "cwd": "${workspaceFolder}",
      "env": {
        "JOURNAL_RAG_WORKSPACE": "${workspaceFolder}"
      }
    }
  }
}

${workspaceFolder} resolves to the repo you opened. journal-mcp comes from npm link in the journal-rag repo.

VS Code (Copilot agent mode)

Same idea: .vscode/mcp.json in the repo, not User settings:

{
  "servers": {
    "journal": {
      "type": "stdio",
      "command": "journal-mcp",
      "cwd": "${workspaceFolder}"
    }
  }
}

JetBrains AI Assistant / Junie

Configure MCP at project scope (.idea / project settings), not the IDE default profile. Open the repo as the project root. Command: journal-mcp (after npm link).

If journal-mcp is not found

Ensure npm's global bin dir is on your PATH (npm bin -g). On Windows that is usually %APPDATA%\\npm. Then re-run npm link from journal-rag. Fallback for a single machine only: "command": "node", "args": ["<absolute-path>/journal-rag/dist/server.js"].

Fallback

If an editor cannot set cwd per workspace, set env JOURNAL_RAG_WORKSPACE to the absolute path of the consuming repo root in that workspace's MCP config.

Design notes

  • Corpus is small (~tens of files); BM25 over heading chunks matches how journals are written.
  • Vector embeddings (local, via Transformers.js) add semantic recall for paraphrased or conceptual queries.
  • Reciprocal Rank Fusion (RRF, k=60) merges BM25 and vector rankings without needing score normalization.
  • Index caches are optional and gitignored; markdown in git is the source of truth.
  • Vector cache is incremental — only new/changed chunks are re-embedded on rebuild.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured