vault-recommender

vault-recommender

Enables semantic recommendation for Obsidian vaults, allowing users to find related notes, forgotten knowledge, and missing connections through embedding similarity and wiki-link graph analysis.

Category
Visit Server

README

vault-recommender

CI PyPI

Semantic recommendation engine for Obsidian vaults. Uses sentence-transformer embeddings + wiki-link graph boosting to surface related notes, forgotten knowledge, and missing connections.

Designed as a tool for LLMs — returns context-rich results with explanations, not just ranked paths.

How it works

Your vault (markdown files)
       │
   Parser ─── extracts frontmatter, body, wiki-links
       │
   Indexer ─── embeds each note as a 384-dim vector (all-MiniLM-L6-v2)
       │
   Link Graph ─── builds bidirectional wiki-link adjacency
       │
   Recommender ─── cosine similarity + graph boost + staleness boost
       │
   Ranked results with reasons

Three scoring signals:

  • Semantic similarity — cosine distance between note embeddings. Catches meaning, not just keywords.
  • Link graph boost — notes connected through wiki-links get a bump. 2-hop neighbors (connected through a shared link) surface "bridge" connections.
  • Staleness boost — notes untouched for 30+ days get a small boost. Surfaces forgotten-but-relevant knowledge.

Installation

# From PyPI
uv tool install vault-recommender

# Or from source
git clone https://github.com/JoshuaOliphant/vault-recommender.git
cd vault-recommender
uv sync

Usage

CLI

# Build the index (run once, re-run when vault changes significantly)
vault-recommender --vault /path/to/vault index

# Recommend by topic
vault-recommender --vault /path/to/vault recommend --topic "career transition strategies"

# Recommend notes similar to a specific note
vault-recommender --vault /path/to/vault recommend --note "areas/career/plan.md"

# Find missing connections (similar but not linked)
vault-recommender --vault /path/to/vault recommend --note "areas/career/plan.md" --exclude-linked

# Auto-rebuild stale index before querying
vault-recommender --vault /path/to/vault recommend --topic "python testing" --rebuild

# JSON output (for LLM consumption)
vault-recommender --vault /path/to/vault recommend --topic "python testing" --json

The --rebuild flag checks whether any vault file is newer than the index. If so, it rebuilds automatically before querying. If the index is fresh, it skips silently.

HTTP Server (for hooks and fast queries)

The CLI cold-starts the embedding model on topic queries (~13s). For latency-sensitive use cases like Claude Code hooks, run the HTTP server instead:

# Start the server (loads index once, then serves fast queries)
vault-recommender --vault /path/to/vault serve

# Custom host/port
vault-recommender --vault /path/to/vault serve --host 0.0.0.0 --port 8000

Endpoints:

# Health check
curl localhost:7532/health

# Recommend by topic
curl "localhost:7532/recommend?topic=career+transition&top_k=5"

# Recommend by note
curl "localhost:7532/recommend?note=areas/career/plan.md&top_k=3"

# Find missing connections
curl "localhost:7532/recommend?note=areas/career/plan.md&exclude_linked=true"

# Hot-reload index after re-indexing via CLI
curl -X POST localhost:7532/reload

MCP Server (Claude Code integration)

Add to your .mcp.json:

{
  "mcpServers": {
    "vault-recommender": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/vault-recommender",
        "python",
        "-m",
        "vault_recommender.mcp_server"
      ],
      "env": {
        "VAULT_PATH": "/path/to/your/vault"
      }
    }
  }
}

This exposes four tools:

  • recommend_by_topic — open-ended semantic search
  • recommend_by_note — "notes like this one"
  • find_missing_connections — similar but unlinked notes
  • reload_index — force-reload the index after re-indexing via CLI

Python API

from pathlib import Path
from vault_recommender.recommender import create_recommender

vault = Path("/path/to/vault")
index_dir = Path(".vault-recommender-index")

rec = create_recommender(vault, index_dir)

# By topic
results = rec.similar_to_topic("career transition")

# By note
results = rec.similar_to_note("areas/career/plan.md")

# Each result has: path, title, score, snippet, tags, reason
for r in results:
    print(f"{r.score:.3f} {r.title}{r.reason}")

Performance

  • ~1,500 notes indexed in ~5 seconds (M-series Mac)
  • Queries return in <1 second (after model warm-up)
  • Index persists as numpy + JSON (~2MB for 1,500 notes)
  • Model: all-MiniLM-L6-v2 (~80MB, runs on CPU)
  • --help responds instantly (heavy imports deferred until needed)

Requirements

  • Python 3.12+
  • An Obsidian vault (or any directory of markdown files with [[wiki-links]])

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured