obsidian-tools

obsidian-tools

Enables natural language interaction with Obsidian vaults through an MCP server, providing hybrid search, file management, and AI-powered analysis.

Category
Visit Server

README

Obsidian Tools

An agentic Obsidian vault manager. Ask questions in natural language, search across notes semantically, explore wikilinks, transcribe meeting recordings, manipulate and organize files and metadata, all conversationally. Works as an Obsidian sidebar plugin, a CLI chat agent, or an HTTP API.

<!-- TODO: Add a screenshot of the Obsidian chat sidebar here -->

What It Does

Your vault gets indexed into a vector database. An LLM agent then uses MCP tools to search, read, and modify your notes — combining semantic understanding with keyword matching to find what you need.

Search & Discovery — Hybrid semantic + keyword search with BM25 scoring, Reciprocal Rank Fusion, HyDE (Hypothetical Document Embeddings) for question-type queries, cross-encoder reranking, and per-source diversity limits. Link graph traversal (backlinks, outlinks), frontmatter queries, date range filtering, folder browsing.

Vault Management — Read, create, move, merge files; edit specific markdown sections by heading; update frontmatter fields. Batch operations (create, move, merge, frontmatter update) with a built-in confirmation flow — the agent previews affected files and waits for approval before executing.

File Readers — Audio transcription via Whisper, image description via vision model, Office document extraction (.docx, .xlsx, .pptx), PDF text extraction, all auto-dispatched through read_file

AI-Powered Analysis — Summarize notes via LLM. Research pipeline: extract topics from a note or research an ad-hoc subject, gather findings from web search + vault search + page extraction, synthesize into a ## Research section or a new note.

Integrations — Web search via DuckDuckGo, interaction logging to daily notes, persistent user preferences

How It Works

Architecture diagram

  1. Indexer scans your vault and creates embeddings in ChromaDB (using nomic-embed-text-v1.5), splitting notes by headings, paragraphs, and sentences with heading hierarchy prefixes ([Note > Section > Subsection]) and cross-section overlap for continuity
  2. Search pipeline combines semantic search (ChromaDB) with BM25 keyword scoring via Reciprocal Rank Fusion. Question-type queries get HyDE augmentation (LLM generates a hypothetical answer, which is embedded alongside the original query). Results are reranked by a cross-encoder model and deduplicated per source.
  3. MCP Server exposes 20 tools for searching, reading, and modifying vault content
  4. LLM Agent (powered by Fireworks AI) orchestrates the tools to answer your questions
  5. Interfaces — chat in Obsidian via the sidebar plugin, from the terminal via the CLI agent, or programmatically via the HTTP API

Requirements

  • Python 3.11, 3.12, or 3.13 (not 3.14 — onnxruntime doesn't have wheels yet)
  • Fireworks AI API key — required for the chat agent and audio transcription

Quick Start

git clone https://github.com/glibalien/obsidian-tools.git
cd obsidian-tools

# macOS / Linux
./install.sh

# Windows (PowerShell)
.\install.ps1

The installer will:

  1. Find or help you install a compatible Python (resolves the real binary, not pyenv shims)
  2. Create a virtual environment and install dependencies
  3. Walk you through .env configuration (vault path, API key, etc.)
  4. Optionally install background services (API server + vault indexer)
  5. Optionally run the initial vault index

That's it — once installed, open Obsidian and start chatting, or run the CLI agent with python src/agent.py.

<details> <summary>Manual installation</summary>

macOS Users (Homebrew)

If Homebrew has upgraded you to Python 3.14, use pyenv to install a compatible version:

brew install pyenv

# Add pyenv to your shell (add these to ~/.zshrc or ~/.bashrc)
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc
echo 'eval "$(pyenv init -)"' >> ~/.zshrc
source ~/.zshrc

pyenv install 3.12.8

Clone and Set Up

git clone https://github.com/glibalien/obsidian-tools.git
cd obsidian-tools

# If using pyenv, create the venv with the real binary (not the shim):
$(pyenv which python3.12) -m venv .venv
# Otherwise:
python -m venv .venv

source .venv/bin/activate
pip install -r requirements.txt

Configure .env

cp .env.example .env

Edit .env:

VAULT_PATH=~/Documents/your-vault-name
CHROMA_PATH=./.chroma_db
FIREWORKS_API_KEY=your-api-key-here
FIREWORKS_MODEL=accounts/fireworks/models/gpt-oss-120b
API_PORT=8000
INDEX_INTERVAL=60
Variable Description
VAULT_PATH Path to your Obsidian vault
CHROMA_PATH Where to store the ChromaDB database (relative or absolute)
FIREWORKS_API_KEY API key from Fireworks AI
FIREWORKS_MODEL Fireworks model ID (default: gpt-oss-120b)
API_PORT Port for the HTTP API server (default: 8000)
INDEX_INTERVAL How often the vault indexer runs, in minutes (default: 60)
INDEX_WORKERS Thread pool size for parallel file indexing (default: 4)

See .env.example for additional optional variables (logging, session limits, Whisper/vision models, etc.).

</details>

Usage

1. Index your vault

Before searching, build the vector index:

python src/index_vault.py

The indexer is incremental — subsequent runs only process files modified since the last run and prune deleted files. Use --full for a complete reindex.

To keep the index up to date automatically, see Running as a Service.

2. Choose your interface

Obsidian Plugin (recommended for daily use)

The plugin/ directory contains a chat sidebar that connects to the API server. It includes a built-in batch operation confirmation system — when the agent wants to modify many files at once, the plugin renders a preview of affected files with Confirm/Cancel buttons so you stay in control.

cd plugin && npm install && npm run build

# Copy to your vault (adjust path)
mkdir -p ~/Documents/your-vault/.obsidian/plugins/vault-agent
cp manifest.json main.js styles.css ~/Documents/your-vault/.obsidian/plugins/vault-agent/

Enable "Vault Agent" in Obsidian Settings > Community Plugins. The API server must be running (see below).

CLI Agent (for terminal users)

python src/agent.py

HTTP API (for programmatic access)

python src/api_server.py

The server binds to 127.0.0.1:8000 (localhost only):

curl -X POST http://127.0.0.1:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Summarize this note", "active_file": "Projects/Marketing.md"}'

Sessions are keyed by active_file — same file continues the conversation, different file starts a new one.

MCP Client (Claude Code, etc.)

Copy .mcp.json.example to .mcp.json and update the paths:

{
  "mcpServers": {
    "obsidian-tools": {
      "command": "/path/to/obsidian-tools/.venv/bin/python",
      "args": ["/path/to/obsidian-tools/src/mcp_server.py"]
    }
  }
}

Customizing for Your Vault

The agent's system prompt describes your vault's folder layout and frontmatter conventions. The default is tuned to the author's vault and almost certainly doesn't match yours. An uncustomized prompt wastes tokens on failed lookups and gives worse results.

The installer copies system_prompt.txt.example to system_prompt.txt (gitignored). Edit it:

  1. Update the Vault Structure section with your actual folder layout and frontmatter conventions
  2. Adjust the Interaction Logging section if you don't use daily notes
  3. Keep the Choosing the Right Tool and Available Tools sections as-is — they apply universally

Available Tools

Tool Description
find_notes Unified discovery — hybrid/semantic/keyword search, frontmatter filters, date ranges, folder browsing
read_file Read any vault file — markdown (with embed expansion), audio (Whisper), images (vision), Office docs, PDFs
transcribe_to_file Transcribe audio to a new vault note with diarized speaker segments
get_note_info Lightweight metadata — frontmatter, headings, size, timestamps, link counts
create_file Create a new note with optional YAML frontmatter
batch_create_files Create multiple files in one operation
edit_file Edit file content — prepend, append, or target a specific section by heading
move_file Move a file within the vault
batch_move_files Move multiple files — explicit list or query-based targeting by frontmatter/folder
merge_files Merge source into destination — smart (content-aware dedup) or concat
batch_merge_files Batch merge matching files across two folders
update_frontmatter Set, remove, append, or rename frontmatter fields
batch_update_frontmatter Bulk frontmatter update — by path list, frontmatter query, or folder
find_links Find backlinks, outlinks, or both for a note
compare_folders Compare two folders by filename — find duplicates and unique files
log_interaction Log interactions to daily notes
summarize_file LLM-powered summarization — appends a ## Summary section to the note
research Research topics in a note or an ad-hoc subject — web + vault search, page extraction, LLM synthesis
manage_preferences List, add, or remove persistent user preferences
web_search Search the web via DuckDuckGo

Running as a Service

The installer can set up background services automatically. If you prefer manual setup:

<details> <summary>Linux (systemd)</summary>

mkdir -p ~/.config/systemd/user

for f in services/systemd/*.service services/systemd/*.timer; do
    sed -e "s|__PROJECT_DIR__|$PWD|g" \
        -e "s|__VENV_PYTHON__|$PWD/.venv/bin/python|g" \
        -e "s|__INDEX_INTERVAL__|60|g" \
        "$f" > ~/.config/systemd/user/$(basename "$f")
done

systemctl --user daemon-reload
systemctl --user enable --now obsidian-tools-api
systemctl --user enable --now obsidian-tools-indexer-scheduler.timer
# Check status
systemctl --user status obsidian-tools-api

# View logs
journalctl --user -u obsidian-tools-api -f

# Restart
systemctl --user restart obsidian-tools-api

Note: To run services without being logged in: sudo loginctl enable-linger $USER

</details>

<details> <summary>macOS (launchd)</summary>

for f in services/launchd/*.plist; do
    sed -e "s|__VENV_PYTHON__|$PWD/.venv/bin/python|g" \
        -e "s|__PROJECT_DIR__|$PWD|g" \
        -e "s|__USERNAME__|$(whoami)|g" \
        -e "s|__INDEX_INTERVAL_SEC__|3600|g" \
        "$f" > ~/Library/LaunchAgents/$(basename "$f")
done

launchctl load ~/Library/LaunchAgents/com.obsidian-tools.api.plist
launchctl load ~/Library/LaunchAgents/com.obsidian-tools.indexer.plist
# Check status
launchctl list | grep obsidian-tools

# View logs
tail -f ~/Library/Logs/obsidian-tools-api.log

</details>

<details> <summary>Windows (Task Scheduler)</summary>

$xml = (Get-Content services\taskscheduler\obsidian-tools-api.xml -Raw) `
    -replace '__VENV_PYTHON__', "$PWD\.venv\Scripts\python.exe" `
    -replace '__PROJECT_DIR__', "$PWD"
Register-ScheduledTask -TaskName "ObsidianToolsAPI" -Xml $xml

$xml = (Get-Content services\taskscheduler\obsidian-tools-indexer.xml -Raw) `
    -replace '__VENV_PYTHON__', "$PWD\.venv\Scripts\python.exe" `
    -replace '__PROJECT_DIR__', "$PWD" `
    -replace '__INDEX_INTERVAL__', '60'
Register-ScheduledTask -TaskName "ObsidianToolsIndexer" -Xml $xml
# Check status
Get-ScheduledTask | Where-Object TaskName -like 'ObsidianTools*'

</details>

Uninstall

./uninstall.sh        # macOS / Linux
.\uninstall.ps1       # Windows

Your .env and .chroma_db/ are preserved.

<details> <summary>Project structure</summary>

src/
├── mcp_server.py        # FastMCP server — registers tools from submodules
├── api_server.py        # FastAPI HTTP wrapper with session management
├── agent.py             # CLI chat agent with tool result continuation
├── config.py            # Shared configuration
├── chunking.py          # Structure-aware markdown chunking (headings, paragraphs, sentences)
├── bm25_index.py        # In-memory BM25 keyword index (lazy singleton from ChromaDB docs)
├── hybrid_search.py     # Semantic + BM25 search with RRF, HyDE, reranking, source diversity
├── search_vault.py      # Search interface
├── index_vault.py       # Vault indexer (incremental, parallel chunking, batched upserts)
├── log_chat.py          # Daily note logging
├── services/
│   ├── chroma.py        # ChromaDB connection, embedding helpers, cross-encoder reranker
│   ├── compaction.py    # Tool message compaction for token management
│   └── vault.py         # Path resolution, response helpers, utilities
└── tools/
    ├── files.py         # File operations (read, create, move, merge, batch)
    ├── frontmatter.py   # Frontmatter queries and updates
    ├── links.py         # Backlinks, outlinks, folder comparison
    ├── preferences.py   # User preferences
    ├── search.py        # Vault search, web search
    ├── editing.py       # Section and position-based editing
    ├── utility.py       # Interaction logging
    ├── readers.py       # File type handlers (audio, image, Office docs, PDF)
    ├── summary.py       # LLM-powered file summarization
    └── research.py      # Agentic research pipeline (extract → search → synthesize)

plugin/                  # Obsidian chat sidebar (optional)
services/                # Service templates (systemd, launchd, Task Scheduler)

</details>

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured