MCP Servers

astrolabe-mcp

Provides unified search and navigation across multiple project documentations via the Model Context Protocol (MCP). Any MCP-compatible agent can discover and retrieve knowledge from all indexed projects.

README

astrolabe-mcp

A transparent knowledge layer across multiple projects via the Model Context Protocol (MCP). Any agent — Claude Code, Claude Desktop, or any MCP-compatible client — connects and gets unified search and navigation across all your documentation.

"The night is still; the desert hearkens unto God, and star speaks unto star" — Lermontov

What is this

Astrolabe is a dumb server + smart agent architecture. The server only walks files, stores an index, and serves content. All classification, summarization, and keyword extraction happen on the agent side (the LLM you're already paying for). The agent reads a file, understands it, and calls update_index_tool() to enrich the index card.

The problem it solves: you work on multiple projects with scattered documentation — specs, references, tasks, reports, skills. Knowledge is siloed. Claude Code can't see files outside its current project. There's no single place to ask "do we have a reference for X?" across projects.

The solution: one MCP server indexes all your projects. Any agent connects and searches across everything.

Key Features

Git-aware scanning — uses git ls-files to respect .gitignore automatically, with rglob fallback for non-git directories
Cross-project search — find documents across all indexed projects from any agent
Agent-powered enrichment — the LLM classifies, summarizes, and tags documents
Progressive disclosure — browse the catalog first, read files only when needed
Section extraction — read specific sections by heading, not the entire file
Managed typing — fixed set of document types with undef as a catch-all
Binary-safe — media and office files are indexed by metadata and filename
Zero-intrusion — no frontmatter, no changes to your project files
Semantic search — optional deep_search via ChromaDB embeddings, finds documents by meaning even without enrichment
Web UI — local browser interface for browsing, searching, and editing index cards with markdown rendering
Divergence tracking — detects when one copy of a duplicated document is edited while others stay behind, flags the split for manual resolution via accept_divergence() or natural reconvergence on the next reindex

Quick Start

git clone https://github.com/zebrr/astrolabe-mcp.git
cd astrolabe-mcp
python3 -m venv .venv
source .venv/bin/activate   # macOS/Linux
# .venv\Scripts\activate    # Windows
pip install -e .

Optional features (install any combination):

Command	What it adds
`pip install -e ".[web]"`	Local web UI (FastAPI + Jinja2 + HTMX)
`pip install -e ".[embeddings]"`	Semantic search via ChromaDB (`deep_search` tool)
`pip install -e ".[dev]"`	Dev tools (ruff, mypy, pytest)
`pip install -e ".[web,embeddings]"`	All optional features
`pip install -e ".[dev,web,embeddings]"`	Everything

pip install -e . installs only the base: MCP server with keyword search, enrichment, and all core tools. Optional groups add features without breaking the base.

Copy and edit the config files:

cp runtime/config.example.json runtime/config.json
cp runtime/doc_types.example.yaml runtime/doc_types.yaml

Edit runtime/config.json — add your project paths (see Configuration).

Connect to Claude Code or Claude Desktop (see Connecting to Clients).

Done. The server starts automatically when the client launches.

Configuration

Projects (`runtime/config.json`)

{
  "projects": {
    "my-project": "/path/to/my-project",
    "api-docs": "/path/to/api-docs",
    "web-app": "/path/to/web-app"
  },
  "index_dir": ".",
  "storage": "json",
  "index_extensions": [
    ".md", ".yaml", ".yml", ".txt", ".py", ".sh",
    ".pdf", ".doc", ".docx", ".xls", ".xlsx", ".ppt", ".pptx",
    ".jpg", ".jpeg", ".png", ".gif", ".svg", ".webp",
    ".mp3", ".wav", ".mp4", ".mov"
  ],
  "ignore_dirs": ["src", "lib", "app", "tests", "test"],
  "ignore_files": ["*.lock"],
  "max_file_size_kb": 50
}

Storage backend: "json" (default) or "sqlite". JSON works out of the box. Switch to SQLite for large indexes (500+ cards) — enrichment writes ~1KB per card instead of rewriting the entire file. Changing the setting auto-migrates the existing JSON index, no re-enrichment needed.

What gets indexed: files matching index_extensions in project directories. Git-aware scanning uses git ls-files to automatically exclude gitignored files (.venv/, node_modules/, __pycache__/, etc.). For non-git directories, falls back to recursive file walking.

Note: ignore_dirs and ignore_files are for domain-specific exclusions — git-tracked directories/files you don't want in the knowledge index. For example, src excludes source code that git tracks but isn't useful as knowledge documents. Gitignored paths are excluded automatically and don't need to be listed here.

Document Types (`runtime/doc_types.yaml`)

Defines the vocabulary of document types used during enrichment. The agent uses these descriptions to classify files:

document_types:
  instruction:
    description: >
      Project instruction, agent rules and workflow.
  reference:
    description: >
      Reference material on API, tool, approach, or methodology.
  spec:
    description: >
      Technical specification, architecture, design document.
  task:
    description: >
      Work assignment with context, steps, and acceptance criteria.
  # ... see doc_types.example.yaml for the full list

Current built-in types: instruction, reference, task, report, spec, document, skill, utility, project_state, binary_doc, media, undef.

Connecting to Clients

Astrolabe is an MCP server with stdio transport. The client starts the server process automatically — you just tell it how.

Claude Code

To make astrolabe available from any project, add to ~/.claude/settings.json:

{
  "mcpServers": {
    "astrolabe": {
      "command": "/absolute/path/to/astrolabe-mcp/.venv/bin/python",
      "args": ["-m", "astrolabe.server"],
      "cwd": "/absolute/path/to/astrolabe-mcp",
      "env": {
        "ASTROLABE_CONFIG": "/absolute/path/to/astrolabe-mcp/runtime/config.json"
      }
    }
  }
}

Replace /absolute/path/to/astrolabe-mcp with the real path. The full path to the venv Python is required.

For a single project only, use .mcp.json in that project's root (same format).

Claude Desktop

Add to claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Same JSON format as above.

Verify

After adding the config, restart the client. Ask the agent:

"Call get_cosmos() from astrolabe"

It should return your project list and index statistics.

How it Works

Architecture

models.py ← config.py ← index.py ← server.py
models.py ← reader.py ←──────────────┘
models.py ← search.py ←──────────────┘
chunker.py ← embeddings.py ←─────────┘  (optional)
                                      ← web/state.py ← web/app.py

On startup, the server reads config.json, scans all project directories, and builds an index of file metadata (name, path, size, content hash). Index cards start empty — no types, summaries, or keywords.

Enrichment

The agent enriches cards by reading files and filling in metadata. Here's a typical session:

Agent → get_cosmos()
       ← {projects: 3, total: 120, empty: 45, enriched: 75}

Agent → list_docs(stale=true)
       ← {total: 45, limit: 50, offset: 0, result: [{doc_id: "web-app::docs/API.md", ...}, ...]}

Agent → read_doc("web-app::docs/API.md")
       ← {content: "# REST API Reference\n\n## Authentication\n...", total_lines: 340}

Agent → update_index_tool(
          "web-app::docs/API.md",
          type="reference",
          summary="REST API reference: authentication, endpoints, rate limits, error codes.",
          keywords=["api", "rest", "authentication", "rate-limits", "endpoints"]
        )
       ← {status: "updated", enriched_at: "2026-03-06T12:00:00Z"}

After enrichment, the card is searchable:

Agent → search_docs("authentication api")
       ← {total: 3, max_results: 20, result: [{doc_id: "web-app::docs/API.md", relevance: 0.92, ...}]}

An included enrichment skill (enrich-index) automates batch enrichment — it processes all stale cards in a forked context. See .claude/skills/enrich-index/SKILL.md.

Search

search_docs(query) performs bilingual stem matching (English + Russian) over enriched cards with field weights:

Field	Weight
keywords	3.0
headings	2.0
summary	1.5
filename	0.8

Each query token and each word in a field are stemmed with both EN and RU Snowball stemmers. A token matches a word if their stem sets intersect — so "running" finds "run", and "документы" finds "документ". Filenames are split on _, -, . before matching. Results are sorted by relevance score.

Semantic Search (optional)

deep_search(query) performs semantic search over actual file content using ChromaDB embeddings. Unlike search_docs which matches keywords on enriched cards, deep_search finds documents by meaning — even unenriched ones.

When to use: when search_docs returns too few results, or when searching for a concept rather than an exact term. The agent is guided automatically — search_docs hints at deep_search when results are sparse.

Setup: add "embeddings": true to config.json and install with pip install -e ".[embeddings]". Run reindex_tool() to build embeddings (one-time, takes ~1-2 minutes for 1500 documents). After that, deep_search is available as a separate MCP tool.

How it works: files are split into ~800-character chunks and embedded using ChromaDB's built-in model (all-MiniLM-L6-v2, ~80MB, runs locally, no API keys). On query, deep_search combines semantic similarity with stem matching for hybrid scoring.

Storage: embeddings are stored locally (runtime/.chromadb/ by default, configurable via embeddings_dir). They are not cloud-synced — ChromaDB's internal files (HNSW index) are too large for reliable cloud drive sync. Embeddings are rebuilt per machine on first reindex_tool() call; subsequent runs only update new/changed documents (tracked via manifest).

MCP Tools

Tool	Description
`get_doc_types()`	Document type vocabulary from doc_types.yaml (descriptions + examples)
`get_cosmos()`	Entry point. Projects, document types, index stats
`list_docs(project?, type?, stale?, desync?, diverged?, limit?, offset?)`	List document cards with filters and pagination
`search_docs(query, project?, type?, max_results?)`	Fast keyword search with relevance ranking
`deep_search(query, project?, max_results?)`	Semantic search over file content (requires `embeddings: true`)
`get_card(doc_id)`	Index card metadata — type, summary, keywords (no file content)
`read_doc(doc_id, section?, range?)`	Read file content — full, by heading, or line range
`update_index_tool(doc_id, type?, summary?, keywords?, headings?)`	Enrich a card (type validated against doc_types.yaml)
`reindex_tool(project?, mode?)`	Rescan filesystem. `mode`: `update` (default) / `clean` (remove missing) / `rebuild` (reset all)
`accept_divergence(doc_id)`	Accept that a previously-duplicated document was intentionally edited out of its group. Clears `diverged_from` flag

doc_id format: project::rel_path — e.g., web-app::docs/API.md.

Section reading: read_doc("web-app::docs/API.md", section="Authentication") extracts from that heading to the next heading of the same or higher level. Returns available headings if the section isn't found.

Cross-Platform Sync

Astrolabe supports sharing a single index across machines (e.g., Windows + Mac) via a cloud folder (Google Drive, OneDrive).

Setup: each machine has its own runtime/config.json (gitignored) with index_dir pointing to the shared cloud folder. Different machines may have different subsets of projects configured.

How it works:

Hash normalization — line endings (CRLF/LF) are normalized before hashing, so the same file produces the same hash on any platform
Pass-through — cards from projects not in the local config are preserved during reindex (they belong to another machine's projects)
Desync detection — if a file is missing locally but exists in the index, get_cosmos() reports desync_documents. Run reindex() to update, or git pull if files are from another machine
Stale detection — hash-based: if content_hash differs from enriched_content_hash, the card is stale and needs re-enrichment. Reliable across machines (no timestamp dependency)
Shared doc_types — doc_types.yaml is loaded from next to the index file first, then next to the config file. When using a cloud index, place doc_types.yaml in the same cloud folder to share document type definitions across machines
Reindex modes — reindex_tool(mode="clean") removes cards for deleted/moved files while preserving enrichment. reindex_tool(mode="rebuild") resets all enrichment (nuclear option). Pass-through cards from other machines are always preserved

Private Index

Some projects shouldn't be visible in the shared cloud index (personal notes, private repos). Astrolabe supports a separate private storage alongside the shared one.

Add to runtime/config.json:

{
  "projects": {
    "shared-project": "/path/to/shared"
  },
  "private_projects": {
    "my-notes": "/path/to/my-notes"
  },
  "private_index_dir": "../private-index"
}

How it works:

private_projects are indexed separately in private_index_dir (local, not cloud-synced)
The server merges both indexes in memory — all tools work transparently across shared and private documents
update_index_tool() routes saves to the correct storage based on project
reindex_tool() splits results to the correct storage
One shared doc_types.yaml — the team agrees on types, private projects use the same vocabulary
Without private_projects/private_index_dir, behavior is identical to before

Web UI

Astrolabe includes an optional local web interface for browsing and managing the index in a browser.

Install and run:

pip install -e ".[web]"
.venv/bin/python -m astrolabe.web          # macOS/Linux
# .venv\Scripts\python -m astrolabe.web    # Windows

Opens at http://127.0.0.1:8420. Custom host/port: --host 0.0.0.0 --port 9000.

Features:

Dashboard — index health overview, project stats, document type breakdown. All elements are clickable links to filtered card lists
Card list — filterable by project, type, stale/empty/desync. Filters auto-apply on change
Card editing — inline editing of type, summary, keywords, and headings. Changes are persisted to the same storage the MCP server uses
Document reader — markdown rendering with section navigation
Search — live search from the header, results ranked by relevance
Reindex — trigger reindex (update/clean/rebuild) from the header

The web server runs as a separate process and shares the storage backend with the MCP server. Changes made in the web UI are immediately visible to MCP clients, and vice versa (click Refresh to reload).

Current Limitations

Binary files — PDF, Office documents are indexed by filename only (no content extraction yet)
Media files — images, audio, video indexed by filename only
No code parsing — .py/.sh files are read as plain text, no AST analysis
Semantic search model — fixed to all-MiniLM-L6-v2, no model choice yet
Embeddings are local — not synced via cloud drives (ChromaDB HNSW files are too large); rebuilt per machine on first reindex
No file writing — index card editing via Web UI, but no document content editing via MCP
Single index file — JSON uses filelock, SQLite uses its own locking; not designed for high-throughput multi-client scenarios

Contributing

Fork, install with pip install -e ".[dev,web,embeddings]", run ruff check src/ tests/ && mypy src/ && pytest -v before submitting.

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

astrolabe-mcp

README

astrolabe-mcp

What is this

Key Features

Quick Start

Configuration

Projects (runtime/config.json)

Document Types (runtime/doc_types.yaml)

Connecting to Clients

Claude Code

Claude Desktop

Verify

How it Works

Architecture

Enrichment

Search

Semantic Search (optional)

MCP Tools

Cross-Platform Sync

Private Index

Web UI

Current Limitations

Contributing

License

Recommended Servers

Projects (`runtime/config.json`)

Document Types (`runtime/doc_types.yaml`)