diamond

diamond

Enables AI assistants to search and read offline documentation from synced libraries via MCP tools, providing hybrid keyword and semantic search without network calls.

Category
Visit Server

README

Diamond

Buy Me A Coffee CI

I got tired of watching an AI agent make fourteen tool calls to resolve a TanStack issue.

Diamond is a documentation registry and MCP server. It crawls a docs site once, stores the content locally, and serves it to your AI assistant on demand — no network call, no hallucinated APIs, no nonsense.

Inspired by pnpm.

If you like it, well then, I made this for you.

Quick Start

# get the code
git clone https://github.com/elmojones3/diamond.git

# navigate to it
cd diamond

# install dependencies
pnpm install

# (optional) makes `diamond` available globally
pnpm link --global

# Sync a library's docs
diamond sync https://mswjs.io/docs --key msw --recursive

# Install as an MCP server (Claude Code, Claude Desktop, Cursor, Gemini CLI, or Codex)
diamond install --claude-code

# Start MCP over stdio (used by MCP hosts)
diamond mcp

# Or run a persistent local HTTP MCP endpoint
diamond serve --port 65535 --bg

That's it. Your AI assistant now has offline access to MSW's documentation.

How It Works

1. Crawl with a real browser. Diamond uses Playwright to render pages the same way Chrome does. It waits for JavaScript frameworks to hydrate, then clicks through tab panels and code switchers to capture content that plain scrapers miss. It respects robots.txt and identifies as DiamondCrawler.

2. Extract the signal. Mozilla Readability (the Firefox Reader View engine) strips navbars, sidebars, and footers. dom-to-semantic-markdown converts the clean HTML to structured Markdown that LLMs read well.

3. Hybrid Search (Keyword + Semantic). Diamond builds two indices for every library: a fast MiniSearch keyword index and a semantic vector index using all-MiniLM-L6-v2. This allows your AI to find exact technical terms and conceptually related content (e.g. searching for "problems" finds "Error Handling") fully offline.

4. Store without duplication. Content is hashed with SHA-256 and stored once, regardless of how many library versions reference it. Versioned directories are hardlinks into this store, so multiple versions cost almost nothing extra.

5. Serve over MCP. diamond mcp (stdio) or diamond serve (HTTP) exposes everything to any MCP-compatible AI host via tools and resource URIs:

docs://msw/latest/api/handlers     ← read a specific page
repo://my-library/src/index.ts     ← read a file from a local repo

CLI

# Sync docs into the registry (use this for MCP access)
diamond sync <url> --key <name> --recursive

# One-shot crawl to a local directory (no registry)
diamond crawl <url> --key <name> --recursive

# Start MCP over stdio (for host-managed MCP processes)
diamond mcp

# Start persistent HTTP MCP server
diamond serve --port 65535

# Run persistent HTTP MCP server in the background
diamond serve --bg

# Inspect running background server and tail logs
diamond view server

# Register a local git repository (immediately indexed and searchable)
diamond repo add <path> --key <name>

# Watch registered repos and keep their indices up to date as files change
diamond watch

# Remove a library or repo from the registry (reclaims disk space for docs)
diamond remove <id>

# Automatic MCP configuration
diamond install --claude-code --claude-desktop --cursor --gemini-cli --codex

Which Server Command Should I Use?

  • Use diamond mcp for Claude Code, Cursor, Codex, and other MCP hosts that spawn a stdio MCP process.
  • Use diamond serve when you want a persistent local HTTP endpoint (http://127.0.0.1:<port>/mcp), including daemon mode via --bg.
  • diamond serve defaults to port 65535; override with --port or DIAMOND_PORT.

MCP Tools

Once your MCP host is configured with diamond mcp (or you run diamond serve for HTTP), your AI assistant has access to:

Tool What it does
list_registry List all synced libraries and repos
sync_docs Crawl and sync a library (callable from the AI)
search_library Hybrid search (keyword + semantic) across docs or repos
remove_library Remove a library or repo from the registry

MCP Setup

You can use diamond install to automatically configure your tools, or manually edit your configuration files:

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "diamond": {
      "command": "diamond",
      "args": ["mcp"]
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "diamond": {
      "command": "diamond",
      "args": ["mcp"]
    }
  }
}

Codex (~/.codex/config.toml):

[mcp_servers.diamond]
command = "diamond"
args = ["mcp"]

Storage Layout

Diamond follows the XDG Base Directory Specification. Override any path with the standard XDG environment variables.

~/.config/diamond/registry.json    ← manifest of all synced libraries
~/.local/share/diamond/store/      ← content-addressable store (SHA-256)
~/.local/share/diamond/storage/    ← versioned views (hardlinked from store)

Requirements

  • Node.js 22+
  • pnpm
  • Playwright / Chromium — on first use, run npx playwright install chromium

Contributing

Contributions are welcome. See CONTRIBUTING.md to get started.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured