MCP Knowledge Base Server

MCP Knowledge Base Server

A local knowledge base server that connects to AI assistants, turning markdown files into a semantically searchable memory layer via OpenAI embeddings and SQLite.

Category
Visit Server

README

MCP Knowledge Base Server

Give your AI assistant a long-term memory. Drop markdown files into a folder — Claude instantly knows your notes, docs, and code snippets.

npm version license node

What is this?

MCP Knowledge Base Server is a personal knowledge base that connects directly to AI assistants like Claude Desktop and Cursor. It turns a local folder of markdown files into a queryable, semantically searchable memory layer — without any cloud infrastructure or external databases.

It works over the Model Context Protocol (MCP), a standard that lets AI assistants call tools and retrieve data from local servers. When you ask Claude "what do my notes say about deployment?", it uses this server to run a semantic similarity search over your embedded documents and return the most relevant chunks — all locally, in milliseconds.

Under the hood, it uses OpenAI's text-embedding-3-small model to generate vector embeddings for your markdown content, stores them in a local SQLite database, and performs cosine similarity search to find the most relevant material. The server auto-indexes on startup (only re-embedding changed files), so your knowledge base stays fresh with zero manual work.

Features

  • 🔍 Semantic search across your knowledge base using OpenAI embeddings
  • 📄 Full document retrieval by title or file path
  • 🏷️ Topic browser — list all tags and categories with document counts
  • ✍️ AI-driven note creation — let Claude write notes back to your KB
  • Incremental indexing — only re-embeds changed files on startup
  • 🗄️ Zero-dependency storage — SQLite, no external DB required
  • 🔧 Simple CLImcp-kb index, mcp-kb stats, mcp-kb serve

Prerequisites

  • Node.js >= 18
  • An OpenAI API key (for embeddings — uses text-embedding-3-small, ~$0.02/1M tokens)

Quick Start

1. Clone and install

git clone https://github.com/YOUR_USERNAME/mcp-knowledge-base.git
cd mcp-knowledge-base
npm install

2. Set your OpenAI API key

cp .env.example .env
# Edit .env and add your key:
# OPENAI_API_KEY=sk-...

3. Add your markdown files

Drop .md files into the knowledge/ directory. They can have YAML frontmatter:

---
title: "My Note"
tags: [typescript, backend]
category: "guides"
---

# Content here

4. Index your knowledge base

npm run build
npx mcp-kb index

5. Configure Claude Desktop

Add to your claude_desktop_config.json (location below):

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "knowledge-base": {
      "command": "node",
      "args": ["/absolute/path/to/mcp-knowledge-base/dist/server.cjs"],
      "env": {
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

Then restart Claude Desktop.

CLI Reference

Command Description
npx mcp-kb index Index all markdown files in the knowledge directory
npx mcp-kb index --force Force re-index all files (ignores change detection)
npx mcp-kb index --dry-run Preview what would be indexed without API calls
npx mcp-kb stats Show KB statistics (docs, chunks, tags, DB size)
npx mcp-kb serve Start the MCP server manually

MCP Tools

Once connected, Claude has access to these tools:

Tool Description
search_knowledge Semantic search — ask in natural language
get_document Retrieve a full document by title or path
list_topics Browse all tags and categories
add_note Let Claude write a new note to your KB

Example Prompts

Once configured in Claude Desktop, try:

  • "What do my notes say about authentication middleware?"
  • "Show me everything tagged 'typescript'"
  • "What topics are covered in my knowledge base?"
  • "Write a summary of today's meeting and save it as a note"
  • "Get the full content of my deployment guide"

Configuration

Edit config.json to customize behavior:

{
  "knowledgeDir": "./knowledge",
  "embeddingModel": "text-embedding-3-small",
  "chunkSize": 500,
  "chunkOverlap": 50,
  "topN": 5,
  "openaiApiKey": "$OPENAI_API_KEY"
}
Field Default Description
knowledgeDir ./knowledge Directory to scan for .md files
embeddingModel text-embedding-3-small OpenAI model for embeddings
chunkSize 500 Max tokens per chunk
chunkOverlap 50 Overlap tokens between chunks
topN 5 Default number of search results

Architecture

The server follows a clean three-layer architecture: Ingestion scans markdown files, parses YAML frontmatter, splits content into overlapping chunks, and generates OpenAI embeddings — only for files that have changed since the last run (via SHA-256 hashing). Storage persists documents, chunks, and embeddings in a local SQLite database using WAL mode for write performance and an in-memory cache for fast similarity lookups. MCP Server exposes four tools to the AI assistant via stdio transport, handling all search, retrieval, and write operations.

Tech Stack

  • Runtime: Node.js + TypeScript
  • MCP SDK: @modelcontextprotocol/sdk
  • Embeddings: OpenAI text-embedding-3-small
  • Storage: better-sqlite3 (WAL mode)
  • Markdown: gray-matter + markdown-it
  • CLI: commander
  • Build: tsup
  • Tests: vitest

Development

npm run build     # compile TypeScript
npm test          # run tests
npm run dev       # watch mode build

Limitations & Roadmap

Current limitations (v1):

  • Markdown files only (.md)
  • In-process similarity search — works well up to ~10k chunks
  • Single knowledge base (no namespacing)

Planned for future versions:

  • pgvector support for large knowledge bases
  • Multi-format ingestion (PDF, HTML, plain text)
  • Namespace support (work/personal/project-x)
  • Local embedding models (offline operation)
  • Obsidian vault compatibility with [[wikilinks]]

License

MIT — see LICENSE

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured