MCP Knowledge Base Server
A local knowledge base server that connects to AI assistants, turning markdown files into a semantically searchable memory layer via OpenAI embeddings and SQLite.
README
MCP Knowledge Base Server
Give your AI assistant a long-term memory. Drop markdown files into a folder — Claude instantly knows your notes, docs, and code snippets.
What is this?
MCP Knowledge Base Server is a personal knowledge base that connects directly to AI assistants like Claude Desktop and Cursor. It turns a local folder of markdown files into a queryable, semantically searchable memory layer — without any cloud infrastructure or external databases.
It works over the Model Context Protocol (MCP), a standard that lets AI assistants call tools and retrieve data from local servers. When you ask Claude "what do my notes say about deployment?", it uses this server to run a semantic similarity search over your embedded documents and return the most relevant chunks — all locally, in milliseconds.
Under the hood, it uses OpenAI's text-embedding-3-small model to generate vector embeddings for your markdown content, stores them in a local SQLite database, and performs cosine similarity search to find the most relevant material. The server auto-indexes on startup (only re-embedding changed files), so your knowledge base stays fresh with zero manual work.
Features
- 🔍 Semantic search across your knowledge base using OpenAI embeddings
- 📄 Full document retrieval by title or file path
- 🏷️ Topic browser — list all tags and categories with document counts
- ✍️ AI-driven note creation — let Claude write notes back to your KB
- ⚡ Incremental indexing — only re-embeds changed files on startup
- 🗄️ Zero-dependency storage — SQLite, no external DB required
- 🔧 Simple CLI —
mcp-kb index,mcp-kb stats,mcp-kb serve
Prerequisites
- Node.js >= 18
- An OpenAI API key (for embeddings — uses
text-embedding-3-small, ~$0.02/1M tokens)
Quick Start
1. Clone and install
git clone https://github.com/YOUR_USERNAME/mcp-knowledge-base.git
cd mcp-knowledge-base
npm install
2. Set your OpenAI API key
cp .env.example .env
# Edit .env and add your key:
# OPENAI_API_KEY=sk-...
3. Add your markdown files
Drop .md files into the knowledge/ directory. They can have YAML frontmatter:
---
title: "My Note"
tags: [typescript, backend]
category: "guides"
---
# Content here
4. Index your knowledge base
npm run build
npx mcp-kb index
5. Configure Claude Desktop
Add to your claude_desktop_config.json (location below):
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"knowledge-base": {
"command": "node",
"args": ["/absolute/path/to/mcp-knowledge-base/dist/server.cjs"],
"env": {
"OPENAI_API_KEY": "sk-your-key-here"
}
}
}
}
Then restart Claude Desktop.
CLI Reference
| Command | Description |
|---|---|
npx mcp-kb index |
Index all markdown files in the knowledge directory |
npx mcp-kb index --force |
Force re-index all files (ignores change detection) |
npx mcp-kb index --dry-run |
Preview what would be indexed without API calls |
npx mcp-kb stats |
Show KB statistics (docs, chunks, tags, DB size) |
npx mcp-kb serve |
Start the MCP server manually |
MCP Tools
Once connected, Claude has access to these tools:
| Tool | Description |
|---|---|
search_knowledge |
Semantic search — ask in natural language |
get_document |
Retrieve a full document by title or path |
list_topics |
Browse all tags and categories |
add_note |
Let Claude write a new note to your KB |
Example Prompts
Once configured in Claude Desktop, try:
- "What do my notes say about authentication middleware?"
- "Show me everything tagged 'typescript'"
- "What topics are covered in my knowledge base?"
- "Write a summary of today's meeting and save it as a note"
- "Get the full content of my deployment guide"
Configuration
Edit config.json to customize behavior:
{
"knowledgeDir": "./knowledge",
"embeddingModel": "text-embedding-3-small",
"chunkSize": 500,
"chunkOverlap": 50,
"topN": 5,
"openaiApiKey": "$OPENAI_API_KEY"
}
| Field | Default | Description |
|---|---|---|
knowledgeDir |
./knowledge |
Directory to scan for .md files |
embeddingModel |
text-embedding-3-small |
OpenAI model for embeddings |
chunkSize |
500 |
Max tokens per chunk |
chunkOverlap |
50 |
Overlap tokens between chunks |
topN |
5 |
Default number of search results |
Architecture
The server follows a clean three-layer architecture: Ingestion scans markdown files, parses YAML frontmatter, splits content into overlapping chunks, and generates OpenAI embeddings — only for files that have changed since the last run (via SHA-256 hashing). Storage persists documents, chunks, and embeddings in a local SQLite database using WAL mode for write performance and an in-memory cache for fast similarity lookups. MCP Server exposes four tools to the AI assistant via stdio transport, handling all search, retrieval, and write operations.
Tech Stack
- Runtime: Node.js + TypeScript
- MCP SDK:
@modelcontextprotocol/sdk - Embeddings: OpenAI
text-embedding-3-small - Storage:
better-sqlite3(WAL mode) - Markdown:
gray-matter+markdown-it - CLI:
commander - Build:
tsup - Tests:
vitest
Development
npm run build # compile TypeScript
npm test # run tests
npm run dev # watch mode build
Limitations & Roadmap
Current limitations (v1):
- Markdown files only (
.md) - In-process similarity search — works well up to ~10k chunks
- Single knowledge base (no namespacing)
Planned for future versions:
- pgvector support for large knowledge bases
- Multi-format ingestion (PDF, HTML, plain text)
- Namespace support (work/personal/project-x)
- Local embedding models (offline operation)
- Obsidian vault compatibility with
[[wikilinks]]
License
MIT — see LICENSE
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.