LLM Wiki Kit

LLM Wiki Kit

Enables creation of persistent, compounding knowledge bases using Karpathy's LLM Wiki pattern with LLM-maintained markdown wikis. Supports automated ingestion, cross-referencing, synthesis, and linting of sources as an alternative to traditional RAG systems.

Category
Visit Server

README

šŸ“š llm-wiki-kit

Stop re-explaining your research to your AI agent every session.

License: MIT Python 3.10+


llm-wiki-kit gives your AI agent a persistent, structured memory that compounds over time. Drop PDFs, URLs, YouTube videos — your agent builds a wiki, connects the dots, and remembers everything across sessions.

Based on Karpathy's LLM Wiki pattern. Works with Claude, Codex, Cursor, Windsurf, and any MCP-compatible agent.


The Problem

Every time you start a new chat:

You: "Remember that paper on speculative decoding I shared last week?"
Agent: "I don't have access to previous conversations..."
You: *sighs, re-uploads PDF, re-explains context*

You're constantly re-teaching your agent things it should already know.

The Solution

With llm-wiki-kit, your agent maintains its own knowledge base:

You: "What did we learn about speculative decoding?"
Agent: *searches wiki* "Based on the 3 papers you've shared, the Eagle 
       architecture shows the best efficiency tradeoffs because..."

The wiki persists. Cross-references build up. Your agent gets smarter with every source you add.


⚔ Quickstart (2 minutes)

1. Install

pip install "llm-wiki-kit[all] @ git+https://github.com/iamsashank09/llm-wiki-kit.git"

2. Initialize a wiki

mkdir my-research && cd my-research
llm-wiki-kit init --agent claude

3. Connect your agent

Add to Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "llm-wiki-kit": {
      "command": "llm-wiki-kit",
      "args": ["serve", "--root", "/path/to/my-research"]
    }
  }
}

<details> <summary><b>Other agents (Codex, Cursor, Windsurf)</b></summary>

OpenAI Codex

codex mcp add llm-wiki-kit -- llm-wiki-kit serve --root /path/to/my-research

Cursor

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "llm-wiki-kit": {
      "command": "llm-wiki-kit",
      "args": ["serve", "--root", "/path/to/my-research"]
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "llm-wiki-kit": {
      "command": "llm-wiki-kit",
      "args": ["serve", "--root", "/path/to/my-research"]
    }
  }
}

</details>

4. Use it

You: "Ingest this paper: raw/attention-is-all-you-need.pdf"
Agent: *creates wiki pages, cross-references concepts, updates index*

You: "Now ingest https://youtube.com/watch?v=kCc8FmEb1nY"
Agent: *extracts transcript, links to existing transformer concepts*

You: "How does the attention mechanism in the paper relate to Karpathy's explanation?"
Agent: *searches wiki, synthesizes answer from both sources*

Your agent now has persistent memory that survives across sessions.


šŸ”„ What Makes This Different

Feature Why It Matters
Multi-format ingest PDFs, URLs, YouTube, markdown — just drop it in
Auto cross-referencing Agent builds [[wiki links]] between related concepts
Persistent across sessions Start fresh chats without losing context
Full-text search Agent finds relevant pages instantly (SQLite FTS5)
Health checks wiki_lint catches broken links, orphan pages, contradictions
Zero lock-in It's just markdown files in a folder — view in Obsidian, VS Code, anywhere
Works with any MCP agent Claude, Codex, Cursor, Windsurf, and more

šŸ“„ Supported Sources

Your agent can ingest anything:

Drop this... Get this...
raw/paper.pdf Extracted text, page markers, metadata
https://arxiv.org/abs/... Clean article content, auto-saved to raw/
https://youtube.com/watch?v=... Full transcript with timestamps
raw/notes.md Direct markdown ingestion

Install what you need:

pip install "llm-wiki-kit[pdf]"      # PDF support
pip install "llm-wiki-kit[web]"      # URL extraction  
pip install "llm-wiki-kit[youtube]"  # YouTube transcripts
pip install "llm-wiki-kit[all]"      # Everything

🧠 How It Works

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  YOU                                                    │
│  "Ingest this paper. How does it relate to X?"         │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                        │
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā–¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  WIKI (agent-maintained)                                │
│                                                         │
│  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”  │
│  │ concepts/    │  │ sources/     │  │ synthesis/   │  │
│  │ attention.md │◄─┤ paper-1.md   │──► cache.md     │  │
│  │ [[linked]]   │  │ [[linked]]   │  │ [[linked]]   │  │
│  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜  │
│                                                         │
│  + index.md (table of contents)                        │
│  + log.md (what happened when)                         │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
                        │
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā–¼ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  RAW SOURCES (immutable)                                │
│  paper.pdf, article.html, transcript.md                 │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

The agent reads raw sources, writes wiki pages, and maintains the connections. You never touch the wiki directly — the agent does all the work.


šŸ›  Available Tools

Your agent gets these MCP tools:

Tool What it does
wiki_ingest Process any source (file, URL, YouTube)
wiki_write_page Create or update a wiki page
wiki_read_page Read a specific page
wiki_search Full-text search across all pages
wiki_lint Find broken links, orphans, empty pages
wiki_status Overview: page count, sources, recent activity
wiki_log Append to the operation log

šŸ’” Use Cases

Research: Feed papers into your wiki over weeks. Ask synthesis questions that span all your reading.

Technical onboarding: Ingest a codebase's docs. Your agent answers architecture questions from accumulated context.

Competitive intel: Add market reports, earnings calls, news. Agent maintains a living landscape that updates as you add more.

Learning: Watch YouTube tutorials, read blog posts. Agent builds a personalized wiki of everything you've studied.

Book notes: Ingest chapters as you read. Agent tracks characters, themes, plot threads, and connections.


šŸ” Pro Tips

  • Use Obsidian to visualize your wiki's graph — it's just a folder of markdown files
  • Git init your wiki directory — get version history for free
  • Let the agent link aggressively — the value compounds in the connections
  • Run lint periodically — catches contradictions and gaps in your knowledge base
  • Start small — even 5-10 sources produce a surprisingly useful wiki

šŸ“¦ Development

git clone https://github.com/iamsashank09/llm-wiki-kit
cd llm-wiki-kit
uv venv && source .venv/bin/activate
uv pip install -e ".[all]"

šŸ™ Credits

Based on the LLM Wiki idea by Andrej Karpathy.

šŸ“„ License

MIT — do whatever you want with it.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured