MCP Servers

supermem

MCP server providing persistent AI memory with four-tier retrieval (SQLite FTS5, graph, vector, LLM agent) to give AI assistants structured, long-term memory without RAG.

README

supermem

Persistent AI memory without RAG — four-tier retrieval that uses an LLM agent only as a last resort, backed by SQLite FTS5, an embedded graph database, and your local markdown vault.

An MCP (Model Context Protocol) server that gives AI assistants — Claude Desktop, LM Studio, ChatGPT — persistent, structured memory backed by SQLite + an optional graph database. The LLM agent is tier 4, not the default path — most queries resolve in milliseconds via full-text search.

Quick Start (Personal, No GPU)

pip install supermem

# Point supermem at a directory of markdown files
export SUPERMEM_VAULT_PATH=~/notes
export SUPERMEM_LLM_PROVIDER=openrouter
export OPENROUTER_API_KEY=your_key_here

# Start the MCP server (add to Claude Desktop's mcp.json)
supermem serve

Add to Claude Desktop mcp.json:

{
  "mcpServers": {
    "supermem": {
      "command": "supermem",
      "args": ["serve"]
    }
  }
}

Quick Start (Production with Docker)

# Clone and configure
git clone https://github.com/lamenting-hawthorn/supermem
cp .env.example .env
# Edit .env: set SUPERMEM_VAULT_PATH, SUPERMEM_LLM_PROVIDER, API keys

# MCP server only (stdio, for Claude Desktop)
docker compose up supermem-mcp

# MCP server + HTTP dashboard
docker compose --profile worker up

# Dashboard at http://localhost:37777

Architecture: Four-Tier Retrieval

Every query goes through tiers in order, short-circuiting when enough results are found. Tiers 1–3 never call an LLM.

Query
  │
  ├─ Tier 1: SQLite FTS5 full-text search          ~1ms    always available
  │          porter tokenizer, WAL mode
  │
  ├─ Tier 2: Kuzu embedded graph expansion         ~5ms    optional (install kuzu)
  │          BFS traversal via [[wikilink]] edges
  │
  ├─ Tier 3: ChromaDB vector similarity            ~50ms   optional (SUPERMEM_VECTOR=true)
  │          sentence-transformer embeddings
  │
  └─ Tier 4: LLM agent fallback                   ~5-30s  always available
             navigates vault via Python sandbox

Short-circuit rule: if tier 1 returns ≥ min_results (default 3), tiers 2–4 are skipped entirely. Unavailable tiers are skipped with a WARNING log — no errors raised.

MCP Tool Reference

Tool	Parameters	Returns	Notes
`use_memory_agent`	`query: str`	Formatted answer	Backward-compatible. Routes through all 4 tiers; falls back to full agent only if tiers 1–3 insufficient
`supermem_hybrid`	`query: str`, `tier_limit: int = 4`	JSON with `obs_ids`, `source_tier`, `latency_ms`	Preferred for programmatic use. Token-efficient — returns IDs first
`get_observations`	`ids: list[int]`	JSON array of observation dicts	Fetch full content for specific IDs
`get_timeline`	`obs_id: int`, `window: int = 5`	JSON array of chronological observations	Context around a specific observation

Progressive Disclosure Pattern

# 1. Search — cheap, returns IDs only
result = await supermem_hybrid("Alice's project status", tier_limit=2)
# {"obs_ids": [42, 17, 88], "source_tier": 1, "latency_ms": 2.1}

# 2. Fetch — only for IDs you actually need
obs = await get_observations([42, 17])
# [{"id": 42, "content": "...", "tier_used": 1}, ...]

# 3. Timeline — context around interesting observations  
ctx = await get_timeline(42, window=3)

Environment Variables

Variable	Default	Description
`SUPERMEM_LLM_PROVIDER`	`openrouter`	`openrouter` \| `ollama` \| `claude` \| `lmstudio`
`SUPERMEM_LLM_MODEL`	provider default	Model string (e.g. `openai/gpt-4o-mini`, `llama3`)
`SUPERMEM_DB_PATH`	`~/.supermem/supermem.db`	SQLite database path
`SUPERMEM_VAULT_PATH`	`.memory_path` file	Markdown vault directory
`SUPERMEM_VECTOR`	`false`	Set `true` to enable ChromaDB tier
`SUPERMEM_API_KEY`	(none)	Bearer token for HTTP API auth (disabled if unset)
`SUPERMEM_RATE_LIMIT`	`60`	Requests/minute limit
`SUPERMEM_WORKER_PORT`	`37777`	HTTP dashboard port
`SUPERMEM_COMPRESS_EVERY`	`50`	Observations written before LLM compression
`OPENROUTER_API_KEY`	(required for openrouter)	OpenRouter API key
`ANTHROPIC_API_KEY`	(required for claude)	Anthropic API key
`OLLAMA_HOST`	`http://localhost:11434`	Ollama server URL
`LMSTUDIO_HOST`	`http://localhost:1234`	LM Studio server URL

Note: Local model inference (vLLM/CUDA) is an optional extra. Install with pip install supermem[local] if you need it. Not included in the default install.

Connector Guide

Import external data into your vault with one command:

# ChatGPT export (Settings → Data controls → Export data → .zip)
supermem connect chatgpt ~/Downloads/chatgpt_export.zip

# Notion workspace export (.zip)
supermem connect notion ~/Downloads/notion_export.zip

# Nuclino workspace export (.zip)
supermem connect nuclino ~/Downloads/nuclino_export.zip

# GitHub repositories (live via API)
supermem connect github owner/repo1,owner/repo2 --token ghp_xxx

# Google Docs (OAuth, opens browser)
supermem connect google_docs "My Doc Name"

All connectors write markdown to your vault, then automatically index the files into SQLite + graph. Private content wrapped in <private>...</private> tags is stripped before indexing.

CLI Reference

supermem serve            # Start MCP server (stdio transport, for Claude Desktop)
supermem serve --worker   # Start MCP server + HTTP dashboard on :37777
supermem chat             # Interactive terminal REPL (no client required)
supermem backup           # Create timestamped .tar.gz (vault + SQLite)
supermem backup --output /path/to/archive.tar.gz
supermem restore <archive.tar.gz>
supermem connect <type> <source> [--token TOKEN] [--max-items N]

HTTP Dashboard (Optional)

Start with supermem serve --worker or docker compose --profile worker up.

Endpoint	Method	Description
`/health`	GET	`{"status":"ok","db":true,"graph":false,"vector":false}`
`/sessions`	GET	Paginated session list with summaries
`/observations`	GET	Filter by session/date/type
`/search`	POST	`{"query": "...", "tier_limit": 4}`
`/index/rebuild`	POST	Reindex entire vault
`/backup`	GET	Streams vault + DB as `.tar.gz`
`/stats`	GET	`{obs_count, entity_count, session_count, db_size_mb}`

Auth: Authorization: Bearer <SUPERMEM_API_KEY>. Disabled when env var is unset.

Privacy

Wrap sensitive content in <private>...</private> tags. It is stripped before writing to any storage layer (SQLite, Kuzu, ChromaDB). The content passes through to the agent sandbox only — it never persists.

# Meeting Notes

Alice discussed the roadmap.
<private>Budget: $2.4M approved for Q3</private>
Next steps: ship v2 by June.

Running Tests

uv run pytest tests/ -v                          # all tests
uv run pytest tests/unit/ -v                     # unit only (fast, no network)
uv run pytest tests/integration/ -v              # integration (real storage)
uv run pytest tests/ --cov=supermem --cov-report=term-missing  # with coverage

Coverage gate: 60% (CI enforced). Kuzu and Anthropic tests are auto-skipped if packages are not installed.

License

Apache 2.0 — see LICENSE.

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured