Memory MCP Server
Provides a persistent "second brain" for Claude featuring zero-latency hot caching, semantic cold storage, and automatic pattern mining from activity logs. It enables users to store, search, and automatically extract project facts and code patterns for enhanced contextual recall.
README
<div align="center">
๐ง Memory MCP Server
Give your AI assistant a persistent second brain
<br />
Stop re-explaining your project every session.
Memory MCP learns what matters and keeps it ready โ instant recall for the stuff you use most, semantic search for everything else.
</div>
The Problem
Every new chat starts from scratch. You repeat yourself. Context balloons. Tool calls add latency.
Memory MCP fixes this. It gives Claude persistent memory with a two-tier architecture: a hot cache for instant access to frequently-used knowledge, and cold storage with semantic search for everything else.
The system learns what you use and automatically promotes it. No manual curation required.
Before & After
| ๐ค Without Memory MCP | ๐ฏ With Memory MCP |
|---|---|
| "Let me explain our architecture again..." | Project facts persist forever |
| Copy-paste the same patterns | Patterns auto-promoted to instant access |
| 500k+ token context windows | Hot cache keeps it lean (~20 items) |
| Tool call latency on every lookup | Hot cache: 0ms โ already in context |
Key Features
๐ Instant recall hot cache โ Frequently-used memories auto-injected into context. No tool calls needed.
๐ Semantic search โ Find memories by meaning, not just keywords. Knowledge graph connects related concepts.
๐ค Self-organizing โ Learns what you use. Auto-promotes frequent patterns. Auto-demotes stale ones.
๐ฆ Local & private โ All data in SQLite. No cloud. No API keys. Works offline.
๐ Apple Silicon optimized โ MLX backend auto-detected on M-series Macs for faster embeddings.
Quick Start
Install
# uv (recommended)
uv tool install memory-mcp
# pip
pip install memory-mcp
# Homebrew (macOS)
brew install memory-mcp
# From source
git clone https://github.com/michael-denyer/memory-mcp.git
cd memory-mcp && uv sync
Apple Silicon? Add MLX support for faster embeddings:
uv tool install memory-mcp[mlx]
# or: pip install memory-mcp[mlx]
Configure
Add to your MCP client config (e.g., ~/.claude.json for Claude Code):
{
"mcpServers": {
"memory": {
"command": "memory-mcp"
}
}
}
<details> <summary>From source? Use this config instead</summary>
{
"mcpServers": {
"memory": {
"command": "uv",
"args": ["run", "--directory", "/path/to/memory-mcp", "memory-mcp"]
}
}
}
</details>
Restart your client. That's it. The hot cache auto-populates from your project docs.
First run: Embedding model (~90MB) downloads automatically. Takes 30-60 seconds once.
How It Works
flowchart LR
subgraph LLM["Your AI Assistant"]
REQ((Request))
end
subgraph Hot["HOT CACHE ยท 0ms"]
HC[Frequent memories]
WS[Working set]
end
subgraph Cold["COLD STORAGE ยท ~50ms"]
VS[(Vector search)]
KG[(Knowledge graph)]
end
REQ -->|"auto-injected"| HC
REQ -->|"recall()"| VS
VS <-->|"related"| KG
Two tiers, automatic promotion:
| Tier | Latency | What happens |
|---|---|---|
| Hot Cache | 0ms | Auto-injected every request. No tool call needed. |
| Cold Storage | ~50ms | Semantic search when you need deeper recall. |
Memories used 3+ times automatically promote to hot cache. Unused memories demote after 14 days. Pin important ones to keep them hot forever.
What Makes It Different
| Memory MCP | Others | |
|---|---|---|
| Hot cache | Auto-injected, 0ms | Most require tool calls |
| Self-organizing | Learns from usage | Manual curation |
| Pattern mining | Extracts from outputs | Not available |
| Setup | One command, local SQLite | Often needs cloud/services |
Reference
Everything below is detailed documentation. You don't need to read it to get started.
Tools
Memory Operations
| Tool | Description |
|---|---|
remember(content, type, tags) |
Store a memory with semantic embedding |
recall(query, limit, threshold, expand_relations) |
Semantic search with confidence gating and optional multi-hop expansion |
recall_by_tag(tag) |
Filter memories by tag |
forget(memory_id) |
Delete a memory |
list_memories(limit, offset, type) |
Browse all memories |
Hot Cache Management
| Tool | Description |
|---|---|
hot_cache_status() |
Show contents, metrics, and effectiveness |
promote(memory_id) |
Manually promote to hot cache |
demote(memory_id) |
Remove from hot cache (keeps in cold storage) |
pin_memory(memory_id) |
Pin memory (prevents auto-eviction) |
unpin_memory(memory_id) |
Unpin memory (allows auto-eviction) |
Pattern Mining
| Tool | Description |
|---|---|
log_output(content) |
Log content for pattern extraction |
run_mining(hours) |
Extract patterns from recent logs |
review_candidates() |
See patterns ready for promotion |
approve_candidate(id) / reject_candidate(id) |
Accept or reject patterns |
Cold Start / Seeding
| Tool | Description |
|---|---|
bootstrap_project(root, files, promote) |
Auto-detect and seed from project docs (README.md, CLAUDE.md, etc.) |
seed_from_text(content, type, promote) |
Parse text into memories |
seed_from_file(path, type, promote) |
Import from file (e.g., CLAUDE.md) |
Knowledge Graph
| Tool | Description |
|---|---|
link_memories(from_id, to_id, relation, metadata) |
Create relationship between memories |
unlink_memories(from_id, to_id, relation) |
Remove relationship(s) |
get_related_memories(memory_id, relation, direction) |
Find connected memories |
Relation types: relates_to, depends_on, supersedes, refines, contradicts, elaborates
Trust Management
| Tool | Description |
|---|---|
strengthen_trust(memory_id, amount, reason) |
Increase confidence in a memory |
weaken_trust(memory_id, amount, reason) |
Decrease confidence (e.g., found outdated) |
Retrieval Quality
| Tool | Description |
|---|---|
mark_memory_used(memory_id, feedback) |
Mark a recalled memory as actually helpful |
retrieval_quality_stats(memory_id, days) |
Get stats on which memories are retrieved vs used |
Session Tracking
| Tool | Description |
|---|---|
get_or_create_session(session_id, topic) |
Track conversation context |
get_session_memories(session_id) |
Retrieve memories from a session |
end_session(session_id, promote_top) |
End session and promote top episodic memories to long-term storage |
Memory Types
| Type | Use for |
|---|---|
project |
Architecture, conventions, tech stack |
pattern |
Reusable code patterns, commands |
reference |
API docs, external references |
conversation |
Facts from discussions |
episodic |
Session-bound short-term context (auto-expires after 7 days) |
Confidence Gating
Recall results include confidence levels based on semantic similarity:
| Confidence | Similarity | Recommended action |
|---|---|---|
| high | > 0.85 | Use directly |
| medium | 0.70 - 0.85 | Verify context |
| low | < 0.70 | Reason from scratch |
Configuration
Environment variables (prefix MEMORY_MCP_):
Core Settings
| Variable | Default | Description |
|---|---|---|
DB_PATH |
~/.memory-mcp/memory.db |
SQLite database location |
EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Sentence transformer model |
EMBEDDING_BACKEND |
auto |
auto, mlx, or sentence-transformers |
Hot Cache
| Variable | Default | Description |
|---|---|---|
HOT_CACHE_MAX_ITEMS |
20 |
Maximum items in hot cache |
PROMOTION_THRESHOLD |
3 |
Access count for auto-promotion |
DEMOTION_DAYS |
14 |
Days without access before demotion |
AUTO_PROMOTE |
true |
Enable automatic promotion |
AUTO_DEMOTE |
true |
Enable automatic demotion |
Retrieval
| Variable | Default | Description |
|---|---|---|
DEFAULT_RECALL_LIMIT |
5 |
Default results per recall |
DEFAULT_CONFIDENCE_THRESHOLD |
0.7 |
Minimum similarity for results |
HIGH_CONFIDENCE_THRESHOLD |
0.85 |
Threshold for "high" confidence |
RECALL_EXPAND_RELATIONS |
false |
Enable multi-hop recall via knowledge graph |
Salience & Promotion
| Variable | Default | Description |
|---|---|---|
SALIENCE_PROMOTION_THRESHOLD |
0.5 |
Minimum salience score for auto-promotion |
SALIENCE_IMPORTANCE_WEIGHT |
0.25 |
Weight for importance in salience |
SALIENCE_TRUST_WEIGHT |
0.25 |
Weight for trust in salience |
SALIENCE_ACCESS_WEIGHT |
0.25 |
Weight for access count in salience |
SALIENCE_RECENCY_WEIGHT |
0.25 |
Weight for recency in salience |
Episodic Memory
| Variable | Default | Description |
|---|---|---|
EPISODIC_PROMOTE_TOP_N |
3 |
Top N episodic memories to promote on session end |
EPISODIC_PROMOTE_THRESHOLD |
0.6 |
Minimum salience for episodic promotion |
RETENTION_EPISODIC_DAYS |
7 |
Days to retain episodic memories |
Working Set
| Variable | Default | Description |
|---|---|---|
WORKING_SET_ENABLED |
true |
Enable memory://working-set resource |
WORKING_SET_MAX_ITEMS |
10 |
Maximum items in working set |
MCP Resources
The server exposes two MCP resources for instant memory access:
Hot Cache (memory://hot-cache)
Auto-injectable system context with high-confidence patterns. Contents are automatically available in Claude's context without tool calls.
- Memories promoted to hot cache appear here
- Keeps system prompts lean (~10-20 items max)
- Auto-bootstrap: If empty, auto-seeds from project docs (README.md, CLAUDE.md, etc.)
Working Set (memory://working-set)
Session-aware active memory context (Engram-inspired). Provides contextually relevant memories:
- Recently recalled memories (that were actually used)
- Predicted next memories (from access pattern learning)
- Top salience hot items (to fill remaining slots)
Smaller and more focused than hot-cache (~10 items) - designed for active work context.
Enabling Auto-Injection
Add the MCP server to your settings (see Quick Start). Both resources are automatically available. Verify with /mcp in Claude Code.
Multi-Client Setup
Memory MCP works with any MCP-compatible client (Claude Code, Codex, etc.).
Shared Memory (Recommended)
Both clients share the same database - memories learned in one are available in the other:
Claude Code (~/.claude.json):
{
"mcpServers": {
"memory": {
"command": "memory-mcp"
}
}
}
Codex (or other MCP client):
{
"mcpServers": {
"memory": {
"command": "memory-mcp"
}
}
}
Separate Memory per Client
Use different database paths via MEMORY_MCP_DB_PATH environment variable:
{
"mcpServers": {
"memory": {
"command": "memory-mcp",
"env": {
"MEMORY_MCP_DB_PATH": "~/.memory-mcp/claude.db"
}
}
}
}
Automatic Output Logging
For pattern mining to work automatically, install the Claude Code hook.
Prerequisites
The hook script requires jq for JSON parsing:
# macOS
brew install jq
# Ubuntu/Debian
sudo apt install jq
Installation
chmod +x hooks/memory-log-response.sh
Add to ~/.claude/settings.json:
{
"hooks": {
"Stop": [{
"hooks": [{
"type": "command",
"command": "/path/to/memory-mcp/hooks/memory-log-response.sh"
}]
}]
}
}
CLI Commands
# Bootstrap hot cache from project docs (auto-detects README.md, CLAUDE.md, etc.)
memory-mcp-cli bootstrap
# Bootstrap from specific directory
memory-mcp-cli bootstrap -r /path/to/project
# Bootstrap specific files only
memory-mcp-cli bootstrap -f README.md -f ARCHITECTURE.md
# Log content for mining
echo "Some content" | memory-mcp-cli log-output
# Run pattern extraction
memory-mcp-cli run-mining --hours 24
# Seed from a file
memory-mcp-cli seed ~/project/CLAUDE.md -t project --promote
# Consolidate similar memories (preview first with --dry-run)
memory-mcp-cli consolidate --dry-run
memory-mcp-cli consolidate
# Show memory system status
memory-mcp-cli status
<details>
<summary>From source? Prefix commands with uv run</summary>
uv run memory-mcp-cli bootstrap
uv run memory-mcp-cli status
# etc.
</details>
Development
# Run tests
uv run pytest -v
# Run with debug logging
uv run memory-mcp 2>&1 | head -50
System Requirements
| Requirement | Minimum | Notes |
|---|---|---|
| Python | 3.10+ | 3.11+ recommended for performance |
| Disk | ~2-3 GB | Dependencies (~2GB) + embedding model (~90MB) + database |
| RAM | 200-400 MB | During embedding operations |
| First Run | 30-60 seconds | One-time ~90MB model download |
| Startup | 2-5 seconds | After model is cached |
Apple Silicon Users: Install with MLX support for faster embeddings:
pip install memory-mcp[mlx]
Example Usage
You: "Remember that this project uses PostgreSQL with pgvector"
Claude: [calls remember(..., memory_type="project")]
โ Stored as memory #1
You: "What database do we use?"
Claude: [calls recall("database configuration")]
โ {confidence: "high", memories: [{content: "PostgreSQL with pgvector..."}]}
You: "Promote that to hot cache"
Claude: [calls promote(1)]
โ Memory #1 now in hot cache - available instantly next session
Troubleshooting
Server Won't Start
Symptom: Claude Code shows "memory" server as disconnected
-
Check the command works directly:
memory-mcp -
Verify installation:
which memory-mcp # Should return a path -
Check Python version: Requires 3.10+
python --version
Dimension Mismatch Error
Symptom: Vector dimension mismatch error during recall
This happens when the embedding model changes. Rebuild vectors:
memory-mcp-cli db-rebuild-vectors
Hot Cache Not Updating
Symptom: Promoted memories don't appear in hot cache
-
Check hot cache status:
memory-mcp-cli status -
Verify memory exists:
[In Claude] list_memories(limit=10) -
Manually promote:
[In Claude] promote(memory_id)
Pattern Mining Not Working
Symptom: run_mining finds no patterns
-
Check mining is enabled:
echo $MEMORY_MCP_MINING_ENABLED # Should not be "false" -
Verify logs exist:
memory-mcp-cli run-mining --hours 24 -
Check hook is installed (see Automatic Output Logging)
Hook Script Fails
Symptom: Hook runs but nothing is logged
-
Check jq is installed:
which jq # Should return a path -
Make script executable:
chmod +x hooks/memory-log-response.sh -
Test manually:
echo "test content" | memory-mcp-cli log-output
Slow First Startup
Symptom: First run takes 30-60 seconds
This is expected - the embedding model (~90MB) downloads on first use. Subsequent starts take 2-5 seconds.
Database Corruption
Symptom: SQLite errors or unexpected behavior
-
Backup and recreate:
mv ~/.memory-mcp/memory.db ~/.memory-mcp/memory.db.bak # Server will create fresh database on next start -
Re-bootstrap from project docs:
memory-mcp-cli bootstrap
Security Note
This server is designed for local use only. It runs unauthenticated over STDIO transport and should not be exposed to networks or untrusted clients.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
E2B
Using MCP to run code via e2b.