Memory MCP Server

Memory MCP Server

Provides a persistent "second brain" for Claude featuring zero-latency hot caching, semantic cold storage, and automatic pattern mining from activity logs. It enables users to store, search, and automatically extract project facts and code patterns for enhanced contextual recall.

Category
Visit Server

README

<div align="center">

๐Ÿง  Memory MCP Server

Give your AI assistant a persistent second brain

License: MIT Python 3.10+ Works with Claude Code MCP Compatible

<br />

Stop re-explaining your project every session.

Memory MCP learns what matters and keeps it ready โ€” instant recall for the stuff you use most, semantic search for everything else.

</div>


The Problem

Every new chat starts from scratch. You repeat yourself. Context balloons. Tool calls add latency.

Memory MCP fixes this. It gives Claude persistent memory with a two-tier architecture: a hot cache for instant access to frequently-used knowledge, and cold storage with semantic search for everything else.

The system learns what you use and automatically promotes it. No manual curation required.

Before & After

๐Ÿ˜ค Without Memory MCP ๐ŸŽฏ With Memory MCP
"Let me explain our architecture again..." Project facts persist forever
Copy-paste the same patterns Patterns auto-promoted to instant access
500k+ token context windows Hot cache keeps it lean (~20 items)
Tool call latency on every lookup Hot cache: 0ms โ€” already in context

Key Features

๐Ÿš€ Instant recall hot cache โ€” Frequently-used memories auto-injected into context. No tool calls needed.

๐Ÿ” Semantic search โ€” Find memories by meaning, not just keywords. Knowledge graph connects related concepts.

๐Ÿค– Self-organizing โ€” Learns what you use. Auto-promotes frequent patterns. Auto-demotes stale ones.

๐Ÿ“ฆ Local & private โ€” All data in SQLite. No cloud. No API keys. Works offline.

๐ŸŽ Apple Silicon optimized โ€” MLX backend auto-detected on M-series Macs for faster embeddings.

Quick Start

Install

# uv (recommended)
uv tool install memory-mcp

# pip
pip install memory-mcp

# Homebrew (macOS)
brew install memory-mcp

# From source
git clone https://github.com/michael-denyer/memory-mcp.git
cd memory-mcp && uv sync

Apple Silicon? Add MLX support for faster embeddings:

uv tool install memory-mcp[mlx]
# or: pip install memory-mcp[mlx]

Configure

Add to your MCP client config (e.g., ~/.claude.json for Claude Code):

{
  "mcpServers": {
    "memory": {
      "command": "memory-mcp"
    }
  }
}

<details> <summary>From source? Use this config instead</summary>

{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/memory-mcp", "memory-mcp"]
    }
  }
}

</details>

Restart your client. That's it. The hot cache auto-populates from your project docs.

First run: Embedding model (~90MB) downloads automatically. Takes 30-60 seconds once.

How It Works

flowchart LR
    subgraph LLM["Your AI Assistant"]
        REQ((Request))
    end

    subgraph Hot["HOT CACHE ยท 0ms"]
        HC[Frequent memories]
        WS[Working set]
    end

    subgraph Cold["COLD STORAGE ยท ~50ms"]
        VS[(Vector search)]
        KG[(Knowledge graph)]
    end

    REQ -->|"auto-injected"| HC
    REQ -->|"recall()"| VS
    VS <-->|"related"| KG

Two tiers, automatic promotion:

Tier Latency What happens
Hot Cache 0ms Auto-injected every request. No tool call needed.
Cold Storage ~50ms Semantic search when you need deeper recall.

Memories used 3+ times automatically promote to hot cache. Unused memories demote after 14 days. Pin important ones to keep them hot forever.

What Makes It Different

Memory MCP Others
Hot cache Auto-injected, 0ms Most require tool calls
Self-organizing Learns from usage Manual curation
Pattern mining Extracts from outputs Not available
Setup One command, local SQLite Often needs cloud/services

Reference

Everything below is detailed documentation. You don't need to read it to get started.

Tools

Memory Operations

Tool Description
remember(content, type, tags) Store a memory with semantic embedding
recall(query, limit, threshold, expand_relations) Semantic search with confidence gating and optional multi-hop expansion
recall_by_tag(tag) Filter memories by tag
forget(memory_id) Delete a memory
list_memories(limit, offset, type) Browse all memories

Hot Cache Management

Tool Description
hot_cache_status() Show contents, metrics, and effectiveness
promote(memory_id) Manually promote to hot cache
demote(memory_id) Remove from hot cache (keeps in cold storage)
pin_memory(memory_id) Pin memory (prevents auto-eviction)
unpin_memory(memory_id) Unpin memory (allows auto-eviction)

Pattern Mining

Tool Description
log_output(content) Log content for pattern extraction
run_mining(hours) Extract patterns from recent logs
review_candidates() See patterns ready for promotion
approve_candidate(id) / reject_candidate(id) Accept or reject patterns

Cold Start / Seeding

Tool Description
bootstrap_project(root, files, promote) Auto-detect and seed from project docs (README.md, CLAUDE.md, etc.)
seed_from_text(content, type, promote) Parse text into memories
seed_from_file(path, type, promote) Import from file (e.g., CLAUDE.md)

Knowledge Graph

Tool Description
link_memories(from_id, to_id, relation, metadata) Create relationship between memories
unlink_memories(from_id, to_id, relation) Remove relationship(s)
get_related_memories(memory_id, relation, direction) Find connected memories

Relation types: relates_to, depends_on, supersedes, refines, contradicts, elaborates

Trust Management

Tool Description
strengthen_trust(memory_id, amount, reason) Increase confidence in a memory
weaken_trust(memory_id, amount, reason) Decrease confidence (e.g., found outdated)

Retrieval Quality

Tool Description
mark_memory_used(memory_id, feedback) Mark a recalled memory as actually helpful
retrieval_quality_stats(memory_id, days) Get stats on which memories are retrieved vs used

Session Tracking

Tool Description
get_or_create_session(session_id, topic) Track conversation context
get_session_memories(session_id) Retrieve memories from a session
end_session(session_id, promote_top) End session and promote top episodic memories to long-term storage

Memory Types

Type Use for
project Architecture, conventions, tech stack
pattern Reusable code patterns, commands
reference API docs, external references
conversation Facts from discussions
episodic Session-bound short-term context (auto-expires after 7 days)

Confidence Gating

Recall results include confidence levels based on semantic similarity:

Confidence Similarity Recommended action
high > 0.85 Use directly
medium 0.70 - 0.85 Verify context
low < 0.70 Reason from scratch

Configuration

Environment variables (prefix MEMORY_MCP_):

Core Settings

Variable Default Description
DB_PATH ~/.memory-mcp/memory.db SQLite database location
EMBEDDING_MODEL all-MiniLM-L6-v2 Sentence transformer model
EMBEDDING_BACKEND auto auto, mlx, or sentence-transformers

Hot Cache

Variable Default Description
HOT_CACHE_MAX_ITEMS 20 Maximum items in hot cache
PROMOTION_THRESHOLD 3 Access count for auto-promotion
DEMOTION_DAYS 14 Days without access before demotion
AUTO_PROMOTE true Enable automatic promotion
AUTO_DEMOTE true Enable automatic demotion

Retrieval

Variable Default Description
DEFAULT_RECALL_LIMIT 5 Default results per recall
DEFAULT_CONFIDENCE_THRESHOLD 0.7 Minimum similarity for results
HIGH_CONFIDENCE_THRESHOLD 0.85 Threshold for "high" confidence
RECALL_EXPAND_RELATIONS false Enable multi-hop recall via knowledge graph

Salience & Promotion

Variable Default Description
SALIENCE_PROMOTION_THRESHOLD 0.5 Minimum salience score for auto-promotion
SALIENCE_IMPORTANCE_WEIGHT 0.25 Weight for importance in salience
SALIENCE_TRUST_WEIGHT 0.25 Weight for trust in salience
SALIENCE_ACCESS_WEIGHT 0.25 Weight for access count in salience
SALIENCE_RECENCY_WEIGHT 0.25 Weight for recency in salience

Episodic Memory

Variable Default Description
EPISODIC_PROMOTE_TOP_N 3 Top N episodic memories to promote on session end
EPISODIC_PROMOTE_THRESHOLD 0.6 Minimum salience for episodic promotion
RETENTION_EPISODIC_DAYS 7 Days to retain episodic memories

Working Set

Variable Default Description
WORKING_SET_ENABLED true Enable memory://working-set resource
WORKING_SET_MAX_ITEMS 10 Maximum items in working set

MCP Resources

The server exposes two MCP resources for instant memory access:

Hot Cache (memory://hot-cache)

Auto-injectable system context with high-confidence patterns. Contents are automatically available in Claude's context without tool calls.

  • Memories promoted to hot cache appear here
  • Keeps system prompts lean (~10-20 items max)
  • Auto-bootstrap: If empty, auto-seeds from project docs (README.md, CLAUDE.md, etc.)

Working Set (memory://working-set)

Session-aware active memory context (Engram-inspired). Provides contextually relevant memories:

  1. Recently recalled memories (that were actually used)
  2. Predicted next memories (from access pattern learning)
  3. Top salience hot items (to fill remaining slots)

Smaller and more focused than hot-cache (~10 items) - designed for active work context.

Enabling Auto-Injection

Add the MCP server to your settings (see Quick Start). Both resources are automatically available. Verify with /mcp in Claude Code.

Multi-Client Setup

Memory MCP works with any MCP-compatible client (Claude Code, Codex, etc.).

Shared Memory (Recommended)

Both clients share the same database - memories learned in one are available in the other:

Claude Code (~/.claude.json):

{
  "mcpServers": {
    "memory": {
      "command": "memory-mcp"
    }
  }
}

Codex (or other MCP client):

{
  "mcpServers": {
    "memory": {
      "command": "memory-mcp"
    }
  }
}

Separate Memory per Client

Use different database paths via MEMORY_MCP_DB_PATH environment variable:

{
  "mcpServers": {
    "memory": {
      "command": "memory-mcp",
      "env": {
        "MEMORY_MCP_DB_PATH": "~/.memory-mcp/claude.db"
      }
    }
  }
}

Automatic Output Logging

For pattern mining to work automatically, install the Claude Code hook.

Prerequisites

The hook script requires jq for JSON parsing:

# macOS
brew install jq

# Ubuntu/Debian
sudo apt install jq

Installation

chmod +x hooks/memory-log-response.sh

Add to ~/.claude/settings.json:

{
  "hooks": {
    "Stop": [{
      "hooks": [{
        "type": "command",
        "command": "/path/to/memory-mcp/hooks/memory-log-response.sh"
      }]
    }]
  }
}

CLI Commands

# Bootstrap hot cache from project docs (auto-detects README.md, CLAUDE.md, etc.)
memory-mcp-cli bootstrap

# Bootstrap from specific directory
memory-mcp-cli bootstrap -r /path/to/project

# Bootstrap specific files only
memory-mcp-cli bootstrap -f README.md -f ARCHITECTURE.md

# Log content for mining
echo "Some content" | memory-mcp-cli log-output

# Run pattern extraction
memory-mcp-cli run-mining --hours 24

# Seed from a file
memory-mcp-cli seed ~/project/CLAUDE.md -t project --promote

# Consolidate similar memories (preview first with --dry-run)
memory-mcp-cli consolidate --dry-run
memory-mcp-cli consolidate

# Show memory system status
memory-mcp-cli status

<details> <summary>From source? Prefix commands with uv run</summary>

uv run memory-mcp-cli bootstrap
uv run memory-mcp-cli status
# etc.

</details>

Development

# Run tests
uv run pytest -v

# Run with debug logging
uv run memory-mcp 2>&1 | head -50

System Requirements

Requirement Minimum Notes
Python 3.10+ 3.11+ recommended for performance
Disk ~2-3 GB Dependencies (~2GB) + embedding model (~90MB) + database
RAM 200-400 MB During embedding operations
First Run 30-60 seconds One-time ~90MB model download
Startup 2-5 seconds After model is cached

Apple Silicon Users: Install with MLX support for faster embeddings:

pip install memory-mcp[mlx]

Example Usage

You: "Remember that this project uses PostgreSQL with pgvector"
Claude: [calls remember(..., memory_type="project")]
โ†’ Stored as memory #1

You: "What database do we use?"
Claude: [calls recall("database configuration")]
โ†’ {confidence: "high", memories: [{content: "PostgreSQL with pgvector..."}]}

You: "Promote that to hot cache"
Claude: [calls promote(1)]
โ†’ Memory #1 now in hot cache - available instantly next session

Troubleshooting

Server Won't Start

Symptom: Claude Code shows "memory" server as disconnected

  1. Check the command works directly:

    memory-mcp
    
  2. Verify installation:

    which memory-mcp  # Should return a path
    
  3. Check Python version: Requires 3.10+

    python --version
    

Dimension Mismatch Error

Symptom: Vector dimension mismatch error during recall

This happens when the embedding model changes. Rebuild vectors:

memory-mcp-cli db-rebuild-vectors

Hot Cache Not Updating

Symptom: Promoted memories don't appear in hot cache

  1. Check hot cache status:

    memory-mcp-cli status
    
  2. Verify memory exists:

    [In Claude] list_memories(limit=10)
    
  3. Manually promote:

    [In Claude] promote(memory_id)
    

Pattern Mining Not Working

Symptom: run_mining finds no patterns

  1. Check mining is enabled:

    echo $MEMORY_MCP_MINING_ENABLED  # Should not be "false"
    
  2. Verify logs exist:

    memory-mcp-cli run-mining --hours 24
    
  3. Check hook is installed (see Automatic Output Logging)

Hook Script Fails

Symptom: Hook runs but nothing is logged

  1. Check jq is installed:

    which jq  # Should return a path
    
  2. Make script executable:

    chmod +x hooks/memory-log-response.sh
    
  3. Test manually:

    echo "test content" | memory-mcp-cli log-output
    

Slow First Startup

Symptom: First run takes 30-60 seconds

This is expected - the embedding model (~90MB) downloads on first use. Subsequent starts take 2-5 seconds.

Database Corruption

Symptom: SQLite errors or unexpected behavior

  1. Backup and recreate:

    mv ~/.memory-mcp/memory.db ~/.memory-mcp/memory.db.bak
    # Server will create fresh database on next start
    
  2. Re-bootstrap from project docs:

    memory-mcp-cli bootstrap
    

Security Note

This server is designed for local use only. It runs unauthenticated over STDIO transport and should not be exposed to networks or untrusted clients.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured