Hippocampus Memory MCP Server
Provides persistent, semantic memory storage for LLMs across sessions using vector embeddings and FAISS search. Features bio-inspired memory consolidation, intelligent forgetting, and semantic retrieval without API costs.
README
<div align="center">
π§ Hippocampus Memory MCP Server
Persistent, Semantic Memory for Large Language Models
Features β’ Installation β’ Quick Start β’ Documentation β’ Architecture
</div>
π Overview
A Python-based Model Context Protocol (MCP) server that gives LLMs persistent, hippocampus-inspired memory across sessions. Store, retrieve, consolidate, and forget memories using semantic similarity search powered by vector embeddings.
Why Hippocampus? Just like the human brain's hippocampus consolidates short-term memories into long-term storage, this server intelligently manages LLM memory through biological patterns:
- π Consolidation - Merge similar memories to reduce redundancy
- π§Ή Forgetting - Remove outdated information based on age/importance
- π Semantic Retrieval - Find relevant memories through meaning, not keywords
β¨ Features
| Feature | Description |
|---|---|
| ποΈ Vector Storage | FAISS-powered semantic similarity search |
| π― MCP Compliant | Full MCP 1.2.0 spec compliance via FastMCP |
| 𧬠Bio-Inspired | Hippocampus-style consolidation and forgetting |
| π Security | Input validation, rate limiting, injection prevention |
| π Semantic Search | Sentence transformer embeddings (CPU-optimized) |
| βΎοΈ Unlimited Storage | No memory count limits, only per-item size limits |
| π 100% Free | Local embedding model - no API costs |
π Quick Start
5 Core MCP Tools
memory_read # π Retrieve memories by semantic similarity
memory_write # βοΈ Store new memories with tags & metadata
memory_consolidate # π Merge similar memories
memory_forget # π§Ή Remove memories by age/importance/tags
memory_stats # π Get system statistics
π¦ Installation
Quick Install (Recommended)
pip install hippocampus-memory-mcp
Prerequisites: Python 3.9+ β’ ~200MB disk space (for embedding model)
Claude Desktop Integration
Add to your Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "memory_mcp_server.server"]
}
}
}
π That's it! Claude will now have persistent memory across conversations.
Install from Source (Alternative)
# Clone the repository
git clone https://github.com/jameslovespancakes/Memory-MCP.git
cd Memory-MCP
# Install dependencies
pip install -r requirements.txt
# Run the server
python -m memory_mcp_server.server
π Documentation
Memory Operations via MCP
Once connected to Claude, use natural language:
"Remember that I prefer Python for backend development"
β Claude calls memory_write()
"What do you know about my programming preferences?"
β Claude calls memory_read()
"Consolidate similar memories to clean up storage"
β Claude calls memory_consolidate()
Direct API Usage
βοΈ Writing Memories
from memory_mcp_server.storage import MemoryStorage
from memory_mcp_server.tools import MemoryTools
storage = MemoryStorage(storage_path="my_memory")
await storage._ensure_initialized()
tools = MemoryTools(storage)
# Store with tags and importance
await tools.memory_write(
text="User prefers dark mode UI",
tags=["preference", "ui"],
importance_score=3.0,
metadata={"category": "settings"}
)
π Reading Memories
# Semantic search
result = await tools.memory_read(
query_text="What are my UI preferences?",
top_k=5,
min_similarity=0.3
)
# Filter by tags and date
result = await tools.memory_read(
query_text="Python learning",
tags=["learning", "python"],
date_range_start="2024-01-01"
)
π Consolidating Memories
# Merge similar memories (threshold: 0.85)
result = await tools.memory_consolidate(similarity_threshold=0.85)
print(f"Merged {result['consolidated_groups']} groups")
π§Ή Forgetting Memories
# Remove by age
await tools.memory_forget(max_age_days=30)
# Remove by importance
await tools.memory_forget(min_importance_score=2.0)
# Remove by tags
await tools.memory_forget(tags_to_forget=["temporary"])
Testing
Run the included test suite:
python test_memory.py
This tests all 5 operations with sample data.
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client (Claude Desktop, etc.) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β JSON-RPC over stdio
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β FastMCP Server (server.py) β
β ββ memory_read β
β ββ memory_write β
β ββ memory_consolidate β
β ββ memory_forget β
β ββ memory_stats β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Memory Tools (tools.py) β
β ββ Input validation & sanitization β
β ββ Rate limiting (100 req/min) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Storage Layer (storage.py) β
β ββ Sentence Transformers (all-MiniLM-L6-v2) β
β ββ FAISS Vector Index (cosine similarity) β
β ββ JSON persistence (memories.json) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Memory Lifecycle
| Step | Process | Technology |
|---|---|---|
| π Write | Text β 384-dim vector embedding | Sentence Transformers (CPU) |
| πΎ Store | Normalized vector β FAISS index | FAISS IndexFlatIP |
| π Search | Query β embedding β top-k similar | Cosine similarity |
| π Consolidate | Group similar (>0.85) β merge | Vector clustering |
| π§Ή Forget | Filter by age/importance/tags β delete | Metadata filtering |
π Security
| Protection | Implementation |
|---|---|
| π‘οΈ Injection Prevention | Regex filtering of script tags, eval(), path traversal |
| β±οΈ Rate Limiting | 100 requests per 60-second window per client |
| π Size Limits | 50KB text, 5KB metadata, 20 tags per memory |
| β Input Validation | Pydantic models + custom sanitization |
| π Safe Logging | stderr only (prevents JSON-RPC corruption) |
βοΈ Configuration
Environment Variables
MEMORY_STORAGE_PATH="memory_data" # Storage directory
EMBEDDING_MODEL="all-MiniLM-L6-v2" # Model name
RATE_LIMIT_REQUESTS=100 # Max requests
RATE_LIMIT_WINDOW=60 # Time window (seconds)
Storage Limits
- β Unlimited total memories (no count limit)
- β οΈ Per-memory limits: 50KB text, 5KB metadata, 20 tags
π Troubleshooting
<details> <summary><b>Model won't download</b></summary>
First run downloads all-MiniLM-L6-v2 (~90MB). Ensure internet connection and ~/.cache/ write permissions.
</details>
<details> <summary><b>PyTorch compatibility errors</b></summary>
pip uninstall torch transformers sentence-transformers -y
pip install torch==2.1.0 transformers==4.35.2 sentence-transformers==2.2.2
</details>
<details> <summary><b>Memory errors on large operations</b></summary>
The model runs on CPU. Ensure 2GB+ free RAM. Reduce top_k in read operations if needed.
</details>
π License
MIT License - feel free to use in your projects!
π€ Contributing
PRs welcome! Please:
- Follow MCP security guidelines
- Add tests for new features
- Update documentation
π Resources
<div align="center">
Built with π§ for persistent LLM memory
</div>
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.