claude-memory
Provides persistent, searchable memory for Claude Code using local SQLite, semantic embeddings, and full-text search, enabling Claude to recall and retrieve context across sessions and projects without external services.
README
<p align="center"> <img src="assets/banner.svg" alt="claude-memory" width="680"> </p>
claude-memory
Persistent, searchable memory for Claude Code — backed by SQLite, semantic embeddings, and full-text search. Connected via MCP.
Everything runs locally. No API keys. No cloud services. Your data never leaves your machine.
Claude remembers. Across sessions. Across projects. Forever.
You: "What do you remember about my auth setup?"
Claude: *searches 847 memories semantically*
*finds 3 relevant entries across 2 projects*
*ranks by importance, recency, and relevance*
"Based on my memory: you use JWT with refresh token rotation,
the auth middleware lives in src/middleware/auth.ts, and you
switched from Passport to a custom solution last month because..."
Why this exists
Claude Code ships with MEMORY.md — a per-project markdown file, capped at ~200 lines, loaded in full every message. It works for small notes. It doesn't scale.
MEMORY.md |
claude-memory |
|
|---|---|---|
| Scope | Single project | Global — all projects, all sessions |
| Search | None (full file loaded every turn) | Semantic + full-text hybrid search |
| Capacity | ~200 lines before truncation | Unlimited (SQLite) |
| Structure | Flat markdown | Categories, tags, relations, importance |
| Duplicates | Manual | Automatic 85% similarity detection |
| Relevance | All or nothing | Importance scoring with time decay |
| Connections | None | Typed relationship graph |
Quick start
Requires Node.js 18+
git clone https://github.com/Tim-Fischer-zh/claude-memory.git
cd claude-memory
./install.sh
Restart Claude Code. The embedding model (~23MB) downloads on first use — subsequent starts are instant.
To uninstall:
./uninstall.sh # backs up your database
./uninstall.sh --force # deletes everything, no backup
<details> <summary>What the installer does</summary> <br>
- Copies source to
~/.claude/memory-server/ - Runs
npm installand compiles TypeScript - Registers the MCP server in
~/.claude.json(merges safely — won't overwrite your other servers) - Installs Claude rules to
~/.claude/rules/(won't overwrite custom rules)
// Added to ~/.claude.json
{
"mcpServers": {
"memory": {
"command": "node",
"args": ["~/.claude/memory-server/dist/index.js"]
}
}
}
</details>
Tools
11 MCP tools, organized by function.
Store & retrieve
| Tool | What it does |
|---|---|
remember |
Store knowledge with category, tags, and source tracking. Checks for duplicates — warns if >85% similar memory exists. |
recall |
Hybrid search: semantic similarity + full-text matching + importance scoring. Finds "JWT middleware" when you search "auth setup". |
update_memory |
Modify content, category, tags, or importance. Auto re-embeds on content change. |
forget |
Delete a memory and cascade to its embedding and relationships. |
Browse
| Tool | What it does |
|---|---|
list_categories |
Overview of categories, entry counts, and embedding model status. |
Connect
| Tool | What it does |
|---|---|
relate |
Link two memories: related, supersedes, caused_by, contradicts, supports, depends_on. |
find_related |
Traverse the relationship graph + find semantically similar entries. |
Maintain
| Tool | What it does |
|---|---|
consolidate |
Scan for clusters of duplicate/similar memories. Returns groups ranked by similarity. |
merge |
Combine multiple memories into one. Preserves relationships, merges tags, deletes originals. |
Explore
| Tool | What it does |
|---|---|
visualize |
Open the memory graph UI in the browser — force-directed graph, categories, search, stats. |
How it works
Semantic search
Memories are embedded locally using all-MiniLM-L6-v2 (384 dimensions) via Transformers.js. Searching "immutability preferences" finds a memory stored as "always use functional patterns, never mutate state" — no keyword overlap needed.
The model loads in the background. Until ready, search falls back to full-text only — the server is always responsive.
Hybrid ranking
Every search combines three signals:
score = semantic_similarity × 0.5
+ full_text_relevance × 0.2
+ importance_score × 0.3
Importance blends manual priority, access frequency, and pin status with time decay:
importance = (manual_importance × 0.4 + access_frequency × 0.3 + pinned × 0.3)
× decay
decay = pinned ? 1.0 : e^(-0.005 × days_since_last_access)
Frequently accessed, manually prioritized, or pinned memories rank higher. Unused memories fade — unless pinned.
Deduplication
When storing, the server checks cosine similarity against all existing memories. If a match exceeds 85% similarity, it returns the existing memory instead of creating a duplicate.
Relationship graph
Memories form a directed graph:
"Use JWT for auth" ──caused_by──▶ "Security audit findings"
"Switch to bun" ──supersedes──▶ "Use npm for all projects"
"Redis caching layer" ──depends_on──▶ "Redis deployment config"
"Use REST not GraphQL" ──contradicts─▶ "Evaluate GraphQL for API"
find_related traverses explicit edges and surfaces semantically similar entries — giving you both explicit and implicit connections.
Consolidation
Over time, small related memories accumulate:
"Tim uses TypeScript for all projects" ┐
"Always use strict TypeScript" ├── 87% similar → merge candidates
"TypeScript with strict mode is preferred" ┘
consolidate finds these clusters. merge combines them into one clean entry, preserving all relationships.
Source tracking
Every memory can record where it came from:
source_project— project directory (e.g.~/projects/my-app)source_session— Claude Code session IDsource_file— file being worked on
Agent support
The installed rules ensure every agent and subagent:
- Calls
recallbefore starting work to load relevant context - Has access to memory tools for storing findings
- Sets
source_projectwhen remembering
This works with the Agent tool, Tasks, agent teams, swarms, and pipelines.
Performance
<p align="center"> <img src="assets/benchmarks.svg" alt="Benchmark results" width="800"> </p>
<details> <summary>Run the benchmark yourself</summary> <br>
node tools/benchmark.js
Seeds 1,000 test memories if your database has fewer than 50, runs all benchmarks, then cleans up. Your real data is never modified. </details>
Visualizer
Built-in web UI for exploring your memory database — force-directed graph, category filters, search, and live stats.
# standalone
node tools/visualize.js
# or via Claude
"show me my memory" → Claude runs the visualize tool
Open localhost:4200.
<details> <summary>Features</summary> <br>
- Force graph — D3 force-directed layout, edges for typed relationships
- Cluster mode — group nodes by category
- Radial mode — circular layout by category
- Sidebar — searchable, filterable card list synced with the graph
- Tooltips — hover any node for content, tags, importance
- Stats bar — total memories, relations, embeddings, recalls, pinned count
- Color-coded — distinct colors per category, edge colors per relation type
- Interactive — drag, zoom, click-to-highlight
</details>
Architecture
┌─────────────┐ stdio/MCP ┌──────────────────┐ ┌──────────┐
│ │◄─────────────────►│ Memory Server │────►│ SQLite │
│ Claude Code │ │ │ │ + FTS5 │
│ │ │ index.ts │ │ + WAL │
└─────────────┘ │ db.ts │ └──────────┘
│ embeddings.ts │────►┌──────────┐
└──────────────────┘ │ MiniLM │
│ L6-v2 │
└──────────┘
Schema
memories (
id, content, category, tags,
source_project, source_session, source_file,
importance, access_count, last_accessed_at, pinned,
created_at, updated_at
)
memory_embeddings (
memory_id → memories.id,
embedding BLOB -- Float32Array × 384
)
memory_relations (
source_id → memories.id,
target_id → memories.id,
relation_type -- related | supersedes | caused_by | contradicts | supports | depends_on
)
memories_fts -- FTS5 virtual table over content, category, tags
Project structure
src/
index.ts MCP server — tool definitions, request handling
db.ts Database — schema, queries, scoring, relations
embeddings.ts Embedding model — lazy loading, cosine similarity
rules/
memory.md Claude rule — automatic memory usage
agents.md Agent rule — memory-aware agent spawning
tools/
visualize.js Web UI — D3 force graph, category explorer
benchmark.js Performance benchmarks
install.sh Safe installer (preserves existing config)
uninstall.sh Uninstaller with database backup
Storage paths
| Path | Contents |
|---|---|
~/.claude/memory-server/dist/ |
Compiled server |
~/.claude/memory-server/memory.db |
Your knowledge base |
~/.claude/memory-server/models/ |
Cached embedding model (~23MB) |
~/.claude/rules/claude-memory.md |
Installed Claude rule |
~/.claude/rules/claude-memory-agents.md |
Agent awareness rule |
The database uses WAL mode for safe concurrent access.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.