memento
Provides persistent memory for AI coding agents via MCP, enabling agents to store and semantically recall facts, events, and lessons across sessions, all running locally without cloud dependencies.
README
memento
<p align="center"> <img src="https://raw.githubusercontent.com/UmoLab/memento/main/logo.jpg" alt="memento logo" width="200"/> </p>
Persistent memory for AI coding agents. Local-first. Zero cloud. Plug-and-play with Claude Code via MCP.
Stop losing context between sessions. memento gives your AI agent a long-term brain that runs entirely on your laptop. Stored facts (preferences, decisions, conventions) survive across sessions and are recalled by semantic similarity — so the next time you ask Claude to "use my dark theme," it remembers without you re-explaining.
Built for Claude Code but works with any MCP-compatible client.
Why
Every Claude Code session starts cold. You re-explain your preferences, your project's conventions, your architecture decisions. After 50 sessions, you've typed the same context thousands of times.
memento solves this with persistent memory:
- Stored automatically by your agent via MCP tools (
memento_store,memento_recall, etc.) - Recalled by semantic similarity (vector + full-text hybrid search)
- Survives sessions, restarts, machines (single SQLite file)
- 100% local — no API calls, no telemetry, no cloud lock-in
- Smart recall — confidence decay downranks forgotten facts (90-day half-life), access boost promotes frequently used memories
- Conflict detection — warns when you try to store a contradictory or near-duplicate fact
- Battle-tested — 179 tests, 92% coverage (with branch tracking), real concurrency tests
30-second quickstart
1. Install
pip install memento
⚠️ ~1.1 GB total (sentence-transformers + PyTorch). First run also downloads ~80MB embedding model. Subsequent installs are instant.
2. Initialize the database
memento init
Creates ~/.local/share/memento/memory.db.
3. Wire into Claude Code
Add to ~/.claude/mcp_servers.json (or .mcp.json in your project):
{
"mcpServers": {
"memory": {
"command": "memento",
"args": ["start"],
"env": {"MEMENTO_PATH": "~/.local/share/memento/memory.db"}
}
}
}
4. Use it
Open Claude Code. It now has access to:
store— save a fact (preference, decision, convention)recall— semantic search across stored factsupdate/forget— mutate stored factsrecent/browse— inspect what's storedabout— see what memento knows about you
Tell Claude: "Remember that I use PostgreSQL for all production databases and I never hardcode API keys." Next session, it'll know without you re-explaining.
First day with memento
A 5-minute walkthrough to prove the round-trip works:
$ memento init
Initialized memento at ~/.local/share/memento/memory.db
$ memento list
No memories yet. Run `memento init` then store something.
# Stash something via Python (or via Claude Code)
$ python -c "from memento import Memory; \
m = Memory(); \
m.store('I prefer dark mode in editors', 'fact', 0.8, subject_key='user.theme')"
$ memento list
01KVF2CKW772 fact imp=0.80 [user.theme] I prefer dark mode in editors
$ memento search "what theme does the user like"
1. 01KVF2CKW772 fact imp=0.80 [user.theme] I prefer dark mode in editors
$ memento show 01KVF2CKW772
Memory 01KVF2CKW772QQG4ZRPSDEH99E
kind: fact
subject_key: user.theme
importance: 0.80
source: manual
...
content:
I prefer dark mode in editors
If you see this — Claude Code will too. The next session, just open Claude and ask "what theme does the user like?" — it'll know.
What's stored
Three kinds of memories:
| Kind | Use for | Example |
|---|---|---|
fact |
User preferences, project facts, conventions | "User prefers dark mode" |
event |
Things that happened, dated context | "Migrated from Postgres to SQLite on 2026-01-15" |
lesson |
Procedural rules the agent should follow | "When using sqlite-vec, always include AND k = ?" |
Each memory can have a subject_key (e.g. user.theme, project.db_choice) for deterministic lookup, plus metadata and importance (0.0–1.0).
Architecture
┌──────────────────────────────────────────────────┐
│ Claude Code (or any MCP client) │
│ │ │
│ │ MCP protocol (JSON-RPC over stdio) │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ memento mcp_server │ │
│ │ (10 tools: memento_store, memento_recall, …) │ │
│ └──────────────┬───────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ memento core (Python) │ │
│ │ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │
│ │ │ Store │ │ Recall │ │ Embedder│ │ │
│ │ │ (SQLite) │ │(vec+fts) │ │ (sbert)│ │ │
│ │ └──────────┘ └──────────┘ └─────────┘ │ │
│ └──────────────┬───────────────────────────┘ │
│ │ │
│ ▼ │
│ SQLite + FTS5 + vec0 (~/.local/share/…) │
└──────────────────────────────────────────────────┘
Everything in one SQLite file. SQLite gives ACID transactions, WAL for concurrent reads, and single-file backup. No Docker, no Postgres, no Redis.
When to use memento
You want persistent memory that just works, runs on your laptop, doesn't phone home, and integrates with Claude Code in under a minute.
When NOT to use it
You need multi-tenant cloud storage, embeddings for non-text modalities, or 100k+ memories per user (vector search in SQLite caps out around there).
⚠️ Auto-execute awareness
Claude Code (and other MCP clients) may auto-execute memento_* tools without explicit confirmation if the user enables auto-mode. Every memento_store call writes a row to your local SQLite file. Two practical consequences:
- Use
subject_keyfor deterministic facts. Free-formmemento_storecalls without asubject_keywill dedupe via vector similarity — convenient, but you'll get a fresh row each time the wording shifts. - Audit before clearing.
memento_forgetis a soft archive (recoverable).memento_forget --hardandmemento_forget-prefixare destructive — theyDELETErows from the on-disk SQLite file. There is no undo.
If your agent runs in a context where another process might invoke these tools, wrap your MCP server call in an explicit user prompt.
Backing up and restoring
Everything is in one SQLite file. To back up:
cp ~/.local/share/memento/memory.db backup.db
# Or export to portable JSONL:
memento export ~/.local/share/memento/memory.db > backup.jsonl
To restore:
# From a SQLite file:
cp backup.db ~/.local/share/memento/memory.db
# From a JSONL file:
memento init ~/.local/share/memento/memory.db --force
memento import_data ~/.local/share/memento/memory.db backup.jsonl
Upgrading the embedding model
If you switch MEMENTO_EMBEDDING_MODEL or upgrade sentence-transformers, old memories
have embeddings in the old dimension. Run:
memento reembed
This re-encodes every memory using the current model. Without it, sqlite-vec queries on mixed-dim embeddings will crash.
Usage
CLI
memento init # create empty DB
memento verify # health check (8 checks)
memento stats # show counts, model, schema version
memento list # show recent memories (formatted)
memento show <id> # show one memory by id
memento search "query" # semantic search via CLI
memento start # run MCP server on stdio
memento export <path> > backup.jsonl # export all memories to JSONL
memento import_data <path> <file.jsonl> # import from JSONL backup
memento import <path> <pm-db-path> # import from PMB-format DB
memento reembed # re-encode stale embeddings
memento about # show user.* facts + recent events
memento forget <id> # soft-archive one memory
memento forget <id> --hard # physical delete (irreversible)
memento forget-prefix <prefix> # archive all memories by subject_key prefix
Python API
from memento import Memory
mem = Memory(path="~/.local/share/memento/memory.db")
# Store a fact (with subject_key for deterministic dedup)
mem.store(
content="User prefers dark mode",
kind="fact",
importance=0.8,
subject_key="user.theme",
)
# Semantic recall (results are MemoryRecord, sorted by relevance)
results = mem.recall("what theme does the user prefer?")
for r in results:
print(f"[{r.kind}] {r.content}")
# Update
mem.update(id=results[0].id, content="User prefers dark mode in editors, light in terminals")
# Forget
mem.forget(id=results[0].id)
# Browse by prefix
for item in mem.browse(subject_key_prefix="user."):
print(item.content)
MCP tools
When running as an MCP server, these tools are exposed:
| Tool | Purpose |
|---|---|
memento_store |
Store a fact/event/lesson |
memento_recall |
Semantic search across all memories |
memento_update |
Update content/importance/metadata |
memento_forget |
Soft-delete (archive); hard=True purges from disk |
memento_forget_prefix |
Soft-archive all memories by subject_key prefix |
memento_recent |
Browse recent memories |
memento_browse |
List by subject_key prefix |
memento_stats |
Counts, schema version, model info |
memento_store_many |
Atomic batch insert |
memento_recall_many |
Batch semantic search |
All tool names are prefixed with memento_ to avoid collisions with other MCP servers exposing generic names like store or recall.
For migration from PMB-format DBs, use the Python API:
from memento.importers.pmb import import_pmb.
Storage format
Single SQLite file with three tables:
memories— content + metadata (subject_key, kind, importance, source, timestamps)memory_fts— FTS5 virtual table for keyword fallbackmemory_vec— sqlite-vec virtual table for semantic search
Schema migrations: schema_version table tracks applied migrations; memento init upgrades existing DBs in place.
Testing
git clone https://github.com/UmoLab/memento
cd memento
pip install -e ".[dev]"
pytest # 179 tests, ~10 min on CPU
pytest --cov=memento # with coverage
ruff check src tests # lint
Documentation
License
Apache 2.0. See LICENSE.
Credits
- sqlite-vec — vector search in SQLite
- sentence-transformers — local embeddings
- fastmcp — MCP server framework
- python-ulid — sortable IDs
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.