engram
Provides persistent, local-first AI memory across sessions via MCP tools for storing, searching, and retrieving context from past interactions.
README
Engram
<img width="1408" height="768" alt="Gemini_Generated_Image_9nr9z59nr9z59nr9" src="https://github.com/user-attachments/assets/e04422f2-1974-48c2-8568-238bc2641bdf" />
The AI memory layer that never forgets.
Engram is a local-first, open-source AI memory system. It solves one problem: AI sessions end, but your work doesn't. Every decision, debug session, and architecture choice you've had with an AI disappears the moment the conversation closes. Engram makes it permanent, searchable, and retrievable in ~170 tokens.
Everything lives on your machine. No cloud. No API key required to run.
Benchmark Targets
| Benchmark | Metric | Target |
|---|---|---|
| LongMemEval | Single-session QA | ≥ 0.68 F1 |
| LongMemEval | Multi-session QA | ≥ 0.61 F1 |
| LoCoMo | Entity recall | ≥ 0.72 |
| LoCoMo | Event recall | ≥ 0.69 |
| ES compression | Factual paragraphs | 8–10× |
| ES compression | Code-heavy content | 4–6× |
| ES compression | Mixed | ~6× |
| Cold-start context | L0 + L1 tokens | ≤ 170 |
| Search latency p99 | ChromaDB 100k | < 200ms |
| Search latency p99 | FAISS 100k | < 50ms |
Quick Start
# Install
pip install engram
# Or with optional backends
pip install "engram[faiss]" # FAISS speed backend
pip install "engram[sqlitevec]" # zero-dependency fallback
pip install "engram[all]" # everything
# Initialise your memory château
engram init ~/myproject
# Mine a project directory
engram mine ~/myproject --wing myapp
# Mine a conversation export
engram mine ~/Downloads/claude-export --mode convos --wing myapp
# Search
engram search "auth migration decisions" --wing myapp
# Load cold-start context (~170 tokens)
engram wake-up
Memory Château Architecture
Wing (person or project)
└── Room (named topic: "auth-migration", "ci-pipeline")
├── Hall (memory type: facts | events | discoveries | preferences | advice)
│ ├── Closet (ES-compressed summary — fast AI read)
│ └── Drawer (verbatim original — never summarised)
└── Tunnel (cross-wing link when same room spans multiple wings)
Memory Layers
| Layer | Content | Size | When Loaded |
|---|---|---|---|
| L0 | Identity — who is this AI? | ~50 tokens | Always |
| L1 | Critical facts in ES | ~120 tokens | Always |
| L2 | Room recall — current project | On demand | When topic arises |
| L3 | Deep semantic search | On demand | When explicitly queried |
Total cold-start context: ~170 tokens (L0 + L1 only).
Using with Claude via MCP
Start the MCP server:
python -m engram.mcp_server
Add to ~/.config/claude/claude_desktop_config.json:
{
"mcpServers": {
"engram": {
"command": "python",
"args": ["-m", "engram.mcp_server"]
}
}
}
Claude will then have access to all 22 engram_* tools including engram_wake_up,
engram_search, engram_add_memory, engram_kg_add, engram_replay, and more.
See examples/mcp_setup.md for the full tool list.
Using with Local Models
Engram works entirely offline. The vector backends use local embeddings:
from engram.backends import get_backend
from engram.palace import Palace
from engram.searcher import Searcher
palace = Palace()
backend = get_backend("chromadb") # or "faiss" / "sqlitevec"
searcher = Searcher(backend, palace)
results = searcher.search("auth migration")
for r in results:
print(r["text"], r["final_score"])
No model API required. ChromaDB embeds locally using its bundled models.
Full CLI Reference
Setup
engram init <dir> # guided onboarding + ES bootstrap
Mining
engram mine <dir> # mine project files
engram mine <dir> --mode convos # mine conversation exports
engram mine <dir> --mode convos --wing myapp # tag with a wing
engram mine <dir> --since 2026-01-01 # skip files older than date
engram mine <dir> --plugin obsidian # use Obsidian vault plugin
engram mine <dir> --plugin notion # Notion export
engram mine <dir> --plugin linear # Linear issues export
Watch Mode
engram watch <dir> # auto-mine on file changes
engram watch <dir> --wing myapp --mode convos # tag + conversation mode
Search
engram search "query"
engram search "query" --wing myapp
engram search "query" --room auth-migration
engram search "query" --no-decay # disable recency weighting
engram search "query" --results 20 # max results
Memory Stack
engram wake-up # L0 + L1 context dump (~170 tokens)
engram wake-up --wing myapp # wing-scoped L1
engram wake-up --rebuild # rebuild L1 from drawers
Compression
engram compress # ES compress all closets
engram compress --wing myapp # wing-scoped
engram compress --wing myapp --room auth # room-scoped
Knowledge Graph
engram kg query "Kai"
engram kg query "Kai" --all # include expired triples
engram kg add "Kai" works_on "Orion" --from 2025-06-01
engram kg invalidate "Kai" works_on "Orion" --ended 2026-03-01
engram kg timeline "auth-migration"
Maintenance
engram conflicts # interactive TUI conflict resolver
engram audit # health check
engram audit --fix # auto-resolve safe issues
engram replay --room auth-migration # chronological room story
engram replay --room auth-migration --wing myapp
engram status # château overview
engram split <dir> # split concatenated transcripts
engram split <dir> --dry-run
Configuration
~/.engram/config.json
{
"palace_path": "~/.engram/palace",
"vector_backend": "chromadb",
"decay_factor": 0.005,
"decay_max_days": 90,
"collection_name": "engram_drawers",
"people_map": {}
}
| Key | Default | Description |
|---|---|---|
palace_path |
~/.engram/palace |
Root of the château filesystem |
vector_backend |
chromadb |
chromadb | faiss | sqlitevec |
decay_factor |
0.005 |
Recency boost per day: score * (1 + factor * days) |
decay_max_days |
90 |
Days after which decay levels off |
collection_name |
engram_drawers |
ChromaDB collection name |
~/.engram/identity.txt — plain text, becomes your L0 context.
~/.engram/wing_config.json — generated by engram init.
Module Reference
| File | Description |
|---|---|
engram/palace.py |
Wing/Room/Hall/Closet/Drawer data model |
engram/config.py |
Config loading, ~/.engram/ management |
engram/shorthand.py |
Engram Shorthand (ES) compression dialect |
engram/knowledge_graph.py |
Temporal KG, SQLite backend |
engram/miner.py |
Project file ingest pipeline |
engram/convo_miner.py |
Conversation export ingest (Claude, ChatGPT, Slack) |
engram/searcher.py |
Semantic search + recency weighting |
engram/layers.py |
L0–L3 memory stack |
engram/watcher.py |
FSEvents/inotify watch mode |
engram/conflict.py |
Contradiction detection + TUI resolver |
engram/audit.py |
Memory health audit |
engram/replay.py |
Session/room replay |
engram/agents.py |
Specialist agent diary system |
engram/palace_graph.py |
Room navigation graph |
engram/onboarding.py |
Guided init + ES bootstrap |
engram/cli.py |
Typer CLI entry point |
engram/mcp_server.py |
MCP server — 22 tools |
engram/backends/base.py |
Abstract VectorBackend interface |
engram/backends/chromadb_backend.py |
ChromaDB backend (default) |
engram/backends/faiss_backend.py |
FAISS backend (speed-optimised) |
engram/backends/sqlitevec_backend.py |
sqlite-vec backend (zero-dependency) |
engram/plugins/obsidian.py |
Obsidian vault plugin miner |
engram/plugins/notion.py |
Notion export plugin miner |
engram/plugins/linear.py |
Linear issues plugin miner |
engram/hooks/engram_save_hook.sh |
Claude Code auto-save hook |
engram/hooks/engram_precompact_hook.sh |
Claude Code pre-compact hook |
Engram Shorthand (ES)
ES is a lossless compression dialect that any LLM can read without a decoder.
from engram.shorthand import compress, decompress
text = (
"The authentication module is a critical component that has a dependency "
"on the database and is responsible for verifying user credentials."
)
es = compress(text, confidence=4)
# → "auth module:★★ component + dependency db & responsible verifying user credentials [★★★★]"
decompress(es)
# → expands symbols back to natural language
# Code-aware compression
code = "def authenticate(user: str, token: str) -> bool:"
compress(code, is_code=True)
# → "fn:authenticate(user:str,token:str)->bool"
# Diff notation
compress("+add_middleware()\n-manual_verify()", is_diff=True, diff_filename="auth.py")
# → "CHANGE:auth.py add:add_middleware() rm:manual_verify()"
Recency Weighting
final_score = semantic_score × (1 + recency_boost)
recency_boost = decay_factor × max(0, decay_max_days − age_days)
Pinned drawers (engram_pin) bypass decay entirely.
Claude Code Hooks
Copy hooks to ~/.engram/hooks/ then add to ~/.claude/settings.json:
{
"hooks": {
"PostToolUse": [{
"matcher": "Write|Edit",
"hooks": [{"type": "command", "command": "~/.engram/hooks/engram_save_hook.sh"}]
}],
"PreCompact": [{
"hooks": [{"type": "command", "command": "~/.engram/hooks/engram_precompact_hook.sh"}]
}]
}
}
Set ENGRAM_WING=myapp and ENGRAM_ROOM=current-task in your environment.
Requirements
- Python 3.9+
- chromadb ≥ 0.4.0
- typer ≥ 0.9.0
- rich ≥ 13.0.0
- watchdog ≥ 3.0.0
- questionary ≥ 2.0.0
- pyyaml ≥ 6.0
Optional:
faiss-cpu— FAISS backendsqlite-vec— sqlite-vec backend
No API key. No internet after install.
Contributing
- Fork the repo
pip install -e ".[dev]"pre-commit install- Run tests:
pytest tests/ -v - Submit a PR
License
MIT © 2026 Tushae Thomas
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.