engram
Local, private memory layer for notes and files with temporal reasoning and citation. Enables agents to query and persist memories via the Model Context Protocol.
README
<div align="center">
π§ engram
Your local, private memory layer
Index your notes and files, then recall anything β with citations and a sense of time. 100% on your machine.
No cloud. No account. No data leaving your laptop. Just npx engram.
npx engram ingest ~/notes
npx engram recall "what did I decide about pricing"
<!-- TODO: replace with a real screen recording before launch -->
<!--
-->
</div>
Your notes, journals, and docs are a second brain you can't query. Hosted "AI memory" tools want you to upload all of it to their cloud. engram is the opposite: it builds a searchable memory on your machine and never phones home.
npx engram ingest ~/notes ~/journal # index markdown, text, PDF, HTML β¦
npx engram recall "auth bug clock skew" # ranked passages, with citations
npx engram recall "hiring" --since week # time-aware: only recent memories
npx engram ask "summarize my pricing decisions" # (optional) local LLM answer
Every result tells you exactly where it came from β file:line and the date β
so you can trust it and jump to the source.
Supported files: Markdown, text,
org,rst, PDF, and HTML β all via zero-dependency extractors. PDF extraction is best-effort: text-based PDFs work great; scanned (image-only), encrypted, or custom-CID-font PDFs may extract poorly. (EPUB is on the roadmap.)
Why engram
- Local-first & private. Memory lives in one JSON file on disk. Embeddings and answers (optional) run through a local Ollama β nothing ever leaves your box.
- Temporal reasoning, not a flat vector dump. Every memory carries a
timestamp (file mtime and dates found in the text). Recall is recency-aware
and supports
--since week,--since 2026-05-01, etc. β so "what was I working on lately" actually works. - Cited recall. Results come back as
source:line (date)with a snippet. - Works with zero setup. A built-in BM25 lexical engine means recall works offline with no model at all. Add a local embedding model for semantic recall when you want it β it's an enhancement, never a requirement.
- Zero dependencies. Pure Node built-ins. A few hundred readable lines.
- A memory backend for your agents, too.
engram serveexposes a tiny local API (/remember,/recall) so your AI agents get private, persistent memory.
Install & use
# index some notes (markdown, txt, org, rst β¦)
npx engram ingest ~/Documents/notes
# β¦or keep it live β re-indexes automatically as you edit
npx engram watch ~/Documents/notes
# recall β lexical + temporal, fully offline
npx engram recall "postgres migration plan"
npx engram recall "standup notes" --since 7d --limit 5
# optional: semantic recall + answers via a LOCAL Ollama
npx engram ingest ~/notes --embed # one-time, computes embeddings
npx engram recall "that idea about caching" --semantic
npx engram ask "what are my open questions about auth?"
# housekeeping
npx engram status
npx engram forget old-project
New here?
examples/has three sample notes and a 30-second walkthrough you can run against this repo β ingest β recall β temporal filter.
How it works
files ββchunkβββΆ memory store (one local JSON file)
β each chunk: text Β· source:line Β· timestamp Β· term-freqs Β· [embedding]
recall(query) ββββββ€
ββ BM25 lexical score (always on, offline)
ββ semantic cosine (optional, local Ollama)
ββ temporal recency + filter (the part most tools miss)
β ranked, cited passages
The store is a plain JSON file (default ~/.engram/store.json). Back it up,
inspect it, delete it β it's yours.
Memory for agents
npx engram serve # http://127.0.0.1:7077 (local only)
curl -s localhost:7077/remember -d '{"text":"Ship date is 2026-07-01"}'
curl -s localhost:7077/recall -d '{"query":"ship date"}'
The open, local alternative to a hosted agent-memory service. Point your agent at it and its memories stay on your machine, with the same temporal ranking.
Use it as an MCP server (Claude, etc.)
engram speaks the Model Context Protocol over
stdio, so Claude Desktop / Claude Code can use your memory as a tool β engram_recall,
engram_remember, engram_status. Add to claude_desktop_config.json (or a
project .mcp.json):
{
"mcpServers": {
"engram": {
"command": "npx",
"args": ["-y", "engram", "mcp"]
}
}
}
Now the model can recall your notes and persist new memories mid-conversation β all locally. Zero dependencies, no SDK: it's a few hundred lines of pure Node implementing JSON-RPC over stdio (spec revision 2025-06-18).
Optional: local embeddings (Ollama)
engram never ships your data anywhere. For semantic recall it talks to a local Ollama:
ollama pull nomic-embed-text # embeddings
ollama pull llama3.2 # for `engram ask`
Without Ollama, engram still works great in lexical + temporal mode.
Commands
engram ingest <path...> |
index files/folders (--embed for semantic) |
engram watch <path...> |
index, then auto-reindex on change (live memory) |
engram recall <query> |
cited passages (--since, --until, --limit, --semantic) |
engram ask <query> |
compose an answer from memory (needs Ollama) |
engram status |
what's stored |
engram forget <substr> |
remove memories by source |
engram serve |
local memory API (HTTP) for agents |
engram mcp |
run as an MCP server (stdio) for Claude/agents |
Status
Early MVP. Lexical + temporal recall, citations, ingest/forget, incremental
re-index, live watch mode (auto-reindex on change), the local agent API, an
MCP server (stdio), PDF + HTML ingestion (zero-dep extractors), and
optional Ollama embeddings/answers all work today. Roadmap: EPUB, and a SQLite
store for large vaults.
Star/watch to follow along.
Sibling projects
Part of a small, local-first, zero-dependency toolkit for building AI agents:
- π§ engram β a local, private memory layer for agents (and you) (this repo)
- π³ skillet β a package manager for agent skills
- π tracelet β local DevTools to debug agent runs
License
MIT β see LICENSE.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.