agent-memory

agent-memory

Provides long-term memory for AI agents with per-user scoped recall, ranking by relevance, recency, and importance, consolidation of repeated events, and automatic forgetting.

Category
Visit Server

README

agent-memory

Long-term memory for agents, done properly. A memory service that keeps per-user memories, ranks recall by relevance, recency, and importance together, consolidates repeated events into semantic facts, and forgets low-value memories over time. Retrieval is always scoped, so one user's memory can never surface for another. Exposed as an MCP server so an agent can remember and recall over the protocol. Fully offline and keyless.

ci python license offline

An agent that forgets everything between sessions cannot help you twice. But naive memory is worse than none: dump every message into a vector store and recall returns the most recent chatter instead of the fact you need, or worse, one user's data leaks into another's session. This service treats memory as a ranking and governance problem, built from my retrieval and evaluation work.

What this demonstrates

Capability Where
Scoped memory: retrieval never crosses users memory.py
Ranking by relevance, recency, and importance together memory.py
Grounded recall with a refusal path recall.py
Consolidation of repeats into semantic facts memory.py
Decay and forgetting of low-value memories memory.py
Exposed as an MCP server server.py
Recall quality and isolation gated in CI evals.py

Architecture

flowchart LR
    E[events per user] --> S[(scoped store)]
    S --> C[consolidate repeats to facts]
    S --> D[decay and forget low value]
    Q[question + user scope] --> R{{relevance + recency + importance}}
    S --> R
    R --> A[grounded recall or refuse]

Quickstart

make dev            # venv + install -e ".[dev]"

amem demo           # ranked recall vs a most-recent baseline, plus consolidation and decay
amem recall "what is my favorite programming language" --scope alice
amem eval           # the recall and isolation gate
amem serve          # live MCP server: remember / recall / consolidate

No keys, no network. Embeddings are a deterministic hashing vectorizer; set a real embedder behind the same interface in production.

The gate that matters

amem eval answers questions whose facts were introduced in earlier sessions (report):

metric value gate
full_recall 1.000 >= 0.90
naive_recall (most-recent) 0.200 < full
precision_at_k 1.000 >= 0.80
cross_scope_leaks 0 = 0
consolidation_merged 2 reported
forgotten_after_decay 3 reported

The comparison is the point. Ranked recall answers every question; a most-recent baseline answers one in five, because the fact you asked about is usually an older memory buried under recent chatter. Scoped retrieval leaks nothing across users even though two users have a memory about the same topic. CI fails if recall drops, if the baseline is not beaten, or if a single cross-scope leak appears.

What it does

  • Ranks, not just stores. Recall combines relevance, recency, and importance, so a salient old fact beats a trivial recent one. Naive most-recent recall gets this wrong, which the eval measures directly.
  • Keeps users apart. Two users each say "my favorite language is ..."; each only ever recalls their own. Isolation is structural, not best-effort.
  • Consolidates. Repeated events are folded into a single semantic fact with boosted importance, so the store does not bloat with duplicates.
  • Forgets on purpose. Low-importance, old memories decay below a threshold and are pruned, while important facts survive. Forgetting is a feature, and the eval reports what was dropped.
  • Refuses. With nothing relevant in scope, recall says it does not remember rather than returning noise.

Design decisions

  • Scope is the first filter. Retrieval is restricted to the caller's scope before ranking, so cross-user leakage is impossible by construction.
  • One score, three signals. Relevance alone recalls stale facts; recency alone recalls chatter; importance alone ignores the query. Combining them is what makes recall useful.
  • Decay is query-free. Forgetting uses recency and importance only, so the store sheds low-value memories independent of any particular question.

Layout

src/agent_memory/  embed · memory · recall · server · evals · cli
data/  memories.jsonl · questions.jsonl
reports/  memory_report_example.md

Related repositories

Part of a portfolio on production ML and LLM engineering:

License

MIT (c) 2026 Taha Siddiqui

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured