agent-memory
MCP server for zero-config, traceable long-term memory using SQLite, enabling agents to store, search, trace, and monitor memory with tools like memory_store, memory_search, and memory_health.
README
agent-memory
Zero-config, traceable, MCP-native long-term memory for agents.
agent-memory targets a gap in the current memory stack: a local-first engine that works with pip install, runs on pure SQLite, and makes memory evolution explainable instead of opaque.
Install from PyPI with pip install agent-memory-engine.
Current packaged release: 0.2.1.
Documentation
- English docs index:
docs/README.md - 中文文档索引:
docs/zh-CN/README.md - Teaching series entry:
docs/teaching/01-project-overview.md - Delivery tutorial:
docs/project-delivery-and-tutorial.md - MCP guide:
docs/mcp-integration.md - Release guide:
docs/release-and-pypi.md - Benchmark report:
docs/benchmark-results.md
Why this exists
Mem0proves demand, but pulls in heavier infra such as Neo4j or Qdrant.- Local agents and personal copilots need a memory layer that is easy to embed, debug, export, and ship.
- The project spans a broad technical surface area: storage, retrieval, ranking, decay, provenance, conflict handling, MCP, and evaluation.
Current Status
- SQLite backend with WAL, FTS5, audit log, evolution log, entity index, and causal parent links
- Go service workspace with SQLite storage engine, schema migration, REST gateway, gRPC server, auth hooks, metrics, tracing bootstrap, and Cobra CLI
- Schema indexes for type, layer, recency, trust, source, relation, and audit hot paths
- Python SDK via
MemoryClient - Python
MemoryClientnow supportsembeddedandremotemodes throughSQLiteBackendandRemoteBackend - In service mode, fused retrieval orchestration can run inside the Go service over REST/gRPC
- Rule-based intent router with Reciprocal Rank Fusion
- Adaptive forgetting utilities with dual-threshold layer transitions
- Heuristic conflict detection with contradiction edges and trust-score adjustment
- Optional LLM-backed conflict adjudication for top semantic candidates
- Governance helpers for health reports, audit reads, and JSONL export/import
- Optional MCP server and REST API adapters with dependency-friendly fallbacks
sqlite-vecintegration with safe fallback to Python cosine scan when unavailable- Deterministic local fallback embeddings for testability and zero-friction startup
- LLM-first conversation extraction with heuristic fallback
- Trace graph reports with ancestors, descendants, relations, and evolution history
- Idempotent maintenance cycle for decay, promotion/demotion, conflict upkeep, and consolidation
- Benchmark helpers and LOCOMO-Lite style starter data
Quickstart
pip install agent-memory-engine
agent-memory store "User prefers SQLite for local-first agents." --source-id demo
agent-memory search "Why SQLite?"
agent-memory health
For development:
pip install -e '.[dev]'
.venv/bin/python -m pytest -q
from agent_memory import MemoryClient
client = MemoryClient()
item = client.add(
"The user prefers SQLite for local-first agent projects.",
source_id="demo-session",
)
results = client.search("What database does the user prefer?")
print(results[0].item.content)
trace = client.trace_graph(item.id)
print(trace.descendants)
health = client.health()
print(health.suggestions)
Service mode
make proto
cd go-server && go run ./cmd/server
export AGENT_MEMORY_MODE=remote
export AGENT_MEMORY_GO_SERVER_URL=http://127.0.0.1:8080
export AGENT_MEMORY_GRPC_TARGET=127.0.0.1:9090
agent-memory search "Why SQLite?"
Architecture
graph TD
A["Python SDK / MCP"] --> B["MemoryClient"]
B --> C{"Mode"}
C -->|"embedded"| D["SQLiteBackend (Python)"]
C -->|"remote"| E["RemoteBackend"]
E --> F["Go REST / gRPC"]
F --> G["SQLite Storage Engine"]
G --> H[("SQLite + WAL + vector fallback")]
B --> I["Intent Router / Conflict / Trust"]
Core components
src/agent_memory/client.py— high-level SDK entry pointsrc/agent_memory/storage/remote_backend.py— REST/gRPC bridge to the Go serviceproto/memory/v1/— shared Protobuf contractsgo-server/cmd/server/main.go— Go service entrypoint with graceful shutdowngo-server/internal/storage/sqlite.go— Go storage enginego-server/internal/gateway/handler.go— Go REST handlersgo-server/internal/grpc/server.go— Go gRPC implementationsrc/agent_memory/storage/sqlite_backend.py— SQLite persistence, FTS, vector fallback, trace queriessrc/agent_memory/controller/router.py— intent-aware retrieval routing and RRF fusionsrc/agent_memory/controller/forgetting.py— Ebbinghaus-inspired adaptive forgettingsrc/agent_memory/controller/conflict.py— contradiction detection and conflict recordssrc/agent_memory/controller/consolidation.py— overlap grouping and merge-draft generationsrc/agent_memory/controller/trust.py— multi-factor trust scoringsrc/agent_memory/governance/health.py— stale/orphan/conflict monitoringsrc/agent_memory/interfaces/mcp_server.py— eight MCP toolssrc/agent_memory/extraction/pipeline.py— conversation-to-memory extractionbenchmarks/— storage/retrieval microbenchmarks and synthetic eval seeds
Design choices
- SQLite + WAL keeps deployment zero-config while fitting agent workloads: many reads, occasional writes.
- Rule routing over LLM routing keeps routing latency predictable and testable.
- RRF instead of score averaging avoids calibration problems across lexical, entity, and semantic retrieval.
sqlite-vecplus fallback gives C/SQL vector search when available while keeping the package runnable everywhere.- Soft delete preserves provenance and causal trace integrity.
- Hash fallback embeddings make the package runnable even before a local embedding model is available.
- Unique relation edges keep maintenance idempotent and health metrics stable.
Project layout
agent-memory/
├── deploy/
├── go-server/
├── proto/
├── docs/plans/
├── examples/
├── src/agent_memory/
│ ├── controller/
│ ├── embedding/
│ ├── extraction/
│ └── storage/
└── tests/
Benchmarks
Synthetic LOCOMO-Lite run on the bundled starter dataset (30 dialogues / 150 questions):
| Metric | agent-memory |
Semantic-only baseline |
|---|---|---|
| Overall hit rate | 50.0% | 23.3% |
| Factual recall | 53.3% | 6.7% |
| Temporal recall | 36.7% | 3.3% |
| Causal recall | 53.3% | 6.7% |
| p95 retrieval latency | 16.64ms | 11.50ms |
- Full report:
docs/benchmark-results.md - Re-run locally:
python benchmarks/locomo_lite/evaluate.py
MCP Usage
Install MCP support and launch the stdio server:
pip install -e .[mcp]
python -m agent_memory.interfaces.mcp_server
Claude Desktop configuration:
{
"mcpServers": {
"agent-memory": {
"command": "python",
"args": ["-m", "agent_memory.interfaces.mcp_server"],
"env": {
"AGENT_MEMORY_DB_PATH": "/absolute/path/to/default.db"
}
}
}
}
Typical tools:
memory_store— store a memory with provenancememory_search— run intent-aware retrievalmemory_trace— inspect causal ancestry and evolutionmemory_health— inspect stale/conflict/orphan metrics
More details: docs/mcp-integration.md
Demos
python examples/demo_cross_session.py --db /tmp/agent-memory-demo.dbpython examples/interactive_chat.py --db chat_memory.db --provider nonepython examples/mcp_server.py
Release Notes
- Changelog:
CHANGELOG.md benchmarks/locomo_lite/latest_results.jsonis regenerated by the evaluation scriptdocs/screenshots/is reserved for verified MCP client screenshots- Delivery record and full tutorial:
docs/project-delivery-and-tutorial.md - Release and PyPI guide:
docs/release-and-pypi.md - Expansion and optimization review:
docs/plans/2026-03-24-agent-memory-expansion-review.md
Dev Notes
- Run all tests with
.venv/bin/python -m pytest -q - Run Go tests with
cd go-server && go test ./... - Run Go benchmarks with
make go-bench - Use the built-in CLI with
agent-memory --help sqlite-vecis installed as a package dependency; if the extension cannot be loaded at runtime, vector search safely falls back to Python cosine scan- Try microbenchmarks with
python benchmarks/bench_storage.pyandpython benchmarks/bench_retrieval.py - Compare Python and Go with
make bench-compare - Try the demo runner with
python examples/benchmark_runner.py
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.