local-memory-mcp
A local-first long-term memory system for AI coding agents, exposed as an MCP server.
README
local-memory-mcp
A local-first long-term memory system for AI coding agents, exposed as an MCP server. Built for Kiro CLI but compatible with any MCP-capable client.
Why
AI agents forget everything between sessions. This gives them persistent, searchable, semantically-aware memory — stored entirely on your machine.
Architecture
┌─────────────────────────────────────────────────┐
│ MCP Server (stdio) │
│ │
│ Tools: store_memory · recall · search_memories │
│ get_memory · forget · relate │
│ query_graph · consolidate · memory_stats│
├─────────────────────────────────────────────────┤
│ Memory Engine │
│ ┌───────────┬──────────┬───────────────────┐ │
│ │ Retrieval │ Embeddings│ Consolidation │ │
│ │ (hybrid) │ (local) │ (decay + merge) │ │
│ └───────────┴──────────┴───────────────────┘ │
├─────────────────────────────────────────────────┤
│ Storage Layer │
│ ┌──────────┬───────────┬────────┬──────────┐ │
│ │ Memories │ Vectors │ FTS5 │ Knowledge│ │
│ │ (SQLite) │(sqlite-vec)│(SQLite)│ Graph │ │
│ └──────────┴───────────┴────────┴──────────┘ │
└─────────────────────────────────────────────────┘
Hybrid retrieval combines four signals into a single score:
| Signal | Weight | Source |
|---|---|---|
| Vector similarity | 50% | sqlite-vec (L2 distance on 384-dim embeddings) |
| Full-text search | 25% | SQLite FTS5 |
| Recency | 15% | Exponential decay, 30-day half-life |
| Importance | 10% | User-assigned + access-frequency boosting |
Embeddings run fully locally via Transformers.js (ONNX runtime) using all-MiniLM-L6-v2. No API keys. No network calls after first model download.
Knowledge graph stores typed entities and weighted relations in SQLite with BFS traversal up to 3 hops.
Consolidation applies importance decay, merges near-duplicate memories, and prunes forgotten ones.
Dual-scope storage
Every memory lives in one of two scopes:
- Global (
~/.local-memory/memory.db) — your preferences, facts, cross-project knowledge - Project (
.local-memory/memory.dbin repo root) — project-specific context, decisions, patterns
The agent can query either scope or both. Project scope auto-detects from .git, package.json, Cargo.toml, pyproject.toml, or go.mod.
Tools
| Tool | Description |
|---|---|
store_memory |
Store a memory with type, scope, importance, and optional entity extraction |
recall |
Semantic recall — hybrid search combining all four signals |
search_memories |
Keyword-based full-text search |
get_memory |
Fetch a specific memory by ID |
forget |
Delete a memory and cascade to embeddings + graph |
relate |
Create/strengthen entity relationships in the knowledge graph |
query_graph |
Traverse the knowledge graph from an entity (BFS, 1-3 hops) |
consolidate |
Decay old memories, merge duplicates, prune weak ones |
memory_stats |
Counts for memories, entities, and relations per scope |
Quickstart
Install
git clone https://github.com/smankoo/local-memory-mcp.git
cd local-memory-mcp
npm install
npm run build
Configure Kiro CLI
Option A — Auto-configure:
npx tsx scripts/install-kiro.ts # global
npx tsx scripts/install-kiro.ts --project # project-level
Option B — Manual:
Add to ~/.kiro/settings/mcp.json:
{
"mcpServers": {
"memory": {
"command": "node",
"args": ["/absolute/path/to/local-memory-mcp/dist/index.js"],
"env": {
"MEMORY_DIR": "~/.local-memory"
}
}
}
}
Then restart Kiro CLI and run /mcp to verify.
Other MCP clients
Any client that speaks MCP over stdio works. The server binary is dist/index.js:
node /path/to/local-memory-mcp/dist/index.js
Environment variables
| Variable | Default | Description |
|---|---|---|
MEMORY_DIR |
~/.local-memory |
Global data directory |
MEMORY_PROJECT |
auto-detected CWD | Project root override |
EMBEDDING_MODEL |
Xenova/all-MiniLM-L6-v2 |
HuggingFace model for embeddings |
Configuration
Tuning knobs are in src/utils/config.ts:
| Parameter | Default | Description |
|---|---|---|
deduplicationThreshold |
0.92 | Cosine similarity above which a new memory updates the existing one |
consolidationThreshold |
0.85 | Similarity above which two memories are merged during consolidation |
decayRate |
0.995 | Daily importance multiplier (0.995^30 ≈ 0.86, so ~14% decay/month) |
minImportanceBeforePrune |
0.05 | Memories below this with <2 accesses get pruned |
Development
npm run dev # watch mode
npm test # run tests (downloads model on first run, ~60s)
npm run build # production build
Project structure
src/
├── index.ts # Entry point — stdio transport
├── server.ts # MCP tool definitions
├── engine/
│ ├── memory-engine.ts # Orchestrator — store, recall, forget, consolidate
│ ├── retrieval.ts # Hybrid scoring (vector + FTS + recency + importance)
│ ├── embeddings.ts # Local embedding via Transformers.js
│ ├── consolidation.ts # Decay, merge, prune lifecycle
│ └── graph.ts # Entity relationship engine
├── storage/
│ ├── database.ts # SQLite + sqlite-vec + FTS5 initialization
│ ├── schema.ts # Drizzle ORM schema
│ ├── memory-store.ts # CRUD for memories table
│ ├── vector-store.ts # sqlite-vec operations
│ ├── fts-store.ts # FTS5 search with query sanitization
│ └── graph-store.ts # Entity + relation tables, BFS traversal
└── utils/
├── config.ts # Environment + defaults
├── scoring.ts # Recency decay, hybrid scoring, cosine similarity
└── id.ts # nanoid generation
Tech stack
- TypeScript + tsup (ESM, Node 22)
- better-sqlite3 + Drizzle ORM for structured storage
- sqlite-vec for vector similarity search
- SQLite FTS5 for full-text search
- @huggingface/transformers for local embeddings (ONNX)
- @modelcontextprotocol/sdk for the MCP server
- Vitest for testing
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.