local-memory-mcp

local-memory-mcp

A local-first long-term memory system for AI coding agents, exposed as an MCP server.

Category
Visit Server

README

local-memory-mcp

A local-first long-term memory system for AI coding agents, exposed as an MCP server. Built for Kiro CLI but compatible with any MCP-capable client.

Why

AI agents forget everything between sessions. This gives them persistent, searchable, semantically-aware memory — stored entirely on your machine.

Architecture

┌─────────────────────────────────────────────────┐
│  MCP Server (stdio)                             │
│                                                 │
│  Tools: store_memory · recall · search_memories │
│         get_memory · forget · relate            │
│         query_graph · consolidate · memory_stats│
├─────────────────────────────────────────────────┤
│  Memory Engine                                  │
│  ┌───────────┬──────────┬───────────────────┐   │
│  │ Retrieval │ Embeddings│ Consolidation    │   │
│  │ (hybrid)  │ (local)   │ (decay + merge)  │   │
│  └───────────┴──────────┴───────────────────┘   │
├─────────────────────────────────────────────────┤
│  Storage Layer                                  │
│  ┌──────────┬───────────┬────────┬──────────┐   │
│  │ Memories │ Vectors   │ FTS5   │ Knowledge│   │
│  │ (SQLite) │(sqlite-vec)│(SQLite)│ Graph    │   │
│  └──────────┴───────────┴────────┴──────────┘   │
└─────────────────────────────────────────────────┘

Hybrid retrieval combines four signals into a single score:

Signal Weight Source
Vector similarity 50% sqlite-vec (L2 distance on 384-dim embeddings)
Full-text search 25% SQLite FTS5
Recency 15% Exponential decay, 30-day half-life
Importance 10% User-assigned + access-frequency boosting

Embeddings run fully locally via Transformers.js (ONNX runtime) using all-MiniLM-L6-v2. No API keys. No network calls after first model download.

Knowledge graph stores typed entities and weighted relations in SQLite with BFS traversal up to 3 hops.

Consolidation applies importance decay, merges near-duplicate memories, and prunes forgotten ones.

Dual-scope storage

Every memory lives in one of two scopes:

  • Global (~/.local-memory/memory.db) — your preferences, facts, cross-project knowledge
  • Project (.local-memory/memory.db in repo root) — project-specific context, decisions, patterns

The agent can query either scope or both. Project scope auto-detects from .git, package.json, Cargo.toml, pyproject.toml, or go.mod.

Tools

Tool Description
store_memory Store a memory with type, scope, importance, and optional entity extraction
recall Semantic recall — hybrid search combining all four signals
search_memories Keyword-based full-text search
get_memory Fetch a specific memory by ID
forget Delete a memory and cascade to embeddings + graph
relate Create/strengthen entity relationships in the knowledge graph
query_graph Traverse the knowledge graph from an entity (BFS, 1-3 hops)
consolidate Decay old memories, merge duplicates, prune weak ones
memory_stats Counts for memories, entities, and relations per scope

Quickstart

Install

git clone https://github.com/smankoo/local-memory-mcp.git
cd local-memory-mcp
npm install
npm run build

Configure Kiro CLI

Option A — Auto-configure:

npx tsx scripts/install-kiro.ts          # global
npx tsx scripts/install-kiro.ts --project # project-level

Option B — Manual:

Add to ~/.kiro/settings/mcp.json:

{
  "mcpServers": {
    "memory": {
      "command": "node",
      "args": ["/absolute/path/to/local-memory-mcp/dist/index.js"],
      "env": {
        "MEMORY_DIR": "~/.local-memory"
      }
    }
  }
}

Then restart Kiro CLI and run /mcp to verify.

Other MCP clients

Any client that speaks MCP over stdio works. The server binary is dist/index.js:

node /path/to/local-memory-mcp/dist/index.js

Environment variables

Variable Default Description
MEMORY_DIR ~/.local-memory Global data directory
MEMORY_PROJECT auto-detected CWD Project root override
EMBEDDING_MODEL Xenova/all-MiniLM-L6-v2 HuggingFace model for embeddings

Configuration

Tuning knobs are in src/utils/config.ts:

Parameter Default Description
deduplicationThreshold 0.92 Cosine similarity above which a new memory updates the existing one
consolidationThreshold 0.85 Similarity above which two memories are merged during consolidation
decayRate 0.995 Daily importance multiplier (0.995^30 ≈ 0.86, so ~14% decay/month)
minImportanceBeforePrune 0.05 Memories below this with <2 accesses get pruned

Development

npm run dev          # watch mode
npm test             # run tests (downloads model on first run, ~60s)
npm run build        # production build

Project structure

src/
├── index.ts                # Entry point — stdio transport
├── server.ts               # MCP tool definitions
├── engine/
│   ├── memory-engine.ts    # Orchestrator — store, recall, forget, consolidate
│   ├── retrieval.ts        # Hybrid scoring (vector + FTS + recency + importance)
│   ├── embeddings.ts       # Local embedding via Transformers.js
│   ├── consolidation.ts    # Decay, merge, prune lifecycle
│   └── graph.ts            # Entity relationship engine
├── storage/
│   ├── database.ts         # SQLite + sqlite-vec + FTS5 initialization
│   ├── schema.ts           # Drizzle ORM schema
│   ├── memory-store.ts     # CRUD for memories table
│   ├── vector-store.ts     # sqlite-vec operations
│   ├── fts-store.ts        # FTS5 search with query sanitization
│   └── graph-store.ts      # Entity + relation tables, BFS traversal
└── utils/
    ├── config.ts           # Environment + defaults
    ├── scoring.ts          # Recency decay, hybrid scoring, cosine similarity
    └── id.ts               # nanoid generation

Tech stack

  • TypeScript + tsup (ESM, Node 22)
  • better-sqlite3 + Drizzle ORM for structured storage
  • sqlite-vec for vector similarity search
  • SQLite FTS5 for full-text search
  • @huggingface/transformers for local embeddings (ONNX)
  • @modelcontextprotocol/sdk for the MCP server
  • Vitest for testing

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured