MCP Servers

memory-mcp

A self-organizing, persistent semantic memory layer that enables AI agents to store, categorize, and retrieve information using hybrid vector and keyword search. It features autonomous chunking, deduplication, and hierarchical taxonomy management through a PostgreSQL-backed MCP server.

README

memory-mcp

Persistent, self-organizing semantic memory for AI agents — served as an MCP server.

What is this?

memory-mcp is a Model Context Protocol server that gives AI agents durable, searchable memory backed by PostgreSQL and pgvector. Drop it into any MCP-compatible client (Claude Code, Cursor, Windsurf, etc.) and your agent gains the ability to remember, retrieve, and reason over information across sessions — without you managing any schema or storage logic.

What it does autonomously:

Chunks and embeds incoming text
Categorizes memories into a hierarchical taxonomy (ltree dot-paths)
Deduplicates against existing memories and resolves conflicts
Synthesizes a System Primer — a compressed, always-current summary of everything it knows — and surfaces it at session start
Expires stale memories via TTL and prompts for verification of aging facts

Why memory-mcp?

	memory-mcp	Simple vector DB	LangChain / LlamaIndex memory
Schema management	Automatic	Manual	Manual
Deduplication	Semantic + LLM	None	None
Taxonomy	Auto-assigned ltree	None	None
Session bootstrap	System Primer	Manual RAG	Manual
Conflict resolution	LLM-evaluated	None	None
Ephemeral context	Built-in (TTL store)	No	No
Self-hostable	Yes (Docker)	Varies	No
MCP-native	Yes	No	No

Architecture

AI Agent (Claude Code / Cursor / Windsurf)
        │  HTTP (MCP — Streamable HTTP)
        ▼
┌──────────────────────────────────────────┐
│              server.py                    │
│  ┌─────────────────┐ ┌─────────────────┐ │
│  │ Production MCP  │ │   Admin MCP     │ │
│  │   :8766/mcp     │ │   :8767/mcp     │ │
│  └────────┬────────┘ └────────┬────────┘ │
│           │  tools/           │           │
│  ┌────────▼──────────────────▼────────┐  │
│  │  ingestion · search · context      │  │
│  │  crud · admin_tools · context_store│  │
│  └────────────────┬───────────────────┘  │
│                   │                       │
│  ┌────────────────▼───────────────────┐  │
│  │         Background Workers          │  │
│  │  Ingestion Queue · TTL Daemon       │  │
│  │  System Primer Auto-Regeneration    │  │
│  └────────────────┬───────────────────┘  │
└───────────────────┼──────────────────────┘
                    │  asyncpg
                    ▼
         PostgreSQL + pgvector
         ┌─────────────────┐
         │ memories        │  chunks, embeddings, ltree paths
         │ memory_edges    │  sequence_next, relates_to, supersedes
         │ ingestion_staging│ async job queue
         │ context_store   │  ephemeral TTL store
         └─────────────────┘
                    │
         ┌──────────▼──────────┐
         │  Backup Service     │  pg_dump → private GitHub repo
         └─────────────────────┘

Two servers, one process:

Production (:8766) — tools safe for the agent to call freely
Admin (:8767) — superset including destructive tools (delete, prune, bulk-move). Point your agent at production; use admin for maintenance.

Quickstart (Docker)

Prerequisites: Docker + Docker Compose, an OpenAI API key.

# 1. Clone
git clone https://github.com/isaacriehm/memory-mcp.git
cd memory-mcp

# 2. Configure
cp .env.example .env
$EDITOR .env   # set OPENAI_API_KEY and DB_PASSWORD at minimum

# 3. Start
docker compose up -d

# Production MCP endpoint: http://localhost:8766/mcp
# Admin MCP endpoint:      http://localhost:8767/mcp

To rebuild after code changes:

docker compose up -d --build memory-api

Connecting to an MCP Client

Claude Code

Add to your project's .claude/settings.json or ~/.claude/settings.json:

{
  "mcpServers": {
    "memory": {
      "type": "http",
      "url": "http://localhost:8766/mcp"
    }
  }
}

Or via the CLI:

claude mcp add memory --transport http http://localhost:8766/mcp

Then add this instruction to your CLAUDE.md so the agent always bootstraps memory at session start:

## Memory
At the start of every session, call `initialize_context` before anything else.
This returns your System Primer — your identity, current knowledge taxonomy, and retrieval guide.
Always consult it before answering questions about prior context.

Cursor / Windsurf

Add to your MCP settings (.cursor/mcp.json or equivalent):

{
  "mcpServers": {
    "memory": {
      "url": "http://localhost:8766/mcp"
    }
  }
}

MCP Tools

Production Tools (`:8766`)

Tool	Description
`initialize_context`	Call first every session. Returns the System Primer + verification prompts for aging memories.
`memorize_context`	Ingest raw text. Automatically chunks, embeds, categorizes, and deduplicates. Supports `ttl_days`.
`check_ingestion_status`	Poll async ingestion job by `job_id`. Returns `pending`, `processing`, `complete`, or `failed`.
`search_memory`	Hybrid vector + BM25 search with Reciprocal Rank Fusion. Filter by `category_path`.
`list_categories`	Return all occupied taxonomy paths with memory counts.
`explore_taxonomy`	Drill into a collapsed `[+N more]` branch from `list_categories`.
`fetch_document`	Reconstruct a full document by following `sequence_next` edges from a memory ID.
`trace_history`	Inspect the full supersession chain (oldest → newest) for a memory.
`confirm_memory_validity`	Confirm an aging memory is still accurate. Advances its `verify_after` date.
`update_memory`	Rewrite a memory's content in-place (preserves identity, edges, history).
`set_context`	Write a key/value pair to the ephemeral context store with a TTL.
`get_context`	Retrieve an ephemeral context entry by key.
`list_context_keys`	List active (non-expired) context keys, optionally filtered by scope.
`delete_context`	Explicitly delete a context entry before its TTL expires.
`extend_context_ttl`	Push a context entry's expiry forward by N hours.

Admin-Only Tools (`:8767`)

Tool	Description
`delete_memory`	Hard-delete a memory by ID (cascades edges).
`prune_history`	Batch-delete superseded memories older than N days.
`export_memories`	Export all active memories to JSON.
`recategorize_memory`	Move a single memory to a new taxonomy path.
`bulk_move_category`	Move an entire taxonomy branch (e.g. `old.prefix` → `new.prefix`).
`update_memory_metadata`	Patch a memory's metadata JSONB in-place.
`run_diagnostics`	Report on pool health, memory counts, ingestion queue depth.
`get_ingestion_stats`	Breakdown of ingestion job statuses.
`flush_staging`	Clear all completed/failed staging jobs immediately.

Taxonomy

Memories are organized into a dot-path hierarchy using PostgreSQL ltree. The system assigns paths automatically during ingestion. You can override with recategorize_memory or bulk_move_category.

Example paths:

user.profile.personal
user.health.medical
projects.myapp.architecture
projects.myapp.decisions
organizations.acme.business
concepts.ai.behavior
reference.system.primer     ← auto-generated System Primer lives here

Search is subtree-aware — passing category_path: "projects.myapp" returns everything under that branch.

System Primer

initialize_context returns a synthesized summary stored at reference.system.primer. It includes:

A compressed user/agent profile
The full taxonomy tree with memory counts
Retrieval guidance

The primer auto-regenerates in the background when ≥10 new memories are ingested or when the previous primer is older than 1 hour. You can force regeneration via the admin tool synthesize_system_primer.

Environment Variables

Copy .env.example to .env and fill in your values.

Required

Variable	Description
`DATABASE_URL`	PostgreSQL connection string (e.g. `postgresql://user:pass@localhost:5432/memory`)
`OPENAI_API_KEY`	OpenAI API key for embeddings and LLM calls
`DB_PASSWORD`	PostgreSQL password (used by Docker Compose)

Optional — Models & Embeddings

Variable	Default	Description
`EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI embedding model
`EXTRACT_MODEL`	`gpt-5-mini`	LLM for semantic section extraction and categorization
`CONFLICT_MODEL`	`gpt-5-nano`	LLM for conflict/dedup evaluation
`EMBED_DIM`	`1536`	Embedding vector dimension (must match model)

Optional — Search & Limits

Variable	Default	Description
`DEFAULT_SEARCH_LIMIT`	`10`	Default result count for `search_memory`
`DEFAULT_LIST_LIMIT`	`50`	Default result count for `list_categories`
`DUP_THRESHOLD`	`0.95`	Cosine similarity threshold for deduplication
`CONFLICT_THRESHOLD`	`0.55`	Similarity threshold for conflict detection
`RELATES_TO_THRESHOLD`	`0.65`	Similarity threshold for `relates_to` edge creation
`MIN_SECTION_LENGTH`	`100`	Minimum character length for a chunk to be stored
`MAX_TAXONOMY_PATHS`	`40`	Max taxonomy paths assigned per ingestion

Optional — OpenAI & Concurrency

Variable	Default	Description
`OPENAI_TIMEOUT_S`	`60`	Per-request OpenAI timeout in seconds
`OPENAI_MAX_RETRIES`	`5`	Exponential-backoff retry limit
`MAX_CONCURRENT_API_CALLS`	`5`	Semaphore for parallel OpenAI requests
`EXTRACT_REASONING`	`low`	Reasoning effort for extraction LLM
`CONFLICT_REASONING`	`minimal`	Reasoning effort for conflict LLM

Optional — Database

Variable	Default	Description
`PG_POOL_MIN`	`1`	asyncpg minimum pool connections
`PG_POOL_MAX`	`10`	asyncpg maximum pool connections
`STAGING_RETENTION_DAYS`	`7`	Days to retain completed/failed staging jobs

Optional — Server

Variable	Default	Description
`PRODUCTION_PORT`	`8766`	Production MCP server port
`ADMIN_PORT`	`8767`	Admin MCP server port
`MCP_TRANSPORT`	`streamable-http`	FastMCP transport mode
`FASTMCP_JSON_RESPONSE`	—	Set to `1` to force JSON responses
`LOG_LEVEL`	`INFO`	`DEBUG` / `INFO` / `WARNING`

Optional — System Primer

Variable	Default	Description
`PRIMER_UPDATE_MAX_AGE_S`	`3600`	Max seconds before auto primer regeneration

Optional — Context Store

Variable	Default	Description
`CONTEXT_DEFAULT_TTL_HOURS`	`24`	Default TTL for context store entries
`CONTEXT_MAX_VALUE_LENGTH`	`50000`	Max character length for context values
`CONTEXT_MAX_KEY_LENGTH`	`200`	Max character length for context keys

Optional — Backup Service

Variable	Description
`GITHUB_PAT`	GitHub Personal Access Token with `repo` scope
`GITHUB_BACKUP_REPO`	Target repo in `owner/repo` format
`BACKUP_INTERVAL_SECONDS`	Seconds between backups (default: `21600` = 6 hours)

Running Locally (Development)

Requirements: Python 3.11+, PostgreSQL with pgvector.

# Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure
cp .env.example .env
$EDITOR .env

# Start the server
python -m server
# Production: http://0.0.0.0:8766
# Admin:      http://0.0.0.0:8767

Backup Service

The backup/ directory contains a containerized PostgreSQL backup job that:

Runs pg_dump on the configured interval (default: every 6 hours)
Commits the dump to a private GitHub repository

The backup service starts automatically with docker compose up. Set GITHUB_PAT and GITHUB_BACKUP_REPO in your .env to enable it. If those variables are unset, the service will error on startup — remove the memory-backup service from docker-compose.yml if you don't need backups.

CLI Scripts

Standalone scripts in scripts/ (require DATABASE_URL in environment):

# Export all memories to a timestamped JSON file
python scripts/export_memories.py

# Generate an interactive graph visualization
python scripts/visualize_memories.py
open memory_map.html

Contributing

See CONTRIBUTING.md.

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

memory-mcp

README

memory-mcp

What is this?

Why memory-mcp?

Architecture

Quickstart (Docker)

Connecting to an MCP Client

Claude Code

Cursor / Windsurf

MCP Tools

Production Tools (:8766)

Admin-Only Tools (:8767)

Taxonomy

System Primer

Environment Variables

Required

Optional — Models & Embeddings

Optional — Search & Limits

Optional — OpenAI & Concurrency

Optional — Database

Optional — Server

Optional — System Primer

Optional — Context Store

Optional — Backup Service

Running Locally (Development)

Backup Service

CLI Scripts

Contributing

License

Recommended Servers

Production Tools (`:8766`)

Admin-Only Tools (`:8767`)