MCP Servers

mnemos

Persistent memory engine for AI coding agents. Single Go binary, zero runtime dependencies, MCP-native. Stores, searches, and deduplicates memories across sessions using embedded SQLite with hybrid FTS + semantic search, memory decay, relation graph, and token-budget context assembly.

README

mnemos

Your AI agent has the memory of a goldfish. Mnemos fixes that.

A persistent memory engine for AI coding agents. Single Go binary, zero runtime dependencies, MCP-native.

Mnemos stores, searches, and manages memories across sessions using embedded SQLite — no external services, no Docker, no cloud, no Python, no Node.js. Just one binary and a .db file.

Agent (Claude Code / Kiro / Cursor / Windsurf / ...)
    ↓ MCP stdio
mnemos serve
    ↓
SQLite + FTS5 (~/.mnemos/mnemos.db)

What does it actually do?

Every time your agent learns something worth keeping — an architecture decision, a bug fix, a project convention — it calls mnemos_store. Next session, it calls mnemos_context and gets that knowledge back, as if it never forgot.

No more re-explaining your project structure every Monday morning.

The memory lifecycle:

Agent finishes something meaningful (fixed a bug, made a decision, learned a pattern)
Calls mnemos_store with the content
Mnemos deduplicates, classifies, and indexes it
Next session: mnemos_context assembles relevant memories within a token budget
Agent picks up right where it left off

Why mnemos?

	claude-mem	engram	neural-memory	mnemos
MCP native	✅	✅	✅	✅
Single binary / zero install	❌	✅	❌ (pip)	✅
Zero config to start	✅	✅	❌	✅
Hybrid search (FTS + semantic RRF)	❌	❌	❌	✅
Memory decay / lifecycle	❌	❌	✅	✅
Deduplication	❌	❌	❌	✅ (3-tier)
Relation graph	❌	❌	✅ (spreading activation)	✅
Token-budget context assembly	❌	partial	❌	✅
Human-readable Markdown mirror	✅	❌	❌	✅
Works with Kiro / Cursor / Windsurf	❌	✅	✅	✅
No Python / Node runtime required	✅	✅	❌	✅
Written in Go	❌	✅	❌ (Python)	✅

Install

# curl (macOS / Linux)
curl -fsSL https://raw.githubusercontent.com/s60yucca/mnemos/main/install.sh | bash

# Homebrew
brew install s60yucca/tap/mnemos

# Build from source (requires Go 1.23+)
git clone https://github.com/s60yucca/mnemos
cd mnemos && make build
# binary at: bin/mnemos

Initialize on first run:

mnemos init
# Creates ~/.mnemos/mnemos.db and ~/.mnemos/config.yaml

Use with Claude Code

Add to ~/.claude.json (global) or .mcp.json in your project root:

{
  "mcpServers": {
    "mnemos": {
      "command": "mnemos",
      "args": ["serve"],
      "env": {
        "MNEMOS_PROJECT_ID": "my-project"
      }
    }
  }
}

Restart Claude Code. Mnemos tools appear automatically.

Use with Kiro

Add to ~/.kiro/settings/mcp.json (global) or .kiro/settings/mcp.json in your workspace:

{
  "mcpServers": {
    "mnemos": {
      "command": "mnemos",
      "args": ["serve"],
      "env": {
        "MNEMOS_PROJECT_ID": "my-project"
      },
      "disabled": false,
      "autoApprove": ["mnemos_search", "mnemos_get", "mnemos_context"]
    }
  }
}

For automatic memory usage on every session, add a steering file at .kiro/steering/mnemos.md telling the agent to call mnemos_context at session start and mnemos_store when it learns something. Kiro will follow it automatically.

Use with Cursor / Windsurf / any MCP client

Same JSON config — mnemos speaks standard MCP over stdio. Works with any client that supports MCP tools.

MCP Tools

Tool	What it does
`mnemos_store`	Store a memory with optional type, tags, project scope
`mnemos_search`	Hybrid FTS + semantic search with RRF ranking
`mnemos_get`	Fetch a memory by ID
`mnemos_update`	Update content, summary, or tags
`mnemos_delete`	Soft-delete (recoverable via maintain)
`mnemos_relate`	Link two memories with a typed relation
`mnemos_context`	Assemble relevant memories within a token budget
`mnemos_maintain`	Run decay, archival, and garbage collection

Resources: mnemos://memories/{project_id}, mnemos://stats

Prompts: load_context (session start), save_session (session end)

CLI

mnemos init                                           # first-time setup
mnemos store "JWT uses RS256, tokens expire in 1h"    # store a memory
mnemos search "authentication"                        # hybrid search
mnemos search "auth" --mode text                      # text-only search
mnemos list --project myapp                           # list memories
mnemos get <id>                                       # fetch by id
mnemos update <id> --content "updated text"           # update
mnemos delete <id>                                    # soft delete
mnemos delete <id> --hard                             # permanent delete
mnemos relate <src-id> <tgt-id> --type depends_on     # create relation
mnemos stats --project myapp                          # storage stats
mnemos maintain                                       # decay + GC
mnemos serve                                          # start MCP server (stdio)
mnemos serve --rest --port 8080                       # start REST server
mnemos version                                        # print version

Global flags: --project <id>, --config <path>, --log-level debug|info|warn|error

Memory Types

Mnemos auto-classifies memories based on content. You can override manually.

Type	Decay rate	Use for
`short_term`	fast (~1 day)	todos, temp notes, WIP
`episodic`	medium (~1 month)	session events, bug fixes
`long_term`	slow (~6 months)	architecture decisions
`semantic`	very slow	facts, definitions, knowledge
`working`	fast	active task context

How search works

Mnemos uses Reciprocal Rank Fusion (RRF) to combine two search signals:

FTS5 — SQLite full-text search with BM25 ranking. Fast, offline, no setup.
Semantic — vector cosine similarity via embeddings. Optional, requires Ollama or OpenAI.

With only FTS5 (default), search is keyword-based but still very good. Enable embeddings to find memories by meaning — e.g. query "token expiry" finds a memory about "JWT RS256 1h lifetime".

Configuration

~/.mnemos/config.yaml:

data_dir: ~/.mnemos
log_level: info
log_format: text        # text or json

embeddings:
  provider: noop          # noop (default) | ollama | openai
  base_url: http://localhost:11434
  model: nomic-embed-text
  dims: 384
  api_key: ""

dedup:
  fuzzy_threshold: 0.85
  semantic_threshold: 0.92

lifecycle:
  decay_interval: 24h
  gc_retention_days: 30
  archive_threshold: 0.1

mirror:
  enabled: false          # set true to write human-readable Markdown files
  base_dir: ~/.mnemos/mirror

Environment variables override config — prefix with MNEMOS_:

MNEMOS_PROJECT_ID=myapp        # scope memories to a project
MNEMOS_LOG_LEVEL=debug

# Only needed if using semantic embeddings (optional):
MNEMOS_EMBEDDINGS_PROVIDER=ollama
MNEMOS_EMBEDDINGS_API_KEY=sk-...

Embedding Providers

Embeddings are optional. By default mnemos uses noop — pure FTS5 text search, zero config, works fully offline.

Enable embeddings only if you want semantic similarity search (find memories by meaning, not just keywords).

Ollama (local, free, no API key):

embeddings:
  provider: ollama
  base_url: http://localhost:11434
  model: nomic-embed-text
  dims: 768

OpenAI:

embeddings:
  provider: openai
  model: text-embedding-3-small
  dims: 1536
  api_key: sk-...

Performance

Benchmarked on macOS (Apple M-series), SQLite WAL mode, embeddings disabled (noop), cold process start per operation.

Operation	350 memories	1500 memories	Notes
`store` (new)	57 ms	24 ms	includes dedup check
`store` (dedup hit)	55 ms	22 ms	hash match, no write
`search` text (FTS5)	60 ms	54 ms	BM25 ranking
`search` hybrid (RRF)	42 ms	39 ms	FTS + noop vector
`list`	34 ms	26 ms	sorted by created_at
`maintain` (decay+GC)	27 ms	108 ms	full table scan
binary size	12 MB	—	single static binary
startup time	~50 ms	—	cold start

Most operations stay under 60 ms regardless of dataset size. With semantic embeddings enabled, store adds ~50–200 ms per memory for embedding generation — search quality improves significantly.

REST API

mnemos serve --rest --port 8080

POST   /memories              store
GET    /memories/{id}         get
PATCH  /memories/{id}         update
DELETE /memories/{id}         soft-delete
GET    /memories              list
POST   /memories/search       search
POST   /memories/{id}/relate  relate
GET    /stats                 stats
POST   /maintain              maintenance

Build

make build    # → bin/mnemos
make test     # all tests
make lint     # golangci-lint
make release  # goreleaser snapshot

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured