mnemos

mnemos

Persistent memory engine for AI coding agents. Single Go binary, zero runtime dependencies, MCP-native. Stores, searches, and deduplicates memories across sessions using embedded SQLite with hybrid FTS + semantic search, memory decay, relation graph, and token-budget context assembly.

Category
Visit Server

README

mnemos

Your AI agent has the memory of a goldfish. Mnemos fixes that.

A persistent memory engine for AI coding agents. Single Go binary, zero runtime dependencies, MCP-native.

Mnemos stores, searches, and manages memories across sessions using embedded SQLite — no external services, no Docker, no cloud, no Python, no Node.js. Just one binary and a .db file.

Agent (Claude Code / Kiro / Cursor / Windsurf / ...)
    ↓ MCP stdio
mnemos serve
    ↓
SQLite + FTS5 (~/.mnemos/mnemos.db)

What does it actually do?

Every time your agent learns something worth keeping — an architecture decision, a bug fix, a project convention — it calls mnemos_store. Next session, it calls mnemos_context and gets that knowledge back, as if it never forgot.

No more re-explaining your project structure every Monday morning.

The memory lifecycle:

  1. Agent finishes something meaningful (fixed a bug, made a decision, learned a pattern)
  2. Calls mnemos_store with the content
  3. Mnemos deduplicates, classifies, and indexes it
  4. Next session: mnemos_context assembles relevant memories within a token budget
  5. Agent picks up right where it left off

Why mnemos?

claude-mem engram neural-memory mnemos
MCP native
Single binary / zero install ❌ (pip)
Zero config to start
Hybrid search (FTS + semantic RRF)
Memory decay / lifecycle
Deduplication ✅ (3-tier)
Relation graph ✅ (spreading activation)
Token-budget context assembly partial
Human-readable Markdown mirror
Works with Kiro / Cursor / Windsurf
No Python / Node runtime required
Written in Go ❌ (Python)

Install

# curl (macOS / Linux)
curl -fsSL https://raw.githubusercontent.com/s60yucca/mnemos/main/install.sh | bash

# Homebrew
brew install s60yucca/tap/mnemos

# Build from source (requires Go 1.23+)
git clone https://github.com/s60yucca/mnemos
cd mnemos && make build
# binary at: bin/mnemos

Initialize on first run:

mnemos init
# Creates ~/.mnemos/mnemos.db and ~/.mnemos/config.yaml

Use with Claude Code

Add to ~/.claude.json (global) or .mcp.json in your project root:

{
  "mcpServers": {
    "mnemos": {
      "command": "mnemos",
      "args": ["serve"],
      "env": {
        "MNEMOS_PROJECT_ID": "my-project"
      }
    }
  }
}

Restart Claude Code. Mnemos tools appear automatically.


Use with Kiro

Add to ~/.kiro/settings/mcp.json (global) or .kiro/settings/mcp.json in your workspace:

{
  "mcpServers": {
    "mnemos": {
      "command": "mnemos",
      "args": ["serve"],
      "env": {
        "MNEMOS_PROJECT_ID": "my-project"
      },
      "disabled": false,
      "autoApprove": ["mnemos_search", "mnemos_get", "mnemos_context"]
    }
  }
}

For automatic memory usage on every session, add a steering file at .kiro/steering/mnemos.md telling the agent to call mnemos_context at session start and mnemos_store when it learns something. Kiro will follow it automatically.


Use with Cursor / Windsurf / any MCP client

Same JSON config — mnemos speaks standard MCP over stdio. Works with any client that supports MCP tools.


MCP Tools

Tool What it does
mnemos_store Store a memory with optional type, tags, project scope
mnemos_search Hybrid FTS + semantic search with RRF ranking
mnemos_get Fetch a memory by ID
mnemos_update Update content, summary, or tags
mnemos_delete Soft-delete (recoverable via maintain)
mnemos_relate Link two memories with a typed relation
mnemos_context Assemble relevant memories within a token budget
mnemos_maintain Run decay, archival, and garbage collection

Resources: mnemos://memories/{project_id}, mnemos://stats

Prompts: load_context (session start), save_session (session end)


CLI

mnemos init                                           # first-time setup
mnemos store "JWT uses RS256, tokens expire in 1h"    # store a memory
mnemos search "authentication"                        # hybrid search
mnemos search "auth" --mode text                      # text-only search
mnemos list --project myapp                           # list memories
mnemos get <id>                                       # fetch by id
mnemos update <id> --content "updated text"           # update
mnemos delete <id>                                    # soft delete
mnemos delete <id> --hard                             # permanent delete
mnemos relate <src-id> <tgt-id> --type depends_on     # create relation
mnemos stats --project myapp                          # storage stats
mnemos maintain                                       # decay + GC
mnemos serve                                          # start MCP server (stdio)
mnemos serve --rest --port 8080                       # start REST server
mnemos version                                        # print version

Global flags: --project <id>, --config <path>, --log-level debug|info|warn|error


Memory Types

Mnemos auto-classifies memories based on content. You can override manually.

Type Decay rate Use for
short_term fast (~1 day) todos, temp notes, WIP
episodic medium (~1 month) session events, bug fixes
long_term slow (~6 months) architecture decisions
semantic very slow facts, definitions, knowledge
working fast active task context

How search works

Mnemos uses Reciprocal Rank Fusion (RRF) to combine two search signals:

  • FTS5 — SQLite full-text search with BM25 ranking. Fast, offline, no setup.
  • Semantic — vector cosine similarity via embeddings. Optional, requires Ollama or OpenAI.

With only FTS5 (default), search is keyword-based but still very good. Enable embeddings to find memories by meaning — e.g. query "token expiry" finds a memory about "JWT RS256 1h lifetime".


Configuration

~/.mnemos/config.yaml:

data_dir: ~/.mnemos
log_level: info
log_format: text        # text or json

embeddings:
  provider: noop          # noop (default) | ollama | openai
  base_url: http://localhost:11434
  model: nomic-embed-text
  dims: 384
  api_key: ""

dedup:
  fuzzy_threshold: 0.85
  semantic_threshold: 0.92

lifecycle:
  decay_interval: 24h
  gc_retention_days: 30
  archive_threshold: 0.1

mirror:
  enabled: false          # set true to write human-readable Markdown files
  base_dir: ~/.mnemos/mirror

Environment variables override config — prefix with MNEMOS_:

MNEMOS_PROJECT_ID=myapp        # scope memories to a project
MNEMOS_LOG_LEVEL=debug

# Only needed if using semantic embeddings (optional):
MNEMOS_EMBEDDINGS_PROVIDER=ollama
MNEMOS_EMBEDDINGS_API_KEY=sk-...

Embedding Providers

Embeddings are optional. By default mnemos uses noop — pure FTS5 text search, zero config, works fully offline.

Enable embeddings only if you want semantic similarity search (find memories by meaning, not just keywords).

Ollama (local, free, no API key):

embeddings:
  provider: ollama
  base_url: http://localhost:11434
  model: nomic-embed-text
  dims: 768

OpenAI:

embeddings:
  provider: openai
  model: text-embedding-3-small
  dims: 1536
  api_key: sk-...

Performance

Benchmarked on macOS (Apple M-series), SQLite WAL mode, embeddings disabled (noop), cold process start per operation.

Operation 350 memories 1500 memories Notes
store (new) 57 ms 24 ms includes dedup check
store (dedup hit) 55 ms 22 ms hash match, no write
search text (FTS5) 60 ms 54 ms BM25 ranking
search hybrid (RRF) 42 ms 39 ms FTS + noop vector
list 34 ms 26 ms sorted by created_at
maintain (decay+GC) 27 ms 108 ms full table scan
binary size 12 MB single static binary
startup time ~50 ms cold start

Most operations stay under 60 ms regardless of dataset size. With semantic embeddings enabled, store adds ~50–200 ms per memory for embedding generation — search quality improves significantly.


REST API

mnemos serve --rest --port 8080
POST   /memories              store
GET    /memories/{id}         get
PATCH  /memories/{id}         update
DELETE /memories/{id}         soft-delete
GET    /memories              list
POST   /memories/search       search
POST   /memories/{id}/relate  relate
GET    /stats                 stats
POST   /maintain              maintenance

Build

make build    # → bin/mnemos
make test     # all tests
make lint     # golangci-lint
make release  # goreleaser snapshot

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured