MCP Servers

mnemo-mcp

Persistent AI memory with SQLite hybrid search (FTS5 + semantic), built-in Qwen3 embedding, and rclone sync across machines.

README

Mnemo MCP Server

mcp-name: io.github.n24q02m/mnemo-mcp

Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.

Features

Hybrid search: FTS5 full-text + sqlite-vec semantic + Qwen3-Embedding-0.6B (built-in)
Zero config mode: Works out of the box — local embedding, no API keys needed
Auto-detect embedding: Set API_KEYS for cloud embedding, auto-fallback to local
Embedded sync: rclone auto-downloaded and managed as subprocess
Multi-machine: JSONL-based merge sync via rclone (Google Drive, S3, etc.)
Proactive memory: Tool descriptions guide AI to save preferences, decisions, facts

Quick Start

The recommended way to run this server is via uvx:

uvx mnemo-mcp@latest

Alternatively, you can use pipx run mnemo-mcp.

Option 1: uvx (Recommended)

{
  "mcpServers": {
    "mnemo": {
      "command": "uvx",
      "args": ["mnemo-mcp@latest"],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        // -- first run downloads ~570MB model, cached for subsequent runs
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_REMOTE": "gdrive",                   // required when SYNC_ENABLED=true
        "SYNC_INTERVAL": "300",                    // optional, auto-sync every 5min (0 = manual only)
        "RCLONE_CONFIG_GDRIVE_TYPE": "drive",      // required when SYNC_ENABLED=true
        "RCLONE_CONFIG_GDRIVE_TOKEN": "<base64>"   // required when SYNC_ENABLED=true, from: uvx mnemo-mcp setup-sync drive
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "mnemo": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "--name", "mcp-mnemo",
        "-v", "mnemo-data:/data",                  // persists memories across restarts
        "-e", "LITELLM_PROXY_URL",                 // optional: pass-through from env below
        "-e", "LITELLM_PROXY_KEY",                 // optional: pass-through from env below
        "-e", "API_KEYS",                          // optional: pass-through from env below
        "-e", "EMBEDDING_API_BASE",                // optional: pass-through from env below
        "-e", "EMBEDDING_API_KEY",                 // optional: pass-through from env below
        "-e", "SYNC_ENABLED",                      // optional: pass-through from env below
        "-e", "SYNC_REMOTE",                       // required when SYNC_ENABLED=true: pass-through
        "-e", "SYNC_INTERVAL",                     // optional: pass-through from env below
        "-e", "RCLONE_CONFIG_GDRIVE_TYPE",         // required when SYNC_ENABLED=true: pass-through
        "-e", "RCLONE_CONFIG_GDRIVE_TOKEN",        // required when SYNC_ENABLED=true: pass-through
        "n24q02m/mnemo-mcp:latest"
      ],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_REMOTE": "gdrive",                   // required when SYNC_ENABLED=true
        "SYNC_INTERVAL": "300",                    // optional, auto-sync every 5min (0 = manual only)
        "RCLONE_CONFIG_GDRIVE_TYPE": "drive",      // required when SYNC_ENABLED=true
        "RCLONE_CONFIG_GDRIVE_TOKEN": "<base64>"   // required when SYNC_ENABLED=true, from: uvx mnemo-mcp setup-sync drive
      }
    }
  }
}

Pre-install (optional)

Pre-download dependencies before adding to your MCP client config. This avoids slow first-run startup:

# Pre-download embedding model (~570MB) and validate API keys
uvx mnemo-mcp warmup

# With cloud embedding (validates API key, skips local download if cloud works)
API_KEYS="GOOGLE_API_KEY:AIza..." uvx mnemo-mcp warmup

Sync setup (one-time)

# Google Drive
uvx mnemo-mcp setup-sync drive

# Other providers (any rclone remote type)
uvx mnemo-mcp setup-sync dropbox
uvx mnemo-mcp setup-sync onedrive
uvx mnemo-mcp setup-sync s3

Opens a browser for OAuth and outputs env vars (RCLONE_CONFIG_*) to set. Both raw JSON and base64 tokens are supported.

Configuration

Variable	Default	Description
`DB_PATH`	`~/.mnemo-mcp/memories.db`	Database location
`LITELLM_PROXY_URL`	—	LiteLLM Proxy URL (e.g. `http://10.0.0.20:4000`). Enables proxy mode
`LITELLM_PROXY_KEY`	—	LiteLLM Proxy virtual key (e.g. `sk-...`)
`API_KEYS`	—	API keys (`ENV:key,ENV:key`). Optional: enables semantic search (SDK mode)
`EMBEDDING_API_BASE`	—	Custom embedding endpoint URL (optional, for SDK mode)
`EMBEDDING_API_KEY`	—	Custom embedding endpoint key (optional)
`EMBEDDING_BACKEND`	(auto-detect)	`litellm` (cloud API) or `local` (Qwen3). Auto: API_KEYS -> litellm, else local (always available)
`EMBEDDING_MODEL`	auto-detect	LiteLLM model name (optional)
`EMBEDDING_DIMS`	`0` (auto=768)	Embedding dimensions (0 = auto-detect, default 768)
`SYNC_ENABLED`	`false`	Enable rclone sync
`SYNC_REMOTE`	—	rclone remote name (required when sync enabled)
`SYNC_FOLDER`	`mnemo-mcp`	Remote folder (optional)
`SYNC_INTERVAL`	`0`	Auto-sync seconds (optional, 0=manual)
`LOG_LEVEL`	`INFO`	Log level (optional)

Embedding (3-Mode Architecture)

Embedding is always available — a local model is built-in and requires no configuration.

Embedding access supports 3 modes, resolved by priority:

Priority	Mode	Config	Use case
1	Proxy	`LITELLM_PROXY_URL` + `LITELLM_PROXY_KEY`	Production (OCI VM, selfhosted gateway)
2	SDK	`API_KEYS` or `EMBEDDING_API_BASE`	Dev/local with direct API access
3	Local	Nothing needed	Offline, always available as fallback

No cross-mode fallback — if proxy is configured but unreachable, calls fail (no silent fallback to direct API).

Local mode: Qwen3-Embedding-0.6B, always available with zero config.
GPU auto-detection: If GPU is available (CUDA/DirectML) and llama-cpp-python is installed, automatically uses GGUF model (~480MB) instead of ONNX (~570MB) for better performance.
All embeddings stored at 768 dims (default). Switching providers never breaks the vector table.
Override with EMBEDDING_BACKEND=local to force local even with API keys.

API_KEYS supports multiple providers in a single string:

API_KEYS=GOOGLE_API_KEY:AIza...,OPENAI_API_KEY:sk-...,COHERE_API_KEY:co-...

Cloud embedding providers (auto-detected from API_KEYS, priority order):

Priority	Env Var (LiteLLM)	Model	Native Dims	Stored
1	`GEMINI_API_KEY`	`gemini/gemini-embedding-001`	3072	768
2	`OPENAI_API_KEY`	`text-embedding-3-large`	3072	768
3	`COHERE_API_KEY`	`embed-multilingual-v3.0`	1024	768

All embeddings are truncated to 768 dims (default) for storage. This ensures switching models never breaks the vector table. Override with EMBEDDING_DIMS if needed.

API_KEYS format maps your env var to LiteLLM's expected var (e.g., GOOGLE_API_KEY:key auto-sets GEMINI_API_KEY). Set EMBEDDING_MODEL explicitly for other providers.

MCP Tools

`memory` — Core memory operations

Action	Required	Optional
`add`	`content`	`category`, `tags`
`search`	`query`	`category`, `tags`, `limit`
`list`	—	`category`, `limit`
`update`	`memory_id`	`content`, `category`, `tags`
`delete`	`memory_id`	—
`export`	—	—
`import`	`data` (JSONL)	`mode` (merge/replace)
`stats`	—	—

`config` — Server configuration

Action	Required	Optional
`status`	—	—
`sync`	—	—
`set`	`key`, `value`	—

`help` — Full documentation

help(topic="memory")  # or "config"

MCP Resources

URI	Description
`mnemo://stats`	Database statistics and server status
`mnemo://recent`	10 most recently updated memories

MCP Prompts

Prompt	Parameters	Description
`save_summary`	`summary`	Generate prompt to save a conversation summary as memory
`recall_context`	`topic`	Generate prompt to recall relevant memories about a topic

Architecture

                  MCP Client (Claude, Cursor, etc.)
                         |
                    FastMCP Server
                   /      |       \
             memory    config    help
                |         |        |
            MemoryDB   Settings  docs/
            /     \
        FTS5    sqlite-vec
                    |
              EmbeddingBackend
              /            \
         LiteLLM        Qwen3 ONNX
            |           (local CPU)
  Gemini / OpenAI / Cohere

        Sync: rclone (embedded) -> Google Drive / S3 / ...

Development

# Install
uv sync

# Run
uv run mnemo-mcp

# Lint
uv run ruff check src/
uv run ty check src/

# Test
uv run pytest

Compatible With

Also by n24q02m

Server	Description	Install
better-notion-mcp	Notion API for AI agents	`npx -y @n24q02m/better-notion-mcp@latest`
wet-mcp	Web search, content extraction, library docs	`uvx --python 3.13 wet-mcp@latest`
better-email-mcp	Email (IMAP/SMTP) for AI agents	`npx -y @n24q02m/better-email-mcp@latest`
better-godot-mcp	Godot Engine for AI agents	`npx -y @n24q02m/better-godot-mcp@latest`

Related Projects

modalcom-ai-workers — GPU-accelerated AI workers on Modal.com (embedding, reranking)
qwen3-embed — Local embedding/reranking library used by mnemo-mcp

Contributing

See CONTRIBUTING.md

License

MIT - See LICENSE

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

mnemo-mcp

README

Mnemo MCP Server

Features

Quick Start

Option 1: uvx (Recommended)

Option 2: Docker

Pre-install (optional)

Sync setup (one-time)

Configuration

Embedding (3-Mode Architecture)

MCP Tools

memory — Core memory operations

config — Server configuration

help — Full documentation

MCP Resources

MCP Prompts

Architecture

Development

Compatible With

Also by n24q02m

Related Projects

Contributing

License

Recommended Servers

`memory` — Core memory operations

`config` — Server configuration

`help` — Full documentation