MCP Servers

CPersona

Persistent AI memory server with 3-layer hybrid search (vector + FTS5 + keyword), confidence scoring via Reciprocal Rank Fusion, episodic/profile memory, and 16 tools. Zero LLM dependency. Works standalone with Claude Desktop and Claude Code. MIT licensed.

README

cpersona

MCP Memory Server

Give Claude persistent memory across sessions. Single SQLite file. 16 tools. Zero LLM dependency.

Quick Start · Features · Architecture · All Tools · Zenn Book (JP)

</div>

Standalone repository — This is the standalone version for use with Claude Desktop, Claude Code, and any MCP client. If you are a ClotoCore user, use the version in cloto-mcp-servers instead.

The Problem

Claude forgets everything between sessions. Every conversation starts from zero — no context about your project, your preferences, or what you discussed yesterday.

cpersona fixes this. It's an MCP server that stores memories in a local SQLite file and retrieves them through hybrid search. Claude remembers you.

Quick Start

Prerequisites: Python 3.10+, Git

git clone https://github.com/Cloto-dev/cpersona.git
cd cpersona
python -m venv .venv

# Windows
.venv\Scripts\activate
# macOS / Linux
# source .venv/bin/activate

pip install .

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "embedding": {
      "command": "/path/to/.venv/bin/python",
      "args": ["/path/to/servers/embedding/server.py"],
      "env": {
        "EMBEDDING_PROVIDER": "onnx_jina_v5_nano",
        "EMBEDDING_HTTP_PORT": "8401"
      }
    },
    "cpersona": {
      "command": "/path/to/.venv/bin/python",
      "args": ["/path/to/cpersona/server.py"],
      "env": {
        "CPERSONA_DB_PATH": "/home/you/.claude/cpersona.db",
        "CPERSONA_EMBEDDING_MODE": "http",
        "CPERSONA_EMBEDDING_URL": "http://127.0.0.1:8401/embed"
      }
    }
  }
}

Windows: use .venv/Scripts/python.exe and C:/Users/you/.claude/cpersona.db

Claude Code:

claude mcp add-json embedding '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/servers/embedding/server.py"],"env":{"EMBEDDING_PROVIDER":"onnx_jina_v5_nano","EMBEDDING_HTTP_PORT":"8401"}}' -s user

claude mcp add-json cpersona '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/cpersona/server.py"],"env":{"CPERSONA_DB_PATH":"/home/you/.claude/cpersona.db","CPERSONA_EMBEDDING_MODE":"http","CPERSONA_EMBEDDING_URL":"http://127.0.0.1:8401/embed"}}' -s user

That's it. Claude now has persistent memory. Ask it to store something and recall it in a later session.

Features

Hybrid Search — Three independent retrieval strategies run in parallel and merge results via Reciprocal Rank Fusion (RRF):

Layer	Method	Strength
Vector	Cosine similarity (jina-v5-nano, 768d)	Semantic meaning
FTS5	SQLite full-text search with trigram tokenizer	Exact terms, names, IDs
Keyword	Fallback pattern matching	Edge cases, partial matches

Memory Types:

Declarative memory — Individual facts, decisions, instructions stored via store
Episodic memory — Conversation summaries archived via archive_episode
Profile memory — Accumulated user/project attributes via update_profile

Confidence Scoring — Each recalled memory gets a confidence score combining:

Cosine similarity (semantic relevance)
Dynamic time decay (adapts to corpus time range — a 1-year-old corpus and a 1-day-old corpus use different decay curves)
Recall boost (frequently useful memories surface more easily, with natural fade-out)
Completion factor (resolved topics decay faster)

Zero LLM Dependency — cpersona is a pure data server. It never calls an LLM internally. All summarization and extraction is performed by the calling agent. This means zero API costs from cpersona itself, deterministic behavior, and no hidden latency.

Additional capabilities:

Agent namespace isolation — multiple agents share one DB without interference
Background task queue — DB-persisted, crash-recoverable async processing
JSONL export/import — full memory portability between environments
Agent-to-agent memory merge — atomic copy/move with deduplication
Auto-calibration — statistical threshold tuning via null distribution z-score (no labels needed)
Health check — 15 automated detections with auto-repair (contamination, duplicates, FTS desync, invalid data, stale tasks)
stdio + Streamable HTTP transport
Single-file SQLite — no external database required

Architecture

                         ┌─────────────────────────────────────┐
                         │            MCP Host                 │
                         │   (Claude Desktop / Claude Code)    │
                         └──────────────┬──────────────────────┘
                                        │ MCP (JSON-RPC)
                         ┌──────────────▼──────────────────────┐
                         │           cpersona                  │
                         │         (server.py)                 │
                         │                                     │
                         │  ┌─────────┐  ┌─────────┐          │
                         │  │  store   │  │ recall  │  ...     │
                         │  └────┬────┘  └────┬────┘          │
                         │       │             │               │
                         │  ┌────▼─────────────▼────────────┐  │
                         │  │         SQLite DB              │  │
                         │  │                                │  │
                         │  │  memories    (content + embed) │  │
                         │  │  episodes    (summaries)       │  │
                         │  │  profiles    (attributes)      │  │
                         │  │  memories_fts (FTS5 index)     │  │
                         │  │  episodes_fts (FTS5 index)     │  │
                         │  │  task_queue   (async jobs)     │  │
                         │  └────────────────────────────────┘  │
                         │                                      │
                         └──────────────┬───────────────────────┘
                                        │ HTTP
                         ┌──────────────▼──────────────────────┐
                         │       Embedding Server              │
                         │  (jina-v5-nano ONNX, 768d)          │
                         └─────────────────────────────────────┘

Recall flow (RRF mode):

Query → ┌── Vector search (cosine similarity)  ──┐
        ├── FTS5 search (episodes + memories)    ──┼── RRF merge → Confidence scoring → Top-K
        └── Keyword fallback                     ──┘

Benchmarks

Tested on LMEB (Long-term Memory Evaluation Benchmark, results) — 22 evaluation tasks measuring memory retrieval quality:

Embedding Model	Params	Dimensions	Mean NDCG@10
MiniLM-L6-v2	22M	384	36.88
e5-small	33M	384	46.36
jina-v5-nano	33M	768	54.14

jina-v5-nano achieves +47% improvement over the MiniLM baseline.

All Tools

Tool	Description
`store`	Store a message in agent memory
`recall`	Recall relevant memories (vector + FTS5 + keyword, RRF merge)
`get_profile`	Get current agent profile
`update_profile`	Save pre-computed agent profile
`archive_episode`	Archive conversation episode with summary and keywords
`list_memories`	List recent memories
`list_episodes`	List archived episodes
`delete_memory`	Delete a single memory (ownership enforced)
`delete_episode`	Delete a single episode (ownership enforced)
`delete_agent_data`	Delete all data for an agent
`calibrate_threshold`	Auto-calibrate vector search threshold via z-score
`export_memories`	Export to JSONL (memories, episodes, profiles)
`import_memories`	Import from JSONL (idempotent via msg_id dedup)
`merge_memories`	Merge one agent's data into another (atomic, with dedup)
`get_queue_status`	Background task queue status
`check_health`	15-point database health check with auto-repair

Configuration

All settings via environment variables with sensible defaults:

Variable	Default	Description
`CPERSONA_DB_PATH`	`./cpersona.db`	SQLite database path
`CPERSONA_EMBEDDING_MODE`	`http`	Embedding mode (`http` or `disabled`)
`CPERSONA_EMBEDDING_URL`	`http://127.0.0.1:8401/embed`	Embedding server URL
`CPERSONA_VECTOR_SEARCH_MODE`	`remote`	Vector search mode
`CPERSONA_SEARCH_MODE`	`rrf`	Search strategy (`rrf` or `cascade`)
`CPERSONA_RRF_K`	`60`	RRF smoothing parameter
`CPERSONA_CONFIDENCE_ENABLED`	`false`	Include confidence metadata in results
`CPERSONA_AUTO_CALIBRATE`	`false`	Auto-calibrate on startup
`CPERSONA_TASK_QUEUE_ENABLED`	`false`	Enable background task queue

Stats

~3,000 LOC Python (single file, server.py)
117 tests across 12 test modules
Schema v7 (auto-migrating)
MIT License

Works With

cpersona is an MCP server — it works with any MCP-compatible host:

Claude Desktop
Claude Code
ClotoCore (AI agent platform, where cpersona originated)
Any custom MCP client

Part of ClotoCore

cpersona is the memory layer of ClotoCore, an open-source AI agent platform written in Rust. While cpersona is fully standalone (MIT license), it was designed to give AI agents persistent, searchable memory within the ClotoCore ecosystem.

Learn More

Zenn Book (Japanese) — Full design walkthrough and setup guide
Memory System Design — Technical specification
ClotoCore — The AI agent platform

License

MIT — free to use from any MCP host without restriction. </div>

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured