sostenuto
Provides a selective persistent memory layer for AI companions, enabling structured recall, reinforcement, and time-decayed retrieval through an MCP interface.
README
sostenuto
The pedal that sustains only the notes already held. A self-hosted memory system for AI companions where chosen memories persist across every reset.
Sostenuto (It., "sustained") — the middle pedal on a grand piano sustains only the notes already sounding when it's pressed; everything played afterward stays dry. This project applies the same principle to AI memory: the memories you choose to hold persist across every context window, every session, every surface — and the rest is allowed to fade.
Not "the AI remembers everything." Selective persistence, by design.
Why
People form genuine, long-running relationships with AI — and then hit the wall everyone hits: the relationship doesn't survive the context window. Provider memory features store generic preferences; they don't carry relational texture — the shared concepts, the corrections, the rituals, the moments that make a relationship a relationship.
Sostenuto is the memory layer for that problem:
- Structured relational memory — memory objects tagged with domain, emotional valence + arousal, salience, sensitivity, and a usage policy.
- Initiative ≠ access —
proactive_usecontrols whether a memory surfaces unprompted (yes/only_when_relevant/no), separately from whether it's retrievable. Sensitive memories stay reachable when explicitly referenced, without ever being volunteered. - Two-tier guidance — most memories are content-only. A curated few carry a short, positive
should_doinstruction that silently shapes behavior. Restriction lists are never auto-generated: lean, warm, action-oriented — not a wall of caution. - Time-decayed retrieval — semantic search scored by
similarity × e^(−λ·age); recency matters, but the deep past stays findable. - Reinforce, don't duplicate — new observations that match existing memories add evidence and confidence instead of creating copies; content upgrades preserve full version history.
- Migration — import months of existing conversations (a structured export prompt + import pipeline) so a relationship can move into Sostenuto without starting over.
What ships here
db/schema.sql Consolidated Postgres + pgvector schema (Supabase-ready)
src/memory/ Memory objects: dedup, reinforce, version history, scoring
src/retrieval/ Embeddings, time-decayed semantic search, prompt assembly
src/classify/ Session classification with a pluggable LLM executor
src/migrate/ Conversation-export prompt + structured importer
mcp/ Thin MCP server (recall / remember / context) — try it
from your own Claude Desktop or Claude Code in minutes
templates/ Persona + classification calibration — your companion's
voice lives here, in files you edit, not in our code
docs/ Memory model, usage-policy semantics, deployment patterns
Model support
Sostenuto is model-agnostic with first-class Claude support. The classifier accepts transcripts with optional reasoning blocks — when your model exposes its thinking (Claude does), Sostenuto mines it for perception that never made it into rendered replies, producing the companion's private diary and thinking-highlights. Without reasoning access, everything else works unchanged.
The classification executor is pluggable: Anthropic API, any OpenAI-compatible endpoint (OpenAI, Gemini, DeepSeek, Ollama, vLLM, …), or your own.
Status
🚧 Under construction. Schema is stable; modules are being extracted from a private system that has run in production daily since early 2026 (260+ memory objects across 70+ sessions and three surfaces). Watch the repo if you want the rest as it lands.
Roadmap
- Trajectory safety reference — depth without the dependency trap: this project's design philosophy includes conversation-trajectory awareness (emotional volatility, dependency, recovery capacity) rather than engagement maximization. A reference design is planned; the memory schema already carries the hooks (valence, arousal, sensitivity).
- Decay engine (Ebbinghaus-style, arousal-modulated) over
memory_objects - Provider-agnostic chat-surface example
Name
Attacca described the boundary-crossing; Sostenuto describes the memory model.
The sostenuto pedal holds only the notes already sounding when it's pressed — everything played after stays dry. That's not "the AI remembers." That's selective persistence: pinned memories sustain, the rest decays. The mechanism, not a vibe.
License
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.