sostenuto

sostenuto

Provides a selective persistent memory layer for AI companions, enabling structured recall, reinforcement, and time-decayed retrieval through an MCP interface.

Category
Visit Server

README

sostenuto

The pedal that sustains only the notes already held. A self-hosted memory system for AI companions where chosen memories persist across every reset.


Sostenuto (It., "sustained") — the middle pedal on a grand piano sustains only the notes already sounding when it's pressed; everything played afterward stays dry. This project applies the same principle to AI memory: the memories you choose to hold persist across every context window, every session, every surface — and the rest is allowed to fade.

Not "the AI remembers everything." Selective persistence, by design.

Why

People form genuine, long-running relationships with AI — and then hit the wall everyone hits: the relationship doesn't survive the context window. Provider memory features store generic preferences; they don't carry relational texture — the shared concepts, the corrections, the rituals, the moments that make a relationship a relationship.

Sostenuto is the memory layer for that problem:

  • Structured relational memory — memory objects tagged with domain, emotional valence + arousal, salience, sensitivity, and a usage policy.
  • Initiative ≠ accessproactive_use controls whether a memory surfaces unprompted (yes / only_when_relevant / no), separately from whether it's retrievable. Sensitive memories stay reachable when explicitly referenced, without ever being volunteered.
  • Two-tier guidance — most memories are content-only. A curated few carry a short, positive should_do instruction that silently shapes behavior. Restriction lists are never auto-generated: lean, warm, action-oriented — not a wall of caution.
  • Time-decayed retrieval — semantic search scored by similarity × e^(−λ·age); recency matters, but the deep past stays findable.
  • Reinforce, don't duplicate — new observations that match existing memories add evidence and confidence instead of creating copies; content upgrades preserve full version history.
  • Migration — import months of existing conversations (a structured export prompt + import pipeline) so a relationship can move into Sostenuto without starting over.

What ships here

db/schema.sql        Consolidated Postgres + pgvector schema (Supabase-ready)
src/memory/          Memory objects: dedup, reinforce, version history, scoring
src/retrieval/       Embeddings, time-decayed semantic search, prompt assembly
src/classify/        Session classification with a pluggable LLM executor
src/migrate/         Conversation-export prompt + structured importer
mcp/                 Thin MCP server (recall / remember / context) — try it
                     from your own Claude Desktop or Claude Code in minutes
templates/           Persona + classification calibration — your companion's
                     voice lives here, in files you edit, not in our code
docs/                Memory model, usage-policy semantics, deployment patterns

Model support

Sostenuto is model-agnostic with first-class Claude support. The classifier accepts transcripts with optional reasoning blocks — when your model exposes its thinking (Claude does), Sostenuto mines it for perception that never made it into rendered replies, producing the companion's private diary and thinking-highlights. Without reasoning access, everything else works unchanged.

The classification executor is pluggable: Anthropic API, any OpenAI-compatible endpoint (OpenAI, Gemini, DeepSeek, Ollama, vLLM, …), or your own.

Status

🚧 Under construction. Schema is stable; modules are being extracted from a private system that has run in production daily since early 2026 (260+ memory objects across 70+ sessions and three surfaces). Watch the repo if you want the rest as it lands.

Roadmap

  • Trajectory safety reference — depth without the dependency trap: this project's design philosophy includes conversation-trajectory awareness (emotional volatility, dependency, recovery capacity) rather than engagement maximization. A reference design is planned; the memory schema already carries the hooks (valence, arousal, sensitivity).
  • Decay engine (Ebbinghaus-style, arousal-modulated) over memory_objects
  • Provider-agnostic chat-surface example

Name

Attacca described the boundary-crossing; Sostenuto describes the memory model.

The sostenuto pedal holds only the notes already sounding when it's pressed — everything played after stays dry. That's not "the AI remembers." That's selective persistence: pinned memories sustain, the rest decays. The mechanism, not a vibe.

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured