Cuba-memroys
Persistent memory for AI agents with knowledge graph, Hebbian learning, 4-signal RRF fusion search, GraphRAG enrichment, FSRS spaced repetition, and anti-hallucination grounding. 12 tools, PostgreSQL.
README
π§ Cuba-Memorys
Persistent memory for AI agents β A Model Context Protocol (MCP) server that gives AI coding assistants long-term memory with a knowledge graph, Hebbian learning, GraphRAG enrichment, and anti-hallucination grounding.
12 tools with Cuban soul. Zero manual setup. Mathematically rigorous.
v1.6.0 β KG-neighbor query expansion, embedding LRU cache, community summaries, batch access tracking, pool race fix, session-aware search.
Why Cuba-Memorys?
AI agents forget everything between conversations. Cuba-Memorys solves this by giving them:
- A knowledge graph β Entities, observations, and relations that persist across sessions
- Error memory β Never repeat the same mistake twice (anti-repetition guard)
- Hebbian learning β Memories strengthen with use and fade adaptively (FSRS v4 spaced repetition)
- Anti-hallucination β Verify claims against stored knowledge with graduated confidence + source triangulation
- Semantic search β 4-signal RRF fusion (TF-IDF + pg_trgm + full-text + optional pgvector HNSW)
- KG-neighbor query expansion β Auto-expands low-recall searches via graph neighbors (PRF + Zep/Graphiti)
- GraphRAG β Top results enriched with degree-1 graph neighbors for topological context
- REM Sleep β Autonomous background consolidation (FSRS decay + prune + PageRank) after 15min idle
- Graph intelligence β Personalized PageRank, Louvain community detection with summaries, betweenness centrality
- Embedding cache β LRU cache eliminates redundant ONNX inference (~84% reduction)
- Session-aware search β Results matching active session goals boosted by 15%
| Feature | Cuba-Memorys | Basic Memory MCPs |
|---|---|---|
| Knowledge graph with relations | β | β |
| Hebbian learning (Oja's rule) | β | β |
| FSRS v4 adaptive spaced repetition | β | β |
| 4-signal RRF fusion search | β | β |
| KG-neighbor query expansion (V9) | β | β |
| Embedding LRU cache (V10) | β | β |
| GraphRAG topological enrichment | β | β |
| REM Sleep autonomous consolidation | β | β |
| Community summaries (V12) | β | β |
| Source triangulation (V3) | β | β |
| Session-aware search (V13) | β | β |
| Batch access tracking (V14) | β | β |
| Token-budget truncation (V1) | β | β |
| Post-fusion deduplication (V2) | β | β |
| Adaptive confidence weights (V4) | β | β |
| Information density gating (V5) | β | β |
| Session-aware FSRS decay (V6) | β | β |
| Conditional pgvector + HNSW | β | β |
| Modular architecture (CC avg A) | β | β |
| Optional BGE embeddings (ONNX) | β | β |
| Contradiction detection | β | β |
| Graduated confidence scoring | β | β |
| Personalized PageRank | β | β |
| Louvain community detection + summaries | β | β |
| Betweenness centrality (bridges) | β | β |
| Shannon entropy (knowledge diversity) | β | β |
| Chi-squared concept drift detection | β | β |
| Error pattern detection + MTTR | β | β |
| Entity duplicate detection (SQL similarity) | β | β |
| Observation versioning (audit trail) | β | β |
| Temporal validity (valid_from/valid_until) | β | β |
| Write-time dedup gate | β | β |
| Auto-supersede contradictions | β | β |
| Full JSON export/backup (bounded) | β | β |
| Fuzzy search (typo-tolerant) | β | β |
| Spreading activation | β | β |
| Batch observations (10x fewer calls) | β | β |
| Entity type validation | β | β |
| Graceful shutdown (SIGTERM/SIGINT) | β | β |
| Auto-provisions its own DB | β | β |
Quick Start
1. Prerequisites
- Python 3.14+
- Docker (for PostgreSQL)
2. Install
git clone https://github.com/LeandroPG19/cuba-memorys.git
cd cuba-memorys
docker compose up -d
pip install -e .
# Optional: BGE embeddings for semantic search (~130MB model)
pip install -e ".[embeddings]"
3. Configure your AI editor
Add to your MCP configuration:
{
"mcpServers": {
"cuba-memorys": {
"command": "/path/to/cuba_memorys_launcher.sh",
"disabled": false
}
}
}
Or run directly:
DATABASE_URL="postgresql://cuba:memorys2026@127.0.0.1:5488/brain" python -m cuba_memorys
The server auto-creates the brain database and all tables on first run.
π¨πΊ Las 12 Herramientas
Every tool is named after Cuban culture β memorable, professional, and meaningful.
Knowledge Graph
| Tool | Meaning | Description |
|---|---|---|
cuba_alma |
Alma β soul, essence | CRUD knowledge entities. Types: concept, project, technology, person, pattern, config. Triggers spreading activation on neighbors. |
cuba_cronica |
CrΓ³nica β chronicle | Attach observations to entities with contradiction detection, dedup gate, and information density gating (V5). Supports batch_add with density parity. Types: fact, decision, lesson, preference, context, tool_usage. |
cuba_puente |
Puente β bridge | Connect entities with typed relations (uses, causes, implements, depends_on, related_to). Traverse walks the graph. Infer discovers transitive connections (AβBβC). |
Search & Verification
| Tool | Meaning | Description |
|---|---|---|
cuba_faro |
Faro β lighthouse | Search with 4-signal RRF fusion + KG-neighbor expansion (V9) when recall is low. verify mode with source triangulation (V3) and adaptive confidence (V4). All results get batch access tracking (V14) for FSRS accuracy. Token-budget truncation (V1) + post-fusion dedup (V2). Session-aware: boosts results matching active goals (V13). |
Error Memory
| Tool | Meaning | Description |
|---|---|---|
cuba_alarma |
Alarma β alarm | Report errors immediately. Auto-detects patterns (β₯3 similar = warning). Hebbian boosting for retrieval. |
cuba_remedio |
Remedio β remedy | Mark an error as resolved. Cross-references similar unresolved errors. |
cuba_expediente |
Expediente β case file | Search past errors/solutions. Anti-repetition guard: warns if a similar approach previously failed. |
Sessions & Decisions
| Tool | Meaning | Description |
|---|---|---|
cuba_jornada |
Jornada β workday | Track working sessions with goals and outcomes. Goals used for session-aware decay (V6) and search boost (V13). |
cuba_decreto |
Decreto β decree | Record architecture decisions with context, alternatives, and rationale. |
Memory Maintenance
| Tool | Meaning | Description |
|---|---|---|
cuba_zafra |
Zafra β sugar harvest π― | Memory consolidation: decay (FSRS v4 adaptive, session-aware V6), prune, merge, summarize, pagerank (personalized), find_duplicates, export (bounded JSON backup), stats, backfill. |
cuba_eco |
Eco β echo | RLHF feedback: positive (Oja's rule boost), negative (decrease), correct (update with versioning). |
cuba_vigia |
VigΓa β watchman | Graph analytics: summary (counts + token estimate), health (staleness, Shannon entropy, MTTR, DB size), drift (chi-squared), communities (Louvain + V12 summaries for β₯3 members), bridges (betweenness centrality). |
Mathematical Foundations
Cuba-Memorys is built on peer-reviewed algorithms, not ad-hoc heuristics:
FSRS v4 Adaptive Decay β Wozniak (1987) / Ye (2023)
R(t, S) = (1 + t/(9Β·S))^(-1)
S_new = S Β· (1 + e^0.1 Β· (11 - D) Β· S^(-0.2) Β· (e^((1-R)Β·0.9) - 1))
FSRS v4 provides adaptive memory decay. Stability grows with successful recalls β memories that are reinforced survive longer. V6: Active session goals exempt observations from decay.
Oja's Rule (1982) β Hebbian Learning
Positive: Ξw = Ξ· Β· (1 - wΒ²) β converges to 1.0, cannot explode
Negative: Ξw = Ξ· Β· (1 + wΒ²) β converges to 0.01 (floor)
Where Ξ· = 0.05. The wΒ² term provides natural saturation β self-normalizing without explicit clipping.
TF-IDF + RRF Fusion β Salton (1975) / Cormack (2009)
tfidf(t, d) = tf(t, d) Β· log(N / df(t))
RRF(d) = Ξ£ 1/(k + rank_i(d)) where k = 60
Reciprocal Rank Fusion combines multiple ranked lists from independent signals into a single robust ranking. V2: Post-fusion deduplication removes semantic duplicates via word-overlap ratio.
KG-Neighbor Query Expansion (V9) β Zep/Graphiti (2025)
Rβ = Ο(q) # Initial search
q' = q βͺ {e.name : e β neighbors(Rβ, depth=1)} # Expand via KG
Rβ = Ο_trgm(q') # Re-search
Final = RRF(Rβ, Rβ) # Fuse
Inspired by Zep/Graphiti BFS search (arXiv 2501.13956 Β§3.1). Only triggered when initial results < limit (low recall). Caps neighbor names to 5 to prevent query bloat. Literature shows +15-25% recall improvement.
Source Triangulation (V3) β Multi-Source Confidence
diversity = unique_entities / total_evidence
confidence_adjusted = confidence Γ (0.7 + 0.3 Γ diversity)
Penalizes single-source evidence to reduce confirmation bias. Multiple independent entities providing evidence increases confidence.
Shannon Entropy β Information Density Gating (V5)
H = -Ξ£ (c/n) Β· logβ(c/n)
density = H / H_max β [0, 1]
Observations with density < 0.3 get reduced initial importance (0.3 instead of 0.5). Prevents low-information noise from polluting the knowledge base. Applied in both add and batch_add with parity.
Optional BGE Embeddings β BAAI (2023)
model: Qdrant/bge-small-en-v1.5-onnx-Q (quantized, ~130MB)
runtime: ONNX (no PyTorch dependency)
similarity: cosine(embed(query), embed(observation))
Auto-downloads on first use. Falls back to TF-IDF if not installed. V10: LRU cache (256 entries) eliminates redundant ONNX calls (~84% reduction per session).
GraphRAG Enrichment
Top-3 search results are enriched with degree-1 graph neighbors via a single batched SQL query. Each result gets a graph_context array containing neighbor name, entity type, relation type, and Hebbian strength.
REM Sleep Daemon
After 15 minutes of user inactivity, an autonomous consolidation coroutine runs:
- FSRS Decay β Applies memory decay using Ye (2023) v4 algorithm. V6: Excludes observations linked to active session goals
- Prune β Removes low-importance (< 0.1), rarely-accessed observations
- PageRank β Recalculates personalized importance scores
- Relation Decay β Weakens unused relation strengths
- TF-IDF Rebuild β Refreshes the TF-IDF index
- Cache Clear β Clears search cache + V10 embedding LRU cache
Cancels immediately on new user activity. Prevents concurrent runs.
Conditional pgvector
IF pgvector extension detected:
β Migrate embedding column: float4[] β vector(384)
β Create HNSW index (m=16, ef_construction=64, vector_cosine_ops)
β Add vector cosine distance as 4th RRF signal
β Persist embeddings on observation insert (async, V7/V11)
ELSE:
β Graceful degradation: TF-IDF + trigrams (unchanged)
Zero-downtime: auto-detects at startup, no configuration needed.
Personalized PageRank β Brin & Page (1998)
PR(v) = (1-Ξ±)/N + Ξ± Β· Ξ£ PR(u)/deg(u) where Ξ± = 0.85
personalization: biased toward recently active entities
final_importance = 0.6Β·PR + 0.4Β·current_importance
Community Detection + Summaries (V12)
communities = Louvain(G, resolution=1.0) β Blondel et al. (2008)
For communities β₯3 entities, V12 generates structured summaries from top-3 observations per entity (by importance). Inspired by Zep/Graphiti hierarchical KG (arXiv 2501.13956 Β§2.3).
Other Algorithms
- Shannon Entropy β Knowledge diversity scoring (
diversity = H/H_max β [0,1]) - Spreading Activation β Collins & Loftus (1975): neighbors get 0.6% importance boost per access
- Chi-Squared Drift β Pearson (1900): detects error type distribution changes
- Contradiction Detection β TF-IDF overlap (>0.7) + negation patterns (EN + ES)
Architecture
v1.6.0 uses a modular architecture β decomposed into focused modules with cyclomatic complexity avg grade A.
cuba-memorys/
βββ docker-compose.yml # Dedicated PostgreSQL (port 5488)
βββ pyproject.toml # Package metadata + optional deps
βββ README.md
βββ src/cuba_memorys/
βββ __init__.py # Version (1.6.0)
βββ __main__.py # Entry point
βββ server.py # Thin re-export (24 LOC)
βββ protocol.py # JSON-RPC transport, event loop, REM Sleep daemon
βββ handlers.py # 12 MCP tool handlers (CC-reduced via sub-function extraction)
βββ constants.py # Tool definitions, thresholds, enums
βββ db.py # asyncpg pool + orjson + pgvector detection + V15 race fix
βββ schema.sql # 5 tables, 15+ indexes, pg_trgm, versioning, V8 temporal index
βββ hebbian.py # FSRS v4, Oja's rule, spreading activation, info density
βββ search.py # LRU cache, RRF fusion + V2 dedup, NEIGHBORS_SQL
βββ tfidf.py # TF-IDF semantic search (scikit-learn)
βββ embeddings.py # Optional BGE embeddings (ONNX) + V10 LRU cache
Database Schema
| Table | Purpose | Key Features |
|---|---|---|
brain_entities |
Knowledge graph nodes | tsvector + pg_trgm indexes, importance β [0,1], FSRS stability/difficulty |
brain_observations |
Facts attached to entities | 9 types, provenance, versioning, temporal validity, vector(384) embedding (if pgvector) |
brain_relations |
Graph edges | 5 types, bidirectional delete, Hebbian strength |
brain_errors |
Error memory | JSONB context, synapse weight, MTTR tracking |
brain_sessions |
Working sessions | Goals (JSONB), outcome tracking |
Search Pipeline
Cuba-Memorys uses Reciprocal Rank Fusion (RRF, k=60) to combine up to 4 independent ranked signals:
| # | Signal | Source | Condition |
|---|---|---|---|
| 1 | Entities (ts_rank + trigrams + importance + freshness) | brain_entities |
Always |
| 2 | Observations (ts_rank + trigrams + TF-IDF + importance) | brain_observations |
Always |
| 3 | Errors (ts_rank + trigrams + synapse_weight) | brain_errors |
Always |
| 4 | Vector cosine distance (HNSW) | brain_observations.embedding |
pgvector installed |
Each signal produces an independent ranking. RRF fuses them: score(d) = Ξ£ 1/(60 + rank_i(d)).
Post-fusion pipeline:
- V2 Deduplication β Removes semantic duplicates via word-overlap ratio
- V9 KG-neighbor expansion β When results < limit, expands query via graph neighbors
- V13 Session boost β Active session goal keywords boost matching results by 15%
- GraphRAG enrichment β Top-3 results get degree-1 neighbor context
- V1 Token-budget truncation β Character budget allocated proportionally to RRF score
- V14 Batch access tracking β All returned observations get
access_countbump for FSRS accuracy
Dependencies
Core:
asyncpgβ PostgreSQL async driverorjsonβ Fast JSON serialization (handles UUID/datetime)scikit-learnβ TF-IDF vectorizationnetworkxβ PageRank + Louvain + betweenness centralityscipyβ Chi-squared statistical testsrapidfuzzβ Entity duplicate detectionnumpyβ Numerical operations
Optional (pip install -e ".[embeddings]"):
onnxruntimeβ ONNX model inferencehuggingface-hubβ Auto-download BGE modeltokenizersβ Fast tokenization
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
β | PostgreSQL connection string (required) |
Docker Compose
Runs a dedicated PostgreSQL 18 Alpine instance:
- Port: 5488 (avoids conflicts with 5432/5433)
- Resources: 256MB RAM, 0.5 CPU
- Restart: always (auto-starts on boot)
- Healthcheck:
pg_isreadyevery 10s
How It Works in Practice
1. The agent learns from your project
Agent: I learned that FastAPI endpoints must use async def with response_model.
β cuba_alma(create, "FastAPI", technology)
β cuba_cronica(add, "FastAPI", "All endpoints must be async def with response_model")
2. Error memory prevents repeated mistakes
Agent: I got IntegrityError: duplicate key on numero_parte.
β cuba_alarma("IntegrityError", "duplicate key on numero_parte")
β Similar error found! Solution: "Add SELECT EXISTS before INSERT with FOR UPDATE"
3. Anti-hallucination grounding
Agent: Let me verify this claim before responding...
β cuba_faro("FastAPI uses Django ORM", mode="verify")
β confidence: 0.0, level: "unknown"
β recommendation: "No supporting evidence found. High hallucination risk."
4. Memories adapt with FSRS v4
Initial stability: S = 1.0 (decays in ~9 days)
After 5 reviews: S = 8.2 (decays in ~74 days)
After 20 reviews: S = 45.0 (survives ~13 months)
5. KG-neighbor query expansion (V9)
Query: "deployment strategy" β 2 results (below limit of 5)
β KG neighbors: [Docker, CI/CD, NGINX]
β Expanded: "deployment strategy Docker CI/CD NGINX"
β 5 results (recall improved by graph context)
6. Community summaries (V12)
β cuba_vigia(metric="communities")
β Community 0 (4 members): [FastAPI, Pydantic, SQLAlchemy, PostgreSQL]
Summary: "FastAPI: All endpoints async def; Pydantic: V2 strict models; ..."
β Community 1 (3 members): [React, Next.js, TypeScript]
Summary: "React: 19 hooks patterns; Next.js: App Router; ..."
Verification
Tested with NEMESIS protocol (3-tier) β v1.6.0 results:
π’ Normal (12/12) β All 12 tools, all CRUD actions, GraphRAG enrichment,
Hebbian cross-reference, RRF fusion, Oja's rule,
dedup gate, anti-repetition guard, batch_add,
V14 access_count bump confirmed, V12 community summaries,
V9 KG expansion, V10 embedding cache transparent
π‘ Pessimist (4/4) β Empty queries, non-existent entities, duplicate observations
(dedup gate: similarity 0.85), ungrounded claims
(confidence: 0.0, level: "unknown", hallucination warning)
π΄ Extreme (3/3) β SQL injection (parametrized queries β stored as text, not executed),
XSS (HTML entities escaped), repetitive content AAAΓ430
(density gating: importance reduced to 0.3)
Previous versions: v1.3.0 (12/12, 8/8, 9/9), v1.1.0 (13/13), v1.0.1 (18/18).
Performance
| Operation | Avg latency |
|---|---|
| RRF hybrid search | < 5ms |
| Analytics | < 2.5ms |
| Entity CRUD | < 1ms |
| PageRank (100 entities) | < 50ms |
| GraphRAG enrichment | < 2ms |
Version History
| Version | Key Changes |
|---|---|
| 1.6.0 | V9 KG-neighbor query expansion, V10 embedding LRU cache, V11 async rebuild_embeddings, V12 community summaries, V14 batch access tracking, V15 pool race fix |
| 1.5.0 | V1 token-budget truncation, V2 post-fusion dedup, V3 source triangulation, V4 adaptive confidence, V5 batch_add density parity, V6 session-aware decay, V7 async embed, V8 temporal index |
| 1.3.0 | Modular architecture (CC avg DβA), 87% CC reduction |
| 1.1.0 | GraphRAG, REM Sleep, conditional pgvector, 4-signal RRF |
| 1.0.0 | Initial release: 12 tools, Hebbian learning, FSRS |
License
CC BY-NC 4.0 β Free to use and modify, not for commercial use.
Author
Leandro PΓ©rez G.
- GitHub: @LeandroPG19
- Email: leandropatodo@gmail.com
Credits
Mathematical foundations: Wozniak (1987), Ye (2023, FSRS v4), Oja (1982), Salton (1975, TF-IDF), Cormack (2009, RRF), Brin & Page (1998, PageRank), Collins & Loftus (1975), Shannon (1948), Pearson (1900, ΟΒ²), Blondel et al. (2008, Louvain), McCabe (1976, CC), BAAI (2023, BGE), Malkov & Yashunin (2018, HNSW), Karpicke & Roediger (2008, testing effect), Zep/Graphiti (2025, arXiv 2501.13956).
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.