Recall

Recall

Provides long-term memory storage for AI assistants with semantic search, enabling persistent storage of preferences, decisions, and context with relationship tracking between memories.

Category
Visit Server

README

Recall

Long-term memory system for MCP-compatible AI assistants with semantic search and relationship tracking.

Features

  • Persistent Memory Storage: Store preferences, decisions, patterns, and session context
  • Semantic Search: Find relevant memories using natural language queries via ChromaDB vectors
  • Memory Relationships: Create edges between memories (supersedes, relates_to, caused_by, contradicts)
  • Namespace Isolation: Global memories vs project-scoped memories
  • Context Generation: Auto-format memories for session context injection
  • Deduplication: Content-hash based duplicate detection

Installation

# Clone the repository
git clone https://github.com/yourorg/recall.git
cd recall

# Install with uv
uv sync

# Ensure Ollama is running with required models
ollama pull mxbai-embed-large  # Required: embeddings for semantic search
ollama pull llama3.2           # Optional: session summarization for auto-capture hook
ollama serve

Usage

Run as MCP Server

uv run python -m recall

CLI Options

uv run python -m recall --help

Options:
  --sqlite-path PATH      SQLite database path (default: ~/.recall/recall.db)
  --chroma-path PATH      ChromaDB storage path (default: ~/.recall/chroma_db)
  --collection NAME       ChromaDB collection name (default: memories)
  --ollama-host HOST      Ollama server URL (default: http://localhost:11434)
  --ollama-model MODEL    Embedding model (default: mxbai-embed-large)
  --ollama-timeout SECS   Request timeout (default: 30)
  --log-level LEVEL       DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO)

meta-mcp Configuration

Add Recall to your meta-mcp servers.json:

{
  "recall": {
    "command": "uv",
    "args": [
      "run",
      "--directory",
      "/path/to/recall",
      "python",
      "-m",
      "recall"
    ],
    "env": {
      "RECALL_LOG_LEVEL": "INFO",
      "RECALL_OLLAMA_HOST": "http://localhost:11434",
      "RECALL_OLLAMA_MODEL": "mxbai-embed-large"
    },
    "description": "Long-term memory system with semantic search",
    "tags": ["memory", "context", "semantic-search"]
  }
}

Or for Claude Code / other MCP clients (claude.json):

{
  "mcpServers": {
    "recall": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/recall",
        "python",
        "-m",
        "recall"
      ],
      "env": {
        "RECALL_LOG_LEVEL": "INFO"
      }
    }
  }
}

Environment Variables

Variable Default Description
RECALL_SQLITE_PATH ~/.recall/recall.db SQLite database file path
RECALL_CHROMA_PATH ~/.recall/chroma_db ChromaDB persistent storage directory
RECALL_COLLECTION_NAME memories ChromaDB collection name
RECALL_OLLAMA_HOST http://localhost:11434 Ollama server URL
RECALL_OLLAMA_MODEL mxbai-embed-large Embedding model name
RECALL_OLLAMA_TIMEOUT 30 Ollama request timeout in seconds
RECALL_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
RECALL_DEFAULT_NAMESPACE global Default namespace for memories
RECALL_DEFAULT_IMPORTANCE 0.5 Default importance score (0.0-1.0)
RECALL_DEFAULT_TOKEN_BUDGET 4000 Default token budget for context

MCP Tool Examples

memory_store_tool

Store a new memory with semantic indexing. Uses fast daemon path when available (<10ms), falls back to sync embedding otherwise.

{
  "content": "User prefers dark mode in all applications",
  "memory_type": "preference",
  "namespace": "global",
  "importance": 0.8,
  "metadata": {"source": "explicit_request"}
}

Response (fast path via daemon):

{
  "success": true,
  "queued": true,
  "queue_id": 42,
  "namespace": "global"
}

Response (sync path fallback):

{
  "success": true,
  "queued": false,
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "content_hash": "a1b2c3d4e5f67890"
}

daemon_status_tool

Check if the recall daemon is running:

{}

Response:

{
  "running": true,
  "status": {
    "pid": 12345,
    "store_queue": {"pending_count": 5},
    "embed_worker_running": true
  }
}

memory_recall_tool

Search memories by semantic similarity:

{
  "query": "user interface preferences",
  "n_results": 5,
  "namespace": "global",
  "memory_type": "preference",
  "min_importance": 0.5,
  "include_related": true
}

Response:

{
  "success": true,
  "memories": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "User prefers dark mode in all applications",
      "type": "preference",
      "namespace": "global",
      "importance": 0.8,
      "created_at": "2024-01-15T10:30:00",
      "accessed_at": "2024-01-15T14:22:00",
      "access_count": 3
    }
  ],
  "total": 1,
  "score": 0.92
}

memory_relate_tool

Create a relationship between memories:

{
  "source_id": "mem_new_123",
  "target_id": "mem_old_456",
  "relation": "supersedes",
  "weight": 1.0
}

Response:

{
  "success": true,
  "edge_id": 42
}

memory_context_tool

Generate formatted context for session injection:

{
  "query": "coding style preferences",
  "project": "myproject",
  "token_budget": 4000
}

Response:

{
  "success": true,
  "context": "# Memory Context\n\n## Preferences\n\n- User prefers dark mode [global]\n- Use 2-space indentation [project:myproject]\n\n## Recent Decisions\n\n- Decided to use FastAPI for the backend [project:myproject]\n",
  "token_estimate": 125
}

memory_forget_tool

Delete memories by ID or semantic search:

{
  "memory_id": "550e8400-e29b-41d4-a716-446655440000",
  "confirm": true
}

Or delete by search:

{
  "query": "outdated preferences",
  "namespace": "project:oldproject",
  "n_results": 10,
  "confirm": true
}

Response:

{
  "success": true,
  "deleted_ids": ["550e8400-e29b-41d4-a716-446655440000"],
  "deleted_count": 1
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     MCP Server (FastMCP)                     │
│  memory_store │ memory_recall │ memory_relate │ memory_forget │
└───────────────────────────┬─────────────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │   FAST PATH       │       │   SYNC PATH       │
    │   <10ms           │       │   10-60s          │
    └─────────┬─────────┘       └─────────┬─────────┘
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │  recall-daemon    │       │   HybridStore     │
    │  (Unix socket)    │       │ (Direct Ollama)   │
    │                   │       └─────────┬─────────┘
    │  ┌─────────────┐  │                 │
    │  │ StoreQueue  │  │     ┌───────────┼───────────┐
    │  │ EmbedWorker │  │     │           │           │
    │  └─────────────┘  │     │           │           │
    └─────────┬─────────┘   ┌─▼─────┐ ┌───▼───┐ ┌─────▼─────┐
              │             │SQLite │ │Chroma │ │  Ollama   │
              └─────────────►Store  │ │ Store │ │  Client   │
                            └───────┘ └───────┘ └───────────┘

The daemon provides fast (<10ms) memory storage by queueing operations and processing embeddings asynchronously. When the daemon is unavailable, the MCP server falls back to synchronous embedding (10-60s).

Daemon Setup (macOS)

The recall daemon provides fast (<10ms) memory storage by processing embeddings asynchronously. Without the daemon, each store operation blocks for 10-60 seconds waiting for Ollama embeddings.

Quick Install

# From the recall directory
./hooks/install-daemon.sh

This will:

  1. Copy hook scripts to ~/.claude/hooks/
  2. Install the launchd plist to ~/Library/LaunchAgents/
  3. Start the daemon automatically

Manual Install

# 1. Copy hook scripts
cp hooks/recall*.py ~/.claude/hooks/
chmod +x ~/.claude/hooks/recall*.py

# 2. Create logs directory
mkdir -p ~/.claude/hooks/logs

# 3. Install plist with path substitution
sed "s|{{HOME}}|$HOME|g; s|{{RECALL_DIR}}|$(pwd)|g" \
  hooks/com.recall.daemon.plist.template > ~/Library/LaunchAgents/com.recall.daemon.plist

# 4. Load the daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

Daemon Commands

# Check status
echo '{"cmd": "status"}' | nc -U /tmp/recall-daemon.sock | jq

# Stop daemon
launchctl unload ~/Library/LaunchAgents/com.recall.daemon.plist

# Start daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

# View logs
tail -f ~/.claude/hooks/logs/recall-daemon.log

Hooks Configuration

Add recall hooks to your Claude Code settings (~/.claude/settings.json). See hooks/settings.example.json for the full configuration.

Development

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest tests/

# Run tests with coverage
uv run pytest tests/ --cov=recall --cov-report=html

# Type checking
uv run mypy src/recall

# Run specific integration tests
uv run pytest tests/integration/test_mcp_server.py -v

Requirements

  • Python 3.13+
  • Ollama with:
    • mxbai-embed-large model (required for semantic search)
    • llama3.2 model (optional, for session auto-capture hook)
  • ~500MB disk space for ChromaDB indices

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
E2B

E2B

Using MCP to run code via e2b.

Official
Featured