MCP Servers

Recall

Provides long-term memory storage for AI assistants with semantic search, enabling persistent storage of preferences, decisions, and context with relationship tracking between memories.

README

Recall

Long-term memory system for MCP-compatible AI assistants with semantic search and relationship tracking.

Features

Persistent Memory Storage: Store preferences, decisions, patterns, and session context
Semantic Search: Find relevant memories using natural language queries via ChromaDB vectors
Memory Relationships: Create edges between memories (supersedes, relates_to, caused_by, contradicts)
Namespace Isolation: Global memories vs project-scoped memories
Context Generation: Auto-format memories for session context injection
Deduplication: Content-hash based duplicate detection

Installation

# Clone the repository
git clone https://github.com/yourorg/recall.git
cd recall

# Install with uv
uv sync

# Ensure Ollama is running with required models
ollama pull mxbai-embed-large  # Required: embeddings for semantic search
ollama pull llama3.2           # Optional: session summarization for auto-capture hook
ollama serve

Usage

Run as MCP Server

uv run python -m recall

CLI Options

uv run python -m recall --help

Options:
  --sqlite-path PATH      SQLite database path (default: ~/.recall/recall.db)
  --chroma-path PATH      ChromaDB storage path (default: ~/.recall/chroma_db)
  --collection NAME       ChromaDB collection name (default: memories)
  --ollama-host HOST      Ollama server URL (default: http://localhost:11434)
  --ollama-model MODEL    Embedding model (default: mxbai-embed-large)
  --ollama-timeout SECS   Request timeout (default: 30)
  --log-level LEVEL       DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO)

meta-mcp Configuration

Add Recall to your meta-mcp servers.json:

{
  "recall": {
    "command": "uv",
    "args": [
      "run",
      "--directory",
      "/path/to/recall",
      "python",
      "-m",
      "recall"
    ],
    "env": {
      "RECALL_LOG_LEVEL": "INFO",
      "RECALL_OLLAMA_HOST": "http://localhost:11434",
      "RECALL_OLLAMA_MODEL": "mxbai-embed-large"
    },
    "description": "Long-term memory system with semantic search",
    "tags": ["memory", "context", "semantic-search"]
  }
}

Or for Claude Code / other MCP clients (claude.json):

{
  "mcpServers": {
    "recall": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/recall",
        "python",
        "-m",
        "recall"
      ],
      "env": {
        "RECALL_LOG_LEVEL": "INFO"
      }
    }
  }
}

Environment Variables

Variable	Default	Description
`RECALL_SQLITE_PATH`	`~/.recall/recall.db`	SQLite database file path
`RECALL_CHROMA_PATH`	`~/.recall/chroma_db`	ChromaDB persistent storage directory
`RECALL_COLLECTION_NAME`	`memories`	ChromaDB collection name
`RECALL_OLLAMA_HOST`	`http://localhost:11434`	Ollama server URL
`RECALL_OLLAMA_MODEL`	`mxbai-embed-large`	Embedding model name
`RECALL_OLLAMA_TIMEOUT`	`30`	Ollama request timeout in seconds
`RECALL_LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
`RECALL_DEFAULT_NAMESPACE`	`global`	Default namespace for memories
`RECALL_DEFAULT_IMPORTANCE`	`0.5`	Default importance score (0.0-1.0)
`RECALL_DEFAULT_TOKEN_BUDGET`	`4000`	Default token budget for context

MCP Tool Examples

memory_store_tool

Store a new memory with semantic indexing. Uses fast daemon path when available (<10ms), falls back to sync embedding otherwise.

{
  "content": "User prefers dark mode in all applications",
  "memory_type": "preference",
  "namespace": "global",
  "importance": 0.8,
  "metadata": {"source": "explicit_request"}
}

Response (fast path via daemon):

{
  "success": true,
  "queued": true,
  "queue_id": 42,
  "namespace": "global"
}

Response (sync path fallback):

{
  "success": true,
  "queued": false,
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "content_hash": "a1b2c3d4e5f67890"
}

daemon_status_tool

Check if the recall daemon is running:

{}

Response:

{
  "running": true,
  "status": {
    "pid": 12345,
    "store_queue": {"pending_count": 5},
    "embed_worker_running": true
  }
}

memory_recall_tool

Search memories by semantic similarity:

{
  "query": "user interface preferences",
  "n_results": 5,
  "namespace": "global",
  "memory_type": "preference",
  "min_importance": 0.5,
  "include_related": true
}

Response:

{
  "success": true,
  "memories": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "User prefers dark mode in all applications",
      "type": "preference",
      "namespace": "global",
      "importance": 0.8,
      "created_at": "2024-01-15T10:30:00",
      "accessed_at": "2024-01-15T14:22:00",
      "access_count": 3
    }
  ],
  "total": 1,
  "score": 0.92
}

memory_relate_tool

Create a relationship between memories:

{
  "source_id": "mem_new_123",
  "target_id": "mem_old_456",
  "relation": "supersedes",
  "weight": 1.0
}

Response:

{
  "success": true,
  "edge_id": 42
}

memory_context_tool

Generate formatted context for session injection:

{
  "query": "coding style preferences",
  "project": "myproject",
  "token_budget": 4000
}

Response:

{
  "success": true,
  "context": "# Memory Context\n\n## Preferences\n\n- User prefers dark mode [global]\n- Use 2-space indentation [project:myproject]\n\n## Recent Decisions\n\n- Decided to use FastAPI for the backend [project:myproject]\n",
  "token_estimate": 125
}

memory_forget_tool

Delete memories by ID or semantic search:

{
  "memory_id": "550e8400-e29b-41d4-a716-446655440000",
  "confirm": true
}

Or delete by search:

{
  "query": "outdated preferences",
  "namespace": "project:oldproject",
  "n_results": 10,
  "confirm": true
}

Response:

{
  "success": true,
  "deleted_ids": ["550e8400-e29b-41d4-a716-446655440000"],
  "deleted_count": 1
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     MCP Server (FastMCP)                     │
│  memory_store │ memory_recall │ memory_relate │ memory_forget │
└───────────────────────────┬─────────────────────────────────┘
                            │
              ┌─────────────┴─────────────┐
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │   FAST PATH       │       │   SYNC PATH       │
    │   <10ms           │       │   10-60s          │
    └─────────┬─────────┘       └─────────┬─────────┘
              │                           │
    ┌─────────▼─────────┐       ┌─────────▼─────────┐
    │  recall-daemon    │       │   HybridStore     │
    │  (Unix socket)    │       │ (Direct Ollama)   │
    │                   │       └─────────┬─────────┘
    │  ┌─────────────┐  │                 │
    │  │ StoreQueue  │  │     ┌───────────┼───────────┐
    │  │ EmbedWorker │  │     │           │           │
    │  └─────────────┘  │     │           │           │
    └─────────┬─────────┘   ┌─▼─────┐ ┌───▼───┐ ┌─────▼─────┐
              │             │SQLite │ │Chroma │ │  Ollama   │
              └─────────────►Store  │ │ Store │ │  Client   │
                            └───────┘ └───────┘ └───────────┘

The daemon provides fast (<10ms) memory storage by queueing operations and processing embeddings asynchronously. When the daemon is unavailable, the MCP server falls back to synchronous embedding (10-60s).

Daemon Setup (macOS)

The recall daemon provides fast (<10ms) memory storage by processing embeddings asynchronously. Without the daemon, each store operation blocks for 10-60 seconds waiting for Ollama embeddings.

Quick Install

# From the recall directory
./hooks/install-daemon.sh

This will:

Copy hook scripts to ~/.claude/hooks/
Install the launchd plist to ~/Library/LaunchAgents/
Start the daemon automatically

Manual Install

# 1. Copy hook scripts
cp hooks/recall*.py ~/.claude/hooks/
chmod +x ~/.claude/hooks/recall*.py

# 2. Create logs directory
mkdir -p ~/.claude/hooks/logs

# 3. Install plist with path substitution
sed "s|{{HOME}}|$HOME|g; s|{{RECALL_DIR}}|$(pwd)|g" \
  hooks/com.recall.daemon.plist.template > ~/Library/LaunchAgents/com.recall.daemon.plist

# 4. Load the daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

Daemon Commands

# Check status
echo '{"cmd": "status"}' | nc -U /tmp/recall-daemon.sock | jq

# Stop daemon
launchctl unload ~/Library/LaunchAgents/com.recall.daemon.plist

# Start daemon
launchctl load ~/Library/LaunchAgents/com.recall.daemon.plist

# View logs
tail -f ~/.claude/hooks/logs/recall-daemon.log

Hooks Configuration

Add recall hooks to your Claude Code settings (~/.claude/settings.json). See hooks/settings.example.json for the full configuration.

Development

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest tests/

# Run tests with coverage
uv run pytest tests/ --cov=recall --cov-report=html

# Type checking
uv run mypy src/recall

# Run specific integration tests
uv run pytest tests/integration/test_mcp_server.py -v

Requirements

Python 3.13+
Ollama with:
- mxbai-embed-large model (required for semantic search)
- llama3.2 model (optional, for session auto-capture hook)
~500MB disk space for ChromaDB indices

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

E2B

Using MCP to run code via e2b.

Official

Featured