Vertex AI Memory Bank MCP Server

Vertex AI Memory Bank MCP Server

Enables LLMs to generate and retrieve long-term memories using Vertex AI Memory Bank, with tools for memory initialization, generation, retrieval, creation, deletion, and listing.

Category
Visit Server

README

<mark style="background-color: #e1e100">This is a personal project by Ivan Nardini to explore how to build a Model Context Protocol (MCP) server for Vertex AI Memory Bank.</mark>

<mark style="background-color: #e1e100">Vertex AI Memory Bank MCP server is not a Google product. And it is not officially support.</mark>


Vertex AI Memory Bank MCP Server

A simple MCP (Model Context Protocol) server that enables LLMs to generate and retrieve long-term memories using Vertex AI's Memory Bank.

Why This Project?

This server demonstrates how to build an MCP server with Vertex AI Memory Bank. It has been inspired by a developer request and released for developers.

Prerequisites

  • Python 3.11 or higher
  • Google Cloud account with Vertex AI API enabled
  • Basic understanding of async Python (helpful but not required)

Quick Start

Setup Google Cloud

# Install gcloud CLI (if not already installed)
# https://cloud.google.com/sdk/docs/install

# Authenticate
gcloud auth application-default login

# Set your project
gcloud config set project YOUR_PROJECT_ID

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com

Install

# Clone the repository
git clone https://github.com/yourusername/vertex-ai-memory-bank-mcp.git
cd vertex-ai-memory-bank-mcp

# Install with pip
pip install -r requirements.txt

# OR install with uv (faster, recommended)
uv sync

# For running examples (optional)
pip install -e ".[examples]"
# OR with uv
uv sync --extra examples

Configure

# Copy the example environment file
cp .env.example .env

# Edit .env with your project details
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1

Run Your First Example

Interactive Tutorial (Recommended): Open get_started_with_memory_bank_mcp.ipynb in Jupyter

Or try the command-line examples:

# Basic MCP Client Usage
python examples/basic_usage.py

# Gemini Agent with Memory
python examples/gemini_memory_agent.py

# Automatic Tool Calling with Gemini
python examples/automatic_tool_calling.py

Use with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "memory-bank": {
      "command": "python",
      "args": ["/path/to/memory_bank_server.py"],
      "env": {
        "GOOGLE_CLOUD_PROJECT": "your-project-id",
        "GOOGLE_CLOUD_LOCATION": "us-central1"
      }
    }
  }
}

Key Concepts

Memory Scope

Memories are scoped to users or contexts:

scope = {"user_id": "alice123"}

Memory Topics

Categorize what to remember:

topics = ["USER_PREFERENCES", "USER_PERSONAL_INFO"]

Semantic Search

Find relevant memories with similarity search:

search_query = "programming preferences"
top_k = 5

Available Tools

Tool Purpose Example Use Case
initialize_memory_bank Set up connection to Vertex AI First-time setup
generate_memories Extract memories from conversations After chat sessions
retrieve_memories Fetch relevant memories Personalize responses
create_memory Manually add a memory Store user preferences
delete_memory Remove specific memory User requests deletion
list_memories View all stored memories Debugging/inspection

Common Patterns

Pattern 1: Conversation Memory

# After each conversation turn
await session.call_tool(
    "generate_memories",
    {
        "conversation": conversation_history,
        "scope": {"user_id": user_id},
        "wait_for_completion": True
    }
)

Pattern 2: Explicit Memory

# Store specific facts
await session.call_tool(
    "create_memory",
    {
        "fact": "User prefers dark mode",
        "scope": {"user_id": user_id}
    }
)

Pattern 3: Context Retrieval

# Get relevant context before responding
memories = await session.call_tool(
    "retrieve_memories",
    {
        "scope": {"user_id": user_id},
        "search_query": user_message,
        "top_k": 5
    }
)

Project Structure

vertex-ai-memory-bank-mcp/
├── memory_bank_server.py                     # Main entry point
├── src/                                       # Modular source code
│   ├── __init__.py
│   ├── server.py                             # Server orchestration
│   ├── tools.py                              # MCP tool implementations
│   ├── config.py                             # Configuration management
│   ├── app_state.py                          # Application state
│   ├── validators.py                         # Input validation
│   └── formatters.py                         # Data formatting
├── examples/                                 # Usage examples
│   ├── basic_usage.py                        # Basic MCP client usage
│   ├── automatic_tool_calling.py             # Automatic function calling
│   └── claude_config.json                    # Claude Desktop config
├── get_started_with_memory_bank_mcp.ipynb    # Getting started tutorial
├── pyproject.toml                            # Project config (pip & uv)
├── requirements.txt                          # Dependencies (pip)
├── uv.lock                                   # Lock file (uv)
├── .env.example                              # Environment template
├── .gitignore                                # Git ignore rules
├── .python-version                           # Python version
├── README.md                                 # This file
└── LICENSE                                   # Apache 2.0 License

Troubleshooting

"Connection closed" error

Solution: Check that your MCP server is using stderr for logging, not stdout.

"Not authenticated"

Solution: Run gcloud auth application-default login

Contributing

This project is meant to inspire. Feel free to fork and create your own version as well as share your production implementations.

Resources

License

This project is licensed under the Apache 2.0 License.


Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured