GCP Sales Analytics MCP Server
Enables querying both Google Cloud SQL (PostgreSQL) and BigQuery public datasets through an AI agent that automatically routes questions to the appropriate data source for sales and e-commerce analytics.
README
GCP Sales Analytics POC
A proof of concept demonstrating an intelligent AI agent that can query both Google Cloud SQL (PostgreSQL) and BigQuery public datasets, automatically choosing the right data source based on the question.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ ADK Agent (Claude) │
│ Intelligently routes queries to appropriate data source │
└────────────────┬────────────────────────────────────────────┘
│
├─────────────────┬──────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ MCP Server │ │ MCP Server │ │ Tools │
│ (Cloud SQL) │ │ (BigQuery) │ │ (Schema) │
└──────┬───────┘ └──────┬───────┘ └──────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Cloud SQL │ │ BigQuery │
│ PostgreSQL │ │ thelook_ │
│ │ │ ecommerce │
│ • customers │ │ • products │
│ • orders │ │ • users │
│ • vendors │ │ • events │
└──────────────┘ │ • inventory │
│ • orders │
└──────────────┘
Features
- Dual Data Source Access: Query both Cloud SQL and BigQuery seamlessly
- Intelligent Routing: Agent automatically determines which data source to use
- MCP Server: Standards-compliant Model Context Protocol server
- Synthetic Data: Pre-populated with 50 customers, 50 vendors, and 50 orders
- Infrastructure as Code: Terraform for Cloud SQL deployment
- Production-Ready: Error handling, logging, and security best practices
Data Sources
Cloud SQL (PostgreSQL)
Contains transactional data with three tables:
- customers: Customer information (name, email, address, etc.)
- orders: Order details (amounts, dates, status, products)
- vendors: Vendor information
Use for queries about:
- Specific customer information
- Recent order details
- Vendor data
- Current sales transactions
BigQuery (thelook_ecommerce)
Public e-commerce analytics dataset with comprehensive data:
- products, users, events
- inventory_items, order_items
- distribution_centers
Use for queries about:
- Product analytics
- User behavior patterns
- Inventory analysis
- Historical trends
- Large-scale analytics
Prerequisites
- Google Cloud Platform account with billing enabled
- GCP Project with appropriate permissions
- Tools installed:
gcloudCLIterraform(>= 1.0)python3(>= 3.9)psql(PostgreSQL client)
- Anthropic API key for Claude
Setup
1. Clone and Configure
cd /Users/matthewiames/Desktop/gcp_mcp
# Copy environment template
cp .env.example .env
# Edit .env with your values
# Required: ANTHROPIC_API_KEY, GCP_PROJECT_ID
nano .env
2. Install Python Dependencies
# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
3. Authenticate with GCP
# Login to GCP
gcloud auth login
# Set application default credentials
gcloud auth application-default login
# Set your project
gcloud config set project YOUR_PROJECT_ID
4. Deploy Infrastructure
# Run the automated deployment script
./scripts/deploy.sh
The deployment script will:
- Deploy Cloud SQL instance with Terraform
- Create database schema (customers, orders, vendors)
- Seed database with synthetic data
- Verify BigQuery access
This process takes approximately 5-10 minutes.
Usage
Running the MCP Server
# In terminal 1
python3 mcp_server/server.py
The MCP server provides tools for:
query_cloudsql- Execute SQL against Cloud SQLquery_bigquery- Execute SQL against BigQuerylist_cloudsql_tables- List Cloud SQL tableslist_bigquery_tables- List BigQuery tablesget_cloudsql_schema- Get table schema from Cloud SQLget_bigquery_schema- Get table schema from BigQuery
Running the Agent (Interactive Mode)
# In terminal 2
python3 adk_agent/agent.py
Example interactions:
📊 You: What are the total sales from our database?
🤖 Agent: I'll query the Cloud SQL database to calculate total sales...
[Uses query_cloudsql tool]
The total sales amount from all orders is $24,567.89 across 50 orders.
📊 You: Show me the top 5 products from BigQuery
🤖 Agent: I'll query the BigQuery thelook_ecommerce dataset...
[Uses query_bigquery tool]
Here are the top 5 products by sales:
1. Product A - $15,234
2. Product B - $12,456
...
Running Tests
# Run automated tests
python3 scripts/test_agent.py
Project Structure
gcp_mcp/
├── README.md # This file
├── requirements.txt # Python dependencies
├── .env.example # Environment template
├── .gitignore # Git ignore rules
│
├── terraform/ # Infrastructure as Code
│ ├── main.tf # Cloud SQL resources
│ ├── variables.tf # Input variables
│ ├── outputs.tf # Output values
│ └── terraform.tfvars.example
│
├── data/ # Database schema
│ └── schema.sql # Table definitions
│
├── scripts/ # Deployment and utilities
│ ├── deploy.sh # Automated deployment
│ ├── cleanup.sh # Resource cleanup
│ ├── generate_seed_data.py # Synthetic data generation
│ └── test_agent.py # Agent tests
│
├── mcp_server/ # MCP Server implementation
│ └── server.py # Database MCP server
│
└── adk_agent/ # ADK Agent implementation
└── agent.py # Sales analytics agent
How It Works
Agent Decision Making
The agent uses Claude's function calling capabilities with a specialized system prompt that guides data source selection:
- Question Analysis: Agent analyzes the user's question
- Schema Discovery: May first list tables to understand available data
- Source Selection: Chooses Cloud SQL or BigQuery based on:
- Keywords (customers, vendors → Cloud SQL)
- Query type (analytics, trends → BigQuery)
- Data recency requirements
- Query Execution: Formulates and executes appropriate SQL
- Result Presentation: Formats and explains results
Example Decision Flow
User: "What are my recent orders?"
↓
Agent thinks: "recent orders" + "my" suggests transactional data
↓
Decision: Use Cloud SQL
↓
Tool: query_cloudsql
↓
SQL: SELECT * FROM orders ORDER BY order_date DESC LIMIT 10
Configuration
Environment Variables (.env)
# GCP Configuration
GCP_PROJECT_ID=your-project-id
GCP_REGION=us-central1
CLOUDSQL_INSTANCE_NAME=sales-poc-db
CLOUDSQL_DATABASE=salesdb
CLOUDSQL_USER=salesuser
CLOUDSQL_PASSWORD=your-secure-password
# BigQuery
BIGQUERY_DATASET=bigquery-public-data.thelook_ecommerce
# Anthropic
ANTHROPIC_API_KEY=your-api-key
# Database connection (set after deployment)
DB_HOST=your-cloudsql-ip
Terraform Variables
See terraform/terraform.tfvars.example
Security Considerations
This is a proof of concept with simplified security:
- Cloud SQL has public IP (uses authorized networks)
- Database credentials in environment variables
- No VPC or private networking
- SQL injection prevention (SELECT-only queries)
For production:
- Use Cloud SQL Proxy or private IP
- Store credentials in Secret Manager
- Implement VPC and private networking
- Add query validation and sanitization
- Enable Cloud SQL backups
- Use IAM authentication
- Implement rate limiting
Cost Estimation
Approximate costs for running this POC:
- Cloud SQL (db-f1-micro): ~$7-10/month
- BigQuery: Pay per query (~$5/TB scanned, public datasets may be free)
- Anthropic API: Pay per token
Important: Run ./scripts/cleanup.sh when done to avoid ongoing charges.
Cleanup
To destroy all resources:
./scripts/cleanup.sh
This will:
- Destroy the Cloud SQL instance
- Remove all data
- Clean up Terraform state
Troubleshooting
Connection Issues
# Test Cloud SQL connection
psql -h $DB_HOST -U $CLOUDSQL_USER -d $CLOUDSQL_DATABASE
# Check instance status
gcloud sql instances describe sales-poc-db-XXXX
API Access Issues
# Enable required APIs
gcloud services enable sqladmin.googleapis.com
gcloud services enable bigquery.googleapis.com
# Check authentication
gcloud auth list
Terraform Issues
cd terraform
# Re-initialize
terraform init
# Check state
terraform show
Extending the POC
Ideas for enhancement:
-
Add More Data Sources
- Cloud Spanner
- Firestore
- External APIs
-
Enhanced Agent Capabilities
- Data visualization
- Report generation
- Predictive analytics
-
Production Features
- Caching layer
- Query optimization
- Audit logging
- Monitoring and alerts
-
Advanced MCP Features
- Streaming responses
- Batch operations
- Transaction support
Resources
- Model Context Protocol (MCP)
- Anthropic Claude API
- Google Cloud SQL
- BigQuery Public Datasets
- Terraform GCP Provider
License
This is a proof of concept for demonstration purposes.
Support
For issues or questions:
- Check the troubleshooting section
- Review logs in the terminal
- Check GCP Console for resource status
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.