redshift-utils-mcp
redshift-utils-mcp
Tools
handle_check_cluster_health
Performs a health assessment of the Redshift cluster. Executes a series of diagnostic SQL scripts concurrently based on the specified level ('basic' or 'full'). Aggregates raw results or errors from each script into a dictionary. Args: ctx: The MCP context object. level: Level of detail: 'basic' for operational status, 'full' for comprehensive table design/maintenance checks. Defaults to 'basic'. time_window_days: Lookback period in days for time-sensitive checks (e.g., queue waits, commit waits). Defaults to 1. Returns: A dictionary where keys are script names and values are either the raw list of dictionary results from the SQL query or an Exception object if that specific script failed. Raises: DataApiError: If a critical error occurs during script execution that prevents gathering results (e.g., config error). Individual script errors are captured within the returned dictionary.
handle_diagnose_locks
Identifies active lock contention in the cluster. Fetches all current lock information and then filters it based on the optional target PID, target table name, and minimum wait time. Formats the results into a list of contention details and a summary. Args: ctx: The MCP context object. target_pid: Optional: Filter results to show locks held by or waited for by this specific process ID (PID). target_table_name: Optional: Filter results for locks specifically on this table name (schema qualification recommended if ambiguous). min_wait_seconds: Minimum seconds a lock must be in a waiting state to be included. Defaults to 5. Returns: A list of dictionaries, where each dictionary represents a row from the lock contention query result. Raises: DataApiError: If fetching the initial lock information fails.
handle_diagnose_query_performance
Analyzes a specific query's execution performance. Fetches query text, execution plan, metrics, alerts, compilation info, skew details, and optionally historical run data. Uses a formatting utility to synthesize this into a structured report with potential issues and recommendations. Args: ctx: The MCP context object. query_id: The numeric ID of the Redshift query to analyze. compare_historical: Fetch performance data for previous runs of the same query text. Defaults to True. Returns: A dictionary conforming to DiagnoseQueryPerformanceResult structure: - On success: Contains detailed performance breakdown, issues, recommendations. - On query not found: Raises QueryNotFound exception. - On other errors: Raises DataApiError or similar for FastMCP to handle. Raises: DataApiError: If a critical error occurs during script execution or parsing. QueryNotFound: If the specified query_id cannot be found in key tables.
handle_execute_ad_hoc_query
Executes an arbitrary SQL query provided by the user via Redshift Data API. Designed as an escape hatch for advanced users or queries not covered by specialized tools. Returns a structured dictionary indicating success (with results) or failure (with error details). Args: ctx: The MCP context object. sql_query: The exact SQL query string to execute. Returns: A dictionary conforming to ExecuteAdHocQueryResult structure: - On success: {"status": "success", "columns": [...], "rows": [...], "row_count": ...} - On error: {"status": "error", "error_message": "...", "error_type": "..."} (Note: Actual return might be handled by FastMCP error handling for raised exceptions) Raises: DataApiConfigError: If configuration is invalid. SqlExecutionError: If the SQL execution itself fails. DataApiTimeoutError: If the Data API call times out. DataApiError: For other Data API related errors or unexpected issues. ClientError: For AWS client-side errors.
handle_get_table_definition
Retrieves the DDL (Data Definition Language) statement for a specific table. Executes a SQL script designed to generate or retrieve the CREATE TABLE statement for the given table. Args: ctx: The MCP context object. schema_name: The schema name of the table. table_name: The name of the table. Returns: A dictionary conforming to GetTableDefinitionResult structure: - On success: {"status": "success", "ddl": "<CREATE TABLE statement>"} - On table not found or DDL retrieval error: {"status": "error", "error_message": "...", "error_type": "..."} Raises: TableNotFound: If the specified table is not found. DataApiError: If a critical, unexpected error occurs during execution.
handle_inspect_table
Retrieves detailed information about a specific Redshift table. Fetches table OID, then concurrently executes various inspection scripts covering design, storage, health, usage, and encoding. Args: ctx: The MCP context object. schema_name: The schema name of the table. table_name: The name of the table. Returns: A dictionary where keys are script names and values are either the raw list of dictionary results from the SQL query, the extracted DDL string, or an Exception object if that specific script failed. - On success: Dictionary containing raw results or Exception objects for each script. - On table not found: Raises TableNotFound exception. - On critical errors (e.g., OID lookup failure): Raises DataApiError or similar. Raises: DataApiError: If a critical error occurs during script execution. TableNotFound: If the specified table cannot be found via its OID.
handle_monitor_workload
Analyzes cluster workload patterns over a specified time window. Executes various SQL scripts concurrently to gather data on resource usage, WLM performance, top queries, queuing, COPY performance, and disk-based queries. Returns a dictionary containing the raw results (or Exceptions) keyed by the script name. Args: ctx: The MCP context object. time_window_days: Lookback period in days for the workload analysis. Defaults to 2. top_n_queries: Number of top queries (by total execution time) to consider for the 'top_queries.sql' script. Defaults to 10. Returns: A dictionary where keys are script names (e.g., 'workload/top_queries.sql') and values are either a list of result rows (as dictionaries) or the Exception object if that script failed. Raises: DataApiError: If a critical error occurs during configuration loading. (Note: Individual script errors are returned in the result dict).
README
Redshift Utils MCP Server
Overview
This project implements a Model Context Protocol (MCP) server designed specifically to interact with Amazon Redshift databases.
It bridges the gap between Large Language Models (LLMs) or AI assistants (like those in Claude, Cursor, or custom applications) and your Redshift data warehouse, enabling secure, standardized data access and interaction. This allows users to query data, understand database structure, and monitoring/diagnostic operations using natural language or AI-driven prompts.
This server is for developers, data analysts, or teams looking to integrate LLM capabilities directly with their Amazon Redshift data environment in a structured and secure manner.
Table of Contents
Features
- ✨ Secure Redshift Connection (via Data API): Connects to your Amazon Redshift cluster using the AWS Redshift Data API via Boto3, leveraging AWS Secrets Manager for credentials managed securely via environment variables.
- 🔍 Schema Discovery: Exposes MCP resources for listing schemas and tables within a specified schema.
- 📊 Metadata & Statistics: Provides a tool (
handle_inspect_table) to gather detailed table metadata, statistics (like size, row counts, skew, stats staleness), and maintenance status. - 📝 Read-Only Query Execution: Offers a secure MCP tool (
handle_execute_ad_hoc_query) to execute arbitrary SELECT queries against the Redshift database, enabling data retrieval based on LLM requests. - 📈 Query Performance Analysis: Includes a tool (
handle_diagnose_query_performance) to retrieve and analyze the execution plan, metrics, and historical data for a specific query ID. - 🔍 Table Inspection: Provides a tool (
handle_inspect_table) to perform a comprehensive inspection of a table, including design, storage, health, and usage. - 🩺 Cluster Health Check: Offers a tool (
handle_check_cluster_health) to perform a basic or full health assessment of the cluster using various diagnostic queries. - 🔒 Lock Diagnosis: Provides a tool (
handle_diagnose_locks) to identify and report on current lock contention and blocking sessions. - 📊 Workload Monitoring: Includes a tool (
handle_monitor_workload) to analyze cluster workload patterns over a time window, covering WLM, top queries, and resource usage. - 📝 DDL Retrieval: Offers a tool (
handle_get_table_definition) to retrieve theSHOW TABLEoutput (DDL) for a specified table. - 🛡️ Input Sanitization: Utilizes parameterized queries via the Boto3 Redshift Data API client where applicable to mitigate SQL injection risks.
- 🧩 Standardized MCP Interface: Adheres to the Model Context Protocol specification for seamless integration with compatible clients (e.g., Claude Desktop, Cursor IDE, custom applications).
Prerequisites
Software:
- Python 3.8+
uv(recommended package manager)- Git (for cloning the repository)
Infrastructure & Access:
- Access to an Amazon Redshift cluster.
- An AWS account with permissions to use the Redshift Data API (
redshift-data:*) and access the specified Secrets Manager secret (secretsmanager:GetSecretValue). - A Redshift user account whose credentials are stored in AWS Secrets Manager. This user needs the necessary permissions within Redshift to perform the actions enabled by this server (e.g.,
CONNECTto the database,SELECTon target tables,SELECTon relevant system views likepg_class,pg_namespace,svv_all_schemas,svv_tables, `svv_table_info``). Using a role with the principle of least privilege is strongly recommended. See Security Considerations.
Credentials:
Your Redshift connection details are managed via AWS Secrets Manager, and the server connects using the Redshift Data API. You need:
- The Redshift cluster identifier.
- The database name within the cluster.
- The ARN of the AWS Secrets Manager secret containing the database credentials (username and password).
- The AWS region where the cluster and secret reside.
- Optionally, an AWS profile name if not using default credentials/region.
These details will be configured via environment variables as detailed in the Configuration section.
Configuration
Set Environment Variables:
This server requires the following environment variables to connect to your Redshift cluster via the AWS Data API. You can set these directly in your shell, using a systemd service file, a Docker environment file, or by creating a .env file in the project's root directory (if using a tool like uv or python-dotenv that supports loading from .env).
Example using shell export:
export REDSHIFT_CLUSTER_ID="your-cluster-id"
export REDSHIFT_DATABASE="your_database_name"
export REDSHIFT_SECRET_ARN="arn:aws:secretsmanager:us-east-1:123456789012:secret:your-redshift-secret-XXXXXX"
export AWS_REGION="us-east-1" # Or AWS_DEFAULT_REGION
# export AWS_PROFILE="your-aws-profile-name" # Optional
Example .env file (see .env.example):
# .env file for Redshift MCP Server configuration
# Ensure this file is NOT committed to version control if it contains secrets. Add it to .gitignore.
REDSHIFT_CLUSTER_ID="your-cluster-id"
REDSHIFT_DATABASE="your_database_name"
REDSHIFT_SECRET_ARN="arn:aws:secretsmanager:us-east-1:123456789012:secret:your-redshift-secret-XXXXXX"
AWS_REGION="us-east-1" # Or AWS_DEFAULT_REGION
# AWS_PROFILE="your-aws-profile-name" # Optional
Required Variables Table:
| Variable Name | Required | Description | Example Value |
|---|---|---|---|
REDSHIFT_CLUSTER_ID |
Yes | Your Redshift cluster identifier. | my-redshift-cluster |
REDSHIFT_DATABASE |
Yes | The name of the database to connect to. | mydatabase |
REDSHIFT_SECRET_ARN |
Yes | AWS Secrets Manager ARN for Redshift credentials. | arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret-abcdef |
AWS_REGION |
Yes | AWS region for Data API and Secrets Manager. | us-east-1 |
AWS_DEFAULT_REGION |
No | Alternative to AWS_REGION for specifying the AWS region. |
us-west-2 |
AWS_PROFILE |
No | AWS profile name to use from your credentials file (~/.aws/...). | my-redshift-profile |
Note: Ensure the AWS credentials used by Boto3 (via environment, profile, or IAM role) have permissions to access the specified REDSHIFT_SECRET_ARN and use the Redshift Data API (redshift-data:*).
Usage
Connecting with Claude Desktop / Anthropic Console:
Add the following configuration block to your mcp.json file. Adjust command, args, env, and workingDirectory based on your installation method and setup.
{
"mcpServers": {
"redshift-utils-mcp": {
"command": "uvx",
"args": ["redshift_utils_mcp"],
"env": {
"REDSHIFT_CLUSTER_ID":"your-cluster-id",
"REDSHIFT_DATABASE":"your_database_name",
"REDSHIFT_SECRET_ARN":"arn:aws:secretsmanager:...",
"AWS_REGION": "us-east-1"
}
}
}
Connecting with Cursor IDE:
- Start the MCP server locally using the instructions in the Usage / Quickstart section.
- In Cursor, open the Command Palette (Cmd/Ctrl + Shift + P).
- Type "Connect to MCP Server" or navigate to the MCP settings.
- Add a new server connection.
- Choose the
stdiotransport type. - Enter the command and arguments required to start your server (
uvx run redshift_utils_mcp). Ensure any necessary environment variables are available to the command being run. - Cursor should detect the server and its available tools/resources.
Available MCP Resources
| Resource URI Pattern | Description | Example URI |
|---|---|---|
/scripts/{script_path} |
Retrieves the raw content of a SQL script file from the server's sql_scripts directory. |
/scripts/health/disk_usage.sql |
redshift://schemas |
Lists all accessible user-defined schemas in the connected database. | redshift://schemas |
redshift://wlm/configuration |
Retrieves the current Workload Management (WLM) configuration details. | redshift://wlm/configuration |
redshift://schema/{schema_name}/tables |
Lists all accessible tables and views within the specified {schema_name}. |
redshift://schema/public/tables |
Replace {script_path} and {schema_name} with the actual values when making requests.
Accessibility of schemas/tables depends on the permissions granted to the Redshift user configured via REDSHIFT_SECRET_ARN.
Available MCP Tools
| Tool Name | Description | Key Parameters (Required*) | Example Invocation |
|---|---|---|---|
handle_check_cluster_health |
Performs a health assessment of the Redshift cluster using a set of diagnostic SQL scripts. | level (optional), time_window_days (optional) |
use_mcp_tool("redshift-admin", "handle_check_cluster_health", {"level": "full"}) |
handle_diagnose_locks |
Identifies active lock contention and blocking sessions in the cluster. | min_wait_seconds (optional) |
use_mcp_tool("redshift-admin", "handle_diagnose_locks", {"min_wait_seconds": 10}) |
handle_diagnose_query_performance |
Analyzes a specific query's execution performance, including plan, metrics, and historical data. | query_id* |
use_mcp_tool("redshift-admin", "handle_diagnose_query_performance", {"query_id": 12345}) |
handle_execute_ad_hoc_query |
Executes an arbitrary SQL query provided by the user via Redshift Data API. Designed as an escape hatch. | sql_query* |
use_mcp_tool("redshift-admin", "handle_execute_ad_hoc_query", {"sql_query": "SELECT ..."}) |
handle_get_table_definition |
Retrieves the DDL (Data Definition Language) statement (SHOW TABLE) for a specific table. |
schema_name, table_name |
use_mcp_tool("redshift-admin", "handle_get_table_definition", {"schema_name": "public", ...}) |
handle_inspect_table |
Retrieves detailed information about a specific Redshift table, covering design, storage, health, and usage. | schema_name, table_name |
use_mcp_tool("redshift-admin", "handle_inspect_table", {"schema_name": "analytics", ...}) |
handle_monitor_workload |
Analyzes cluster workload patterns over a specified time window using various diagnostic scripts. | time_window_days (optional), top_n_queries (optional) |
use_mcp_tool("redshift-admin", "handle_monitor_workload", {"time_window_days": 7}) |
TO DO
- [ ] Improve Prompt Options
- [ ] Add support for more credential methods
- [ ] Add Support for Redshift Serverless
Contributing
Contributions are welcome! Please follow these guidelines.
Find/Report Issues: Check the GitHub Issues page for existing bugs or feature requests. Feel free to open a new issue if needed.
Security is critical when providing database access via an MCP server. Please consider the following:
🔒 Credentials Management: This server uses AWS Secrets Manager via the Redshift Data API, which is a more secure approach than storing credentials directly in environment variables or configuration files. Ensure your AWS credentials used by Boto3 (via environment, profile, or IAM role) are managed securely and have the minimum necessary permissions. Never commit your AWS credentials or .env files containing secrets to version control.
🛡️ Principle of Least Privilege: Configure the Redshift user whose credentials are in AWS Secrets Manager with the minimum permissions required for the server's intended functionality. For example, if only read access is needed, grant only CONNECT and SELECT privileges on the necessary schemas/tables and SELECT on the required system views. Avoid using highly privileged users like admin or the cluster superuser.
For guidance on creating restricted Redshift users and managing permissions, refer to the official (https://docs.aws.amazon.com/redshift/latest/mgmt/security.html).
License
This project is licensed under the MIT License. See the (LICENSE) file for details.
References
- This project relies heavily on the Model Context Protocol specification.
- Built using the official MCP SDK provided by Model Context Protocol.
- Utilizes the AWS SDK for Python (Boto3) to interact with the Amazon Redshift Data API.
- Many of the diagnostic SQL scripts are adapted from the excellent awslabs/amazon-redshift-utils repository.
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.