PrismSRE

PrismSRE

PrismSRE is an AI-powered Kubernetes troubleshooting agent that uses Google's Gemini models and the Model Context Protocol to provide autonomous diagnostics and real-time insights for cluster issues. It features a glassmorphism dashboard and secure read-only access to pods, logs, and deployments.

Category
Visit Server

README

πŸ’Ž PrismSRE

Python FastAPI Kubernetes License

The next-generation, AI-powered Site Reliability Engineer for your Kubernetes Clusters. <img width="1536" height="1024" alt="gpt-overview" src="https://github.com/user-attachments/assets/a7a051f3-6cda-40a1-bd65-d357960b72a5" />

PrismSRE is a production-grade Kubernetes troubleshooting system that acts as an autonomous AI agent. It seamlessly bridges the gap between raw cluster metrics/logs and actionable SRE insights. Powered by the Google Agent Development Kit (ADK), Model Context Protocol (MCP), and a beautiful Glassmorphism Dashboard, PrismSRE provides immediate, intelligent diagnostics for your Kubernetes workloads.


✨ Features

  • 🧠 Autonomous Diagnostics: Powered by Google's Gemini models, capable of analyzing CrashLoopBackOff, OOMKilled, and stuck rollouts.
  • πŸ›‘οΈ Secure by Design: Employs the Model Context Protocol (FastMCP) to enforce strict read-only access to the Kubernetes cluster. The AI agent operates outside the direct execution context.
  • 🎨 Glassmorphism UI: A breathtaking, dependency-free, single-file HTML dashboard using Vanilla JS and Tailwind CSS.
  • ⚑ Real-time Context Gathering: Automatically fetches pod status, deployment definitions, and tail logs through MCP tools without requiring raw shell access.
  • ☁️ Cloud Agnostic: Compatible with GKE, K3s, Minikube, and standard Kubernetes distributions.

πŸ—οΈ Architecture

For a deep dive into the system design, security boundaries, and component interaction, please see the Architecture Documentation.


πŸš€ Getting Started

Prerequisites

  • Python 3.11+
  • A running Kubernetes cluster (GKE, K3s, Minikube, etc.)
  • kubectl configured and authenticated to your cluster
  • A Google Gemini API Key

Local Development

  1. Clone the repository:

    git clone https://github.com/barbaria888/PrismSRE.git
    cd PrismSRE
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Configure Environment Variables:

    cp .env.example .env
    

    Add your GOOGLE_API_KEY to the .env file.

  4. Run the Dashboard Server:

    uvicorn app:app --reload --host 0.0.0.0 --port 8000
    

    Navigate to http://localhost:8000 in your browser.


☸️ Running in Your Own Cluster

To deploy PrismSRE as a long-running service inside your Kubernetes cluster, follow these steps.

1. Create the Secret

The agent requires your Gemini API key to operate. We provide a compatible secret manifest. Edit secret.yaml with your actual base64/plaintext key, then apply:

kubectl apply -f secret.yaml

2. Containerize the Application

Build and push the Docker image to your container registry:

# Example Dockerfile included in the project or write a simple one for FastAPI
docker build -t your-registry/prismsre:latest .
docker push your-registry/prismsre:latest

3. Deploy to Kubernetes

You can deploy the application using standard Kubernetes manifests. Ensure you grant the necessary RBAC permissions (read-only access to Pods, Deployments, and Logs).

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prismsre-sa
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prismsre-reader
rules:
- apiGroups: ["", "apps"]
  resources: ["pods", "pods/log", "deployments", "events"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prismsre-reader-binding
subjects:
- kind: ServiceAccount
  name: prismsre-sa
  namespace: default
roleRef:
  kind: ClusterRole
  name: prismsre-reader
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prismsre
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prismsre
  template:
    metadata:
      labels:
        app: prismsre
    spec:
      serviceAccountName: prismsre-sa
      containers:
      - name: prismsre
        image: your-registry/prismsre:latest
        ports:
        - containerPort: 8000
        envFrom:
        - secretRef:
            name: kubeops-ai-secret
---
apiVersion: v1
kind: Service
metadata:
  name: prismsre-service
spec:
  type: ClusterIP
  selector:
    app: prismsre
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8000

Apply the deployment:

kubectl apply -f deployment.yaml

<img width="959" height="449" alt="Image" src="https://github.com/user-attachments/assets/96424554-5c10-46db-90f9-08877387d2da" />

(Note: If you want external access, configure an Ingress or change the Service type to LoadBalancer).


πŸ›‘οΈ Security Considerations

  • No Root Access: The agent operates strictly with ClusterRole read-only permissions.
  • No Direct Shell: Uses the Model Context Protocol to execute predefined tools, preventing Prompt Injection attacks that try to execute arbitrary bash commands.

πŸ“„ License

This project is licensed under the MIT License.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured