AI Vision Debug MCP Server
A Model Context Protocol server that provides AI vision capabilities for analyzing UI screenshots, offering tools for screen analysis, file operations, and UI/UX report generation.
samihalawa
README
AI Vision MCP Server
A Model Context Protocol (MCP) server that provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants.
Features
- Screenshot URL: Capture screenshots of any website by providing a URL
- Visual Analysis: Analyze UI elements, layouts, and content in screenshots
- File Operations: Read and modify files with line-specific precision
- Report Generation: Create comprehensive UI/UX analysis reports
- Debugging Session: Maintain context across multiple analysis steps
Installation
# Clone the repository
git clone https://github.com/samihalawa/mcp-server-ai-vision.git
cd mcp-server-ai-vision
# Install dependencies
npm install
# Build the server
npm run build
Usage
Starting the Server
npm start
Configuration
Add the server to your MCP configuration:
{
"servers": {
"ai-vision": {
"command": "/path/to/node",
"args": ["/path/to/mcp-server-ai-vision/build/index.js"],
"enabled": true,
"port": 3005,
"environment": {
"NODE_PATH": "/path/to/node_modules",
"PATH": "/usr/local/bin:/usr/bin:/bin",
"GEMINI_API_KEY": "your-gemini-api-key"
}
}
}
}
Available Tools
screenshot_url
Take a screenshot of a URL using a web browser.
Parameters:
url
(string, required): URL to capture a screenshot of (e.g., http://localhost:4999, https://google.com)fullPage
(boolean, optional): Whether to capture full page or just viewport. Default: falsewaitForSelector
(string, optional): CSS selector to wait for before taking screenshotwaitTime
(number, optional): Time to wait in milliseconds before taking screenshot. Default: 1000
analyze_screen
Analyze a screenshot with AI vision.
Parameters: None (uses the most recent screenshot)
read_file
Read content from a file between specified line numbers.
Parameters:
path
(string): Path to the filestartLine
(number): Starting line number (1-indexed)endLine
(number): Ending line number (1-indexed)
modify_file
Modify content in a file between specified line numbers.
Parameters:
path
(string): Path to the filestartLine
(number): Starting line number to replace (1-indexed)endLine
(number): Ending line number to replace (1-indexed)content
(string): New content to replace the specified lines
generate_report
Generate a comprehensive UI/UX analysis report.
Parameters:
testUrl
(string): URL of the application being testedappName
(string, optional): Name of the application being analyzeddate
(string, optional): Date of the analysis (YYYY-MM-DD)observations
(object): Observations structured as components, data state, interactions, etc.
Example Workflow
-
Take a screenshot of a website:
screenshot_url(url: "https://example.com")
-
Analyze the screenshot:
analyze_screen()
-
Generate a report based on the analysis:
generate_report(testUrl: "https://example.com", observations: {...})
Requirements
- Node.js 14+
- Playwright for browser automation
- Gemini API key for AI vision analysis
License
MIT
Recommended Servers
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
Excel MCP Server
A Model Context Protocol server that enables AI assistants to read from and write to Microsoft Excel files, supporting formats like xlsx, xlsm, xltx, and xltm.
Playwright MCP Server
Provides a server utilizing Model Context Protocol to enable human-like browser automation with Playwright, allowing control over browser actions such as navigation, element interaction, and scrolling.
@kazuph/mcp-fetch
Model Context Protocol server for fetching web content and processing images. This allows Claude Desktop (or any MCP client) to fetch web content and handle images appropriately.
Claude Code MCP
An implementation of Claude Code as a Model Context Protocol server that enables using Claude's software engineering capabilities (code generation, editing, reviewing, and file operations) through the standardized MCP interface.
Apple MCP Server
Enables interaction with Apple apps like Messages, Notes, and Contacts through the MCP protocol to send messages, search, and open app content using natural language.

Supabase MCP Server
A Model Context Protocol (MCP) server that provides programmatic access to the Supabase Management API. This server allows AI models and other clients to manage Supabase projects and organizations through a standardized interface.
mermaid-mcp-server
A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images.
@kazuph/mcp-gmail-gas
Model Context Protocol server for Gmail integration. This allows Claude Desktop (or any MCP client) to interact with your Gmail account through Google Apps Script.