AgentViewport
Enables AI agents to view and control the local desktop through browser-based viewport and MCP tools.
README
AgentViewport
AgentViewport is a bridge tool that turns your local desktop into a browser-based viewport and an MCP (Model Context Protocol) Server. It enables AI agents (like Claude Desktop, Cursor, or custom agents) to "see" your screen and interact with it (click, type, drag) via a standardized protocol using their embedded browser tools.
<img src="icon.png" width="200" alt="AgentViewport Icon" />
Features
- MCP Server: Exposes
get_screenshot,mouse_click,key_type, and more to AI agents. - Low Latency Viewport: Streams your desktop to
http://localhost:3000for remote viewing or debugging. - System Tray Integration: Background execution with a convenient tray menu.
- Safety Mode: Emergency kill-switch (defaults to Tray Exit).
- Portable: Can be built into a standalone
.exewith bundled dependencies.
Prerequisites
- Node.js 20+ - Download from nodejs.org
- Windows 10/11 - Currently Windows-only (uses native Windows APIs for screen capture and input simulation)
- npm - Comes with Node.js
Installation
Option A: Run from Source
-
Clone the repository:
git clone https://github.com/guybnd/agent-viewport.git cd agent-viewport -
Install dependencies:
npm install -
Run the application:
npm start
Option B: Build Standalone Executable
-
Install dependencies (if not already done):
npm install -
Build the executable:
npm run build -
Copy runtime dependencies:
node copy_assets.js -
Run from the
distfolder:.\dist\AgentViewport.exe
[!IMPORTANT] The
distfolder contains three items that must stay together:
AgentViewport.exe- The main executablevendor/- Native modules required at runtimeicon.png- Tray iconIf you move the executable, move the entire
distfolder contents together.
Configuration
Using with Claude Desktop (MCP)
To let Claude "see" your computer, add AgentViewport to your claude_desktop_config.json:
Windows Config Location: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"agent-viewport": {
"command": "node",
"args": ["C:\\path\\to\\agent-viewport\\server.js"]
}
}
}
If using the built .exe:
{
"mcpServers": {
"agent-viewport": {
"command": "C:\\path\\to\\AgentViewport.exe",
"args": []
}
}
}
Application Config
A configuration file is automatically created at %AppData%\AgentViewport\agent-viewport.config.json.
You can modify:
port: Web server port (Default: 3000)targetWidth: Screenshot resizing width (Default: 2560)fps: Streaming FPS (Default: 10)jpegQuality: Compression quality (Default: 70)
Tools Available to Agents
| Tool Name | Description | Arguments |
|---|---|---|
get_screenshot |
Returns a base64 encoded JPEG of the main screen. | None |
list_monitors |
Returns screen resolution. | None |
mouse_click |
Click at x, y. | x, y, button (left/right) |
mouse_drag |
Drag from current pos to x, y. | x, y |
key_type |
Type text or press keys. | text, key |
Security & Privacy
[!CAUTION] Use with Caution: Giving an AI agent access to your screen and input controls (mouse/keyboard) is a high-privilege action. Always monitor the agent's activity in real-time. Do not leave the agent unattended while it has control over your session.
- Local Execution: The server binds to
localhost:3000by default. It is not accessible from the internet unless you explicitly use a tunnel or port forwarding. - MCP Security: The MCP server communicates via
stdio(Standard Input/Output), which is a local-only transport managed by your AI client (e.g., Claude Desktop). - No Cloud Processing: Screenshots and input data are processed entirely on your machine. No data is sent to external servers by this tool.
- Transparency: This project is open-source. You can inspect
server.jsto see exactly how screenshots are captured and how mouse/keyboard inputs are handled.
Troubleshooting
| Issue | Solution |
|---|---|
| Port 3000 in use | Change port in %AppData%\AgentViewport\agent-viewport.config.json or close the conflicting app |
| Clicks not registering | Make sure you're clicking inside the video area, not the letterboxed margins |
| Build fails with EPERM | Close the running AgentViewport.exe (via tray or Task Manager) before rebuilding |
| Native module errors | Delete node_modules and dist, then run npm install and rebuild |
| Tray icon not appearing | Check the system tray overflow area (^ arrow in taskbar) |
Acknowledgements
This project is made possible by these incredible open-source libraries:
- RobotJS - Desktop automation.
- Model Context Protocol SDK - MCP server framework.
- Sharp - High-performance image processing.
- screenshot-desktop - Cross-platform screenshots.
- Socket.io - Real-time viewport streaming.
- SysTray2 - System tray integration.
License
MIT
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.