vnc-mcp

vnc-mcp

Enables MCP-compatible LLMs to interact with any desktop accessible over VNC, providing tools for screen reading (OCR), mouse and keyboard control, and automation.

Category
Visit Server

README

vnc-mcp

pre-commit Black

šŸ¤–šŸ–„ vnc-mcp allows MCP-compatible LLMs to interact with any desktop accessible over VNC

https://github.com/user-attachments/assets/5eff0be6-81f4-4baf-ad67-2f68041a022a

Claude 4 Opus getting comfortable in a Kubuntu X11 session. The real limit to this application of LLMs is now the speed of LLMs: it is possible they could be better at this task if they had reinforcement learning to more efficiently decode embeddings or prefer OCR at all times.

Features

  • Connects to any RFC6143 RFB spec-compliant VNC server using pyvnc.
  • Exposes a number of MCP tools that an LLM can use to interact with a computer through the VNC client.
  • OCR for screen reading and text extraction, offering much better integration with non-multimodal LLMs. (Support for all languages unfortunately makes the image pretty huge...)

Requirements

  • A VNC server that is RFC6143 RFB spec-compliant.
  • A VNC server that supports spec-compliant authentication (not sure which RFC this aligns with). This means ARD (Apple Remote Desktop) servers are not supported. Mac users should consider using the project https://github.com/baryhuang/mcp-remote-macos-use instead.
  • (Optional) A VNC server that reuses your existing desktop session so you can see the actions an LLM is making and collaborate with it. This project was tested with krfb, the preferred VNC server for both X11 and Wayland KDE sessions. It should be built into most KDE distros (I use Fedora 42 Kinoite). It may work with other DEs, but I am unsure. GNOME should also have a built-in VNC server. Windows (ew) users may have to experiment with alternative clients like TightVNC and TigerVNC. My experience with desktop-sharing VNC servers on this platform is not good. I would either suggest using RDP or switching to a Linux distribution with KDE or Gnome as a DE.
  • (Optional) A graphics-accelerated VM running KDE/another DE with a same-session VNC server. Having a VM is preferred because then your AI UI will not have access to itself, which can cause frustration as you wrestle the mouse away from your LLM in order to see its output. NOTE: if you want to use Qemu virtio OpenGL acceleration, make sure you do NOT have the proprietary Nvidia drivers installed. They will prohibit you from starting a VM with OpenGL acceleration. This is not a problem on MacOS if you are starting a VM through UTM, as virtio should work perfectly there with Metal.

The following DEs and VNC servers have been tested in a VM with virgl acceleration:

  • ā” Fedora 42 Workstation (Gnome)
  • āŒ Fedora 42 KDE - VNC server raised by krfb does not work, only shows a black screen with no resolution. krdp is no better: only shows a white screen. It is possible this may be an artifact of my VM setup, but I cannot be sure.
  • ā” Ubuntu 24.04 (Gnome)
  • āœ… Kubuntu 24.04 (KDE) - Krfb had to be downloaded from the Ubuntu repositories, but it works fine. Only worked with unattended access. Possibly works because it uses x11 instead of Wayland.
  • āœ… Windows 11 w/ TightVNC 2.8.85 - Tested on bare metal and Hyper-V. Claude had to click anywhere in the desktop to get a picture. The first poll was always black, but worked as expected afterward!

VNC is used as opposed to RDP for two reasons:

  • The RDP protocol is much more difficult to implement and much more computationally expensive.
  • RDP servers (even moreso than most VNC servers) tend to be session-based, meaning that a new desktop environment will be spawned with each new connection. This prevents working collaboratively with an LLM, and obscures the actions the LLM is taking. This is not good, considering the broad access to your computer an LLM will have.

Installation

Although vnc-mcp is a Python package and could theoretically be started with uvx, the best way to use it with an MCP client is with a Docker container.

Docker provides several advantages:

  • Scoping access
  • Security
  • Privacy
  • Portability (important because vnc-mcp requires Python 3.12, which may be replaced by 3.13 on modern distributions)

If you do not have a docker daemon set up on your host yet, I suggest using rootless podman. See their documentation for setup.

If you decide to go with a rootless docker or podman approach, please be mindful to set the DOCKER_HOST or CONTAINER_HOST environment variables respectively.

If your UID is 1000, a rootless podman socket address would be unix:///run/user/1000/podman/podman.sock and a rootless docker socket address would be unix:///run/user/1000/docker.sock (unless you configure them in another way).

With podman, you must then additionally install the Docker CLI (available as docker-cli in most distribution repositories) and make sure the DOCKER_HOST environment variable is configured to point to your podman sock.

Installation into an MCP client

The below configuration is given in mcphost/Claude Desktop format. Be sure to correctly configure the environment variables in the env block to match your setup.

The docker container will automatically update itself and operate with the host's network (useful for getting localhost as the host without any trickery).

{
  ...
  "mcpServers": {
    ...
    "desktop": {
      "command": "docker",
      "args": [
        "run",
        "--env", "VNCMCP_HOST",
        "--env", "VNCMCP_PORT",
        "--env", "VNCMCP_TIMEOUT",
        "--env", "VNCMCP_USERNAME",
        "--env", "VNCMCP_PASSWORD",
        "-i",
        "--rm",
        "--pull=always",
        "--network=host",
        "ghcr.io/regulad/vnc-mcp:latest"
      ],
      "env": {
        "DOCKER_HOST": "unix:///run/user/1000/podman/podman.sock",
        "VNCMCP_HOST": "localhost",
        "VNCMCP_PORT": "5900",
        "VNCMCP_TIMEOUT": "15.0",
        "VNCMCP_USERNAME": "regulad",
        "VNCMCP_PASSWORD": "ChangeMeImInsecure!!"
      }
    },
    ...
  },
  ...
},

Usage

Please see the Command-line Reference for details.

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

Test with nox, but if you want to interactively test mcp, do this:

npx @modelcontextprotocol/inspector poetry run vnc-mcp

License

Distributed under the terms of the AGPL 3.0 or later license, vnc-mcp is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from @regulad's neopy template.

<!-- github-only -->

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured