taw-computer

taw-computer

An MCP server that provides AI agents with a full Ubuntu desktop environment inside Docker, enabling them to perform complex computer tasks like browsing, coding, testing, and GUI automation.

Category
Visit Server

README

<p align="center"> <img src="https://img.shields.io/badge/πŸ–₯️_taw--computer-Give_any_AI_a_real_computer-blue?style=for-the-badge&labelColor=000" alt="taw-computer" height="40"> </p>

<p align="center"> <strong>Your AI writes code. This lets it <em>use</em> a computer.</strong> </p>

<p align="center"> <a href="https://github.com/the-agents-work/taw-computer/stargazers"><img src="https://img.shields.io/github/stars/the-agents-work/taw-computer?style=flat-square&logo=github&color=yellow" alt="Stars"></a>Β  <a href="https://github.com/the-agents-work/taw-computer/blob/main/LICENSE"><img src="https://img.shields.io/github/license/the-agents-work/taw-computer?style=flat-square&color=green" alt="MIT License"></a>Β  <a href="https://github.com/the-agents-work/taw-computer/issues"><img src="https://img.shields.io/github/issues/the-agents-work/taw-computer?style=flat-square" alt="Issues"></a>Β  <a href="#quick-start"><img src="https://img.shields.io/badge/get_started-5_minutes-brightgreen?style=flat-square" alt="Get Started"></a> </p>

<p align="center"> <a href="#quick-start">Quick Start</a> Β· <a href="#demo">Demo</a> Β· <a href="https://shipkit.cc">Hosted Version</a> Β· <a href="CONTRIBUTING.md">Contributing</a> Β· <a href="https://github.com/the-agents-work/taw-computer/issues">Report Bug</a> </p>


What if your AI could do everything you do on a computer?

Not just write code β€” but open a browser, click buttons, fill forms, run servers, test in real browsers, install anything, and see the screen?

taw-computer is an open-source MCP server that gives AI agents a full Ubuntu desktop inside Docker. Your AI connects, gets a real computer, and works like a human would.

No internal LLM. No chat UI. Your AI is the brain. This is the body.

<br>

Demo

<!-- 🎬 Replace with your actual demo GIF/video --> <!-- Record: start Claude Code β†’ "build me a landing page" β†’ AI creates VM, codes, opens browser, shows result --> <!-- Recommended: use asciinema for terminal, or screen record VNC + terminal side by side -->

<p align="center"> <em>πŸ“Ή Demo coming soon β€” <a href="https://github.com/the-agents-work/taw-computer/stargazers">star this repo</a> to get notified!</em> </p>

<!-- When ready, uncomment: <p align="center"> <img src="docs/demo.gif" alt="taw-computer demo β€” AI builds a website from scratch" width="800"> <br> <sub>AI builds a full website from a single prompt β€” writing code, installing packages, and testing in a real browser.</sub> </p> -->

<br>

Why taw-computer?

Other tools let AI write code. taw-computer lets AI use a computer.

ChatGPT / Claude Cursor / Copilot Lovable / Bolt taw-computer
Write code βœ… βœ… βœ… βœ…
Run shell commands ❌ Limited Sandboxed Full Ubuntu
Browse the web ❌ ❌ ❌ Real Chromium
See & click the screen ❌ ❌ ❌ Desktop + VNC
Install any software ❌ ❌ ❌ apt/npm/pip
Test in real browser ❌ ❌ Preview only Playwright + CDP
Persist across sessions ❌ ❌ βœ… Snapshots
Self-hostable ❌ ❌ ❌ 100% yours

<br>

Quick start

Get running in under 5 minutes:

# 1. Clone & build
git clone https://github.com/the-agents-work/taw-computer.git
cd taw-computer
docker build -f images/Dockerfile.taw -t taw-computer-base .

# 2. Install & start
npm install && npm start

Then add to your AI client:

<details open> <summary><strong>Claude Code</strong> (~/.claude/mcp.json)</summary>

{
  "mcpServers": {
    "taw-computer": {
      "command": "npx",
      "args": ["tsx", "/path/to/taw-computer/mcp/index.ts"]
    }
  }
}

</details>

<details> <summary><strong>Cursor</strong></summary>

Add to Cursor MCP settings (Settings β†’ MCP Servers) β€” same JSON format as above.

</details>

<details> <summary><strong>Claude Desktop</strong></summary>

Add to claude_desktop_config.json β€” same JSON format as above.

</details>

<details> <summary><strong>Any MCP client</strong></summary>

taw-computer speaks standard MCP over stdio. Any client that supports MCP can connect.

</details>

<details> <summary><strong>Remote server (SSH)</strong> β€” run on a beefy machine, use from your laptop</summary>

Got a powerful server / Mac Mini / VPS? Run taw-computer there and connect from anywhere:

{
  "mcpServers": {
    "taw-computer": {
      "command": "ssh",
      "args": ["user@your-server", "cd /path/to/taw-computer && npx tsx mcp/index.ts"]
    }
  }
}
Your laptop (Claude Code)
    ↕ SSH (stdin/stdout piped over network)
Remote server (taw-computer + Docker)
    ↕ Docker
Ubuntu sandbox

Setup:

  1. On the server: install Docker, clone repo, build image, npm install
  2. On the server: enable SSH (sudo systemctl enable ssh)
  3. On your laptop: ssh-copy-id user@your-server (passwordless login)
  4. Add the MCP config above β€” done!

Watch via VNC: open http://your-server:6080 in your browser.

</details>

That's it. Now tell your AI: "Create a VM and build me a website" β€” and watch it work.

<br>

What can it do?

πŸ–₯️ "Build me a landing page"

AI creates a VM β†’ scaffolds Next.js β†’ writes components β†’ starts dev server β†’ opens browser to check β†’ iterates until it looks right

🌐 "Go to Amazon and find the best laptop under $1000"

AI opens Chromium β†’ navigates to Amazon β†’ searches β†’ scrolls β†’ extracts prices β†’ compares β†’ reports back

πŸ§ͺ "Run E2E tests on my deployed app"

AI launches Playwright β†’ navigates to your URL β†’ fills forms β†’ clicks buttons β†’ asserts results β†’ reports failures

πŸ”§ "Set up a PostgreSQL database with sample data"

AI runs apt install postgresql β†’ creates database β†’ writes seed script β†’ runs it β†’ verifies with queries

πŸ“Έ "What does my app look like on mobile?"

AI takes desktop screenshot β†’ resizes viewport β†’ screenshots again β†’ compares β†’ suggests CSS fixes

<br>

How it works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Your AI Client                                     β”‚
β”‚  Claude Code Β· Cursor Β· Claude Desktop Β· any MCP    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚ MCP protocol (stdio)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  taw-computer MCP server          30+ tools         β”‚
β”‚  vm Β· shell Β· files Β· browser Β· desktop Β· search    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚ Docker API
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Ubuntu 22.04 Sandbox              isolated containerβ”‚
β”‚                                                      β”‚
β”‚   bash    Chromium + CDP    xfce4 Desktop + VNC     β”‚
β”‚     git npm pip curl          Playwright              β”‚
β”‚       python node              xdotool scrot          β”‚
β”‚                                                      β”‚
β”‚   /workspace ← your project files live here          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

<br>

30+ tools

<details open> <summary><strong>VM Management</strong> β€” create, destroy, snapshot, resume</summary>

Tool What it does
vm_create Spin up a new sandbox. Returns VNC URL to watch live
vm_destroy Destroy (auto-saves snapshot for later)
vm_reset Destroy + delete snapshot (fresh start)
vm_restart Restart container, keep all files
vm_status CPU, RAM, disk, uptime, top processes
vm_list List running sandboxes
vm_rename Rename a VM
snapshot_list List saved snapshots
snapshot_delete Delete a snapshot

</details>

<details> <summary><strong>Shell & Files</strong> β€” full Ubuntu command line + file ops</summary>

Tool What it does
exec Run any command: git, npm, pip, curl, docker, anything
fs_read Read a file
fs_write Write a file (creates parent dirs)
fs_edit Find-and-replace in a file
fs_list ls / recursive find
fs_search grep for patterns
code_search ripgrep with regex, file types, context
file_upload Upload file into VM (base64, max 50MB)

</details>

<details> <summary><strong>Browser (CDP/Playwright)</strong> β€” real browser, not a simulator</summary>

Tool What it does
browser_navigate Go to URL, wait for load
browser_snapshot Screenshot + numbered overlays on every clickable element
browser_click_ref Click element #N from snapshot
browser_type_ref Type into element #N
browser_extract Read page text (CSS selector or full page)
browser_eval Run JavaScript in page
browser_wait_for Wait for selector / text / network idle
browser_console_logs Read console.log, console.error, etc.
browser_network_errors Catch 404s, CORS errors, failed requests
browser_run_test Run a Playwright test script
browser_open Open Chrome via desktop (fallback)
browser_close Kill Chrome
web_search Google search β†’ top 8 results

</details>

<details> <summary><strong>Desktop</strong> β€” see and control the GUI</summary>

Tool What it does
desktop_screenshot JPEG screenshot of the whole desktop
desktop_click Click at (x, y)
desktop_type Type text into focused window
desktop_key Key combos: ctrl+c, alt+tab, Return, etc.
desktop_scroll Scroll up/down
desktop_drag Drag from A to B

</details>

<br>

Set-of-Mark: how browser automation actually works

Most "computer use" tools guess pixel coordinates. We use Set-of-Mark prompting β€” the AI sees numbered badges on every interactive element:

Step 1: browser_snapshot
        β†’ AI sees screenshot with [1] Login  [2] Search  [3] Cart  ...

Step 2: browser_click_ref(ref=2)
        β†’ clicks the Search box precisely

Step 3: browser_type_ref(ref=2, text="laptop", submit=true)  
        β†’ types and presses Enter

Step 4: browser_snapshot
        β†’ sees new page with results [4] [5] [6] ...

No coordinate guessing. No CSS selector fragility. The AI sees what it's clicking.

<br>

VNC β€” watch your AI work in real time

Every sandbox comes with a noVNC web viewer. Open the URL in your browser and watch:

  • πŸ–±οΈ AI navigating websites and clicking buttons
  • ⌨️ AI writing code in the terminal
  • πŸ—οΈ AI building and testing applications
  • πŸ› AI debugging by inspecting the screen

Perfect for demos, debugging, and building trust in AI agents.

<br>

What's inside each sandbox

Included
OS Ubuntu 22.04
Desktop xfce4 + Xvfb + x11vnc + noVNC
Browser Playwright Chromium (native arm64 + amd64)
Languages Node.js 20, Python 3, build-essential
CLI git, curl, wget, jq, ripgrep, tree, nano, vim
DB clients PostgreSQL, MariaDB, Redis
Dev tools GitHub CLI, yq, httpie
Automation xdotool, scrot, imagemagick, xclip

<br>

Configuration

Variable Default Description
MAX_SANDBOXES 3 Max concurrent VMs
SANDBOX_TYPE auto auto / docker / firecracker
DOCKER_IMAGE taw-computer-base Base image
DOCKER_MEMORY_MB 4096 RAM per container
DOCKER_CPUS 2 CPUs per container
DESKTOP_RESOLUTION 1280x720 Screen resolution

<br>

Requirements

Minimum
Docker Docker Desktop or Docker Engine
Node.js 20+
RAM ~4GB per sandbox
Disk ~5GB for base image

<br>

Project structure

taw-computer/
β”œβ”€β”€ mcp/
β”‚   β”œβ”€β”€ index.ts            # MCP server β€” stdio, 30+ tool handlers
β”‚   └── browser.ts          # Playwright CDP + Set-of-Mark engine
β”œβ”€β”€ sandbox/
β”‚   β”œβ”€β”€ SandboxManager.ts   # Abstract interface
β”‚   β”œβ”€β”€ DockerSandbox.ts    # Docker implementation
β”‚   β”œβ”€β”€ FirecrackerSandbox.ts # Firecracker microVM (optional)
β”‚   β”œβ”€β”€ NetworkManager.ts   # Network isolation
β”‚   β”œβ”€β”€ config.ts           # Env-based config
β”‚   └── index.ts            # Auto-detect backend
β”œβ”€β”€ images/
β”‚   └── Dockerfile.taw      # Ubuntu sandbox image
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ workflows/ci.yml    # CI: typecheck + Docker build
β”‚   └── ISSUE_TEMPLATE/     # Bug report + feature request
β”œβ”€β”€ package.json
β”œβ”€β”€ CONTRIBUTING.md
└── LICENSE (MIT)

<br>

Contributing

We'd love your help! See CONTRIBUTING.md.

Ideas for first contributions:

  • 🎨 Record a demo GIF for this README
  • πŸ“ Write a tutorial ("Build X with taw-computer")
  • πŸ”§ Add a new MCP tool (audio? clipboard? multi-tab?)
  • 🐳 Build a slimmer Docker image
  • πŸ§ͺ Add automated tests
  • πŸ“¦ Support Podman / containerd

<br>

Hosted version

Don't want to self-host? shipkit.cc β€” managed taw-computer with:

  • Chat UI (just type what you want)
  • Auth & team collaboration
  • One-click app sharing
  • No Docker setup needed

<br>

FAQ

<details> <summary><strong>How is this different from Lovable / Bolt / v0?</strong></summary>

Those are closed-source, hosted-only products that generate code. taw-computer gives AI a real computer β€” it can run servers, browse the web, install anything, and interact with any desktop app. It's also open source and self-hostable.

</details>

<details> <summary><strong>How is this different from OpenInterpreter / Open Hands?</strong></summary>

OpenInterpreter runs code on your local machine (risky). Open Hands uses its own LLM orchestration. taw-computer is just the computer β€” no built-in LLM, no opinions about orchestration. Your existing AI client (Claude Code, Cursor, etc.) is the brain. taw-computer is a pure MCP server.

</details>

<details> <summary><strong>Is it safe? Can the AI break my system?</strong></summary>

Each sandbox is an isolated Docker container with its own filesystem, network, and process space. Nothing inside can touch your host system. Containers have memory/CPU/PID limits. When you're done, destroy the VM.

</details>

<details> <summary><strong>Can I use it with GPT-4 / Gemini / local models?</strong></summary>

Yes β€” any AI client that supports MCP can connect. The server doesn't care which LLM is behind the client.

</details>

<details> <summary><strong>Does it work on Mac / Windows / Linux?</strong></summary>

Yes. Anywhere Docker runs, taw-computer runs. The sandbox image supports both arm64 (Apple Silicon) and amd64 (Intel/AMD).

</details>

<br>

Star History

<p align="center"> <a href="https://star-history.com/#the-agents-work/taw-computer&Date"> <img src="https://api.star-history.com/svg?repos=the-agents-work/taw-computer&type=Date" alt="Star History" width="600"> </a> </p>

<p align="center"> If taw-computer is useful to you, <a href="https://github.com/the-agents-work/taw-computer/stargazers"><strong>give it a ⭐</strong></a> β€” it helps others find it. </p>

<br>

Related

License

MIT β€” do whatever you want with it.

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured