MCP Servers

AI Gaming Agent MCP Server

Enables AI agents to remotely control gaming PCs for automated gameplay with tools for screen capture, mouse/keyboard control, workflows, and system operations.

README

AI Gaming Agent MCP Server

MCP (Model Context Protocol) server that enables AI agents like Claude to remotely control gaming PCs for automated gameplay.

Master Claude (Orchestrator)
    |
    | MCP Protocol (JSON-RPC over HTTP/SSE)
    |
    +---> PC #1 (MCP Server :8765) --> PyAutoGUI + Optional VLM
    +---> PC #2 (MCP Server :8765) --> PyAutoGUI + Optional VLM
    +---> PC #N (MCP Server :8765) --> PyAutoGUI + Optional VLM

Features

24 MCP Tools: Screen capture, mouse/keyboard control, file operations, system commands, workflow automation
Workflow Automation: Chain multiple actions into single commands with run_workflow and demo_terminal_workflow
Multi-Monitor Support: Target specific monitors for screenshots and actions
Dual Transport Modes: HTTP/SSE for remote control, stdio for local clients
Optional Local VLM: Use Ollama (Qwen2.5-VL, Moondream) for fast local screen analysis
Security-First: Bearer token auth, path restrictions, command blocklist, audit logging
Cross-Platform: Windows, Linux, macOS with auto-detection

Quick Start

Installation

pip install ai-gaming-agent

Or install from source:

git clone https://github.com/developerz-ai/ai-gaming-agent-mcp.git
cd ai-gaming-agent-mcp
pip install -e .

Start the Server

The server supports two transport modes:

HTTP/SSE Transport (Recommended for Remote Control):

gaming-agent serve --transport http --port 8765 --password your-secret-password

Stdio Transport (For Local MCP Clients):

gaming-agent serve --transport stdio

Connect from Claude Desktop

For HTTP/SSE Transport:

Add to your Claude Desktop config (~/.config/claude/claude_desktop_config.json on Linux, ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "gaming-pc": {
      "transport": "sse",
      "url": "http://YOUR-PC-IP:8765/mcp",
      "headers": {
        "Authorization": "Bearer your-secret-password"
      }
    }
  }
}

For Stdio Transport:

{
  "mcpServers": {
    "gaming-pc": {
      "command": "gaming-agent",
      "args": ["serve", "--transport", "stdio"]
    }
  }
}

Quick Test

Test the complete automation capability with a single command:

# Start the server
gaming-agent serve --transport http --password test123

# In Claude Desktop (after connecting), ask:
"Use the demo_terminal_workflow tool to open a terminal, type 'echo hello world', and close it"

This will:

✓ Auto-detect your terminal (gnome-terminal, konsole, xterm, Terminal.app, cmd)
✓ Open a new terminal window
✓ Type "echo hello world"
✓ Press Enter to execute
✓ Capture a screenshot for verification
✓ Close the terminal with the appropriate hotkey

Available Tools

Total: 24 MCP Tools

Workflow Tools (New!)

Tool	Description
`run_workflow`	Execute a sequence of tool actions with optional delays
`demo_terminal_workflow`	Complete demo: open terminal, type command, execute, screenshot, close

Screen Tools

Tool	Description
`screenshot`	Capture current screen (returns base64 PNG)
`get_screen_size`	Get screen dimensions
`analyze_screen`	Use local VLM to analyze screen content (requires Ollama)

Mouse Tools

Tool	Description
`click`	Click at coordinates
`double_click`	Double-click at coordinates
`move_to`	Move mouse cursor
`drag_to`	Drag from current position
`scroll`	Scroll mouse wheel
`get_mouse_position`	Get current cursor location

Keyboard Tools

Tool	Description
`type_text`	Type a string of text (supports fast paste mode via clipboard)
`press_key`	Press a single key
`hotkey`	Press key combination

Fast Text Input with Paste

The type_text tool supports fast clipboard-based paste mode for significantly faster text input:

{
  "tool": "type_text",
  "args": {
    "text": "long command or text here",
    "use_paste": true
  }
}

Benefits:

10x faster than character-by-character typing for long text
Ideal for: Pasting long commands, scripts, credentials
How it works: Copies text to clipboard, then uses Ctrl+V (Linux/Windows) or Cmd+V (macOS) to paste
Default: use_paste=false (uses character-by-character typing)

When to use:

✓ Large blocks of text
✓ Complex commands with special characters
✓ When speed matters more than realtime character visibility
✗ Games that don't support paste input
✗ When character-by-character input is explicitly required

File Tools

Tool	Description
`read_file`	Read file contents
`write_file`	Write content to file
`list_files`	List directory contents
`upload_file`	Upload file to PC
`download_file`	Download file from PC

System Tools

Tool	Description
`execute_command`	Run shell command
`get_system_info`	Get CPU/RAM/GPU usage
`list_windows`	List open windows
`focus_window`	Bring window to foreground

Workflow Automation

`run_workflow` - Composite Command Execution

Execute multiple tools in sequence with a single command. Perfect for complex automation tasks.

Example: Open terminal and run command

{
  "steps": [
    {
      "tool": "execute_command",
      "args": {"command": "gnome-terminal"},
      "wait_ms": 1500,
      "description": "Open terminal"
    },
    {
      "tool": "type_text",
      "args": {"text": "ls -la"},
      "wait_ms": 200,
      "description": "Type command"
    },
    {
      "tool": "press_key",
      "args": {"key": "enter"},
      "wait_ms": 1000,
      "description": "Execute command"
    },
    {
      "tool": "screenshot",
      "args": {},
      "description": "Capture result"
    }
  ]
}

Step Fields:

tool (required): Name of the tool to execute
args (optional): Arguments to pass to the tool
wait_ms (optional): Milliseconds to wait after this step
description (optional): Human-readable step description
continue_on_error (optional): Continue workflow if this step fails

Returns:

{
  "success": true,
  "total_steps": 4,
  "completed_steps": 4,
  "failed_step": null,
  "results": [...],
  "total_time_ms": 3523,
  "error": null
}

`demo_terminal_workflow` - Ready-Made Terminal Demo

A convenience tool that demonstrates the full automation capability in one call.

Usage:

{
  "text": "echo hello world",
  "terminal_wait_ms": 2000,
  "post_type_wait_ms": 500,
  "post_enter_wait_ms": 1000,
  "capture_screenshot": true,
  "close_terminal": true
}

What it does:

Auto-detects platform terminal (gnome-terminal, konsole, xterm, Terminal.app, cmd)
Opens the terminal application
Waits for terminal to fully load
Types the provided command
Presses Enter to execute
Waits for command output
Captures screenshot for verification (optional)
Closes terminal with platform-appropriate hotkey (optional)

Returns:

{
  "success": true,
  "terminal_command": "gnome-terminal",
  "platform": "Linux",
  "text_typed": "echo hello world",
  "screenshot": {"success": true, "image": "...", ...},
  "steps_completed": ["detect_terminal", "open_terminal", "wait_for_terminal",
                      "type_text", "press_enter", "capture_screenshot",
                      "close_terminal"],
  "total_time_ms": 4523,
  "error": null
}

Platform Support:

Linux: gnome-terminal, konsole, xfce4-terminal, mate-terminal, tilix, terminator, xterm
macOS: Terminal.app
Windows: cmd.exe

Configuration

Create ~/.gaming-agent/config.json:

{
  "server": {
    "host": "0.0.0.0",
    "port": 8765,
    "password": "your-secure-password"
  },
  "vlm": {
    "enabled": false,
    "provider": "ollama",
    "model": "qwen2.5-vl:3b",
    "endpoint": "http://localhost:11434"
  },
  "security": {
    "allowed_paths": ["/home/user/games", "C:\\Games"],
    "blocked_commands": ["rm -rf", "format", "del /f"],
    "max_command_timeout": 30
  }
}

Enabling VLM (Optional)

To use the analyze_screen tool with local vision models:

Install Ollama (if not already installed):

curl -fsSL https://ollama.com/install.sh | sh

Pull a vision model:

ollama pull qwen2.5-vl:3b  # Lightweight, fast
# or
ollama pull moondream      # Alternative

Install VLM dependencies:
```
pip install ai-gaming-agent[vlm]
```

Enable in config (~/.gaming-agent/config.json):

{
  "vlm": {
    "enabled": true,
    "provider": "ollama",
    "model": "qwen2.5-vl:3b",
    "endpoint": "http://localhost:11434"
  }
}

Use in workflows:

{
  "tool": "analyze_screen",
  "args": {
    "prompt": "What is the current health percentage?"
  }
}

Deployment Options

Option A: Claude Does All Vision (Simplest)

Claude analyzes all screenshots
Gaming PCs are simple executors
No GPU needed on gaming PCs
Use HTTP/SSE transport with Bearer auth

Option B: Hybrid with Local VLM (Recommended)

Claude for high-level decisions and orchestration
Local VLM (Qwen2.5-VL, Moondream) for fast visual processing
Best balance of speed and intelligence
Reduces API costs and latency

Option C: Full Local (Privacy)

Local orchestrator (e.g., Qwen3-72B via Ollama)
No cloud APIs, complete privacy
Requires powerful hardware (GPU recommended)
Use stdio transport for local control

Practical Examples

Example 1: Terminal Automation

# Ask Claude: "Run the demo_terminal_workflow with the command 'uname -a'"
# Result: Opens terminal, runs command, captures output, closes

Example 2: Multi-Step Workflow

# Ask Claude: "Create a workflow that:
# 1. Opens a file browser
# 2. Navigates to Downloads
# 3. Takes a screenshot
# 4. Closes the window"

# Claude will use run_workflow with execute_command, type_text, screenshot, hotkey

Example 3: Game Automation with VLM

# Ask Claude: "Use analyze_screen to check if the game menu is visible,
# then click the 'Start Game' button at coordinates you detect"

# Claude will:
# 1. Call analyze_screen with prompt "Is there a Start Game button? Where?"
# 2. Use VLM response to determine coordinates
# 3. Call click tool with detected coordinates

Example 4: Batch File Operations

# Ask Claude: "Create a workflow that backs up all .save files from
# C:\Games\MyGame to C:\Backups\saves-{date}"

# Claude will use run_workflow with list_files, read_file, write_file

Security

Always use strong, unique passwords (min 16 chars, random)
Limit file access to game directories only in allowed_paths
Use a VPN if accessing over internet (never expose to public)
Enable TLS/HTTPS for production (use reverse proxy like nginx)
Regularly rotate passwords (weekly for high-security environments)
Monitor ~/.gaming-agent/audit.log for suspicious activity
Set appropriate max_command_timeout to prevent runaway processes

Development

# Install with uv (recommended)
uv sync --extra dev

# Or with pip
pip install -e ".[dev]"

# Lint
uv run ruff check src tests

# Run unit tests
uv run pytest tests/ --ignore=tests/integration -v

Testing

Unit Tests (CI)

Unit tests run in CI on every push/PR. They test configuration, file operations, and tool interfaces without requiring a display.

# Run unit tests only
uv run pytest tests/ --ignore=tests/integration -v

Integration Tests (Local Only)

Integration tests perform real GUI automation and require:

A real display (X11, Wayland, Windows, macOS)
tesseract-ocr for OCR verification
pyautogui working with your display

These tests CANNOT run in CI because they need a real desktop environment.

# Install integration test dependencies
uv sync --extra integration

# Install tesseract (Linux)
sudo apt install tesseract-ocr

# Install tesseract (macOS)
brew install tesseract

# Run integration tests locally
uv run pytest tests/integration -v

What Integration Tests Do

Test	Description
`test_screenshot_returns_image`	Captures real screen content
`test_ocr_screen_content`	Uses OCR to read text from screen
`test_mouse_move`	Moves mouse cursor to position
`test_mouse_click`	Performs real mouse click
`test_type_text`	Types actual text
`test_terminal_workflow`	Opens terminal, types command, closes
`test_terminal_with_ocr_verification`	Opens terminal, runs command, verifies output with OCR
`test_batch_gui_operations`	Runs 7 GUI operations in sequence

Terminal Workflow Test

The most comprehensive test opens a terminal, types a command, and verifies the output:

# What the test does:
1. Opens system terminal (gnome-terminal, konsole, xterm, etc.)
2. Types: echo "AGENT_TEST_abc12345"
3. Presses Enter
4. Takes screenshot
5. Runs OCR on screenshot
6. Verifies "AGENT_TEST_abc12345" appears in OCR output
7. Closes terminal with Alt+F4

CI Configuration

CI runs on GitHub Actions with Python 3.12 only:

# .github/workflows/ci.yml
- Checkout code
- Install uv
- Install Python 3.12
- Install system deps (xvfb, scrot, python3-tk)
- Run ruff lint
- Run unit tests (integration tests excluded)

Integration tests are auto-skipped in CI via the CI=true environment variable.

License

MIT

Recommended Servers

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official

Featured

TypeScript

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official

Featured

Local

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official

Featured

TypeScript

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official

Featured

Python

E2B

Using MCP to run code via e2b.

Official

Featured

Neon Database

MCP server for interacting with Neon Management API and databases

Official

Featured

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official

Featured

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official

Featured

AI Gaming Agent MCP Server

README

AI Gaming Agent MCP Server

Features

Quick Start

Installation

Start the Server

Connect from Claude Desktop

Quick Test

Available Tools

Workflow Tools (New!)

Screen Tools

Mouse Tools

Keyboard Tools

Fast Text Input with Paste

File Tools

System Tools

Workflow Automation

run_workflow - Composite Command Execution

demo_terminal_workflow - Ready-Made Terminal Demo

Configuration

Enabling VLM (Optional)

Deployment Options

Option A: Claude Does All Vision (Simplest)

Option B: Hybrid with Local VLM (Recommended)

Option C: Full Local (Privacy)

Practical Examples

Example 1: Terminal Automation

Example 2: Multi-Step Workflow

Example 3: Game Automation with VLM

Example 4: Batch File Operations

Security

Development

Testing

Unit Tests (CI)

Integration Tests (Local Only)

What Integration Tests Do

Terminal Workflow Test

CI Configuration

License

Recommended Servers

`run_workflow` - Composite Command Execution

`demo_terminal_workflow` - Ready-Made Terminal Demo