system-media-control-mcp
Enables AI agents to monitor system resources and control media, brightness, volume, and other Windows PC functions through natural language commands.
README
System & Media Control MCP Client/Server (Windows)
A Node.js-based Model Context Protocol (MCP) automation system that enables Artificial Intelligence (Google Gemini, OpenAI, Anthropic Claude, or Local AI models like Ollama/LM Studio) to query and control your Windows PC using natural language.
📖 Table of Contents
- Features
- How It Works
- Prerequisites
- Installation & Setup
- Configuration
- Usage
- Available Tools (30 total)
🌟 Features
💻 System & Resource Monitoring
- get_system_status: CPU load, RAM usage, primary disk (C:) free capacity, and system uptime.
- get_system_info: Retrieve detailed static system hardware specifications (CPU, motherboard, RAM slots, OS, BIOS).
- get_disk_space: Returns storage space metrics (total, used, free) for all active system drives.
- get_top_processes: Top 5 CPU-intensive running processes with PIDs and CPU percentages.
- get_battery_status: Battery health percentage, charging state, and estimated remaining minutes (laptops).
- get_network_info: Retrieves local IPv4 addresses, active network adapters, and external IP.
- get_dns_servers: Retrieves configured DNS server IP addresses for the active network interfaces.
- get_network_latency: Test latency (ping response time) to major targets (e.g. google.com).
- get_wifi_networks: Scans and lists nearby Wi-Fi network SSIDs and signal strengths.
- get_wifi_status: Retrieves status details of the currently active Wi-Fi connection (SSID, Signal quality, Transmission rate).
- get_active_window: Retrieves the title, process name, and PID of the currently active focused foreground window.
- get_gpu_info: GPU graphics card details, driver versions, and VRAM memory.
- get_audio_devices: Lists available system output and input audio hardware controllers.
🎛 PC & Media Control
- media_control: Simulates keyboard media keys (Play/Pause, Next Track, Previous Track, Stop).
- get_volume / set_volume / set_mute: Checks system volume level, sets level (0-100), and mutes/unmutes audio.
- get_brightness / set_brightness: Reads or changes monitor brightness (0-100%).
- system_power_control: Locks the screen, puts the PC to sleep, schedules a shutdown/restart (with a 60s warning), or aborts active power schedules.
- close_process: Force terminates a running background/foreground process by name or process ID.
📋 Automation & Clipboard Utilities
- get_clipboard: Reads text contents currently on the Windows clipboard.
- set_clipboard: Copies a text string to the system clipboard.
- clear_clipboard: Clears all text contents currently on the Windows clipboard.
- open_url: Launches a website URL in the default or a specified browser (Chrome, Firefox, Edge, Brave).
- send_keystrokes: Sends sequential keyboard commands and inputs to automate apps and pages.
- launch_app: Spawns a desktop application by command name (e.g.
notepad,calc,explorer). - take_screenshot: Captures a PNG screenshot of the primary screen and saves it locally.
- empty_recycle_bin: Empties the Windows Recycle Bin.
- show_desktop: Minimizes all active GUI windows instantly.
⚙️ How It Works
graph TD
User([User Prompt]) --> Client[MCP Client - client.js]
Client -->|1. Prompt + Tools| LLM[AI Provider: Gemini/OpenAI/Claude/Ollama]
LLM -->|2. Tool Call Request| Client
Client -->|3. JSON-RPC over stdin| Server[MCP Server - server.js]
Server -->|4. Runs Native Powershell/C#| WinAPI[Windows Core Audio / CIM / Clipboard]
WinAPI -->|5. Output| Server
Server -->|6. JSON-RPC over stdout| Client
Client -->|7. Tool Response| LLM
LLM -->|8. Natural Language Answer| Client
Client --> Display([Console Display])
- Background Spawn: The client runs as a Node.js process and spawns the MCP server as a background process (
child_process.spawn). - JSON-RPC handshake: During startup, they perform a line-by-line handshake over
stdinandstdoutusing JSON-RPC 2.0. - AI routing: The client prompts your chosen AI provider with the user prompt and the server's tools list.
- Tool Call Interception: When the AI responds with a Tool Call request, the client intercepts it, translates it to a
tools/callJSON-RPC message, and forwards it to the server'sstdin. - Windows Integration: The server runs native PowerShell command scripts, compiling on-the-fly C# interfaces to bypass the COM binder inside PowerShell, interacting directly with core APIs and user32 keyboard emulation.
- Result Loop: The tool result is returned over
stdoutto the client, which feeds it back to the AI. The AI then formulates a final user response. - Interactive loop: When executed without argument, the client enters a continuous loop where you can chat with the PC Agent.
💻 Prerequisites
- Operating System: Windows (required for CIM, user32, and COM audio objects).
- Runtime: Node.js v18 or higher.
📦 Installation & Setup
-
Clone this repository:
git clone https://github.com/noackjona-hash/system-media-control-mcp.git cd system-media-control-mcp -
Install npm packages:
npm install
⚙️ Configuration
Create a file named .env in the root folder (automatically ignored by git) and set your preferred AI provider:
# Choose AI Provider: gemini, openai, anthropic, groq, github, deepseek, mistral, together, openrouter, watsonx, perplexity, nvidia, local, mock, etc.
AI_PROVIDER=gemini
# Google Gemini API
GEMINI_API_KEY=AIzaSy...
GEMINI_MODEL=gemini-2.5-flash
# OpenAI API
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o-mini
# Anthropic Claude API
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
# Groq AI API
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.3-70b-versatile
# GitHub Models API
GITHUB_TOKEN=ghp_...
GITHUB_MODEL=gpt-4o
# Other supported providers (DeepSeek, Mistral, Together, OpenRouter, Watsonx, Perplexity, Cloudflare, Nebius, NVIDIA NIM, Upstage, Moonshot, etc.)
DEEPSEEK_API_KEY=sk-...
MISTRAL_API_KEY=sk-...
PERPLEXITY_API_KEY=pplx-...
# Local AI / Ollama / LM Studio (OpenAI-compatible)
LOCAL_API_BASE=http://localhost:11434/v1
LOCAL_MODEL=llama3.2:1b
(If no API keys are found or configured, it defaults to Mock Mode, running keyword analysis locally to simulate tool execution. Perfect for offline testing!)
🚀 Usage
Continuous Interactive Chat Loop (Recommended)
Run the client with no arguments to start a persistent shell session:
npm start
You can chat continuously:
Ask PC Agent > How busy is my PC?
Ask PC Agent > Open github.com
Ask PC Agent > Mute the volume
Ask PC Agent > exit
CLI Command Mode (Single-Shot)
Pass your instruction directly as a command-line argument:
npm start "Show my desktop"
npm start "Copy 'Antigravity' to my clipboard"
npm start "Check the battery status"
🛠 Available Tools
| Tool Name | Parameters | Description |
|---|---|---|
get_system_status |
None | Returns CPU load, RAM usage (GB/%), C: disk space, and uptime. |
get_top_processes |
None | Returns the top 5 CPU consuming active processes. |
get_battery_status |
None | Returns charge level (%), charging status, and remaining time. |
get_brightness |
None | Gets current screen brightness percentage. |
set_brightness |
level (0-100) |
Sets screen brightness to the specified level. |
get_volume |
None | Gets master volume (0-100) and mute status. |
set_volume |
level (0-100) |
Sets master volume level. |
set_mute |
mute (boolean) |
Mutes or unmutes system audio. |
media_control |
action (string) |
Simulates keypress: play_pause, next_track, prev_track, stop. |
system_power_control |
action (string) |
Performs power action: lock, sleep, shutdown, restart, abort_shutdown. |
get_clipboard |
None | Returns text content on the Windows clipboard. |
set_clipboard |
text (string) |
Copies the text to the Windows clipboard. |
open_url |
url (string), browser (string) |
Opens the URL in default or specified browser (chrome, firefox, edge, brave). |
launch_app |
app (string) |
Launches the application (e.g. notepad). |
get_network_info |
None | Returns local IPs, network adapter name, external IP, and SSID. |
show_desktop |
None | Minimizes all active windows to show the desktop. |
get_gpu_info |
None | Returns GPU name, driver version, memory, and status. |
get_audio_devices |
None | Lists all active audio output and input devices. |
close_process |
target (string) |
Force closes process by name or PID. |
empty_recycle_bin |
None | Empties the Windows Recycle Bin. |
get_disk_space |
None | Gets storage metrics (total, used, free) for all active drives. |
take_screenshot |
filename (string) |
Captures primary display screenshot and saves it locally. |
get_wifi_networks |
None | Scans and lists nearby Wi-Fi network SSIDs and signal strengths. |
get_system_info |
None | Retrieves static system specs (Motherboard, CPU, RAM modules info). |
get_wifi_status |
None | Retrieves details of current Wi-Fi SSID, rate, and signal quality. |
get_network_latency |
target (string) |
Tests latency (ping response time) to custom or default addresses. |
clear_clipboard |
None | Clears all text contents currently on the Windows clipboard. |
get_dns_servers |
None | Retrieves configured DNS server IP addresses for the network interface. |
get_active_window |
None | Retrieves the title, process name, and PID of the focused foreground window. |
send_keystrokes |
keys (array of strings) |
Sends keystrokes to active window (type, tabs, enter, delays). |
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.