Model Context Protocol (MCP) MSPaint App Automation
A simple Model Context Protocol (MCP) server client code to solve math problems and show the solution in MSPaint application
shettysaish20
README
Model Context Protocol (MCP) MSPaint App Automation
This project demonstrates how to automate interactions with a legacy Windows application (MSPaint) using the Model Context Protocol (MCP). It leverages pywinauto
to control the Paint application and fastmcp
to define tools that can be called by an AI agent. The AI agent, powered by Google's Gemini model, uses these tools to perform tasks such as drawing rectangles and adding text to the Paint canvas.
Table of Contents
- Introduction
- Model Context Protocol (MCP)
- Project Structure
- Requirements
- Setup
- Usage
- How It Works
- Key Components
- Troubleshooting
- Contributing
- License
Introduction
This project showcases the automation of MSPaint using an AI agent. The agent can open Paint, draw rectangles, and add text, all driven by natural language instructions. This is achieved through the Model Context Protocol (MCP), which allows the AI agent to call specific functions (tools) defined in the Python code.
Model Context Protocol (MCP)
The Model Context Protocol (MCP) is a framework that enables AI models to interact with external tools and resources. It provides a standardized way for models to call functions, retrieve data, and perform actions in the real world. In this project, MCP is used to expose Paint automation functions as tools that the AI agent can use.
Project Structure
├── MSPaint-MCP-Server/
│ ├── mcp_server.py # Defines the MCP server with tools for Paint automation
│ ├── mcp_client.py # Defines the MCP client that interacts with the server and AI model
│ ├── requirements.txt # Lists the project dependencies
│ └── .env # Stores the Gemini API key
├── README.md # This file
Requirements
- Python 3.11+
- Conda (recommended for environment management)
- Google Gemini API key
- pywin32
- pywinauto
- fastmcp
- python-dotenv
- google-genai
Setup
-
Create a Conda environment:
conda create -n eagenv python=3.11 conda activate eagenv
-
Install dependencies:
pip install -r requirements.txt
-
Set up the Gemini API key:
-
Create a
.env
file in the directory. -
Add your Gemini API key to the
.env
file:GEMINI_API_KEY=YOUR_API_KEY
-
Usage
-
Run the MCP client:
python mcp_paint_app/mcp_client.py
This will start the MCP client, which connects to the MCP server, initializes the AI agent, and begins the automation process.
How It Works
-
MCP Server (
mcp_server.py
):- Defines the tools for interacting with MSPaint (e.g.,
open_paint
,draw_rectangle
,add_text_in_paint
). - Uses
pywinauto
to control the MSPaint application. - Exposes these tools via the
fastmcp
library.
- Defines the tools for interacting with MSPaint (e.g.,
-
MCP Client (
mcp_client.py
):- Connects to the MCP server.
- Uses the Google Gemini model to generate instructions.
- Parses the model's output to determine which tool to call.
- Calls the appropriate tool on the MCP server.
- Handles the response from the tool and feeds it back to the model.
-
AI Agent (Google Gemini):
- Receives a query (e.g., "Return the sum of first 20 Fibonacci numbers.").
- Uses the available tools (defined in the system prompt) to solve the problem.
- Generates function calls (e.g.,
FUNCTION_CALL: fibonacci_numbers|20
) to use the tools. - Provides a final answer (e.g.,
FINAL_ANSWER: 6765
) and uses Paint to display the result.
Key Components
mcp_server.py
: Contains the core logic for automating MSPaint. Theopen_paint
,draw_rectangle
, andadd_text_in_paint
functions are the key tools used by the AI agent.mcp_client.py
: Manages the interaction between the AI agent and the MCP server. It sets up the system prompt, calls the tools, and handles the responses.requirements.txt
: Lists all the necessary Python packages for the project.- .env: Stores the Google Gemini API key.
Troubleshooting
- Permission Issues: If you encounter permission issues, try running the scripts as an administrator.
- Coordinate Issues: The coordinates used for clicking in MSPaint may need to be adjusted based on your screen resolution and window size. Use the debugging print statements in the code to identify the correct coordinates.
- Tool Selection Issues: If the AI agent is not selecting the correct tools, review the system prompt and ensure that the tool descriptions are accurate.
- API Key Issues: Ensure that your Gemini API key is correctly set in the
.env
file.
Contributing
Contributions are welcome! Please submit a pull request with your changes.
License
Recommended Servers
Crypto Price & Market Analysis MCP Server
A Model Context Protocol (MCP) server that provides comprehensive cryptocurrency analysis using the CoinCap API. This server offers real-time price data, market analysis, and historical trends through an easy-to-use interface.
MCP PubMed Search
Server to search PubMed (PubMed is a free, online database that allows users to search for biomedical and life sciences literature). I have created on a day MCP came out but was on vacation, I saw someone post similar server in your DB, but figured to post mine.
dbt Semantic Layer MCP Server
A server that enables querying the dbt Semantic Layer through natural language conversations with Claude Desktop and other AI assistants, allowing users to discover metrics, create queries, analyze data, and visualize results.
mixpanel
Connect to your Mixpanel data. Query events, retention, and funnel data from Mixpanel analytics.

Sequential Thinking MCP Server
This server facilitates structured problem-solving by breaking down complex issues into sequential steps, supporting revisions, and enabling multiple solution paths through full MCP integration.

Nefino MCP Server
Provides large language models with access to news and information about renewable energy projects in Germany, allowing filtering by location, topic (solar, wind, hydrogen), and date range.
Vectorize
Vectorize MCP server for advanced retrieval, Private Deep Research, Anything-to-Markdown file extraction and text chunking.
Mathematica Documentation MCP server
A server that provides access to Mathematica documentation through FastMCP, enabling users to retrieve function documentation and list package symbols from Wolfram Mathematica.
kb-mcp-server
An MCP server aimed to be portable, local, easy and convenient to support semantic/graph based retrieval of txtai "all in one" embeddings database. Any txtai embeddings db in tar.gz form can be loaded
Research MCP Server
The server functions as an MCP server to interact with Notion for retrieving and creating survey data, integrating with the Claude Desktop Client for conducting and reviewing surveys.