ByteBot MCP Server
Enables autonomous task execution and direct desktop computer control through ByteBot's dual-API architecture, supporting intelligent hybrid workflows with mouse/keyboard operations, screen capture, file I/O, and automatic intervention handling.
README
ByteBot MCP Server
Production-grade Model Context Protocol (MCP) server for ByteBot's dual-API architecture, providing intelligent hybrid workflow orchestration for autonomous task execution and desktop computer control.
Overview
This MCP server integrates ByteBot's Agent API (task management) and Desktop API (computer control) into a unified interface for AI assistants like Claude. It enables:
- Autonomous Task Execution: Create and manage tasks for ByteBot to execute independently
- Direct Computer Control: Mouse, keyboard, screen capture, and file operations
- Hybrid Workflows: Intelligent orchestration with automatic monitoring and intervention handling
- Real-time Updates: Optional WebSocket support for live task status notifications
Features
Agent API Tools (Task Management)
bytebot_create_task- Create new tasks with priority levelsbytebot_list_tasks- List and filter tasks by status/prioritybytebot_get_task- Get detailed task information with message historybytebot_get_in_progress_task- Check currently running taskbytebot_update_task- Update task status or prioritybytebot_delete_task- Delete tasks
Desktop API Tools (Computer Control)
Mouse Operations:
bytebot_move_mouse- Move cursor to coordinatesbytebot_click- Click with left/right/middle buttonbytebot_drag- Drag from one position to anotherbytebot_scroll- Scroll in any direction
Keyboard Operations:
bytebot_type_text- Type text stringsbytebot_paste_text- Paste text (for special characters)bytebot_press_keys- Keyboard shortcuts (Ctrl+C, Alt+Tab, etc.)
Screen Operations:
bytebot_screenshot- Capture screen as base64 PNGbytebot_cursor_position- Get current cursor position
File I/O:
bytebot_read_file- Read file content (base64)bytebot_write_file- Write file content (base64)
System:
bytebot_switch_application- Switch to applicationbytebot_wait- Wait for specified duration
Hybrid Orchestration Tools (Priority 1)
bytebot_create_and_monitor_task- Create task and wait for completionbytebot_monitor_task- Monitor existing task until terminal statebytebot_intervene_in_task- Provide help when task needs interventionbytebot_execute_workflow- Multi-step workflow with automatic error recovery
Prerequisites
- Node.js: 20.x or higher
- ByteBot Instance: Running and accessible at configured endpoints
- Agent API (default:
http://localhost:9991) - Desktop API (default:
http://localhost:9990)
- Agent API (default:
Installation
# Clone or download this repository
cd bytebot-mcp-server
# Install dependencies
npm install
# Build TypeScript code
npm run build
Configuration
1. Create Environment File
Copy the example environment file and customize:
cp .env.example .env
2. Edit .env File
# ByteBot Agent API (Task Management)
BYTEBOT_AGENT_URL=http://localhost:9991
# ByteBot Desktop API (Computer Control)
BYTEBOT_DESKTOP_URL=http://localhost:9990
# WebSocket Configuration (Optional)
BYTEBOT_WS_URL=ws://localhost:9991
ENABLE_WEBSOCKET=false
# Server Configuration
MCP_SERVER_NAME=bytebot-mcp
# Timeouts (milliseconds)
REQUEST_TIMEOUT=30000
DESKTOP_ACTION_TIMEOUT=10000
# Retry Configuration
MAX_RETRIES=3
RETRY_DELAY=1000
# Monitoring Configuration
TASK_POLL_INTERVAL=2000
TASK_MONITOR_TIMEOUT=300000
# File Configuration
MAX_FILE_SIZE=10485760
# Logging
LOG_LEVEL=info
3. Remote ByteBot Configuration
If ByteBot is running on a remote server:
BYTEBOT_AGENT_URL=http://your-server.com:9991
BYTEBOT_DESKTOP_URL=http://your-server.com:9990
BYTEBOT_WS_URL=ws://your-server.com:9991
MCP Client Setup
Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"bytebot": {
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Zed Editor
Add to your Zed settings:
{
"context_servers": {
"bytebot": {
"command": {
"path": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"]
},
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Continue.dev
Add to .continue/config.json:
{
"mcpServers": [
{
"name": "bytebot",
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
]
}
Usage Examples
Example 1: Basic Task Creation
User: Create a task for ByteBot to search Wikipedia for "quantum computing"
Claude uses: bytebot_create_task
{
"description": "Go to wikipedia.org and search for 'quantum computing'",
"priority": "MEDIUM"
}
Response:
{
"id": "task-123",
"status": "PENDING",
"priority": "MEDIUM",
"createdAt": "2024-01-15T10:30:00Z"
}
Example 2: Hybrid Workflow (Create → Monitor → Complete)
User: Create a task to log into example.com and wait for it to complete
Claude uses: bytebot_create_and_monitor_task
{
"description": "Navigate to example.com and log in with credentials from keychain",
"timeout": 60000,
"pollInterval": 2000
}
Response:
{
"taskId": "task-456",
"finalStatus": "COMPLETED",
"completedAt": "2024-01-15T10:31:45Z",
"messagesCount": 12,
"task": { ... full task details ... }
}
Example 3: Task Needs Intervention
User: Create a task to fill out a complex form
Claude uses: bytebot_create_and_monitor_task
{
"description": "Fill out the registration form at example.com/register"
}
Response (after monitoring):
{
"taskId": "task-789",
"finalStatus": "NEEDS_HELP",
"task": {
"id": "task-789",
"status": "NEEDS_HELP",
"messages": [
{
"role": "assistant",
"content": "I need the user's phone number to complete this form"
}
]
}
}
User: My phone number is 555-1234
Claude uses: bytebot_intervene_in_task
{
"taskId": "task-789",
"message": "User's phone number is 555-1234",
"action": "resume",
"continueMonitoring": true
}
Response:
{
"taskId": "task-789",
"status": "COMPLETED",
"intervention": "applied"
}
Example 4: Interactive Desktop Control
User: Take a screenshot and click at position (500, 300)
Claude uses: bytebot_screenshot
Response: { "screenshot": "iVBORw0KG..." }
Claude uses: bytebot_click
{
"x": 500,
"y": 300,
"button": "left"
}
Response: ✓ bytebot_click completed successfully
Example 5: Multi-Step Workflow
User: Execute a workflow to open Firefox, navigate to GitHub, and take a screenshot
Claude uses: bytebot_execute_workflow
{
"steps": [
{
"name": "Open Firefox",
"description": "Switch to Firefox browser application"
},
{
"name": "Navigate to GitHub",
"description": "Navigate to github.com in the browser"
},
{
"name": "Take Screenshot",
"description": "Capture a screenshot of the GitHub homepage"
}
],
"priority": "HIGH"
}
Response:
{
"steps": [
{ "name": "Open Firefox", "taskId": "task-001", "status": "COMPLETED" },
{ "name": "Navigate to GitHub", "taskId": "task-002", "status": "COMPLETED" },
{ "name": "Take Screenshot", "taskId": "task-003", "status": "COMPLETED" }
],
"overallStatus": "completed",
"totalInterventions": 0
}
Example 6: File Operations
User: Read the contents of /home/user/data.txt
Claude uses: bytebot_read_file
{
"path": "/home/user/data.txt"
}
Response: { "content": "SGVsbG8gV29ybGQh..." } // Base64 encoded
Troubleshooting
Error: "Cannot connect to ByteBot server"
Cause: ByteBot is not running or endpoint URL is incorrect
Solution:
- Verify ByteBot is running:
curl http://localhost:9991/tasks - Check
.envfile has correct URLs - Ensure no firewall blocking connections
Error: "Request to ByteBot timed out"
Cause: Task took longer than configured timeout
Solution:
- Increase
REQUEST_TIMEOUTin.envfor Agent API calls - Increase
DESKTOP_ACTION_TIMEOUTfor Desktop API calls - Use
bytebot_create_and_monitor_taskwith custom timeout:{ "description": "Long running task", "timeout": 600000 }
Error: "Task with ID xyz not found"
Cause: Task was deleted or ID is incorrect
Solution:
- List all tasks:
bytebot_list_tasks - Verify task ID from response
- Check if task was accidentally deleted
Warning: "Screenshot size is 8.5MB"
Cause: Screenshot is very large (high resolution display)
Solution:
- This is just a warning, screenshot still works
- Consider reducing screen resolution if frequently capturing screenshots
- Screenshots >5MB will show this warning
Error: "Task must be in NEEDS_HELP state"
Cause: Attempting to intervene in task that doesn't need help
Solution:
- Check task status first:
bytebot_get_task - Only use
bytebot_intervene_in_taskwhen status isNEEDS_HELP - Use
bytebot_update_taskto manually change status if needed
WebSocket Connection Failed
Cause: WebSocket URL incorrect or ByteBot doesn't support WebSocket
Solution:
- Set
ENABLE_WEBSOCKET=falsein.envto disable WebSocket - Server will automatically fall back to HTTP polling
- WebSocket is optional - all features work without it
Error: "File size exceeds maximum allowed size"
Cause: Trying to upload/read file larger than 10MB
Solution:
- Increase
MAX_FILE_SIZEin.env(in bytes) - Split large files into smaller chunks
- Compress files before uploading
API Reference
Task Priority Levels
LOW- Background tasks, non-urgentMEDIUM- Default priority (recommended)HIGH- Important tasks, process soonURGENT- Critical tasks, process immediately
Task Lifecycle States
PENDING- Task created, waiting to startIN_PROGRESS- Task currently executingNEEDS_HELP- Task blocked, requires interventionNEEDS_REVIEW- Task complete but needs verificationCOMPLETED- Task finished successfullyCANCELLED- Task cancelled by userFAILED- Task failed with error
Mouse Buttons
left- Primary button (default)right- Context menu buttonmiddle- Scroll wheel click
Scroll Directions
up- Scroll updown- Scroll downleft- Scroll leftright- Scroll right
Common Applications
firefox- Mozilla Firefoxchrome- Google Chromesafari- Safari (macOS)terminal- Terminal/Command Promptvscode- Visual Studio Code
Architecture
┌─────────────────────────────────────────────┐
│ MCP Client (Claude) │
└─────────────────┬───────────────────────────┘
│ stdio transport
┌─────────────────▼───────────────────────────┐
│ ByteBot MCP Server │
│ ┌────────────────────────────────────────┐ │
│ │ Agent Tools │ Desktop Tools │ │
│ │ Hybrid Orchestrator │ │
│ └────────────┬──────────────┬─────────────┘ │
└───────────────┼──────────────┼───────────────┘
│ │
┌──────────▼──┐ ┌──────▼──────┐
│ Agent API │ │ Desktop API │
│ (port 9991) │ │ (port 9990) │
└─────────────┘ └─────────────┘
│ │
┌──────▼───────────────────▼──────┐
│ ByteBot Instance │
└─────────────────────────────────┘
Development
Build
npm run build
Type Check
npm run type-check
Watch Mode
npm run dev
Environment Variables Reference
| Variable | Default | Description |
|---|---|---|
BYTEBOT_AGENT_URL |
http://localhost:9991 |
ByteBot Agent API endpoint |
BYTEBOT_DESKTOP_URL |
http://localhost:9990 |
ByteBot Desktop API endpoint |
BYTEBOT_WS_URL |
ws://localhost:9991 |
WebSocket endpoint for real-time updates |
ENABLE_WEBSOCKET |
false |
Enable WebSocket connections |
MCP_SERVER_NAME |
bytebot-mcp |
Server identifier |
REQUEST_TIMEOUT |
30000 |
HTTP request timeout (ms) |
DESKTOP_ACTION_TIMEOUT |
10000 |
Desktop action timeout (ms) |
MAX_RETRIES |
3 |
Maximum retry attempts for failed requests |
RETRY_DELAY |
1000 |
Initial retry delay (ms) |
TASK_POLL_INTERVAL |
2000 |
Task status polling interval (ms) |
TASK_MONITOR_TIMEOUT |
300000 |
Maximum task monitoring duration (ms) |
MAX_FILE_SIZE |
10485760 |
Maximum file size in bytes (10MB) |
LOG_LEVEL |
info |
Logging level (debug/info/warn/error) |
License
MIT
Support
For issues and questions:
- ByteBot Documentation: https://docs.bytebot.ai
- MCP Specification: https://modelcontextprotocol.io
- Report issues: Create an issue in this repository
Version History
1.0.0 (2024-01-15)
- Initial release
- Agent API integration (task management)
- Desktop API integration (computer control)
- Hybrid orchestration tools
- WebSocket support for real-time updates
- Comprehensive error handling and retry logic
- Full TypeScript implementation with strict typing
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.