SideButton
Open source AI agent platform with MCP server, browser automation, knowledge packs, and workflow engine. Works with Claude Code, ChatGPT, Cursor, and any MCP client.
README
SideButton
Open-source AI agent platform — MCP server, knowledge packs, and workflow automation tools.
<p align="center"> <a href="https://sidebutton.com/media/sidebutton-open-source-platform-release"> <img src="https://sidebutton.com/media/sidebutton-agent-stack.png" alt="The AI Agent Stack — SideButton" width="700" /> </a> </p>
AI agent platform with 40+ AI agent tools. Run autonomous AI agents with agentic workflows, knowledge packs, and real browser control. Connect Claude Code, Cursor, ChatGPT, or any MCP client.
npx sidebutton@latest
# Dashboard at http://localhost:9876
What you get
| MCP Server | 40+ AI agent tools for browser control, workflow execution, knowledge pack access. Stdio and SSE transports. |
| REST API | 60+ endpoints. Trigger workflows remotely from webhooks, cron jobs, mobile apps, or other agents. |
| Workflow Engine | AI workflow automation with 34+ step types — browser, shell, LLM, control flow. Define agentic workflows in YAML. |
| Knowledge Packs | Installable domain knowledge — CSS selectors, data models, state machines. Role playbooks turn coding agents into an AI software engineer, QA, or PM. |
| Chrome Extension | 40+ browser commands. Real DOM access via WebSocket, not screenshots. Recording mode. |
| Dashboard | Svelte UI — workflow browser, run logs, skill pack manager, system status. |
Quick Start
# Install and start
npx sidebutton@latest
# Or from source
pnpm install && pnpm build && pnpm start
# Open http://localhost:9876
CLI
pnpm cli serve # Start server with dashboard
pnpm cli serve --stdio # Start with stdio transport (for Claude Desktop)
pnpm cli list # List available workflows
pnpm cli status # Check server status
# Skill pack management
pnpm cli registry add <path|url> # Install skill packs from a registry
pnpm cli registry update [name] # Update installed packs
pnpm cli registry remove <name> # Uninstall packs and remove registry
pnpm cli search [query] # Search available skill packs
# Creating skill packs
pnpm cli init [domain] # Scaffold a new skill pack
pnpm cli validate [path] # Validate pack structure
pnpm cli publish [source] # Publish to a registry
MCP Server
SideButton is an AI agent platform and MCP server. AI coding agents connect to it directly for browser control, workflow automation, and domain knowledge.
Works with Claude Code, Cursor, Claude Desktop, VS Code, Windsurf, ChatGPT — any MCP client.
Claude Code
Add to ~/.claude/settings.json:
{
"mcpServers": {
"sidebutton": {
"type": "sse",
"url": "http://localhost:9876/mcp"
}
}
}
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sidebutton": {
"command": "npx",
"args": ["sidebutton", "--stdio"]
}
}
}
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"sidebutton": {
"url": "http://localhost:9876/mcp"
}
}
}
MCP Tools
| Tool | Description |
|---|---|
run_workflow |
Execute a workflow by ID |
list_workflows |
List all available workflows |
get_workflow |
Get workflow YAML definition |
get_run_log |
Get execution log for a run |
list_run_logs |
List recent workflow executions |
get_browser_status |
Check browser extension connection |
capture_page |
Capture selectors from current page |
navigate |
Navigate browser to URL |
snapshot |
Get page accessibility snapshot |
click |
Click an element |
type |
Type text into an element |
scroll |
Scroll the page |
screenshot |
Capture page screenshot |
hover |
Hover over element |
extract |
Extract text from element |
extract_all |
Extract all matching elements |
extract_map |
Extract structured data from repeated elements |
select_option |
Select dropdown option |
fill |
Fill input value (React-compatible) |
press_key |
Send keyboard keys |
scroll_into_view |
Scroll element into viewport |
evaluate |
Execute JavaScript in browser |
exists |
Check if element exists |
wait |
Wait for element or delay |
check_writing_quality |
Evaluate text quality |
REST API
60+ JSON endpoints for external integrations. Same workflows available via MCP locally and via REST remotely.
# Run a workflow
curl -X POST http://localhost:9876/api/workflows/check_ticket/run \
-H "Content-Type: application/json" \
-d '{"params": {"ticket_id": "PROJ-123"}}'
# List workflows
curl http://localhost:9876/api/workflows
# Get run log
curl http://localhost:9876/api/runs/latest
Trigger workflows from webhooks, cron jobs, mobile apps, or other agents on different machines.
Workflow Engine
YAML-first orchestration. 34+ step types:
Step Types
| Type | Description |
|---|---|
| Browser | |
browser.navigate |
Open a URL |
browser.click |
Click an element by selector |
browser.type |
Type text into an element |
browser.fill |
Fill input value (React-compatible) |
browser.scroll |
Scroll the page |
browser.extract |
Extract text from element into variable |
browser.extractAll |
Extract all matching elements |
browser.extractMap |
Extract structured data from repeated elements |
browser.wait |
Wait for element or fixed delay |
browser.exists |
Check if element exists |
browser.hover |
Position cursor on element |
browser.key |
Send keyboard keys |
browser.snapshot |
Capture accessibility snapshot |
browser.injectCSS |
Inject CSS styles into page |
browser.injectJS |
Execute JavaScript in page |
browser.select_option |
Select dropdown option |
browser.scrollIntoView |
Scroll element into view |
| Shell | |
shell.run |
Execute a bash command |
terminal.open |
Open a visible terminal window (macOS) |
terminal.run |
Run command in terminal window |
| LLM | |
llm.classify |
Structured classification with categories |
llm.generate |
Free-form text generation |
| Control Flow | |
control.if |
Conditional branching |
control.retry |
Retry with backoff |
control.stop |
End workflow with message |
workflow.call |
Call another workflow with parameters |
| Data | |
data.first |
Extract first item from list |
LLM steps work with Ollama (local), OpenAI, Anthropic, and Google.
Example
id: check_ticket_status
title: "Check Jira ticket and classify"
steps:
- type: browser.navigate
url: "https://your-org.atlassian.net/browse/{{ticket_id}}"
- type: browser.extract
selector: "[data-testid='status-field']"
as: current_status
- type: control.if
condition: "{{current_status}} != 'Done'"
then:
- type: llm.classify
prompt: "Should this ticket be closed? Context: {{current_status}}"
classes: [close, keep_open]
as: decision
Variable Interpolation
Use {{variable}} syntax to reference extracted values or parameters:
steps:
- type: browser.extract
selector: ".username"
as: user
- type: shell.run
cmd: "echo 'Hello, {{user}}!'"
Knowledge Packs
Installable domain knowledge (skill packs) per web app or domain. Knowledge packs power AI code review, automated testing, and enterprise AI agent deployments.
Also referred to as skill packs in code and CLI commands.
- Selectors — CSS selectors for UI elements
- Data models — entity types, fields, relationships, valid states
- State machines — valid transitions per state
- Role playbooks — role-specific procedures (QA, SE, PM, SD)
- Common tasks — step-by-step procedures, gotchas, edge cases
sidebutton install github.com
sidebutton install atlassian.net
11 domains, 28+ modules published. Open registry — build and share packs for any web app.
Chrome Extension
Install from the Chrome Web Store.
- 40+ browser commands — navigate, click, type, extract, scroll, wait, snapshot
- Real DOM access via CSS selectors — not pixel coordinates, not screenshots
- Recording mode — capture manual actions as workflows
- Embed buttons — inject action buttons into any web page
- WebSocket connection — stable reconnection, works with local or remote server
After installing:
- Navigate to any website
- Click the SideButton extension icon
- Click "Connect This Tab"
Dashboard & Observability
Svelte UI at http://localhost:9876:
- Workflow browser — list, search, run
- Run logs — step-by-step execution traces with timing, variables, errors
- Skill pack manager — install, browse, inspect
- System status — extension connection, LLM config, server health
SideButton handles AI agent orchestration — from workflow execution to knowledge injection.
Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ @sidebutton/server │
│ │
│ ┌─────────────────────┐ ┌──────────────────────────────────────────┐ │
│ │ stdio Transport │ │ Fastify HTTP + WebSocket (port 9876) │ │
│ │ ───────────────── │ │ ──────────────────────────────────── │ │
│ │ stdin → JSON-RPC │ │ GET / → Dashboard (Svelte) │ │
│ │ stdout ← JSON-RPC │ │ GET /ws → Chrome Extension WS │ │
│ │ (Claude Desktop) │ │ POST /mcp → MCP JSON-RPC (SSE) │ │
│ └──────────┬──────────┘ │ GET /api/* → REST API │ │
│ │ └──────────────────────┬───────────────────┘ │
│ │ │ │
│ └──────────────────┬──────────────────┘ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ @sidebutton/core │ │
│ │ │ │
│ │ - Workflow types & parser (YAML) │ │
│ │ - Step executors (37 step types) │ │
│ │ - Variable interpolation │ │
│ │ - Execution context & events │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
▲ ▲ ▲ ▲
│ stdio │ WebSocket │ HTTP POST │ REST
▼ ▼ ▼ ▼
┌──────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────────┐
│Claude Desktop│ │ Chrome Extension│ │ Claude Code │ │ Mobile App │
│ (MCP stdio) │ │ (Browser Auto) │ │ (MCP SSE) │ │ (REST Client) │
└──────────────┘ └─────────────────┘ └─────────────────┘ └───────────────────┘
Project Structure
sidebutton/
├── packages/
│ ├── core/ # @sidebutton/core — workflow engine
│ │ └── src/
│ │ ├── types.ts # Workflow types
│ │ ├── parser.ts # YAML loader
│ │ ├── executor.ts # Workflow runner
│ │ └── steps/ # Step implementations
│ ├── server/ # @sidebutton/server — MCP + HTTP + CLI
│ │ ├── bin/ # CLI entry point
│ │ └── src/
│ │ ├── server.ts # Fastify HTTP server
│ │ ├── stdio-mode.ts # stdio transport entry point
│ │ ├── extension.ts # WebSocket client
│ │ ├── mcp/ # MCP handlers
│ │ │ ├── handler.ts # MCP JSON-RPC logic
│ │ │ ├── stdio.ts # stdio transport adapter
│ │ │ └── tools.ts # Tool definitions
│ │ └── cli.ts # Commander CLI
│ └── dashboard/ # Svelte web UI
│ └── src/
│ ├── App.svelte
│ └── lib/
├── extension/ # Chrome extension
├── workflows/ # Public workflow library
├── actions/ # User-created workflows
├── skills/ # Installed skill packs
└── run_logs/ # Execution history
Environment Variables
| Variable | Required For | Description |
|---|---|---|
OPENAI_API_KEY |
llm.* steps |
OpenAI API key for LLM workflows |
ANTHROPIC_API_KEY |
llm.* steps |
Anthropic API key (alternative) |
Development
pnpm install # Install dependencies
pnpm build # Build all packages
pnpm start # Start server
pnpm cli list # List workflows
pnpm cli status # Check status
Watch Mode
pnpm dev # Full dev mode (all packages)
pnpm dev:server # Server with auto-restart on :9876
pnpm dev:dashboard # Dashboard watch build
pnpm dev:core # Core library watch build
Platform Automation Disclaimer
SideButton is a general-purpose browser automation framework. When automating third-party platforms:
- Review Terms of Service: Many platforms prohibit or restrict automation. You are responsible for complying with the terms of any platform you automate.
- Account Risk: Automation may result in account restrictions or suspension on some platforms.
- Use Responsibly: Only automate actions you would perform manually. Respect rate limits and platform guidelines.
The authors do not endorse or encourage violations of third-party terms of service.
Legal
License
This project uses mixed licensing. See LICENSING.md for details.
- Engine, server, CLI, dashboard — Apache-2.0
- Browser extension — FSL-1.1-Apache-2.0 (converts to Apache-2.0 on 2029-03-15)
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.