Mochi
Browser automation MCP server with persistent memory for AI assistants, enabling automated web testing and workflow replay with self-healing selectors.
README
Mochi
Browser companion for AI assistants. Browser automation MCP + persistent project memory + in-page hint messaging, all in one Claude Code plugin.
A QA-tester MCP for AI assistants — with memory.
Each AI session runs inside its own Chrome tab group (your other tabs are untouched). What's new: every successful action is auto-traced, and the agent can save the trace as a named workflow scoped to a domain. Next time you ask "test the login flow on staging", the agent replays the saved workflow with cached selectors — no re-discovery, no re-screenshotting. If a selector breaks (refactor, A/B test, redesign), the engine self-heals by ARIA role + name and updates the cache.
selector cache + workflow store
(file-based, <project>/.continuum/)
▲
AI client (Claude Code / Codex / Cursor)
│ stdio (MCP) ▲
▼ │
server/ ──── auto-launches Chrome ─┴─▶ Chrome
│ WebSocket │
▼ ▼
extension/ background.js ◀────────────── manifest V3 extension
│
▼
session = { tab group, primary tab, tab set, CDP }
Layout
| Directory | What it is |
|---|---|
server/ |
Node MCP server. WS server (port 9009) + file-based memory + workflow replay engine. |
extension/ |
Chrome MV3 extension. Owns the tab group, CDP attachments, and DOM helpers. |
mcp/ |
Reference clone of upstream Browser MCP (not used; archival). |
Memory (selector cache, workflows, runs) lives in <project>/.continuum/
alongside the Continuum chain data — pure files, no database. Override with
SUPER_TESTER_DATA_DIR.
Install (one plugin, one extension, done)
Mochi is a Claude Code plugin bundling browser automation, context-chain
memory, and popup hint messaging. The server is pre-bundled into a single
file (server/dist/server.bundle.mjs) by GitHub Actions on every push to
the default branch — so installing means cloning the repo. No npm install,
no native binaries, no setup script.
Install from GitHub (recommended)
# Inside any Claude Code session:
/plugin marketplace add DevZonayed/Mochi
/plugin install mochi@mochi
That's it for the plugin. Then load the Chrome extension once:
chrome://extensions → Developer mode → Load unpacked → select
~/.claude/plugins/cache/mochi/mochi/0.4.1/extension
Restart Claude Code. Press ⌘⇧M (macOS) or Ctrl+Shift+M (other) on any tab to send a hint with picked elements + screenshot.
Install from a local checkout (if hacking on the plugin itself)
/plugin marketplace add /absolute/path/to/your/Mochi/checkout
/plugin install mochi@mochi
If you change source files, rebuild the bundle before reloading:
cd /path/to/Mochi/server && npm install && npm run build
Then /reload-plugins inside Claude (or /plugin uninstall mochi && /plugin install mochi@mochi for a clean refresh).
The plugin auto-registers:
- Two MCP servers —
browser(automation tools) andcontinuum(recall tool) - Seven slash commands —
/continuum:checkpoint,/continuum:recall,/continuum:status,/continuum:dream,/continuum:feedback,/continuum:rename,/continuum:render - Seven hooks — SessionStart, PreCompact, SessionEnd, PreToolUse, PostToolUse, Stop, UserPromptSubmit
- The
/browserskill
For the in-page hint modal: press ⌘⇧M (macOS) or Ctrl+Shift+M on any page once the extension is loaded.
Migrating from the older super-tester plugin name
If you had an earlier install when the plugin was called super-tester, run:
/plugin uninstall super-tester
/plugin marketplace remove super-tester
…then follow the install steps above using the new mochi names. Your
.continuum/ chain data, screenshots, and feedback queue are tied to your
project dir (not the plugin name) — they survive the rename.
Updating
Whenever new versions of mochi land (bumped version in plugin.json):
/plugin update mochi
Claude Code re-copies the source files into its plugin cache and switches the
active install over. MCP servers respawn on next Claude restart. For hook /
command / skill changes only, /reload-plugins works mid-session.
Legacy install script
The older ./install.sh script (which edited ~/.claude.json directly) is
still in the repo but is now redundant with the plugin route. Don't use both
at the same time — pick one.
./install.sh --uninstall # remove the legacy .claude.json entry if present
Tools (MCP)
54 tools, grouped by purpose.
Session + tabs
| Tool | What it does |
|---|---|
browser_session_start |
New tab group + primary tab. Pass newWindow:true to spawn a fresh window. |
browser_session_end |
Detach debugger, ungroup or close session tabs. |
browser_navigate |
Navigate primary tab, wait for load. |
browser_open_tab |
Open new tab inside the session group. |
browser_list_tabs |
List session tabs + CDP attachment state. |
browser_close_tab |
Close a specific session tab. |
Discovery + interaction
| Tool | What it does |
|---|---|
browser_text |
Compact visible text lines, optionally filtered by query. Use before snapshot for reading/searching content. |
browser_links |
Compact visible links with text, href, selector ref, and box. |
browser_snapshot |
ARIA tree with stable refs + pixel boxes. Defaults to compact, viewport-only, redacted, depth-limited, and 12KB capped. |
browser_snapshot_query |
Search the stored snapshot by text/name/role/ref/tag and return tiny excerpts. |
browser_snapshot_node |
Return one compact subtree from the stored snapshot by ref, text, or query path. |
browser_click |
Click by selector (CDP). Pass intent to cache the selector for next time. |
browser_click_at |
Click at pixel coords (CDP). |
browser_type |
Focus + clear + insertText. Pass intent to cache. |
browser_press_key |
Real keyboard event (CDP). |
browser_scroll |
Absolute {x,y} or relative {deltaX,deltaY}. |
browser_go_back / browser_go_forward |
History navigation. |
browser_wait |
Sleep up to 60s. |
browser_screenshot |
PNG/JPEG (viewport / fullPage / elementRef). |
Viewport + window
| Tool | What it does |
|---|---|
browser_window_resize |
Resize/move/maximize the actual window (only safe with newWindow). |
browser_emulate_viewport |
Device Mode via CDP. Presets + custom width/height/DPR/UA. |
browser_clear_emulation |
Reset viewport overrides. |
Assertions
| Tool | What it does |
|---|---|
browser_assert |
Verify url-contains, url-equals, title-contains, element-exists, element-missing, text-contains, text-equals. Returns {ok, got}. |
File uploads
| Tool | What it does |
|---|---|
browser_upload_stage |
Stage a file into the per-project library at .continuum/uploads/. Accepts path, https url, dataUrl, or base64. Returns a stable stashId (sha256-based, idempotent) reusable across many uploads. |
browser_upload_file |
Attach a file to a page target via a strategy chain: direct (DOM.setFileInputFiles) → intercept (file-chooser dialog) → drop (synthesized DataTransfer + DragEvent) → paste (synthesized ClipboardEvent). Bypasses the native OS file picker entirely. Target by selector, ref, trigger, or auto: { near }. Smart-wait confirms upload (preview thumbnail / 2xx upload response / custom successSelector). |
Sources can be inlined into browser_upload_file or pre-staged with
browser_upload_stage; staging dedupes by sha256, so the same logo across
ten sites is fetched/decoded once and reused. Frame-traversal handles
same-origin iframes automatically. See
docs/superpowers/specs/2026-05-19-browser-file-upload-design.md
for the full design.
Playbooks (personal ops memory)
| Tool | What it does |
|---|---|
browser_playbook_list |
List playbooks under .continuum/playbooks/, filter by origin/tag/verifiable. |
browser_playbook_get |
Return one playbook with meta + body sections + workflow JSON. |
browser_playbook_save |
Create/update a playbook (validates frontmatter and required sections). |
browser_playbook_delete |
Remove a playbook + workflow + screenshots. |
browser_playbook_match |
Score-match playbooks against a URL, intent, or task description. |
browser_playbook_run |
Replay a playbook (with self-heal) using provided inputs; recursively executes composes/next chains; returns verdict + evidence. |
browser_playbook_propose_update |
Given a successful trace, create or update the matching playbook. Inputs and steps auto-inferred. |
browser_playbook_secret_check |
Validate that a playbook's type: secret inputs are resolvable (env or .continuum/secrets/). Returns availability only — never values. |
browser_playbook_seed_from_codebase |
Static-analyze the project's frontend (Next.js / Vite / CRA) and emit draft playbooks per route + form. Solves cold-start on in-house apps. |
browser_playbook_diff_accept |
Bless a run's per-step screenshots as the new visual reference; bumps playbook_version. |
browser_playbook_export |
Export one or more playbooks to a single JSON bundle file. Includes embedded base64 screenshots. |
browser_playbook_import |
Import a playbook bundle (file / inline JSON / https URL). Supports overwrite + rewriteOrigin (e.g., staging → production). |
browser_playbook_dashboard |
Generate a self-contained HTML dashboard from the library; opens in the active browser session. |
v1.5 capabilities:
- Typed secrets:
inputs[].type: secretresolves at runtime from${env:VAR}or${secret:name}(reads.continuum/secrets/<name>.txt, which ischmod 0700with an auto-protective.gitignore). Secret values never appear in.continuum/runs/traces or promoted playbook bodies. - Codebase-derived drafts: point
browser_playbook_seed_from_codebaseat this project and it walks your routes (Next.js App/Pages Router, Vite, CRA), extracts forms +data-testids +aria-labels, and emits draft playbooks per route. Password fields auto-typed assecret. - Visual diff regression: during
browser_playbook_run, each step's screenshot is compared (pixelmatch) against the playbook's reference.warnbetween 5–20% diff;fail≥20% (configurable per playbook). Usebrowser_playbook_diff_acceptto bless intentional UI changes.
See docs/superpowers/specs/2026-05-20-playbooks-v1-5-design.md.
v2 capabilities (Sharing & Polish):
- 1Password integration:
inputs[].type: secretrefs accept${1password:vault/item/field}(alias${op:...}). When theopCLI is installed and signed in, values are resolved viaop readat run time and never logged. - Vue + SvelteKit codebase seeding: Nuxt projects (
nuxt.config.*) and SvelteKit projects (svelte.config.*+@sveltejs/kit) are detected bybrowser_playbook_seed_from_codebasein addition to Next.js / Vite / CRA. - Blocked-verdict UX:
browser_playbook_runreturnsverdict: "blocked"with aneeds[]array (one entry per missing required input + ahintper source) instead of throwing. The main agent uses the hints to prompt the user (or fix env) and then retries. - Cross-project playbook bundles:
browser_playbook_exportwrites a single JSON containing markdown + workflow + base64 screenshots;browser_playbook_importrestores them anywhere. Supports overwrite + origin rewrite for staging → production migration. - HTML dashboard:
/mochi:playbook ui(orbrowser_playbook_dashboard) generates a self-contained dashboard with search, tag filters, and inline drill-down per playbook.
See docs/superpowers/specs/2026-05-20-playbooks-v2-design.md.
Combined with the bundled qa-tester subagent and the smart-router rule in
plugins/qa/CLAUDE.md, the playbook library is your personal ops memory —
each browser task you do once becomes replayable, chainable, and
scheduleable. Main Claude routes verifiable + repeatable tasks to the
isolated qa-tester subagent (which returns a pass/fail verdict + evidence)
while operational tasks (multi-step, decisive, may need mid-flow input)
stay in the main conversation and use playbooks as guidance.
Slash commands:
/qa <task>— dispatch the qa-tester subagent for a verifiable browser task./mochi:playbook list|show|run|delete|match [args]— manage playbooks./mochi:schedule-playbook <id>— wire up cron via the host'sscheduleskill./mochi:unschedule-playbook <id>— cancel a scheduled playbook.
See docs/superpowers/specs/2026-05-20-personal-ops-playbooks-design.md
for the full design.
Memory: selector cache (per origin)
| Tool | What it does |
|---|---|
browser_recall_selector |
"Do I already know how to find X on this site?" Returns cached selector or null. |
browser_forget_selector |
Drop a cached entry. |
browser_list_selectors |
Inspect the cache. |
Memory: workflows
| Tool | What it does |
|---|---|
browser_workflow_save |
Persist current session's auto-traced actions as a named workflow. |
browser_workflow_run |
Replay. Cached selector → self-heal by role+name → screenshot on miss. |
browser_workflow_list / _get / _delete |
Manage workflows. |
browser_workflow_export / _import |
Portable JSON (commit alongside your app's tests). |
browser_run_history |
Last N runs of a workflow. |
Compact inspection ladder
Use the smallest inspection tool that can answer the current question:
browser_text {query?, limit?}for page copy, search results, lists, and visible facts.browser_links {query?, limit?}for navigation choices.browser_snapshotfor clickable refs and visible actionable UI.browser_snapshot_queryfor targeted search inside the stored snapshot.browser_snapshot_nodefor the one subtree you need.browser_snapshot {mode:"full", scope:"all", maxBytes:0}only as an explicit last resort.
This keeps Claude Code, Codex, and parallel agents from flooding their context with full-page accessibility trees.
Visual placement loop
browser_snapshot → compact tree with refs + boxes
browser_screenshot → image (viewport / fullPage / elementRef)
↓
agent correlates ref ↔ box ↔ pixel position
↓
browser_click {ref, intent:"…"} ← intent caches the selector
Resize vs. emulate — when to use which
| Goal | Tool |
|---|---|
| Test a real responsive layout at iPhone size | browser_emulate_viewport {preset:"iphone-15-pro"} (no window change, includes touch + UA) |
| Test how the UI behaves at a real 2560×1440 monitor | browser_window_resize {width:2560, height:1440} (only safe in a session-owned window) |
| Verify a layout breakpoint at exactly 768px wide | browser_emulate_viewport {width:768, height:1024} |
| Reset back to native | browser_clear_emulation |
emulate_viewport is preferred — it's deterministic, doesn't disturb anything else, and matches what Chrome DevTools' Device Mode does. window_resize is for when you genuinely need real OS-level window dimensions.
Memory model
Two layers, stored as plain JSON files under <project>/.continuum/ (no
database, no native bindings).
1) Selector cache — keyed by (origin, intent)
Every browser_click / browser_type call may carry an intent ("click sign in
button", "email field"). On success, the resolved selector is cached at
(origin, intent). The agent can short-circuit discovery by calling
browser_recall_selector before snapshotting:
browser_recall_selector {intent:"click sign in button"}
→ {found:true, selector:'button[aria-label="Sign in"]', last_box:{...}}
browser_click {ref:'button[aria-label="Sign in"]', intent:"click sign in button"}
The cache survives Chrome restarts, project reloads, server restarts.
2) Workflows — keyed by (origin, name)
Every successful action inside a session is appended to an in-memory trace.
browser_workflow_save {name:"login"} persists the trace as an ordered list of
steps. browser_workflow_run {name:"login"} replays them.
Replay strategy per step:
- Try the step's stored selector. If it resolves → click.
- Else: try other entries from the selector cache for the same
intent. - Else: self-heal by ARIA role + name from a fresh snapshot. If found, update both the step record AND the selector cache, continue.
- Else: return a rich failure envelope (tried selectors, role/name, screenshot, suggestion) so the agent can recover.
The agent doesn't need to think about caching — just pass intent. Workflows
build themselves out of normal exploration and replay deterministically next
time.
Step-by-step feedback contract
Every replayed step returns:
{
"step": 2,
"action": "click",
"intent": "click sign in button",
"status": "pass", // pass | fail | skipped
"selector": "button[aria-label=\"Sign in\"]",
"selector_source": "step_cache", // step_cache | selector_cache | self_healed
"durationMs": 12
}
Failures additionally include tried, role, name, screenshotDataUrl, and
a suggestion.
Typical agent flow
First time ("test the login flow"):
browser_session_start
browser_navigate {url:"https://staging.myapp.com/login"}
browser_recall_selector {intent:"email field"} → not found
browser_snapshot
browser_type {ref:"input[name=email]", text:"…", intent:"email field"}
browser_recall_selector {intent:"click sign in"} → not found
browser_click {ref:"button.signin", intent:"click sign in"}
browser_assert {kind:"url-contains", value:"/dashboard"}
browser_workflow_save {name:"login"}
Next time ("retest login"):
browser_session_start
browser_workflow_run {name:"login", origin:"https://staging.myapp.com"}
→ {status:"pass", stepsTotal:5, stepsPassed:5, results:[…]}
If the UI was refactored, the run still passes — the engine self-heals and updates the cache. If it can't find the element at all, the agent gets a screenshot and a suggestion, and falls back to snapshot + AI discovery.
Portability
Workflows are portable JSON. Commit them alongside your app:
# in agent flow:
browser_workflow_export {name:"login"} # returns JSON payload
# write to repo: tests/super-tester/login.json
# later, on a fresh machine:
browser_workflow_import {payload: <json>}
Concurrent Claude sessions
You can run multiple Claude Code sessions at once, each with its own
super-tester scope. The first MCP server to start binds port 9009 and becomes
the broker; subsequent MCP servers detect the conflict and connect to the
broker as clients, forwarding their browser commands through it. Each
Claude session gets its own clientId, and the extension keeps a separate tab
group per client. Sessions are fully isolated — Session A's clicks/navigates
never touch Session B's tabs.
Claude session 1 ──stdio──► MCP-A ─────► (broker, owns port 9009 + extension WS)
└──┐
Claude session 2 ──stdio──► MCP-B ─────► (client → forwards via MCP-A)
└──┐
Claude session 3 ──stdio──► MCP-C ─────► (client → forwards via MCP-A)
Extension holds Map<clientId, Session> — one tab group per Claude session.
The selector cache and workflow store are shared across sessions
(per-origin, in .continuum/), so a workflow recorded in Session A can be
replayed from Session B without re-learning anything.
Claude Code shortcuts
After installing the mochi plugin (see Install
above), the browser MCP server runs automatically. No claude mcp add-json
needed.
After restarting Claude Code, use:
/browser test localhost:3000
use browser to verify the login flow
use the browser MCP and check console errors
The MCP tools are named browser_session_start, browser_navigate,
browser_snapshot, browser_click, browser_screenshot,
browser_console_messages, browser_network_requests, and related
browser_* tools.
If the broker process dies (for example, the first Claude Code session exits),
the remaining MCP clients automatically race to recover. One client promotes
itself to the new broker, the extension reconnects to it, and clients request
their previous clientId so existing tab groups remain attached to the right
Claude session. New Claude sessions can then connect to the recovered broker.
Commands from the same client are serialized inside the extension to prevent
same-session races such as session_start overlapping navigate or
session_end. Different client sessions still run in parallel, each scoped to
its own tab group.
Boundary guarantees
- Spawned tabs (target=_blank,
window.open, etc.) are auto-grouped into the session group viachrome.tabs.onCreated. - Drag a tab out of the group → it's released from the session, no longer touched.
- All operations validate that the target tab is still in the session group. If you ungroup or close the group, the next tool call fails cleanly.
- Other Chrome windows / tabs / groups are never queried, never modified.
- Per-client isolation: every operation is scoped to the originating Claude session's tab group. Cross-session reads/writes are impossible at the protocol level.
- Service-worker restart recovery: session metadata is persisted in
chrome.storage.localand restored against live tab groups when the extension wakes back up.
Environment variables
| Variable | Default | Purpose |
|---|---|---|
SUPER_TESTER_WS_PORT |
9009 |
WS port the extension connects to (must match in code). |
SUPER_TESTER_AUTO_LAUNCH |
true |
Set false to disable Chrome auto-launch. |
SUPER_TESTER_CHROME_PATH |
platform-detected | Override Chrome binary path. |
SUPER_TESTER_EXTENSION_PATH |
unset | If set, Chrome launches with --load-extension=<path>. |
SUPER_TESTER_PROFILE_DIR |
~/.super-tester/super-tester-profile |
Dedicated --user-data-dir. |
SUPER_TESTER_EXTENSION_WAIT_MS |
20000 |
How long to wait for the extension to connect on cold start. |
SUPER_TESTER_DATA_DIR |
<project>/.continuum/ |
Override where selector cache and workflows are stored. |
SUPER_TESTER_PROJECT_DIR |
process.cwd() |
Where to start looking for a project root (.git / package.json) for the per-project data dir. |
Caveats
- Chromium-only (Tab Groups API). Won't work in Firefox.
- Debugger banner: the first time you call
browser_click/browser_type/browser_press_key/browser_screenshotwithfullPageorelementRefon a tab, Chrome shows a "Mochi started debugging this browser" banner (Chrome shows the extension's display name). The session keeps the attachment alive untilbrowser_session_end(or the tab closes). This is intentional and unavoidable for real input dispatch. - DevTools collision: if you open Chrome DevTools on a session tab, CDP attach will fail until you close DevTools.
--load-extensionrequires Developer Mode in the target profile.- WebSocket reconnect from an MV3 service worker is best-effort: a 30-second alarm pings every cycle; opening the popup wakes the SW immediately.
- Hard-coded
ws://127.0.0.1:9009in the extension — change there + viaSUPER_TESTER_WS_PORTif needed.
Troubleshooting
| Symptom | Likely cause / fix |
|---|---|
extension didn't connect within timeout |
Extension isn't loaded, Chrome on a different profile, or the SW died. Click the Mochi icon → confirm the status dot is green. |
tab not in session group |
The tab was dragged out, or the session was ended/cleared. Call browser_session_start again. |
element not found: … |
Use browser_snapshot first; pass the ref it returns. |
chrome.debugger attach failed — another debugger… |
Close Chrome DevTools on the session tab (or other debugging extensions), then retry. |
element has zero size from browser_screenshot |
The elementRef is hidden / display:none. Snapshot first to confirm visibility. |
| Chrome opens but with the wrong profile | Set SUPER_TESTER_PROFILE_DIR to a clean directory. |
Server logs [ws] error: EADDRINUSE |
Another instance is running on port 9009. Kill it or change SUPER_TESTER_WS_PORT. |
Contributing
Issues, ideas, and PRs welcome. A few pointers:
- Bug reports / feature requests — use the issue templates.
- Open-ended questions or ideas — start a discussion instead of an issue.
- Pull requests — keep them focused (one concern per PR). The PR template prompts for context. Don't hand-edit
server/dist/— it's auto-rebuilt by CI fromserver/src/. - Security issues — please don't open a public issue. See SECURITY.md for the private disclosure process.
License
MIT © Jonayed Ahamed
Recommended Servers
playwright-mcp
A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
Magic Component Platform (MCP)
An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.
Audiense Insights MCP Server
Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.
VeyraX MCP
Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.
graphlit-mcp-server
The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.
Kagi MCP Server
An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.
E2B
Using MCP to run code via e2b.
Neon Database
MCP server for interacting with Neon Management API and databases
Exa Search
A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.
Qdrant Server
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.