playwright-fixer-mcp

playwright-fixer-mcp

Automated Playwright E2E test repair powered by a self-improving, governed MCP server that runs failing tests, collects failure artifacts, reasons about root causes, validates and applies fixes, and re-runs to verify.

Category
Visit Server

README

playwright-fixer-mcp

Automated Playwright E2E test repair powered by a self-improving, governed MCP server.

Built on the three-layer AI automation architecture — Knowledge, Capability, and Governance — this tool turns failing Playwright tests into a closed verification loop: it runs the test, collects failure artifacts, reasons about the root cause, applies a rule-validated fix, and re-runs to verify — without hallucinating about runtime state.

We don't increase model intelligence. We reduce model freedom. The result is a system that works not by magic, but by design.


Why This Exists

When a Playwright test fails, most teams do one of two things:

  1. Send the error to an LLM and hope it guesses right.
  2. Alert an engineer to investigate manually.

Both share the same flaw: the LLM is reasoning blind. Timeout 5000ms exceeded contains almost no actionable information. No selector. No DOM context. No iframe structure. No screenshot.

This tool solves that by treating E2E debugging as a state reconstruction problem, not a prompting problem.

Instead of better prompts, the system provides:

  • Deterministic context — the model never guesses about runtime state; tools provide it
  • Constrained reasoning — exactly one failure context is resolved, one rule bundle injected
  • Closed verification — every proposed fix is validated by validate_and_apply_fix (the locked door) and re-run before accepting
  • Self-improving governance — learned fix patterns become rules, reviewed by humans, promoted automatically

Architecture

The system separates three concerns that most AI automation tools mix together. Mixing them is the most common failure mode.

┌──────────────────────────────────────────────────────────┐
│  Layer 3 — Governance  (.mdc rule files)                 │
│                                                          │
│  Behavioral contracts, workflow procedures, approval     │
│  gates. The AI executes these procedures — it doesn't    │
│  decide whether to follow them.                          │
│                                                          │
│  playwright-mcp.mdc · rule-evolution-review.mdc          │
└──────────────────────────────┬───────────────────────────┘
                               │ constrains
┌──────────────────────────────▼───────────────────────────┐
│  Layer 2 — Capability  (index.js — this MCP server)      │
│                                                          │
│  Neutral execution: run tests, read specs, analyze       │
│  failures, validate + apply fixes, propose rule updates. │
│                                                          │
│  LOCATOR_VIOLATION_RULES is enforced here as a hard      │
│  constraint (locked-door pattern) — not as a suggestion. │
└──────────────────────────────┬───────────────────────────┘
                               │ consults
┌──────────────────────────────▼───────────────────────────┐
│  Layer 1 — Knowledge  (playwright-test-standards.mdc)    │
│                                                          │
│  Locator priority, DSL conventions, evolved heuristics.  │
│  Guides reasoning — not enforced. Accumulates learned    │
│  rules over time via the rule evolution system.          │
└──────────────────────────────────────────────────────────┘

Key distinctionapply_approved_rules is intentionally not an MCP tool. It lives in rule-evolution-review.mdc as a governed procedure. Keeping it out of the automated executor prevents the system from modifying its own rule layer without human approval.

Closed Repair Loop

@AUT-xxx  (or "npx playwright test --grep @AUT-xxx")
    │
    ▼
resolve_spec_by_tag ──→ read_spec_file
    │                        │
    │                   (understand full test intent before running)
    │
    ▼
run_test_and_analyze_failure (attemptNumber: 1)
    │
    ├── passed: true  ──→ propose_rule_evolution ──→ PENDING queue
    │
    ├── shouldStop: true  ──→ escalate to human (attempt limit reached)
    │
    └── passed: false
            │
            ▼
       analyze_and_fix_selector
       (error message + screenshot path + spec context + rule bundle)
            │
            ▼
       validate_and_apply_fix  ←── LOCKED DOOR: enforces LOCATOR_RULES
            │
            ├── error violations  ──→ regenerate fix, retry validate
            │
            └── ok
                    │
                    ▼
               run_test_and_analyze_failure (attemptNumber + 1)
               (loop until pass or shouldStop)

Rule Evolution Lifecycle

[Automated Loop]           [Human Review]           [Governance Workflow]
propose_rule_evolution  →  PENDING in queue  →  mark APPROVED / REJECTED
                                                          │
                                             "apply approved rules"
                                              triggers rule-evolution-review.mdc
                                                          │
                                             ┌────────────▼────────────┐
                                             │ writes to .mdc file     │
                                             │ removes from Pending    │
                                             │ appends to History Log  │
                                             └─────────────────────────┘

The "trainable parameters" are .mdc rule files — not model weights. The system improves without retraining anything.


Prerequisites

  • Node.js 18+
  • Cursor with MCP support
  • Playwright installed in your project (npm install -D @playwright/test)

Installation

Option A — npm (recommended)

# Install as a dev dependency in your Playwright project
npm install -D playwright-fixer-mcp

Option B — Clone from GitHub

git clone https://github.com/your-username/playwright-fixer-mcp.git
cd playwright-fixer-mcp
npm install

Setup

After installation, run the setup command to copy the Cursor rule templates into your project:

# From your Playwright project root
npx playwright-fixer-mcp setup

# With an explicit project root
npx playwright-fixer-mcp setup --project-root=/path/to/your/project

# Force overwrite existing rule files
npx playwright-fixer-mcp setup --force

This copies four files into .cursor/rules/ in your project:

File Layer Purpose
playwright-mcp.mdc Governance Workflow trigger, closed-loop procedure, hard constraints
playwright-test-standards.mdc Knowledge Locator priority, DSL conventions, evolved heuristics
rule-evolution-review.mdc Governance Rule promotion workflow (human-triggered)
rule-evolution-queue.md Queue Pending / history log for rule proposals

Note: rule-evolution-queue.md is never overwritten if it already contains [PENDING] entries, even with --force. Your pending proposals are safe.


Cursor MCP Configuration

Add to your Cursor MCP config. Create .cursor/mcp.json in your project root (or add to Cursor → Settings → MCP):

If installed as dev dependency

{
  "mcpServers": {
    "playwright-fixer": {
      "command": "node",
      "args": ["./node_modules/playwright-fixer-mcp/index.js"]
    }
  }
}

If cloned locally

{
  "mcpServers": {
    "playwright-fixer": {
      "command": "node",
      "args": ["/absolute/path/to/playwright-fixer-mcp/index.js"]
    }
  }
}

Restart Cursor after adding the configuration. Verify the server appears under MCP tools.


Usage

Running a Test by Tag

Type a test tag in the Cursor chat — the closed-loop repair activates automatically:

@AUT-589-1

or

npx playwright test --grep @AUT-589-1

The system will:

  1. Find the spec file containing @AUT-589-1 (via resolve_spec_by_tag)
  2. Read the full spec + page object to understand test intent (via read_spec_file)
  3. Run the test (via run_test_and_analyze_failure)
  4. If it fails: collect screenshot + error → analyze → generate fix → validate against locator rules → apply → re-run
  5. If it passes: propose the learned fix pattern as a rule update (propose_rule_evolution)

The loop retries up to 2 times before stopping and escalating to human review.

Reviewing and Promoting Rule Proposals

After a successful auto-fix, a PENDING entry is written to .cursor/rules/rule-evolution-queue.md.

To review and promote:

  1. Open .cursor/rules/rule-evolution-queue.md
  2. Read the proposed rule under ## Pending (awaiting review)
  3. Change <!-- APPROVED | REJECTED | APPLIED --> to either <!-- APPROVED --> or <!-- REJECTED -->
  4. Tell Cursor: "apply approved rules"

The rule-evolution-review.mdc governance workflow activates:

  • Appends approved rules to the target .mdc file
  • Removes the entry from the Pending section
  • Writes a permanent record to the History Log

Hard constraint: The AI cannot self-approve. Only entries the human has explicitly marked are processed.


Project Structure (after setup)

your-project/
├── .cursor/
│   ├── mcp.json                             ← Cursor MCP server config
│   └── rules/
│       ├── playwright-mcp.mdc               ← Governance: workflow + hard constraints
│       ├── playwright-test-standards.mdc    ← Knowledge: locators, DSL, evolved rules
│       ├── rule-evolution-review.mdc        ← Governance: rule promotion procedure
│       └── rule-evolution-queue.md          ← Queue: pending/history rule proposals
├── tests/
│   └── **/*.spec.js                         ← Your Playwright specs (tagged @AUT-xxx)
├── node_modules/
│   └── playwright-fixer-mcp/
│       └── index.js                         ← MCP server (Capability layer)
└── package.json

Test Tag Format

Every test must use the @AUT-xxx tag format for the system to locate and run it:

import helperFunctions from '../helpers.js';
import PageObjectName from '../../pageObjects/PageObjectName.js';

test("description @BaseCase @PageName @testCaseName @AUT-xxx", async () => {
  const { browser, context, page } = await helperFunctions.setup_Backgound_Step();
  const pageObj = new PageObjectName(page);
  await helperFunctions.given_A_Page(page, PageObjectName);
  await helperFunctions.click(page, pageObj.someButton);
  await helperFunctions.check_Element_Contains_Text(page, pageObj.result, 'expected text');
  await browser.close();
});

Always use helperFunctions — direct page.click() / page.fill() calls bypass the failure normalization layer that the CONTEXT_RESOLVERS depend on.


Failure Normalization

The system uses a DSL layer (helperFunctions) to convert raw Playwright errors into semantic signals:

// Raw Playwright error (no semantic information for the system):
// Timeout 5000ms exceeded

// Normalized error from helperFunctions (classifiable):
throw new Error(`Element "${elementName}" not found with selector: ${selector}`);

This is how CONTEXT_RESOLVERS deterministically classifies failures into hover, fill, iframe, or default — without LLM inference.


Locator Rules (Enforced)

validate_and_apply_fix is the only valid write path to spec files. It enforces these constraints before writing:

Rule ID Severity Description
NO_XPATH error XPath locators are blocked. Use getByRole, getByLabel, or getByText.
IFRAME_USE_FRAME_LOCATOR error In iframe context, page.locator() is blocked — must use frame.locator().
CSS_CLASS_SELECTOR warn CSS class selectors are warned. Prefer semantic locators.

Error-level violations block the write and return the violation list. The model must regenerate a compliant fix and call validate_and_apply_fix again.

Locator Priority (highest → lowest)

1. getByRole('button', { name: '...' })   ← preferred
2. getByLabel('...')
3. getByPlaceholder('...')
4. getByText('...')
5. [data-testid] / [data-qa]
6. CSS class selectors                     ← warn
7. XPath                                   ← blocked

Tool Reference

Automated Loop Tools (Layer 2 — Capability)

Tool Description
run_test_and_analyze_failure Run npx playwright test --grep @<tag>. Returns passed, resolvedContext, error artifacts. Enforces retry stop-loss (max 2 attempts before shouldStop: true).
analyze_and_fix_selector Build a fix suggestion payload from error message + screenshot path + resolved failure context + rule bundle. Returns structured guidance for the model.
validate_and_apply_fix Validate proposed code changes against LOCATOR_VIOLATION_RULES, then write to spec. The only valid write path. Returns violations if blocked.
resolve_spec_by_tag Scan tests/ to find the spec file containing a given @AUT-xxx tag. Deterministic — never guess.
read_spec_file Read the full spec + auto-resolve its imported page object. Provides complete test context before analysis.
read_page_object_selectors Extract getter → elementName → selector mapping from a page object file.
get_failure_artifacts Scan test-results/ for the latest screenshot and trace.zip. Optionally filter by tag.
get_iframe_context Extract iframe selector and frame-related operations from a spec + page object.
propose_rule_evolution Write a learned fix pattern as a PENDING proposal to the rule evolution queue. Does not modify any .mdc file — human approval required.

Reference / Info Tools

Tool Description
get_playwright_fix_workflow Return the full fix workflow reference (read spec → run → fail → fix → verify → evolve).
get_internal_locator_rules Return the locator priority rules.
get_tag_run_rule Return the tag trigger rule (when @AUT-xxx triggers the closed loop).

Not an MCP Tool (by design)

apply_approved_rules — This is a governance operation triggered by human intent ("apply approved rules"), not by the automated loop. It lives in rule-evolution-review.mdc as a Cursor rule procedure. Keeping it out of index.js enforces the architectural boundary: the executor cannot promote its own rule proposals.


Customization

Adding a New Failure Context

Edit index.js to add a new entry to both CONTEXT_RESOLVERS and FAILURE_CONTEXT_MAP:

// In CONTEXT_RESOLVERS — detection pattern
const CONTEXT_RESOLVERS = [
  // ...existing entries...
  {
    contextKey: "select",
    test: (msg) => /selectOption|dropdown|ant-select/i.test(msg || ""),
  },
];

// In FAILURE_CONTEXT_MAP — instructions and rule blocks for this context
const FAILURE_CONTEXT_MAP = {
  // ...existing entries...
  select: {
    instruction: "3. **This error relates to a dropdown/select**: use `getByRole('option')` or `page.selectOption()`.\n",
    extraRuleBlocks: [],
  },
};

No other code changes required. The resolver runs first-match-wins.

Adding a Custom Locator Violation Rule

const LOCATOR_VIOLATION_RULES = [
  // ...existing rules...
  {
    id: "NO_DATA_TESTID_WHEN_ROLE_EXISTS",
    severity: "warn",
    test: (code) => /\[data-testid\]/i.test(code),
    message: "Prefer getByRole over data-testid when a semantic role exists (LOCATOR_RULES).",
  },
];

severity: "error" blocks the write. severity: "warn" allows it but surfaces the issue.

Adjusting the Retry Limit

const MAX_AUTO_RETRIES = 3; // stop at attempt 3; allows attempts 1 and 2

Increase for environments with higher test flakiness.


Stage Evolution

This tool implements Stages 2–4 of the AI automation evolution path:

Stage 1 → Static rules in skill.md
           (AI reads knowledge before acting)

Stage 2 → MCP selects rule bundles by context     ← CONTEXT_RESOLVERS
           (System decides the framework)

Stage 3 → Closed verification loop                ← repair loop in playwright-mcp.mdc
           (Outputs become testable hypotheses)

Stage 4 → Rules evolve based on execution results ← rule-evolution-queue.md
           (System improves without retraining)

Stage 5 → Self-improving governed AI environment  ← you build this on top
           (The environment becomes the intelligence layer)

The "trainable parameters" are rule bundles in .mdc files — not model weights.


Architecture Articles

This tool was built based on the following series:

  1. Stop Prompting Your Way Out of Playwright Failures — The state reconstruction problem and closed-loop architecture
  2. The Environment Is the Prompt: Why MCP Rules Supersede Static Skill Files — Knowledge vs. constraints; why governance belongs in the environment
  3. Rules That Learn: How We Built a Self-Improving Test Governance System — The execution context separation; why apply_approved_rules is not an MCP tool
  4. The Three Layers of AI Automation Systems — Knowledge / Capability / Governance — and the cost of mixing them

License

MIT

Recommended Servers

playwright-mcp

playwright-mcp

A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.

Official
Featured
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

An AI-powered tool that generates modern UI components from natural language descriptions, integrating with popular IDEs to streamline UI development workflow.

Official
Featured
Local
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

Enables interaction with Audiense Insights accounts via the Model Context Protocol, facilitating the extraction and analysis of marketing insights and audience data including demographics, behavior, and influencer engagement.

Official
Featured
Local
TypeScript
VeyraX MCP

VeyraX MCP

Single MCP tool to connect all your favorite tools: Gmail, Calendar and 40 more.

Official
Featured
Local
graphlit-mcp-server

graphlit-mcp-server

The Model Context Protocol (MCP) Server enables integration between MCP clients and the Graphlit service. Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a Graphlit project - and then retrieve relevant contents from the MCP client.

Official
Featured
TypeScript
Kagi MCP Server

Kagi MCP Server

An MCP server that integrates Kagi search capabilities with Claude AI, enabling Claude to perform real-time web searches when answering questions that require up-to-date information.

Official
Featured
Python
E2B

E2B

Using MCP to run code via e2b.

Official
Featured
Neon Database

Neon Database

MCP server for interacting with Neon Management API and databases

Official
Featured
Exa Search

Exa Search

A Model Context Protocol (MCP) server lets AI assistants like Claude use the Exa AI Search API for web searches. This setup allows AI models to get real-time web information in a safe and controlled way.

Official
Featured
Qdrant Server

Qdrant Server

This repository is an example of how to create a MCP server for Qdrant, a vector search engine.

Official
Featured