feat(verify): add optional Playwright-MCP automated UI verification (#1604)

When Playwright-MCP is available in the session, GSD UI verification steps can be automated via screenshot comparison instead of manual checkbox review. Falls back to manual flow when Playwright is not configured. Closes #1420
2026-04-25 17:25:23 +02:00 · 2026-04-03 12:40:47 -04:00
parent 8fce097222
commit 522860ceef
4 changed files with 145 additions and 0 deletions
--- a/agents/gsd-ui-auditor.md
+++ b/agents/gsd-ui-auditor.md
@@ -86,6 +86,46 @@ This gate runs unconditionally on every audit. The .gitignore ensures screenshot

 </gitignore_gate>

+<playwright_mcp_approach>
+
+## Automated Screenshot Capture via Playwright-MCP (preferred when available)
+
+Before attempting the CLI screenshot approach, check whether `mcp__playwright__*`
+tools are available in this session. If they are, use them instead of the CLI approach:
+
+```
+# Preferred: Playwright-MCP automated verification
+# 1. Navigate to the component URL
+mcp__playwright__navigate(url="http://localhost:3000")
+
+# 2. Take desktop screenshot
+mcp__playwright__screenshot(name="desktop", width=1440, height=900)
+
+# 3. Take mobile screenshot
+mcp__playwright__screenshot(name="mobile", width=375, height=812)
+
+# 4. For specific components listed in UI-SPEC.md, navigate to each
+#    component route and capture targeted screenshots for comparison
+#    against the spec's stated dimensions, colors, and layout.
+
+# 5. Compare screenshots against UI-SPEC.md requirements:
+#    - Dimensions: Is component X width 70vw as specified?
+#    - Color: Is the accent color applied only on declared elements?
+#    - Layout: Are spacing values within the declared spacing scale?
+#    Report any visual discrepancies as automated findings.
+```
+
+**When Playwright-MCP is available:**
+- Use it for all screenshot capture (skip the CLI approach below)
+- Each UI checkpoint from UI-SPEC.md can be verified automatically
+- Discrepancies are reported as pillar findings with screenshot evidence
+- Items requiring subjective judgment are flagged as `needs_human_review: true`
+
+**When Playwright-MCP is NOT available:** fall back to the CLI screenshot approach
+below. Behavior is unchanged from the standard code-only audit path.
+
+</playwright_mcp_approach>
+
 <screenshot_approach>

 ## Screenshot Capture (CLI only — no MCP, no persistent browser)
--- a/get-shit-done/workflows/ui-review.md
+++ b/get-shit-done/workflows/ui-review.md
@@ -146,6 +146,26 @@ Full review: {path to UI-REVIEW.md}
 ───────────────────────────────────────────────────────────────
 ```

+## Automated UI Verification (when Playwright-MCP is available)
+
+If `mcp__playwright__*` tools are accessible in this session:
+
+1. Navigate to each UI component described in the phase's UI-SPEC.md using
+   `mcp__playwright__navigate` (or equivalent Playwright-MCP tool).
+2. Take a screenshot of each component using `mcp__playwright__screenshot`.
+3. Compare against the spec's visual requirements — dimensions, color palette,
+   layout, spacing scale, and typography.
+4. Report any dimension, color, or layout discrepancies automatically as
+   additional findings within the relevant pillar section of UI-REVIEW.md.
+5. Flag items that require human judgment (brand feel, content tone) as
+   `needs_human_review: true` in the findings — these are surfaced to the user
+   separately after the automated pass completes.
+
+If Playwright-MCP is not available in this session, this section is skipped
+entirely. The audit falls back to the standard code-only review described above.
+No configuration change is required — the availability of `mcp__playwright__*`
+tools is detected at runtime.
+
 ## 5. Commit (if configured)

 ```bash
--- a/get-shit-done/workflows/verify-work.md
+++ b/get-shit-done/workflows/verify-work.md
@@ -86,6 +86,42 @@ Provide a phase number to start testing (e.g., /gsd:verify-work 4)
 Continue to `create_uat_file`.
 </step>

+<step name="automated_ui_verification">
+**Automated UI Verification (when Playwright-MCP is available)**
+
+Before running manual UAT, check whether this phase has a UI component and whether
+`mcp__playwright__*` or `mcp__puppeteer__*` tools are available in the current session.
+
+```
+UI_PHASE_FLAG=$(node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" config-get workflow.ui_phase --raw 2>/dev/null || echo "true")
+UI_SPEC_FILE=$(ls "${PHASE_DIR}"/*-UI-SPEC.md 2>/dev/null | head -1)
+```
+
+**If Playwright-MCP tools are available in this session (`mcp__playwright__*` tools
+respond to tool calls) AND (`UI_PHASE_FLAG` is `true` OR `UI_SPEC_FILE` is non-empty):**
+
+For each UI checkpoint listed in the phase's UI-SPEC.md (or inferred from SUMMARY.md):
+
+1. Use `mcp__playwright__navigate` (or equivalent) to open the component's URL.
+2. Use `mcp__playwright__screenshot` to capture a screenshot.
+3. Compare the screenshot visually against the spec's stated requirements
+   (dimensions, color, layout, spacing).
+4. Automatically mark checkpoints as **passed** or **needs review** based on the
+   visual comparison — no manual question required for items that clearly match.
+5. Flag items that require human judgment (subjective aesthetics, content accuracy)
+   and present only those as manual UAT questions.
+
+If automated verification is not available, fall back to the standard manual
+checkpoint questions defined in this workflow unchanged. This step is entirely
+conditional: if Playwright-MCP is not configured, behavior is unchanged from today.
+
+**Display summary line before proceeding:**
+```
+UI checkpoints: {N} auto-verified, {M} queued for manual review
+```
+
+</step>
+
 <step name="find_summaries">
 **Find what to test:**

--- a/tests/playwright-ui-verify.test.cjs
+++ b/tests/playwright-ui-verify.test.cjs
@@ -0,0 +1,49 @@
+const { test, describe } = require('node:test');
+const assert = require('node:assert');
+const fs = require('fs');
+const path = require('path');
+
+describe('Playwright-MCP UI verification integration', () => {
+  test('verify-work.md mentions automated UI verification', () => {
+    const content = fs.readFileSync(
+      path.join(__dirname, '..', 'get-shit-done', 'workflows', 'verify-work.md'), 'utf-8'
+    );
+    assert.ok(
+      content.toLowerCase().includes('playwright') || content.includes('automated') && content.includes('UI'),
+      'verify-work.md should mention automated UI verification option'
+    );
+  });
+
+  test('ui-review.md mentions Playwright-MCP when available', () => {
+    const content = fs.readFileSync(
+      path.join(__dirname, '..', 'get-shit-done', 'workflows', 'ui-review.md'), 'utf-8'
+    );
+    assert.ok(
+      content.toLowerCase().includes('playwright') || content.includes('mcp__playwright'),
+      'ui-review.md should reference Playwright-MCP'
+    );
+  });
+
+  test('gsd-ui-auditor.md includes automated screenshot guidance', () => {
+    const content = fs.readFileSync(
+      path.join(__dirname, '..', 'agents', 'gsd-ui-auditor.md'), 'utf-8'
+    );
+    assert.ok(
+      content.toLowerCase().includes('playwright') || content.includes('screenshot') || content.includes('automated'),
+      'gsd-ui-auditor.md should mention automated screenshot verification'
+    );
+  });
+
+  test('automated verification is optional/conditional (falls back to manual)', () => {
+    const verifyContent = fs.readFileSync(
+      path.join(__dirname, '..', 'get-shit-done', 'workflows', 'verify-work.md'), 'utf-8'
+    );
+    // Must include a fallback / "if available" conditional
+    const hasConditional =
+      verifyContent.includes('if available') ||
+      verifyContent.includes('when available') ||
+      verifyContent.includes('if Playwright') ||
+      verifyContent.includes('fall back');
+    assert.ok(hasConditional, 'Playwright integration must be conditional with manual fallback');
+  });
+});