get-shit-done/get-shit-done/workflows/add-tests.md at 47badff2ee20b61f4ae6a44f9bc0e81328db2efd

mirror of https://github.com/glittercowboy/get-shit-done synced 2026-04-25 17:25:23 +02:00

Files

Tom Boucher 47badff2ee fix(workflow): add plain-text fallback for AskUserQuestion on non-Claude runtimes (#2042 )

AskUserQuestion is a Claude Code-only tool. When running GSD on OpenAI Codex,
Gemini CLI, or other non-Claude runtimes, the model renders the tool call as a
markdown code block instead of executing it, so the interactive TUI never
appears and the session stalls without collecting user input.

The workflow.text_mode / --text flag mechanism already handles this in 5 of
the 37 affected workflows. This commit adds the same TEXT_MODE fallback
instruction to all remaining 32 workflows so that, when text_mode is enabled,
every AskUserQuestion call is replaced with a plain-text numbered list that
any runtime can handle.

Fixes #2012

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-10 12:30:46 -04:00

12 KiB

Raw Blame History

Generate unit and E2E tests for a completed phase based on its SUMMARY.md, CONTEXT.md, and implementation. Classifies each changed file into TDD (unit), E2E (browser), or Skip categories, presents a test plan for user approval, then generates tests following RED-GREEN conventions.

Users currently hand-craft /gsd-quick prompts for test generation after each phase. This workflow standardizes the process with proper classification, quality gates, and gap reporting.

<required_reading> Read all files referenced by the invoking prompt's execution_context before starting. </required_reading>

Parse `$ARGUMENTS` for: - Phase number (integer, decimal, or letter-suffix) → store as `$PHASE_ARG` - Remaining text after phase number → store as `$EXTRA_INSTRUCTIONS` (optional)

Example: /gsd-add-tests 12 focus on edge cases → $PHASE_ARG=12, $EXTRA_INSTRUCTIONS="focus on edge cases"

If no phase argument provided:

ERROR: Phase number required
Usage: /gsd-add-tests <phase> [additional instructions]
Example: /gsd-add-tests 12
Example: /gsd-add-tests 12 focus on edge cases in the pricing module

Exit.

Load phase operation context:

INIT=$(node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" init phase-op "${PHASE_ARG}")
if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi

Extract from init JSON: phase_dir, phase_number, phase_name.

Verify the phase directory exists. If not:

ERROR: Phase directory not found for phase ${PHASE_ARG}
Ensure the phase exists in .planning/phases/

Exit.

Read the phase artifacts (in order of priority):

${phase_dir}/*-SUMMARY.md — what was implemented, files changed
${phase_dir}/CONTEXT.md — acceptance criteria, decisions
${phase_dir}/*-VERIFICATION.md — user-verified scenarios (if UAT was done)

If no SUMMARY.md exists:

ERROR: No SUMMARY.md found for phase ${PHASE_ARG}
This command works on completed phases. Run /gsd-execute-phase first.

Exit.

Present banner:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 GSD ► ADD TESTS — Phase ${phase_number}: ${phase_name}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Extract the list of files modified by the phase from SUMMARY.md ("Files Changed" or equivalent section).

For each file, classify into one of three categories:

Category	Criteria	Test Type
TDD	Pure functions where `expect(fn(input)).toBe(output)` is writable	Unit tests
E2E	UI behavior verifiable by browser automation	Playwright/E2E tests
Skip	Not meaningfully testable or already covered	None

TDD classification — apply when:

Business logic: calculations, pricing, tax rules, validation
Data transformations: mapping, filtering, aggregation, formatting
Parsers: CSV, JSON, XML, custom format parsing
Validators: input validation, schema validation, business rules
State machines: status transitions, workflow steps
Utilities: string manipulation, date handling, number formatting

E2E classification — apply when:

Keyboard shortcuts: key bindings, modifier keys, chord sequences
Navigation: page transitions, routing, breadcrumbs, back/forward
Form interactions: submit, validation errors, field focus, autocomplete
Selection: row selection, multi-select, shift-click ranges
Drag and drop: reordering, moving between containers
Modal dialogs: open, close, confirm, cancel
Data grids: sorting, filtering, inline editing, column resize

Skip classification — apply when:

UI layout/styling: CSS classes, visual appearance, responsive breakpoints
Configuration: config files, environment variables, feature flags
Glue code: dependency injection setup, middleware registration, routing tables
Migrations: database migrations, schema changes
Simple CRUD: basic create/read/update/delete with no business logic
Type definitions: records, DTOs, interfaces with no logic

Read each file to verify classification. Don't classify based on filename alone.

Present the classification to the user for confirmation before proceeding:

Text mode (workflow.text_mode: true in config or --text flag): Set TEXT_MODE=true if --text is present in $ARGUMENTS OR text_mode from init JSON is true. When TEXT_MODE is active, replace every AskUserQuestion call with a plain-text numbered list and ask the user to type their choice number. This is required for non-Claude runtimes (OpenAI Codex, Gemini CLI, etc.) where AskUserQuestion is not available.

AskUserQuestion(
  header: "Test Classification",
  question: |
    ## Files classified for testing

    ### TDD (Unit Tests) — {N} files
    {list of files with brief reason}

    ### E2E (Browser Tests) — {M} files
    {list of files with brief reason}

    ### Skip — {K} files
    {list of files with brief reason}

    {if $EXTRA_INSTRUCTIONS: "Additional instructions: ${EXTRA_INSTRUCTIONS}"}

    How would you like to proceed?
  options:
    - "Approve and generate test plan"
    - "Adjust classification (I'll specify changes)"
    - "Cancel"
)

If user selects "Adjust classification": apply their changes and re-present. If user selects "Cancel": exit gracefully.

Before generating the test plan, discover the project's existing test structure:

# Find existing test directories
find . -type d -name "*test*" -o -name "*spec*" -o -name "*__tests__*" 2>/dev/null | head -20
# Find existing test files for convention matching
find . -type f \( -name "*.test.*" -o -name "*.spec.*" -o -name "*Tests.fs" -o -name "*Test.fs" \) 2>/dev/null | head -20
# Check for test runners
ls package.json *.sln 2>/dev/null || true

Identify:

Test directory structure (where unit tests live, where E2E tests live)
Naming conventions (.test.ts, .spec.ts, *Tests.fs, etc.)
Test runner commands (how to execute unit tests, how to execute E2E tests)
Test framework (xUnit, NUnit, Jest, Playwright, etc.)

If test structure is ambiguous, ask the user:

AskUserQuestion(
  header: "Test Structure",
  question: "I found multiple test locations. Where should I create tests?",
  options: [list discovered locations]
)

For each approved file, create a detailed test plan.

For TDD files, plan tests following RED-GREEN-REFACTOR:

Identify testable functions/methods in the file
For each function: list input scenarios, expected outputs, edge cases
Note: since code already exists, tests may pass immediately — that's OK, but verify they test the RIGHT behavior

For E2E files, plan tests following RED-GREEN gates:

Identify user scenarios from CONTEXT.md/VERIFICATION.md
For each scenario: describe the user action, expected outcome, assertions
Note: RED gate means confirming the test would fail if the feature were broken

Present the complete test plan:

AskUserQuestion(
  header: "Test Plan",
  question: |
    ## Test Generation Plan

    ### Unit Tests ({N} tests across {M} files)
    {for each file: test file path, list of test cases}

    ### E2E Tests ({P} tests across {Q} files)
    {for each file: test file path, list of test scenarios}

    ### Test Commands
    - Unit: {discovered test command}
    - E2E: {discovered e2e command}

    Ready to generate?
  options:
    - "Generate all"
    - "Cherry-pick (I'll specify which)"
    - "Adjust plan"
)

If "Cherry-pick": ask user which tests to include. If "Adjust plan": apply changes and re-present.

For each approved TDD test:

Create test file following discovered project conventions (directory, naming, imports)

Write test with clear arrange/act/assert structure:

// Arrange — set up inputs and expected outputs
// Act — call the function under test
// Assert — verify the output matches expectations

Run the test:
```
{discovered test command}
```
Evaluate result:
- Test passes: Good — the implementation satisfies the test. Verify the test checks meaningful behavior (not just that it compiles).
- Test fails with assertion error: This may be a genuine bug discovered by the test. Flag it:
```
⚠️ Potential bug found: {test name}
Expected: {expected}
Actual: {actual}
File: {implementation file}
```
  Do NOT fix the implementation — this is a test-generation command, not a fix command. Record the finding.
- Test fails with error (import, syntax, etc.): This is a test error. Fix the test and re-run.

For each approved E2E test:

Check for existing tests covering the same scenario:
```
grep -r "{scenario keyword}" {e2e test directory} 2>/dev/null || true
```
If found, extend rather than duplicate.
Create test file targeting the user scenario from CONTEXT.md/VERIFICATION.md
Run the E2E test:
```
{discovered e2e command}
```
Evaluate result:
- GREEN (passes): Record success
- RED (fails): Determine if it's a test issue or a genuine application bug. Flag bugs:
```
⚠️ E2E failure: {test name}
Scenario: {description}
Error: {error message}
```
- Cannot run: Report blocker. Do NOT mark as complete.
```
🛑 E2E blocker: {reason tests cannot run}
```

No-skip rule: If E2E tests cannot execute (missing dependencies, environment issues), report the blocker and mark the test as incomplete. Never mark success without actually running the test.

Create a test coverage report and present to user:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 GSD ► TEST GENERATION COMPLETE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

## Results

| Category | Generated | Passing | Failing | Blocked |
|----------|-----------|---------|---------|---------|
| Unit     | {N}       | {n1}    | {n2}    | {n3}    |
| E2E      | {M}       | {m1}    | {m2}    | {m3}    |

## Files Created/Modified
{list of test files with paths}

## Coverage Gaps
{areas that couldn't be tested and why}

## Bugs Discovered
{any assertion failures that indicate implementation bugs}

Record test generation in project state:

node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state-snapshot

If there are passing tests to commit:

git add {test files}
git commit -m "test(phase-${phase_number}): add unit and E2E tests from add-tests command"

Present next steps:

---

## ▶ Next Up

{if bugs discovered:}
**Fix discovered bugs:** `/gsd-quick fix the {N} test failures discovered in phase ${phase_number}`

{if blocked tests:}
**Resolve test blockers:** {description of what's needed}

{otherwise:}
**All tests passing!** Phase ${phase_number} is fully tested.

---

**Also available:**
- `/gsd-add-tests {next_phase}` — test another phase
- `/gsd-verify-work {phase_number}` — run UAT verification

---

<success_criteria>

Phase artifacts loaded (SUMMARY.md, CONTEXT.md, optionally VERIFICATION.md)
All changed files classified into TDD/E2E/Skip categories
Classification presented to user and approved
Project test structure discovered (directories, conventions, runners)
Test plan presented to user and approved
TDD tests generated with arrange/act/assert structure
E2E tests generated targeting user scenarios
All tests executed — no untested tests marked as passing
Bugs discovered by tests flagged (not fixed)
Test files committed with proper message
Coverage gaps documented
Next steps presented to user </success_criteria>

12 KiB Raw Blame History

12 KiB

Raw Blame History