refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes #2551) (#2607)

* refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes #2551) Splits 1,347-line workflows/discuss-phase.md into a 495-line dispatcher plus per-mode files in workflows/discuss-phase/modes/ and templates in workflows/discuss-phase/templates/. Mirrors the progressive-disclosure pattern that #2361 enforced for agents. - Per-mode files: power, all, auto, chain, text, batch, analyze, default, advisor - Templates lazy-loaded at the step that produces the artifact (CONTEXT.md template at write_context, DISCUSSION-LOG.md template at git_commit, checkpoint.json schema when checkpointing) - Advisor mode gated behind `[ -f $HOME/.claude/get-shit-done/USER-PROFILE.md ]` — inverse of #2174's --advisor flag (don't pay the cost when unused) - scout_codebase phase-type→map selection table extracted to references/scout-codebase.md - New tests/workflow-size-budget.test.cjs enforces tiered budgets across all workflows/*.md (XL=1700 / LARGE=1500 / DEFAULT=1000) plus the explicit <500 ceiling for discuss-phase.md per #2551 - Existing tests updated to read from the new file locations after the split (functional equivalence preserved — content moved, not removed) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#2607): align modes/auto.md check_existing with parent (Update it, not Skip) CodeRabbit flagged drift between the parent step (which auto-selects "Update it") and modes/auto.md (which documented "Skip"). The pre-refactor file had both — line 182 said "Skip" in the overview, line 250 said "Update it" in the actual step. The step is authoritative. Fix the new mode file to match. Refs: PR #2607 review comment 3127783430 * test(#2607): harden discuss-phase regression tests after #2551 split CodeRabbit identified four test smells where the split weakened coverage: - workflow-size-budget: assertion was unreachable (entered if-block on match, then asserted occurrences === 0 — always failed). Now unconditional. - bug-2549-2550-2552: bounded-read assertion checked concatenated source, so src.includes('3') was satisfied by unrelated content in scout-codebase.md (e.g., "3-5 most relevant files"). Now reads parent only with a stricter regex. Also asserts SCOUT_REF exists. - chain-flag-plan-phase: filter(existsSync) silently skipped a missing modes/chain.md. Now fails loudly via explicit asserts. - discuss-checkpoint: same silent-filter pattern across three sources. Now asserts each required path before reading. Refs: PR #2607 review comments 3127783457, 3127783452, plus nitpicks for chain-flag-plan-phase.test.cjs:21-24 and discuss-checkpoint.test.cjs:22-27 * docs(#2607): fix INVENTORY count, context.md placeholders, scout grep portability - INVENTORY.md: subdirectory note said "50 top-level references" but the section header now says 51. Updated to 51. - templates/context.md: footer hardcoded XX-name instead of declared placeholders [X]/[Name], which would leak sample text into generated CONTEXT.md files. Now uses the declared placeholders. - references/scout-codebase.md: no-maps fallback used grep -rl with "\\|" alternation (GNU grep only — silent on BSD/macOS grep). Switched to grep -rlE with extended regex for portability. Refs: PR #2607 review comments 3127783404, 3127783448, plus nitpick for scout-codebase.md:32-40 * docs(#2607): label fenced examples + clarify overlay/advisor precedence - analyze.md / text.md / default.md: add language tags (markdown/text) to fenced example blocks to silence markdownlint MD040 warnings flagged by CodeRabbit (one fence in analyze.md, two in text.md, five in default.md). - discuss-phase.md: document overlay stacking rules in discuss_areas — fixed outer→inner order --analyze → --batch → --text, with a pointer to each overlay file for mode-specific precedence. - advisor.md: add tie-breaker rules for NON_TECHNICAL_OWNER signals — explicit technical_background overrides inferred signals; otherwise OR-aggregate; contradictory explanation_depth values resolve by most-recent-wins. Refs: PR #2607 review comments 3127783415, 3127783437, plus nitpicks for default.md:24, discuss-phase.md:345-365, and advisor.md:51-56 * fix(#2607): extract codebase_drift_gate body to keep execute-phase under XL budget PR #2605 added 80 lines to execute-phase.md (1622 -> 1702), pushing it over the XL_BUDGET=1700 line cap enforced by tests/workflow-size-budget.test.cjs (introduced by this PR). Per the test's own remediation hint and #2551's progressive-disclosure pattern, extract the codebase_drift_gate step body to get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md and leave a brief pointer in the workflow. execute-phase.md is now 1633 lines. Budget is NOT relaxed; the offending workflow is tightened. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:25:23 +02:00 · 2026-04-22 21:57:24 -04:00
parent 220da8e487
commit 41dc475c46
27 changed files with 1649 additions and 1127 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -314,6 +314,15 @@ bin/install.js          — Installer (multi-runtime)
 get-shit-done/
  bin/lib/              — Core library modules (.cjs)
  workflows/            — Workflow definitions (.md)
+                          Large workflows split per progressive-disclosure
+                          pattern: workflows/<name>/modes/*.md +
+                          workflows/<name>/templates/*. Parent dispatches
+                          to mode files. See workflows/discuss-phase/ as
+                          the canonical example (#2551). New modes for
+                          discuss-phase land in
+                          workflows/discuss-phase/modes/<mode>.md.
+                          Per-file budgets enforced by
+                          tests/workflow-size-budget.test.cjs.
  references/           — Reference documentation (.md)
  templates/            — File templates
 agents/                 — Agent definitions (.md) — CANONICAL SOURCE
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -131,6 +131,33 @@ Orchestration logic that commands reference. Contains the step-by-step process i

 **Total workflows:** see [`docs/INVENTORY.md`](INVENTORY.md#workflows) for the authoritative count and full roster.

+#### Progressive disclosure for workflows
+
+Workflow files are loaded verbatim into Claude's context every time the
+corresponding `/gsd:*` command is invoked. To keep that cost bounded, the
+workflow size budget enforced by `tests/workflow-size-budget.test.cjs`
+mirrors the agent budget from #2361:
+
+| Tier      | Per-file line limit |
+|-----------|--------------------|
+| `XL`      | 1700 — top-level orchestrators (`execute-phase`, `plan-phase`, `new-project`) |
+| `LARGE`   | 1500 — multi-step planners and large feature workflows |
+| `DEFAULT` | 1000 — focused single-purpose workflows (the target tier) |
+
+`workflows/discuss-phase.md` is held to a stricter <500-line ceiling per
+issue #2551. When a workflow grows beyond its tier, extract per-mode bodies
+into `workflows/<workflow>/modes/<mode>.md`, templates into
+`workflows/<workflow>/templates/`, and shared knowledge into
+`get-shit-done/references/`. The parent file becomes a thin dispatcher that
+Reads only the mode and template files needed for the current invocation.
+
+`workflows/discuss-phase/` is the canonical example of this pattern —
+parent dispatches, modes/ holds per-flag behavior (`power.md`, `all.md`,
+`auto.md`, `chain.md`, `text.md`, `batch.md`, `analyze.md`, `default.md`,
+`advisor.md`), and templates/ holds CONTEXT.md, DISCUSSION-LOG.md, and
+checkpoint.json schemas that are read only when the corresponding output
+file is being written.
+
 ### Agents (`agents/*.md`)

 Specialized agent definitions with frontmatter specifying:
--- a/docs/INVENTORY-MANIFEST.json
+++ b/docs/INVENTORY-MANIFEST.json
@@ -242,6 +242,7 @@
      "project-skills-discovery.md",
      "questioning.md",
      "revision-loop.md",
+      "scout-codebase.md",
      "sketch-interactivity.md",
      "sketch-theme-system.md",
      "sketch-tooling.md",
--- a/docs/INVENTORY.md
+++ b/docs/INVENTORY.md
@@ -269,7 +269,7 @@ Full roster at `get-shit-done/workflows/*.md`. Workflows are thin orchestrators

 ---

-## References (50 shipped)
+## References (51 shipped)

 Full roster at `get-shit-done/references/*.md`. References are shared knowledge documents that workflows and agents `@-reference`. The groupings below match [`docs/ARCHITECTURE.md`](ARCHITECTURE.md#references-get-shit-donereferencesmd) — core, workflow, thinking-model clusters, and the modular planner decomposition.

@@ -303,6 +303,7 @@ Full roster at `get-shit-done/references/*.md`. References are shared knowledge
 | `continuation-format.md` | Session continuation/resume format. |
 | `domain-probes.md` | Domain-specific probing questions for discuss-phase. |
 | `gate-prompts.md` | Gate/checkpoint prompt templates. |
+| `scout-codebase.md` | Phase-type→codebase-map selection table for discuss-phase scout step (extracted via #2551). |
 | `revision-loop.md` | Plan revision iteration patterns. |
 | `universal-anti-patterns.md` | Universal anti-patterns to detect and avoid. |
 | `artifact-types.md` | Planning artifact type definitions. |
@@ -354,7 +355,7 @@ The `gsd-planner` agent is decomposed into a core agent plus reference modules t
 | `planner-revision.md` | Plan revision patterns for iterative refinement. |
 | `planner-source-audit.md` | Planner source-audit and authority-limit rules. |

-> **Subdirectory:** `get-shit-done/references/few-shot-examples/` contains additional few-shot examples (`plan-checker.md`, `verifier.md`) that are referenced from specific agents. These are not counted in the 50 top-level references.
+> **Subdirectory:** `get-shit-done/references/few-shot-examples/` contains additional few-shot examples (`plan-checker.md`, `verifier.md`) that are referenced from specific agents. These are not counted in the 51 top-level references.

 ---

--- a/get-shit-done/references/scout-codebase.md
+++ b/get-shit-done/references/scout-codebase.md
@@ -0,0 +1,51 @@
+# Codebase scout — map selection table
+
+> Lazy-loaded reference for the `scout_codebase` step in
+> `workflows/discuss-phase.md` (extracted via #2551 progressive-disclosure
+> refactor). Read this only when prior `.planning/codebase/*.md` maps exist
+> and the workflow needs to pick which 2–3 to load.
+
+## Phase-type → recommended maps
+
+Read 2–3 maps based on inferred phase type. Do NOT read all seven —
+that inflates context without improving discussion quality.
+
+| Phase type (infer from title + ROADMAP entry) | Read these maps |
+|---|---|
+| UI / frontend / styling / design | CONVENTIONS.md, STRUCTURE.md, STACK.md |
+| Backend / API / service / data model | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
+| Integration / third-party / provider | STACK.md, INTEGRATIONS.md, ARCHITECTURE.md |
+| Infrastructure / DevOps / CI / deploy | STACK.md, ARCHITECTURE.md, INTEGRATIONS.md |
+| Testing / QA / coverage | TESTING.md, CONVENTIONS.md, STRUCTURE.md |
+| Documentation / content | CONVENTIONS.md, STRUCTURE.md |
+| Mixed / unclear | STACK.md, ARCHITECTURE.md, CONVENTIONS.md |
+
+Read CONCERNS.md only if the phase explicitly addresses known concerns or
+security issues.
+
+## Single-read rule
+
+Read each map file in a **single** Read call. Do not read the same file at
+two different offsets — split reads break prompt-cache reuse and cost more
+than a single full read.
+
+## No-maps fallback
+
+If `.planning/codebase/*.md` does not exist:
+1. Extract key terms from the phase goal (e.g., "feed" → "post", "card",
+   "list"; "auth" → "login", "session", "token")
+2. `grep -rlE "{term1}|{term2}" src/ app/ --include="*.ts" ...` (use `-E`
+   for extended regex so the `|` alternation works on both GNU grep and BSD
+   grep / macOS), and `ls` the conventional component/hook/util dirs
+3. Read the 3–5 most relevant files
+
+## Output (internal `<codebase_context>`)
+
+From the scan, identify:
+- **Reusable assets** — components, hooks, utilities usable in this phase
+- **Established patterns** — state management, styling, data fetching
+- **Integration points** — routes, nav, providers where new code connects
+- **Creative options** — approaches the architecture enables or constrains
+
+Used in `analyze_phase` and `present_gray_areas`. NOT written to a file —
+session-only.
--- a/get-shit-done/workflows/discuss-phase.md
+++ b/get-shit-done/workflows/discuss-phase.md
--- a/get-shit-done/workflows/discuss-phase/modes/advisor.md
+++ b/get-shit-done/workflows/discuss-phase/modes/advisor.md
@@ -0,0 +1,173 @@
+# Advisor mode — research-backed comparison tables
+
+> **Lazy-loaded and gated.** The parent `workflows/discuss-phase.md` Reads
+> this file ONLY when `ADVISOR_MODE` is true (i.e., when
+> `$HOME/.claude/get-shit-done/USER-PROFILE.md` exists). Skip the Read
+> entirely when no profile is present — that's the inverse of the
+> `--advisor` flag from #2174 (don't pay the cost when unused).
+
+## Activation
+
+```bash
+PROFILE_PATH="$HOME/.claude/get-shit-done/USER-PROFILE.md"
+if [ -f "$PROFILE_PATH" ]; then
+  ADVISOR_MODE=true
+else
+  ADVISOR_MODE=false
+fi
+```
+
+If `ADVISOR_MODE` is false, do **not** Read this file — proceed with the
+standard `default.md` discussion flow.
+
+## Calibration tier
+
+Resolve `vendor_philosophy` calibration tier:
+1. **Priority 1:** Read `config.json` > `preferences.vendor_philosophy`
+   (project-level override)
+2. **Priority 2:** Read USER-PROFILE.md `Vendor Choices/Philosophy` rating
+   (global)
+3. **Priority 3:** Default to `"standard"` if neither has a value or value
+   is `UNSCORED`
+
+Map to calibration tier:
+- `conservative` OR `thorough-evaluator` → `full_maturity`
+- `opinionated` → `minimal_decisive`
+- `pragmatic-fast` OR any other value OR empty → `standard`
+
+Resolve advisor model:
+```bash
+ADVISOR_MODEL=$(gsd-sdk query resolve-model gsd-advisor-researcher --raw)
+```
+
+## Non-technical owner detection
+
+Read USER-PROFILE.md and check for product-owner signals:
+
+```bash
+PROFILE_CONTENT=$(cat "$HOME/.claude/get-shit-done/USER-PROFILE.md" 2>/dev/null || true)
+```
+
+Set `NON_TECHNICAL_OWNER = true` if ANY of the following are present:
+- `learning_style: guided`
+- The word `jargon` appears in a `frustration_triggers` section
+- `explanation_depth: practical-detailed` (without a technical modifier)
+- `explanation_depth: high-level`
+
+**Tie-breaker / precedence (when signals conflict):**
+1. An explicit `technical_background: true` (or any `explanation_depth` value
+   tagged with a technical modifier such as `practical-detailed:technical`)
+   **overrides** all inferred non-technical signals — set
+   `NON_TECHNICAL_OWNER = false`.
+2. Otherwise, ANY single matching signal is sufficient to set
+   `NON_TECHNICAL_OWNER = true` (signals are OR-aggregated, not weighted).
+3. Contradictory `explanation_depth` values: the most recent entry wins.
+
+Log the resolved value and the matched/overriding signal so the user can
+audit why a given framing was used.
+
+When `NON_TECHNICAL_OWNER` is true, reframe gray area labels and
+descriptions in product-outcome language before presenting them. Preserve
+the same underlying decision — only change the framing:
+
+- Technical implementation term → outcome the user will experience
+  - "Token architecture" → "Color system: which approach prevents the dark theme from flashing white on open"
+  - "CSS variable strategy" → "Theme colors: how your brand colors stay consistent in both light and dark mode"
+  - "Component API surface area" → "How the building blocks connect: how tightly coupled should these parts be"
+  - "Caching strategy: SWR vs React Query" → "Loading speed: should screens show saved data right away or wait for fresh data"
+
+This reframing applies to:
+1. Gray area labels and descriptions in `present_gray_areas`
+2. Advisor research rationale rewrites in the synthesis step below
+
+## advisor_research step
+
+After the user selects gray areas in `present_gray_areas`, spawn parallel
+research agents.
+
+1. Display brief status: `Researching {N} areas...`
+
+2. For EACH user-selected gray area, spawn a `Task()` in parallel:
+
+   ```
+   Task(
+     prompt="First, read @~/.claude/agents/gsd-advisor-researcher.md for your role and instructions.
+
+     <gray_area>{area_name}: {area_description from gray area identification}</gray_area>
+     <phase_context>{phase_goal and description from ROADMAP.md}</phase_context>
+     <project_context>{project name and brief description from PROJECT.md}</project_context>
+     <calibration_tier>{resolved calibration tier: full_maturity | standard | minimal_decisive}</calibration_tier>
+
+     Research this gray area and return a structured comparison table with rationale.
+     ${AGENT_SKILLS_ADVISOR}",
+     subagent_type="general-purpose",
+     model="{ADVISOR_MODEL}",
+     description="Research: {area_name}"
+   )
+   ```
+
+   All `Task()` calls spawn simultaneously — do NOT wait for one before
+   starting the next.
+
+3. After ALL agents return, **synthesize results** before presenting:
+
+   For each agent's return:
+   a. Parse the markdown comparison table and rationale paragraph
+   b. Verify all 5 columns present (Option | Pros | Cons | Complexity | Recommendation) — fill any missing columns rather than showing broken table
+   c. Verify option count matches calibration tier:
+      - `full_maturity`: 3-5 options acceptable
+      - `standard`: 2-4 options acceptable
+      - `minimal_decisive`: 1-2 options acceptable
+      If agent returned too many, trim least viable. If too few, accept as-is.
+   d. Rewrite rationale paragraph to weave in project context and ongoing discussion context that the agent did not have access to
+   e. If agent returned only 1 option, convert from table format to direct recommendation: "Standard approach for {area}: {option}. {rationale}"
+   f. **If `NON_TECHNICAL_OWNER` is true:** apply a plain language rewrite to the rationale paragraph. Replace implementation-level terms with outcome descriptions the user can reason about without technical context. The Recommendation column value and the table structure remain intact. Do not remove detail; translate it. Example: "SWR uses stale-while-revalidate to serve cached responses immediately" → "This approach shows you something right away, then quietly updates in the background — users see data instantly."
+
+4. Store synthesized tables for use in `discuss_areas` (table-first flow).
+
+## discuss_areas (advisor table-first flow)
+
+For each selected area:
+
+1. **Present the synthesized comparison table + rationale paragraph** (from
+   `advisor_research`)
+
+2. **Use AskUserQuestion** (or text-mode equivalent if `--text` overlay):
+   - header: `{area_name}`
+   - question: `Which approach for {area_name}?`
+   - options: extract from the table's Option column (AskUserQuestion adds
+     "Other" automatically)
+
+3. **Record the user's selection:**
+   - If user picks from table options → record as locked decision for that
+     area
+   - If user picks "Other" → receive their input, reflect it back for
+     confirmation, record
+
+4. **Thinking partner (conditional):** same rule as default mode — if
+   `features.thinking_partner` is enabled and tradeoff signals are
+   detected, offer a 3-5 bullet analysis before locking in.
+
+5. **After recording pick, decide whether follow-up questions are needed:**
+   - If the pick has ambiguity that would affect downstream planning →
+     ask 1-2 targeted follow-up questions using AskUserQuestion
+   - If the pick is clear and self-contained → move to next area
+   - Do NOT ask the standard 4 questions — the table already provided the
+     context
+
+6. **After all areas processed:**
+   - header: "Done"
+   - question: "That covers [list areas]. Ready to create context?"
+   - options: "Create context" / "Revisit an area"
+
+## Scope creep handling (advisor mode)
+
+If user mentions something outside the phase domain:
+```
+"[Feature] sounds like a new capability — that belongs in its own phase.
+I'll note it as a deferred idea.
+
+Back to [current area]: [return to current question]"
+```
+
+Track deferred ideas internally.
--- a/get-shit-done/workflows/discuss-phase/modes/all.md
+++ b/get-shit-done/workflows/discuss-phase/modes/all.md
@@ -0,0 +1,28 @@
+# --all mode — auto-select ALL gray areas, discuss interactively
+
+> **Lazy-loaded.** Read this file from `workflows/discuss-phase.md` when
+> `--all` is present in `$ARGUMENTS`. Behavior overlays the default mode.
+
+## Effect
+
+- In `present_gray_areas`: auto-select ALL gray areas without asking the user
+  (skips the AskUserQuestion area-selection step).
+- Discussion for each area proceeds **fully interactively** — the user drives
+  every question for every area (use the default-mode `discuss_areas` flow).
+- Does NOT auto-advance to plan-phase afterward — use `--chain` or `--auto`
+  if you want auto-advance.
+- Log: `[--all] Auto-selected all gray areas: [list area names].`
+
+## Why this mode exists
+
+This is the "discuss everything" shortcut: skip the selection friction, keep
+full interactive control over each individual question.
+
+## Combination rules
+
+- `--all --auto`: `--auto` wins for the discussion phase too (Claude picks
+  recommended answers); `--all`'s contribution is just area auto-selection.
+- `--all --chain`: areas auto-selected, discussion interactive, then
+  auto-advance to plan/execute (chain semantics).
+- `--all --batch` / `--all --text` / `--all --analyze`: layered overlays
+  apply during discussion as documented in their respective files.
--- a/get-shit-done/workflows/discuss-phase/modes/analyze.md
+++ b/get-shit-done/workflows/discuss-phase/modes/analyze.md
@@ -0,0 +1,44 @@
+# --analyze mode — trade-off tables before each question
+
+> **Lazy-loaded overlay.** Read this file from `workflows/discuss-phase.md`
+> when `--analyze` is present in `$ARGUMENTS`. Combinable with default,
+> `--all`, `--chain`, `--text`, `--batch`.
+
+## Effect
+
+Before presenting each question (or question group, in batch mode), provide
+a brief **trade-off analysis** for the decision:
+- 2-3 options with pros/cons based on codebase context and common patterns
+- A recommended approach with reasoning
+- Known pitfalls or constraints from prior phases
+
+## Example
+
+```markdown
+**Trade-off analysis: Authentication strategy**
+
+| Approach | Pros | Cons |
+|----------|------|------|
+| Session cookies | Simple, httpOnly prevents XSS | Requires CSRF protection, sticky sessions |
+| JWT (stateless) | Scalable, no server state | Token size, revocation complexity |
+| OAuth 2.0 + PKCE | Industry standard for SPAs | More setup, redirect flow UX |
+
+💡 Recommended: OAuth 2.0 + PKCE — your app has social login in requirements (REQ-04) and this aligns with the existing NextAuth setup in `src/lib/auth.ts`.
+
+How should users authenticate?
+```
+
+This gives the user context to make informed decisions without extra
+prompting.
+
+When `--analyze` is absent, present questions directly as before (no
+trade-off table).
+
+## Sourcing the analysis
+
+- Pros/cons should reflect the codebase context loaded in `scout_codebase`
+  and any prior decisions surfaced in `load_prior_context`.
+- The recommendation must explicitly tie to project context (e.g.,
+  existing libraries, prior phase decisions, documented requirements).
+- If a related ADR or spec is referenced in CONTEXT.md `<canonical_refs>`,
+  cite it in the recommendation.
--- a/get-shit-done/workflows/discuss-phase/modes/auto.md
+++ b/get-shit-done/workflows/discuss-phase/modes/auto.md
@@ -0,0 +1,56 @@
+# --auto mode — fully autonomous discuss-phase
+
+> **Lazy-loaded.** Read this file from `workflows/discuss-phase.md` when
+> `--auto` is present in `$ARGUMENTS`. After the discussion completes, the
+> parent's `auto_advance` step also reads `modes/chain.md` to drive the
+> auto-advance to plan-phase.
+
+## Effect across steps
+
+- **`check_existing`**: if CONTEXT.md exists, auto-select "Update it" — load
+  existing context and continue to `analyze_phase` (matches the parent step's
+  documented `--auto` branch). If no context exists, continue without
+  prompting. For interrupted checkpoints, auto-select "Resume". For existing
+  plans, auto-select "Continue and replan after". Log every decision so the
+  user can audit.
+- **`cross_reference_todos`**: fold all todos with relevance score >= 0.4
+  automatically. Log the selection.
+- **`present_gray_areas`**: auto-select ALL gray areas. Log:
+  `[--auto] Selected all gray areas: [list area names].`
+- **`discuss_areas`**: for each discussion question, choose the recommended
+  option (first option, or the one explicitly marked "recommended") **without
+  using AskUserQuestion**. Skip interactive prompts entirely. Log each
+  auto-selected choice inline so the user can review decisions in the
+  context file:
+  ```
+  [auto] [Area] — Q: "[question text]" → Selected: "[chosen option]" (recommended default)
+  ```
+- After all areas are auto-resolved, skip the "Explore more gray areas"
+  prompt and proceed directly to `write_context`.
+- After `write_context`, **auto-advance** to plan-phase via `modes/chain.md`.
+
+## CRITICAL — Auto-mode pass cap
+
+In `--auto` mode, the discuss step MUST complete in a **single pass**. After
+writing CONTEXT.md once, you are DONE — proceed immediately to
+`write_context` and then auto_advance. Do NOT re-read your own CONTEXT.md to
+find "gaps", "undefined types", or "missing decisions" and run additional
+passes. This creates a self-feeding loop where each pass generates references
+that the next pass treats as gaps, consuming unbounded time and resources.
+
+Check the pass cap from config:
+```bash
+MAX_PASSES=$(gsd-sdk query config-get workflow.max_discuss_passes 2>/dev/null || echo "3")
+```
+
+If you have already written and committed CONTEXT.md, the discuss step is
+complete. Move on.
+
+## Combination rules
+
+- `--auto --text` / `--auto --batch`: text/batch overlays are no-ops in
+  auto mode (no user prompts to render).
+- `--auto --analyze`: trade-off tables can still be logged for the audit
+  trail; selection still uses the recommended option.
+- `--auto --power`: `--power` wins (power mode generates files for offline
+  answering — incompatible with autonomous selection).
--- a/get-shit-done/workflows/discuss-phase/modes/batch.md
+++ b/get-shit-done/workflows/discuss-phase/modes/batch.md
@@ -0,0 +1,52 @@
+# --batch mode — grouped question batches
+
+> **Lazy-loaded overlay.** Read this file from `workflows/discuss-phase.md`
+> when `--batch` is present in `$ARGUMENTS`. Combinable with default,
+> `--all`, `--chain`, `--text`, `--analyze`.
+
+## Argument parsing
+
+Parse optional `--batch` from `$ARGUMENTS`:
+- Accept `--batch`, `--batch=N`, or `--batch N`
+- Default to **4 questions per batch** when no number is provided
+- Clamp explicit sizes to **2–5** so a batch stays answerable
+- If `--batch` is absent, keep the existing one-question-at-a-time flow
+  (default mode).
+
+## Effect on discuss_areas
+
+`--batch` mode: ask **2–5 numbered questions in one plain-text turn** per
+area, instead of the default 4 single-question AskUserQuestion turns.
+
+- Group closely related questions for the current area into a single
+  message
+- Keep each question concrete and answerable in one reply
+- When options are helpful, include short inline choices per question
+  rather than a separate AskUserQuestion for every item
+- After the user replies, reflect back the captured decisions, note any
+  unanswered items, and ask only the minimum follow-up needed before
+  moving on
+- Preserve adaptiveness between batches: use the full set of answers to
+  decide the next batch or whether the area is sufficiently clear
+
+## Philosophy
+
+Stay adaptive, but let the user choose the pacing.
+- Default mode: 4 single-question turns, then check whether to continue
+- `--batch` mode: 1 grouped turn with 2–5 numbered questions, then check
+  whether to continue
+
+Each answer set should reveal the next question or next batch.
+
+## Example batch
+
+```
+Authentication — please answer 1–4:
+
+1. Which auth strategy?  (a) Session cookies  (b) JWT  (c) OAuth 2.0 + PKCE
+2. Where do tokens live?  (a) httpOnly cookie  (b) localStorage  (c) memory only
+3. Session lifetime?       (a) 1h  (b) 24h  (c) 30d  (d) configurable
+4. Account recovery?       (a) email reset  (b) magic link  (c) both
+
+Reply with your choices (e.g. "1c, 2a, 3b, 4c") or describe in your own words.
+```
--- a/get-shit-done/workflows/discuss-phase/modes/chain.md
+++ b/get-shit-done/workflows/discuss-phase/modes/chain.md
@@ -0,0 +1,97 @@
+# --chain mode — interactive discuss, then auto-advance
+
+> **Lazy-loaded.** Read this file from `workflows/discuss-phase.md` when
+> `--chain` is present in `$ARGUMENTS`, or when the parent's `auto_advance`
+> step needs to dispatch to plan-phase under `--auto`.
+
+## Effect
+
+- Discussion is **fully interactive** — questions, gray-area selection, and
+  follow-ups behave exactly the same as default mode.
+- After discussion completes, **auto-advance to plan-phase → execute-phase**
+  (same downstream behavior as `--auto`).
+- This is the middle ground: the user controls the discuss decisions, then
+  plan and execute run autonomously.
+
+## auto_advance step (executed by the parent file)
+
+1. Parse `--auto` and `--chain` flags from `$ARGUMENTS`. **Note:** `--all`
+   is NOT an auto-advance trigger — it only affects area selection. A
+   session with `--all` but without `--auto` or `--chain` returns to manual
+   next-steps after discussion completes.
+
+2. **Sync chain flag with intent** — if user invoked manually (no `--auto`
+   and no `--chain`), clear the ephemeral chain flag from any previous
+   interrupted `--auto` chain. This does NOT touch `workflow.auto_advance`
+   (the user's persistent settings preference):
+   ```bash
+   if [[ ! "$ARGUMENTS" =~ --auto ]] && [[ ! "$ARGUMENTS" =~ --chain ]]; then
+     gsd-sdk query config-set workflow._auto_chain_active false 2>/dev/null
+   fi
+   ```
+
+3. Read consolidated auto-mode (`active` = chain flag OR user preference):
+   ```bash
+   AUTO_MODE=$(gsd-sdk query check auto-mode --pick active 2>/dev/null || echo "false")
+   ```
+
+4. **If `--auto` or `--chain` flag present AND `AUTO_MODE` is not true:**
+   Persist chain flag to config (handles direct usage without new-project):
+   ```bash
+   gsd-sdk query config-set workflow._auto_chain_active true
+   ```
+
+5. **If `--auto` flag present OR `--chain` flag present OR `AUTO_MODE` is
+   true:** display banner and launch plan-phase.
+
+   Banner:
+   ```
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+    GSD ► AUTO-ADVANCING TO PLAN
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+   Context captured. Launching plan-phase...
+   ```
+
+   Launch plan-phase using the Skill tool to avoid nested Task sessions
+   (which cause runtime freezes due to deep agent nesting — see #686):
+   ```
+   Skill(skill="gsd-plan-phase", args="${PHASE} --auto ${GSD_WS}")
+   ```
+
+   This keeps the auto-advance chain flat — discuss, plan, and execute all
+   run at the same nesting level rather than spawning increasingly deep
+   Task agents.
+
+6. **Handle plan-phase return:**
+
+   - **PHASE COMPLETE** → Full chain succeeded. Display:
+     ```
+     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      GSD ► PHASE ${PHASE} COMPLETE
+     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+     Auto-advance pipeline finished: discuss → plan → execute
+
+     /clear then:
+
+     Next: /gsd:discuss-phase ${NEXT_PHASE} ${WAS_CHAIN ? "--chain" : "--auto"} ${GSD_WS}
+     ```
+   - **PLANNING COMPLETE** → Planning done, execution didn't complete:
+     ```
+     Auto-advance partial: Planning complete, execution did not finish.
+     Continue: /gsd:execute-phase ${PHASE} ${GSD_WS}
+     ```
+   - **PLANNING INCONCLUSIVE / CHECKPOINT** → Stop chain:
+     ```
+     Auto-advance stopped: Planning needs input.
+     Continue: /gsd:plan-phase ${PHASE} ${GSD_WS}
+     ```
+   - **GAPS FOUND** → Stop chain:
+     ```
+     Auto-advance stopped: Gaps found during execution.
+     Continue: /gsd:plan-phase ${PHASE} --gaps ${GSD_WS}
+     ```
+
+7. **If none of `--auto`, `--chain`, nor config enabled:** route to
+   `confirm_creation` step (existing behavior — show manual next steps).
--- a/get-shit-done/workflows/discuss-phase/modes/default.md
+++ b/get-shit-done/workflows/discuss-phase/modes/default.md
@@ -0,0 +1,141 @@
+# Default mode — interactive discuss-phase
+
+> **Lazy-loaded.** Read this file from `workflows/discuss-phase.md` when no
+> mode flag is present (the baseline interactive flow). When `--text`,
+> `--batch`, or `--analyze` is also present, layer the corresponding overlay
+> file from this directory on top of the rules below.
+
+This document defines `discuss_areas` for the default flow. The shared steps
+that come before (`initialize`, `check_blocking_antipatterns`, `check_spec`,
+`check_existing`, `load_prior_context`, `cross_reference_todos`,
+`scout_codebase`, `analyze_phase`, `present_gray_areas`) live in the parent
+file and run for every mode.
+
+## discuss_areas (default, interactive)
+
+For each selected area, conduct a focused discussion loop.
+
+**Research-before-questions mode:** Check if `workflow.research_before_questions` is enabled in config (from init context or `.planning/config.json`). When enabled, before presenting questions for each area:
+1. Do a brief web search for best practices related to the area topic
+2. Summarize the top findings in 2-3 bullet points
+3. Present the research alongside the question so the user can make a more informed decision
+
+Example with research enabled:
+```text
+Let's talk about [Authentication Strategy].
+
+📊 Best practices research:
+• OAuth 2.0 + PKCE is the current standard for SPAs (replaces implicit flow)
+• Session tokens with httpOnly cookies preferred over localStorage for XSS protection
+• Consider passkey/WebAuthn support — adoption is accelerating in 2025-2026
+
+With that context: How should users authenticate?
+```
+
+When disabled (default), skip the research and present questions directly as before.
+
+**Philosophy:** stay adaptive. Default flow is 4 single-question turns, then
+check whether to continue. Each answer should reveal the next question.
+
+**For each area:**
+
+1. **Announce the area:**
+   ```text
+   Let's talk about [Area].
+   ```
+
+2. **Ask 4 questions using AskUserQuestion:**
+   - header: "[Area]" (max 12 chars — abbreviate if needed)
+   - question: Specific decision for this area
+   - options: 2-3 concrete choices (AskUserQuestion adds "Other" automatically), with the recommended choice highlighted and brief explanation why
+   - **Annotate options with code context** when relevant:
+     ```text
+     "How should posts be displayed?"
+     - Cards (reuses existing Card component — consistent with Messages)
+     - List (simpler, would be a new pattern)
+     - Timeline (needs new Timeline component — none exists yet)
+     ```
+   - Include "You decide" as an option when reasonable — captures Claude discretion
+   - **Context7 for library choices:** When a gray area involves library selection (e.g., "magic links" → query next-auth docs) or API approach decisions, use `mcp__context7__*` tools to fetch current documentation and inform the options. Don't use Context7 for every question — only when library-specific knowledge improves the options.
+
+3. **After the current set of questions, check:**
+   - header: "[Area]" (max 12 chars)
+   - question: "More questions about [area], or move to next? (Remaining: [list other unvisited areas])"
+   - options: "More questions" / "Next area"
+
+   When building the question text, list the remaining unvisited areas so the user knows what's ahead. For example: "More questions about Layout, or move to next? (Remaining: Loading behavior, Content ordering)"
+
+   If "More questions" → ask another 4 single questions, then check again
+   If "Next area" → proceed to next selected area
+   If "Other" (free text) → interpret intent: continuation phrases ("chat more", "keep going", "yes", "more") map to "More questions"; advancement phrases ("done", "move on", "next", "skip") map to "Next area". If ambiguous, ask: "Continue with more questions about [area], or move to the next area?"
+
+4. **After all initially-selected areas complete:**
+   - Summarize what was captured from the discussion so far
+   - AskUserQuestion:
+     - header: "Done"
+     - question: "We've discussed [list areas]. Which gray areas remain unclear?"
+     - options: "Explore more gray areas" / "I'm ready for context"
+   - If "Explore more gray areas":
+     - Identify 2-4 additional gray areas based on what was learned
+     - Return to present_gray_areas logic with these new areas
+     - Loop: discuss new areas, then prompt again
+   - If "I'm ready for context": Proceed to write_context
+
+**Canonical ref accumulation during discussion:**
+When the user references a doc, spec, or ADR during any answer — e.g., "read adr-014", "check the MCP spec", "per browse-spec.md" — immediately:
+1. Read the referenced doc (or confirm it exists)
+2. Add it to the canonical refs accumulator with full relative path
+3. Use what you learned from the doc to inform subsequent questions
+
+These user-referenced docs are often MORE important than ROADMAP.md refs because they represent docs the user specifically wants downstream agents to follow. Never drop them.
+
+**Question design:**
+- Options should be concrete, not abstract ("Cards" not "Option A")
+- Each answer should inform the next question or next batch
+- If user picks "Other" to provide freeform input (e.g., "let me describe it", "something else", or an open-ended reply), ask your follow-up as plain text — NOT another AskUserQuestion. Wait for them to type at the normal prompt, then reflect their input back and confirm before resuming AskUserQuestion or the next numbered batch.
+
+**Thinking partner (conditional):**
+If `features.thinking_partner` is enabled in config, check the user's answer for tradeoff signals
+(see `references/thinking-partner.md` for signal list). If tradeoff detected:
+
+```text
+I notice competing priorities here — {option_A} optimizes for {goal_A} while {option_B} optimizes for {goal_B}.
+
+Want me to think through the tradeoffs before we lock this in?
+[Yes, analyze] / [No, decision made]
+```
+
+If yes: provide 3-5 bullet analysis (what each optimizes/sacrifices, alignment with PROJECT.md goals, recommendation). Then return to normal flow.
+
+**Scope creep handling:**
+If user mentions something outside the phase domain:
+```text
+"[Feature] sounds like a new capability — that belongs in its own phase.
+I'll note it as a deferred idea.
+
+Back to [current area]: [return to current question]"
+```
+
+Track deferred ideas internally.
+
+**Incremental checkpoint — save after each area completes:**
+
+After each area is resolved (user says "Next area"), immediately write a checkpoint file with all decisions captured so far. This prevents data loss if the session is interrupted mid-discussion.
+
+**Checkpoint file:** `${phase_dir}/${padded_phase}-DISCUSS-CHECKPOINT.json`
+
+Schema: read `workflows/discuss-phase/templates/checkpoint.json` for the
+canonical structure — copy it and substitute the live values.
+
+**On session resume:** Handled in the parent's `check_existing` step. After
+`write_context` completes successfully, the parent's `git_commit` step
+deletes the checkpoint.
+
+**Track discussion log data internally:**
+For each question asked, accumulate:
+- Area name
+- All options presented (label + description)
+- Which option the user selected (or their free-text response)
+- Any follow-up notes or clarifications the user provided
+
+This data is used to generate DISCUSSION-LOG.md in the parent's `git_commit` step.
--- a/get-shit-done/workflows/discuss-phase/modes/power.md
+++ b/get-shit-done/workflows/discuss-phase/modes/power.md
@@ -0,0 +1,44 @@
+# --power mode — bulk question generation, async answering
+
+> **Lazy-loaded.** Read this file from `workflows/discuss-phase.md` when
+> `--power` is present in `$ARGUMENTS`. The full step-by-step instructions
+> live in the existing `discuss-phase-power.md` workflow file (kept stable
+> at its original path so installed `@`-references continue to resolve).
+
+## Dispatch
+
+```
+Read @~/.claude/get-shit-done/workflows/discuss-phase-power.md
+```
+
+Execute it end-to-end. Do not continue with the standard interactive steps.
+
+## Summary of flow
+
+The power user mode generates ALL questions upfront into machine-readable
+and human-friendly files, then waits for the user to answer at their own
+pace before processing all answers in a single pass.
+
+1. Run the same phase analysis (gray area identification) as standard mode
+2. Write all questions to
+   `{phase_dir}/{padded_phase}-QUESTIONS.json` and
+   `{phase_dir}/{padded_phase}-QUESTIONS.html`
+3. Notify user with file paths and wait for a "refresh" or "finalize"
+   command
+4. On "refresh": read the JSON, process answered questions, update stats
+   and HTML
+5. On "finalize": read all answers from JSON, generate CONTEXT.md in the
+   standard format
+
+## When to use
+
+Large phases with many gray areas, or when users prefer to answer
+questions offline / asynchronously rather than interactively in the chat
+session.
+
+## Combination rules
+
+- `--power --auto`: power wins. Power mode is incompatible with
+  autonomous selection — its purpose is offline answering.
+- `--power --chain`: after the power-mode finalize step writes
+  CONTEXT.md, the chain auto-advance still applies (Read `chain.md`).
--- a/get-shit-done/workflows/discuss-phase/modes/text.md
+++ b/get-shit-done/workflows/discuss-phase/modes/text.md
@@ -0,0 +1,55 @@
+# --text mode — plain-text overlay (no AskUserQuestion)
+
+> **Lazy-loaded overlay.** Read this file from `workflows/discuss-phase.md`
+> when `--text` is present in `$ARGUMENTS`, OR when
+> `workflow.text_mode: true` is set in config (e.g., per-project default).
+
+## Effect
+
+When text mode is active, **do not use AskUserQuestion at all**. Instead,
+present every question as a plain-text numbered list and ask the user to
+type their choice number. Free-text input maps to the "Other" branch of
+the equivalent AskUserQuestion call.
+
+This is required for Claude Code remote sessions (`/rc` mode) where the
+Claude App cannot forward TUI menu selections back to the host.
+
+## Activation
+
+- Per-session: pass `--text` flag to any command (e.g.,
+  `/gsd:discuss-phase --text`)
+- Per-project: `gsd-sdk query config-set workflow.text_mode true`
+
+Text mode applies to ALL workflows in the session, not just discuss-phase.
+
+## Question rendering
+
+Replace this:
+```text
+AskUserQuestion(
+  header="Layout",
+  question="How should posts be displayed?",
+  options=["Cards", "List", "Timeline"]
+)
+```
+
+With this:
+```text
+Layout — How should posts be displayed?
+  1. Cards
+  2. List
+  3. Timeline
+  4. Other (type freeform)
+
+Reply with a number, or describe your preference.
+```
+
+Wait for the user's reply at the normal prompt. Parse:
+- Numeric reply → mapped to that option
+- Free text → treated as "Other" — reflect it back, confirm, then proceed
+
+## Empty-answer handling
+
+The same answer-validation rules from the parent file apply: empty
+responses trigger one retry, then a clarifying question. Do not proceed
+with empty input.
--- a/get-shit-done/workflows/discuss-phase/templates/checkpoint.json
+++ b/get-shit-done/workflows/discuss-phase/templates/checkpoint.json
@@ -0,0 +1,18 @@
+{
+  "phase": "{PHASE_NUM}",
+  "phase_name": "{phase_name}",
+  "timestamp": "{ISO timestamp}",
+  "areas_completed": ["Area 1", "Area 2"],
+  "areas_remaining": ["Area 3", "Area 4"],
+  "decisions": {
+    "Area 1": [
+      {"question": "...", "answer": "...", "options_presented": ["..."]},
+      {"question": "...", "answer": "...", "options_presented": ["..."]}
+    ],
+    "Area 2": [
+      {"question": "...", "answer": "...", "options_presented": ["..."]}
+    ]
+  },
+  "deferred_ideas": ["..."],
+  "canonical_refs": ["..."]
+}
--- a/get-shit-done/workflows/discuss-phase/templates/context.md
+++ b/get-shit-done/workflows/discuss-phase/templates/context.md
@@ -0,0 +1,136 @@
+# CONTEXT.md template — for discuss-phase write_context step
+
+> **Lazy-loaded.** Read this file only inside the `write_context` step of
+> `workflows/discuss-phase.md`, immediately before writing
+> `${phase_dir}/${padded_phase}-CONTEXT.md`. Do not put a reference to this
+> file in `<required_reading>` — that defeats the progressive-disclosure
+> savings introduced by issue #2551.
+
+## Variable substitutions
+
+The caller substitutes:
+- `[X]` → phase number
+- `[Name]` → phase name
+- `[date]` → ISO date when context was gathered
+- `${padded_phase}` → zero-padded phase number (e.g., `07`, `15`)
+- `{N}` → counts (requirements, etc.)
+
+## Conditional sections
+
+- **`<spec_lock>`** — include only when `spec_loaded = true` (a `*-SPEC.md`
+  was found by `check_spec`). Otherwise omit the entire `<spec_lock>` block.
+- **Folded Todos / Reviewed Todos** — include subsections only when the
+  `cross_reference_todos` step folded or reviewed at least one todo.
+
+## Template body
+
+```markdown
+# Phase [X]: [Name] - Context
+
+**Gathered:** [date]
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+[Clear statement of what this phase delivers — the scope anchor]
+
+</domain>
+
+[If spec_loaded = true, insert this section:]
+<spec_lock>
+## Requirements (locked via SPEC.md)
+
+**{N} requirements are locked.** See `{padded_phase}-SPEC.md` for full requirements, boundaries, and acceptance criteria.
+
+Downstream agents MUST read `{padded_phase}-SPEC.md` before planning or implementing. Requirements are not duplicated here.
+
+**In scope (from SPEC.md):** [copy the "In scope" bullet list from SPEC.md Boundaries]
+**Out of scope (from SPEC.md):** [copy the "Out of scope" bullet list from SPEC.md Boundaries]
+
+</spec_lock>
+
+<decisions>
+## Implementation Decisions
+
+### [Category 1 that was discussed]
+- **D-01:** [Decision or preference captured]
+- **D-02:** [Another decision if applicable]
+
+### [Category 2 that was discussed]
+- **D-03:** [Decision or preference captured]
+
+### Claude's Discretion
+[Areas where user said "you decide" — note that Claude has flexibility here]
+
+### Folded Todos
+[If any todos were folded into scope from the cross_reference_todos step, list them here.
+Each entry should include the todo title, original problem, and how it fits this phase's scope.
+If no todos were folded: omit this subsection entirely.]
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+**Downstream agents MUST read these before planning or implementing.**
+
+[MANDATORY section. Write the FULL accumulated canonical refs list here.
+Sources: ROADMAP.md refs + REQUIREMENTS.md refs + user-referenced docs during
+discussion + any docs discovered during codebase scout. Group by topic area.
+Every entry needs a full relative path — not just a name.]
+
+### [Topic area 1]
+- `path/to/adr-or-spec.md` — [What it decides/defines that's relevant]
+- `path/to/doc.md` §N — [Specific section reference]
+
+### [Topic area 2]
+- `path/to/feature-doc.md` — [What this doc defines]
+
+[If no external specs: "No external specs — requirements fully captured in decisions above"]
+
+</canonical_refs>
+
+<code_context>
+## Existing Code Insights
+
+### Reusable Assets
+- [Component/hook/utility]: [How it could be used in this phase]
+
+### Established Patterns
+- [Pattern]: [How it constrains/enables this phase]
+
+### Integration Points
+- [Where new code connects to existing system]
+
+</code_context>
+
+<specifics>
+## Specific Ideas
+
+[Any particular references, examples, or "I want it like X" moments from discussion]
+
+[If none: "No specific requirements — open to standard approaches"]
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+[Ideas that came up but belong in other phases. Don't lose them.]
+
+### Reviewed Todos (not folded)
+[If any todos were reviewed in cross_reference_todos but not folded into scope,
+list them here so future phases know they were considered.
+Each entry: todo title + reason it was deferred (out of scope, belongs in Phase Y, etc.)
+If no reviewed-but-deferred todos: omit this subsection entirely.]
+
+[If none: "None — discussion stayed within phase scope"]
+
+</deferred>
+
+---
+
+*Phase: [X]-[Name]*
+*Context gathered: [date]*
+```
--- a/get-shit-done/workflows/discuss-phase/templates/discussion-log.md
+++ b/get-shit-done/workflows/discuss-phase/templates/discussion-log.md
@@ -0,0 +1,50 @@
+# DISCUSSION-LOG.md template — for discuss-phase git_commit step
+
+> **Lazy-loaded.** Read this file only inside the `git_commit` step of
+> `workflows/discuss-phase.md`, immediately before writing
+> `${phase_dir}/${padded_phase}-DISCUSSION-LOG.md`.
+
+## Purpose
+
+Audit trail for human review (compliance, learning, retrospectives). NOT
+consumed by downstream agents — those read CONTEXT.md only.
+
+## Template body
+
+```markdown
+# Phase [X]: [Name] - Discussion Log
+
+> **Audit trail only.** Do not use as input to planning, research, or execution agents.
+> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
+
+**Date:** [ISO date]
+**Phase:** [phase number]-[phase name]
+**Areas discussed:** [comma-separated list]
+
+---
+
+[For each gray area discussed:]
+
+## [Area Name]
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| [Option 1] | [Description from AskUserQuestion] | |
+| [Option 2] | [Description] | ✓ |
+| [Option 3] | [Description] | |
+
+**User's choice:** [Selected option or free-text response]
+**Notes:** [Any clarifications, follow-up context, or rationale the user provided]
+
+---
+
+[Repeat for each area]
+
+## Claude's Discretion
+
+[List areas where user said "you decide" or deferred to Claude]
+
+## Deferred Ideas
+
+[Ideas mentioned during discussion that were noted for future phases]
+```
--- a/get-shit-done/workflows/execute-phase.md
+++ b/get-shit-done/workflows/execute-phase.md
@@ -1271,83 +1271,14 @@ If `TEXT_MODE` is true, present as a plain-text numbered list. Otherwise use Ask
 </step>

 <step name="codebase_drift_gate">
-Post-execution structural drift detection (#2003). Runs after the last wave
-commits, before verification. **Non-blocking by contract:** any internal
-error here MUST fall through and continue to `verify_phase_goal`. The phase
+Post-execution structural drift detection (#2003). Non-blocking by contract:
+any internal error here MUST fall through to `verify_phase_goal`. The phase
 is never failed by this gate.

-```bash
-DRIFT=$(gsd-sdk query verify.codebase-drift 2>/dev/null || echo '{"skipped":true,"reason":"sdk-failed"}')
-```
-
-Parse JSON for: `skipped`, `reason`, `action_required`, `directive`,
-`spawn_mapper`, `affected_paths`, `elements`, `threshold`, `action`,
-`last_mapped_commit`, `message`.
-
-**If `skipped` is true (no STRUCTURE.md, missing git, or any internal error):**
-Log one line — `Codebase drift check skipped: {reason}` — and continue to
-`verify_phase_goal`. Do NOT prompt the user. Do NOT block.
-
-**If `action_required` is false:** Continue silently to `verify_phase_goal`.
-
-**If `action_required` is true AND `directive` is `warn`:**
-Print the `message` field verbatim. The format is:
-
-```text
-Codebase drift detected: {N} structural element(s) since last mapping.
-
-New directories:
-  - {path}
-New barrel exports:
-  - {path}
-New migrations:
-  - {path}
-New route modules:
-  - {path}
-
-Run /gsd:map-codebase --paths {affected_paths} to refresh planning context.
-```
-
-Then continue to `verify_phase_goal`. Do NOT block. Do NOT spawn anything.
-
-**If `action_required` is true AND `directive` is `auto-remap`:**
-
-First load the mapper agent's skill bundle (the executor's `AGENT_SKILLS`
-from step `init_context` is for `gsd-executor`, not the mapper):
-
-```bash
-AGENT_SKILLS_MAPPER=$(gsd-sdk query agent-skills gsd-codebase-mapper 2>/dev/null || true)
-```
-
-Then spawn `gsd-codebase-mapper` agents with the `--paths` hint:
-
-```text
-Task(
-  subagent_type="gsd-codebase-mapper",
-  description="Incremental codebase remap (drift)",
-  prompt="Focus: arch
-Today's date: {date}
--paths {affected_paths joined by comma}
-
-Refresh STRUCTURE.md and ARCHITECTURE.md scoped to the listed paths only.
-Stamp last_mapped_commit in each document's frontmatter.
-${AGENT_SKILLS_MAPPER}"
-)
-```
-
-If the spawn fails or the agent reports an error: log `Codebase drift
-auto-remap failed: {reason}` and continue to `verify_phase_goal`. The phase
-is NOT failed by a remap failure.
-
-If the remap succeeds: log `Codebase drift auto-remap completed for paths:
-{affected_paths}` and continue to `verify_phase_goal`.
-
-The two relevant config keys (continue on error / failure if either is invalid):
- `workflow.drift_threshold` (integer, default 3) — minimum drift elements before action
- `workflow.drift_action` — `warn` (default) or `auto-remap`
-
-This step is fully non-blocking — it never fails the phase, and any
-exception path returns control to `verify_phase_goal`.
+Load and follow the full step spec from
+`get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md` —
+covers the SDK call, JSON contract, `warn` vs `auto-remap` branches, mapper
+spawn template, and the two `workflow.drift_*` config keys.
 </step>

 <step name="verify_phase_goal">
--- a/get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md
+++ b/get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md
@@ -0,0 +1,79 @@
+# Step: codebase_drift_gate
+
+Post-execution structural drift detection (#2003). Runs after the last wave
+commits, before verification. **Non-blocking by contract:** any internal
+error here MUST fall through and continue to `verify_phase_goal`. The phase
+is never failed by this gate.
+
+```bash
+DRIFT=$(gsd-sdk query verify.codebase-drift 2>/dev/null || echo '{"skipped":true,"reason":"sdk-failed"}')
+```
+
+Parse JSON for: `skipped`, `reason`, `action_required`, `directive`,
+`spawn_mapper`, `affected_paths`, `elements`, `threshold`, `action`,
+`last_mapped_commit`, `message`.
+
+**If `skipped` is true (no STRUCTURE.md, missing git, or any internal error):**
+Log one line — `Codebase drift check skipped: {reason}` — and continue to
+`verify_phase_goal`. Do NOT prompt the user. Do NOT block.
+
+**If `action_required` is false:** Continue silently to `verify_phase_goal`.
+
+**If `action_required` is true AND `directive` is `warn`:**
+Print the `message` field verbatim. The format is:
+
+```text
+Codebase drift detected: {N} structural element(s) since last mapping.
+
+New directories:
+  - {path}
+New barrel exports:
+  - {path}
+New migrations:
+  - {path}
+New route modules:
+  - {path}
+
+Run /gsd:map-codebase --paths {affected_paths} to refresh planning context.
+```
+
+Then continue to `verify_phase_goal`. Do NOT block. Do NOT spawn anything.
+
+**If `action_required` is true AND `directive` is `auto-remap`:**
+
+First load the mapper agent's skill bundle (the executor's `AGENT_SKILLS`
+from step `init_context` is for `gsd-executor`, not the mapper):
+
+```bash
+AGENT_SKILLS_MAPPER=$(gsd-sdk query agent-skills gsd-codebase-mapper 2>/dev/null || true)
+```
+
+Then spawn `gsd-codebase-mapper` agents with the `--paths` hint:
+
+```text
+Task(
+  subagent_type="gsd-codebase-mapper",
+  description="Incremental codebase remap (drift)",
+  prompt="Focus: arch
+Today's date: {date}
+--paths {affected_paths joined by comma}
+
+Refresh STRUCTURE.md and ARCHITECTURE.md scoped to the listed paths only.
+Stamp last_mapped_commit in each document's frontmatter.
+${AGENT_SKILLS_MAPPER}"
+)
+```
+
+If the spawn fails or the agent reports an error: log `Codebase drift
+auto-remap failed: {reason}` and continue to `verify_phase_goal`. The phase
+is NOT failed by a remap failure.
+
+If the remap succeeds: log `Codebase drift auto-remap completed for paths:
+{affected_paths}` and continue to `verify_phase_goal`.
+
+The two relevant config keys (continue on error / failure if either is invalid):
+- `workflow.drift_threshold` (integer, default 3) — minimum drift elements before action
+- `workflow.drift_action` — `warn` (default) or `auto-remap`
+
+This step is fully non-blocking — it never fails the phase, and any
+exception path returns control to `verify_phase_goal`.
--- a/tests/agent-frontmatter.test.cjs
+++ b/tests/agent-frontmatter.test.cjs
@@ -359,16 +359,23 @@ describe('VERIFY: data-flow trace, environment audit, and behavioral spot-checks

 describe('DISCUSS: discussion log generation', () => {
  test('discuss-phase workflow references DISCUSSION-LOG.md generation', () => {
-    const content = fs.readFileSync(
+    // After #2551 progressive-disclosure refactor, the DISCUSSION-LOG.md template
+    // body lives in workflows/discuss-phase/templates/discussion-log.md and is
+    // read at the git_commit step. Both files together must satisfy the
+    // documentation contract.
+    const parent = fs.readFileSync(
      path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8'
    );
+    const tplPath = path.join(WORKFLOWS_DIR, 'discuss-phase', 'templates', 'discussion-log.md');
+    const tpl = fs.existsSync(tplPath) ? fs.readFileSync(tplPath, 'utf-8') : '';
+    const content = parent + '\n' + tpl;
    assert.ok(
      content.includes('DISCUSSION-LOG.md'),
      'discuss-phase must reference DISCUSSION-LOG.md generation'
    );
    assert.ok(
      content.includes('Audit trail only'),
-      'discuss-phase must mark discussion log as audit-only'
+      'discuss-phase (or its discussion-log template after #2551) must mark discussion log as audit-only'
    );
  });

--- a/tests/bug-2549-2550-2552-discuss-phase-context.test.cjs
+++ b/tests/bug-2549-2550-2552-discuss-phase-context.test.cjs
@@ -16,17 +16,35 @@ const path = require('node:path');
 const DISCUSS_PHASE = path.join(
  __dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase.md',
 );
+// After #2551 progressive-disclosure refactor, the scout_codebase phase-type
+// table and split-reads warning live in references/scout-codebase.md.
+const SCOUT_REF = path.join(
+  __dirname, '..', 'get-shit-done', 'references', 'scout-codebase.md',
+);
+
+function readDiscussContext() {
+  // Both files are required after #2551 — fail loudly if either is missing
+  // rather than silently weakening the regression coverage.
+  for (const p of [DISCUSS_PHASE, SCOUT_REF]) {
+    assert.ok(fs.existsSync(p), `Required discuss-phase context source missing: ${p}`);
+  }
+  return [DISCUSS_PHASE, SCOUT_REF].map(p => fs.readFileSync(p, 'utf-8')).join('\n');
+}

 describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {
  let src;
  test('discuss-phase.md source exists', () => {
    assert.ok(fs.existsSync(DISCUSS_PHASE), 'discuss-phase.md must exist');
-    src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    assert.ok(
+      fs.existsSync(SCOUT_REF),
+      'references/scout-codebase.md must exist after #2551 extraction',
+    );
+    src = readDiscussContext();
  });

  // ─── #2549: load_prior_context cap ──────────────────────────────────────
  test('#2549: load_prior_context must NOT instruct reading ALL prior CONTEXT.md files', () => {
-    if (!src) src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    if (!src) src = readDiscussContext();
    assert.ok(
      !src.includes('For each CONTEXT.md where phase number < current phase'),
      'load_prior_context must not unboundedly read all prior CONTEXT.md files',
@@ -34,9 +52,13 @@ describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {
  });

  test('#2549: load_prior_context must reference a bounded read (3 phases or DECISIONS-INDEX)', () => {
-    if (!src) src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
-    const hasBound = src.includes('3') && src.includes('prior CONTEXT.md');
-    const hasIndex = src.includes('DECISIONS-INDEX.md');
+    // Read ONLY the parent file — `src.includes('3')` against the
+    // concatenated source can be satisfied by unrelated occurrences of "3"
+    // in scout-codebase.md (e.g., "3-5 most relevant files"), masking a
+    // regression where the parent drops the bounded-read instruction.
+    const parent = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    const hasBound = /\b(?:most recent|latest|last|up to)\s+3\b[\s\S]{0,160}\bprior CONTEXT\.md\b/i.test(parent);
+    const hasIndex = parent.includes('DECISIONS-INDEX.md');
    assert.ok(
      hasBound || hasIndex,
      'load_prior_context must reference a bounded read (e.g., most recent 3 phases) or DECISIONS-INDEX.md',
@@ -45,7 +67,7 @@ describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {

  // ─── #2550: scout_codebase phase-type selection ──────────────────────────
  test('#2550: scout_codebase must not instruct reading all 7 codebase maps', () => {
-    if (!src) src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    if (!src) src = readDiscussContext();
    assert.ok(
      !src.includes('Read the most relevant ones (CONVENTIONS.md, STRUCTURE.md, STACK.md based on phase type)'),
      'scout_codebase must not use the old vague "most relevant" instruction without a selection table',
@@ -53,7 +75,7 @@ describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {
  });

  test('#2550: scout_codebase must include a phase-type-to-maps selection table', () => {
-    if (!src) src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    if (!src) src = readDiscussContext();
    // The table maps phase types to specific map selections
    assert.ok(
      src.includes('Phase type') && src.includes('Read these maps'),
@@ -68,7 +90,7 @@ describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {

  // ─── #2552: no split reads ───────────────────────────────────────────────
  test('#2552: scout_codebase must explicitly prohibit split reads of the same file', () => {
-    if (!src) src = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
+    if (!src) src = readDiscussContext();
    const prohibitsSplit = src.includes('split reads') || src.includes('split read');
    assert.ok(
      prohibitsSplit,
--- a/tests/chain-flag-plan-phase.test.cjs
+++ b/tests/chain-flag-plan-phase.test.cjs
@@ -16,6 +16,15 @@ const path = require('path');
 describe('plan-phase chain flag preservation (#1620)', () => {
  const planPath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'plan-phase.md');
  const discussPath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase.md');
+  // After #2551, discuss-phase chain logic moved to modes/chain.md.
+  const discussChainPath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase', 'modes', 'chain.md');
+  const readDiscuss = () => {
+    // Fail loudly if either source is missing — silent filtering would let a
+    // regression that deletes modes/chain.md pass this whole suite.
+    assert.ok(fs.existsSync(discussPath), `discuss-phase.md missing: ${discussPath}`);
+    assert.ok(fs.existsSync(discussChainPath), `discuss-phase/modes/chain.md missing after #2551 split: ${discussChainPath}`);
+    return [discussPath, discussChainPath].map(p => fs.readFileSync(p, 'utf8')).join('\n');
+  };

  test('plan-phase sync-flag guard checks both --auto AND --chain', () => {
    const content = fs.readFileSync(planPath, 'utf8');
@@ -37,7 +46,7 @@ describe('plan-phase chain flag preservation (#1620)', () => {

  test('plan-phase and discuss-phase use the same guard pattern for clearing _auto_chain_active', () => {
    const planContent = fs.readFileSync(planPath, 'utf8');
-    const discussContent = fs.readFileSync(discussPath, 'utf8');
+    const discussContent = readDiscuss();

    const guardPattern = 'if [[ ! "$ARGUMENTS" =~ --auto ]] && [[ ! "$ARGUMENTS" =~ --chain ]]; then';

@@ -47,7 +56,7 @@ describe('plan-phase chain flag preservation (#1620)', () => {
    );
    assert.ok(
      discussContent.includes(guardPattern),
-      'discuss-phase should use the dual-flag guard pattern'
+      'discuss-phase (or discuss-phase/modes/chain.md after #2551 split) should use the dual-flag guard pattern'
    );
  });

--- a/tests/discuss-checkpoint.test.cjs
+++ b/tests/discuss-checkpoint.test.cjs
@@ -14,9 +14,25 @@ const path = require('path');

 describe('discuss-phase incremental checkpoint saves (#1485)', () => {
  const workflowPath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase.md');
+  // After #2551 progressive-disclosure refactor, checkpoint logic lives in the
+  // default mode file and the JSON schema lives in the templates directory.
+  const defaultModePath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase', 'modes', 'default.md');
+  const checkpointTplPath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase', 'templates', 'checkpoint.json');
+
+  function readAll() {
+    // Fail loudly if any required source is missing — silent filtering would
+    // let a regression that deletes the extracted default-mode or checkpoint
+    // template pass the suite.
+    for (const p of [workflowPath, defaultModePath, checkpointTplPath]) {
+      assert.ok(fs.existsSync(p), `Required discuss-phase checkpoint source missing: ${p}`);
+    }
+    return [workflowPath, defaultModePath, checkpointTplPath]
+      .map(p => fs.readFileSync(p, 'utf8'))
+      .join('\n');
+  }

  test('workflow writes checkpoint file after each area completes', () => {
-    const content = fs.readFileSync(workflowPath, 'utf8');
+    const content = readAll();
    assert.ok(
      content.includes('DISCUSS-CHECKPOINT.json'),
      'workflow should reference checkpoint JSON file'
@@ -28,14 +44,14 @@ describe('discuss-phase incremental checkpoint saves (#1485)', () => {
  });

  test('checkpoint includes decisions, areas completed, and areas remaining', () => {
-    const content = fs.readFileSync(workflowPath, 'utf8');
+    const content = readAll();
    assert.ok(content.includes('areas_completed'), 'checkpoint should track completed areas');
    assert.ok(content.includes('areas_remaining'), 'checkpoint should track remaining areas');
    assert.ok(content.includes('"decisions"'), 'checkpoint should include decisions object');
  });

  test('check_existing step detects checkpoint for session resume', () => {
-    const content = fs.readFileSync(workflowPath, 'utf8');
+    const content = readAll();
    // The check_existing step should look for checkpoint files
    assert.ok(
      content.includes('DISCUSS-CHECKPOINT.json') && content.includes('Resume'),
--- a/tests/discuss-phase-power.test.cjs
+++ b/tests/discuss-phase-power.test.cjs
@@ -37,12 +37,17 @@ describe('discuss-phase power user mode (#1513)', () => {

  describe('main workflow file (discuss-phase.md)', () => {
    test('has power_user_mode section or references discuss-phase-power.md', () => {
-      const content = fs.readFileSync(workflowPath, 'utf8');
-      const hasPowerSection = content.includes('power_user_mode') || content.includes('power user mode');
+      // After #2551, the power dispatch lives in discuss-phase/modes/power.md and
+      // the parent references it via the dispatch table.
+      const parentContent = fs.readFileSync(workflowPath, 'utf8');
+      const powerModePath = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase', 'modes', 'power.md');
+      const powerMode = fs.existsSync(powerModePath) ? fs.readFileSync(powerModePath, 'utf8') : '';
+      const content = parentContent + '\n' + powerMode;
+      const hasPowerSection = content.includes('power_user_mode') || content.includes('power user mode') || content.includes('modes/power.md');
      const hasReference = content.includes('discuss-phase-power');
      assert.ok(
        hasPowerSection || hasReference,
-        'discuss-phase.md should have power_user_mode section or reference discuss-phase-power.md'
+        'discuss-phase.md (or modes/power.md after #2551) should have power_user_mode section or reference discuss-phase-power.md'
      );
    });

--- a/tests/thinking-partner.test.cjs
+++ b/tests/thinking-partner.test.cjs
@@ -91,48 +91,51 @@ describe('Thinking Partner Integration (#1726)', () => {
  });

  // Workflow integration tests
+  // After #2551 progressive-disclosure refactor, the thinking-partner block
+  // moved into the per-mode files (default.md, advisor.md) since the prompt
+  // is mode-specific (only fires inside discuss_areas, after a user answer).
  describe('Discuss-phase integration', () => {
-    test('discuss-phase.md contains thinking partner conditional block', () => {
-      const content = fs.readFileSync(
+    function readDiscussFamily() {
+      const candidates = [
        path.join(GSD_ROOT, 'workflows', 'discuss-phase.md'),
-        'utf-8'
-      );
+        path.join(GSD_ROOT, 'workflows', 'discuss-phase', 'modes', 'default.md'),
+        path.join(GSD_ROOT, 'workflows', 'discuss-phase', 'modes', 'advisor.md'),
+      ];
+      return candidates
+        .filter(p => fs.existsSync(p))
+        .map(p => fs.readFileSync(p, 'utf-8'))
+        .join('\n');
+    }
+
+    test('discuss-phase.md contains thinking partner conditional block', () => {
+      const content = readDiscussFamily();
      assert.ok(
        content.includes('Thinking partner (conditional)'),
-        'discuss-phase.md should contain thinking partner conditional block'
+        'discuss-phase workflow family should contain thinking partner conditional block'
      );
    });

    test('discuss-phase references features.thinking_partner config', () => {
-      const content = fs.readFileSync(
-        path.join(GSD_ROOT, 'workflows', 'discuss-phase.md'),
-        'utf-8'
-      );
+      const content = readDiscussFamily();
      assert.ok(
        content.includes('features.thinking_partner'),
-        'discuss-phase.md should reference the config key'
+        'discuss-phase workflow family should reference the config key'
      );
    });

    test('discuss-phase references thinking-partner.md for signal list', () => {
-      const content = fs.readFileSync(
-        path.join(GSD_ROOT, 'workflows', 'discuss-phase.md'),
-        'utf-8'
-      );
+      const content = readDiscussFamily();
      assert.ok(
        content.includes('references/thinking-partner.md'),
-        'discuss-phase.md should reference the signal list doc'
+        'discuss-phase workflow family should reference the signal list doc'
      );
    });

    test('discuss-phase offers skip option', () => {
-      const content = fs.readFileSync(
-        path.join(GSD_ROOT, 'workflows', 'discuss-phase.md'),
-        'utf-8'
-      );
+      const content = readDiscussFamily();
      assert.ok(
        content.includes('No, decision made'),
-        'discuss-phase.md should offer a skip/decline option'
+        'discuss-phase workflow family should offer a skip/decline option'
      );
    });
  });
--- a/tests/workflow-size-budget.test.cjs
+++ b/tests/workflow-size-budget.test.cjs
@@ -0,0 +1,317 @@
+/**
+ * Workflow size budget.
+ *
+ * Workflow definitions in `get-shit-done/workflows/*.md` are loaded verbatim
+ * into Claude's context every time the corresponding `/gsd:*` command is
+ * invoked. Unbounded growth is paid on every invocation across every session.
+ *
+ * Tiered the same way as agent budgets (#2361):
+ *   - XL       : top-level orchestrators (e.g., execute-phase, autonomous)
+ *   - LARGE    : multi-step planners
+ *   - DEFAULT  : focused single-purpose workflows (target tier)
+ *
+ * Raising a budget is a deliberate choice — adjust the constant, write a
+ * rationale in the PR, and confirm the bloat is not duplicated content
+ * that belongs in `get-shit-done/references/` or a per-mode subdirectory
+ * (see `workflows/discuss-phase/modes/` for the progressive-disclosure
+ * pattern introduced by #2551).
+ *
+ * See:
+ *   - https://github.com/gsd-build/get-shit-done/issues/2551 (this test)
+ *   - https://github.com/gsd-build/get-shit-done/issues/2361 (agent budget)
+ */
+
+const { test, describe } = require('node:test');
+const assert = require('node:assert/strict');
+const fs = require('fs');
+const path = require('path');
+
+const WORKFLOWS_DIR = path.join(__dirname, '..', 'get-shit-done', 'workflows');
+
+const XL_BUDGET = 1700;
+const LARGE_BUDGET = 1500;
+const DEFAULT_BUDGET = 1000;
+
+// Top-level orchestrators that own end-to-end multi-phase rubrics.
+// Grandfathered at current sizes — see PR #2551 for #2551 progressive-disclosure
+// pattern that future shrinks should follow.
+const XL_WORKFLOWS = new Set([
+  'execute-phase',  // 1622
+  'plan-phase',     // 1493
+  'new-project',    // 1391
+]);
+
+// Multi-step planners and bigger feature workflows. Grandfathered.
+const LARGE_WORKFLOWS = new Set([
+  'docs-update',           // 1155
+  'autonomous',            // 789
+  'complete-milestone',    // 847
+  'verify-work',           // 740
+  'transition',            // 693
+  'help',                  // 667
+  'discuss-phase-assumptions', // 670
+  'progress',              // 619
+  'new-milestone',         // 611
+  'update',                // 587
+  'quick',                 // 971
+  'code-review',           // 515
+]);
+
+const ALL_WORKFLOWS = fs.readdirSync(WORKFLOWS_DIR)
+  .filter(f => f.endsWith('.md'))
+  .map(f => f.replace('.md', ''));
+
+function budgetFor(workflow) {
+  if (XL_WORKFLOWS.has(workflow)) return { tier: 'XL', limit: XL_BUDGET };
+  if (LARGE_WORKFLOWS.has(workflow)) return { tier: 'LARGE', limit: LARGE_BUDGET };
+  return { tier: 'DEFAULT', limit: DEFAULT_BUDGET };
+}
+
+function lineCount(filePath) {
+  const content = fs.readFileSync(filePath, 'utf-8');
+  if (content.length === 0) return 0;
+  const trailingNewline = content.endsWith('\n') ? 1 : 0;
+  return content.split('\n').length - trailingNewline;
+}
+
+describe('SIZE: workflow line-count budget', () => {
+  for (const workflow of ALL_WORKFLOWS) {
+    const { tier, limit } = budgetFor(workflow);
+    test(`${workflow} (${tier}) stays under ${limit} lines`, () => {
+      const filePath = path.join(WORKFLOWS_DIR, workflow + '.md');
+      const lines = lineCount(filePath);
+      assert.ok(
+        lines <= limit,
+        `${workflow}.md has ${lines} lines — exceeds ${tier} budget of ${limit}. ` +
+        `Extract per-mode bodies to a workflows/${workflow}/modes/ subdirectory, ` +
+        `templates to workflows/${workflow}/templates/, or shared references ` +
+        `to get-shit-done/references/. See workflows/discuss-phase/ for the pattern.`
+      );
+    });
+  }
+});
+
+describe('SIZE: discuss-phase progressive disclosure (issue #2551)', () => {
+  // Issue #2551 explicitly targets discuss-phase.md at <500 lines, separate from
+  // the per-tier grandfathered budgets above. This is the headline metric of the
+  // refactor — every other workflow above 500 is grandfathered at its current
+  // size and may shrink later by following the same pattern.
+  const DISCUSS_PHASE_TARGET = 500;
+  test(`discuss-phase.md is under ${DISCUSS_PHASE_TARGET} lines (issue #2551 target)`, () => {
+    const filePath = path.join(WORKFLOWS_DIR, 'discuss-phase.md');
+    const lines = lineCount(filePath);
+    assert.ok(
+      lines < DISCUSS_PHASE_TARGET,
+      `discuss-phase.md has ${lines} lines — must be under ${DISCUSS_PHASE_TARGET} per #2551. ` +
+      `Per-mode logic belongs in workflows/discuss-phase/modes/<mode>.md, ` +
+      `templates in workflows/discuss-phase/templates/.`
+    );
+  });
+
+  const SUBDIR = path.join(WORKFLOWS_DIR, 'discuss-phase');
+
+  test('mode files exist for every documented mode', () => {
+    const expected = ['power', 'all', 'auto', 'chain', 'text', 'batch', 'analyze', 'default', 'advisor'];
+    for (const mode of expected) {
+      const p = path.join(SUBDIR, 'modes', `${mode}.md`);
+      assert.ok(
+        fs.existsSync(p),
+        `Expected mode file ${path.relative(WORKFLOWS_DIR, p)} — missing. ` +
+        `Each --flag in commands/gsd/discuss-phase.md must have a matching mode file.`
+      );
+    }
+  });
+
+  test('every mode file is a real, non-empty workflow doc', () => {
+    const modesDir = path.join(SUBDIR, 'modes');
+    if (!fs.existsSync(modesDir)) {
+      assert.fail(`workflows/discuss-phase/modes/ directory does not exist`);
+    }
+    for (const file of fs.readdirSync(modesDir)) {
+      if (!file.endsWith('.md')) continue;
+      const p = path.join(modesDir, file);
+      const content = fs.readFileSync(p, 'utf-8');
+      assert.ok(content.trim().length > 100,
+        `${file} is empty or near-empty (${content.length} chars) — extraction must preserve behavior, not stub it out`);
+    }
+  });
+
+  test('templates extracted to discuss-phase/templates/', () => {
+    const expected = ['context.md', 'discussion-log.md', 'checkpoint.json'];
+    for (const t of expected) {
+      const p = path.join(SUBDIR, 'templates', t);
+      assert.ok(fs.existsSync(p),
+        `Expected template ${path.relative(WORKFLOWS_DIR, p)} — missing.`);
+    }
+  });
+
+  test('parent discuss-phase.md dispatches to mode files (power)', () => {
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    assert.ok(
+      /discuss-phase\/modes\/power\.md/.test(parent) ||
+      /discuss-phase-power\.md/.test(parent),
+      `Parent discuss-phase.md must reference workflows/discuss-phase/modes/power.md ` +
+      `(or the legacy discuss-phase-power.md alias) somewhere in its dispatch logic.`
+    );
+  });
+
+  test('parent dispatches to all extracted modes (auto, chain, all, advisor)', () => {
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    for (const mode of ['auto', 'chain', 'all', 'advisor']) {
+      assert.ok(
+        new RegExp(`discuss-phase/modes/${mode}\\.md`).test(parent),
+        `Parent discuss-phase.md must reference workflows/discuss-phase/modes/${mode}.md`
+      );
+    }
+  });
+
+  test('parent reads CONTEXT.md template at the write step (not at top)', () => {
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    // The template reference must appear inside or near the write_context step,
+    // not in the top-level <required_reading> block (which would defeat lazy load).
+    const requiredReadingMatch = parent.match(/<required_reading>([\s\S]*?)<\/required_reading>/);
+    if (requiredReadingMatch) {
+      assert.ok(
+        !/discuss-phase\/templates\/context\.md/.test(requiredReadingMatch[1]),
+        `CONTEXT.md template must NOT be in <required_reading> — that defeats lazy loading. ` +
+        `Read it inside the write_context step, just before writing the file.`
+      );
+    }
+    assert.ok(
+      /discuss-phase\/templates\/context\.md/.test(parent),
+      `Parent must reference workflows/discuss-phase/templates/context.md somewhere ` +
+      `(inside write_context step) so the template loads only when CONTEXT.md is being written.`
+    );
+  });
+
+  test('advisor block is gated behind USER-PROFILE.md existence check', () => {
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    // The guard MUST be a file-existence check (test -f or equivalent), not an
+    // unconditional Read of the advisor mode file.
+    assert.ok(
+      /USER-PROFILE\.md/.test(parent),
+      'Parent must reference USER-PROFILE.md to detect advisor mode'
+    );
+    assert.ok(
+      /test\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent) ||
+      /\[\[\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent) ||
+      /\[\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent),
+      'Advisor mode detection must use a file-existence guard (test -f / [ -f ]) ' +
+      'so the advisor mode file is only Read when USER-PROFILE.md exists.'
+    );
+    // Confirm advisor.md Read is conditional on ADVISOR_MODE
+    const advisorReadGuarded =
+      /ADVISOR_MODE[\s\S]{0,200}?modes\/advisor\.md/.test(parent) ||
+      /modes\/advisor\.md[\s\S]{0,200}?ADVISOR_MODE/.test(parent) ||
+      /if[\s\S]{0,200}?ADVISOR_MODE[\s\S]{0,400}?advisor\.md/.test(parent);
+    assert.ok(
+      advisorReadGuarded,
+      'Read of modes/advisor.md must be guarded by ADVISOR_MODE (which derives from USER-PROFILE.md existence). ' +
+      'Skip the Read entirely when no profile is present.'
+    );
+  });
+
+  test('auto mode file documents skipping interactive questions (regression)', () => {
+    const auto = fs.readFileSync(path.join(SUBDIR, 'modes', 'auto.md'), 'utf-8');
+    assert.ok(
+      /skip[\s\S]{0,80}interactive|without\s+(?:using\s+)?AskUserQuestion|recommended\s+(?:option|default)/i.test(auto),
+      `auto.md must preserve the documented behavior: skip interactive questions ` +
+      `and pick the recommended option without using AskUserQuestion.`
+    );
+  });
+
+  test('auto mode preserves the single-pass cap (regression for inline rule)', () => {
+    const auto = fs.readFileSync(path.join(SUBDIR, 'modes', 'auto.md'), 'utf-8');
+    assert.ok(
+      /single\s+pass|max_discuss_passes|MAX_PASSES|pass\s+cap/i.test(auto),
+      `auto.md must preserve the auto-mode pass cap rule from the original workflow. ` +
+      `Without it, the workflow can self-feed and consume unbounded resources.`
+    );
+  });
+
+  test('all mode file documents auto-selecting all gray areas (regression)', () => {
+    const allMode = fs.readFileSync(path.join(SUBDIR, 'modes', 'all.md'), 'utf-8');
+    assert.ok(
+      /auto-select(?:ed)?\s+ALL|select\s+ALL|all\s+gray\s+areas/i.test(allMode),
+      `all.md must preserve the documented behavior: auto-select ALL gray areas ` +
+      `without asking the user.`
+    );
+  });
+
+  test('chain mode documents auto-advance to plan-phase (regression)', () => {
+    const chain = fs.readFileSync(path.join(SUBDIR, 'modes', 'chain.md'), 'utf-8');
+    assert.ok(
+      /plan-phase/.test(chain) && /(auto-advance|auto\s+plan)/i.test(chain),
+      `chain.md must preserve the documented auto-advance to plan-phase behavior.`
+    );
+  });
+
+  test('text mode documents replacing AskUserQuestion (regression)', () => {
+    const textMode = fs.readFileSync(path.join(SUBDIR, 'modes', 'text.md'), 'utf-8');
+    assert.ok(
+      /AskUserQuestion/.test(textMode) && /(numbered\s+list|plain[-\s]text)/i.test(textMode),
+      `text.md must preserve the rule: replace AskUserQuestion with plain-text numbered lists.`
+    );
+  });
+
+  test('batch mode documents 2-5 question grouping (regression)', () => {
+    const batch = fs.readFileSync(path.join(SUBDIR, 'modes', 'batch.md'), 'utf-8');
+    assert.ok(
+      /2[-\s–]5|2\s+to\s+5|--batch=N|--batch\s+N/.test(batch),
+      `batch.md must preserve the 2-5 questions-per-batch rule.`
+    );
+  });
+
+  test('analyze mode documents trade-off table presentation (regression)', () => {
+    const analyze = fs.readFileSync(path.join(SUBDIR, 'modes', 'analyze.md'), 'utf-8');
+    assert.ok(
+      /trade[-\s]off|tradeoff|pros[\s\S]{0,30}cons/i.test(analyze),
+      `analyze.md must preserve the trade-off analysis presentation rule.`
+    );
+  });
+
+  test('CONTEXT.md template preserves all required sections', () => {
+    const tpl = fs.readFileSync(path.join(SUBDIR, 'templates', 'context.md'), 'utf-8');
+    for (const section of ['<domain>', '<decisions>', '<canonical_refs>', '<code_context>', '<specifics>', '<deferred>']) {
+      assert.ok(tpl.includes(section),
+        `CONTEXT.md template missing required section ${section} — extraction dropped content.`);
+    }
+    // spec_lock is conditional but the template still has to include it as a documented option
+    assert.ok(/spec_lock/i.test(tpl),
+      `CONTEXT.md template must document the conditional <spec_lock> section for SPEC.md integration.`);
+  });
+
+  test('checkpoint template is valid JSON', () => {
+    const raw = fs.readFileSync(path.join(SUBDIR, 'templates', 'checkpoint.json'), 'utf-8');
+    assert.doesNotThrow(() => JSON.parse(raw),
+      `checkpoint.json template must parse as valid JSON — downstream code reads it.`);
+    const parsed = JSON.parse(raw);
+    for (const key of ['phase', 'phase_name', 'timestamp', 'areas_completed', 'areas_remaining', 'decisions']) {
+      assert.ok(key in parsed,
+        `checkpoint.json template missing required field "${key}" — schema regression vs original workflow.`);
+    }
+  });
+
+  test('parent does not leak per-mode bodies inline (would defeat extraction)', () => {
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    // Heuristic: the parent should not contain the full DISCUSSION-LOG.md template body
+    // (extracted to templates/discussion-log.md) — that's the heaviest single block.
+    // Look for unique strings that ONLY appear in the original inline template.
+    const inlineDiscussionLogSignal = /\| Option \| Description \| Selected \|/g;
+    const occurrences = (parent.match(inlineDiscussionLogSignal) || []).length;
+    assert.ok(occurrences === 0,
+      `Parent discuss-phase.md still contains the inline DISCUSSION-LOG.md table — ` +
+      `that block must move to workflows/discuss-phase/templates/discussion-log.md.`);
+  });
+
+  test('negative: invalid mode flag combinations document a clear error path', () => {
+    // Sanity check: the parent file should explicitly handle the mode dispatch
+    // rather than silently doing nothing on an unknown flag pattern.
+    const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
+    assert.ok(
+      /ARGUMENTS|--auto|--chain|--all|--power/.test(parent),
+      'Parent must dispatch on $ARGUMENTS — losing the flag-parsing block would silently ' +
+      'fall back to default mode and obscure user errors.'
+    );
+  });
+});