Files
get-shit-done/tests/workflow-size-budget.test.cjs
Tom Boucher 41dc475c46 refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes #2551) (#2607)
* refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes #2551)

Splits 1,347-line workflows/discuss-phase.md into a 495-line dispatcher plus
per-mode files in workflows/discuss-phase/modes/ and templates in
workflows/discuss-phase/templates/. Mirrors the progressive-disclosure
pattern that #2361 enforced for agents.

- Per-mode files: power, all, auto, chain, text, batch, analyze, default, advisor
- Templates lazy-loaded at the step that produces the artifact (CONTEXT.md
  template at write_context, DISCUSSION-LOG.md template at git_commit,
  checkpoint.json schema when checkpointing)
- Advisor mode gated behind `[ -f $HOME/.claude/get-shit-done/USER-PROFILE.md ]`
  — inverse of #2174's --advisor flag (don't pay the cost when unused)
- scout_codebase phase-type→map selection table extracted to
  references/scout-codebase.md
- New tests/workflow-size-budget.test.cjs enforces tiered budgets across
  all workflows/*.md (XL=1700 / LARGE=1500 / DEFAULT=1000) plus the
  explicit <500 ceiling for discuss-phase.md per #2551
- Existing tests updated to read from the new file locations after the
  split (functional equivalence preserved — content moved, not removed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2607): align modes/auto.md check_existing with parent (Update it, not Skip)

CodeRabbit flagged drift between the parent step (which auto-selects "Update
it") and modes/auto.md (which documented "Skip"). The pre-refactor file had
both — line 182 said "Skip" in the overview, line 250 said "Update it" in the
actual step. The step is authoritative. Fix the new mode file to match.

Refs: PR #2607 review comment 3127783430

* test(#2607): harden discuss-phase regression tests after #2551 split

CodeRabbit identified four test smells where the split weakened coverage:

- workflow-size-budget: assertion was unreachable (entered if-block on match,
  then asserted occurrences === 0 — always failed). Now unconditional.
- bug-2549-2550-2552: bounded-read assertion checked concatenated source, so
  src.includes('3') was satisfied by unrelated content in scout-codebase.md
  (e.g., "3-5 most relevant files"). Now reads parent only with a stricter
  regex. Also asserts SCOUT_REF exists.
- chain-flag-plan-phase: filter(existsSync) silently skipped a missing
  modes/chain.md. Now fails loudly via explicit asserts.
- discuss-checkpoint: same silent-filter pattern across three sources. Now
  asserts each required path before reading.

Refs: PR #2607 review comments 3127783457, 3127783452, plus nitpicks for
chain-flag-plan-phase.test.cjs:21-24 and discuss-checkpoint.test.cjs:22-27

* docs(#2607): fix INVENTORY count, context.md placeholders, scout grep portability

- INVENTORY.md: subdirectory note said "50 top-level references" but the
  section header now says 51. Updated to 51.
- templates/context.md: footer hardcoded XX-name instead of declared
  placeholders [X]/[Name], which would leak sample text into generated
  CONTEXT.md files. Now uses the declared placeholders.
- references/scout-codebase.md: no-maps fallback used grep -rl with
  "\\|" alternation (GNU grep only — silent on BSD/macOS grep). Switched
  to grep -rlE with extended regex for portability.

Refs: PR #2607 review comments 3127783404, 3127783448, plus nitpick for
scout-codebase.md:32-40

* docs(#2607): label fenced examples + clarify overlay/advisor precedence

- analyze.md / text.md / default.md: add language tags (markdown/text) to
  fenced example blocks to silence markdownlint MD040 warnings flagged by
  CodeRabbit (one fence in analyze.md, two in text.md, five in default.md).
- discuss-phase.md: document overlay stacking rules in discuss_areas — fixed
  outer→inner order --analyze → --batch → --text, with a pointer to each
  overlay file for mode-specific precedence.
- advisor.md: add tie-breaker rules for NON_TECHNICAL_OWNER signals — explicit
  technical_background overrides inferred signals; otherwise OR-aggregate;
  contradictory explanation_depth values resolve by most-recent-wins.

Refs: PR #2607 review comments 3127783415, 3127783437, plus nitpicks for
default.md:24, discuss-phase.md:345-365, and advisor.md:51-56

* fix(#2607): extract codebase_drift_gate body to keep execute-phase under XL budget

PR #2605 added 80 lines to execute-phase.md (1622 -> 1702), pushing it over
the XL_BUDGET=1700 line cap enforced by tests/workflow-size-budget.test.cjs
(introduced by this PR). Per the test's own remediation hint and #2551's
progressive-disclosure pattern, extract the codebase_drift_gate step body to
get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md and leave
a brief pointer in the workflow. execute-phase.md is now 1633 lines.

Budget is NOT relaxed; the offending workflow is tightened.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 21:57:24 -04:00

318 lines
14 KiB
JavaScript
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
/**
* Workflow size budget.
*
* Workflow definitions in `get-shit-done/workflows/*.md` are loaded verbatim
* into Claude's context every time the corresponding `/gsd:*` command is
* invoked. Unbounded growth is paid on every invocation across every session.
*
* Tiered the same way as agent budgets (#2361):
* - XL : top-level orchestrators (e.g., execute-phase, autonomous)
* - LARGE : multi-step planners
* - DEFAULT : focused single-purpose workflows (target tier)
*
* Raising a budget is a deliberate choice — adjust the constant, write a
* rationale in the PR, and confirm the bloat is not duplicated content
* that belongs in `get-shit-done/references/` or a per-mode subdirectory
* (see `workflows/discuss-phase/modes/` for the progressive-disclosure
* pattern introduced by #2551).
*
* See:
* - https://github.com/gsd-build/get-shit-done/issues/2551 (this test)
* - https://github.com/gsd-build/get-shit-done/issues/2361 (agent budget)
*/
const { test, describe } = require('node:test');
const assert = require('node:assert/strict');
const fs = require('fs');
const path = require('path');
const WORKFLOWS_DIR = path.join(__dirname, '..', 'get-shit-done', 'workflows');
const XL_BUDGET = 1700;
const LARGE_BUDGET = 1500;
const DEFAULT_BUDGET = 1000;
// Top-level orchestrators that own end-to-end multi-phase rubrics.
// Grandfathered at current sizes — see PR #2551 for #2551 progressive-disclosure
// pattern that future shrinks should follow.
const XL_WORKFLOWS = new Set([
'execute-phase', // 1622
'plan-phase', // 1493
'new-project', // 1391
]);
// Multi-step planners and bigger feature workflows. Grandfathered.
const LARGE_WORKFLOWS = new Set([
'docs-update', // 1155
'autonomous', // 789
'complete-milestone', // 847
'verify-work', // 740
'transition', // 693
'help', // 667
'discuss-phase-assumptions', // 670
'progress', // 619
'new-milestone', // 611
'update', // 587
'quick', // 971
'code-review', // 515
]);
const ALL_WORKFLOWS = fs.readdirSync(WORKFLOWS_DIR)
.filter(f => f.endsWith('.md'))
.map(f => f.replace('.md', ''));
function budgetFor(workflow) {
if (XL_WORKFLOWS.has(workflow)) return { tier: 'XL', limit: XL_BUDGET };
if (LARGE_WORKFLOWS.has(workflow)) return { tier: 'LARGE', limit: LARGE_BUDGET };
return { tier: 'DEFAULT', limit: DEFAULT_BUDGET };
}
function lineCount(filePath) {
const content = fs.readFileSync(filePath, 'utf-8');
if (content.length === 0) return 0;
const trailingNewline = content.endsWith('\n') ? 1 : 0;
return content.split('\n').length - trailingNewline;
}
describe('SIZE: workflow line-count budget', () => {
for (const workflow of ALL_WORKFLOWS) {
const { tier, limit } = budgetFor(workflow);
test(`${workflow} (${tier}) stays under ${limit} lines`, () => {
const filePath = path.join(WORKFLOWS_DIR, workflow + '.md');
const lines = lineCount(filePath);
assert.ok(
lines <= limit,
`${workflow}.md has ${lines} lines — exceeds ${tier} budget of ${limit}. ` +
`Extract per-mode bodies to a workflows/${workflow}/modes/ subdirectory, ` +
`templates to workflows/${workflow}/templates/, or shared references ` +
`to get-shit-done/references/. See workflows/discuss-phase/ for the pattern.`
);
});
}
});
describe('SIZE: discuss-phase progressive disclosure (issue #2551)', () => {
// Issue #2551 explicitly targets discuss-phase.md at <500 lines, separate from
// the per-tier grandfathered budgets above. This is the headline metric of the
// refactor — every other workflow above 500 is grandfathered at its current
// size and may shrink later by following the same pattern.
const DISCUSS_PHASE_TARGET = 500;
test(`discuss-phase.md is under ${DISCUSS_PHASE_TARGET} lines (issue #2551 target)`, () => {
const filePath = path.join(WORKFLOWS_DIR, 'discuss-phase.md');
const lines = lineCount(filePath);
assert.ok(
lines < DISCUSS_PHASE_TARGET,
`discuss-phase.md has ${lines} lines — must be under ${DISCUSS_PHASE_TARGET} per #2551. ` +
`Per-mode logic belongs in workflows/discuss-phase/modes/<mode>.md, ` +
`templates in workflows/discuss-phase/templates/.`
);
});
const SUBDIR = path.join(WORKFLOWS_DIR, 'discuss-phase');
test('mode files exist for every documented mode', () => {
const expected = ['power', 'all', 'auto', 'chain', 'text', 'batch', 'analyze', 'default', 'advisor'];
for (const mode of expected) {
const p = path.join(SUBDIR, 'modes', `${mode}.md`);
assert.ok(
fs.existsSync(p),
`Expected mode file ${path.relative(WORKFLOWS_DIR, p)} — missing. ` +
`Each --flag in commands/gsd/discuss-phase.md must have a matching mode file.`
);
}
});
test('every mode file is a real, non-empty workflow doc', () => {
const modesDir = path.join(SUBDIR, 'modes');
if (!fs.existsSync(modesDir)) {
assert.fail(`workflows/discuss-phase/modes/ directory does not exist`);
}
for (const file of fs.readdirSync(modesDir)) {
if (!file.endsWith('.md')) continue;
const p = path.join(modesDir, file);
const content = fs.readFileSync(p, 'utf-8');
assert.ok(content.trim().length > 100,
`${file} is empty or near-empty (${content.length} chars) — extraction must preserve behavior, not stub it out`);
}
});
test('templates extracted to discuss-phase/templates/', () => {
const expected = ['context.md', 'discussion-log.md', 'checkpoint.json'];
for (const t of expected) {
const p = path.join(SUBDIR, 'templates', t);
assert.ok(fs.existsSync(p),
`Expected template ${path.relative(WORKFLOWS_DIR, p)} — missing.`);
}
});
test('parent discuss-phase.md dispatches to mode files (power)', () => {
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
assert.ok(
/discuss-phase\/modes\/power\.md/.test(parent) ||
/discuss-phase-power\.md/.test(parent),
`Parent discuss-phase.md must reference workflows/discuss-phase/modes/power.md ` +
`(or the legacy discuss-phase-power.md alias) somewhere in its dispatch logic.`
);
});
test('parent dispatches to all extracted modes (auto, chain, all, advisor)', () => {
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
for (const mode of ['auto', 'chain', 'all', 'advisor']) {
assert.ok(
new RegExp(`discuss-phase/modes/${mode}\\.md`).test(parent),
`Parent discuss-phase.md must reference workflows/discuss-phase/modes/${mode}.md`
);
}
});
test('parent reads CONTEXT.md template at the write step (not at top)', () => {
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
// The template reference must appear inside or near the write_context step,
// not in the top-level <required_reading> block (which would defeat lazy load).
const requiredReadingMatch = parent.match(/<required_reading>([\s\S]*?)<\/required_reading>/);
if (requiredReadingMatch) {
assert.ok(
!/discuss-phase\/templates\/context\.md/.test(requiredReadingMatch[1]),
`CONTEXT.md template must NOT be in <required_reading> — that defeats lazy loading. ` +
`Read it inside the write_context step, just before writing the file.`
);
}
assert.ok(
/discuss-phase\/templates\/context\.md/.test(parent),
`Parent must reference workflows/discuss-phase/templates/context.md somewhere ` +
`(inside write_context step) so the template loads only when CONTEXT.md is being written.`
);
});
test('advisor block is gated behind USER-PROFILE.md existence check', () => {
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
// The guard MUST be a file-existence check (test -f or equivalent), not an
// unconditional Read of the advisor mode file.
assert.ok(
/USER-PROFILE\.md/.test(parent),
'Parent must reference USER-PROFILE.md to detect advisor mode'
);
assert.ok(
/test\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent) ||
/\[\[\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent) ||
/\[\s+-[ef]\s+["'$].*USER-PROFILE/.test(parent),
'Advisor mode detection must use a file-existence guard (test -f / [ -f ]) ' +
'so the advisor mode file is only Read when USER-PROFILE.md exists.'
);
// Confirm advisor.md Read is conditional on ADVISOR_MODE
const advisorReadGuarded =
/ADVISOR_MODE[\s\S]{0,200}?modes\/advisor\.md/.test(parent) ||
/modes\/advisor\.md[\s\S]{0,200}?ADVISOR_MODE/.test(parent) ||
/if[\s\S]{0,200}?ADVISOR_MODE[\s\S]{0,400}?advisor\.md/.test(parent);
assert.ok(
advisorReadGuarded,
'Read of modes/advisor.md must be guarded by ADVISOR_MODE (which derives from USER-PROFILE.md existence). ' +
'Skip the Read entirely when no profile is present.'
);
});
test('auto mode file documents skipping interactive questions (regression)', () => {
const auto = fs.readFileSync(path.join(SUBDIR, 'modes', 'auto.md'), 'utf-8');
assert.ok(
/skip[\s\S]{0,80}interactive|without\s+(?:using\s+)?AskUserQuestion|recommended\s+(?:option|default)/i.test(auto),
`auto.md must preserve the documented behavior: skip interactive questions ` +
`and pick the recommended option without using AskUserQuestion.`
);
});
test('auto mode preserves the single-pass cap (regression for inline rule)', () => {
const auto = fs.readFileSync(path.join(SUBDIR, 'modes', 'auto.md'), 'utf-8');
assert.ok(
/single\s+pass|max_discuss_passes|MAX_PASSES|pass\s+cap/i.test(auto),
`auto.md must preserve the auto-mode pass cap rule from the original workflow. ` +
`Without it, the workflow can self-feed and consume unbounded resources.`
);
});
test('all mode file documents auto-selecting all gray areas (regression)', () => {
const allMode = fs.readFileSync(path.join(SUBDIR, 'modes', 'all.md'), 'utf-8');
assert.ok(
/auto-select(?:ed)?\s+ALL|select\s+ALL|all\s+gray\s+areas/i.test(allMode),
`all.md must preserve the documented behavior: auto-select ALL gray areas ` +
`without asking the user.`
);
});
test('chain mode documents auto-advance to plan-phase (regression)', () => {
const chain = fs.readFileSync(path.join(SUBDIR, 'modes', 'chain.md'), 'utf-8');
assert.ok(
/plan-phase/.test(chain) && /(auto-advance|auto\s+plan)/i.test(chain),
`chain.md must preserve the documented auto-advance to plan-phase behavior.`
);
});
test('text mode documents replacing AskUserQuestion (regression)', () => {
const textMode = fs.readFileSync(path.join(SUBDIR, 'modes', 'text.md'), 'utf-8');
assert.ok(
/AskUserQuestion/.test(textMode) && /(numbered\s+list|plain[-\s]text)/i.test(textMode),
`text.md must preserve the rule: replace AskUserQuestion with plain-text numbered lists.`
);
});
test('batch mode documents 2-5 question grouping (regression)', () => {
const batch = fs.readFileSync(path.join(SUBDIR, 'modes', 'batch.md'), 'utf-8');
assert.ok(
/2[-\s]5|2\s+to\s+5|--batch=N|--batch\s+N/.test(batch),
`batch.md must preserve the 2-5 questions-per-batch rule.`
);
});
test('analyze mode documents trade-off table presentation (regression)', () => {
const analyze = fs.readFileSync(path.join(SUBDIR, 'modes', 'analyze.md'), 'utf-8');
assert.ok(
/trade[-\s]off|tradeoff|pros[\s\S]{0,30}cons/i.test(analyze),
`analyze.md must preserve the trade-off analysis presentation rule.`
);
});
test('CONTEXT.md template preserves all required sections', () => {
const tpl = fs.readFileSync(path.join(SUBDIR, 'templates', 'context.md'), 'utf-8');
for (const section of ['<domain>', '<decisions>', '<canonical_refs>', '<code_context>', '<specifics>', '<deferred>']) {
assert.ok(tpl.includes(section),
`CONTEXT.md template missing required section ${section} — extraction dropped content.`);
}
// spec_lock is conditional but the template still has to include it as a documented option
assert.ok(/spec_lock/i.test(tpl),
`CONTEXT.md template must document the conditional <spec_lock> section for SPEC.md integration.`);
});
test('checkpoint template is valid JSON', () => {
const raw = fs.readFileSync(path.join(SUBDIR, 'templates', 'checkpoint.json'), 'utf-8');
assert.doesNotThrow(() => JSON.parse(raw),
`checkpoint.json template must parse as valid JSON — downstream code reads it.`);
const parsed = JSON.parse(raw);
for (const key of ['phase', 'phase_name', 'timestamp', 'areas_completed', 'areas_remaining', 'decisions']) {
assert.ok(key in parsed,
`checkpoint.json template missing required field "${key}" — schema regression vs original workflow.`);
}
});
test('parent does not leak per-mode bodies inline (would defeat extraction)', () => {
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
// Heuristic: the parent should not contain the full DISCUSSION-LOG.md template body
// (extracted to templates/discussion-log.md) — that's the heaviest single block.
// Look for unique strings that ONLY appear in the original inline template.
const inlineDiscussionLogSignal = /\| Option \| Description \| Selected \|/g;
const occurrences = (parent.match(inlineDiscussionLogSignal) || []).length;
assert.ok(occurrences === 0,
`Parent discuss-phase.md still contains the inline DISCUSSION-LOG.md table — ` +
`that block must move to workflows/discuss-phase/templates/discussion-log.md.`);
});
test('negative: invalid mode flag combinations document a clear error path', () => {
// Sanity check: the parent file should explicitly handle the mode dispatch
// rather than silently doing nothing on an unknown flag pattern.
const parent = fs.readFileSync(path.join(WORKFLOWS_DIR, 'discuss-phase.md'), 'utf-8');
assert.ok(
/ARGUMENTS|--auto|--chain|--all|--power/.test(parent),
'Parent must dispatch on $ARGUMENTS — losing the flag-parsing block would silently ' +
'fall back to default mode and obscure user errors.'
);
});
});