mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-04-25 17:25:23 +02:00
* refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes #2551) Splits 1,347-line workflows/discuss-phase.md into a 495-line dispatcher plus per-mode files in workflows/discuss-phase/modes/ and templates in workflows/discuss-phase/templates/. Mirrors the progressive-disclosure pattern that #2361 enforced for agents. - Per-mode files: power, all, auto, chain, text, batch, analyze, default, advisor - Templates lazy-loaded at the step that produces the artifact (CONTEXT.md template at write_context, DISCUSSION-LOG.md template at git_commit, checkpoint.json schema when checkpointing) - Advisor mode gated behind `[ -f $HOME/.claude/get-shit-done/USER-PROFILE.md ]` — inverse of #2174's --advisor flag (don't pay the cost when unused) - scout_codebase phase-type→map selection table extracted to references/scout-codebase.md - New tests/workflow-size-budget.test.cjs enforces tiered budgets across all workflows/*.md (XL=1700 / LARGE=1500 / DEFAULT=1000) plus the explicit <500 ceiling for discuss-phase.md per #2551 - Existing tests updated to read from the new file locations after the split (functional equivalence preserved — content moved, not removed) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#2607): align modes/auto.md check_existing with parent (Update it, not Skip) CodeRabbit flagged drift between the parent step (which auto-selects "Update it") and modes/auto.md (which documented "Skip"). The pre-refactor file had both — line 182 said "Skip" in the overview, line 250 said "Update it" in the actual step. The step is authoritative. Fix the new mode file to match. Refs: PR #2607 review comment 3127783430 * test(#2607): harden discuss-phase regression tests after #2551 split CodeRabbit identified four test smells where the split weakened coverage: - workflow-size-budget: assertion was unreachable (entered if-block on match, then asserted occurrences === 0 — always failed). Now unconditional. - bug-2549-2550-2552: bounded-read assertion checked concatenated source, so src.includes('3') was satisfied by unrelated content in scout-codebase.md (e.g., "3-5 most relevant files"). Now reads parent only with a stricter regex. Also asserts SCOUT_REF exists. - chain-flag-plan-phase: filter(existsSync) silently skipped a missing modes/chain.md. Now fails loudly via explicit asserts. - discuss-checkpoint: same silent-filter pattern across three sources. Now asserts each required path before reading. Refs: PR #2607 review comments 3127783457, 3127783452, plus nitpicks for chain-flag-plan-phase.test.cjs:21-24 and discuss-checkpoint.test.cjs:22-27 * docs(#2607): fix INVENTORY count, context.md placeholders, scout grep portability - INVENTORY.md: subdirectory note said "50 top-level references" but the section header now says 51. Updated to 51. - templates/context.md: footer hardcoded XX-name instead of declared placeholders [X]/[Name], which would leak sample text into generated CONTEXT.md files. Now uses the declared placeholders. - references/scout-codebase.md: no-maps fallback used grep -rl with "\\|" alternation (GNU grep only — silent on BSD/macOS grep). Switched to grep -rlE with extended regex for portability. Refs: PR #2607 review comments 3127783404, 3127783448, plus nitpick for scout-codebase.md:32-40 * docs(#2607): label fenced examples + clarify overlay/advisor precedence - analyze.md / text.md / default.md: add language tags (markdown/text) to fenced example blocks to silence markdownlint MD040 warnings flagged by CodeRabbit (one fence in analyze.md, two in text.md, five in default.md). - discuss-phase.md: document overlay stacking rules in discuss_areas — fixed outer→inner order --analyze → --batch → --text, with a pointer to each overlay file for mode-specific precedence. - advisor.md: add tie-breaker rules for NON_TECHNICAL_OWNER signals — explicit technical_background overrides inferred signals; otherwise OR-aggregate; contradictory explanation_depth values resolve by most-recent-wins. Refs: PR #2607 review comments 3127783415, 3127783437, plus nitpicks for default.md:24, discuss-phase.md:345-365, and advisor.md:51-56 * fix(#2607): extract codebase_drift_gate body to keep execute-phase under XL budget PR #2605 added 80 lines to execute-phase.md (1622 -> 1702), pushing it over the XL_BUDGET=1700 line cap enforced by tests/workflow-size-budget.test.cjs (introduced by this PR). Per the test's own remediation hint and #2551's progressive-disclosure pattern, extract the codebase_drift_gate step body to get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md and leave a brief pointer in the workflow. execute-phase.md is now 1633 lines. Budget is NOT relaxed; the offending workflow is tightened. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
101 lines
4.6 KiB
JavaScript
101 lines
4.6 KiB
JavaScript
'use strict';
|
|
|
|
/**
|
|
* Bugs #2549, #2550, #2552: discuss-phase context bloat and cache invalidation.
|
|
*
|
|
* #2549: load_prior_context must cap prior CONTEXT.md reads (was O(phases))
|
|
* #2550: scout_codebase must select maps by phase type (was always all 7)
|
|
* #2552: scout_codebase must not instruct split reads of the same file
|
|
*/
|
|
|
|
const { test, describe } = require('node:test');
|
|
const assert = require('node:assert/strict');
|
|
const fs = require('node:fs');
|
|
const path = require('node:path');
|
|
|
|
const DISCUSS_PHASE = path.join(
|
|
__dirname, '..', 'get-shit-done', 'workflows', 'discuss-phase.md',
|
|
);
|
|
// After #2551 progressive-disclosure refactor, the scout_codebase phase-type
|
|
// table and split-reads warning live in references/scout-codebase.md.
|
|
const SCOUT_REF = path.join(
|
|
__dirname, '..', 'get-shit-done', 'references', 'scout-codebase.md',
|
|
);
|
|
|
|
function readDiscussContext() {
|
|
// Both files are required after #2551 — fail loudly if either is missing
|
|
// rather than silently weakening the regression coverage.
|
|
for (const p of [DISCUSS_PHASE, SCOUT_REF]) {
|
|
assert.ok(fs.existsSync(p), `Required discuss-phase context source missing: ${p}`);
|
|
}
|
|
return [DISCUSS_PHASE, SCOUT_REF].map(p => fs.readFileSync(p, 'utf-8')).join('\n');
|
|
}
|
|
|
|
describe('discuss-phase context fixes (#2549, #2550, #2552)', () => {
|
|
let src;
|
|
test('discuss-phase.md source exists', () => {
|
|
assert.ok(fs.existsSync(DISCUSS_PHASE), 'discuss-phase.md must exist');
|
|
assert.ok(
|
|
fs.existsSync(SCOUT_REF),
|
|
'references/scout-codebase.md must exist after #2551 extraction',
|
|
);
|
|
src = readDiscussContext();
|
|
});
|
|
|
|
// ─── #2549: load_prior_context cap ──────────────────────────────────────
|
|
test('#2549: load_prior_context must NOT instruct reading ALL prior CONTEXT.md files', () => {
|
|
if (!src) src = readDiscussContext();
|
|
assert.ok(
|
|
!src.includes('For each CONTEXT.md where phase number < current phase'),
|
|
'load_prior_context must not unboundedly read all prior CONTEXT.md files',
|
|
);
|
|
});
|
|
|
|
test('#2549: load_prior_context must reference a bounded read (3 phases or DECISIONS-INDEX)', () => {
|
|
// Read ONLY the parent file — `src.includes('3')` against the
|
|
// concatenated source can be satisfied by unrelated occurrences of "3"
|
|
// in scout-codebase.md (e.g., "3-5 most relevant files"), masking a
|
|
// regression where the parent drops the bounded-read instruction.
|
|
const parent = fs.readFileSync(DISCUSS_PHASE, 'utf-8');
|
|
const hasBound = /\b(?:most recent|latest|last|up to)\s+3\b[\s\S]{0,160}\bprior CONTEXT\.md\b/i.test(parent);
|
|
const hasIndex = parent.includes('DECISIONS-INDEX.md');
|
|
assert.ok(
|
|
hasBound || hasIndex,
|
|
'load_prior_context must reference a bounded read (e.g., most recent 3 phases) or DECISIONS-INDEX.md',
|
|
);
|
|
});
|
|
|
|
// ─── #2550: scout_codebase phase-type selection ──────────────────────────
|
|
test('#2550: scout_codebase must not instruct reading all 7 codebase maps', () => {
|
|
if (!src) src = readDiscussContext();
|
|
assert.ok(
|
|
!src.includes('Read the most relevant ones (CONVENTIONS.md, STRUCTURE.md, STACK.md based on phase type)'),
|
|
'scout_codebase must not use the old vague "most relevant" instruction without a selection table',
|
|
);
|
|
});
|
|
|
|
test('#2550: scout_codebase must include a phase-type-to-maps selection table', () => {
|
|
if (!src) src = readDiscussContext();
|
|
// The table maps phase types to specific map selections
|
|
assert.ok(
|
|
src.includes('Phase type') && src.includes('Read these maps'),
|
|
'scout_codebase must include a phase-type to map-selection table',
|
|
);
|
|
// Key phase types must be covered
|
|
assert.ok(src.includes('UI') || src.includes('frontend'), 'Table must cover UI/frontend phases');
|
|
assert.ok(src.includes('Backend') || src.includes('API'), 'Table must cover backend phases');
|
|
assert.ok(src.includes('Testing'), 'Table must cover testing phases');
|
|
assert.ok(src.includes('Mixed'), 'Table must have a fallback for mixed/unclear phases');
|
|
});
|
|
|
|
// ─── #2552: no split reads ───────────────────────────────────────────────
|
|
test('#2552: scout_codebase must explicitly prohibit split reads of the same file', () => {
|
|
if (!src) src = readDiscussContext();
|
|
const prohibitsSplit = src.includes('split reads') || src.includes('split read');
|
|
assert.ok(
|
|
prohibitsSplit,
|
|
'scout_codebase must explicitly warn against split reads (same file, two offsets) that break prompt cache',
|
|
);
|
|
});
|
|
});
|