* refactor(workflows): extract discuss-phase modes/templates/advisor for progressive disclosure (closes#2551)
Splits 1,347-line workflows/discuss-phase.md into a 495-line dispatcher plus
per-mode files in workflows/discuss-phase/modes/ and templates in
workflows/discuss-phase/templates/. Mirrors the progressive-disclosure
pattern that #2361 enforced for agents.
- Per-mode files: power, all, auto, chain, text, batch, analyze, default, advisor
- Templates lazy-loaded at the step that produces the artifact (CONTEXT.md
template at write_context, DISCUSSION-LOG.md template at git_commit,
checkpoint.json schema when checkpointing)
- Advisor mode gated behind `[ -f $HOME/.claude/get-shit-done/USER-PROFILE.md ]`
— inverse of #2174's --advisor flag (don't pay the cost when unused)
- scout_codebase phase-type→map selection table extracted to
references/scout-codebase.md
- New tests/workflow-size-budget.test.cjs enforces tiered budgets across
all workflows/*.md (XL=1700 / LARGE=1500 / DEFAULT=1000) plus the
explicit <500 ceiling for discuss-phase.md per #2551
- Existing tests updated to read from the new file locations after the
split (functional equivalence preserved — content moved, not removed)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(#2607): align modes/auto.md check_existing with parent (Update it, not Skip)
CodeRabbit flagged drift between the parent step (which auto-selects "Update
it") and modes/auto.md (which documented "Skip"). The pre-refactor file had
both — line 182 said "Skip" in the overview, line 250 said "Update it" in the
actual step. The step is authoritative. Fix the new mode file to match.
Refs: PR #2607 review comment 3127783430
* test(#2607): harden discuss-phase regression tests after #2551 split
CodeRabbit identified four test smells where the split weakened coverage:
- workflow-size-budget: assertion was unreachable (entered if-block on match,
then asserted occurrences === 0 — always failed). Now unconditional.
- bug-2549-2550-2552: bounded-read assertion checked concatenated source, so
src.includes('3') was satisfied by unrelated content in scout-codebase.md
(e.g., "3-5 most relevant files"). Now reads parent only with a stricter
regex. Also asserts SCOUT_REF exists.
- chain-flag-plan-phase: filter(existsSync) silently skipped a missing
modes/chain.md. Now fails loudly via explicit asserts.
- discuss-checkpoint: same silent-filter pattern across three sources. Now
asserts each required path before reading.
Refs: PR #2607 review comments 3127783457, 3127783452, plus nitpicks for
chain-flag-plan-phase.test.cjs:21-24 and discuss-checkpoint.test.cjs:22-27
* docs(#2607): fix INVENTORY count, context.md placeholders, scout grep portability
- INVENTORY.md: subdirectory note said "50 top-level references" but the
section header now says 51. Updated to 51.
- templates/context.md: footer hardcoded XX-name instead of declared
placeholders [X]/[Name], which would leak sample text into generated
CONTEXT.md files. Now uses the declared placeholders.
- references/scout-codebase.md: no-maps fallback used grep -rl with
"\\|" alternation (GNU grep only — silent on BSD/macOS grep). Switched
to grep -rlE with extended regex for portability.
Refs: PR #2607 review comments 3127783404, 3127783448, plus nitpick for
scout-codebase.md:32-40
* docs(#2607): label fenced examples + clarify overlay/advisor precedence
- analyze.md / text.md / default.md: add language tags (markdown/text) to
fenced example blocks to silence markdownlint MD040 warnings flagged by
CodeRabbit (one fence in analyze.md, two in text.md, five in default.md).
- discuss-phase.md: document overlay stacking rules in discuss_areas — fixed
outer→inner order --analyze → --batch → --text, with a pointer to each
overlay file for mode-specific precedence.
- advisor.md: add tie-breaker rules for NON_TECHNICAL_OWNER signals — explicit
technical_background overrides inferred signals; otherwise OR-aggregate;
contradictory explanation_depth values resolve by most-recent-wins.
Refs: PR #2607 review comments 3127783415, 3127783437, plus nitpicks for
default.md:24, discuss-phase.md:345-365, and advisor.md:51-56
* fix(#2607): extract codebase_drift_gate body to keep execute-phase under XL budget
PR #2605 added 80 lines to execute-phase.md (1622 -> 1702), pushing it over
the XL_BUDGET=1700 line cap enforced by tests/workflow-size-budget.test.cjs
(introduced by this PR). Per the test's own remediation hint and #2551's
progressive-disclosure pattern, extract the codebase_drift_gate step body to
get-shit-done/workflows/execute-phase/steps/codebase-drift-gate.md and leave
a brief pointer in the workflow. execute-phase.md is now 1633 lines.
Budget is NOT relaxed; the offending workflow is tightened.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(tests): standardize to node:assert/strict and t.after() per CONTRIBUTING.md
- Replace require('node:assert') with require('node:assert/strict') across
all 73 test files to enforce strict equality (no type coercion)
- Replace try/finally cleanup blocks with t.after() hooks in core.test.cjs
and hooks-opt-in.test.cjs per the test lifecycle standards
- Utility functions in codex-config and security-scan retain try/finally
as that is appropriate for per-function resource guards, not lifecycle hooks
Closes#1674
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* perf(tests): add --test-concurrency=4 to test runner for parallel file execution
Node.js --test-concurrency controls how many test files run as parallel child
processes. Set to 4 by default, configurable via TEST_CONCURRENCY env var.
Fixes tests at a known level rather than inheriting os.availableParallelism()
which varies across CI environments.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(security): allowlist verify.test.cjs in prompt-injection scanner
tests/verify.test.cjs uses <human>...</human> as GSD phase task-type
XML (meaning "a human should verify this step"), which matches the
scanner's fake-message-boundary pattern for LLM APIs. This is a
false positive — add it to the allowlist alongside the other test files
that legitimately contain injection-adjacent patterns.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
permissionMode: acceptEdits in gsd-executor and gsd-debugger frontmatter
is Claude Code-specific and causes Gemini CLI to hard-fail on agent load
with "Unrecognized key(s) in object: 'permissionMode'". The field also
has no effect in Claude Code (subagent Write permissions are controlled
at runtime level regardless). Remove it from both agents and update
tests to enforce cross-runtime compatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #1139 added <available_agent_types> sections to execute-phase.md and
plan-phase.md to prevent /clear from causing silent fallback to
general-purpose. However, 14 other workflows and 2 commands that also
spawn named GSD agents were missed, leaving them vulnerable to the same
regression after /clear.
Added <available_agent_types> listing to: research-phase, quick,
audit-milestone, diagnose-issues, discuss-phase-assumptions,
execute-plan, map-codebase, new-milestone, new-project, ui-phase,
ui-review, validate-phase, verify-work (workflows) and debug,
research-phase (commands).
Added regression test that enforces every workflow/command spawning
named subagent_type must have a matching <available_agent_types>
section listing all spawned types.
Fixes#1357
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Worktree agents (gsd-executor, gsd-debugger) prompt for edit permissions
on every new directory they touch, even when the user has "accept edits"
enabled. This is caused by Claude Code's directory-scoped permission
model not propagating to worktree paths.
Setting permissionMode: acceptEdits in the agent frontmatter tells Claude
Code to auto-approve file edits for these agents, bypassing the per-
directory prompts. This is safe because these agents are already granted
Write/Edit in their tools list and are spawned in isolated worktrees.
- Add permissionMode: acceptEdits to gsd-executor.md frontmatter
- Add permissionMode: acceptEdits to gsd-debugger.md frontmatter
- Add regression tests verifying worktree agents have the field
- Add test ensuring all isolation="worktree" spawns are covered
Upstream: anthropics/claude-code#29110, anthropics/claude-code#28041
Fixes#1334
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verification checked structure but not whether data actually flows
end-to-end or whether external dependencies are available. Adds:
- Step 4b (Data-Flow Trace): Level 4 verification traces upstream from
wired artifacts to verify data sources produce real data, catching
hollow components that render empty/hardcoded values
- Step 7b (Behavioral Spot-Checks): lightweight smoke tests that verify
runnable code produces expected output, not just that it exists
- Step 2.6 (Environment Audit): researcher probes target machine for
external tools/services/runtimes before planning, so plans include
fallback strategies for missing dependencies
Closes#1245
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add CLAUDE.md enforcement across the three core agents:
- gsd-plan-checker: new Dimension 10 verifies plans respect project
conventions, forbidden patterns, and required tools from CLAUDE.md
- gsd-phase-researcher: outputs Project Constraints section from
CLAUDE.md so planner can verify compliance
- gsd-executor: treats CLAUDE.md directives as hard constraints,
with precedence over plan instructions
Includes 4 regression tests validating the new dimension and
enforcement directives across all three agents.
Closes#1260
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Generate {phase_num}-DISCUSSION-LOG.md alongside CONTEXT.md during
discuss-phase sessions
- Log captures all options presented per gray area (not just the
selected one), user's choice, notes, Claude's discretion items,
and deferred ideas
- File is explicitly marked as audit-only — not for agent consumption
- Add discussion-log.md template with format specification
- Track Q&A data accumulation instruction in discuss_areas step
- Commit discussion log alongside CONTEXT.md in same git commit
- Add regression tests for workflow reference and template existence
* fix: remove dangling skills: from agent frontmatter and strip in Gemini converter (closes#1023, closes#953, closes#930)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: invert skills frontmatter test to assert absence (fixes CI)
The PR deliberately removed skills: from agent frontmatter (breaks
Gemini CLI), but the test still asserted its presence. Inverted the
assertion to ensure skills: stays removed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New test suite covering:
- HDOC: anti-heredoc instruction present in all 9 file-writing agents
- SKILL: skills: frontmatter present in all 11 agents
- HOOK: commented hooks pattern in file-writing agents
- SPAWN: no stale workaround patterns, valid agent type references
- AGENT: required frontmatter fields (name, description, tools, color)
509 total tests (462 existing + 47 new), 0 failures.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>