Files
get-shit-done/get-shit-done/workflows/execute-plan.md
yuiooo1102-droid dcb503961a feat: harness engineering improvements — post-merge test gate, shared file isolation, behavioral verification (#1486)
* feat: harness engineering improvements — post-merge test gate, shared file isolation, behavioral verification

Three improvements inspired by Anthropic's harness engineering research
(March 2026) and real-world pain points from parallel worktree execution:

1. Post-merge test gate (execute-phase.md)
   - Run project test suite after merging each wave's worktrees
   - Catches cross-plan integration failures that individual Self-Checks miss
   - Addresses the Generator self-evaluation blind spot (agents praise own work)

2. Shared file isolation (execute-phase.md)
   - Executors no longer modify STATE.md or ROADMAP.md in parallel mode
   - Orchestrator updates tracking files centrally after merge
   - Eliminates the #1 source of merge conflicts in parallel execution

3. Behavioral verification (verify-phase.md)
   - Verifier runs project test suite and CLI commands, not just grep
   - Follows Anthropic's Generator/Evaluator separation principle
   - Tests actual behavior against success criteria, not just file existence

Real-world evidence: In a session executing 37 plans across 8 phases with
parallel worktrees, we observed:
- 4 test failures after merge that all Self-Checks missed (models.py type loss)
- STATE.md/ROADMAP.md conflicts on every single parallel merge
- Verifier reporting PASSED while merged code had broken imports

References:
- Anthropic Engineering Blog: Harness Design for Long-Running Apps (2026-03-24)
- Issue #1451: Massive git worktree problem
- Issue #1413: Autonomous execution without manual context clearing

* fix: address review feedback — test runner detection, parallel isolation, edge cases

- Replace hardcoded jest/vitest with `npm test` (reads project's scripts.test)
- Add Go detection to post-merge test gate (was only in verify-phase)
- Add 5-minute timeout to post-merge test gate to prevent pipeline stalls
- Track cumulative wave failures via WAVE_FAILURE_COUNT for cross-wave awareness
- Guard orchestrator tracking commit against unchanged files (prevent empty commits)
- Align execute-plan.md with parallel isolation model (skip STATE.md/ROADMAP.md
  updates when running in parallel mode, orchestrator handles centrally)
- Scope behavioral verification CLI checks: skip when no fixtures/test data exist,
  mark as NEEDS HUMAN instead of inventing inputs

* fix: pass PARALLEL_MODE to executor agents to activate shared file isolation

The executor spawn prompt in execute-phase.md instructed agents not to
modify STATE.md/ROADMAP.md, but execute-plan.md gates this behavior on
PARALLEL_MODE which was never defined in the executor context. This adds
the variable to the spawn prompt and wraps all three shared-file steps
(update_current_position, update_roadmap, git_commit_metadata) with
explicit conditional guards.

* fix: replace unreliable PARALLEL_MODE env var with git worktree auto-detection

Address PR #1486 review feedback (trek-e):

1. PARALLEL_MODE was never reliably set — the <env> block instructed the LLM
   to export a bash variable, but each Bash tool call runs in a fresh shell
   so the variable never persisted. Replace with self-contained worktree
   detection: `[ -f .git ]` returns true in worktrees (.git is a file) and
   false in main repos (.git is a directory). Each bash block detects
   independently with no external state dependency.

2. TEST_EXIT only checked for timeout (124) — test failures (non-zero,
   non-124) were silently ignored, making the "If tests fail" prose
   unreachable. Add full if/elif/else handling: 0=pass, 124=timeout,
   else=fail with WAVE_FAILURE_COUNT increment.

3. Add Go detection to regression_gate (was missing go.mod check).
   Replace hardcoded npx jest/vitest with npm test for consistency.

4. Renumber steps from 4/4b/4c/5/5/5b to 4a/4b/4c/4d/5/6/7/8/9.

* fix: address remaining review blockers — timeout, tracking guard, shell safety

- verify-phase.md: wrap behavioral_verification test suite in timeout 300
- execute-phase.md: gate tracking update on TEST_EXIT=0, skip on failure/timeout
- Quote all TEST_EXIT variables, add default initialization
- Add else branch for unrecognized project types
- Renumber steps to align with upstream (5.x series)

* fix: rephrase worktree success_criteria to satisfy substring test guard

The worktree mode success_criteria line literally contained "STATE.md"
and "ROADMAP.md" inside a prohibition ("No modifications to..."), but
the test guard in execute-phase-worktree-artifacts.test.cjs uses a
substring check and cannot distinguish prohibition from requirement.

Rephrase to "shared orchestrator artifacts" so the substring check
passes while preserving the same intent.
2026-04-10 10:42:45 -04:00

24 KiB

Execute a phase prompt (PLAN.md) and create the outcome summary (SUMMARY.md).

<required_reading> Read STATE.md before any operation to load project context. Read config.json for planning behavior settings.

@~/.claude/get-shit-done/references/git-integration.md </required_reading>

<available_agent_types> Valid GSD subagent types (use exact names — do not fall back to 'general-purpose'):

  • gsd-executor — Executes plan tasks, commits, creates SUMMARY.md </available_agent_types>
Load execution context (paths only to minimize orchestrator context):
INIT=$(node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" init execute-phase "${PHASE}")
if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi

Extract from init JSON: executor_model, commit_docs, sub_repos, phase_dir, phase_number, plans, summaries, incomplete_plans, state_path, config_path.

If .planning/ missing: error.

```bash # Use plans/summaries from INIT JSON, or list files (ls .planning/phases/XX-name/*-PLAN.md 2>/dev/null || true) | sort (ls .planning/phases/XX-name/*-SUMMARY.md 2>/dev/null || true) | sort ```

Find first PLAN without matching SUMMARY. Decimal phases supported (01.1-hotfix/):

PHASE=$(echo "$PLAN_PATH" | grep -oE '[0-9]+(\.[0-9]+)?-[0-9]+')
# config settings can be fetched via gsd-tools config-get if needed
Auto-approve: ` Execute {phase}-{plan}-PLAN.md [Plan X of Y for Phase Z]` → parse_segments. Present plan identification, wait for confirmation. ```bash PLAN_START_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ") PLAN_START_EPOCH=$(date +%s) ``` ```bash grep -n "type=\"checkpoint" .planning/phases/XX-name/{phase}-{plan}-PLAN.md ```

Routing by checkpoint type:

Checkpoints Pattern Execution
None A (autonomous) Single subagent: full plan + SUMMARY + commit
Verify-only B (segmented) Segments between checkpoints. After none/human-verify → SUBAGENT. After decision/human-action → MAIN
Decision C (main) Execute entirely in main context

Pattern A: init_agent_tracking → capture EXPECTED_BASE=$(git rev-parse HEAD) → spawn Task(subagent_type="gsd-executor", model=executor_model) with prompt: execute plan at [path], autonomous, all tasks + SUMMARY + commit, follow deviation/auth rules, report: plan name, tasks, SUMMARY path, commit hash → track agent_id → wait → update tracking → report. Include isolation="worktree" only if workflow.use_worktrees is not false (read via config-get workflow.use_worktrees). When using isolation="worktree", include a <worktree_branch_check> block in the prompt instructing the executor to run git merge-base HEAD {EXPECTED_BASE} and, if the result differs from {EXPECTED_BASE}, reset the branch base with git reset --soft {EXPECTED_BASE} before starting work. This corrects a known issue on Windows where EnterWorktree creates branches from main instead of the feature branch HEAD.

Pattern B: Execute segment-by-segment. Autonomous segments: spawn subagent for assigned tasks only (no SUMMARY/commit). Checkpoints: main context. After all segments: aggregate, create SUMMARY, commit. See segment_execution.

Pattern C: Execute in main using standard flow (step name="execute").

Fresh context per subagent preserves peak quality. Main context stays lean.

```bash if [ ! -f .planning/agent-history.json ]; then echo '{"version":"1.0","max_entries":50,"entries":[]}' > .planning/agent-history.json fi rm -f .planning/current-agent-id.txt if [ -f .planning/current-agent-id.txt ]; then INTERRUPTED_ID=$(cat .planning/current-agent-id.txt) echo "Found interrupted agent: $INTERRUPTED_ID" fi ```

If interrupted: ask user to resume (Task resume parameter) or start fresh.

Tracking protocol: On spawn: write agent_id to current-agent-id.txt, append to agent-history.json: {"agent_id":"[id]","task_description":"[desc]","phase":"[phase]","plan":"[plan]","segment":[num|null],"timestamp":"[ISO]","status":"spawned","completion_timestamp":null}. On completion: status → "completed", set completion_timestamp, delete current-agent-id.txt. Prune: if entries > max_entries, remove oldest "completed" (never "spawned").

Run for Pattern A/B before spawning. Pattern C: skip.

Pattern B only (verify-only checkpoints). Skip for A/C.
  1. Parse segment map: checkpoint locations and types

  2. Per segment:

    • Subagent route: spawn gsd-executor for assigned tasks only. Prompt: task range, plan path, read full plan for context, execute assigned tasks, track deviations, NO SUMMARY/commit. Track via agent protocol.
    • Main route: execute tasks using standard flow (step name="execute")
  3. After ALL segments: aggregate files/deviations/decisions → create SUMMARY.md → commit → self-check:

    • Verify key-files.created exist on disk with [ -f ]
    • Check git log --oneline --all --grep="{phase}-{plan}" returns ≥1 commit
    • Append ## Self-Check: PASSED or ## Self-Check: FAILED to SUMMARY

    Known Claude Code bug (classifyHandoffIfNeeded): If any segment agent reports "failed" with classifyHandoffIfNeeded is not defined, this is a Claude Code runtime bug — not a real failure. Run spot-checks; if they pass, treat as successful.

```bash cat .planning/phases/XX-name/{phase}-{plan}-PLAN.md ``` This IS the execution instructions. Follow exactly. If plan references CONTEXT.md: honor user's vision throughout.

If plan contains <interfaces> block: These are pre-extracted type definitions and contracts. Use them directly — do NOT re-read the source files to discover types. The planner already extracted what you need.

```bash node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" phases list --type summaries --raw # Extract the second-to-last summary from the JSON result ``` If previous SUMMARY has unresolved "Issues Encountered" or "Next Phase Readiness" blockers: AskUserQuestion(header="Previous Issues", options: "Proceed anyway" | "Address first" | "Review previous"). Deviations are normal — handle via rules below.
  1. Read @context files from prompt
  2. MCP tools: If CLAUDE.md or project instructions reference MCP tools (e.g. jCodeMunch for code navigation), prefer them over Grep/Glob when available. Fall back to Grep/Glob if MCP tools are not accessible.
  3. Per task:
    • MANDATORY read_first gate: If the task has a <read_first> field, you MUST read every listed file BEFORE making any edits. This is not optional. Do not skip files because you "already know" what's in them — read them. The read_first files establish ground truth for the task.
    • type="auto": if tdd="true" → TDD execution. Implement with deviation rules + auth gates. Verify done criteria. Commit (see task_commit). Track hash for Summary.
    • type="checkpoint:*": STOP → checkpoint_protocol → wait for user → continue only after confirmation.
    • MANDATORY acceptance_criteria check: After completing each task, if it has <acceptance_criteria>, verify EVERY criterion before moving to the next task. Use grep, file reads, or CLI commands to confirm each criterion. If any criterion fails, fix the implementation before proceeding. Do not skip criteria or mark them as "will verify later".
  4. Run <verification> checks
  5. Confirm <success_criteria> met
  6. Document deviations in Summary

<authentication_gates>

Authentication Gates

Auth errors during execution are NOT failures — they're expected interaction points.

Indicators: "Not authenticated", "Unauthorized", 401/403, "Please run {tool} login", "Set {ENV_VAR}"

Protocol:

  1. Recognize auth gate (not a bug)
  2. STOP task execution
  3. Create dynamic checkpoint:human-action with exact auth steps
  4. Wait for user to authenticate
  5. Verify credentials work
  6. Retry original task
  7. Continue normally

Example: vercel --yes → "Not authenticated" → checkpoint asking user to vercel login → verify with vercel whoami → retry deploy → continue

In Summary: Document as normal flow under "## Authentication Gates", not as deviations.

</authentication_gates>

<deviation_rules>

Deviation Rules

You WILL discover unplanned work. Apply automatically, track all for Summary.

Rule Trigger Action Permission
1: Bug Broken behavior, errors, wrong queries, type errors, security vulns, race conditions, leaks Fix → test → verify → track [Rule 1 - Bug] Auto
2: Missing Critical Missing essentials: error handling, validation, auth, CSRF/CORS, rate limiting, indexes, logging Add → test → verify → track [Rule 2 - Missing Critical] Auto
3: Blocking Prevents completion: missing deps, wrong types, broken imports, missing env/config/files, circular deps Fix blocker → verify proceeds → track [Rule 3 - Blocking] Auto
4: Architectural Structural change: new DB table, schema change, new service, switching libs, breaking API, new infra STOP → present decision (below) → track [Rule 4 - Architectural] Ask user

Rule 4 format:

⚠️ Architectural Decision Needed

Current task: [task name]
Discovery: [what prompted this]
Proposed change: [modification]
Why needed: [rationale]
Impact: [what this affects]
Alternatives: [other approaches]

Proceed with proposed change? (yes / different approach / defer)

Priority: Rule 4 (STOP) > Rules 1-3 (auto) > unsure → Rule 4 Edge cases: missing validation → R2 | null crash → R1 | new table → R4 | new column → R1/2 Heuristic: Affects correctness/security/completion? → R1-3. Maybe? → R4.

</deviation_rules>

<deviation_documentation>

Documenting Deviations

Summary MUST include deviations section. None? → ## Deviations from Plan\n\nNone - plan executed exactly as written.

Per deviation: [Rule N - Category] Title — Found during: Task X | Issue | Fix | Files modified | Verification | Commit hash

End with: Total deviations: N auto-fixed (breakdown). Impact: assessment.

</deviation_documentation>

<tdd_plan_execution>

TDD Execution

For type: tdd plans — RED-GREEN-REFACTOR:

  1. Infrastructure (first TDD plan only): detect project, install framework, config, verify empty suite
  2. RED: Read <behavior> → failing test(s) → run (MUST fail) → commit: test({phase}-{plan}): add failing test for [feature]
  3. GREEN: Read <implementation> → minimal code → run (MUST pass) → commit: feat({phase}-{plan}): implement [feature]
  4. REFACTOR: Clean up → tests MUST pass → commit: refactor({phase}-{plan}): clean up [feature]

Errors: RED doesn't fail → investigate test/existing feature. GREEN doesn't pass → debug, iterate. REFACTOR breaks → undo.

See ~/.claude/get-shit-done/references/tdd.md for structure. </tdd_plan_execution>

<precommit_failure_handling>

Pre-commit Hook Failure Handling

Your commits may trigger pre-commit hooks. Auto-fix hooks handle themselves transparently — files get fixed and re-staged automatically.

If running as a parallel executor agent (spawned by execute-phase): Use --no-verify on all commits. Pre-commit hooks cause build lock contention when multiple agents commit simultaneously (e.g., cargo lock fights in Rust projects). The orchestrator validates once after all agents complete.

If running as the sole executor (sequential mode): If a commit is BLOCKED by a hook:

  1. The git commit command fails with hook error output
  2. Read the error — it tells you exactly which hook and what failed
  3. Fix the issue (type error, lint violation, secret leak, etc.)
  4. git add the fixed files
  5. Retry the commit
  6. Budget 1-2 retry cycles per commit </precommit_failure_handling>

<task_commit>

Task Commit Protocol

After each task (verification passed, done criteria met), commit immediately.

1. Check: git status --short

2. Stage individually (NEVER git add . or git add -A):

git add src/api/auth.ts
git add src/types/user.ts

3. Commit type:

Type When Example
feat New functionality feat(08-02): create user registration endpoint
fix Bug fix fix(08-02): correct email validation regex
test Test-only (TDD RED) test(08-02): add failing test for password hashing
refactor No behavior change (TDD REFACTOR) refactor(08-02): extract validation to helper
perf Performance perf(08-02): add database index
docs Documentation docs(08-02): add API docs
style Formatting style(08-02): format auth module
chore Config/deps chore(08-02): add bcrypt dependency

4. Format: {type}({phase}-{plan}): {description} with bullet points for key changes.

<sub_repos_commit_flow> Sub-repos mode: If sub_repos is configured (non-empty array from init context), use commit-to-subrepo instead of standard git commit. This routes files to their correct sub-repo based on path prefix.

node ~/.claude/get-shit-done/bin/gsd-tools.cjs commit-to-subrepo "{type}({phase}-{plan}): {description}" --files file1 file2 ...

The command groups files by sub-repo prefix and commits atomically to each. Returns JSON: { committed: true, repos: { "backend": { hash: "abc", files: [...] }, ... } }.

Record hashes from each repo in the response for SUMMARY tracking.

If sub_repos is empty or not set: Use standard git commit flow below. </sub_repos_commit_flow>

5. Record hash:

TASK_COMMIT=$(git rev-parse --short HEAD)
TASK_COMMITS+=("Task ${TASK_NUM}: ${TASK_COMMIT}")

6. Check for untracked generated files:

git status --short | grep '^??'

If new untracked files appeared after running scripts or tools, decide for each:

  • Commit it — if it's a source file, config, or intentional artifact
  • Add to .gitignore — if it's a generated/runtime output (build artifacts, .env files, cache files, compiled output)
  • Do NOT leave generated files untracked

</task_commit>

On `type="checkpoint:*"`: automate everything possible first. Checkpoints are for verification/decisions only.

Display: CHECKPOINT: [Type] box → Progress {X}/{Y} → Task name → type-specific content → YOUR ACTION: [signal]

Type Content Resume signal
human-verify (90%) What was built + verification steps (commands/URLs) "approved" or describe issues
decision (9%) Decision needed + context + options with pros/cons "Select: option-id"
human-action (1%) What was automated + ONE manual step + verification plan "done"

After response: verify if specified. Pass → continue. Fail → inform, wait. WAIT for user — do NOT hallucinate completion.

See ~/.claude/get-shit-done/references/checkpoints.md for details.

When spawned via Task and hitting checkpoint: return structured state (cannot interact with user directly).

Required return: 1) Completed Tasks table (hashes + files) 2) Current Task (what's blocking) 3) Checkpoint Details (user-facing content) 4) Awaiting (what's needed from user)

Orchestrator parses → presents to user → spawns fresh continuation with your completed tasks state. You will NOT be resumed. In main context: use checkpoint_protocol above.

If verification fails:

Check if node repair is enabled (default: on):

NODE_REPAIR=$(node "./.claude/get-shit-done/bin/gsd-tools.cjs" config-get workflow.node_repair 2>/dev/null || echo "true")

If NODE_REPAIR is true: invoke @./.claude/get-shit-done/workflows/node-repair.md with:

  • FAILED_TASK: task number, name, done-criteria
  • ERROR: expected vs actual result
  • PLAN_CONTEXT: adjacent task names + phase goal
  • REPAIR_BUDGET: workflow.node_repair_budget from config (default: 2)

Node repair will attempt RETRY, DECOMPOSE, or PRUNE autonomously. Only reaches this gate again if repair budget is exhausted (ESCALATE).

If NODE_REPAIR is false OR repair returns ESCALATE: STOP. Present: "Verification failed for Task [X]: [name]. Expected: [criteria]. Actual: [result]. Repair attempted: [summary of what was tried]." Options: Retry | Skip (mark incomplete) | Stop (investigate). If skipped → SUMMARY "Issues Encountered".

```bash PLAN_END_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ") PLAN_END_EPOCH=$(date +%s)

DURATION_SEC=$(( PLAN_END_EPOCH - PLAN_START_EPOCH )) DURATION_MIN=$(( DURATION_SEC / 60 ))

if $DURATION_MIN -ge 60 ; then HRS=$(( DURATION_MIN / 60 )) MIN=$(( DURATION_MIN % 60 )) DURATION="${HRS}h ${MIN}m" else DURATION="${DURATION_MIN} min" fi

</step>

<step name="generate_user_setup">
```bash
grep -A 50 "^user_setup:" .planning/phases/XX-name/{phase}-{plan}-PLAN.md | head -50

If user_setup exists: create {phase}-USER-SETUP.md using template ~/.claude/get-shit-done/templates/user-setup.md. Per service: env vars table, account setup checklist, dashboard config, local dev notes, verification commands. Status "Incomplete". Set USER_SETUP_CREATED=true. If empty/missing: skip.

Create `{phase}-{plan}-SUMMARY.md` at `.planning/phases/XX-name/`. Use `~/.claude/get-shit-done/templates/summary.md`.

Frontmatter: phase, plan, subsystem, tags | requires/provides/affects | tech-stack.added/patterns | key-files.created/modified | key-decisions | requirements-completed (MUST copy requirements array from PLAN.md frontmatter verbatim) | duration ($DURATION), completed ($PLAN_END_TIME date).

Title: # Phase [X] Plan [Y]: [Name] Summary

One-liner SUBSTANTIVE: "JWT auth with refresh rotation using jose library" not "Authentication implemented"

Include: duration, start/end times, task count, file count.

Next: more plans → "Ready for {next-plan}" | last → "Phase complete, ready for next step".

**Skip this step if running in parallel mode** (the orchestrator in execute-phase.md handles STATE.md/ROADMAP.md updates centrally after merging worktrees to avoid merge conflicts).

Update STATE.md using gsd-tools:

# Auto-detect parallel mode: .git is a file in worktrees, a directory in main repo
IS_WORKTREE=$([ -f .git ] && echo "true" || echo "false")

# Skip in parallel mode — orchestrator handles STATE.md centrally
if [ "$IS_WORKTREE" != "true" ]; then
  # Advance plan counter (handles last-plan edge case)
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state advance-plan

  # Recalculate progress bar from disk state
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state update-progress

  # Record execution metrics
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state record-metric \
    --phase "${PHASE}" --plan "${PLAN}" --duration "${DURATION}" \
    --tasks "${TASK_COUNT}" --files "${FILE_COUNT}"
fi
From SUMMARY: Extract decisions and add to STATE.md:
# Add each decision from SUMMARY key-decisions
# Prefer file inputs for shell-safe text (preserves `$`, `*`, etc. exactly)
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state add-decision \
  --phase "${PHASE}" --summary-file "${DECISION_TEXT_FILE}" --rationale-file "${RATIONALE_FILE}"

# Add blockers if any found
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state add-blocker --text-file "${BLOCKER_TEXT_FILE}"
Update session info using gsd-tools:
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" state record-session \
  --stopped-at "Completed ${PHASE}-${PLAN}-PLAN.md" \
  --resume-file "None"

Keep STATE.md under 150 lines.

If SUMMARY "Issues Encountered" ≠ "None": yolo → log and continue. Interactive → present issues, wait for acknowledgment. **Skip this step if running in parallel mode** (the orchestrator handles ROADMAP.md updates centrally after merging worktrees).
# Auto-detect parallel mode: .git is a file in worktrees, a directory in main repo
IS_WORKTREE=$([ -f .git ] && echo "true" || echo "false")

# Skip in parallel mode — orchestrator handles ROADMAP.md centrally
if [ "$IS_WORKTREE" != "true" ]; then
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" roadmap update-plan-progress "${PHASE}"
fi

Counts PLAN vs SUMMARY files on disk. Updates progress table row with correct count and status (In Progress or Complete with date).

Mark completed requirements from the PLAN.md frontmatter `requirements:` field:
node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" requirements mark-complete ${REQ_IDS}

Extract requirement IDs from the plan's frontmatter (e.g., requirements: [AUTH-01, AUTH-02]). If no requirements field, skip.

Task code already committed per-task. Commit plan metadata:
# Auto-detect parallel mode: .git is a file in worktrees, a directory in main repo
IS_WORKTREE=$([ -f .git ] && echo "true" || echo "false")

# In parallel mode: exclude STATE.md and ROADMAP.md (orchestrator commits these)
if [ "$IS_WORKTREE" = "true" ]; then
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" commit "docs({phase}-{plan}): complete [plan-name] plan" --files .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md .planning/REQUIREMENTS.md
else
  node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" commit "docs({phase}-{plan}): complete [plan-name] plan" --files .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md .planning/STATE.md .planning/ROADMAP.md .planning/REQUIREMENTS.md
fi
If .planning/codebase/ doesn't exist: skip.
FIRST_TASK=$(git log --oneline --grep="feat({phase}-{plan}):" --grep="fix({phase}-{plan}):" --grep="test({phase}-{plan}):" --reverse | head -1 | cut -d' ' -f1)
git diff --name-only ${FIRST_TASK}^..HEAD 2>/dev/null || true

Update only structural changes: new src/ dir → STRUCTURE.md | deps → STACK.md | file pattern → CONVENTIONS.md | API client → INTEGRATIONS.md | config → STACK.md | renamed → update paths. Skip code-only/bugfix/content changes.

node "$HOME/.claude/get-shit-done/bin/gsd-tools.cjs" commit "" --files .planning/codebase/*.md --amend
If `USER_SETUP_CREATED=true`: display `⚠️ USER SETUP REQUIRED` with path + env/config tasks at TOP.
(ls -1 .planning/phases/[current-phase-dir]/*-PLAN.md 2>/dev/null || true) | wc -l
(ls -1 .planning/phases/[current-phase-dir]/*-SUMMARY.md 2>/dev/null || true) | wc -l
Condition Route Action
summaries < plans A: More plans Find next PLAN without SUMMARY. Yolo: auto-continue. Interactive: show next plan, suggest /gsd-execute-phase {phase} + /gsd-verify-work. STOP here.
summaries = plans, current < highest phase B: Phase done Show completion, suggest /gsd-plan-phase {Z+1} + /gsd-verify-work {Z} + /gsd-discuss-phase {Z+1}
summaries = plans, current = highest phase C: Milestone done Show banner, suggest /gsd-complete-milestone + /gsd-verify-work + /gsd-add-phase

All routes: /clear first for fresh context.

<success_criteria>

  • All tasks from PLAN.md completed
  • All verifications pass
  • USER-SETUP.md generated if user_setup in frontmatter
  • SUMMARY.md created with substantive content
  • STATE.md updated (position, decisions, issues, session) — unless parallel mode (orchestrator handles)
  • ROADMAP.md updated — unless parallel mode (orchestrator handles)
  • If codebase map exists: map updated with execution changes (or skipped if no significant changes)
  • If USER-SETUP.md created: prominently surfaced in completion output </success_criteria>