refactor: deepen GSDTools query execution seams (#3085 )

* refactor: deepen gsdtools query execution seams * docs: add changeset for query seam deepening * docs: fix changeset summary text * fix: address coderabbit query seam findings * test: address remaining coderabbit findings and notes * refactor: use internal gsdtools error type import
Deepen query dispatch seam with Command Topology Module (#3078 )
2026-05-05 23:02:20 +02:00 · 2026-05-03 18:56:41 -04:00 · 2026-05-03 18:11:38 -04:00 · 2026-05-03 16:31:48 -04:00 · 2026-05-03 15:57:01 -04:00 · 2026-05-03 15:29:34 -04:00
922 changed files with 85646 additions and 13968 deletions
--- a/.changeset/README.md
+++ b/.changeset/README.md
@@ -0,0 +1,44 @@
+# Changeset Fragments
+
+This directory holds **per-PR CHANGELOG fragments**. Every PR with user-facing changes drops one (or more) `<random-name>.md` files here describing its CHANGELOG entry. Fragments are consolidated into the top-level `CHANGELOG.md` at release time.
+
+## Why
+
+Two PRs that both edit the `### Fixed` block of `CHANGELOG.md` always conflict on merge — git can't pick a serialization order without human input. Two PRs that each add a fresh `.changeset/<unique-name>.md` never conflict because they don't share lines.
+
+See [#2975](https://github.com/gsd-build/get-shit-done/issues/2975) for the full rationale.
+
+## Adding a fragment
+
+```bash
+node scripts/changeset/new.cjs \
+  --type Fixed \
+  --pr 1234 \
+  --body "fix the thing — explain the user-visible change in one sentence"
+```
+
+This writes `.changeset/<adjective>-<noun>-<noun>.md` with frontmatter and a body. Three random words → concurrent PRs don't collide.
+
+## Format
+
+```md
+---
+type: Fixed
+pr: 1234
+---
+**`/gsd-foo` no longer drops trailing slashes** — explain the user-visible change.
+```
+
+Allowed `type:` values follow [Keep a Changelog](https://keepachangelog.com/): `Added`, `Changed`, `Deprecated`, `Removed`, `Fixed`, `Security`.
+
+## Opting out
+
+PRs that legitimately have no user-facing impact can add the `no-changelog` label. CI honors it. When unsure, add the fragment.
+
+## At release time
+
+```bash
+node scripts/changeset/cli.cjs render --version vX.Y.Z --date YYYY-MM-DD
+```
+
+Reads every fragment, groups bullets by `type:`, replaces `## [Unreleased]` with a new `## [vX.Y.Z] - YYYY-MM-DD` block, opens a fresh `## [Unreleased]` above, deletes consumed fragments. Idempotent.
--- a/.changeset/blue-stones-topology.md
+++ b/.changeset/blue-stones-topology.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+---
+
+**Query command dispatch deepened with Command Topology Module** — query dispatch now consumes a single topology seam that resolves command tokens, binds native handler adapters, and returns structured no-match diagnosis, improving locality and reducing dispatch seam drift.
--- a/.changeset/bold-finches-rally.md
+++ b/.changeset/bold-finches-rally.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3058
+---
+**GSD transport raw-mode handling and timeout fallback hardened** — fixes undefined raw formatting edge case and adds raw-path coverage to prevent regressions.
--- a/.changeset/brave-mice-build.md
+++ b/.changeset/brave-mice-build.md
@@ -0,0 +1,8 @@
+---
+type: Changed
+pr: 3069
+---
+
+**query command metadata now flows through a canonical Command Definition Module seam** — registry assembly, mutation semantics, and alias generation consume one Interface (`family`, `canonical`, `aliases`, `mutation`, `output_mode`, `handler_key`) to improve locality and reduce drift.
+
+**query fallback error mapping cleanup** — the CJS fallback catch path now passes original `err` to `mapFallbackDispatchError` (follow-up to prior review feedback missed in PR #3066).
--- a/.changeset/bright-pumas-fold.md
+++ b/.changeset/bright-pumas-fold.md
@@ -0,0 +1,6 @@
+---
+type: Changed
+pr: 3075
+---
+
+**query architecture deepening pass** — extracted Query Runtime Context, Native Dispatch Adapter, and Query CLI Output Modules so dispatch policy, runtime context policy, and CLI projection logic each live behind focused seams with higher locality and leverage.
--- a/.changeset/calm-birds-greet.md
+++ b/.changeset/calm-birds-greet.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2990
+---
+gsd-code-fixer worktree no longer fails on the same-branch checkout — the agent now creates a new gsd-reviewfix/ branch via git worktree add -b and fast-forwards the user's branch on cleanup. See #2990.
--- a/.changeset/calm-ibex-jump.md
+++ b/.changeset/calm-ibex-jump.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 2986
+---
+Test suite for config-schema.cjs is now mutation-resistant — 95 typed assertions kill the 124 surviving Stryker mutants from the 4.62% baseline. Tests target static-key fast path, dynamic-pattern .some semantics, polarity, and regex-anchor tightening. See #2986.
--- a/.changeset/calm-tigers-frolic.md
+++ b/.changeset/calm-tigers-frolic.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3008
+---
+**`tests/install-minimal.test.cjs:307` no longer races on shared `os.tmpdir()` under parallel CI** — the previous shape compared `listTmpStageDirs()` snapshots before and after the throw. Under `scripts/run-tests.cjs --test-concurrency=4`, `tests/install-minimal-all-runtimes.test.cjs` runs in a parallel process and creates/removes `gsd-minimal-skills-*` dirs in the shared OS tmpdir between snapshots, so `deepStrictEqual` failed deterministically when the parallel process happened to have a live stage dir during the snapshot window. Fix: stub `fs.mkdtempSync` to record THIS call's stage dir, then assert that exact path no longer exists after the throw — no global filesystem snapshot, no race. (#3008)
--- a/.changeset/codex-bare-node-fix.md
+++ b/.changeset/codex-bare-node-fix.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3022
+---
+**Codex SessionStart hook now uses absolute Node binary path** — closes the gap left after #3002. The Codex install path wrote `command = "node ${path}"` directly into config.toml, bypassing `resolveNodeRunner()`. Under GUI/minimal-PATH runtimes (`/usr/bin:/bin:/usr/sbin:/sbin`), bare `node` failed to resolve, exit 127. Now routed through new `buildCodexHookBlock()` helper. Reinstall path migrates legacy bare-node entries via new `rewriteLegacyCodexHookBlock()`. See #3017.
--- a/.changeset/codex-discuss-fallback.md
+++ b/.changeset/codex-discuss-fallback.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: TBD
+---
+**Codex skill adapter no longer instructs the agent to silently default discuss-phase decisions.** When `request_user_input` was rejected (Default mode), the generated adapter said "pick a reasonable default" — so `$gsd-discuss-phase` proceeded toward writing CONTEXT.md / DISCUSSION-LOG.md / checkpoints without ever asking the user. Adapter prose now requires the agent to STOP, present plain-text questions, and wait, with explicit named exceptions (`--auto`/`--all`/explicit user approval). See #3018.
--- a/.changeset/cool-monkeys-smell.md
+++ b/.changeset/cool-monkeys-smell.md
@@ -0,0 +1,6 @@
+---
+type: Changed
+pr: 3074
+---
+
+**query CLI path extracted into a dedicated Query CLI Adapter Module** — `sdk/src/cli.ts` now delegates query-specific dispatch, error mapping, and output/exit handling to `sdk/src/query/query-cli-adapter.ts` for better locality and testability.
--- a/.changeset/curious-bears-march.md
+++ b/.changeset/curious-bears-march.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3012
+---
+**Post-install message and update.md no longer recommend the removed `/gsd-reapply-patches` command** — after PR #2824 consolidated 86 skills into ~58, `/gsd-reapply-patches` was folded into a flag (`/gsd-update --reapply`). The 1.39.1 hotfix (#2954) updated `help.md` but missed `bin/install.js`'s `reportLocalPatches` runtime emitter, `get-shit-done/workflows/update.md` Step 4, and the English + zh-CN/ja-JP/ko-KR doc set. Users hit "Unknown command" after every install with backed-up patches. All five runtime branches in `reportLocalPatches` (claude, opencode, kilo, copilot, gemini, codex, cursor) now emit the consolidated form. Regression: `tests/bug-3010-reapply-patches-references.test.cjs` scans `bin/install.js`, every workflow file, and every doc (excluding CHANGELOG history and help.md's deprecation notice) for stale recommendations. See #3010.
--- a/.changeset/docs-1-40-0-audit.md
+++ b/.changeset/docs-1-40-0-audit.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 0
+---
+**Documentation refreshed for v1.40.0** — full audit of `docs/` against the 1.40.0-rc.1 release surface. Updates command lists, walkthroughs, and inventory rows for the 86→59 skill consolidation (#2790), the six namespace meta-skills with two-stage routing (#2792), the `/gsd-health --context` guard, the phase-lifecycle status-line read-side (#2833), and the Gemini colon-form / non-Gemini hyphen-form slash-command split. Translations in ja-JP/ko-KR/zh-CN/pt-BR mirror the structural changes; new English prose is marked with `<!-- TODO i18n -->` for human translator follow-up. CHANGELOG.md `[Unreleased]` section regrouped under Feature/Enhancement/Fix headers.
--- a/.changeset/dynamic-routing.md
+++ b/.changeset/dynamic-routing.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: TBD
+---
+**`dynamic_routing` block in `.planning/config.json` for failure-tier escalation (#3024).** Each agent declares a default tier (`light` / `standard` / `heavy`); when `dynamic_routing.enabled: true`, the resolver picks `tier_models[default_tier]` for the first spawn and escalates one tier up on orchestrator-detected soft failure (capped by `max_escalations`). Disabled by default — fully backward compatible. Composes with `model_overrides` (higher precedence) and `models.<phase_type>` (lower) for full cost-control flexibility. Adds new resolver `resolveModelForTier(cwd, agent, attempt)` to `core.cjs` for orchestrator integration.
--- a/.changeset/eager-hawks-rally.md
+++ b/.changeset/eager-hawks-rally.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2975
+---
+**Changeset-fragment workflow** — eliminates CHANGELOG.md merge conflicts. Each PR drops `.changeset/<random-name>.md` with frontmatter (`type:`, `pr:`) plus a markdown body; the release-time `npm run changelog:render` consolidates fragments into `CHANGELOG.md` and deletes them. CI lint (`npm run lint:changeset`) requires a fragment on any PR touching user-facing files (`bin/`, `get-shit-done/`, `agents/`, `commands/`, `hooks/`, `sdk/src/`); contributors can opt out via the `no-changelog` label for purely internal changes. See [.changeset/README.md](.changeset/README.md) and CONTRIBUTING.md for the workflow.
--- a/.changeset/gemini-skip-local-when-global.md
+++ b/.changeset/gemini-skip-local-when-global.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3037
+---
+**Gemini local install no longer duplicates `/gsd:*` commands across user and workspace scopes** — when GSD is already installed at the user scope (`~/.gemini/commands/gsd/`) and you run `npx get-shit-done-cc --gemini --local` in a project, the installer now skips writing `commands/gsd/` to `<project>/.gemini/` and prints a one-line warning explaining why. Previously, both scopes received the same 65 command files, and Gemini's conflict detector renamed every `/gsd:*` command to `/workspace.gsd:*` and `/user.gsd:*`, breaking the documented namespace. Closes #3037.
--- a/.changeset/happy-jays-greet.md
+++ b/.changeset/happy-jays-greet.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2994
+---
+/gsd-reapply-patches Step 5 verifier now resolves at runtime — moved scripts/verify-reapply-patches.cjs to get-shit-done/bin/ which is shipped by the installer. The legacy scripts/ directory is not copied to user installs. See #2994.
--- a/.changeset/happy-tigers-travel.md
+++ b/.changeset/happy-tigers-travel.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Query mutation event mapping moved to dedicated module** — preserves event payloads while improving registry locality and test surface.
--- a/.changeset/help-passthrough.md
+++ b/.changeset/help-passthrough.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3026
+---
+**`gsd-sdk query <subcommand> --help` now reaches the handler instead of returning top-level usage.** The query argv parser harvested `--help` as a global flag and `main()` short-circuited dispatch — there was no path to discover what arguments a query subcommand accepts. The parser now leaves `--help` in `queryArgv` so the handler/fallback can render contextual help. The `gsd-tools.cjs` fallback now renders top-level usage on `--help` (instead of erroring), preserving #1818's anti-hallucination invariant by NOT executing the destructive command. See #3019.
--- a/.changeset/humble-goats-swim.md
+++ b/.changeset/humble-goats-swim.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Alias-family handler maps moved to dedicated catalog module** — keeps command keys/order while reducing createRegistry coupling and improving family-level locality.
--- a/.changeset/install-shell-path-probe.md
+++ b/.changeset/install-shell-path-probe.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3028
+---
+**Installer no longer prints `✓ GSD SDK ready` when the shim is unreachable from the user's runtime shells.** The previous check used `process.env.PATH` from the install subprocess, which often differs from the user's later interactive shells (POSIX `~/.local/bin` not in login shell, node-version-manager PATH shims). Added `getUserShellPath()` helper that probes `$SHELL -lc 'printf %s "$PATH"'` and `isGsdSdkOnPath(pathString?)` overload that accepts an explicit PATH; the install-time check now downgrades to the actionable `⚠` diagnostic from PR #3014 when install-PATH and user-shell-PATH disagree. Windows cross-shell support tracked separately. See #3020.
--- a/.changeset/issue-driven-orchestration.md
+++ b/.changeset/issue-driven-orchestration.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2840
+---
+**`docs/issue-driven-orchestration.md` — recipe for driving GSD from a tracker issue** — new guide that maps Symphony-style orchestration concepts (workflow, isolated agent workspace, proof-of-work, human review gate, follow-up capture) onto existing GSD primitives (`/gsd-new-workspace`, `/gsd-manager`, `/gsd-autonomous`, `/gsd-verify-work`, `/gsd-review`, `/gsd-ship`, `STATE.md`, phase artifacts). Documentation only — no new commands, no daemon, no tracker integration.
--- a/.changeset/jolly-newts-roam.md
+++ b/.changeset/jolly-newts-roam.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2994
+---
+/gsd-reapply-patches Step 5 verifier now resolves at runtime — moved scripts/verify-reapply-patches.cjs to get-shit-done/bin/ which is shipped by the installer. The legacy scripts/ directory is not copied to user installs. See #2994.
--- a/.changeset/jolly-pumas-dance.md
+++ b/.changeset/jolly-pumas-dance.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2979
+---
+Managed JS hooks now resolve under GUI/minimal-PATH runtimes — installer emits process.execPath (absolute, quoted, forward-slash-normalized) as the runner for every .js hook command instead of bare node. See #2979.
--- a/.changeset/lively-goats-run.md
+++ b/.changeset/lively-goats-run.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2995
+---
+Post-install path smoke test for workflow-invoked scripts — audits every node ${GSD_HOME}/...cjs invocation in workflows resolves at the runtime-installed path. See #2995.
--- a/.changeset/lively-otters-gather.md
+++ b/.changeset/lively-otters-gather.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3011
+---
+**Actionable diagnostic when `gsd-sdk` is not on PATH after install** — Windows users (and others on multi-shell setups) reported that the previous "GSD SDK files are present but `gsd-sdk` is not on your PATH" warning gave them no way to fix it: no path to look at, no shell-specific commands, no mention of the npx-cache caveat. New `formatSdkPathDiagnostic({ shimDir, platform, runDir })` helper returns a typed IR with the resolved shim location, platform-specific PATH-export commands (PowerShell / cmd.exe / Git Bash on Windows; `export PATH` on POSIX), and an npx-specific note when running under an `_npx` cache segment (where the shim may be written to a temp dir that won't persist). The console renderer in `bin/install.js` emits the lines from the IR; tests assert on the typed fields directly. (#3011)
--- a/.changeset/mcp-token-budget-docs.md
+++ b/.changeset/mcp-token-budget-docs.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 3032
+---
+**Documentation: MCP tool schema as a context-budget concern (#3025).** Adds new sections to `get-shit-done/references/context-budget.md` and `docs/USER-GUIDE.md` explaining that every enabled MCP server injects its tool schema into every turn — heavyweight servers (browser/playwright, Mac-tools, Windows-tools) can cost 20k+ tokens each, often dwarfing what `model_profile` tuning saves. The toggle lives in `.claude/settings.json` (`enabledMcpjsonServers` / `disabledMcpjsonServers`) and is a Claude Code harness concern, not a GSD concern. Includes a pre-phase audit checklist (browser, platform-specific, cross-project, duplicates) and notes the multiplier interaction with `model_profile`. Companion to #3023 (per-phase-type model map) and #3024 (dynamic routing); together they cover the three biggest cost levers.
--- a/.changeset/merry-foxes-climb.md
+++ b/.changeset/merry-foxes-climb.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2997
+---
+SDK config-set/config-get and init responses no longer echo plaintext API keys. New sdk/src/query/secrets.ts ports SECRET_CONFIG_KEYS masking from CJS; init bundles only mask string values to preserve the boolean availability-flag contract. See #2997.
--- a/.changeset/merry-lynx-sing.md
+++ b/.changeset/merry-lynx-sing.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2992
+---
+/gsd-update queries wrong npm package names — moved package name into a deterministic check-latest-version.cjs script and updated the workflow to use ${GSD_DIR} from get_installed_version. See #2992.
--- a/.changeset/merry-lynx-wander.md
+++ b/.changeset/merry-lynx-wander.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3007
+---
+**PR templates now point at the changeset workflow** — the `Fix`, `Enhancement`, and `Feature` PR templates previously asked contributors to tick `CHANGELOG.md updated`, which contradicted the post-#2978 rule that `CHANGELOG.md` must not be edited directly. Each checkbox now references `npm run changeset` (and the `no-changelog` opt-out where applicable).
--- a/.changeset/merry-moles-chatter.md
+++ b/.changeset/merry-moles-chatter.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**CLI query CJS fallback execution extracted to dedicated adapter module** — preserves logs/help passthrough behavior while improving fallback locality and testability.
--- a/.changeset/noble-badgers-roar.md
+++ b/.changeset/noble-badgers-roar.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Query mutation event emission now uses a dedicated decorator seam** — preserves fire-and-forget behavior while reducing registry coupling and improving testability.
--- a/.changeset/per-phase-type-models.md
+++ b/.changeset/per-phase-type-models.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 3030
+---
+**`models` block in `.planning/config.json` for per-phase-type model selection (#3023).** A new resolution layer between per-agent `model_overrides` and the `model_profile` tier table. Six named slots (`planning` / `discuss` / `research` / `execution` / `verification` / `completion`) accept tier aliases (`opus` / `sonnet` / `haiku` / `inherit`). Lets you express "Opus for planning, Sonnet for the rest" in two lines without learning the agent taxonomy. Fully backward compatible — configs without `models` behave exactly as today.
--- a/.changeset/plucky-ibex-gather.md
+++ b/.changeset/plucky-ibex-gather.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2998
+---
+gsd-pristine/ is now populated by the installer when local patches are detected — saveLocalPatches calls a new populatePristineDir helper that runs the install transform pipeline into a tmp staging dir and copies modified files into pristineDir. The reapply-patches Step 5 verifier no longer falls back to its over-broad heuristic. See #2998.
--- a/.changeset/plucky-moles-roam.md
+++ b/.changeset/plucky-moles-roam.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2997
+---
+SDK config-set/config-get and init responses no longer echo plaintext API keys. New sdk/src/query/secrets.ts ports SECRET_CONFIG_KEYS masking from CJS; init bundles only mask string values to preserve the boolean availability-flag contract. See #2997.
--- a/.changeset/plucky-otters-roam.md
+++ b/.changeset/plucky-otters-roam.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2995
+---
+Post-install path smoke test for workflow-invoked scripts — audits every node ${GSD_HOME}/...cjs invocation in workflows resolves at the runtime-installed path. See #2995.
--- a/.changeset/quick-geese-hum.md
+++ b/.changeset/quick-geese-hum.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Query fallback orchestration now shared** — CLI and SDK query dispatch now use one planning seam for native vs CJS fallback decisions with behavior parity preserved.
--- a/.changeset/rapid-goats-munch.md
+++ b/.changeset/rapid-goats-munch.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Query/transport policy data now converged in shared module** — mutation and raw-output policy wiring now share one source of truth to reduce drift.
--- a/.changeset/research-flag-and-stale-refs.md
+++ b/.changeset/research-flag-and-stale-refs.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3042
+---
+**`/gsd-research-phase` consolidated into `/gsd-plan-phase --research-phase <N>`** — the standalone research command's slash-command stub was never registered (#3042). Rather than restore the orphan, the research-only capability now lives as a flag on `/gsd-plan-phase`. New modifiers: `--view` prints existing `RESEARCH.md` to stdout without spawning, `--research` forces refresh, otherwise prompts `update / view / skip` when `RESEARCH.md` already exists. Also scrubs four other stale slash-command references (`/gsd-check-todos`, `/gsd-new-workspace`, `/gsd-status`, residual `/gsd-plan-milestone-gaps`) across English + 4 localized doc sets (#3044). Closes #3042 and #3044.
--- a/.changeset/scrub-stale-command-routes.md
+++ b/.changeset/scrub-stale-command-routes.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 3029
+---
+**`/gsd-code-review-fix` and `/gsd-plan-milestone-gaps` no longer surface as "Unknown command"** — both were consolidated by #2790 (`/gsd-code-review --fix` and inline gap planning in `/gsd-audit-milestone` respectively), but several user-facing surfaces still emitted the old slash forms in their offer text. Fixed audit-milestone offer blocks, gsd-complete-milestone routing, code-review/execute-phase offer text, gsd-code-fixer agent role card, and the doc surfaces (USER-GUIDE, FEATURES, INVENTORY, AGENTS, CONFIGURATION). Closes #3029, closes #3034.
--- a/.changeset/silly-foxes-wander.md
+++ b/.changeset/silly-foxes-wander.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2990
+---
+gsd-code-fixer worktree no longer fails on the same-branch checkout — the agent now creates a new gsd-reviewfix/ branch via git worktree add -b and fast-forwards the user's branch on cleanup. See #2990.
--- a/.changeset/silly-newts-swim.md
+++ b/.changeset/silly-newts-swim.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2982
+---
+Extended no-source-grep lint to catch var-binding readFileSync.includes() pattern. Tests now fail when source-grep is hidden behind a parser wrapper. See #2982.
--- a/.changeset/steady-ravens-shape.md
+++ b/.changeset/steady-ravens-shape.md
@@ -0,0 +1,6 @@
+---
+type: Changed
+pr: 3065
+---
+
+**Dispatch policy seam now returns a structured result contract** across native and fallback query execution paths (`ok`, typed error `kind`, `details`, and final `exit_code`), with CLI consuming the unified result instead of mixed throw/result handling.
--- a/.changeset/sturdy-jays-glide.md
+++ b/.changeset/sturdy-jays-glide.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3060
+---
+**Query static command registrations now split into domain catalog modules** — preserves command order/strings while improving registry locality and maintenance.
--- a/.changeset/tidy-tunas-zip.md
+++ b/.changeset/tidy-tunas-zip.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 3085
+---
+**`GSDTools` query execution internals now use deep Module seams** — refactors runtime composition, native/subprocess adapters, and output projection behind stable public interfaces for better locality and testability.
--- a/.changeset/typed-rivers-flow.md
+++ b/.changeset/typed-rivers-flow.md
@@ -0,0 +1,5 @@
+---
+type: Changed
+pr: 2974
+---
+Migrated 8 test files from raw text matching (`stdout.includes(...)`, `assert.match(stderr, ...)`) to typed-IR assertions per CONTRIBUTING.md. Adds shared `ERROR_REASON` enum and `--json-errors` flag in `core.cjs`, typed `GRAPHIFY_REASON` in `graphify.cjs`, pure `buildSdkFailFastReport()` IR builder in `bin/install.js`, and Claude Code JSON envelope output (`hookSpecificOutput` with typed fields) for `gsd-session-state.sh` and `gsd-phase-boundary.sh`. Tests now assert on structured fields (`reason`, `context`, `state_present`, `planning_modified`, etc.) instead of substring matching. See #2974.
--- a/.changeset/update-banner-opt-in.md
+++ b/.changeset/update-banner-opt-in.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2795
+---
+**Optional update banner for non-GSD statusline users** — when the installer detects you've declined or kept a non-GSD statusline, it now offers an opt-in `SessionStart` banner that surfaces update availability via the existing `~/.cache/gsd/gsd-update-check.json` cache. Silent when up-to-date, rate-limits failure diagnostics to once per 24h, removed cleanly by `npx get-shit-done-cc --uninstall`.
--- a/.changeset/witty-hawks-jump.md
+++ b/.changeset/witty-hawks-jump.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2973
+---
+/gsd-profile-user --refresh writes dev-preferences.md to ~/.claude/skills/gsd-dev-preferences/SKILL.md instead of the legacy commands/gsd/ directory. Installer migrates any preserved legacy file to the new location. See #2973.
--- a/.changeset/witty-newts-greet.md
+++ b/.changeset/witty-newts-greet.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2992
+---
+/gsd-update queries wrong npm package names — moved package name into a deterministic check-latest-version.cjs script and updated the workflow to use ${GSD_DIR} from get_installed_version. See #2992.
--- a/.changeset/zesty-jays-wake.md
+++ b/.changeset/zesty-jays-wake.md
@@ -0,0 +1,5 @@
+---
+type: Fixed
+pr: 2979
+---
+Managed JS hooks now resolve under GUI/minimal-PATH runtimes — installer emits process.execPath (absolute, quoted, forward-slash-normalized) as the runner for every .js hook command instead of bare node. See #2979.
--- a/.changeset/zesty-moles-forage.md
+++ b/.changeset/zesty-moles-forage.md
@@ -0,0 +1,5 @@
+---
+type: Added
+pr: 2982
+---
+Extended no-source-grep lint to catch var-binding readFileSync.includes() pattern. Tests now fail when source-grep is hidden behind a parser wrapper. See #2982.
--- a/.coderabbit.yaml
+++ b/.coderabbit.yaml
@@ -0,0 +1,26 @@
+# CodeRabbit configuration — gsd-build/get-shit-done
+#
+# Schema: https://docs.coderabbit.ai/reference/yaml-template/
+#
+# Project context: GSD ships a CLI tool + an agent runtime, not a documented
+# public library. We carry rich JSDoc on internal helpers that warrant it
+# (see bin/install.js, get-shit-done/bin/lib/*.cjs) but we do not enforce a
+# blanket docstring coverage bar — see issue #2932 for rationale.
+
+reviews:
+  pre_merge_checks:
+    # Disable docstring coverage check.
+    #
+    # The check produces false-positive warnings on PRs whose new code is
+    # entirely test files: it counts test(...) / beforeEach / afterEach
+    # arrow-function callbacks as functions and then reports 0% coverage
+    # because nothing has JSDoc. There is no per-check path filter in CR's
+    # documented schema that would let us exclude tests/** while keeping
+    # the check active elsewhere, and the top-level path_filters approach
+    # would silence ALL CR review on tests (security scans, out-of-scope
+    # checks, line-level findings) which we want to keep.
+    #
+    # All other CR pre-merge checks (out-of-scope, security, title) remain
+    # at their defaults.
+    docstrings:
+      mode: off
--- a/.githooks/pre-commit
+++ b/.githooks/pre-commit
@@ -0,0 +1,6 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+if git diff --cached --name-only | grep -Eq "^sdk/src/query/command-manifest\.|^sdk/src/query/command-aliases\.generated\.ts$|^get-shit-done/bin/lib/command-aliases\.generated\.cjs$|^sdk/scripts/gen-command-aliases\.ts$"; then
+  npm run check:alias-drift
+fi
--- a/.githooks/pre-push
+++ b/.githooks/pre-push
@@ -0,0 +1,48 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+zero_sha='0000000000000000000000000000000000000000'
+blocked_regex="${GSD_BLOCKED_AUTHOR_REGEX:-}"
+
+# Local-only guard: no-op unless the developer opts in via env var, e.g.
+# export GSD_BLOCKED_AUTHOR_REGEX='@example-corp\.com$'
+if [[ -z "$blocked_regex" ]]; then
+  exit 0
+fi
+
+violations=()
+
+while read -r local_ref local_sha remote_ref remote_sha; do
+  # branch/tag deletion
+  if [[ "$local_sha" == "$zero_sha" ]]; then
+    continue
+  fi
+
+  if [[ "$remote_sha" == "$zero_sha" ]]; then
+    # New remote ref: inspect commits not already on any remote
+    commit_list=$(git rev-list "$local_sha" --not --remotes)
+  else
+    commit_list=$(git rev-list "$remote_sha..$local_sha")
+  fi
+
+  while read -r commit; do
+    [[ -z "$commit" ]] && continue
+    author_email=$(git show -s --format='%ae' "$commit")
+    lower_email=$(printf '%s' "$author_email" | tr '[:upper:]' '[:lower:]')
+    if printf '%s' "$lower_email" | grep -Eq "$blocked_regex"; then
+      violations+=("$commit <$author_email>")
+    fi
+  done <<< "$commit_list"
+done
+
+if [[ ${#violations[@]} -gt 0 ]]; then
+  {
+    echo "Push blocked: commit author email matched local blocked regex ($blocked_regex)."
+    echo "Rewrite author info before pushing these commits:"
+    for v in "${violations[@]}"; do
+      echo "  - $v"
+    done
+    echo "Suggested fix: git rebase -i <base> --exec \"git commit --amend --no-edit --author='Your Name <non-enterprise@email>'\""
+  } >&2
+  exit 1
+fi
--- a/.github/PULL_REQUEST_TEMPLATE/enhancement.md
+++ b/.github/PULL_REQUEST_TEMPLATE/enhancement.md
@@ -73,7 +73,7 @@ Closes #
 - [ ] Changes are scoped to the approved enhancement — nothing extra included
 - [ ] All existing tests pass (`npm test`)
 - [ ] New or updated tests cover the enhanced behavior
- [ ] CHANGELOG.md updated
+- [ ] `.changeset/` fragment added (`npm run changeset -- --type Changed --pr <NNN> --body "..."`) — or `no-changelog` label applied if not user-facing
 - [ ] Documentation updated if behavior or output changed
 - [ ] No unnecessary dependencies added

--- a/.github/PULL_REQUEST_TEMPLATE/feature.md
+++ b/.github/PULL_REQUEST_TEMPLATE/feature.md
@@ -94,7 +94,7 @@ Closes #
 - [ ] Implementation scope matches the approved spec exactly
 - [ ] All existing tests pass (`npm test`)
 - [ ] New tests cover the happy path, error cases, and edge cases
- [ ] CHANGELOG.md updated with a user-facing description of the feature
+- [ ] `.changeset/` fragment added with a user-facing description of the feature (`npm run changeset -- --type Added --pr <NNN> --body "..."`)
 - [ ] Documentation updated — commands, workflows, references, README if applicable
 - [ ] No unnecessary external dependencies added
 - [ ] Works on Windows (backslash paths handled)
--- a/.github/PULL_REQUEST_TEMPLATE/fix.md
+++ b/.github/PULL_REQUEST_TEMPLATE/fix.md
@@ -63,7 +63,7 @@ Fixes #
 - [ ] Fix is scoped to the reported bug — no unrelated changes included
 - [ ] Regression test added (or explained why not)
 - [ ] All existing tests pass (`npm test`)
- [ ] CHANGELOG.md updated if this is a user-facing fix
+- [ ] `.changeset/` fragment added if this is a user-facing fix (`npm run changeset -- --type Fixed --pr <NNN> --body "..."`) — or `no-changelog` label applied
 - [ ] No unnecessary dependencies added

 ## Breaking changes
--- a/.github/workflows/canary.yml
+++ b/.github/workflows/canary.yml
@@ -0,0 +1,157 @@
+# Release stream policy:
+#   dev   → @canary  (this workflow — preview builds for the long-lived integration branch)
+#   main  → @next    (RC train, see release.yml)
+#   main  → @latest  (stable cuts, see release.yml)
+#
+# Streams do not mix. The publish/tag steps below gate on `refs/heads/dev` so a
+# workflow_dispatch run on any other branch (including main) completes the
+# build/test/dry-run validation but does not publish or tag.
+
+name: Canary
+
+on:
+  workflow_dispatch:
+    inputs:
+      dry_run:
+        description: 'Dry run (skip npm publish, tagging, and push)'
+        required: false
+        type: boolean
+        default: false
+
+concurrency:
+  group: canary
+  cancel-in-progress: false
+
+env:
+  NODE_VERSION: 24
+
+jobs:
+  canary:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    permissions:
+      contents: write
+      id-token: write
+    environment: npm-publish
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          fetch-depth: 0
+
+      - uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f  # v6.3.0
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+          registry-url: 'https://registry.npmjs.org'
+          cache: 'npm'
+
+      - name: Determine canary version
+        id: canary
+        run: |
+          # Strip any pre-release suffix from package.json version to get base (e.g. 1.39.0-rc.4 → 1.39.0)
+          RAW=$(node -p "require('./package.json').version")
+          BASE=$(echo "$RAW" | sed 's/-.*//')
+          # Find next sequential canary number from existing tags
+          N=1
+          while git tag -l "v${BASE}-canary.${N}" | grep -q .; do
+            N=$((N + 1))
+          done
+          CANARY_VERSION="${BASE}-canary.${N}"
+          echo "canary_version=$CANARY_VERSION" >> "$GITHUB_OUTPUT"
+
+      - name: Configure git identity
+        run: |
+          git config user.name "github-actions[bot]"
+          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+
+      - name: Bump to canary version
+        env:
+          CANARY_VERSION: ${{ steps.canary.outputs.canary_version }}
+        run: |
+          npm version "$CANARY_VERSION" --no-git-tag-version
+          cd sdk && npm version "$CANARY_VERSION" --no-git-tag-version && cd ..
+
+      - name: Install and test
+        run: |
+          npm ci
+          npm test
+
+      - name: Build SDK dist for tarball
+        run: npm run build:sdk
+
+      - name: Verify tarball ships sdk/dist/cli.js (bug #2647)
+        run: bash scripts/verify-tarball-sdk-dist.sh
+
+      - name: Dry-run publish validation
+        run: |
+          npm publish --dry-run --tag canary
+          cd sdk && npm publish --dry-run --tag canary
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+
+      - name: Tag and push
+        if: ${{ github.ref == 'refs/heads/dev' && !inputs.dry_run }}
+        env:
+          CANARY_VERSION: ${{ steps.canary.outputs.canary_version }}
+        run: |
+          git tag "v${CANARY_VERSION}"
+          git push origin "v${CANARY_VERSION}"
+
+      - name: Publish to npm (canary)
+        if: ${{ github.ref == 'refs/heads/dev' && !inputs.dry_run }}
+        run: npm publish --provenance --access public --tag canary
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+
+      - name: Publish SDK to npm (canary)
+        if: ${{ github.ref == 'refs/heads/dev' && !inputs.dry_run }}
+        run: cd sdk && npm publish --provenance --access public --tag canary
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+
+      - name: Verify publish
+        if: ${{ github.ref == 'refs/heads/dev' && !inputs.dry_run }}
+        env:
+          CANARY_VERSION: ${{ steps.canary.outputs.canary_version }}
+        run: |
+          PUBLISHED="NOT_FOUND"
+          SDK_PUBLISHED="NOT_FOUND"
+          for delay in 5 10 20 30 45; do
+            PUBLISHED=$(npm view get-shit-done-cc@"$CANARY_VERSION" version 2>/dev/null || echo "NOT_FOUND")
+            SDK_PUBLISHED=$(npm view @gsd-build/sdk@"$CANARY_VERSION" version 2>/dev/null || echo "NOT_FOUND")
+            if [ "$PUBLISHED" = "$CANARY_VERSION" ] && [ "$SDK_PUBLISHED" = "$CANARY_VERSION" ]; then
+              break
+            fi
+            echo "Not yet live (sleeping ${delay}s)..."
+            sleep "$delay"
+          done
+          if [ "$PUBLISHED" != "$CANARY_VERSION" ]; then
+            echo "::error::Published version verification failed. Expected $CANARY_VERSION, got $PUBLISHED"
+            exit 1
+          fi
+          echo "Verified: get-shit-done-cc@$CANARY_VERSION is live on npm"
+          if [ "$SDK_PUBLISHED" != "$CANARY_VERSION" ]; then
+            echo "::error::SDK version verification failed. Expected $CANARY_VERSION, got $SDK_PUBLISHED"
+            exit 1
+          fi
+          echo "Verified: @gsd-build/sdk@$CANARY_VERSION is live on npm"
+          CANARY_TAG=$(npm dist-tag ls get-shit-done-cc 2>/dev/null | grep "canary:" | awk '{print $2}')
+          echo "canary dist-tag points to: $CANARY_TAG"
+
+      - name: Summary
+        env:
+          CANARY_VERSION: ${{ steps.canary.outputs.canary_version }}
+          DRY_RUN: ${{ inputs.dry_run }}
+          PUBLISH_ELIGIBLE: ${{ github.ref == 'refs/heads/dev' && !inputs.dry_run }}
+          BRANCH_REF: ${{ github.ref }}
+        run: |
+          echo "## Canary v${CANARY_VERSION}" >> "$GITHUB_STEP_SUMMARY"
+          if [ "$DRY_RUN" = "true" ]; then
+            echo "**DRY RUN** — npm publish, tagging, and push skipped" >> "$GITHUB_STEP_SUMMARY"
+          elif [ "$PUBLISH_ELIGIBLE" != "true" ]; then
+            echo "**VALIDATION ONLY** — publish/tag skipped for \`${BRANCH_REF}\`; canary publish is gated to \`refs/heads/dev\`." >> "$GITHUB_STEP_SUMMARY"
+          else
+            echo "- Published to npm as \`canary\`" >> "$GITHUB_STEP_SUMMARY"
+            echo "- SDK also published: \`@gsd-build/sdk@${CANARY_VERSION}\` on \`canary\`" >> "$GITHUB_STEP_SUMMARY"
+            echo "- Tagged \`v${CANARY_VERSION}\`" >> "$GITHUB_STEP_SUMMARY"
+            echo "- Install: \`npx get-shit-done-cc@canary\`" >> "$GITHUB_STEP_SUMMARY"
+          fi
--- a/.github/workflows/changeset-required.yml
+++ b/.github/workflows/changeset-required.yml
@@ -0,0 +1,24 @@
+name: Changeset Required
+
+on:
+  pull_request:
+    types: [opened, synchronize, reopened, labeled, unlabeled]
+
+permissions:
+  contents: read
+  pull-requests: read
+
+jobs:
+  changeset-lint:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '24'
+      - name: Run changeset lint
+        env:
+          GITHUB_BASE_REF: ${{ github.base_ref }}
+        run: node scripts/changeset/lint.cjs
--- a/.github/workflows/hotfix.yml
+++ b/.github/workflows/hotfix.yml
@@ -1,5 +1,27 @@
 name: Hotfix Release

+# Hotfix flow for X.YY.Z patch releases (Z > 0).
+#
+# create:
+#   - Branches hotfix/X.YY.Z from the highest existing vX.YY.* tag (1.27.2 from
+#     v1.27.1, 1.27.1 from v1.27.0). The base IS the cumulative-fix anchor for
+#     the previous patch.
+#   - Auto-cherry-picks every fix:/chore: commit on origin/main that isn't
+#     already in the base, oldest-first. Patch-equivalents (already applied)
+#     are skipped via `git cherry`. feat:/refactor: are NEVER auto-included.
+#   - Conflicts fail the workflow with the offending SHA so the operator can
+#     resolve manually on the branch and re-run finalize with auto_cherry_pick=false.
+#   - Step summary lists every included SHA so the eventual vX.YY.Z tag
+#     self-documents what shipped.
+#
+# finalize:
+#   - install-smoke gate (cross-platform, parity with release.yml/release-sdk.yml)
+#   - Bundles SDK as both loose tree (sdk/dist/cli.js) and recoverable tarball
+#     (sdk-bundle/gsd-sdk.tgz) — parity with release-sdk.yml so a hotfix shipped
+#     during the @gsd-build-token outage carries the same payload shape.
+#   - Publishes to @latest, tags vX.YY.Z, re-points @next → vX.YY.Z, opens
+#     merge-back PR.
+
 on:
  workflow_dispatch:
    inputs:
@@ -14,6 +36,11 @@ on:
        description: 'Patch version (e.g., 1.27.1)'
        required: true
        type: string
+      auto_cherry_pick:
+        description: 'Auto-cherry-pick fix:/chore: commits from origin/main since base tag (create only)'
+        required: false
+        type: boolean
+        default: true
      dry_run:
        description: 'Dry run (skip npm publish, tagging, and push)'
        required: false
@@ -54,10 +81,13 @@ jobs:
          MAJOR_MINOR=$(echo "$VERSION" | cut -d. -f1-2)
          TARGET_TAG="v${VERSION}"
          BRANCH="hotfix/${VERSION}"
-          BASE_TAG=$(git tag -l "v${MAJOR_MINOR}.*" \
-            | grep -E "^v[0-9]+\.[0-9]+\.[0-9]+$" \
+          # Append TARGET_TAG to the candidate list, then sort -V, then walk the
+          # sorted list and print whatever immediately precedes TARGET_TAG. This
+          # is semver-correct for multi-digit patches (v1.27.10 > v1.27.9) where
+          # a plain `awk '$1 < target'` lexicographic compare would mis-order.
+          BASE_TAG=$( ( git tag -l "v${MAJOR_MINOR}.*" | grep -E "^v[0-9]+\.[0-9]+\.[0-9]+$"; echo "$TARGET_TAG" ) \
            | sort -V \
-            | awk -v target="$TARGET_TAG" '$1 < target { last=$1 } END { if (last != "") print last }')
+            | awk -v target="$TARGET_TAG" '$1 == target { print prev; exit } { prev = $1 }')
          if [ -z "$BASE_TAG" ]; then
            echo "::error::No prior stable tag found for ${MAJOR_MINOR}.x before $TARGET_TAG"
            exit 1
@@ -95,29 +125,160 @@ jobs:
          git config user.name "github-actions[bot]"
          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

-      - name: Create hotfix branch
-        if: inputs.dry_run != 'true'
+      - name: Create hotfix branch from base tag and push (skeleton)
+        env:
+          BRANCH: ${{ needs.validate-version.outputs.branch }}
+          BASE_TAG: ${{ needs.validate-version.outputs.base_tag }}
+          DRY_RUN: ${{ inputs.dry_run }}
+        run: |
+          set -euo pipefail
+          git checkout -b "$BRANCH" "$BASE_TAG"
+          # Push the skeleton branch up-front so any subsequent cherry-pick
+          # conflict leaves a remote artefact the operator can fetch, resolve,
+          # and re-push. Skipped on dry-run — local checkout still exercises
+          # the same cherry-pick + bump flow so conflicts are caught.
+          if [ "$DRY_RUN" != "true" ]; then
+            git push -u origin "$BRANCH"
+          fi
+
+      - name: Cherry-pick fix/chore commits from origin/main since base tag
+        if: ${{ inputs.auto_cherry_pick }}
+        env:
+          BRANCH: ${{ needs.validate-version.outputs.branch }}
+          BASE_TAG: ${{ needs.validate-version.outputs.base_tag }}
+          DRY_RUN: ${{ inputs.dry_run }}
+        run: |
+          set -euo pipefail
+          git fetch origin main:refs/remotes/origin/main
+
+          # `git cherry $BASE_TAG origin/main` lists every commit on main not
+          # patch-equivalent in BASE_TAG. + means needs picking, - means
+          # already applied (skipped silently).
+          CANDIDATES=$(git cherry "$BASE_TAG" origin/main | awk '/^\+ / {print $2}')
+
+          if [ -z "$CANDIDATES" ]; then
+            echo "No commits on origin/main beyond $BASE_TAG."
+            echo "## Cherry-pick summary" >> "$GITHUB_STEP_SUMMARY"
+            echo "" >> "$GITHUB_STEP_SUMMARY"
+            echo "Base: \`$BASE_TAG\` — no commits to consider." >> "$GITHUB_STEP_SUMMARY"
+            exit 0
+          fi
+
+          # Re-order chronologically (oldest first) for predictable application.
+          ORDERED=$(git log --reverse --format='%H' "$BASE_TAG..origin/main" \
+            | grep -F -f <(echo "$CANDIDATES") || true)
+
+          INCLUDED=""
+          SKIPPED=""
+          while IFS= read -r SHA; do
+            [ -z "$SHA" ] && continue
+            SUBJECT=$(git log -1 --format='%s' "$SHA")
+            # fix: or chore:, optional scope, optional ! breaking marker
+            if echo "$SUBJECT" | grep -qE '^(fix|chore)(\([^)]+\))?!?: '; then
+              echo "→ cherry-picking $SHA  $SUBJECT"
+              if ! git cherry-pick -x "$SHA"; then
+                # Abort restores HEAD to the last successful pick. On real
+                # runs, push that state so the operator can fetch, resolve
+                # $SHA manually, and finalize with auto_cherry_pick=false.
+                git cherry-pick --abort || true
+                if [ "$DRY_RUN" != "true" ]; then
+                  git push --force-with-lease origin "$BRANCH" || git push origin "$BRANCH" || true
+                fi
+                {
+                  echo "## Cherry-pick conflict"
+                  echo ""
+                  echo "Failed at: \`${SHA}\` — \`${SUBJECT}\`"
+                  echo ""
+                  if [ "$DRY_RUN" = "true" ]; then
+                    echo "**Dry run:** branch was not pushed, so the picks below were discarded with the runner."
+                    if [ -n "$INCLUDED" ]; then
+                      echo ""
+                      echo "Already-applied picks (lost — must be re-applied before resolving \`${SHA}\`):"
+                      echo ""
+                      echo "$INCLUDED"
+                    fi
+                    echo ""
+                    echo "**To resolve:** re-run \`create\` with \`auto_cherry_pick=true\` (real, not dry-run) to materialize the partial branch on origin, then resolve \`${SHA}\` manually. Re-running with \`auto_cherry_pick=false\` would recreate the branch from \`${BASE_TAG}\` and lose every pick listed above."
+                  else
+                    echo "Branch \`${BRANCH}\` was pushed with picks applied up to (but not including) the conflicting commit."
+                    echo ""
+                    echo "**To resolve:** \`git fetch origin && git checkout ${BRANCH} && git cherry-pick -x ${SHA}\`, fix the conflict, push, then re-run \`finalize\` with \`auto_cherry_pick=false\`."
+                  fi
+                } >> "$GITHUB_STEP_SUMMARY"
+                echo "::error::Cherry-pick of $SHA failed. See summary."
+                exit 1
+              fi
+              INCLUDED="${INCLUDED}- \`${SHA}\` ${SUBJECT}"$'\n'
+            else
+              echo "  skip $SHA  $SUBJECT  (not fix/chore)"
+              SKIPPED="${SKIPPED}- \`${SHA}\` ${SUBJECT}"$'\n'
+            fi
+          done <<< "$ORDERED"
+
+          {
+            echo "## Cherry-pick summary"
+            echo ""
+            echo "Base: \`$BASE_TAG\`"
+            echo ""
+            if [ -n "$INCLUDED" ]; then
+              echo "### Included (fix/chore)"
+              echo ""
+              echo "$INCLUDED"
+            else
+              echo "_No fix/chore commits to include._"
+              echo ""
+            fi
+            if [ -n "$SKIPPED" ]; then
+              echo "### Skipped (feat/refactor/etc — not auto-included)"
+              echo ""
+              echo "$SKIPPED"
+            fi
+          } >> "$GITHUB_STEP_SUMMARY"
+
+      - name: Bump version and push
        env:
          BRANCH: ${{ needs.validate-version.outputs.branch }}
          BASE_TAG: ${{ needs.validate-version.outputs.base_tag }}
          VERSION: ${{ inputs.version }}
+          DRY_RUN: ${{ inputs.dry_run }}
        run: |
-          git checkout -b "$BRANCH" "$BASE_TAG"
-          # Bump version in package.json
+          set -euo pipefail
          npm version "$VERSION" --no-git-tag-version
          git add package.json package-lock.json
+          # Keep sdk/package.json in lockstep (parity with release-sdk.yml).
+          if [ -f sdk/package.json ]; then
+            (cd sdk && npm version "$VERSION" --no-git-tag-version)
+            git add sdk/package.json
+            [ -f sdk/package-lock.json ] && git add sdk/package-lock.json
+          fi
          git commit -m "chore: bump version to $VERSION for hotfix"
-          git push origin "$BRANCH"
-          echo "## Hotfix branch created" >> "$GITHUB_STEP_SUMMARY"
-          echo "- Branch: \`$BRANCH\`" >> "$GITHUB_STEP_SUMMARY"
-          echo "- Based on: \`$BASE_TAG\`" >> "$GITHUB_STEP_SUMMARY"
-          echo "- Apply your fix, push, then run this workflow again with \`finalize\`" >> "$GITHUB_STEP_SUMMARY"
+          if [ "$DRY_RUN" != "true" ]; then
+            git push origin "$BRANCH"
+          else
+            echo "DRY RUN — branch not pushed. Local checkout exercised the cherry-pick and bump flow."
+          fi
+          {
+            echo "## Hotfix branch created"
+            echo ""
+            echo "- Branch: \`$BRANCH\`"
+            echo "- Based on: \`$BASE_TAG\`"
+            echo "- Apply additional manual fixes if needed, then run \`finalize\`."
+          } >> "$GITHUB_STEP_SUMMARY"

-  finalize:
+  install-smoke:
    needs: validate-version
    if: inputs.action == 'finalize'
+    permissions:
+      contents: read
+    uses: ./.github/workflows/install-smoke.yml
+    with:
+      ref: ${{ needs.validate-version.outputs.branch }}
+
+  finalize:
+    needs: [validate-version, install-smoke]
+    if: inputs.action == 'finalize'
    runs-on: ubuntu-latest
-    timeout-minutes: 10
+    timeout-minutes: 15
    permissions:
      contents: write
      pull-requests: write
@@ -140,31 +301,83 @@ jobs:
          git config user.name "github-actions[bot]"
          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

+      - name: Detect prior publish (reconciliation mode)
+        id: prior_publish
+        env:
+          VERSION: ${{ inputs.version }}
+        run: |
+          EXISTING=$(npm view get-shit-done-cc@"$VERSION" version 2>/dev/null || true)
+          if [ -n "$EXISTING" ]; then
+            echo "::warning::get-shit-done-cc@${VERSION} is already on the registry — entering reconciliation mode (skip publish, continue with tag/release/PR/dist-tag)."
+            echo "skip_publish=true" >> "$GITHUB_OUTPUT"
+          else
+            echo "skip_publish=false" >> "$GITHUB_OUTPUT"
+          fi
+
      - name: Install and test
        run: |
          npm ci
          npm run test:coverage

-      - name: Create PR to merge hotfix back to main
-        if: ${{ !inputs.dry_run }}
+      - name: Build SDK dist for tarball
+        run: npm run build:sdk
+
+      - name: Verify CC tarball ships sdk/dist/cli.js (bug #2647 guard)
+        run: bash scripts/verify-tarball-sdk-dist.sh
+
+      - name: Pack SDK as tarball and bundle into CC source tree
        env:
-          GH_TOKEN: ${{ github.token }}
-          BRANCH: ${{ needs.validate-version.outputs.branch }}
          VERSION: ${{ inputs.version }}
        run: |
-          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number')
-          if [ -n "$EXISTING_PR" ]; then
-            echo "PR #$EXISTING_PR already exists; updating"
-            gh pr edit "$EXISTING_PR" \
-              --title "chore: merge hotfix v${VERSION} back to main" \
-              --body "Merge hotfix changes back to main after v${VERSION} release."
-          else
-            gh pr create \
-              --base main \
-              --head "$BRANCH" \
-              --title "chore: merge hotfix v${VERSION} back to main" \
-              --body "Merge hotfix changes back to main after v${VERSION} release."
+          set -e
+          cd sdk
+          npm pack
+          TARBALL="gsd-build-sdk-${VERSION}.tgz"
+          if [ ! -f "$TARBALL" ]; then
+            echo "::error::Expected $TARBALL but npm pack did not produce it."
+            ls -la
+            exit 1
          fi
+          mkdir -p ../sdk-bundle
+          mv "$TARBALL" ../sdk-bundle/gsd-sdk.tgz
+          cd ..
+          ls -la sdk-bundle/
+
+      - name: Add sdk-bundle to CC files whitelist (in-tree, not committed)
+        run: |
+          node <<'NODE'
+          const fs = require('fs');
+          const pkg = JSON.parse(fs.readFileSync('package.json', 'utf8'));
+          if (!Array.isArray(pkg.files)) {
+            console.error('::error::package.json files is not an array');
+            process.exit(1);
+          }
+          if (!pkg.files.includes('sdk-bundle')) {
+            pkg.files.push('sdk-bundle');
+            fs.writeFileSync('package.json', JSON.stringify(pkg, null, 2) + '\n');
+            console.log('Added sdk-bundle/ to package.json files whitelist');
+          }
+          NODE
+
+      - name: Verify CC tarball will contain sdk-bundle/gsd-sdk.tgz
+        run: |
+          set -e
+          TARBALL=$(npm pack --ignore-scripts 2>/dev/null | tail -1)
+          if [ -z "$TARBALL" ] || [ ! -f "$TARBALL" ]; then
+            echo "::error::npm pack produced no tarball"
+            exit 1
+          fi
+          if ! tar -tzf "$TARBALL" | grep -q "package/sdk-bundle/gsd-sdk.tgz"; then
+            echo "::error::CC tarball is missing package/sdk-bundle/gsd-sdk.tgz"
+            exit 1
+          fi
+          echo "✅ CC tarball contains sdk-bundle/gsd-sdk.tgz"
+          rm -f "$TARBALL"
+
+      - name: Dry-run publish validation
+        env:
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: npm publish --dry-run --tag latest

      - name: Tag and push
        if: ${{ !inputs.dry_run }}
@@ -185,55 +398,98 @@ jobs:
          fi

      - name: Publish to npm (latest)
-        if: ${{ !inputs.dry_run }}
-        run: npm publish --provenance --access public
+        if: ${{ !inputs.dry_run && steps.prior_publish.outputs.skip_publish != 'true' }}
        env:
          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: npm publish --provenance --access public --tag latest

-      - name: Create GitHub Release
+      - name: Re-point next dist-tag at this hotfix
+        if: ${{ !inputs.dry_run }}
+        env:
+          VERSION: ${{ inputs.version }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: |
+          npm dist-tag add "get-shit-done-cc@${VERSION}" next
+          echo "✅ next dist-tag re-pointed to v${VERSION} (matches latest)"
+
+      - name: Create GitHub Release (idempotent)
        if: ${{ !inputs.dry_run }}
        env:
          GH_TOKEN: ${{ github.token }}
          VERSION: ${{ inputs.version }}
        run: |
-          gh release create "v${VERSION}" \
-            --title "v${VERSION} (hotfix)" \
-            --generate-notes
+          if gh release view "v${VERSION}" >/dev/null 2>&1; then
+            echo "GitHub Release v${VERSION} already exists; ensuring --latest flag is set"
+            gh release edit "v${VERSION}" --latest || true
+          else
+            gh release create "v${VERSION}" \
+              --title "v${VERSION} (hotfix)" \
+              --generate-notes \
+              --latest
+          fi

-      - name: Clean up next dist-tag
+      - name: Create PR to merge hotfix back to main
        if: ${{ !inputs.dry_run }}
        env:
+          GH_TOKEN: ${{ github.token }}
+          BRANCH: ${{ needs.validate-version.outputs.branch }}
          VERSION: ${{ inputs.version }}
-          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
        run: |
-          # Point next to the stable release so @next never returns something
-          # older than @latest. This prevents stale pre-release installs.
-          npm dist-tag add "get-shit-done-cc@${VERSION}" next 2>/dev/null || true
-          echo "✓ next dist-tag updated to v${VERSION}"
+          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number')
+          if [ -n "$EXISTING_PR" ]; then
+            gh pr edit "$EXISTING_PR" \
+              --title "chore: merge hotfix v${VERSION} back to main" \
+              --body "Merge hotfix changes back to main after v${VERSION} release."
+          else
+            gh pr create \
+              --base main \
+              --head "$BRANCH" \
+              --title "chore: merge hotfix v${VERSION} back to main" \
+              --body "Merge hotfix changes back to main after v${VERSION} release."
+          fi

-      - name: Verify publish
+      - name: Verify publish landed on registry
        if: ${{ !inputs.dry_run }}
        env:
          VERSION: ${{ inputs.version }}
        run: |
-          sleep 10
-          PUBLISHED=$(npm view get-shit-done-cc@"$VERSION" version 2>/dev/null || echo "NOT_FOUND")
+          PUBLISHED="NOT_FOUND"
+          for delay in 5 10 20 30 45; do
+            PUBLISHED=$(npm view get-shit-done-cc@"$VERSION" version 2>/dev/null || echo "NOT_FOUND")
+            if [ "$PUBLISHED" = "$VERSION" ]; then
+              break
+            fi
+            echo "Waiting ${delay}s for registry to catch up (saw: $PUBLISHED)..."
+            sleep "$delay"
+          done
          if [ "$PUBLISHED" != "$VERSION" ]; then
-            echo "::error::Published version verification failed. Expected $VERSION, got $PUBLISHED"
+            echo "::error::Version $VERSION did not appear on the registry within timeout"
            exit 1
          fi
-          echo "✓ Verified: get-shit-done-cc@$VERSION is live on npm"
+          LATEST_VER=$(npm view get-shit-done-cc dist-tags.latest 2>/dev/null || echo "NOT_FOUND")
+          if [ "$LATEST_VER" != "$VERSION" ]; then
+            echo "::error::dist-tag 'latest' resolves to '$LATEST_VER', expected '$VERSION'"
+            exit 1
+          fi
+          echo "✓ Verified: get-shit-done-cc@$VERSION is live on @latest"

      - name: Summary
        env:
          VERSION: ${{ inputs.version }}
+          BASE_TAG: ${{ needs.validate-version.outputs.base_tag }}
          DRY_RUN: ${{ inputs.dry_run }}
        run: |
-          echo "## Hotfix v${VERSION}" >> "$GITHUB_STEP_SUMMARY"
-          if [ "$DRY_RUN" = "true" ]; then
-            echo "**DRY RUN** — npm publish, tagging, and push skipped" >> "$GITHUB_STEP_SUMMARY"
-          else
-            echo "- Published to npm as \`latest\`" >> "$GITHUB_STEP_SUMMARY"
-            echo "- Tagged \`v${VERSION}\`" >> "$GITHUB_STEP_SUMMARY"
-            echo "- PR created to merge back to main" >> "$GITHUB_STEP_SUMMARY"
-          fi
+          {
+            echo "## Hotfix v${VERSION}"
+            echo ""
+            echo "- Base (cumulative-fix anchor): \`${BASE_TAG}\`"
+            if [ "$DRY_RUN" = "true" ]; then
+              echo "- **DRY RUN** — npm publish, tagging, and push skipped"
+            else
+              echo "- Published to npm as \`latest\`"
+              echo "- \`next\` dist-tag re-pointed to v${VERSION}"
+              echo "- Tagged \`v${VERSION}\` (anchor for the next hotfix's cherry-pick base)"
+              echo "- SDK bundled at \`sdk-bundle/gsd-sdk.tgz\` inside CC tarball"
+              echo "- Merge-back PR opened against main"
+            fi
+          } >> "$GITHUB_STEP_SUMMARY"
--- a/.github/workflows/install-smoke.yml
+++ b/.github/workflows/install-smoke.yml
@@ -1,10 +1,13 @@
 name: Install Smoke

-# Exercises the real install path: `npm pack` → `npm install -g <tarball>`
-# → run `bin/install.js` → assert `gsd-sdk` is on PATH.
+# Exercises the real install paths:
+#   tarball: `npm pack` → `npm install -g <tarball>` → assert gsd-sdk on PATH
+#   unpacked: `npm install -g <dir>` (no pack) → assert gsd-sdk on PATH + executable
 #
-# Closes the CI gap that let #2439 ship: the rest of the suite only reads
-# `bin/install.js` as a string and never executes it.
+# The tarball path is the canonical ship path. The unpacked path reproduces the
+# mode-644 failure class (issue #2453): npm does NOT chmod bin targets when
+# installing from an unpacked local directory, so any stale tsc output lacking
+# execute bits will be caught by the unpacked job before release.
 #
 # - PRs: path-filtered, minimal runner (ubuntu + Node LTS) for fast signal.
 # - Push to release branches / main: full matrix.
@@ -16,6 +19,7 @@ on:
      - main
    paths:
      - 'bin/install.js'
+      - 'bin/gsd-sdk.js'
      - 'sdk/**'
      - 'package.json'
      - 'package-lock.json'
@@ -40,6 +44,9 @@ concurrency:
  cancel-in-progress: true

 jobs:
+  # ---------------------------------------------------------------------------
+  # Job 1: tarball install (existing canonical path)
+  # ---------------------------------------------------------------------------
  smoke:
    runs-on: ${{ matrix.os }}
    timeout-minutes: 12
@@ -78,6 +85,31 @@ jobs:
        if: steps.skip.outputs.skip != 'true'
        with:
          ref: ${{ inputs.ref || github.ref }}
+          # Need enough history to merge origin/main for stale-base detection.
+          fetch-depth: 0
+
+      # The default `refs/pull/N/merge` ref GitHub produces for PRs is cached
+      # against the recorded merge-base, not current main. When main advances
+      # after the PR was opened, the merge ref stays stale and CI can fail on
+      # issues that were already fixed upstream. Explicitly merge current
+      # origin/main into the PR head so smoke always tests the PR against the
+      # latest trunk. If the merge conflicts, emit a clear "rebase onto main"
+      # diagnostic instead of a downstream build error that looks unrelated.
+      - name: Rebase check — merge origin/main into PR head
+        if: steps.skip.outputs.skip != 'true' && github.event_name == 'pull_request'
+        shell: bash
+        run: |
+          set -euo pipefail
+          git config user.email "ci@gsd-build"
+          git config user.name "CI Rebase Check"
+          git fetch origin main
+          if ! git merge --no-edit --no-ff origin/main; then
+            echo "::error::This PR cannot cleanly merge origin/main. Rebase your branch onto current main and push again."
+            echo "::error::Conflicting files:"
+            git diff --name-only --diff-filter=U
+            git merge --abort
+            exit 1
+          fi

      - name: Set up Node.js ${{ matrix.node-version }}
        if: steps.skip.outputs.skip != 'true'
@@ -90,6 +122,23 @@ jobs:
        if: steps.skip.outputs.skip != 'true'
        run: npm ci

+      # Isolated SDK typecheck — if the build fails, emit a clear "stale base
+      # or real type error" diagnostic instead of letting the failure cascade
+      # into the tarball install step, where the downstream PATH assertion
+      # misreports it as "gsd-sdk not on PATH — installSdkIfNeeded regression".
+      - name: SDK typecheck (fails fast on type regressions)
+        if: steps.skip.outputs.skip != 'true'
+        shell: bash
+        run: |
+          set -euo pipefail
+          if ! npm run build:sdk; then
+            echo "::error::SDK build (npm run build:sdk) failed."
+            echo "::error::Common cause: your PR base is behind main and picks up intermediate type errors that are already fixed on trunk."
+            echo "::error::Fix: git fetch origin main && git rebase origin/main && git push --force-with-lease"
+            echo "::error::If the error persists on a fresh rebase, the type error is real — fix it in sdk/src/ and push."
+            exit 1
+          fi
+
      - name: Pack root tarball
        if: steps.skip.outputs.skip != 'true'
        id: pack
@@ -109,7 +158,7 @@ jobs:
          echo "$NPM_BIN" >> "$GITHUB_PATH"
          echo "npm global bin: $NPM_BIN"

-      - name: Install tarball globally (runs bin/install.js → installSdkIfNeeded)
+      - name: Install tarball globally
        if: steps.skip.outputs.skip != 'true'
        shell: bash
        env:
@@ -121,13 +170,14 @@ jobs:
          cd "$TMPDIR_ROOT"
          npm install -g "$WORKSPACE/$TARBALL"
          command -v get-shit-done-cc
-          # `--claude --local` is the non-interactive code path (see
-          # install.js main block: when both a runtime and location are set,
-          # installAllRuntimes runs with isInteractive=false, no prompts).
-          # We tolerate non-zero here because the authoritative assertion is
-          # the next step: gsd-sdk must land on PATH. Some runtime targets
-          # may exit before the SDK step for unrelated reasons on CI.
-          get-shit-done-cc --claude --local || true
+          # `--claude --local` is the non-interactive code path. Don't swallow
+          # non-zero exit — if the installer fails, that IS the CI failure, and
+          # its own error message is more useful than the downstream "shim
+          # regression" assertion masking the real cause.
+          if ! get-shit-done-cc --claude --local; then
+            echo "::error::get-shit-done-cc --claude --local failed. See the install.js output above for the real error (SDK build, PATH resolution, chmod, etc.)."
+            exit 1
+          fi

      - name: Assert gsd-sdk resolves on PATH
        if: steps.skip.outputs.skip != 'true'
@@ -135,7 +185,7 @@ jobs:
        run: |
          set -euo pipefail
          if ! command -v gsd-sdk >/dev/null 2>&1; then
-            echo "::error::gsd-sdk is not on PATH after install — installSdkIfNeeded() regression"
+            echo "::error::gsd-sdk is not on PATH after tarball install — shim regression"
            NPM_BIN="$(npm config get prefix)/bin"
            echo "npm global bin: $NPM_BIN"
            ls -la "$NPM_BIN" | grep -i gsd || true
@@ -150,3 +200,99 @@ jobs:
          set -euo pipefail
          gsd-sdk --version || gsd-sdk --help
          echo "✓ gsd-sdk is executable"
+
+  # ---------------------------------------------------------------------------
+  # Job 2: unpacked-dir install — reproduces the mode-644 failure class (#2453)
+  #
+  # `npm install -g <directory>` does NOT chmod bin targets when the source
+  # file was produced by a build script (tsc emits 0o644). This job catches
+  # regressions where sdk/dist/cli.js loses its execute bit before publish.
+  # ---------------------------------------------------------------------------
+  smoke-unpacked:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          ref: ${{ inputs.ref || github.ref }}
+          fetch-depth: 0
+
+      # See the `smoke` job above for rationale — refs/pull/N/merge is cached
+      # against the recorded merge-base, not current main. Explicitly merge
+      # origin/main so smoke-unpacked also runs against the latest trunk.
+      - name: Rebase check — merge origin/main into PR head
+        if: github.event_name == 'pull_request'
+        shell: bash
+        run: |
+          set -euo pipefail
+          git config user.email "ci@gsd-build"
+          git config user.name "CI Rebase Check"
+          git fetch origin main
+          if ! git merge --no-edit --no-ff origin/main; then
+            echo "::error::This PR cannot cleanly merge origin/main. Rebase your branch onto current main and push again."
+            echo "::error::Conflicting files:"
+            git diff --name-only --diff-filter=U
+            git merge --abort
+            exit 1
+          fi
+
+      - name: Set up Node.js 22
+        uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f  # v6.3.0
+        with:
+          node-version: 22
+          cache: 'npm'
+
+      - name: Install root deps
+        run: npm ci
+
+      - name: Build SDK dist (sdk/dist is gitignored — must build for unpacked install)
+        run: npm run build:sdk
+
+      - name: Ensure npm global bin is on PATH
+        shell: bash
+        run: |
+          NPM_BIN="$(npm config get prefix)/bin"
+          echo "$NPM_BIN" >> "$GITHUB_PATH"
+          echo "npm global bin: $NPM_BIN"
+
+      - name: Strip execute bit from sdk/dist/cli.js to simulate tsc-fresh output
+        shell: bash
+        run: |
+          set -euo pipefail
+          # Simulate the exact state tsc produces: cli.js at mode 644.
+          chmod 644 sdk/dist/cli.js
+          echo "Stripped execute bit: $(stat -c '%a' sdk/dist/cli.js 2>/dev/null || stat -f '%p' sdk/dist/cli.js)"
+
+      - name: Install from unpacked directory (no npm pack)
+        shell: bash
+        run: |
+          set -euo pipefail
+          TMPDIR_ROOT=$(mktemp -d)
+          cd "$TMPDIR_ROOT"
+          npm install -g "$GITHUB_WORKSPACE"
+          command -v get-shit-done-cc
+          get-shit-done-cc --claude --local || true
+
+      - name: Assert gsd-sdk resolves on PATH after unpacked install
+        shell: bash
+        run: |
+          set -euo pipefail
+          if ! command -v gsd-sdk >/dev/null 2>&1; then
+            echo "::error::gsd-sdk is not on PATH after unpacked install — #2453 regression"
+            NPM_BIN="$(npm config get prefix)/bin"
+            ls -la "$NPM_BIN" | grep -i gsd || true
+            exit 1
+          fi
+          echo "✓ gsd-sdk resolves at: $(command -v gsd-sdk)"
+
+      - name: Assert gsd-sdk is executable after unpacked install (#2453)
+        shell: bash
+        run: |
+          set -euo pipefail
+          # This is the exact check that would have caught #2453 before release.
+          # The shim (bin/gsd-sdk.js) invokes sdk/dist/cli.js via `node`, so
+          # the execute bit on cli.js is not needed for the shim path. However
+          # installSdkIfNeeded() also chmods cli.js in-place as a safety net.
+          gsd-sdk --version || gsd-sdk --help
+          echo "✓ gsd-sdk is executable after unpacked install"
--- a/.github/workflows/release-sdk.yml
+++ b/.github/workflows/release-sdk.yml
@@ -0,0 +1,790 @@
+# Release SDK Bundle
+#
+# Stopgap workflow_dispatch publish path: builds get-shit-done-cc with the
+# compiled SDK and the SDK .tgz bundled inside the CC tarball, then
+# publishes the CC package to ONE chosen dist-tag (dev | next | latest)
+# per run.
+#
+# Why this exists: @gsd-build/sdk publishes from canary.yml and release.yml
+# fail because the @gsd-build npm token is currently unavailable. CC users
+# do not consume @gsd-build/sdk directly — bin/gsd-sdk.js resolves
+# sdk/dist/cli.js from inside the installed CC package, so the bundled
+# copy is sufficient for full functionality. This workflow ships CC alone
+# (no separate @gsd-build/sdk publish attempt) and additionally bakes a
+# bundled gsd-sdk-<version>.tgz at sdk-bundle/gsd-sdk.tgz inside the CC
+# tarball as a recoverable npm-installable artifact.
+#
+# Existing canary.yml and release.yml are intentionally untouched. They
+# remain the canonical two-package publish path; restore them to primary
+# use once @gsd-build/sdk ownership is recovered.
+#
+# Tracking issues: #2925 (initial workflow), #2929 (CI-gate parity with release.yml)
+
+name: Release SDK Bundle
+
+on:
+  workflow_dispatch:
+    inputs:
+      action:
+        description: 'publish = normal dev/next/latest publish; hotfix = create hotfix/X.YY.Z branch from latest vX.YY.* tag, cherry-pick fix:/chore: from main, publish to @latest'
+        required: true
+        type: choice
+        default: publish
+        options:
+          - publish
+          - hotfix
+      tag:
+        description: 'npm dist-tag (publish action only; hotfix forces latest)'
+        required: false
+        type: choice
+        default: latest
+        options:
+          - dev
+          - next
+          - latest
+      version:
+        description: 'Version. publish: explicit (e.g. 1.50.0-dev.3) or empty to derive. hotfix: REQUIRED patch (e.g. 1.27.1, Z>0).'
+        required: false
+        type: string
+      ref:
+        description: 'Branch or ref to build from. Ignored for hotfix (workflow uses hotfix/X.YY.Z).'
+        required: false
+        type: string
+      auto_cherry_pick:
+        description: 'Hotfix only: auto-cherry-pick fix:/chore: commits from origin/main since base tag.'
+        required: false
+        type: boolean
+        default: true
+      dry_run:
+        description: 'Dry run (skip npm publish, git tag, and push). Hotfix branch creation/push also skipped.'
+        required: false
+        type: boolean
+        default: false
+
+# Per stream (dist-tag for publish, version for hotfix) — no concurrent publishes for the same stream.
+concurrency:
+  group: release-sdk-${{ inputs.action == 'hotfix' && format('hotfix-{0}', inputs.version) || inputs.tag }}
+  cancel-in-progress: false
+
+env:
+  NODE_VERSION: 24
+
+jobs:
+  # Resolves the effective git ref for this run.
+  #
+  # action=publish  → outputs inputs.ref verbatim (may be empty = workflow ref)
+  # action=hotfix   → branches hotfix/X.YY.Z from highest existing vX.YY.* tag,
+  #                   auto-cherry-picks fix:/chore: from origin/main, pushes,
+  #                   and outputs the new branch as ref. Idempotent: if branch
+  #                   already exists (operator pre-prepared it via hotfix.yml),
+  #                   we just check it out and re-run the cherry-pick step
+  #                   no-ops since `git cherry` will report nothing new.
+  prepare:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    permissions:
+      contents: write
+    outputs:
+      ref: ${{ steps.out.outputs.ref }}
+      base_tag: ${{ steps.hotfix.outputs.base_tag }}
+    steps:
+      - name: Validate hotfix inputs
+        if: inputs.action == 'hotfix'
+        env:
+          VERSION: ${{ inputs.version }}
+        run: |
+          if [ -z "$VERSION" ]; then
+            echo "::error::action=hotfix requires the 'version' input (e.g. 1.27.1)"
+            exit 1
+          fi
+          if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[1-9][0-9]*$'; then
+            echo "::error::Hotfix version must match X.YY.Z with Z>0 (got: $VERSION)"
+            exit 1
+          fi
+
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        if: inputs.action == 'hotfix'
+        with:
+          fetch-depth: 0
+
+      - name: Configure git identity
+        if: inputs.action == 'hotfix'
+        run: |
+          git config user.name "github-actions[bot]"
+          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+
+      - name: Prepare hotfix branch
+        id: hotfix
+        if: inputs.action == 'hotfix'
+        env:
+          VERSION: ${{ inputs.version }}
+          AUTO_CHERRY_PICK: ${{ inputs.auto_cherry_pick }}
+          DRY_RUN: ${{ inputs.dry_run }}
+        run: |
+          set -euo pipefail
+          # Stash the shipped-paths classifier from the dispatched ref's
+          # working tree BEFORE `git checkout -b ... "$BASE_TAG"` below
+          # overwrites it. Base tags predating #2980 don't have the
+          # classifier in their tree, so the loop must reference a
+          # location that survives the working-tree swap. Bug #2983.
+          CLASSIFIER_SRC="scripts/diff-touches-shipped-paths.cjs"
+          if [ ! -f "$CLASSIFIER_SRC" ]; then
+            echo "::error::shipped-paths classifier not found at $CLASSIFIER_SRC in dispatched ref — refusing to run"
+            exit 1
+          fi
+          CLASSIFIER="${RUNNER_TEMP}/diff-touches-shipped-paths.cjs"
+          cp "$CLASSIFIER_SRC" "$CLASSIFIER"
+          if [ ! -f "$CLASSIFIER" ]; then
+            echo "::error::failed to stage classifier at $CLASSIFIER"
+            exit 1
+          fi
+
+          MAJOR_MINOR=$(echo "$VERSION" | cut -d. -f1-2)
+          TARGET_TAG="v${VERSION}"
+          BRANCH="hotfix/${VERSION}"
+          # Semver-correct selection: append TARGET_TAG, sort -V, take preceding entry.
+          # Plain lexicographic compare mis-orders multi-digit patches (v1.27.10 vs v1.27.9).
+          BASE_TAG=$( ( git tag -l "v${MAJOR_MINOR}.*" | grep -E "^v[0-9]+\.[0-9]+\.[0-9]+$"; echo "$TARGET_TAG" ) \
+            | sort -V \
+            | awk -v target="$TARGET_TAG" '$1 == target { print prev; exit } { prev = $1 }')
+          if [ -z "$BASE_TAG" ]; then
+            echo "::error::No prior stable tag found for ${MAJOR_MINOR}.x before $TARGET_TAG"
+            exit 1
+          fi
+          echo "base_tag=$BASE_TAG" >> "$GITHUB_OUTPUT"
+          echo "branch=$BRANCH" >> "$GITHUB_OUTPUT"
+
+          # Idempotent branch creation — operator may have pre-prepared via hotfix.yml.
+          git fetch origin main:refs/remotes/origin/main
+          if git ls-remote --exit-code origin "refs/heads/$BRANCH" >/dev/null 2>&1; then
+            echo "Branch $BRANCH already exists on origin; checking out"
+            git fetch origin "$BRANCH"
+            git checkout "$BRANCH"
+            BRANCH_PRE_EXISTED=1
+          else
+            git checkout -b "$BRANCH" "$BASE_TAG"
+            BRANCH_PRE_EXISTED=0
+            # Push the skeleton up-front (real runs only) so cherry-pick conflicts
+            # leave a remote artefact the operator can resolve. Dry-run keeps
+            # everything local — no orphan branch created on origin.
+            if [ "$DRY_RUN" != "true" ]; then
+              git push -u origin "$BRANCH"
+            fi
+          fi
+
+          if [ "$AUTO_CHERRY_PICK" = "true" ]; then
+            CANDIDATES=$(git cherry HEAD origin/main | awk '/^\+ / {print $2}')
+            if [ -n "$CANDIDATES" ]; then
+              ORDERED=$(git log --reverse --format='%H' "${BASE_TAG}..origin/main" \
+                | grep -F -f <(echo "$CANDIDATES") || true)
+              INCLUDED=""
+              # POLICY_SKIPPED — commits intentionally not picked because they
+              # don't match the fix/chore filter (feat/refactor/docs/etc).
+              # CONFLICT_SKIPPED — fix/chore commits whose cherry-pick failed
+              # and were skipped per the full-automation policy (#2968).
+              # NON_SHIPPED_SKIPPED — fix/chore commits whose diff doesn't
+              # touch any path in the npm tarball's `files` whitelist
+              # (CI / test / docs / planning-only changes). They can't
+              # affect the published package's behavior, so picking them
+              # into a hotfix is meaningless — and picking workflow-file
+              # changes specifically would also fail the push step because
+              # the default GITHUB_TOKEN lacks the `workflow` scope. The
+              # shipped-paths filter is the precise root cause: bug #2980.
+              # Operators reviewing the run summary need these distinct so
+              # the manual-review queue (CONFLICT_SKIPPED) isn't buried in
+              # the noise from the other two buckets.
+              POLICY_SKIPPED=""
+              CONFLICT_SKIPPED=""
+              NON_SHIPPED_SKIPPED=""
+              while IFS= read -r SHA; do
+                [ -z "$SHA" ] && continue
+                SUBJECT=$(git log -1 --format='%s' "$SHA")
+                if echo "$SUBJECT" | grep -qE '^(fix|chore)(\([^)]+\))?!?: '; then
+                  # Merge commits with fix:/chore: titles can't be cherry-picked
+                  # without `-m <parent>` and we can't pick the parent
+                  # automatically. They fail BEFORE entering cherry-pick state
+                  # (no CHERRY_PICK_HEAD), so an unconditional `--skip` would
+                  # then fail and brick the loop. Skip them upfront with a
+                  # distinct reason. Bug #2968 / CodeRabbit on PR #2970.
+                  PARENT_COUNT=$(git rev-list --parents -n 1 "$SHA" | awk '{print NF - 1}')
+                  if [ "$PARENT_COUNT" -gt 1 ]; then
+                    REASON="merge commit — manual -m parent selection required"
+                    echo "↷ skipping $SHA — $REASON"
+                    CONFLICT_SKIPPED="${CONFLICT_SKIPPED}- \`${SHA}\` ${SUBJECT} ($REASON)"$'\n'
+                    continue
+                  fi
+                  # Pre-pick guard: a hotfix release can only be affected
+                  # by commits whose diff intersects the npm tarball's
+                  # shipped paths (package.json `files` whitelist plus
+                  # package.json itself, which `npm pack` always
+                  # includes). Commits that touch only CI workflows,
+                  # tests, docs, or planning artifacts cannot change what
+                  # ships, so picking them into a hotfix is meaningless.
+                  # As a side benefit, this excludes
+                  # `.github/workflows/*` changes whose push would
+                  # otherwise be rejected by GitHub because the default
+                  # GITHUB_TOKEN lacks the `workflow` scope. The filter
+                  # is implemented in
+                  # scripts/diff-touches-shipped-paths.cjs rather than
+                  # inline so the rules (read package.json `files`,
+                  # treat entries as file-OR-directory prefix, the
+                  # `package.json`-always-shipped rule) are
+                  # unit-testable. Bug #2980.
+                  #
+                  # Use $CLASSIFIER (staged at workflow-start, before
+                  # `git checkout -b ... "$BASE_TAG"` swapped the working
+                  # tree) rather than `scripts/...` directly — base tags
+                  # older than #2980 don't have the classifier in their
+                  # tree. Capture the exit code via PIPESTATUS and
+                  # dispatch on it: 0 = shipped, 1 = not shipped, 2+ =
+                  # classifier error → fail-fast (don't silently treat
+                  # tooling errors as informational skips). Bug #2983.
+                  #
+                  # PIPESTATUS capture must happen IMMEDIATELY after the
+                  # pipeline — the previous form (`pipeline || true; RC=
+                  # ${PIPESTATUS[1]}`) had a subtle bug: when the
+                  # pipeline fails (exit 1 or 2 — exactly the cases we
+                  # care about), `|| true` runs `true` as a one-command
+                  # pipeline, overwriting PIPESTATUS to (0). The fix is
+                  # to wrap the pipeline in `set +e`/`set -e` and snapshot
+                  # PIPESTATUS into a local array on the very next line.
+                  # CodeRabbit on PR #2984.
+                  set +e
+                  git diff-tree --no-commit-id --name-only -r "$SHA" \
+                    | node "$CLASSIFIER"
+                  PIPE_RC=("${PIPESTATUS[@]}")
+                  set -e
+                  DIFFTREE_RC="${PIPE_RC[0]}"
+                  CLASSIFIER_RC="${PIPE_RC[1]}"
+                  if [ "$DIFFTREE_RC" -ne 0 ]; then
+                    echo "::error::git diff-tree failed for $SHA (exit $DIFFTREE_RC) — refusing to classify on incomplete input."
+                    exit "$DIFFTREE_RC"
+                  fi
+                  case "$CLASSIFIER_RC" in
+                    0) ;;
+                    1)
+                      REASON="touches no shipped paths (CI / test / docs / planning only)"
+                      echo "↷ skipping $SHA — $REASON"
+                      NON_SHIPPED_SKIPPED="${NON_SHIPPED_SKIPPED}- \`${SHA}\` ${SUBJECT}"$'\n'
+                      continue
+                      ;;
+                    *)
+                      echo "::error::shipped-paths classifier failed for $SHA (exit $CLASSIFIER_RC). Refusing to silently skip — bug #2983."
+                      exit "$CLASSIFIER_RC"
+                      ;;
+                  esac
+                  echo "→ cherry-picking $SHA  $SUBJECT"
+                  # Pin merge.conflictStyle=merge on the cherry-pick so the
+                  # awk classifier below sees deterministic marker shapes —
+                  # diff3/zdiff3 would inject `||||||| ancestor` lines into
+                  # the HEAD section and cause context-missing conflicts to
+                  # misclassify as real. Bug #2966.
+                  if ! git -c merge.conflictStyle=merge cherry-pick -x --allow-empty --keep-redundant-commits "$SHA"; then
+                    # Full automation policy (bug #2968): any conflict the
+                    # cherry-pick can't auto-resolve is skipped, not aborted.
+                    # The hotfix run completes with whatever applies cleanly;
+                    # the CONFLICT_SKIPPED list below becomes the operator's
+                    # review queue (see "Cherry-pick summary" in the run
+                    # summary).
+                    #
+                    # Classify the conflict for the skip reason (operator-
+                    # facing diagnostic — doesn't change control flow):
+                    #   - context absent at base: HEAD section in every
+                    #     conflict marker is empty (the picked commit modifies
+                    #     code that doesn't exist at the base). Bug #2966.
+                    #   - merge conflict: HEAD section has content (both base
+                    #     and patch want different content for the same
+                    #     region). Typical when the base tag was cut from a
+                    #     branch that has diverged from main. Bug #2968.
+                    UNMERGED=$(git diff --name-only --diff-filter=U)
+                    REASON="merge conflict — manual review"
+                    if [ -n "$UNMERGED" ]; then
+                      ALL_EMPTY_HEAD=true
+                      while IFS= read -r CONFLICTED; do
+                        [ -z "$CONFLICTED" ] && continue
+                        # Guard the classifier against degenerate cases that
+                        # would otherwise skew toward "context absent" (the
+                        # auto-skip path) when they're actually unsafe to skip:
+                        #   - file missing or unreadable: don't pretend the
+                        #     conflict is benign; treat as real.
+                        #   - file listed as unmerged but no conflict markers
+                        #     present: anomalous git state; treat as real so
+                        #     the pick goes to the manual-review queue.
+                        # CodeRabbit on PR #2970.
+                        if [ ! -r "$CONFLICTED" ] || ! grep -q '^<<<<<<< ' "$CONFLICTED" 2>/dev/null; then
+                          ALL_EMPTY_HEAD=false
+                          break
+                        fi
+                        REAL=$(awk '
+                          /^<<<<<<< / { in_head=1; head=""; next }
+                          /^=======$/ && in_head { in_head=0; next }
+                          /^>>>>>>> / {
+                            if (head ~ /[^[:space:]]/) { print "real"; exit }
+                            head=""
+                            next
+                          }
+                          in_head { head = head $0 "\n" }
+                        ' "$CONFLICTED" 2>/dev/null || echo "real")
+                        if [ "$REAL" = "real" ]; then
+                          ALL_EMPTY_HEAD=false
+                          break
+                        fi
+                      done <<< "$UNMERGED"
+                      if [ "$ALL_EMPTY_HEAD" = "true" ]; then
+                        REASON="context absent at base"
+                      fi
+                    fi
+
+                    echo "↷ skipping $SHA — $REASON"
+                    # Guard `--skip`: cherry-pick can fail before entering the
+                    # conflict state (e.g. unreadable commit, empty-without-
+                    # --allow-empty edge cases the flag misses). Calling
+                    # `--skip` outside an in-progress cherry-pick exits non-
+                    # zero and would brick the loop. CodeRabbit on PR #2970.
+                    if git rev-parse -q --verify CHERRY_PICK_HEAD >/dev/null 2>&1; then
+                      git cherry-pick --skip
+                    fi
+                    CONFLICT_SKIPPED="${CONFLICT_SKIPPED}- \`${SHA}\` ${SUBJECT} ($REASON)"$'\n'
+                    continue
+                  fi
+                  INCLUDED="${INCLUDED}- \`${SHA}\` ${SUBJECT}"$'\n'
+                else
+                  POLICY_SKIPPED="${POLICY_SKIPPED}- \`${SHA}\` ${SUBJECT}"$'\n'
+                fi
+              done <<< "$ORDERED"
+              {
+                echo "## Cherry-pick summary"
+                echo ""
+                echo "Base: \`$BASE_TAG\` → Branch: \`$BRANCH\`$([ "$DRY_RUN" = "true" ] && echo " (DRY RUN — local only)")"
+                echo ""
+                if [ -n "$INCLUDED" ]; then
+                  echo "### Included (fix/chore)"
+                  echo ""
+                  echo "$INCLUDED"
+                else
+                  echo "_No fix/chore commits to include._"
+                fi
+                if [ -n "$NON_SHIPPED_SKIPPED" ]; then
+                  echo "### Skipped — touches no shipped paths (informational)"
+                  echo ""
+                  echo "These fix/chore commits don't touch any path in the npm tarball's \`files\` whitelist (or \`package.json\`), so they cannot change the published package's behavior. CI / test / docs / planning-only changes belong on \`main\`, not in a hotfix. No action needed."
+                  echo ""
+                  echo "$NON_SHIPPED_SKIPPED"
+                fi
+                if [ -n "$CONFLICT_SKIPPED" ]; then
+                  echo "### Skipped — cherry-pick conflict (manual review)"
+                  echo ""
+                  echo "$CONFLICT_SKIPPED"
+                fi
+                if [ -n "$POLICY_SKIPPED" ]; then
+                  echo "### Not auto-included (feat/refactor/docs/etc)"
+                  echo ""
+                  echo "$POLICY_SKIPPED"
+                fi
+              } >> "$GITHUB_STEP_SUMMARY"
+            fi
+          fi
+
+          # Bump version on the branch (committed) so downstream install-smoke +
+          # release jobs build the correct version. The release job's own in-tree
+          # bump becomes a no-op when the file already has the right version.
+          CURRENT=$(node -p "require('./package.json').version")
+          if [ "$CURRENT" != "$VERSION" ]; then
+            npm version "$VERSION" --no-git-tag-version
+            git add package.json package-lock.json
+            if [ -f sdk/package.json ]; then
+              (cd sdk && npm version "$VERSION" --no-git-tag-version)
+              git add sdk/package.json
+              [ -f sdk/package-lock.json ] && git add sdk/package-lock.json
+            fi
+            git commit -m "chore: bump version to $VERSION for hotfix"
+          fi
+          if [ "$DRY_RUN" != "true" ]; then
+            git push origin "$BRANCH"
+          else
+            echo "DRY RUN — cherry-picks applied locally; branch not pushed. Downstream install-smoke will run against \`$BASE_TAG\` (the cherry-pick verification above is the dry-run signal)."
+          fi
+
+      - name: Determine effective ref
+        id: out
+        env:
+          ACTION: ${{ inputs.action }}
+          INPUT_REF: ${{ inputs.ref }}
+          DRY_RUN: ${{ inputs.dry_run }}
+          BASE_TAG: ${{ steps.hotfix.outputs.base_tag }}
+          BRANCH: ${{ steps.hotfix.outputs.branch }}
+        run: |
+          if [ "$ACTION" = "hotfix" ]; then
+            if [ "$DRY_RUN" = "true" ]; then
+              echo "ref=$BASE_TAG" >> "$GITHUB_OUTPUT"
+            else
+              echo "ref=$BRANCH" >> "$GITHUB_OUTPUT"
+            fi
+          else
+            echo "ref=$INPUT_REF" >> "$GITHUB_OUTPUT"
+          fi
+
+  # Cross-platform install validation gate (parity with release.yml).
+  install-smoke:
+    needs: prepare
+    permissions:
+      contents: read
+    uses: ./.github/workflows/install-smoke.yml
+    with:
+      ref: ${{ needs.prepare.outputs.ref }}
+
+  release:
+    needs: [prepare, install-smoke]
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+    permissions:
+      contents: write  # tag + push + GitHub Release
+      id-token: write  # provenance
+      # The merge-back PR step (and the pull-request scope it required)
+      # was removed in #2983 — auto-cherry-pick hotfix flow only picks
+      # commits already on main, so there's nothing to merge back.
+    environment: npm-publish
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          fetch-depth: 0
+          ref: ${{ needs.prepare.outputs.ref }}
+
+      - uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f  # v6.3.0
+        with:
+          node-version: ${{ env.NODE_VERSION }}
+          registry-url: 'https://registry.npmjs.org'
+          cache: 'npm'
+
+      - name: Determine version
+        id: ver
+        env:
+          ACTION: ${{ inputs.action }}
+          INPUT_TAG: ${{ inputs.tag }}
+          INPUT_OVERRIDE: ${{ inputs.version }}
+        run: |
+          set -e
+          # Hotfix forces version=inputs.version and dist-tag=latest.
+          if [ "$ACTION" = "hotfix" ]; then
+            if [ -z "$INPUT_OVERRIDE" ]; then
+              echo "::error::action=hotfix requires the 'version' input"
+              exit 1
+            fi
+            VERSION="$INPUT_OVERRIDE"
+            EFFECTIVE_TAG="latest"
+            echo "version=$VERSION" >> "$GITHUB_OUTPUT"
+            echo "tag=$EFFECTIVE_TAG" >> "$GITHUB_OUTPUT"
+            echo "→ Hotfix: will publish v${VERSION} to dist-tag '${EFFECTIVE_TAG}'"
+            exit 0
+          fi
+          RAW=$(node -p "require('./package.json').version")
+          BASE=$(echo "$RAW" | sed 's/-.*//')
+          if [ -n "$INPUT_OVERRIDE" ]; then
+            VERSION="$INPUT_OVERRIDE"
+          else
+            case "$INPUT_TAG" in
+              dev)
+                N=1
+                while git tag -l "v${BASE}-dev.${N}" | grep -q .; do
+                  N=$((N + 1))
+                done
+                VERSION="${BASE}-dev.${N}"
+                ;;
+              next)
+                N=1
+                while git tag -l "v${BASE}-rc.${N}" | grep -q .; do
+                  N=$((N + 1))
+                done
+                VERSION="${BASE}-rc.${N}"
+                ;;
+              latest)
+                VERSION="$BASE"
+                ;;
+              *)
+                echo "::error::Unknown tag '$INPUT_TAG' (expected dev|next|latest)"
+                exit 1
+                ;;
+            esac
+          fi
+          echo "version=$VERSION" >> "$GITHUB_OUTPUT"
+          echo "tag=$INPUT_TAG" >> "$GITHUB_OUTPUT"
+          echo "→ Will publish v${VERSION} to dist-tag '${INPUT_TAG}'"
+
+      # Reconciliation mode: if version is already on npm (a prior run
+       # published successfully but a downstream step failed), don't hard-fail.
+       # Set a flag and skip the publish step below; tag/release/PR/dist-tag
+       # steps still execute so the rerun can finish reconciling state.
+      - name: Detect prior publish (reconciliation mode)
+        id: prior_publish
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+        run: |
+          EXISTING=$(npm view get-shit-done-cc@"$VERSION" version 2>/dev/null || true)
+          if [ -n "$EXISTING" ]; then
+            echo "::warning::get-shit-done-cc@${VERSION} is already on the registry — entering reconciliation mode (skip publish, continue with tag/release/PR/dist-tag)."
+            echo "skip_publish=true" >> "$GITHUB_OUTPUT"
+          else
+            echo "skip_publish=false" >> "$GITHUB_OUTPUT"
+          fi
+
+      # Tolerant tag-existence check (matches release.yml pattern). An
+      # operator re-running after a mid-flight publish-step failure should
+      # not be blocked just because the tag step succeeded last time. Only
+      # error if the existing tag points at a different commit than HEAD.
+      - name: Check git tag (skip if matches HEAD, error if mismatched)
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+        run: |
+          if git rev-parse -q --verify "refs/tags/v${VERSION}" >/dev/null; then
+            EXISTING_SHA=$(git rev-parse "refs/tags/v${VERSION}")
+            HEAD_SHA=$(git rev-parse HEAD)
+            if [ "$EXISTING_SHA" != "$HEAD_SHA" ]; then
+              echo "::error::git tag v${VERSION} already exists pointing at ${EXISTING_SHA}, but HEAD is ${HEAD_SHA}"
+              exit 1
+            fi
+            echo "::notice::tag v${VERSION} already exists at HEAD; tag step will skip"
+          fi
+
+      - name: Configure git identity
+        run: |
+          git config user.name "github-actions[bot]"
+          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+
+      - name: Bump in-tree version (not committed)
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+        run: |
+          # --allow-same-version: prepare may have already committed this bump
+          # on the hotfix branch (release checks out BRANCH in real runs,
+          # BASE_TAG in dry-runs — only the latter has the older version).
+          npm version "$VERSION" --no-git-tag-version --allow-same-version
+          cd sdk && npm version "$VERSION" --no-git-tag-version --allow-same-version
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Run full test suite with coverage (parity with release.yml)
+        run: npm run test:coverage
+
+      - name: Build SDK dist for tarball
+        run: npm run build:sdk
+
+      - name: Verify CC tarball ships sdk/dist/cli.js (bug #2647 guard)
+        run: bash scripts/verify-tarball-sdk-dist.sh
+
+      - name: Pack SDK as tarball and bundle into CC source tree
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+        run: |
+          set -e
+          cd sdk
+          npm pack
+          # npm pack emits gsd-build-sdk-<version>.tgz in the cwd
+          TARBALL="gsd-build-sdk-${VERSION}.tgz"
+          if [ ! -f "$TARBALL" ]; then
+            echo "::error::Expected $TARBALL but npm pack did not produce it. Listing sdk/:"
+            ls -la
+            exit 1
+          fi
+          mkdir -p ../sdk-bundle
+          mv "$TARBALL" ../sdk-bundle/gsd-sdk.tgz
+          cd ..
+          ls -la sdk-bundle/
+
+      - name: Add sdk-bundle to CC files whitelist (in-tree, not committed)
+        run: |
+          node <<'NODE'
+          const fs = require('fs');
+          const pkg = JSON.parse(fs.readFileSync('package.json', 'utf8'));
+          if (!Array.isArray(pkg.files)) {
+            console.error('::error::package.json files is not an array');
+            process.exit(1);
+          }
+          if (!pkg.files.includes('sdk-bundle')) {
+            pkg.files.push('sdk-bundle');
+            fs.writeFileSync('package.json', JSON.stringify(pkg, null, 2) + '\n');
+            console.log('Added sdk-bundle/ to package.json files whitelist');
+          } else {
+            console.log('sdk-bundle/ already in files whitelist');
+          }
+          NODE
+
+      - name: Verify CC tarball will contain sdk-bundle/gsd-sdk.tgz
+        run: |
+          set -e
+          TARBALL=$(npm pack --ignore-scripts 2>/dev/null | tail -1)
+          if [ -z "$TARBALL" ] || [ ! -f "$TARBALL" ]; then
+            echo "::error::npm pack produced no tarball"
+            exit 1
+          fi
+          echo "Inspecting $TARBALL for sdk-bundle/gsd-sdk.tgz:"
+          if ! tar -tzf "$TARBALL" | grep -q "package/sdk-bundle/gsd-sdk.tgz"; then
+            echo "::error::CC tarball is missing package/sdk-bundle/gsd-sdk.tgz"
+            tar -tzf "$TARBALL" | grep -E "sdk-bundle|sdk/dist" | head -20
+            exit 1
+          fi
+          echo "✅ CC tarball contains sdk-bundle/gsd-sdk.tgz"
+          rm -f "$TARBALL"
+
+      - name: Dry-run publish validation
+        # Skip the rehearsal when the version is already on npm
+        # (reconciliation mode). `npm publish --dry-run` contacts the
+        # registry and fails with "You cannot publish over the
+        # previously published versions" if the version exists, even
+        # though no actual publish would be attempted. The real publish
+        # step (further down) is gated on the same condition; gate the
+        # rehearsal too so re-runs of an already-published hotfix don't
+        # fail here on a check that doesn't apply. Bug #2987.
+        if: ${{ steps.prior_publish.outputs.skip_publish != 'true' }}
+        env:
+          TAG: ${{ steps.ver.outputs.tag }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: npm publish --dry-run --tag "$TAG"
+
+      - name: Tag and push
+        if: ${{ !inputs.dry_run }}
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+        run: |
+          if git rev-parse -q --verify "refs/tags/v${VERSION}" >/dev/null; then
+            echo "Tag v${VERSION} already exists at HEAD (per pre-flight check); skipping git tag step"
+          else
+            git tag "v${VERSION}"
+          fi
+          git push origin "v${VERSION}"
+
+      - name: Publish to npm (CC bundle, SDK included as both loose tree and .tgz)
+        if: ${{ !inputs.dry_run && steps.prior_publish.outputs.skip_publish != 'true' }}
+        env:
+          TAG: ${{ steps.ver.outputs.tag }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: npm publish --provenance --access public --tag "$TAG"
+
+      # Keep `next` from going stale relative to `latest`. When publishing a
+      # stable release, also point `next` at it so users on `@next` don't
+      # get stuck on an older pre-release than what's now stable. Parity
+      # with release.yml#finalize "Clean up next dist-tag" step.
+      - name: Re-point next dist-tag at the new latest (only when tag=latest)
+        if: ${{ !inputs.dry_run && steps.ver.outputs.tag == 'latest' }}
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: |
+          npm dist-tag add "get-shit-done-cc@${VERSION}" next
+          echo "✅ next dist-tag re-pointed to v${VERSION} (matches latest)"
+
+      - name: Create GitHub Release (idempotent)
+        if: ${{ !inputs.dry_run }}
+        env:
+          GH_TOKEN: ${{ github.token }}
+          VERSION: ${{ steps.ver.outputs.version }}
+          TAG: ${{ steps.ver.outputs.tag }}
+        run: |
+          # Per-tag release flags:
+          #   dev, next → --prerelease (won't be highlighted as the latest release on the repo page)
+          #   latest    → --latest (becomes the highlighted release)
+          # Idempotent: if release already exists (rerun after a transient
+          # downstream failure), edit the latest flag instead of failing.
+          if gh release view "v${VERSION}" >/dev/null 2>&1; then
+            echo "GitHub Release v${VERSION} already exists; reconciling --latest flag"
+            if [ "$TAG" = "latest" ]; then
+              gh release edit "v${VERSION}" --latest || true
+            fi
+          elif [ "$TAG" = "latest" ]; then
+            gh release create "v${VERSION}" \
+              --title "v${VERSION}" \
+              --generate-notes \
+              --latest
+          else
+            gh release create "v${VERSION}" \
+              --title "v${VERSION}" \
+              --generate-notes \
+              --prerelease
+          fi
+          echo "✅ GitHub Release v${VERSION} ready"
+
+      # Merge-back PR step removed — bug #2983.
+      #
+      # The auto-cherry-pick hotfix flow only picks commits already on
+      # main (`git cherry HEAD origin/main` outputs unmerged commits;
+      # we filter to fix:/chore: from main). By construction every code
+      # commit on the hotfix branch is already on main. The only
+      # hotfix-branch-only commit is `chore: bump version to X.Y.Z for
+      # hotfix`, which would either no-op against main (already past
+      # X.Y.Z) or rewind main's in-progress version — strictly
+      # counterproductive in either case.
+      #
+      # The original merge-back step also failed in production with
+      # `GitHub Actions is not permitted to create or approve pull
+      # requests (createPullRequest)` (org policy), but even if the
+      # policy were lifted the PR would have nothing useful to merge.
+      # Run 25232968975 was the trigger for removal.
+
+      - name: Verify publish landed on registry
+        if: ${{ !inputs.dry_run }}
+        env:
+          VERSION: ${{ steps.ver.outputs.version }}
+          TAG: ${{ steps.ver.outputs.tag }}
+        run: |
+          PUBLISHED="NOT_FOUND"
+          for delay in 5 10 20 30 45; do
+            PUBLISHED=$(npm view get-shit-done-cc@"$VERSION" version 2>/dev/null || echo "NOT_FOUND")
+            if [ "$PUBLISHED" = "$VERSION" ]; then
+              break
+            fi
+            echo "Waiting ${delay}s for registry to catch up (saw: $PUBLISHED)..."
+            sleep "$delay"
+          done
+          if [ "$PUBLISHED" != "$VERSION" ]; then
+            echo "::error::Version $VERSION did not appear on the registry within timeout"
+            exit 1
+          fi
+          TAG_VERSION=$(npm view get-shit-done-cc dist-tags."$TAG" 2>/dev/null || echo "NOT_FOUND")
+          if [ "$TAG_VERSION" != "$VERSION" ]; then
+            echo "::error::dist-tag '$TAG' resolves to '$TAG_VERSION', expected '$VERSION'"
+            exit 1
+          fi
+          echo "✅ get-shit-done-cc@${VERSION} live on dist-tag '${TAG}'"
+
+      - name: Summary
+        env:
+          ACTION: ${{ inputs.action }}
+          VERSION: ${{ steps.ver.outputs.version }}
+          TAG: ${{ steps.ver.outputs.tag }}
+          BASE_TAG: ${{ needs.prepare.outputs.base_tag }}
+          BRANCH: ${{ needs.prepare.outputs.ref }}
+          DRY_RUN: ${{ inputs.dry_run }}
+        run: |
+          {
+            if [ "$ACTION" = "hotfix" ]; then
+              echo "## Release SDK Bundle (hotfix): v${VERSION} → @${TAG}"
+              echo ""
+              echo "- Base (cumulative-fix anchor): \`${BASE_TAG}\`"
+              echo "- Branch: \`${BRANCH}\`"
+            else
+              echo "## Release SDK Bundle: v${VERSION} → @${TAG}"
+            fi
+            echo ""
+            if [ "$DRY_RUN" = "true" ]; then
+              echo "**DRY RUN** — npm publish, git tag, push, and GitHub Release were skipped."
+            else
+              echo "- Published \`get-shit-done-cc@${VERSION}\` to dist-tag \`${TAG}\`"
+              echo "- SDK bundled inside the CC tarball at:"
+              echo "  - \`sdk/dist/cli.js\` (loose tree, consumed by \`bin/gsd-sdk.js\` shim)"
+              echo "  - \`sdk-bundle/gsd-sdk.tgz\` (npm-installable artifact)"
+              echo "- Git tag \`v${VERSION}\` pushed"
+              echo "- GitHub Release \`v${VERSION}\` created"
+              if [ "$TAG" = "latest" ]; then
+                echo "- \`next\` dist-tag re-pointed at \`v${VERSION}\` (kept current with \`latest\`)"
+              fi
+              if [ "$ACTION" = "hotfix" ]; then
+                # Auto-cherry-pick hotfixes only pick commits already on
+                # main, so there's nothing to merge back. The merge-back
+                # PR step was removed in #2983; this line surfaces the
+                # explicit non-action so operators don't expect a PR
+                # that was never opened.
+                echo "- No merge-back PR (auto-picked commits are already on main)"
+              fi
+              echo "- Install: \`npm install -g get-shit-done-cc@${TAG}\`"
+            fi
+          } >> "$GITHUB_STEP_SUMMARY"
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -189,8 +189,11 @@ jobs:
          git add package.json package-lock.json sdk/package.json
          git commit -m "chore: bump to ${PRE_VERSION}"

-      - name: Build SDK
-        run: cd sdk && npm ci && npm run build
+      - name: Build SDK dist for tarball
+        run: npm run build:sdk
+
+      - name: Verify tarball ships sdk/dist/cli.js (bug #2647)
+        run: bash scripts/verify-tarball-sdk-dist.sh

      - name: Dry-run publish validation
        run: |
@@ -330,8 +333,11 @@ jobs:
          npm ci
          npm run test:coverage

-      - name: Build SDK
-        run: cd sdk && npm ci && npm run build
+      - name: Build SDK dist for tarball
+        run: npm run build:sdk
+
+      - name: Verify tarball ships sdk/dist/cli.js (bug #2647)
+        run: bash scripts/verify-tarball-sdk-dist.sh

      - name: Dry-run publish validation
        run: |
@@ -342,23 +348,32 @@ jobs:

      - name: Create PR to merge release back to main
        if: ${{ !inputs.dry_run }}
+        continue-on-error: true
        env:
          GH_TOKEN: ${{ github.token }}
          BRANCH: ${{ needs.validate-version.outputs.branch }}
          VERSION: ${{ inputs.version }}
        run: |
-          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number')
+          # Non-fatal: repos that disable "Allow GitHub Actions to create and
+          # approve pull requests" cause this step to fail with GraphQL 403.
+          # The release itself (tag + npm publish + GitHub Release) must still
+          # proceed. Open the merge-back PR manually afterwards with:
+          #   gh pr create --base main --head release/${VERSION} \
+          #     --title "chore: merge release v${VERSION} to main"
+          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number' 2>/dev/null || echo "")
          if [ -n "$EXISTING_PR" ]; then
            echo "PR #$EXISTING_PR already exists; updating"
            gh pr edit "$EXISTING_PR" \
              --title "chore: merge release v${VERSION} to main" \
-              --body "Merge release branch back to main after v${VERSION} stable release."
+              --body "Merge release branch back to main after v${VERSION} stable release." \
+              || echo "::warning::Could not update merge-back PR (likely PR-creation policy disabled). Open it manually after release."
          else
            gh pr create \
              --base main \
              --head "$BRANCH" \
              --title "chore: merge release v${VERSION} to main" \
-              --body "Merge release branch back to main after v${VERSION} stable release."
+              --body "Merge release branch back to main after v${VERSION} stable release." \
+              || echo "::warning::Could not create merge-back PR (likely PR-creation policy disabled). Open it manually after release."
          fi

      - name: Tag and push
--- a/.github/workflows/require-issue-link.yml
+++ b/.github/workflows/require-issue-link.yml
@@ -24,19 +24,20 @@ jobs:
            echo "found=false" >> "$GITHUB_OUTPUT"
          fi

-      - name: Comment and fail if no issue link
+      - name: Comment, close, and fail if no issue link
        if: steps.check.outputs.found == 'false'
        uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
        with:
          # Uses GitHub API SDK — no shell string interpolation of untrusted input
          script: |
            const repoUrl = `https://github.com/${context.repo.owner}/${context.repo.repo}`;
+            const prNumber = context.payload.pull_request.number;
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
-              issue_number: context.payload.pull_request.number,
+              issue_number: prNumber,
              body: [
-                '## Missing issue link',
+                '## Missing issue link — PR auto-closed',
                '',
                'This PR does not reference an issue. **All PRs must link to an open issue** using a closing keyword in the PR body:',
                '',
@@ -46,7 +47,13 @@ jobs:
                '',
                `If no issue exists for this change, [open one first](${repoUrl}/issues/new/choose), then update this PR body with the reference.`,
                '',
-                'This PR will remain blocked until a valid `Closes #NNN`, `Fixes #NNN`, or `Resolves #NNN` line is present in the description.',
+                'To resume work after fixing the body: edit the PR description to add a valid `Closes #NNN`, `Fixes #NNN`, or `Resolves #NNN` line, then click **Reopen pull request**. The workflow will re-evaluate on reopen.',
              ].join('\n')
            });
-            core.setFailed('PR body must contain a closing issue reference (e.g. "Closes #123")');
+            await github.rest.pulls.update({
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              pull_number: prNumber,
+              state: 'closed',
+            });
+            core.setFailed('PR body must contain a closing issue reference (e.g. "Closes #123") — PR closed.');
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -16,6 +16,21 @@ concurrency:
  cancel-in-progress: true

 jobs:
+  # Static lint: no source-grep tests in the test suite.
+  # Runs once (not per matrix node version) since it is a file-content check.
+  lint-tests:
+    runs-on: ubuntu-latest
+    timeout-minutes: 2
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+      - name: Set up Node.js
+        uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f  # v6.3.0
+        with:
+          node-version: 24
+      - name: Lint — no source-grep tests
+        shell: bash
+        run: node scripts/lint-no-source-grep.cjs
+
  test:
    runs-on: ${{ matrix.os }}
    timeout-minutes: 10
@@ -35,6 +50,31 @@ jobs:

    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          # Fetch full history so we can merge origin/main for stale-base detection.
+          fetch-depth: 0
+
+      # GitHub's `refs/pull/N/merge` is cached against the recorded merge-base.
+      # When main advances after a PR is opened, the cache stays stale and CI
+      # runs against the pre-advance state — hiding bugs that are already fixed
+      # on trunk and surfacing type errors that were introduced and then patched
+      # on main in between. Explicitly merge current origin/main here so tests
+      # always run against the latest trunk.
+      - name: Rebase check — merge origin/main into PR head
+        if: github.event_name == 'pull_request'
+        shell: bash
+        run: |
+          set -euo pipefail
+          git config user.email "ci@gsd-build"
+          git config user.name "CI Rebase Check"
+          git fetch origin main
+          if ! git merge --no-edit --no-ff origin/main; then
+            echo "::error::This PR cannot cleanly merge origin/main. Rebase your branch onto current main and push again."
+            echo "::error::Conflicting files:"
+            git diff --name-only --diff-filter=U
+            git merge --abort
+            exit 1
+          fi

      - name: Set up Node.js ${{ matrix.node-version }}
        uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f  # v6.3.0
@@ -45,6 +85,21 @@ jobs:
      - name: Install dependencies
        run: npm ci

+      - name: Build SDK dist (required by installer)
+        run: npm run build:sdk
+
+      # Seam contract gate: keep manifest -> generated aliases -> registry/CJS adapters aligned.
+      # Run once per workflow on the primary Linux node to avoid redundant matrix cost.
+      - name: SDK seam coverage tests
+        if: matrix.os == 'ubuntu-latest' && matrix.node-version == 24
+        shell: bash
+        run: cd sdk && npx vitest run src/query/command-seam-coverage.test.ts
+
+      - name: SDK generated alias artifact drift check
+        if: matrix.os == 'ubuntu-latest' && matrix.node-version == 24
+        shell: bash
+        run: node sdk/scripts/check-command-aliases-fresh.mjs
+
      - name: Run tests with coverage
        shell: bash
        run: npm run test:coverage
--- a/.out-of-scope/agent-template-rendering.md
+++ b/.out-of-scope/agent-template-rendering.md
@@ -0,0 +1,104 @@
+# Render agent definitions from templates at install/config-change time
+
+**Source:** [#2758](https://github.com/gsd-build/get-shit-done/issues/2758)
+**Decision:** wontfix — closed on the technical merits
+**Date:** 2026-05-02
+
+## Proposal summary
+
+Move config-gated prose out of `agents/*.md` into `agents/templates/*.md.tmpl`,
+rendered at install time and after `.planning/config.json` writes via a new
+`gsd-sdk agents render` subcommand. Conditional branches resolve at render time
+(deterministic code) instead of at inference time (LLM interpretation).
+
+Three named benefits:
+
+1. Token reduction proportional to disabled features.
+2. Deterministic feature gating (impossible-by-construction vs. test-for).
+3. Single source of truth for contributor-facing gating.
+
+Cites PR #2279 (Codex/OpenCode model embedding at install time) as direct
+precedent for compile-time embedding.
+
+## Why GSD does not own this
+
+### 1. The determinism claim is theoretical, not observed
+
+The proposal's strongest argument is that config-gated branches in agent prose
+are a determinism failure surface. The actual patterns in the codebase today are
+already heavily mitigated:
+
+- The `use_worktrees` branch in `gsd-executor` is resolved deterministically via
+  `gsd-sdk query config-get` in bash — it is not LLM-interpreted.
+- "Skip if `workflow.X` is `false`" prose patterns are short, stable, and
+  follow a uniform "missing key = enabled" convention. There is no documented
+  history of LLMs running disabled checks or skipping enabled ones because of
+  this prose.
+
+A theoretical failure surface should not be traded for a real, high-risk
+patch-migration surface (`gsd-local-patches/` rebase logic, by the reporter's own
+admission "the highest-risk piece of the change"). The reporter was asked for
+documented evidence; none was provided.
+
+### 2. Token waste is small and bounded
+
+The codebase has roughly 5 `workflow.*` toggle references in agent files and
+~20 "Skip if" conditional-prose patterns total — most 1–2 sentences. The
+"real spend across multi-phase milestones" claim was not measured against
+`gsd-context-monitor` output despite being asked. Without a measured baseline,
+the token-savings argument is asserted rather than demonstrated, and the savings
+ceiling on ~20 short conditionals is small enough that it does not justify a new
+template-and-rendering subsystem with a CI-enforced template/generated split.
+
+### 3. The deterministic-gating need is already served
+
+PR #2279 established orchestrator-time config embedding for the cases that
+genuinely need deterministic resolution (model selection, reasoning effort,
+worktree mode). That mechanism is the right layer for orchestration-time
+decisions and can be extended toggle-by-toggle along the existing path without
+introducing a parallel templating subsystem. The proposal's own "Alternative #1"
+(continue the orchestrator-embedding pattern) was rejected on the grounds that
+agent-internal conditionals belong in the agent layer, but the asks behind the
+proposal — determinism, lower token cost — are equally satisfied by extending
+PR #2279 incrementally without a second mechanism.
+
+Adding a templating layer alongside orchestrator-embedding means two mechanisms
+own the same problem. The proposal does not specify a partition rule, and the
+reporter did not respond when asked for one.
+
+### 4. Patch-migration risk is disproportionate to benefit
+
+The `/gsd-reapply-patches` three-way-merge migration for `gsd-local-patches/`
+is, in the proposal's own words, the highest-risk piece of the change. It exists
+solely to absorb a contributor-workflow shift — the user-facing surface is
+unchanged. Risk that flows entirely from internal restructuring, where the
+benefit is unmeasured token savings and a theoretical determinism gain, is the
+wrong trade.
+
+The reduced-scope variant (Alternative #5: fresh installs only, defer the
+migration) avoids that specific risk but still ships a parallel mechanism for
+benefits that remain unmeasured and that PR #2279's path can absorb.
+
+## Re-open criteria
+
+This may be revisited if a contributor:
+
+- Provides measured token deltas via `gsd-context-monitor` against a
+  representative all-toggles-off config, and the delta is materially larger
+  than what extending PR #2279's orchestrator-embedding path one toggle at a
+  time would produce.
+- Documents a real LLM misinterpretation of an existing toggle conditional
+  (executor ignored `workflow.use_worktrees: false`, verifier ran when
+  `workflow.verifier: false`, etc.) — not a projected failure mode.
+- Proposes a clear partition rule between orchestrator-time embedding (PR #2279)
+  and any new install-time templating layer, so the two mechanisms do not
+  overlap.
+
+## Related
+
+- PR #2279 — Codex/OpenCode model embedding at install time (the established
+  precedent for deterministic compile-time embedding into agent files)
+- v1.37.0 release notes — shared-boilerplate extraction (reference files for
+  mandatory-initial-read, project-skills-discovery)
+- `get-shit-done/workflows/` — workflow-level config embedding before subagent
+  spawn (the path of least friction for incremental deterministic gating)
--- a/.out-of-scope/temporal-context.md
+++ b/.out-of-scope/temporal-context.md
@@ -0,0 +1,56 @@
+# Temporal context as a first-class GSD signal
+
+**Source:** [#2756](https://github.com/gsd-build/get-shit-done/issues/2756)
+**Decision:** wontfix — closed without further engagement
+**Date:** 2026-05-02
+
+## Proposal summary
+
+Reporter proposed treating idle-time-between-turns as a first-class context signal in
+GSD. Three flavors floated across the issue:
+
+1. **Passive** — block at session resume injecting "you've been idle Nh, here's what was
+   open" into the orchestrator prompt.
+2. **Active** — `/resume-context` slash command.
+3. **Retrospective** — `HANDOFF.json` written at session end, read at next start.
+
+Framed initially as a `claude-inject-idle-time` plugin, with a request that GSD treat
+the pattern as core.
+
+## Why GSD does not own this
+
+- **Subagent gap unsolved.** Passive injection lands in the orchestrator's context
+  only. Subagents (the workers that actually do GSD's planning, execution, verification)
+  spawn fresh and never see the temporal signal. The proposal does not solve this, and
+  any GSD-core integration would inherit the gap. Until the subagent boundary is
+  addressed, "first-class temporal context" is at best a partial feature.
+- **`HANDOFF.json` duplicates existing artifacts.** GSD already persists session
+  continuity through `.planning/state/*` and per-phase artifacts (PLAN.md, RESEARCH.md,
+  REVIEW.md, VERIFICATION.md). A separate handoff file would either drift from those or
+  redundantly mirror them. The right primitive for "what was I doing" already exists.
+- **Statusline / TUI re-entry is platform-level, not GSD-level.** A statusline showing
+  idle time belongs in Claude Code itself or in a thin user plugin, not in GSD's phase
+  machinery.
+- **Scope is unstable.** Reporter agreed with the narrowed minimum ask ("doc mention
+  only, rest opt-in"), then partially retracted it in a follow-up comment ("very
+  integral to myself"). The maintainer asked which version of the ask should move
+  forward; reporter did not respond.
+
+## Re-open criteria
+
+This may be revisited if a reporter:
+
+- Engages with the subagent-gap problem and proposes a concrete mechanism for
+  temporal context to reach subagents (not just the orchestrator).
+- Demonstrates a use case `.planning/state/*` provably cannot serve.
+- Commits to a single stable scope (doc mention OR core integration OR plugin
+  reference) rather than oscillating between them mid-thread.
+
+A drive-by enhancement request that the author does not return to engage with after
+maintainer questions is not actionable. Future proposers: please plan to participate
+through to a triage decision rather than dropping an issue and moving on.
+
+## Related
+
+- `.planning/state/` — existing session-continuity artifacts
+- `get-shit-done/references/` — where any future plugin-interface doc would live
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
--- a/CONTEXT.md
+++ b/CONTEXT.md
@@ -0,0 +1,41 @@
+# Context
+
+## Domain terms
+
+### Dispatch Policy Module
+Module owning dispatch error mapping, fallback policy, timeout classification, and CLI exit mapping contract.
+
+Canonical error kind set:
+- `unknown_command`
+- `native_failure`
+- `native_timeout`
+- `fallback_failure`
+- `validation_error`
+- `internal_error`
+
+### Command Definition Module
+Canonical command metadata Interface powering alias, catalog, and semantics generation.
+
+### Query Runtime Context Module
+Module owning query-time context resolution for `projectDir` and `ws`, including precedence and validation policy used by query adapters.
+
+### Native Dispatch Adapter Module
+Adapter Module that satisfies native query dispatch at the Dispatch Policy seam, so policy modules consume a focused dispatch Interface instead of closure-wired call sites.
+
+### Query CLI Output Module
+Module owning projection from dispatch results/errors to CLI `{ exitCode, stdoutChunks, stderrLines }` output contract.
+
+### Query Execution Policy Module
+Module owning query transport routing policy projection (`preferNative`, fallback policy, workstream subprocess forcing) at execution seam.
+
+### Query Subprocess Adapter Module
+Adapter Module owning subprocess execution contract for query commands (JSON/raw invocation, `@file:` indirection parsing, timeout/exit error projection).
+
+### Query Command Resolution Module
+Canonical command normalization and resolution Interface (`query-command-resolution-strategy`) used by internal query/transport paths after dead-wrapper convergence.
+
+### Command Topology Module
+Module owning command resolution, policy projection (`mutation`, `output_mode`), unknown-command diagnosis, and handler Adapter binding at one seam for query dispatch.
+
+### Query Pre-Project Config Policy Module
+Module policy that defines query-time behavior when `.planning/config.json` is absent: use built-in defaults for parity-sensitive query Interfaces, and emit parity-aligned empty model ids for pre-project model resolution surfaces.
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -81,6 +81,20 @@ PRs that arrive without a properly-labeled linked issue are closed automatically

 ## Pull Request Guidelines

+### Architecture & Domain Standards (Maintainer-Defined)
+
+The following files are maintainer-owned coding standards and must be treated as canonical when contributing:
+
+- `CONTEXT.md` — domain language and module naming standards
+- `docs/adr/` — Architecture Decision Records (ADRs) for accepted architectural decisions
+
+Contributor requirements:
+- Read `CONTEXT.md` before naming or refactoring modules/interfaces/seams.
+- Use `CONTEXT.md` vocabulary consistently in code comments, tests, issue/PR text, and docs for the touched area.
+- Check relevant ADRs in `docs/adr/` before proposing or implementing architectural changes.
+- If a change intentionally revisits an ADR decision, call it out explicitly in the linked issue and PR rationale.
+- Do not rewrite maintainer intent in `CONTEXT.md`/ADRs as part of drive-by cleanup; propose focused updates tied to approved scope.
+
 **Every PR must link to an approved issue.** PRs without a linked issue are closed without review, no exceptions.

 - **No draft PRs** — draft PRs are automatically closed. Only open a PR when it is complete, tested, and ready for review. If your work is not finished, keep it on your local branch until it is.
@@ -91,6 +105,23 @@ PRs that arrive without a properly-labeled linked issue are closed automatically
 - **CI must pass** — all matrix jobs (Ubuntu × Node 22, 24; macOS × Node 24) must be green
 - **Scope matches the approved issue** — if your PR does more than what the issue describes, the extra changes will be asked to be removed or moved to a new issue

+## CHANGELOG Entries — Drop a Fragment
+
+**Do not edit `CHANGELOG.md` directly.** Two PRs that both append to a `### Fixed` block always conflict on merge — git can't pick a serialization order without a human. Instead, every PR with user-facing changes drops a fragment file in `.changeset/`.
+
+```bash
+npm run changeset -- --type Fixed --pr <YOUR_PR_NUMBER> \
+  --body "**\`/gsd-foo\` no longer drops trailing slashes** — explain the user-visible change."
+```
+
+This writes `.changeset/<adjective>-<noun>-<noun>.md`. Three random words → concurrent PRs never collide. Allowed `type:` values follow [Keep a Changelog](https://keepachangelog.com/): `Added`, `Changed`, `Deprecated`, `Removed`, `Fixed`, `Security`.
+
+Fragments are consolidated into `CHANGELOG.md` at release time by the release workflow. See [`.changeset/README.md`](.changeset/README.md) for the format spec and [#2975](https://github.com/gsd-build/get-shit-done/issues/2975) for the rationale.
+
+**CI enforcement:** the `Changeset Required` workflow (`scripts/changeset/lint.cjs`) fails any PR that touches `bin/`, `get-shit-done/`, `agents/`, `commands/`, `hooks/`, or `sdk/src/` without a `.changeset/*.md` fragment.
+
+**Opt-out:** PRs with no user-facing impact (test refactors, lint config changes, CI tweaks, formatting-only changes) can add the `no-changelog` label. The lint honors it. When unsure whether a change is user-facing, **add the fragment**.
+
 ## Testing Standards

 All tests use Node.js built-in test runner (`node:test`) and assertion library (`node:assert`). **Do not use Jest, Mocha, Chai, or any external test framework.**
@@ -229,6 +260,136 @@ const content = `
 `;
 ```

+### Prohibited: Source-Grep Tests
+
+**Never read source-code `.cjs` files with `readFileSync` to assert that strings exist within them.** This is source-grep theater: it proves a literal is present in a file, not that the feature works at runtime.
+
+```javascript
+// BAD — source-grep theater
+const configSrc = fs.readFileSync(
+  path.join(GSD_ROOT, 'bin', 'lib', 'config-schema.cjs'), 'utf-8'
+);
+assert.ok(
+  configSrc.includes("'workflow.plan_bounce'"),
+  'VALID_CONFIG_KEYS should contain workflow.plan_bounce'
+);
+```
+
+This test passes even if `workflow.plan_bounce` is present but misspelled in the schema, removed from the validation path, or moved to a different file under a different name. It survives every behavioral regression and fails only on trivial renames.
+
+The correct pattern for config key tests — use the CLI:
+
+```javascript
+// GOOD — behavioral test via the CLI
+test('config-set accepts workflow.plan_bounce', (t) => {
+  const tmpDir = createTempProject();
+  t.after(() => cleanup(tmpDir));
+
+  const result = runGsdTools('config-set workflow.plan_bounce true', tmpDir);
+  assert.ok(result.success, `config-set should accept workflow.plan_bounce: ${result.error}`);
+
+  const configPath = path.join(tmpDir, '.planning', 'config.json');
+  const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'));
+  assert.strictEqual(config.workflow?.plan_bounce, true, 'value must be persisted');
+});
+```
+
+This single test covers key registration in `VALID_CONFIG_KEYS`, the key's namespace resolution in `KNOWN_TOP_LEVEL`, and value persistence — all behaviors that the source-grep test could not touch.
+
+**Why this pattern broke at scale:** Commit `990c3e64` in this repo updated 5 source-grep tests in one pass when `VALID_CONFIG_KEYS` moved between files. Zero of those tests were testing behavior. If they had been behavioral tests, the migration would have been invisible.
+
+**CI enforcement:** A linter (`scripts/lint-no-source-grep.cjs`, run as `npm run lint:tests`) detects violations. Any test file that calls `readFileSync` on a `.cjs` path in a source directory without the exemption annotation below will fail the `lint-tests` CI job.
+
+### Exception: `allow-test-rule: <reason>`
+
+Some tests legitimately read source files. There are six recognized categories:
+
+| Reason | When to use |
+|--------|-------------|
+| `source-text-is-the-product` | Agent `.md`, workflow `.md`, command `.md` files — their text IS what the runtime loads. Testing text content tests the deployed contract. |
+| `architectural-invariant` | Implementation must use a specific primitive (e.g., `Atomics.wait`, atomic file writes) that cannot be tested by observing outputs. |
+| `structural-regression-guard` | A specific code pattern must (or must not) exist to prevent a class of bug (e.g., regex global-state misuse). Behavioral tests cannot distinguish which pattern was used. |
+| `docs-parity` | A reference doc must stay in sync with source-defined constants (e.g., `CONFIG_DEFAULTS`). The source is the canonical list; there is no runtime API to enumerate it. |
+| `integration-test-input` | A source file is used as a real fixture input to a transformation function under test — the file is not inspected for strings but passed as data. |
+| `structural-implementation-guard` | A feature's interception or wiring point is not reachable end-to-end via `runGsdTools`. Used temporarily until a behavioral path exists. |
+| `pending-migration-to-typed-ir` | **Tracked for correction, not exempted.** Test was identified by the lint as carrying a raw-text-matching pattern that contradicts the rule above. Each annotated file MUST cite the open migration issue (e.g. `// allow-test-rule: pending-migration-to-typed-ir [#NNNN]`) so the tracking is auditable. New tests cannot use this category — they must refactor production to expose typed IR. The annotation is removed when the test is corrected. |
+
+Annotate with a standalone `//` comment before the file's opening block comment:
+
+```javascript
+// allow-test-rule: architectural-invariant
+// state.cjs locking must use Atomics.wait(), not a spin-loop. Behavioral tests
+// cannot observe which sleep primitive was chosen — only source inspection can.
+
+/**
+ * Regression tests for locking bugs #1909...
+ */
+```
+
+The annotation **must** be a standalone `// allow-test-rule:` line, not inside a `/** */` block comment — the CI linter scans for the pattern `// allow-test-rule:`.
+
+### Prohibited: Raw Text Matching on Test Outputs (file content, stdout, stderr)
+
+**Source-grep is not just `readFileSync` of a `.cjs` file.** The same anti-pattern shows up wherever a test pattern-matches against text that a system-under-test produced, regardless of whether that text came from a source file, a rendered shim, a child process's stdout, or a free-form `reason` string. **All forms are forbidden.**
+
+The following are all violations of the same rule:
+
+```javascript
+// BAD — substring match on text written by the code under test
+const cmdContent = fs.readFileSync(path.join(tmpDir, 'gsd-sdk.cmd'), 'utf8');
+assert.ok(cmdContent.includes(`@node ${jsonQuoted} %*`), '.cmd embeds shim path');
+
+// BAD — regex match on a child process's human-readable stdout formatter
+const r = cp.spawnSync(SCRIPT, ['--patches-dir', dir]);
+assert.match(r.stdout, /Failures: 1/);
+assert.match(r.stdout, /not a regular file/);
+
+// BAD — "structured parser" that hides string ops behind a function wrapper
+function parseCmdShim(content) {
+  const lines = content.split('\r\n').filter((l) => l.length > 0);
+  return { header: lines[0], usesCRLF: content.includes('\r\n') };
+}
+
+// BAD — assert.match on a free-form `reason` string from a JSON report
+assert.ok(/not a regular file/.test(report.results[0].reason));
+```
+
+Each of these passes on accidental near-matches (a comment containing `@node` somewhere, a stack trace that happens to say `Failures: 1`, a mis-typed reason that still contains the substring you're matching) and fails on harmless reformatting (changing `Failures: 1` to `1 failure`, swapping CRLF rendering style, rewording the error prose).
+
+#### The rule
+
+> **Tests assert on typed structured values. If the code under test produces text, the code under test must also expose a structured intermediate representation, and the test must assert on that IR — never on the rendered text.**
+
+Concretely: for any system-under-test that produces text output (a file renderer, a CLI formatter, an error-message builder), the production code MUST expose a typed alternative that the test consumes:
+
+| Output kind | Required structured surface | What the test asserts on |
+|---|---|---|
+| Rendered file (shim, template, generated code) | A pure builder function returning the IR (`{ invocation, eol, fileNames, render }`) | `triple.invocation.target === expected`, `triple.eol.cmd === '\r\n'` |
+| CLI human-formatter output | A `--json` mode that emits the same data structurally | `report.results[0].reason === REASON.FAIL_INSTALLED_NOT_REGULAR_FILE` |
+| Error / status / reason | A frozen enum (`Object.freeze({ FAIL_X: 'fail_x', ... })`) | `assert.equal(result.reason, REASON.FAIL_X)` |
+| File presence after a write | `fs.statSync().isFile()`, `.size > 0`, `.mtimeMs` advances | Filesystem facts; never read the file content back |
+
+#### Concrete examples from this repo
+
+`buildWindowsShimTriple(shimSrc)` in `bin/install.js` is the canonical IR pattern: pure function, no I/O, returns `{ invocation, eol, fileNames, render }`. `trySelfLinkGsdSdkWindows` calls it and writes `triple.render[kind]()` to disk. Tests assert on `triple.invocation.target`, `triple.eol.cmd`, `Object.keys(triple).sort()` — never on the rendered text. Filesystem-level tests assert `fs.statSync(target).size === Buffer.byteLength(triple.render.cmd())` to prove the writer writes what the renderer produces, **without comparing content**.
+
+`scripts/verify-reapply-patches.cjs` exposes a frozen `REASON` enum and emits it through `--json`. Tests assert `report.results[0].reason === REASON.FAIL_USER_LINES_MISSING`. The human formatter exists for operator console output only — tests must not depend on its prose. Adding a new reason code requires updating the `REASON` enum, the `--json` output, AND the test that locks `Object.keys(REASON).sort()` — three coordinated changes that prevent the code surface from drifting from the test surface.
+
+#### Hiding grep behind a function is still grep
+
+`parseCmdShim`, `parsePs1Invocation`, etc. that internally do `content.split(...)`, `lines[1].trim()`, `content.includes(...)` are still string manipulation. The fact that the entry point looks like a parser doesn't change what's happening underneath — the test is still asserting on the lexical shape of rendered text. The fix is not "wrap the grep in a function with a typed-looking return value." The fix is to **eliminate the rendered text from the test path entirely** by surfacing the IR.
+
+#### When you cannot eliminate text matching
+
+There are exactly two cases where text content is the legitimate object of a test, both already covered by the existing exemption matrix:
+
+1. `source-text-is-the-product` — workflow `.md` / agent `.md` / command `.md` files where the deployed text IS what the runtime loads.
+2. `docs-parity` — a reference doc must mirror source-defined constants and there is no runtime enumeration API.
+
+For everything else, if a test reaches for `.includes()` / `.startsWith()` / `assert.match(text, /…/)`, the production code is missing a typed surface. **Add the typed surface; do not work around it.**
+
+**CI enforcement:** `scripts/lint-no-source-grep.cjs` is being extended (see issue tracker for the latest scope) to flag `String#includes`/`String#startsWith`/`String#endsWith`/`assert.match` on `readFileSync` results and on `cp.spawnSync` stdout/stderr in test files, with the same `// allow-test-rule:` exemption mechanism.
+
 ### Node.js Version Compatibility

 **Node 22 is the minimum supported version.** Node 24 is the primary CI target. All tests must pass on both.
@@ -278,8 +439,93 @@ node --test tests/core.test.cjs
 npm run test:coverage
 ```

+### Pre-PR Seam Checks (Manifest/Alias Routing)
+
+If you touched any of the command-manifest or generated alias files, run:
+
+```bash
+npm run check:alias-drift
+```
+
+This verifies generated alias artifacts are in sync with manifest source-of-truth.
+
+Optional local pre-commit hook entry (Git-native):
+
+```bash
+# one-time setup
+mkdir -p .githooks
+cat > .githooks/pre-commit <<'EOF'
+#!/usr/bin/env bash
+set -euo pipefail
+
+if git diff --cached --name-only | grep -Eq "^sdk/src/query/command-manifest\.|^sdk/src/query/command-aliases\.generated\.ts$|^get-shit-done/bin/lib/command-aliases\.generated\.cjs$|^sdk/scripts/gen-command-aliases\.ts$"; then
+  npm run check:alias-drift
+fi
+EOF
+chmod +x .githooks/pre-commit
+git config core.hooksPath .githooks
+```
+
+Optional local pre-push hook to block a private author-email pattern:
+
+```bash
+# set locally in your shell profile (example)
+export GSD_BLOCKED_AUTHOR_REGEX='@example-corp\\.com$'
+
+cat > .githooks/pre-push <<'EOF'
+#!/usr/bin/env bash
+set -euo pipefail
+
+zero_sha='0000000000000000000000000000000000000000'
+blocked_regex="${GSD_BLOCKED_AUTHOR_REGEX:-}"
+[[ -z "$blocked_regex" ]] && exit 0
+violations=()
+
+while read -r local_ref local_sha remote_ref remote_sha; do
+  [[ "$local_sha" == "$zero_sha" ]] && continue
+  if [[ "$remote_sha" == "$zero_sha" ]]; then
+    commits=$(git rev-list "$local_sha" --not --remotes)
+  else
+    commits=$(git rev-list "$remote_sha..$local_sha")
+  fi
+  while read -r commit; do
+    [[ -z "$commit" ]] && continue
+    email=$(git show -s --format='%ae' "$commit" | tr '[:upper:]' '[:lower:]')
+    if printf '%s' "$email" | grep -Eq "$blocked_regex"; then
+      violations+=("$commit <$email>")
+    fi
+  done <<< "$commits"
+done
+
+if [[ ${#violations[@]} -gt 0 ]]; then
+  echo "Push blocked: commit author email matched local blocked regex ($blocked_regex)." >&2
+  printf '  - %s\n' "${violations[@]}" >&2
+  exit 1
+fi
+EOF
+chmod +x .githooks/pre-push
+```
+
+### CI Test Quality Checks
+
+The following checks run on every PR in addition to the test suite:
+
+| Job | What it checks | How to pass |
+|-----|----------------|-------------|
+| `lint-tests` | No source-grep tests (see above) | Replace with `runGsdTools()` behavioral tests, or add `// allow-test-rule: <reason>` |
+
+Run locally before pushing: `npm run lint:tests`
+
 ### Test Requirements by Contribution Type

+### Architecture-Aware Testing Requirements
+
+When work touches architecture, routing, policy, registry assembly, or command semantics:
+- Write tests against module **interfaces** and seam behavior, not implementation trivia.
+- Prefer invariant/contract tests that protect ADR-backed behavior and `CONTEXT.md` terminology.
+- Ensure tests validate canonical behavior through the defined seam (for example: structured result contracts, canonical command metadata, and adapter parity), not source-text coupling.
+- If ADRs define expected behavior, tests should assert those expectations directly.
+
 The required tests differ depending on what you are contributing:

 **Bug Fix:** A regression test is required. Write the test first — it must demonstrate the original failure before your fix is applied, then pass after the fix. A PR that fixes a bug without a regression test will be asked to add one. "Tests pass" does not prove correctness; it proves the bug isn't present in the tests that exist.
@@ -314,6 +560,15 @@ bin/install.js          — Installer (multi-runtime)
 get-shit-done/
  bin/lib/              — Core library modules (.cjs)
  workflows/            — Workflow definitions (.md)
+                          Large workflows split per progressive-disclosure
+                          pattern: workflows/<name>/modes/*.md +
+                          workflows/<name>/templates/*. Parent dispatches
+                          to mode files. See workflows/discuss-phase/ as
+                          the canonical example (#2551). New modes for
+                          discuss-phase land in
+                          workflows/discuss-phase/modes/<mode>.md.
+                          Per-file budgets enforced by
+                          tests/workflow-size-budget.test.cjs.
  references/           — Reference documentation (.md)
  templates/            — File templates
 agents/                 — Agent definitions (.md) — CANONICAL SOURCE
--- a/README.ja-JP.md
+++ b/README.ja-JP.md
@@ -75,15 +75,17 @@ GSDはそれを解決します。Claude Codeを信頼性の高いものにする

 ビルトインの品質ゲートが本当の問題を検出します：スキーマドリフト検出はマイグレーション漏れのORM変更をフラグし、セキュリティ強制は検証を脅威モデルに紐付け、スコープ削減検出はプランナーが要件を暗黙的に落とすのを防止します。

-### v1.32.0 ハイライト
+### v1.39.0 ハイライト

- **STATE.md整合性ゲート** — `state validate`がSTATE.mdとファイルシステムの差分を検出、`state sync`が実際のプロジェクト状態から再構築
- **`--to N`フラグ** — 自律実行を特定のフェーズ完了後に停止
- **リサーチゲート** — RESEARCH.mdに未解決の質問がある場合、計画をブロック
- **検証マイルストーンスコープフィルタリング** — 後のフェーズで対処されるギャップは「ギャップ」ではなく「延期」としてマーク
- **読み取り後編集ガード** — 非Claudeランタイムでの無限リトライループを防止するアドバイザリーフック
- **コンテキスト削減** — Markdownのトランケーションとキャッシュフレンドリーなプロンプト順序でトークン使用量を削減
- **4つの新ランタイム** — Trae、Kilo、Augment、Cline（合計12ランタイム）
+完全なリストは [v1.39.0 リリースノート](https://github.com/gsd-build/get-shit-done/releases/tag/v1.39.0) を参照してください。
+
+- **`--minimal` インストールプロファイル** — エイリアス `--core-only`。メインループの6スキル（`new-project`、`discuss-phase`、`plan-phase`、`execute-phase`、`help`、`update`）のみをインストールし、`gsd-*` サブエージェントはゼロ。コールドスタート時のシステムプロンプトのオーバーヘッドを ~12kトークンから ~700トークンへ削減（≥94%減）。32K〜128Kコンテキストのローカル LLM やトークン課金 API に有効。
+- **`/gsd-edit-phase`** — `ROADMAP.md` 上の既存フェーズの任意フィールドをその場で編集（番号や位置は変更されない）。`--force` で確認 diff をスキップ、`depends_on` の参照を検証し、書き込み時に `STATE.md` も更新。
+- **マージ後ビルド & テストゲート** — `execute-phase` のステップ 5.6 が `workflow.build_command` の設定を自動検出し、無ければ Xcode（`.xcodeproj`）、Makefile、Justfile、Cargo、Go、Python、npm の順にフォールバック。Xcode/iOS プロジェクトでは `xcodebuild build` と `xcodebuild test` を自動実行。並列・直列両モードで動作。
+- **ランタイム別レビューモデル選択** — `review.models.<cli>` で各外部レビュー CLI（codex、gemini など）が使うモデルをプランナー/実行プロファイルとは独立に指定可能。
+- **ワークストリーム設定の継承** — `GSD_WORKSTREAM` が設定されている場合、ルートの `.planning/config.json` を先に読み込み、ワークストリーム設定をディープマージ（衝突時はワークストリーム側が優先）。ワークストリーム設定で明示的に `null` を指定するとルート値を上書き可能。
+- **手動カナリアリリースワークフロー** — `.github/workflows/canary.yml` が `workflow_dispatch` 経由で `dev` ブランチから `{base}-canary.{N}` ビルドを `@canary` dist-tag に手動公開（`get-shit-done-cc` と `@gsd-build/sdk`）。
+- **スキルの統合：86 → 59** — 4つの新しいグループ化スキル（`capture`、`phase`、`config`、`workspace`）が31のマイクロスキルを吸収。既存の親スキル6つはラップアップやサブ操作をフラグ化：`update --sync/--reapply`、`sketch --wrap-up`、`spike --wrap-up`、`map-codebase --fast/--query`、`code-review --fix`、`progress --do/--next`。機能の欠損なし。

 ---

@@ -597,6 +599,7 @@ lmn012o feat(08-02): create registration endpoint
 |---------|--------------|
 | `/gsd-add-phase` | ロードマップにフェーズを追加 |
 | `/gsd-insert-phase [N]` | フェーズ間に緊急作業を挿入 |
+| `/gsd-edit-phase [N] [--force]` | 既存フェーズの任意フィールドをその場で編集 — 番号と位置は変更されない |
 | `/gsd-remove-phase [N]` | 将来のフェーズを削除し番号を振り直し |
 | `/gsd-list-phase-assumptions [N]` | 計画前にClaudeの意図するアプローチを確認 |
 | `/gsd-plan-milestone-gaps` | 監査で見つかったギャップを埋めるフェーズを作成 |
--- a/README.ko-KR.md
+++ b/README.ko-KR.md
@@ -75,15 +75,17 @@ GSD가 그걸 고칩니다. Claude Code를 신뢰할 수 있게 만드는 컨텍

 내장 품질 게이트가 실제 문제를 잡아냅니다: 스키마 드리프트 감지는 마이그레이션 누락된 ORM 변경을 플래그하고, 보안 강제는 검증을 위협 모델에 고정시키고, 스코프 축소 감지는 플래너가 요구사항을 몰래 빠뜨리는 걸 방지합니다.

-### v1.32.0 하이라이트
+### v1.39.0 하이라이트

- **STATE.md 일관성 게이트** — `state validate`가 STATE.md와 파일시스템 간 드리프트를 감지, `state sync`가 실제 프로젝트 상태에서 재구성
- **`--to N` 플래그** — 자율 실행을 특정 단계 완료 후 중지
- **리서치 게이트** — RESEARCH.md에 미해결 질문이 있으면 기획을 차단
- **검증 마일스톤 스코프 필터링** — 이후 단계에서 처리될 격차는 "격차"가 아닌 "지연됨"으로 표시
- **읽기-후-편집 가드** — 비Claude 런타임에서 무한 재시도 루프를 방지하는 어드바이저리 훅
- **컨텍스트 축소** — 마크다운 잘라내기 및 캐시 친화적 프롬프트 순서로 토큰 사용량 절감
- **4개의 새 런타임** — Trae, Kilo, Augment, Cline (총 12개 런타임)
+전체 목록은 [v1.39.0 릴리스 노트](https://github.com/gsd-build/get-shit-done/releases/tag/v1.39.0)를 참고하세요.
+
+- **`--minimal` 설치 프로파일** — 별칭 `--core-only`. 메인 루프 6개 스킬(`new-project`, `discuss-phase`, `plan-phase`, `execute-phase`, `help`, `update`)만 설치하고 `gsd-*` 서브에이전트는 설치하지 않음. 콜드 스타트 시스템 프롬프트 오버헤드를 ~12k 토큰에서 ~700 토큰으로 축소(≥94% 감소). 32K–128K 컨텍스트의 로컬 LLM이나 토큰 과금 API에 유용.
+- **`/gsd-edit-phase`** — `ROADMAP.md`에 있는 기존 단계의 임의 필드를 그 자리에서 수정(번호와 위치는 변경되지 않음). `--force`는 확인 diff를 건너뛰고, `depends_on` 참조를 검증하며 쓰기 시 `STATE.md`도 갱신.
+- **머지 후 빌드 & 테스트 게이트** — `execute-phase` 5.6 단계가 `workflow.build_command` 설정을 우선 자동 감지하고, 없으면 Xcode(`.xcodeproj`), Makefile, Justfile, Cargo, Go, Python, npm 순으로 폴백. Xcode/iOS 프로젝트는 `xcodebuild build` 및 `xcodebuild test`를 자동 실행. 병렬·직렬 모드 모두에서 동작.
+- **런타임별 리뷰 모델 선택** — `review.models.<cli>`로 각 외부 리뷰 CLI(codex, gemini 등)가 플래너/실행 프로파일과 독립적으로 자체 모델을 선택할 수 있음.
+- **워크스트림 설정 상속** — `GSD_WORKSTREAM`이 설정되면 루트 `.planning/config.json`을 먼저 로드한 뒤 워크스트림 설정을 딥 머지(충돌 시 워크스트림 우선). 워크스트림 설정에서 명시적 `null`은 루트 값을 덮어씀.
+- **수동 카나리 릴리스 워크플로** — `.github/workflows/canary.yml`이 `workflow_dispatch`로 `dev` 브랜치에서 `{base}-canary.{N}` 빌드를 `@canary` dist-tag로 수동 게시(`get-shit-done-cc`와 `@gsd-build/sdk`).
+- **스킬 통합: 86 → 59** — 4개의 새로운 그룹 스킬(`capture`, `phase`, `config`, `workspace`)이 31개의 마이크로 스킬을 흡수. 기존 6개의 부모 스킬은 래퍼업/하위 동작을 플래그로 흡수: `update --sync/--reapply`, `sketch --wrap-up`, `spike --wrap-up`, `map-codebase --fast/--query`, `code-review --fix`, `progress --do/--next`. 기능 손실 없음.

 ---

@@ -594,6 +596,7 @@ lmn012o feat(08-02): create registration endpoint
 |---------|------------|
 | `/gsd-add-phase` | 로드맵에 단계 추가 |
 | `/gsd-insert-phase [N]` | 단계 사이에 긴급 작업 삽입 |
+| `/gsd-edit-phase [N] [--force]` | 기존 단계의 임의 필드를 그 자리에서 수정 — 번호와 위치는 그대로 |
 | `/gsd-remove-phase [N]` | 미래 단계 제거, 번호 재정렬 |
 | `/gsd-list-phase-assumptions [N]` | 기획 전 Claude의 의도된 접근 방식 확인 |
 | `/gsd-plan-milestone-gaps` | 감사에서 발견된 갭을 해소하기 위한 단계 생성 |
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@

 **English** · [Português](README.pt-BR.md) · [简体中文](README.zh-CN.md) · [日本語](README.ja-JP.md) · [한국어](README.ko-KR.md)

-**A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code, OpenCode, Gemini CLI, Kilo, Codex, Copilot, Cursor, Windsurf, Antigravity, Augment, Trae, Qwen Code, Cline, and CodeBuddy.**
+**A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code, OpenCode, Gemini CLI, Kilo, Codex, Copilot, Cursor, Windsurf, Antigravity, Augment, Trae, Qwen Code, Hermes Agent, Cline, and CodeBuddy.**

 **Solves context rot — the quality degradation that happens as Claude fills its context window.**

@@ -41,7 +41,7 @@ npx get-shit-done-cc@latest

 **Trusted by engineers at Amazon, Google, Shopify, and Webflow.**

-[Why I Built This](#why-i-built-this) · [How It Works](#how-it-works) · [Commands](#commands) · [Why It Works](#why-it-works) · [User Guide](docs/USER-GUIDE.md)
+[Why I Built This](#why-i-built-this) · [How It Works](#how-it-works) · [Commands](#commands) · [Why It Works](#why-it-works) · [User Guide](docs/USER-GUIDE.md) · [Walkthrough](docs/USER-GUIDE.md#end-to-end-walkthrough)

 </div>

@@ -89,11 +89,17 @@ People who want to describe what they want and have it built correctly — witho

 Built-in quality gates catch real problems: schema drift detection flags ORM changes missing migrations, security enforcement anchors verification to threat models, and scope reduction detection prevents the planner from silently dropping your requirements.

-### v1.37.0 Highlights
+### v1.39.0 Highlights

- **Spiking & sketching** — `/gsd-spike` runs 2–5 focused experiments with Given/When/Then verdicts; `/gsd-sketch` produces 2–3 interactive HTML mockup variants per design question — both store artifacts in `.planning/` and pair with wrap-up commands to package findings into project-local skills
- **Agent size-budget enforcement** — Tiered line-count limits (XL: 1 600, Large: 1 000, Default: 500) keep agent prompts lean; violations surface in CI
- **Shared boilerplate extraction** — Mandatory-initial-read and project-skills-discovery logic extracted to reference files, reducing duplication across a dozen agents
+See the [v1.39.0 release notes](https://github.com/gsd-build/get-shit-done/releases/tag/v1.39.0) for the full list.
+
+- **`--minimal` install profile** — alias `--core-only`, writes only the six main-loop skills (`new-project`, `discuss-phase`, `plan-phase`, `execute-phase`, `help`, `update`) and zero `gsd-*` subagents. Cuts cold-start system-prompt overhead from ~12k tokens to ~700 (≥94% reduction). Useful for local LLMs with 32K–128K context and token-billed APIs.
+- **`/gsd-edit-phase`** — modify any field of an existing phase in `ROADMAP.md` in place, without changing its number or position. `--force` skips the confirmation diff; `depends_on` references are validated and `STATE.md` is updated on write.
+- **Post-merge build & test gate** — `execute-phase` step 5.6 now auto-detects the build command from `workflow.build_command`, then falls back to Xcode (`.xcodeproj`), Makefile, Justfile, Cargo, Go, Python, or npm. Xcode/iOS projects get `xcodebuild build` + `xcodebuild test` automatically. Runs in both parallel and serial mode.
+- **Per-runtime review-model selection** — `review.models.<cli>` lets each external review CLI (codex, gemini, etc.) pick its own model independently of the planner/executor profile.
+- **Workstream config inheritance** — when `GSD_WORKSTREAM` is set, the root `.planning/config.json` is loaded first and deep-merged with the workstream config (workstream wins on conflict). Explicit `null` in a workstream config now correctly overrides a root value.
+- **Manual canary release workflow** — `.github/workflows/canary.yml` publishes `{base}-canary.{N}` builds of `get-shit-done-cc` and `@gsd-build/sdk` to the `@canary` dist-tag from `dev` on demand via `workflow_dispatch`.
+- **Skill consolidation: 86 → 59** — four new grouped skills (`capture`, `phase`, `config`, `workspace`) absorb 31 micro-skills. Six existing parents absorb wrap-up and sub-operations as flags: `update --sync/--reapply`, `sketch --wrap-up`, `spike --wrap-up`, `map-codebase --fast/--query`, `code-review --fix`, `progress --do/--next`. Zero functional loss.

 ---

@@ -104,11 +110,11 @@ npx get-shit-done-cc@latest
 ```

 The installer prompts you to choose:
-1. **Runtime** — Claude Code, OpenCode, Gemini, Kilo, Codex, Copilot, Cursor, Windsurf, Antigravity, Augment, Trae, Qwen Code, CodeBuddy, Cline, or all (interactive multi-select — pick multiple runtimes in a single install session)
+1. **Runtime** — Claude Code, OpenCode, Gemini, Kilo, Codex, Copilot, Cursor, Windsurf, Antigravity, Augment, Trae, Qwen Code, Hermes Agent, CodeBuddy, Cline, or all (interactive multi-select — pick multiple runtimes in a single install session)
 2. **Location** — Global (all projects) or local (current project only)

 Verify with:
- Claude Code / Gemini / Copilot / Antigravity / Qwen Code: `/gsd-help`
+- Claude Code / Gemini / Copilot / Antigravity / Qwen Code / Hermes Agent: `/gsd-help`
 - OpenCode / Kilo / Augment / Trae / CodeBuddy: `/gsd-help`
 - Codex: `$gsd-help`
 - Cline: GSD installs via `.clinerules` — verify by checking `.clinerules` exists
@@ -179,6 +185,10 @@ npx get-shit-done-cc --trae --local         # Install to ./.trae/
 npx get-shit-done-cc --qwen --global        # Install to ~/.qwen/
 npx get-shit-done-cc --qwen --local         # Install to ./.qwen/

+# Hermes Agent
+npx get-shit-done-cc --hermes --global      # Install to ~/.hermes/ (honors $HERMES_HOME)
+npx get-shit-done-cc --hermes --local       # Install to ./.hermes/
+
 # CodeBuddy
 npx get-shit-done-cc --codebuddy --global   # Install to ~/.codebuddy/
 npx get-shit-done-cc --codebuddy --local    # Install to ./.codebuddy/
@@ -192,11 +202,62 @@ npx get-shit-done-cc --all --global      # Install to all directories
 ```

 Use `--global` (`-g`) or `--local` (`-l`) to skip the location prompt.
-Use `--claude`, `--opencode`, `--gemini`, `--kilo`, `--codex`, `--copilot`, `--cursor`, `--windsurf`, `--antigravity`, `--augment`, `--trae`, `--qwen`, `--codebuddy`, `--cline`, or `--all` to skip the runtime prompt.
+Use `--claude`, `--opencode`, `--gemini`, `--kilo`, `--codex`, `--copilot`, `--cursor`, `--windsurf`, `--antigravity`, `--augment`, `--trae`, `--qwen`, `--hermes`, `--codebuddy`, `--cline`, or `--all` to skip the runtime prompt.
 The GSD SDK CLI (`gsd-sdk`) is installed automatically (required by `/gsd-*` commands). Pass `--no-sdk` to skip the SDK install, or `--sdk` to force a reinstall.

 </details>

+<details>
+<summary><strong>Minimal Install (local LLMs and token-billed APIs)</strong></summary>
+
+GSD ships 86 skills and 33 subagents. Every runtime (Claude Code, OpenCode, etc.) eagerly enumerates skill descriptions and subagent descriptions into the system prompt on **every turn** — about **~12k tokens** of fixed overhead before you've typed anything. Frontier models with large context (Sonnet 4.6, Opus 4.7 — 200K to 1M ctx) absorb that without a noticeable hit. **Local LLMs with 32K–128K context, and any model where you're paying per token, will feel it.**
+
+Pass `--minimal` (alias `--core-only`) to install only the **main GSD loop**:
+
+```bash
+npx get-shit-done-cc --claude --global --minimal
+# or any other runtime — works the same
+npx get-shit-done-cc --opencode --global --minimal
+```
+
+What you get:
+
+| Surface | Default install | `--minimal` install |
+|---|---|---|
+| Skills | 86 (`new-project`, `discuss-phase`, `plan-phase`, `execute-phase`, …82 more) | **6** (`new-project`, `discuss-phase`, `plan-phase`, `execute-phase`, `help`, `update`) |
+| Subagents | 33 `gsd-*` agents | **0** |
+| Cold-start system-prompt overhead | ~12k tokens | **~700 tokens** (≥94% reduction) |
+| Manifest mode field | `"full"` | `"minimal"` |
+
+The 6 core skills are exactly the ones you need to drive a project from zero: `new-project` to bootstrap, then the `discuss → plan → execute` loop, plus `help` for discovery and `update` to upgrade later.
+
+**This is a hard floor, not a ceiling.** Each `/gsd-*` command you start using and each subagent it dispatches loads its body content into the conversation for that turn — that's normal token use, not eager overhead. But:
+
+> [!IMPORTANT]
+> **The savings disappear the moment you re-install without `--minimal`.** Running `npx get-shit-done-cc@latest` (or `gsd update` from inside a session) without the flag puts the full 86-skill / 33-agent surface back on disk, and every subsequent session pays the full ~12k-token floor again. If you want to stay minimal, **always pass `--minimal` when updating**:
+>
+> ```bash
+> npx get-shit-done-cc@latest --claude --global --minimal
+> ```
+>
+> Need a specific skill that isn't in the core set (e.g., `gsd-autonomous`, `gsd-ship`, `gsd-debug`)? You have two options:
+> 1. **Permanent expand:** re-install without `--minimal` to get the full surface (and the full token floor).
+> 2. **One-shot:** run the slash command's underlying logic by reading the source from `commands/gsd/<name>.md` in the GSD package and executing it manually — no install change needed.
+>
+> Tip: `cat ~/.claude/get-shit-done/.gsd-manifest.json | jq .mode` (or `gsd-file-manifest.json` depending on layout) confirms which mode you're in.
+
+When to use `--minimal`:
+- Local model with 32K–128K context (Qwen3, Llama, Mistral, etc.)
+- Token-metered API where every turn matters
+- Throwaway directory or non-GSD project where you want `/gsd-new-project` available without paying for the rest
+- CI runners or ephemeral containers where install footprint matters
+
+When **not** to use `--minimal`:
+- Active GSD project where you regularly invoke the broader command set (`autonomous`, `ship`, `code-review`, `debug`, etc.) — re-installing each time is friction without payoff.
+- Frontier models with 200K–1M context — the savings are noise.
+
+</details>
+
 <details>
 <summary><strong>Development Installation</strong></summary>

@@ -263,6 +324,8 @@ If you prefer not to use that flag, add this to your project's `.claude/settings

 ## How It Works

+> **New to GSD?** See the [end-to-end walkthrough](docs/USER-GUIDE.md#end-to-end-walkthrough) in the User Guide — it shows a complete project from `/gsd-new-project` through `/gsd-verify-work` with concrete example outputs.
+
 > **Already have code?** Run `/gsd-map-codebase` first. It spawns parallel agents to analyze your stack, architecture, conventions, and concerns. Then `/gsd-new-project` knows your codebase — questions focus on what you're adding, and planning automatically loads your patterns.

 ### 1. Initialize Project
@@ -588,7 +651,7 @@ You're never locked in. The system adapts.

 | Command | What it does |
 |---------|--------------|
-| `/gsd-new-workspace` | Create isolated workspace with repo copies (worktrees or clones) |
+| `/gsd-workspace --new` | Create isolated workspace with repo copies (worktrees or clones) |
 | `/gsd-list-workspaces` | Show all GSD workspaces and their status |
 | `/gsd-remove-workspace` | Remove workspace and clean up worktrees |

@@ -632,9 +695,9 @@ You're never locked in. The system adapts.
 |---------|--------------|
 | `/gsd-add-phase` | Append phase to roadmap |
 | `/gsd-insert-phase [N]` | Insert urgent work between phases |
+| `/gsd-edit-phase [N] [--force]` | Modify any field of an existing phase in place — number and position unchanged |
 | `/gsd-remove-phase [N]` | Remove future phase, renumber |
 | `/gsd-list-phase-assumptions [N]` | See Claude's intended approach before planning |
-| `/gsd-plan-milestone-gaps` | Create phases to close gaps from audit |

 ### Session

@@ -676,7 +739,7 @@ You're never locked in. The system adapts.
 | `/gsd-settings` | Configure model profile and workflow agents |
 | `/gsd-set-profile <profile>` | Switch model profile (quality/balanced/budget/inherit) |
 | `/gsd-add-todo [desc]` | Capture idea for later |
-| `/gsd-check-todos` | List pending todos |
+| `/gsd-capture --list` | List pending todos |
 | `/gsd-debug [desc]` | Systematic debugging with persistent state |
 | `/gsd-do <text>` | Route freeform text to the right GSD command automatically |
 | `/gsd-note <text>` | Zero-friction idea capture — append, list, or promote notes to todos |
@@ -693,6 +756,8 @@ You're never locked in. The system adapts.

 GSD stores project settings in `.planning/config.json`. Configure during `/gsd-new-project` or update later with `/gsd-settings`. For the full config schema, workflow toggles, git branching options, and per-agent model breakdown, see the [User Guide](docs/USER-GUIDE.md#configuration-reference).

+When `GSD_WORKSTREAM` is set, GSD loads the root `.planning/config.json` first and deep-merges the workstream's `config.json` on top — workstream values win on conflict, and an explicit `null` in a workstream config overrides a root value.
+
 ### Core Settings

 | Setting | Options | Default | What it controls |
@@ -721,6 +786,8 @@ Use `inherit` when using non-Anthropic providers (OpenRouter, local models) or t

 Or configure via `/gsd-settings`.

+Per-runtime review-model overrides live under `review.models.<cli>` (e.g. `review.models.codex`, `review.models.gemini`) and let each external review CLI pick its own model independently of the planner/executor profile.
+
 ### Workflow Agents

 These spawn additional agents during planning/execution. They improve quality but add tokens and time.
@@ -736,6 +803,7 @@ These spawn additional agents during planning/execution. They improve quality bu
 | `workflow.skip_discuss` | `false` | Skip discuss-phase in autonomous mode |
 | `workflow.text_mode` | `false` | Text-only mode for remote sessions (no TUI menus) |
 | `workflow.use_worktrees` | `true` | Toggle worktree isolation for execution |
+| `workflow.build_command` | _(auto-detect)_ | Override the post-merge build gate command. Falls back to Xcode (`.xcodeproj`), Makefile, Justfile, Cargo, Go, Python, or npm; Xcode/iOS projects also run `xcodebuild test`. |

 Use `/gsd-settings` to toggle these, or override per-invocation:
 - `/gsd-plan-phase --skip-research`
@@ -866,6 +934,7 @@ npx get-shit-done-cc --antigravity --global --uninstall
 npx get-shit-done-cc --augment --global --uninstall
 npx get-shit-done-cc --trae --global --uninstall
 npx get-shit-done-cc --qwen --global --uninstall
+npx get-shit-done-cc --hermes --global --uninstall
 npx get-shit-done-cc --codebuddy --global --uninstall
 npx get-shit-done-cc --cline --global --uninstall

@@ -882,6 +951,7 @@ npx get-shit-done-cc --antigravity --local --uninstall
 npx get-shit-done-cc --augment --local --uninstall
 npx get-shit-done-cc --trae --local --uninstall
 npx get-shit-done-cc --qwen --local --uninstall
+npx get-shit-done-cc --hermes --local --uninstall
 npx get-shit-done-cc --codebuddy --local --uninstall
 npx get-shit-done-cc --cline --local --uninstall
 ```
--- a/README.pt-BR.md
+++ b/README.pt-BR.md
@@ -73,15 +73,17 @@ Para quem quer descrever o que precisa e receber isso construído do jeito certo

 Quality gates embutidos capturam problemas reais: detecção de schema drift sinaliza mudanças ORM sem migrations, segurança ancora verificação a modelos de ameaça, e detecção de redução de escopo impede o planner de descartar requisitos silenciosamente.

-### Destaques v1.32.0
+### Destaques v1.39.0

- **Gates de consistência STATE.md** — `state validate` detecta divergência entre STATE.md e o filesystem; `state sync` reconstrói a partir do estado real do projeto
- **Flag `--to N`** — Para a execução autônoma após completar uma fase específica
- **Research gate** — Bloqueia planejamento quando RESEARCH.md tem perguntas abertas não resolvidas
- **Filtro de escopo do verificador** — Lacunas abordadas em fases posteriores são marcadas como "adiadas", não como lacunas
- **Guard de leitura antes de edição** — Hook consultivo previne loops de retry infinitos em runtimes não-Claude
- **Redução de contexto** — Truncamento de Markdown e ordenação de prompts cache-friendly para menor uso de tokens
- **4 novos runtimes** — Trae, Kilo, Augment e Cline (12 runtimes no total)
+Lista completa nas [notas de release v1.39.0](https://github.com/gsd-build/get-shit-done/releases/tag/v1.39.0).
+
+- **Perfil de instalação `--minimal`** — alias `--core-only`. Instala apenas os 6 skills do loop principal (`new-project`, `discuss-phase`, `plan-phase`, `execute-phase`, `help`, `update`) e nenhum subagente `gsd-*`. Reduz o overhead do system prompt no cold-start de ~12k para ~700 tokens (≥94% de redução). Útil para LLMs locais com contexto de 32K–128K e APIs cobradas por token.
+- **`/gsd-edit-phase`** — edita qualquer campo de uma fase existente em `ROADMAP.md` no lugar, sem alterar o número ou a posição. `--force` pula o diff de confirmação; referências em `depends_on` são validadas e o `STATE.md` é atualizado na escrita.
+- **Build & test gate pós-merge** — o passo 5.6 de `execute-phase` agora detecta automaticamente o comando de build em `workflow.build_command`, com fallback para Xcode (`.xcodeproj`), Makefile, Justfile, Cargo, Go, Python ou npm. Projetos Xcode/iOS rodam `xcodebuild build` e `xcodebuild test` automaticamente. Funciona em modo paralelo e serial.
+- **Modelo de review por runtime** — `review.models.<cli>` permite que cada CLI externa de review (codex, gemini, etc.) escolha seu próprio modelo, independente do perfil de planner/executor.
+- **Herança de configuração de workstream** — quando `GSD_WORKSTREAM` está definido, o `.planning/config.json` raiz é carregado primeiro e merge-deep com o config da workstream (workstream vence em conflito). Um `null` explícito no config da workstream sobrescreve corretamente o valor raiz.
+- **Workflow manual de canary release** — `.github/workflows/canary.yml` publica builds `{base}-canary.{N}` de `get-shit-done-cc` e `@gsd-build/sdk` na dist-tag `@canary` a partir de `dev`, sob demanda via `workflow_dispatch`.
+- **Consolidação de skills: 86 → 59** — 4 novos skills agrupados (`capture`, `phase`, `config`, `workspace`) absorvem 31 micro-skills. 6 skills pais existentes absorvem wrap-up e sub-operações como flags: `update --sync/--reapply`, `sketch --wrap-up`, `spike --wrap-up`, `map-codebase --fast/--query`, `code-review --fix`, `progress --do/--next`. Sem perda funcional.

 ---

--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@@ -73,15 +73,17 @@ GSD 解决的就是这个问题。它是让 Claude Code 变得可靠的上下文

 适合那些想把自己的需求说明白，然后让系统正确构建出来的人，而不是假装自己在运营一个 50 人工程组织的人。

-### v1.32.0 亮点
+### v1.39.0 亮点

- **STATE.md 一致性检查** — `state validate` 检测 STATE.md 与文件系统之间的偏差；`state sync` 从实际项目状态重建
- **`--to N` 标志** — 在完成特定阶段后停止自主执行
- **研究门控** — 当 RESEARCH.md 有未解决的开放问题时阻止规划
- **验证里程碑范围过滤** — 后续阶段将处理的差距标记为"延迟"而非差距
- **读取后编辑保护** — 咨询性 hook 防止非 Claude 运行时的无限重试循环
- **上下文缩减** — Markdown 截断和缓存友好的 prompt 排序，降低 token 使用量
- **4 个新运行时** — Trae、Kilo、Augment 和 Cline（共 12 个运行时）
+完整列表请参阅 [v1.39.0 发行说明](https://github.com/gsd-build/get-shit-done/releases/tag/v1.39.0)。
+
+- **`--minimal` 安装档** — 别名 `--core-only`。仅安装主循环的 6 个核心技能（`new-project`、`discuss-phase`、`plan-phase`、`execute-phase`、`help`、`update`），不安装任何 `gsd-*` 子代理。将冷启动系统提示开销从 ~12k token 降至 ~700 token（≥94% 减少）。适合 32K–128K 上下文的本地 LLM 和按 token 计费的 API。
+- **`/gsd-edit-phase`** — 就地修改 `ROADMAP.md` 中已有阶段的任意字段，不改变其编号或位置。`--force` 跳过确认 diff，验证 `depends_on` 引用，并在写入时更新 `STATE.md`。
+- **合并后构建与测试门** — `execute-phase` 步骤 5.6 优先自动检测 `workflow.build_command` 配置，否则按 Xcode（`.xcodeproj`）、Makefile、Justfile、Cargo、Go、Python、npm 顺序回退。Xcode/iOS 项目自动运行 `xcodebuild build` 和 `xcodebuild test`。在并行与串行模式下均生效。
+- **每运行时评审模型选择** — `review.models.<cli>` 让每个外部评审 CLI（codex、gemini 等）独立于规划/执行档选择自己的模型。
+- **工作流设置继承** — 设置 `GSD_WORKSTREAM` 后，先加载根 `.planning/config.json`，再与该工作流的配置进行深合并（冲突时工作流优先）。工作流配置中显式 `null` 会覆盖根值。
+- **手动 canary 发布工作流** — `.github/workflows/canary.yml` 通过 `workflow_dispatch` 从 `dev` 分支按需将 `{base}-canary.{N}` 构建（`get-shit-done-cc` 与 `@gsd-build/sdk`）发布到 `@canary` dist-tag。
+- **技能整合：86 → 59** — 4 个新分组技能（`capture`、`phase`、`config`、`workspace`）吸收了 31 个微技能。6 个已有父技能将收尾与子操作合并为标志：`update --sync/--reapply`、`sketch --wrap-up`、`spike --wrap-up`、`map-codebase --fast/--query`、`code-review --fix`、`progress --do/--next`。功能无损失。

 ---

@@ -589,6 +591,7 @@ lmn012o feat(08-02): create registration endpoint
 |------|------|
 | `/gsd-add-phase` | 在路线图末尾追加 phase |
 | `/gsd-insert-phase [N]` | 在 phase 之间插入紧急工作 |
+| `/gsd-edit-phase [N] [--force]` | 就地修改已有 phase 的任意字段 — 编号与位置保持不变 |
 | `/gsd-remove-phase [N]` | 删除未来 phase，并重编号 |
 | `/gsd-list-phase-assumptions [N]` | 在规划前查看 Claude 打算采用的方案 |
 | `/gsd-plan-milestone-gaps` | 为 audit 发现的缺口创建 phase |
--- a/VERSIONING.md
+++ b/VERSIONING.md
@@ -67,15 +67,38 @@ main                              ← stable, always deployable

 ### Patch Release (Hotfix)

-For critical bugs that can't wait for the next minor release.
+For fixes that need to ship without waiting for the next minor.

-1. Trigger `hotfix.yml` with version (e.g., `1.27.1`)
-2. Workflow creates `hotfix/1.27.1` branch from the latest patch tag for that minor version (e.g., `v1.27.0` or `v1.27.1`)
-3. Cherry-pick or apply fix on the hotfix branch
-4. Push — CI runs tests automatically
-5. Trigger `hotfix.yml` finalize action
-6. Workflow runs full test suite, bumps version, tags, publishes to `latest`
-7. Merge hotfix branch back to main
+A hotfix `vX.YY.Z` cumulatively includes everything in `vX.YY.{Z-1}` plus every `fix:`/`chore:` commit landed on `main` since that base. The base tag is the anchor — `git cherry $BASE_TAG main` reveals exactly which commits are still unshipped, and the new `vX.YY.Z` tag becomes the next hotfix's base, so the cycle is self-documenting.
+
+#### Two paths
+
+**Path A — `hotfix.yml` (canonical, two-step):**
+
+1. Trigger `hotfix.yml` with `action=create`, `version=1.27.1`, `auto_cherry_pick=true` (default).
+   - Workflow detects `BASE_TAG` = highest `v1.27.*` < `v1.27.1` (so `1.27.1` branches from `v1.27.0`; `1.27.2` would branch from `v1.27.1`).
+   - Branches `hotfix/1.27.1` from `BASE_TAG`.
+   - Auto-cherry-picks every `fix:`/`chore:` commit on `origin/main` not already in the base, oldest-first. Patch-equivalents are skipped via `git cherry`. `feat:`/`refactor:` are **never** auto-included.
+   - On conflict the workflow halts with the offending SHA. Resolve manually on the branch, then re-run finalize with `auto_cherry_pick=false`.
+   - Bumps `package.json` (and `sdk/package.json`), pushes the branch, and lists every included SHA in the run summary.
+2. (Optional) push additional manual commits to `hotfix/1.27.1`.
+3. Trigger `hotfix.yml` with `action=finalize`. The workflow:
+   - Runs `install-smoke` cross-platform gate.
+   - Runs full test suite + coverage.
+   - Builds SDK, bundles `sdk-bundle/gsd-sdk.tgz` inside the CC tarball (parity with `release-sdk.yml`).
+   - Tags `v1.27.1`, publishes to `@latest`, re-points `@next → v1.27.1`.
+   - Opens merge-back PR against `main`.
+
+**Path B — `release-sdk.yml` (stopgap, one-shot):**
+
+Active while the `@gsd-build/sdk` npm token is unavailable; bundles the SDK inside the CC tarball.
+
+1. Trigger `release-sdk.yml` with `action=hotfix`, `version=1.27.1`, `auto_cherry_pick=true`.
+   - The `prepare` job creates the branch and cherry-picks (same logic as Path A).
+   - `install-smoke` runs against the new branch.
+   - The `release` job tags, publishes to `@latest`, re-points `@next`, opens merge-back PR.
+   - Idempotent: if `hotfix/1.27.1` already exists (e.g. you ran `hotfix.yml create` first), the prepare job checks it out and re-runs cherry-pick as a no-op.
+2. `dry_run=true` exercises the full pipeline without pushing the branch or publishing.

 ### Minor Release (Standard Cycle)

--- a/agents/gsd-code-fixer.md
+++ b/agents/gsd-code-fixer.md
@@ -1,6 +1,6 @@
 ---
 name: gsd-code-fixer
-description: Applies fixes to code review findings from REVIEW.md. Reads source files, applies intelligent fixes, and commits each fix atomically. Spawned by /gsd-code-review-fix.
+description: Applies fixes to code review findings from REVIEW.md. Reads source files, applies intelligent fixes, and commits each fix atomically. Spawned by /gsd-code-review --fix.
 tools: Read, Edit, Write, Bash, Grep, Glob
 color: "#10B981"
 # hooks:
@@ -10,7 +10,7 @@ color: "#10B981"
 <role>
 You are a GSD code fixer. You apply fixes to issues found by the gsd-code-reviewer agent.

-Spawned by `/gsd-code-review-fix` workflow. You produce REVIEW-FIX.md artifact in the phase directory.
+Spawned by `/gsd-code-review --fix` workflow. You produce REVIEW-FIX.md artifact in the phase directory.

 Your job: Read REVIEW.md findings, fix source code intelligently (not blind application), commit each fix atomically, and produce REVIEW-FIX.md report.

@@ -209,6 +209,152 @@ If a finding references multiple files (in Fix section or Issue section):

 <execution_flow>

+<step name="setup_worktree">
+**Isolation: create a dedicated git worktree BEFORE touching any files.**
+
+This agent runs as a background process that makes commits. Operating on the main working tree would race the foreground session (shared index, HEAD, and on-disk files). Instead, every instance runs in its own isolated worktree.
+
+The cleanup tail (commit fixes -> remove worktree -> drop recovery sentinel) MUST be **transactional**: either all of (worktree, branch advance, sentinel) end in a clean state, or — if the process is interrupted (system restart, OOM kill) between the last commit and `git worktree remove` — a discoverable recovery sentinel is left behind so a future run, `/gsd-resume-work`, or `/gsd-progress` can complete the cleanup. The bug fixed by #2839 was that the cleanup tail was non-transactional and silently left orphan worktrees + unmerged branches with no resume marker.
+
+```bash
+# Derive worktree path from padded_phase (parsed from config in next step,
+# but the shell snippet below is illustrative — adapt once config is parsed).
+# In practice: parse padded_phase from config first, then run:
+branch=$(git branch --show-current)
+test -n "$branch" || { echo "Detached HEAD is not supported for review-fix (#2686)"; exit 1; }
+
+# Recovery-sentinel handling (#2839):
+# Path is ${phase_dir}/.review-fix-recovery-pending.json. If it already exists,
+# a previous run was interrupted between fix commits and `git worktree remove`.
+# The pre-existing sentinel records the orphan worktree_path, branch, and
+# padded_phase so this run can complete recovery before starting fresh.
+sentinel="${phase_dir}/.review-fix-recovery-pending.json"
+if [ -f "$sentinel" ]; then
+  echo "Detected pre-existing recovery sentinel from a prior interrupted run: $sentinel"
+  # Recovery must extract BOTH worktree_path AND reviewfix_branch (#3001 CR):
+  # if a prior run died after `git worktree remove` but before
+  # `git branch -D`, the orphan branch survives and clutters `git branch`
+  # output forever. Emit both fields newline-separated so we can read them
+  # independently.
+  prior_recovery=$(node -e '
+    const fs = require("fs");
+    try {
+      const parsed = JSON.parse(fs.readFileSync(process.argv[1], "utf-8"));
+      process.stdout.write((parsed.worktree_path || "") + "\n" + (parsed.reviewfix_branch || ""));
+    } catch (err) {
+      process.stderr.write(`Warning: malformed recovery sentinel ${process.argv[1]}: ${err.message}\n`);
+      process.stdout.write("\n");
+    }
+  ' "$sentinel")
+  prior_wt="$(printf '%s' "$prior_recovery" | sed -n '1p')"
+  prior_branch="$(printf '%s' "$prior_recovery" | sed -n '2p')"
+  if [ -n "$prior_wt" ] && git worktree list --porcelain | grep -q "^worktree $prior_wt$"; then
+    echo "Removing orphan worktree from prior run: $prior_wt"
+    git worktree remove "$prior_wt" --force || true
+  fi
+  if [ -n "$prior_branch" ]; then
+    # Best-effort: branch may already be gone (cleaned by an earlier
+    # partial recovery, or never created if `git worktree add -b` itself
+    # failed). `|| true` keeps recovery non-fatal.
+    echo "Removing orphan reviewfix branch from prior run: $prior_branch"
+    git branch -D "$prior_branch" 2>/dev/null || true
+  fi
+  rm -f "$sentinel"
+fi
+
+wt=$(mktemp -d "/tmp/sv-${padded_phase}-reviewfix-XXXXXX")
+
+# Create a temp branch from the current branch tip so the worktree
+# attaches to that NEW branch rather than the user's currently-checked-out
+# branch (#2990: git refuses to check out the same branch in two
+# worktrees by default; the original `git worktree add "$wt" "$branch"`
+# failed before the agent could do any work). The temp branch shares
+# history with $branch up to the moment of creation, so commits made
+# inside the worktree fast-forward $branch on cleanup.
+reviewfix_branch="gsd-reviewfix/${padded_phase}-$$"
+git worktree add -b "$reviewfix_branch" "$wt" "$branch"
+
+# Write the recovery sentinel ONLY AFTER `git worktree add` succeeds.
+# Writing it before would leave a sentinel pointing at a worktree that does
+# not exist if `git worktree add` itself failed.
+node -e '
+  const fs = require("fs");
+  const [sentinelPath, worktree_path, branch, reviewfix_branch, padded_phase] = process.argv.slice(1);
+  fs.writeFileSync(sentinelPath, JSON.stringify({
+    worktree_path,
+    branch,
+    reviewfix_branch,
+    padded_phase,
+    started_at: new Date().toISOString()
+  }, null, 2));
+' "$sentinel" "$wt" "$branch" "$reviewfix_branch" "$padded_phase"
+
+cd "$wt"
+```
+
+Concrete steps:
+1. Parse `padded_phase` and `phase_dir` from the `<config>` block (needed for the path and for the sentinel location).
+2. Resolve the current branch: `branch=$(git branch --show-current)`. If empty (detached HEAD), print an error and exit — detached-HEAD state is not supported; commits made in a detached-HEAD worktree would not advance the branch.
+3. **Recovery check (#2839, #2990):** If `${phase_dir}/.review-fix-recovery-pending.json` already exists, a prior run was interrupted. Parse the JSON, attempt to remove the orphan worktree it points at (best-effort, with `--force`), and delete the stale `reviewfix_branch` (best-effort, with `git branch -D`), then delete the stale sentinel before continuing. This makes a re-run of `/gsd-code-review --fix` self-healing.
+4. Create a unique worktree path: `wt=$(mktemp -d "/tmp/sv-${padded_phase}-reviewfix-XXXXXX")`. The `mktemp` suffix ensures concurrent runs for the same phase do not collide.
+5. Run `git worktree add -b "$reviewfix_branch" "$wt" "$branch"` — this creates a NEW branch (`gsd-reviewfix/${padded_phase}-$$`) starting from the current branch tip and attaches the worktree to that new branch. Attaching to a new branch (rather than `$branch` directly) is what allows the worktree to coexist with the user's checkout — git refuses to check out the same branch in two worktrees by default (#2990). Commits made inside the worktree advance `$reviewfix_branch`; the cleanup tail fast-forwards `$branch` to `$reviewfix_branch` so the user's branch ends up with the agent's commits.
+6. **Write the recovery sentinel** at `${phase_dir}/.review-fix-recovery-pending.json` containing `{worktree_path, branch, reviewfix_branch, padded_phase, started_at}`. Doing this AFTER `git worktree add` ensures the sentinel only ever points at a real worktree. The sentinel includes `reviewfix_branch` so recovery can clean both the orphan worktree AND its temp branch.
+7. All subsequent file reads, edits, and commits happen inside `$wt` (which is on `$reviewfix_branch`, not `$branch`).
+
+**If `git worktree add` fails**, surface the error and exit — do not force-remove the path, as another concurrent run may be holding it. Do not write the sentinel (the worktree does not exist). Do not delete `$reviewfix_branch` either; if `-b` failed, no temp branch was created.
+
+**Cleanup tail (transactional, ALWAYS — even on failure):** After writing REVIEW-FIX.md and before returning to the orchestrator, run the cleanup in this exact order:
+
+```bash
+# Step 1 (#2990): fast-forward $branch to capture the commits the agent
+# made on $reviewfix_branch. Run from the main repo (not $wt) — the user's
+# checkout owns $branch. --ff-only ensures we never silently drop or
+# rewrite history if the user committed to $branch concurrently; on
+# divergence, this fails loudly and the temp branch is left for the
+# user to inspect/merge manually. We deliberately resolve the main repo
+# path via `git worktree list --porcelain` rather than assuming $PWD,
+# because the agent ran inside $wt.
+# Strip the literal "worktree " prefix and print the rest of the line, then
+# exit on the first match. This preserves paths that contain spaces
+# (awk '$2' would truncate "/path/with spaces/repo" to "/path/with").
+main_repo="$(git worktree list --porcelain | awk '/^worktree / { sub(/^worktree /, ""); print; exit }')"
+ff_status=0
+# Capture the exit code of `git merge` directly. `if ! cmd; then ff_status=$?`
+# captures the exit code of the `!` operator (always 1 when the inner cmd
+# failed) — masking the real merge exit code. Use the success/else split
+# instead so $? in the else-branch is the merge command's exit code.
+if git -C "$main_repo" merge --ff-only "$reviewfix_branch" 2>&1; then
+  ff_status=0
+else
+  ff_status=$?
+  echo "WARN: could not fast-forward $branch to $reviewfix_branch (exit $ff_status)."
+  echo "      The temp branch $reviewfix_branch is preserved for manual merge."
+fi
+
+# Step 2: drop the worktree. If this succeeds and the process is then
+# killed, the next run finds a sentinel pointing at a worktree that no
+# longer exists — the recovery branch handles this gracefully (best-effort
+# remove + sentinel delete). If we reversed the order (sentinel removed
+# first, then worktree remove), an interruption between the two steps
+# would leave NO sentinel and an orphan worktree — exactly the bug from
+# #2839.
+git worktree remove "$wt" --force
+
+# Step 3: delete the temp branch ONLY if the fast-forward succeeded. If
+# it didn't, leaving the branch lets the user inspect/merge manually.
+if [ "$ff_status" -eq 0 ]; then
+  git -C "$main_repo" branch -D "$reviewfix_branch" || true
+fi
+
+# Step 4: drop the recovery sentinel ONLY after `git worktree remove`
+# returns successfully. This atomic-ish ordering is what makes the
+# cleanup tail transactional from the orchestrator's perspective.
+rm -f "$sentinel"
+```
+
+This cleanup is unconditional — register it mentally as a finally-block obligation. If the agent exits early (config error, no findings, etc.), still run the cleanup tail in order (fast-forward → worktree remove → temp branch delete → sentinel rm) before exit. The sentinel must NEVER be removed before `git worktree remove` succeeds. The temp branch must NEVER be deleted while the fast-forward is in a diverged state.
+</step>
+
 <step name="load_context">
 **1. Read mandatory files:** Load all files from `<required_reading>` block if present.

@@ -312,6 +458,7 @@ Use `gsd-sdk query commit` with conventional format (message first, then every s
 ```bash
 gsd-sdk query commit \
  "fix({padded_phase}): {finding_id} {short_description}" \
+  --files \
  {all_modified_files}
 ```

@@ -321,7 +468,7 @@ Examples:

 **Multiple files:** List ALL modified files after the message (space-separated):
 ```bash
-gsd-sdk query commit "fix(02): CR-01 ..." \
+gsd-sdk query commit "fix(02): CR-01 ..." --files \
  src/api/auth.ts src/types/user.ts tests/auth.test.ts
 ```

@@ -437,6 +584,10 @@ _Iteration: {N}_

 <critical_rules>

+**ALWAYS run inside the isolated worktree** — set up via `branch=$(git branch --show-current)` + `wt=$(mktemp -d "/tmp/sv-${padded_phase}-reviewfix-XXXXXX")` + `git worktree add -b "$reviewfix_branch" "$wt" "$branch"` at the very start (see `setup_worktree` step). Using `mktemp` ensures concurrent runs do not collide. Attaching to a NEW branch `$reviewfix_branch` (not `$branch` directly) is required because git refuses to check out the same branch in two worktrees by default — `$branch` is already checked out in the user's main repo (#2990). Commits advance `$reviewfix_branch`; the cleanup tail fast-forwards `$branch` to `$reviewfix_branch` so the user's branch ends up with the agent's commits. Every file read, edit, and commit must happen inside `$wt`. Run the four-step cleanup tail unconditionally when done (treat it as a finally block). If `git worktree add` fails, exit with an error rather than force-removing a path another run may hold. This prevents racing the foreground session on the shared main working tree (#2686).
+
+**ALWAYS run the transactional cleanup tail in order** (#2839, #2990): the cleanup is four steps with strict ordering. (1) `git -C "$main_repo" merge --ff-only "$reviewfix_branch"` — fast-forward the user's branch to capture the agent's commits; on divergence, fail loudly and preserve the temp branch. (2) `git worktree remove "$wt" --force`. (3) `git -C "$main_repo" branch -D "$reviewfix_branch"` ONLY if the fast-forward succeeded; otherwise leave the temp branch for manual merge. (4) `rm -f "$sentinel"` (the recovery sentinel at `${phase_dir}/.review-fix-recovery-pending.json`). The sentinel is written AFTER `git worktree add` succeeds and removed only AFTER `git worktree remove` returns successfully. The temp branch is deleted only when the fast-forward succeeded. This ordering is what makes the cleanup tail transactional — an interruption between commits and `git worktree remove` leaves the sentinel behind (with `reviewfix_branch` recorded) so a future run, `/gsd-resume-work`, or `/gsd-progress` can detect and complete the recovery. Reversing the order recreates the orphan-worktree bug.
+
 **ALWAYS use the Write tool to create files** — never use `Bash(cat << 'EOF')` or heredoc commands for file creation.

 **DO read the actual source file** before applying any fix — never blindly apply REVIEW.md suggestions without understanding current code state.
--- a/agents/gsd-code-reviewer.md
+++ b/agents/gsd-code-reviewer.md
@@ -8,7 +8,7 @@ color: "#F59E0B"
 ---

 <role>
-You are a GSD code reviewer. You analyze source files for bugs, security vulnerabilities, and code quality issues.
+Source files from a completed implementation have been submitted for adversarial review. Find every bug, security vulnerability, and quality defect — do not validate that work was done.

 Spawned by `/gsd-code-review` workflow. You produce REVIEW.md artifact in the phase directory.

@@ -16,6 +16,22 @@ Spawned by `/gsd-code-review` workflow. You produce REVIEW.md artifact in the ph
 If the prompt contains a `<required_reading>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. This is your primary context.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every submitted implementation contains defects. Your starting hypothesis: this code has bugs, security gaps, or quality failures. Surface what you can prove.
+
+**Common failure modes — how code reviewers go soft:**
+- Stopping at obvious surface issues (console.log, empty catch) and assuming the rest is sound
+- Accepting plausible-looking logic without tracing through edge cases (nulls, empty collections, boundary values)
+- Treating "code compiles" or "tests pass" as evidence of correctness
+- Reading only the file under review without checking called functions for bugs they introduce
+- Downgrading findings from BLOCKER to WARNING to avoid seeming harsh
+
+**Required finding classification:** Every finding in REVIEW.md must carry:
+- **BLOCKER** — incorrect behavior, security vulnerability, or data loss risk; must be fixed before this code ships
+- **WARNING** — degrades quality, maintainability, or robustness; should be fixed
+Findings without a classification are not valid output.
+</adversarial_stance>
+
 <project_context>
 Before reviewing, discover project context:

--- a/agents/gsd-codebase-mapper.md
+++ b/agents/gsd-codebase-mapper.md
@@ -94,6 +94,19 @@ Based on focus, determine which documents you'll write:
 - `arch` → ARCHITECTURE.md, STRUCTURE.md
 - `quality` → CONVENTIONS.md, TESTING.md
 - `concerns` → CONCERNS.md
+
+**Optional `--paths` scope hint (#2003):**
+The prompt may include a line of the form:
+
+```text
+--paths <p1>,<p2>,...
+```
+
+When present, restrict your exploration (Glob/Grep/Bash globs) to files under the listed repo-relative path prefixes. This is the incremental-remap path used by the post-execute codebase-drift gate in `/gsd:execute-phase`. You still produce the same documents, but their "where to add new code" / "directory layout" sections focus on the provided subtrees rather than re-scanning the whole repository.
+
+**Path validation:** Reject any `--paths` value containing `..`, starting with `/`, or containing shell metacharacters (`;`, `` ` ``, `$`, `&`, `|`, `<`, `>`). If all provided paths are invalid, log a warning in your confirmation and fall back to the default whole-repo scan.
+
+If no `--paths` hint is provided, behave exactly as before.
 </step>

 <step name="explore_codebase">
@@ -326,10 +339,42 @@ Ready for orchestrator summary.
 ## ARCHITECTURE.md Template (arch focus)

 ```markdown
+<!-- refreshed: [YYYY-MM-DD] -->
 # Architecture

 **Analysis Date:** [YYYY-MM-DD]

+## System Overview
+
+```text
+┌─────────────────────────────────────────────────────────────┐
+│                      [Top Layer Name]                        │
+├──────────────────┬──────────────────┬───────────────────────┤
+│   [Component A]  │   [Component B]  │    [Component C]      │
+│  `[path/to/a]`   │  `[path/to/b]`   │   `[path/to/c]`       │
+└────────┬─────────┴────────┬─────────┴──────────┬────────────┘
+         │                  │                     │
+         ▼                  ▼                     ▼
+┌─────────────────────────────────────────────────────────────┐
+│                    [Middle Layer Name]                       │
+│         `[path/to/layer]`                                    │
+└─────────────────────────────────────────────────────────────┘
+         │
+         ▼
+┌─────────────────────────────────────────────────────────────┐
+│  [Store / Output / External]                                 │
+│  `[path/to/store]`                                           │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Component Responsibilities
+
+| Component | Responsibility | File |
+|-----------|----------------|------|
+| [Name] | [What it owns] | `[path]` |
+| [Name] | [What it owns] | `[path]` |
+| [Name] | [What it owns] | `[path]` |
+
 ## Pattern Overview

 **Overall:** [Pattern name]
@@ -350,7 +395,13 @@ Ready for orchestrator summary.

 ## Data Flow

-**[Flow Name]:**
+### Primary Request Path
+
+1. [Step 1 — entry point] (`[file:line]`)
+2. [Step 2 — processing] (`[file:line]`)
+3. [Step 3 — output/response] (`[file:line]`)
+
+### [Secondary Flow Name]

 1. [Step 1]
 2. [Step 2]
@@ -373,6 +424,27 @@ Ready for orchestrator summary.
 - Triggers: [What invokes it]
 - Responsibilities: [What it does]

+## Architectural Constraints
+
+- **Threading:** [Threading model — e.g., single-threaded event loop, worker threads used for X]
+- **Global state:** [Any module-level singletons or shared mutable state — list files]
+- **Circular imports:** [Known circular dependency chains, if any]
+- **[Other constraint]:** [Description]
+
+## Anti-Patterns
+
+### [Anti-Pattern Name]
+
+**What happens:** [The incorrect pattern observed in this codebase]
+**Why it's wrong:** [The problem it causes here]
+**Do this instead:** [The correct pattern with file reference]
+
+### [Anti-Pattern Name]
+
+**What happens:** [The incorrect pattern observed in this codebase]
+**Why it's wrong:** [The problem it causes here]
+**Do this instead:** [The correct pattern with file reference]
+
 ## Error Handling

 **Strategy:** [Approach]
--- a/agents/gsd-debugger.md
+++ b/agents/gsd-debugger.md
@@ -1168,7 +1168,7 @@ Root cause: {root_cause}"

 Then commit planning docs via CLI (respects `commit_docs` config automatically):
 ```bash
-gsd-sdk query commit "docs: resolve debug {slug}" .planning/debug/resolved/{slug}.md
+gsd-sdk query commit "docs: resolve debug {slug}" --files .planning/debug/resolved/{slug}.md
 ```

 **Append to knowledge base:**
@@ -1199,7 +1199,7 @@ Then append the entry:

 Commit the knowledge base update alongside the resolved session:
 ```bash
-gsd-sdk query commit "docs: update debug knowledge base with {slug}" .planning/debug/knowledge-base.md
+gsd-sdk query commit "docs: update debug knowledge base with {slug}" --files .planning/debug/knowledge-base.md
 ```

 Report completion and offer next steps.
--- a/agents/gsd-doc-classifier.md
+++ b/agents/gsd-doc-classifier.md
@@ -110,7 +110,7 @@ Regardless of type, extract:
 </step>

 <step name="write_output">
-Write to `{OUTPUT_DIR}/{slug}.json` where `slug` is the filename without extension (replace non-alphanumerics with `-`).
+Write to `{OUTPUT_DIR}/{slug}-{source_hash}.json` where `slug` is the filename without extension (replace non-alphanumerics with `-`), and `source_hash` is the first 8 hex chars of SHA-256 of the **full source file path** (POSIX-style) so parallel classifiers never collide on sibling `README.md` files.

 JSON schema:

--- a/agents/gsd-doc-verifier.md
+++ b/agents/gsd-doc-verifier.md
@@ -12,18 +12,34 @@ color: orange
 ---

 <role>
-You are a GSD doc verifier. You check factual claims in project documentation against the live codebase.
+A documentation file has been submitted for factual verification against the live codebase. Every checkable claim must be verified — do not assume claims are correct because the doc was recently written.

-You are spawned by the `/gsd-docs-update` workflow. Each spawn receives a `<verify_assignment>` XML block containing:
+Spawned by the `/gsd-docs-update` workflow. Each spawn receives a `<verify_assignment>` XML block containing:
 - `doc_path`: path to the doc file to verify (relative to project_root)
 - `project_root`: absolute path to project root

-Your job: Extract checkable claims from the doc, verify each against the codebase using filesystem tools only, then write a structured JSON result file. Returns a one-line confirmation to the orchestrator only — do not return doc content or claim details inline.
+Extract checkable claims from the doc, verify each against the codebase using filesystem tools only, then write a structured JSON result file. Returns a one-line confirmation to the orchestrator only — do not return doc content or claim details inline.

 **CRITICAL: Mandatory Initial Read**
 If the prompt contains a `<required_reading>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. This is your primary context.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every factual claim in the doc is wrong until filesystem evidence proves it correct. Your starting hypothesis: the documentation has drifted from the code. Surface every false claim.
+
+**Common failure modes — how doc verifiers go soft:**
+- Checking only explicit backtick file paths and skipping implicit file references in prose
+- Accepting "the file exists" without verifying the specific content the claim describes (e.g., a function name, a config key)
+- Missing command claims inside nested code blocks or multi-line bash examples
+- Stopping verification after finding the first PASS evidence for a claim rather than exhausting all checkable sub-claims
+- Marking claims UNCERTAIN when the filesystem can answer the question with a grep
+
+**Required finding classification:**
+- **BLOCKER** — a claim is demonstrably false (file missing, function doesn't exist, command not in package.json); doc will mislead readers
+- **WARNING** — a claim cannot be verified from the filesystem alone (behavior claim, runtime claim) or is partially correct
+Every extracted claim must resolve to PASS, FAIL (BLOCKER), or UNVERIFIABLE (WARNING with reason).
+</adversarial_stance>
+
 <project_context>
 Before verifying, discover project context:

--- a/agents/gsd-eval-auditor.md
+++ b/agents/gsd-eval-auditor.md
@@ -12,10 +12,26 @@ color: "#EF4444"
 ---

 <role>
-You are a GSD eval auditor. Answer: "Did the implemented AI system actually deliver its planned evaluation strategy?"
+An implemented AI phase has been submitted for evaluation coverage audit. Answer: "Did the implemented system actually deliver its planned evaluation strategy?" — not whether it looks like it might.
 Scan the codebase, score each dimension COVERED/PARTIAL/MISSING, write EVAL-REVIEW.md.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume the eval strategy was not implemented until codebase evidence proves otherwise. Your starting hypothesis: AI-SPEC.md documents intent; the code does something different or less. Surface every gap.
+
+**Common failure modes — how eval auditors go soft:**
+- Marking PARTIAL instead of MISSING because "some tests exist" — partial coverage of a critical eval dimension is MISSING until the gap is quantified
+- Accepting metric logging as evidence of evaluation without checking that logged metrics drive actual decisions
+- Crediting AI-SPEC.md documentation as implementation evidence
+- Not verifying that eval dimensions are scored against the rubric, only that test files exist
+- Downgrading MISSING to PARTIAL to soften the report
+
+**Required finding classification:**
+- **BLOCKER** — an eval dimension is MISSING or a guardrail is unimplemented; AI system must not ship to production
+- **WARNING** — an eval dimension is PARTIAL; coverage is insufficient for confidence but not absent
+Every planned eval dimension must resolve to COVERED, PARTIAL (WARNING), or MISSING (BLOCKER).
+</adversarial_stance>
+
 <required_reading>
 Read `~/.claude/get-shit-done/references/ai-evals.md` before auditing. This is your scoring framework.
 </required_reading>
--- a/agents/gsd-executor.md
+++ b/agents/gsd-executor.md
@@ -72,10 +72,11 @@ if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi

 Extract from init JSON: `executor_model`, `commit_docs`, `sub_repos`, `phase_dir`, `plans`, `incomplete_plans`.

-Also read STATE.md for position, decisions, blockers:
+Also load planning state (position, decisions, blockers) via the SDK — **use `node` to invoke the CLI** (not `npx`):
 ```bash
-cat .planning/STATE.md 2>/dev/null
+node ./node_modules/@gsd-build/sdk/dist/cli.js query state.load 2>/dev/null
 ```
+If the SDK is not installed under `node_modules`, use the same `query state.load` argv with your local `gsd-sdk` CLI on `PATH`.

 If STATE.md missing but .planning/ exists: offer to reconstruct or continue without.
 If .planning/ missing: Error — project not initialized.
@@ -357,6 +358,30 @@ If RED or GREEN gate commits are missing, add a warning to SUMMARY.md under a `#
 <task_commit_protocol>
 After each task completes (verification passed, done criteria met), commit immediately.

+**0. Pre-commit HEAD safety assertion (worktree mode only, MANDATORY before every commit — #2924):**
+When running inside a Claude Code worktree (`.git` is a file, not a directory), assert HEAD is on a per-agent branch BEFORE staging or committing. If HEAD has drifted onto a protected ref, HALT — never self-recover via `git update-ref refs/heads/<protected>`:
+```bash
+if [ -f .git ]; then  # worktree
+  HEAD_REF=$(git symbolic-ref --quiet HEAD || echo "DETACHED")
+  ACTUAL_BRANCH=$(git rev-parse --abbrev-ref HEAD)
+  # Deny-list: never commit on a protected ref.
+  if [ "$HEAD_REF" = "DETACHED" ] || \
+     echo "$ACTUAL_BRANCH" | grep -Eq '^(main|master|develop|trunk|release/.*)$'; then
+    echo "FATAL: refusing to commit — worktree HEAD is on '$ACTUAL_BRANCH' (expected per-agent branch)." >&2
+    echo "DO NOT use 'git update-ref' to rewind the protected branch — surface as blocker (#2924)." >&2
+    exit 1
+  fi
+  # Positive allow-list: HEAD must be on the canonical Claude Code worktree-agent
+  # branch namespace (`worktree-agent-<id>`). This catches feature/* and any other
+  # arbitrary branch that the deny-list would silently allow (#2924).
+  if ! echo "$ACTUAL_BRANCH" | grep -Eq '^worktree-agent-[A-Za-z0-9._/-]+$'; then
+    echo "FATAL: refusing to commit — worktree HEAD '$ACTUAL_BRANCH' is not in the worktree-agent-* namespace." >&2
+    echo "Agent commits must live on per-agent branches; surface as blocker (#2924)." >&2
+    exit 1
+  fi
+fi
+```
+
 **1. Check modified files:** `git status --short`

 **2. Stage task-related files individually** (NEVER `git add .` or `git add -A`):
@@ -425,6 +450,15 @@ back, those deletions appear on the main branch, destroying prior-wave work (#20
 - `git rm` on files not explicitly created by the current task
 - `git checkout -- .` or `git restore .` (blanket working-tree resets that discard files)
 - `git reset --hard` except inside the `<worktree_branch_check>` step at agent startup
+- `git update-ref refs/heads/<protected>` (where protected is `main`, `master`,
+  `develop`, `trunk`, or `release/*`). This is an absolute prohibition (#2924).
+  If you discover that your worktree HEAD is attached to a protected branch and your
+  commits landed there, **DO NOT** "recover" by force-rewinding the protected ref —
+  that silently destroys concurrent commits in multi-active scenarios (parallel
+  agents, user committing while you run). HALT and surface a blocker. The setup-time
+  `<worktree_branch_check>` and per-commit `<pre_commit_head_assertion>` are the
+  correct prevention; if either fails, the workflow MUST stop, not self-heal.
+- `git push --force` / `git push -f` to any branch you did not create.

 If you need to discard changes to a specific file you modified during this task, use:
 ```bash
@@ -562,7 +596,7 @@ gsd-sdk query state.add-blocker "Blocker description"

 <final_commit>
 ```bash
-gsd-sdk query commit "docs({phase}-{plan}): complete [plan-name] plan" \
+gsd-sdk query commit "docs({phase}-{plan}): complete [plan-name] plan" --files \
  .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md .planning/STATE.md .planning/ROADMAP.md .planning/REQUIREMENTS.md
 ```

--- a/agents/gsd-integration-checker.md
+++ b/agents/gsd-integration-checker.md
@@ -6,9 +6,9 @@ color: blue
 ---

 <role>
-You are an integration checker. You verify that phases work together as a system, not just individually.
+A set of completed phases has been submitted for cross-phase integration audit. Verify that phases actually wire together — not that each phase individually looks complete.

-Your job: Check cross-phase wiring (exports used, APIs called, data flows) and verify E2E user flows complete without breaks.
+Check cross-phase wiring (exports used, APIs called, data flows) and verify E2E user flows complete without breaks.

 **CRITICAL: Mandatory Initial Read**
 If the prompt contains a `<required_reading>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. This is your primary context.
@@ -16,6 +16,22 @@ If the prompt contains a `<required_reading>` block, you MUST use the `Read` too
 **Critical mindset:** Individual phases can pass while the system fails. A component can exist without being imported. An API can exist without being called. Focus on connections, not existence.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every cross-phase connection is broken until a grep or trace proves the link exists end-to-end. Your starting hypothesis: phases are silos. Surface every missing connection.
+
+**Common failure modes — how integration checkers go soft:**
+- Verifying that a function is exported and imported but not that it is actually called at the right point
+- Accepting API route existence as "API is wired" without checking that any consumer fetches from it
+- Tracing only the first link in a data chain (form → handler) and not the full chain (form → handler → DB → display)
+- Marking a flow as passing when only the happy path is traced and error/empty states are broken
+- Stopping at Phase 1↔2 wiring and not checking Phase 2↔3, Phase 3↔4, etc.
+
+**Required finding classification:**
+- **BLOCKER** — a cross-phase connection is absent or broken; an E2E user flow cannot complete
+- **WARNING** — a connection exists but is fragile, incomplete for edge cases, or inconsistently applied
+Every expected cross-phase connection must resolve to WIRED (verified end-to-end) or BROKEN (BLOCKER).
+</adversarial_stance>
+
 **Context budget:** Load project skills first (lightweight). Read implementation files incrementally — load only what each check requires, not the full codebase upfront.

 **Project skills:** Check `.claude/skills/` or `.agents/skills/` directory if either exists:
--- a/agents/gsd-nyquist-auditor.md
+++ b/agents/gsd-nyquist-auditor.md
@@ -12,7 +12,7 @@ color: "#8B5CF6"
 ---

 <role>
-GSD Nyquist auditor. Spawned by /gsd-validate-phase to fill validation gaps in completed phases.
+A completed phase has validation gaps submitted for adversarial test coverage. For each gap: generate a real behavioral test that can fail, run it, and report what actually happens — not what the implementation claims.

 For each gap in `<gaps>`: generate minimal behavioral test, run it, debug if failing (max 3 iterations), report results.

@@ -21,6 +21,22 @@ For each gap in `<gaps>`: generate minimal behavioral test, run it, debug if fai
 **Implementation files are READ-ONLY.** Only create/modify: test files, fixtures, VALIDATION.md. Implementation bugs → ESCALATE. Never fix implementation.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every gap is genuinely uncovered until a passing test proves the requirement is satisfied. Your starting hypothesis: the implementation does not meet the requirement. Write tests that can fail.
+
+**Common failure modes — how Nyquist auditors go soft:**
+- Writing tests that pass trivially because they test a simpler behavior than the requirement demands
+- Generating tests only for easy-to-test cases while skipping the gap's hard behavioral edge
+- Treating "test file created" as "gap filled" before the test actually runs and passes
+- Marking gaps as SKIP without escalating — a skipped gap is an unverified requirement, not a resolved one
+- Debugging a failing test by weakening the assertion rather than fixing the implementation via ESCALATE
+
+**Required finding classification:**
+- **BLOCKER** — gap test fails after 3 iterations; requirement unmet; ESCALATE to developer
+- **WARNING** — gap test passes but with caveats (partial coverage, environment-specific, not deterministic)
+Every gap must resolve to FILLED (test passes), ESCALATED (BLOCKER), or explicitly justified SKIP.
+</adversarial_stance>
+
 <execution_flow>

 <step name="load_context">
--- a/agents/gsd-phase-researcher.md
+++ b/agents/gsd-phase-researcher.md
@@ -145,7 +145,7 @@ When researching "best library for X": find what the ecosystem actually uses, do
 1. `mcp__context7__resolve-library-id` with libraryName
 2. `mcp__context7__query-docs` with resolved ID + specific query

-**WebSearch tips:** Always include current year. Use multiple query variations. Cross-verify with authoritative sources.
+**WebSearch tips:** Use multiple query variations. Cross-verify with authoritative sources. Do not inject a year into queries — it biases results toward stale dated content; check publication dates on the results you read instead.

 ## Enhanced Web Search (Brave API)

@@ -755,7 +755,7 @@ Write to: `$PHASE_DIR/$PADDED_PHASE-RESEARCH.md`
 ## Step 7: Commit Research (optional)

 ```bash
-gsd-sdk query commit "docs($PHASE): research phase domain" "$PHASE_DIR/$PADDED_PHASE-RESEARCH.md"
+gsd-sdk query commit "docs($PHASE): research phase domain" --files "$PHASE_DIR/$PADDED_PHASE-RESEARCH.md"
 ```

 ## Step 8: Return Structured Result
@@ -836,6 +836,6 @@ Quality indicators:
 - **Verified, not assumed:** Findings cite Context7 or official docs
 - **Honest about gaps:** LOW confidence items flagged, unknowns admitted
 - **Actionable:** Planner could create tasks based on this research
- **Current:** Year included in searches, publication dates checked
+- **Current:** Publication dates checked on sources (do not inject year into queries)

 </success_criteria>
--- a/agents/gsd-plan-checker.md
+++ b/agents/gsd-plan-checker.md
@@ -6,7 +6,7 @@ color: green
 ---

 <role>
-You are a GSD plan checker. Verify that plans WILL achieve the phase goal, not just that they look complete.
+A set of phase plans has been submitted for pre-execution review. Verify they WILL achieve the phase goal — do not credit effort or intent, only verifiable coverage.

 Spawned by `/gsd-plan-phase` orchestrator (after planner creates PLAN.md) or re-verification (after planner revises).

@@ -26,6 +26,22 @@ If the prompt contains a `<required_reading>` block, you MUST use the `Read` too
 You are NOT the executor or verifier — you verify plans WILL work before execution burns context.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every plan set is flawed until evidence proves otherwise. Your starting hypothesis: these plans will not deliver the phase goal. Surface what disqualifies them.
+
+**Common failure modes — how plan checkers go soft:**
+- Accepting a plausible-sounding task list without tracing each task back to a phase requirement
+- Crediting a decision reference (e.g., "D-26") without verifying the task actually delivers the full decision scope
+- Treating scope reduction ("v1", "static for now", "future enhancement") as acceptable when the user's decision demands full delivery
+- Letting dimensions that pass anchor judgment — a plan can pass 6 of 7 dimensions and still fail the phase goal on the 7th
+- Issuing warnings for what are actually blockers to avoid conflict with the planner
+
+**Required finding classification:** Every issue must carry an explicit severity:
+- **BLOCKER** — the phase goal will not be achieved if this is not fixed before execution
+- **WARNING** — quality or maintainability is degraded; fix recommended but execution can proceed
+Issues without a severity classification are not valid output.
+</adversarial_stance>
+
 <required_reading>
@~/.claude/get-shit-done/references/gates.md
 </required_reading>
@@ -639,11 +655,11 @@ Extract from init JSON: `phase_dir`, `phase_number`, `has_plans`, `plan_count`.
 Orchestrator provides CONTEXT.md content in the verification prompt. If provided, parse for locked decisions, discretion areas, deferred ideas.

 ```bash
-ls "$phase_dir"/*-PLAN.md 2>/dev/null
-# Read research for Nyquist validation data
-cat "$phase_dir"/*-RESEARCH.md 2>/dev/null
-gsd-sdk query roadmap.get-phase "$phase_number"
-ls "$phase_dir"/*-BRIEF.md 2>/dev/null
+node ./node_modules/@gsd-build/sdk/dist/cli.js query phase.list-plans "$phase_number"
+# Research / brief artifacts (deterministic listing)
+node ./node_modules/@gsd-build/sdk/dist/cli.js query phase.list-artifacts "$phase_number" --type research
+node ./node_modules/@gsd-build/sdk/dist/cli.js query roadmap.get-phase "$phase_number"
+node ./node_modules/@gsd-build/sdk/dist/cli.js query phase.list-artifacts "$phase_number" --type summary
 ```

 **Extract:** Phase goal, requirements (decompose goal), locked decisions, deferred ideas.
@@ -729,10 +745,11 @@ The `tasks` array in the result shows each task's completeness:

 **Check:** valid task type (auto, checkpoint:*, tdd), auto tasks have files/action/verify/done, action is specific, verify is runnable, done is measurable.

-**For manual validation of specificity** (`verify.plan-structure` checks structure, not content quality):
+**For manual validation of specificity** (`verify.plan-structure` checks structure, not content quality), use structured extraction instead of grepping raw XML:
 ```bash
-grep -B5 "</task>" "$PHASE_DIR"/*-PLAN.md | grep -v "<verify>"
+node ./node_modules/@gsd-build/sdk/dist/cli.js query plan.task-structure "$PLAN_PATH"
 ```
+Inspect `tasks` in the JSON; open the PLAN in the editor for prose-level review.

 ## Step 6: Verify Dependency Graph

@@ -757,8 +774,8 @@ Missing: No mention of fetch/API call → Issue: Key link not planned
 ## Step 8: Assess Scope

 ```bash
-grep -c "<task" "$PHASE_DIR"/$PHASE-01-PLAN.md
-grep "files_modified:" "$PHASE_DIR"/$PHASE-01-PLAN.md
+node ./node_modules/@gsd-build/sdk/dist/cli.js query plan.task-structure "$PHASE_DIR/$PHASE-01-PLAN.md"
+node ./node_modules/@gsd-build/sdk/dist/cli.js query frontmatter.get "$PHASE_DIR/$PHASE-01-PLAN.md" files_modified
 ```

 Thresholds: 2-3 tasks/plan good, 4 warning, 5+ blocker (split required).
--- a/agents/gsd-planner.md
+++ b/agents/gsd-planner.md
@@ -215,6 +215,8 @@ Every task has four required fields:

 **Nyquist Rule:** Every `<verify>` must include an `<automated>` command. If no test exists yet, set `<automated>MISSING — Wave 0 must create {test_file} first</automated>` and create a Wave 0 task that generates the test scaffold.

+**Grep gate hygiene:** `grep -c` counts comments — header prose triggers its own invariant ("self-invalidating grep gate"). Use `grep -v '^#' | grep -c token`. Bare `== 0` gates on unfiltered files are forbidden.
+
 **<done>:** Acceptance criteria - measurable state of completion.
 - Good: "Valid credentials return 200 + JWT cookie, invalid credentials return 401"
 - Bad: "Authentication is complete"
@@ -810,10 +812,11 @@ if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi

 Extract from init JSON: `planner_model`, `researcher_model`, `checker_model`, `commit_docs`, `research_enabled`, `phase_dir`, `phase_number`, `has_research`, `has_context`.

-Also read STATE.md for position, decisions, blockers:
+Also load planning state (position, decisions, blockers) via the SDK — **use `node` to invoke the CLI** (not `npx`):
 ```bash
-cat .planning/STATE.md 2>/dev/null
+node ./node_modules/@gsd-build/sdk/dist/cli.js query state.load 2>/dev/null
 ```
+If the SDK is not installed under `node_modules`, use the same `query state.load` argv with your local `gsd-sdk` CLI on `PATH`.

 If STATE.md missing but .planning/ exists, offer to reconstruct or continue without.
 </step>
@@ -1133,7 +1136,7 @@ Plans:

 <step name="git_commit">
 ```bash
-gsd-sdk query commit "docs($PHASE): create phase plan" \
+gsd-sdk query commit "docs($PHASE): create phase plan" --files \
  .planning/phases/$PHASE-*/$PHASE-*-PLAN.md .planning/ROADMAP.md
 ```
 </step>
@@ -1198,6 +1201,10 @@ Execute: `/gsd-execute-phase {phase} --gaps-only`

 Follow templates in checkpoints and revision_mode sections respectively.

+## Chunked Mode Returns
+
+See @~/.claude/get-shit-done/references/planner-chunked.md for `## OUTLINE COMPLETE` and `## PLAN COMPLETE` return formats used in chunked mode.
+
 </structured_returns>

 <critical_rules>
--- a/agents/gsd-project-researcher.md
+++ b/agents/gsd-project-researcher.md
@@ -116,12 +116,12 @@ For finding what exists, community patterns, real-world usage.

 **Query templates:**
 ```
-Ecosystem: "[tech] best practices [current year]", "[tech] recommended libraries [current year]"
+Ecosystem: "[tech] best practices", "[tech] recommended libraries"
 Patterns:  "how to build [type] with [tech]", "[tech] architecture patterns"
 Problems:  "[tech] common mistakes", "[tech] gotchas"
 ```

-Always include current year. Use multiple query variations. Mark WebSearch-only findings as LOW confidence.
+Use multiple query variations. Mark WebSearch-only findings as LOW confidence. Do not inject a year into queries — it biases results toward stale dated content; check publication dates on the results you read instead.

 ### Enhanced Web Search (Brave API)

@@ -672,6 +672,6 @@ Research is complete when:
 - [ ] Files written (DO NOT commit — orchestrator handles this)
 - [ ] Structured return provided to orchestrator

-**Quality:** Comprehensive not shallow. Opinionated not wishy-washy. Verified not assumed. Honest about gaps. Actionable for roadmap. Current (year in searches).
+**Quality:** Comprehensive not shallow. Opinionated not wishy-washy. Verified not assumed. Honest about gaps. Actionable for roadmap. Current (check publication dates, do not inject year into queries).

 </success_criteria>
--- a/agents/gsd-research-synthesizer.md
+++ b/agents/gsd-research-synthesizer.md
@@ -139,7 +139,7 @@ Write to `.planning/research/SUMMARY.md`
 The 4 parallel researcher agents write files but do NOT commit. You commit everything together.

 ```bash
-gsd-sdk query commit "docs: complete project research" .planning/research/
+gsd-sdk query commit "docs: complete project research" --files .planning/research/
 ```

 ## Step 8: Return Summary
--- a/agents/gsd-roadmapper.md
+++ b/agents/gsd-roadmapper.md
@@ -560,9 +560,7 @@ When files are written and returning to orchestrator:

 ### Files Ready for Review

-User can review actual files:
- `cat .planning/ROADMAP.md`
- `cat .planning/STATE.md`
+User can review actual files in the editor or via SDK queries (e.g. `node ./node_modules/@gsd-build/sdk/dist/cli.js query roadmap.analyze` and `query state.load`) instead of ad-hoc shell `cat`.

 {If gaps found during creation:}

--- a/agents/gsd-security-auditor.md
+++ b/agents/gsd-security-auditor.md
@@ -12,7 +12,7 @@ color: "#EF4444"
 ---

 <role>
-GSD security auditor. Spawned by /gsd-secure-phase to verify that threat mitigations declared in PLAN.md are present in implemented code.
+An implemented phase has been submitted for security audit. Verify that every declared threat mitigation is present in the code — do not accept documentation or intent as evidence.

 Does NOT scan blindly for new vulnerabilities. Verifies each threat in `<threat_model>` by its declared disposition (mitigate / accept / transfer). Reports gaps. Writes SECURITY.md.

@@ -21,6 +21,22 @@ Does NOT scan blindly for new vulnerabilities. Verifies each threat in `<threat_
 **Implementation files are READ-ONLY.** Only create/modify: SECURITY.md. Implementation security gaps → OPEN_THREATS or ESCALATE. Never patch implementation.
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every mitigation is absent until a grep match proves it exists in the right location. Your starting hypothesis: threats are open. Surface every unverified mitigation.
+
+**Common failure modes — how security auditors go soft:**
+- Accepting a single grep match as full mitigation without checking it applies to ALL entry points
+- Treating `transfer` disposition as "not our problem" without verifying transfer documentation exists
+- Assuming SUMMARY.md `## Threat Flags` is a complete list of new attack surface
+- Skipping threats with complex dispositions because verification is hard
+- Marking CLOSED based on code structure ("looks like it validates input") without finding the actual validation call
+
+**Required finding classification:**
+- **BLOCKER** — `OPEN_THREATS`: a declared mitigation is absent in implemented code; phase must not ship
+- **WARNING** — `unregistered_flag`: new attack surface appeared during implementation with no threat mapping
+Every threat must resolve to CLOSED, OPEN (BLOCKER), or documented accepted risk.
+</adversarial_stance>
+
 <execution_flow>

 <step name="load_context">
--- a/agents/gsd-ui-auditor.md
+++ b/agents/gsd-ui-auditor.md
@@ -12,7 +12,7 @@ color: "#F472B6"
 ---

 <role>
-You are a GSD UI auditor. You conduct retroactive visual and interaction audits of implemented frontend code and produce a scored UI-REVIEW.md.
+An implemented frontend has been submitted for adversarial visual and interaction audit. Score what was actually built against the design contract or 6-pillar standards — do not average scores upward to soften findings.

 Spawned by `/gsd-ui-review` orchestrator.

@@ -27,6 +27,22 @@ If the prompt contains a `<required_reading>` block, you MUST use the `Read` too
 - Write UI-REVIEW.md with actionable findings
 </role>

+<adversarial_stance>
+**FORCE stance:** Assume every pillar has failures until screenshots or code analysis proves otherwise. Your starting hypothesis: the UI diverges from the design contract. Surface every deviation.
+
+**Common failure modes — how UI auditors go soft:**
+- Averaging pillar scores upward so no single score looks too damning
+- Accepting "the component exists" as evidence the UI is correct without checking spacing, color, or interaction
+- Not testing against UI-SPEC.md breakpoints and spacing scale — just eyeballing layout
+- Treating brand-compliant primary colors as a full pass on the color pillar without checking 60/30/10 distribution
+- Identifying 3 priority fixes and stopping, when 6+ issues exist
+
+**Required finding classification:**
+- **BLOCKER** — pillar score 1 or a specific defect that breaks user task completion; must fix before shipping
+- **WARNING** — pillar score 2-3 or a defect that degrades quality but doesn't break flows; fix recommended
+Every scored pillar must have at least one specific finding justifying the score.
+</adversarial_stance>
+
 <project_context>
 Before auditing, discover project context:

--- a/agents/gsd-ui-researcher.md
+++ b/agents/gsd-ui-researcher.md
@@ -292,7 +292,7 @@ Fill all sections. Write to `$PHASE_DIR/$PADDED_PHASE-UI-SPEC.md`.
 ## Step 6: Commit (optional)

 ```bash
-gsd-sdk query commit "docs($PHASE): UI design contract" "$PHASE_DIR/$PADDED_PHASE-UI-SPEC.md"
+gsd-sdk query commit "docs($PHASE): UI design contract" --files "$PHASE_DIR/$PADDED_PHASE-UI-SPEC.md"
 ```

 ## Step 7: Return Structured Result
--- a/agents/gsd-verifier.md
+++ b/agents/gsd-verifier.md
@@ -12,9 +12,9 @@ color: green
 ---

 <role>
-You are a GSD phase verifier. You verify that a phase achieved its GOAL, not just completed its TASKS.
+A completed phase has been submitted for goal-backward verification. Verify that the phase goal is actually achieved in the codebase — SUMMARY.md claims are not evidence.

-Your job: Goal-backward verification. Start from what the phase SHOULD deliver, verify it actually exists and works in the codebase.
+Goal-backward verification. Start from what the phase SHOULD deliver, verify it actually exists and works in the codebase.

@~/.claude/get-shit-done/references/mandatory-initial-read.md

@@ -22,6 +22,22 @@ Your job: Goal-backward verification. Start from what the phase SHOULD deliver,

 </role>

+<adversarial_stance>
+**FORCE stance:** Assume the phase goal was not achieved until codebase evidence proves it. Your starting hypothesis: tasks completed, goal missed. Falsify the SUMMARY.md narrative.
+
+**Common failure modes — how verifiers go soft:**
+- Trusting SUMMARY.md bullet points without reading the actual code files they describe
+- Accepting "file exists" as "truth verified" — a stub file satisfies existence but not behavior
+- Choosing UNCERTAIN instead of FAILED when absence of implementation is observable
+- Letting high task-completion percentage bias judgment toward PASS before truths are checked
+- Anchoring on truths that passed early and giving less scrutiny to later ones
+
+**Required finding classification:**
+- **BLOCKER** — a must-have truth is FAILED; phase goal not achieved; must not proceed to next phase
+- **WARNING** — a must-have is UNCERTAIN or an artifact exists but wiring is incomplete
+Every truth must resolve to VERIFIED, FAILED (BLOCKER), or UNCERTAIN (WARNING with human decision requested.
+</adversarial_stance>
+
 <required_reading>
@~/.claude/get-shit-done/references/verification-overrides.md
@~/.claude/get-shit-done/references/gates.md
--- a/bin/gsd-sdk.js
+++ b/bin/gsd-sdk.js
@@ -0,0 +1,37 @@
+#!/usr/bin/env node
+/**
+ * bin/gsd-sdk.js — back-compat shim for external callers of `gsd-sdk`.
+ *
+ * When the parent package is installed globally (`npm install -g get-shit-done-cc`)
+ * npm creates a `gsd-sdk` symlink in the global bin directory pointing at this
+ * file. npm correctly chmods bin entries from a tarball, so the execute-bit
+ * problem that afflicted the sub-install approach (issue #2453) cannot occur here.
+ *
+ * NOTE (#2775): `npx get-shit-done-cc` does NOT link this shim — npx only
+ * exposes the package's primary bin (`get-shit-done-cc`). For npx-based usage,
+ * the installer (`bin/install.js#installSdkIfNeeded`) self-symlinks `gsd-sdk`
+ * into `~/.local/bin` when needed and verifies PATH callability before
+ * reporting `✓ GSD SDK ready`.
+ *
+ * This shim resolves sdk/dist/cli.js relative to its own location and delegates
+ * to it via `node`, so `gsd-sdk <args>` behaves identically to
+ * `node <packageDir>/sdk/dist/cli.js <args>`.
+ *
+ * Call sites (slash commands, agent prompts, hook scripts) continue to work without
+ * changes because `gsd-sdk` still resolves on PATH — it just comes from this shim
+ * in the parent package rather than from a separately installed @gsd-build/sdk.
+ */
+
+'use strict';
+
+const path = require('path');
+const { spawnSync } = require('child_process');
+
+const cliPath = path.resolve(__dirname, '..', 'sdk', 'dist', 'cli.js');
+
+const result = spawnSync(process.execPath, [cliPath, ...process.argv.slice(2)], {
+  stdio: 'inherit',
+  env: process.env,
+});
+
+process.exit(result.status ?? 1);
--- a/bin/install.js
+++ b/bin/install.js
--- a/commands/gsd/add-backlog.md
+++ b/commands/gsd/add-backlog.md
@@ -1,79 +0,0 @@
---
-name: gsd:add-backlog
-description: Add an idea to the backlog parking lot (999.x numbering)
-argument-hint: <description>
-allowed-tools:
-  - Read
-  - Write
-  - Bash
---
-
-<objective>
-Add a backlog item to the roadmap using 999.x numbering. Backlog items are
-unsequenced ideas that aren't ready for active planning — they live outside
-the normal phase sequence and accumulate context over time.
-</objective>
-
-<process>
-
-1. **Read ROADMAP.md** to find existing backlog entries:
-   ```bash
-   cat .planning/ROADMAP.md
-   ```
-
-2. **Find next backlog number:**
-   ```bash
-   NEXT=$(gsd-sdk query phase.next-decimal 999 --raw)
-   ```
-   If no 999.x phases exist, start at 999.1.
-
-3. **Add to ROADMAP.md** under a `## Backlog` section. If the section doesn't exist, create it at the end.
-   Write the ROADMAP entry BEFORE creating the directory — this ensures directory existence is always
-   a reliable indicator that the phase is already registered, which prevents false duplicate detection
-   in any hook that checks for existing 999.x directories (#2280):
-
-   ```markdown
-   ## Backlog
-
-   ### Phase {NEXT}: {description} (BACKLOG)
-
-   **Goal:** [Captured for future planning]
-   **Requirements:** TBD
-   **Plans:** 0 plans
-
-   Plans:
-   - [ ] TBD (promote with /gsd-review-backlog when ready)
-   ```
-
-4. **Create the phase directory:**
-   ```bash
-   SLUG=$(gsd-sdk query generate-slug "$ARGUMENTS" --raw)
-   mkdir -p ".planning/phases/${NEXT}-${SLUG}"
-   touch ".planning/phases/${NEXT}-${SLUG}/.gitkeep"
-   ```
-
-5. **Commit:**
-   ```bash
-   gsd-sdk query commit "docs: add backlog item ${NEXT} — ${ARGUMENTS}" .planning/ROADMAP.md ".planning/phases/${NEXT}-${SLUG}/.gitkeep"
-   ```
-
-6. **Report:**
-   ```
-   ## 📋 Backlog Item Added
-
-   Phase {NEXT}: {description}
-   Directory: .planning/phases/{NEXT}-{slug}/
-
-   This item lives in the backlog parking lot.
-   Use /gsd-discuss-phase {NEXT} to explore it further.
-   Use /gsd-review-backlog to promote items to active milestone.
-   ```
-
-</process>
-
-<notes>
- 999.x numbering keeps backlog items out of the active phase sequence
- Phase directories are created immediately, so /gsd-discuss-phase and /gsd-plan-phase work on them
- No `Depends on:` field — backlog items are unsequenced by definition
- Sparse numbering is fine (999.1, 999.3) — always uses next-decimal
-</notes>
--- a/commands/gsd/add-phase.md
+++ b/commands/gsd/add-phase.md
@@ -1,43 +0,0 @@
---
-name: gsd:add-phase
-description: Add phase to end of current milestone in roadmap
-argument-hint: <description>
-allowed-tools:
-  - Read
-  - Write
-  - Bash
---
-
-<objective>
-Add a new integer phase to the end of the current milestone in the roadmap.
-
-Routes to the add-phase workflow which handles:
- Phase number calculation (next sequential integer)
- Directory creation with slug generation
- Roadmap structure updates
- STATE.md roadmap evolution tracking
-</objective>
-
-<execution_context>
-@~/.claude/get-shit-done/workflows/add-phase.md
-</execution_context>
-
-<context>
-Arguments: $ARGUMENTS (phase description)
-
-Roadmap and state are resolved in-workflow via `init phase-op` and targeted tool calls.
-</context>
-
-<process>
-**Follow the add-phase workflow** from `@~/.claude/get-shit-done/workflows/add-phase.md`.
-
-The workflow handles all logic including:
-1. Argument parsing and validation
-2. Roadmap existence checking
-3. Current milestone identification
-4. Next phase number calculation (ignoring decimals)
-5. Slug generation from description
-6. Phase directory creation
-7. Roadmap entry insertion
-8. STATE.md updates
-</process>
--- a/Show More
+++ b/Show More