Commit Graph

52 Commits

Author SHA1 Message Date
Tom Boucher
8bc255c266 fix(workstream): normalize migration workstream names (#3269)
* fix(workstream): normalize migrate-name to valid slug

* docs(context): record workstream migrate-name slug invariant

* fix(catalog-cjs): balanced fallback for unknown profile (CR finding A)

profiles[profile] could return undefined for any profile key absent from
the catalog entry, causing downstream callers like formatAgentToModelMapAsTable
to crash on .length. Add ?? profiles.balanced fallback to match the SDK adapter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(sdk): anchor path resolution on import.meta.url not cwd (CR finding B)

resolve(process.cwd(), '..') breaks when Vitest is invoked from the repo root
because cwd is already the repo root and '..' goes one level above. Replace
with a file-relative path using fileURLToPath(new URL('../../../', import.meta.url))
anchored at the test file's location (sdk/src/query/).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: derive Group B runtime list from catalog (CR finding C)

Hardcoded ['kilo', 'cline', ...] throws TypeError if a runtime name is
removed from the catalog. Derive group B dynamically via
Object.keys(catalog.runtimeTierDefaults).filter(r => !r.opus) so the
test never goes stale and auto-covers future Group B additions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(workflow): add hermes to Step B runtime options (CR finding D)

hermes appears in the Group A built-in defaults table but was missing from
the AskUserQuestion options in Step B, forcing users to manually type it via
'Other (Group B or custom)'. Add explicit hermes entry for UI consistency.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(config): refresh dynamic_routing tier table; fix stale L671 (findings E+F)

Finding E: tier table was missing 6 heavy-tier agents and 15 standard/light
agents added by this PR. Updated all three rows to match catalog routingTier
assignments (33 agents total).

Finding F: removed stale '18 of 31' claim and agent enumeration; replaced
with accurate note that all 33 agents have explicit catalog entries. Updated
authoritative source pointers to model-catalog.cjs / model-catalog.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test(core): add profile-fallback unit tests for quality and budget (CR nitpick G)

The PR introduced quality→opus and budget→haiku unknown-agent fallbacks but
only balanced→sonnet and inherit→inherit were tested. Add two tests covering
the remaining two branches to complete coverage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* adr: define planning workspace and worktree seam

* refactor(worktree): extract worktree safety policy module

* refactor(workstream): extract active workstream pointer store seam

* test(worktree): cover policy branch paths and persist seam guardrails

* refactor(worktree): centralize health inventory seam for W017

* fix(workspace): align SDK project path policy with CJS planningDir

* refactor(query): unify SDK planning path projection seam

* refactor(init): route workspace projection through planningPaths seam

* docs(adr): add SDK architecture and planning path ADRs

* refactor(worktree): deepen name, pointer, inventory, and config seams

* docs(config): harmonize claude-opus-4-6 to 4-7 in resolve_model_ids example (CR finding 2)

* fix(sdk): return undefined for model_profile='inherit' sentinel (CR finding 3)

* docs(adr): renumber conflicting 0003-sdk-package-seam-module to 0007, update seam-map reference (CR finding 4)

* fix(workstream): align CJS and SDK name validation to accept dots, guard path traversal via includes('..') (CR finding 5)

* fix(sdk): guard writeActiveWorkstream against non-existent workstream directory, k014/k031 parity (CR finding 6)

* chore(changeset): add #3269 changeset (CR finding 1 — proper changeset for this PR)

* docs(inventory): register 3 new CLI modules in INVENTORY.md/MANIFEST (active-workstream-store, workstream-name-policy, worktree-safety)

* fix(sdk): use relPlanningPath(workstream) in planningPaths, fix setActiveWorkstream/getActiveWorkstream name errors in workstream.ts

* fix(sdk): validate GSD_WORKSTREAM in planningPaths before use (#3269 regression)

planningPaths() called resolveWorkspaceContext() which returned GSD_WORKSTREAM
raw (no validation). An invalid value like '../evil' was used as effectiveWorkstream,
constructing a bad path; roadmapAnalyze() caught the ENOENT and returned a
no-phase_count error object instead of the root ROADMAP result.

Fix: validate envCtx.workstream with validateWorkstreamName() in planningPaths()
before accepting it as effectiveWorkstream. Invalid env → null → root .planning/
fallback, preserving the bug-2791 contract: invalid GSD_WORKSTREAM is silently
ignored and falls back to the root context (phase_count: 0 for empty root ROADMAP).

The bug-2791 regression test now passes. No other call sites read GSD_WORKSTREAM
without validation: query-runtime-context.ts already validates; cli.ts already
validates; context-engine.ts takes a caller-validated workstream parameter.

Closes #3268 (regression introduced by #3269 workstream-name-policy work).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 00:15:04 -04:00
Tom Boucher
96806003c5 fix(#3229): shared model catalog source of truth for agent profiles + runtime tier defaults (#3230)
* docs(adr): add ADR-0003 model catalog module

* fix(#3229): add shared model catalog as source of truth for agent profiles and runtime tier defaults

Research / design (ADR-0003):
- Existing drift came from 4 independent model truths:
  1. CJS model-profiles.cjs
  2. SDK config-query.ts stale copy (18 agents)
  3. settings-advanced.md runtime tier table
  4. session-runner Claude-only profile map
- New design: one machine-readable Model Catalog Module in sdk/shared/
  that both packages ship and consume.

Implementation:
- sdk/shared/model-catalog.json — canonical source of truth for:
  - full 33-agent registry
  - per-agent golden (quality) alias + balanced/budget aliases
  - adaptive derivation from routingTier
  - agent→phaseType map
  - agent→dynamic-routing default tier map
  - runtime tier defaults for all supported runtimes
- get-shit-done/bin/lib/model-catalog.cjs — CJS adapter over the catalog
- sdk/src/model-catalog.ts — SDK adapter over the same catalog
- CJS model-profiles.cjs now re-exports derived data from model-catalog.cjs
- SDK config-query.ts now re-exports MODEL_PROFILES/VALID_PROFILES from
  model-catalog.ts instead of maintaining its own list
- sdk/src/query/helpers.ts runtime list now comes from the catalog (fixes hermes drift)
- sdk/src/session-runner.ts Claude profile→model-id mapping now resolves via catalog
- docs/CONFIGURATION.md + settings-advanced.md runtime tables updated to match catalog

Behavior changes:
- resolve-model now covers every shipped agent file on disk (33 agents)
- unknown-agent fallback is profile-semantic, not hardcoded sonnet:
  quality→opus, budget→haiku, balanced/adaptive→sonnet, inherit→inherit
- Group B runtimes remain known runtimes but do not get built-in tier defaults

Tests (RED→GREEN):
- root tests: shipped agent files must equal MODEL_PROFILES keys
- sdk tests: shipped agent files must equal MODEL_PROFILES keys
- direct fix assertion: gsd-code-reviewer resolves to opus under quality with no unknown_agent
- runtime defaults parity test: settings-advanced.md + CONFIGURATION.md tables must match catalog
- helper tests: hermes included in SUPPORTED_RUNTIMES and getRuntimeConfigDir()

Closes #3229

* chore(changeset): update #3229 changeset pr field to 3230

* fix(ci): update inherit fallback expectations and inventory parity for model catalog
2026-05-08 21:25:37 -04:00
Tom Boucher
c0be29607a docs: v1.41.0 release documentation — CHANGELOG promotion, release notes, FEATURES update (#3219)
- Promote CHANGELOG [Unreleased] → [1.41.0] - 2026-05-07; add fresh [Unreleased] header
- Fix CONFIGURATION.md version labels: 'added in v1.40' → 'added in v1.41' for models and dynamic_routing
- Create docs/RELEASE-v1.41.0.md in compact v1.39.0 bullet format
- Rewrite docs/RELEASE-v1.40.0-rc.1.md to compact bullet format (removes wall-of-text entries)
- Add docs/FEATURES.md v1.41.0 section (features 126–131: per-phase models, dynamic routing, update banner, issue-driven orchestration, graphify staleness, MVP SDK verbs)
- Update docs/FEATURES.md TOC
- Trim README "Notable extras" table (highlight page, not a command menu)

Fixes #3218

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 00:19:26 -04:00
Tom Boucher
29eb8be06d feat(graphify): commit-based staleness from built_at_commit (#3170) (#3171)
* test(graphify): TDD-red design contract for #3170 commit-staleness signal

Captures the proposed extension to graphifyStatus() as 8 failing
assertions across 3 groups (git-aware, non-git, back-compat). Suite is
describe.skip()'d so npm test stays green on the branch — removing
.skip is the green-light moment when the enhancement is approved and
implementation lands.

Verified against safishamsi/graphify v0.7.0 release notes: the field
on graph.json is built_at_commit (full git HEAD), not commit_hash as
originally guessed in #3170. Tests assert against the verified name.

Design highlights captured in the file's docstring:
- Tri-state commit_stale (true/false/null) — null means "we don't
  know" (pre-v0.7 graph or no git), distinct from false ("known fresh")
- Argument-injection fence /^[0-9a-f]{4,40}$/i validates built_at_commit
  before it reaches `git` as an argv element
- Existing graphifyStatus() fields (node_count, edge_count, stale,
  age_hours, etc.) are unchanged — back-compat fenced

Per the issue's enhancement template: no PR will be opened until the
issue is labeled `approved-enhancement`.

Refs #3170
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(graphify): surface commit-based staleness from graphify v0.7+ built_at_commit

Closes #3170

graphify v0.7+ embeds built_at_commit (full git HEAD) into graph.json at
write time. GSD's existing graphifyStatus() ignored it; staleness was
mtime-only, which is a poor proxy for "does this graph reflect the
current code." A CI-built graph rebuilt minutes ago against an old
checkout reads as FRESH on mtime but is materially stale.

graphifyStatus() now returns four additional fields on the success path:

  built_at_commit   short hash from graph.built_at_commit, or null
  current_commit    short hash of git HEAD, or null when no git
  commits_behind    git rev-list --count <built>..HEAD, or null
  commit_stale      true | false | null

Tri-state on commit_stale is load-bearing. null means "we don't know"
(pre-v0.7 graph, non-git cwd, unreachable commit) — semantically
distinct from false ("known fresh"). Agents reading null should fall
back to mtime; reading false can confidently skip a rebuild.

Security: built_at_commit is on-disk and user-influenceable. Without
validation, a hostile value (e.g. "--upload-pack=evil") would reach git
as an argv element and be interpreted as an option. The
/^[0-9a-f]{4,40}$/i fence rejects anything else as absent. spawnSync's
array args (no shell) is defense in depth, not the boundary.

Skill (commands/gsd/graphify.md) Step 2b renders one conditional line:

  Source commit: abc1234 (5 commits behind HEAD)
  Source commit: abc1234 (current)
  Source commit: abc1234 (freshness unknown)

Pre-v0.7 graphs omit the line entirely — no confusing "Source commit:
unknown" rendered.

Also documents `graphify hook install` in docs/CONFIGURATION.md for
multi-dev teams who would otherwise hit graph.json merge conflicts on
parallel rebuilds (sub-enhancement 2 from #3170).

TDD red→green: tests/enh-3170-graphify-commit-staleness.test.cjs
(8 assertions across git-aware, non-git, back-compat) was committed
describe.skip()'d in c567f23d when the issue was filed; this commit
removes .skip and lands the implementation that makes them green.
Full suite 7503/7503 passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 11:59:53 -04:00
Tom Boucher
858c821829 docs: sweep stale /gsd-* command references across all user-facing docs
Replace 30 absorbed/deleted standalone command forms with their
consolidated flag-based equivalents across 25 files (English + 4
locales + AGENTS/CLI-TOOLS/CONFIGURATION):

  /gsd-session-report        → /gsd-pause-work --report
  /gsd-list-phase-assumptions → /gsd-discuss-phase --assumptions
  /gsd-analyze-dependencies  → /gsd-manager --analyze-deps
  /gsd-research-phase        → /gsd-plan-phase --research-phase
  /gsd-plan-milestone-gaps   → /gsd-audit-milestone
  /gsd-code-review-fix       → /gsd-code-review --fix
  /gsd-spike-wrap-up         → /gsd-spike --wrap-up
  /gsd-sketch-wrap-up        → /gsd-sketch --wrap-up
  /gsd-set-profile           → /gsd-config --profile
  /gsd-check-todos           → /gsd-capture --list
  /gsd-add-todo              → /gsd-capture
  /gsd-add-backlog           → /gsd-capture --backlog
  /gsd-plant-seed            → /gsd-capture --seed
  /gsd-note                  → /gsd-capture --note
  /gsd-add-phase             → /gsd-phase
  /gsd-insert-phase          → /gsd-phase --insert
  /gsd-edit-phase            → /gsd-phase --edit
  /gsd-remove-phase          → /gsd-phase --remove
  /gsd-new-workspace         → /gsd-workspace --new
  /gsd-list-workspaces       → /gsd-workspace --list
  /gsd-remove-workspace      → /gsd-workspace --remove
  /gsd-sync-skills           → /gsd-update --sync
  /gsd-reapply-patches       → /gsd-update --reapply
  /gsd-scan                  → /gsd-map-codebase --fast
  /gsd-intel                 → /gsd-map-codebase --query
  /gsd-next                  → /gsd-progress --next
  /gsd-do                    → /gsd-progress --do
  /gsd-status                → /gsd-progress
  /gsd-join-discord          → /gsd-help

Skipped: CHANGELOG, RELEASE notes, superpowers/specs (historical)
Suite: 6971/6971 pass
2026-05-05 11:01:15 -04:00
Tom Boucher
eb365f7336 docs: audit and update docs/ for v1.40.0 release (#3048)
* docs(en): update FEATURES/USER-GUIDE/COMMANDS for v1.40.0 surface

- FEATURES.md: append v1.40.0 section (#122 skill consolidation, #123
  namespace meta-skills, #124 context-window guard, #125 phase-lifecycle
  status-line read-side); add to TOC.
- USER-GUIDE.md: add slash-command form (hyphen vs colon) primer and
  namespace routing primer; replace deleted slash forms in walkthroughs
  (`/gsd-add-backlog`, `/gsd-plant-seed`, `/gsd-add-phase`,
  `/gsd-set-profile`, `/gsd-list-workspaces`, etc.) with consolidated
  forms (`/gsd-capture --backlog`, `/gsd-phase --insert`,
  `/gsd-config --profile`, `/gsd-workspace --list`, etc.); fix
  `/gsd-spike-wrap-up` and `/gsd-sketch-wrap-up` to flag form.
- COMMANDS.md: clarify Command Syntax (Gemini = colon form, others =
  hyphen form); add Namespace Meta-Skills section with all six routers;
  add `--context` to /gsd-health flag table.

Refs #3047

* docs(en): refresh INVENTORY/CLI-TOOLS/STATE-MD-LIFECYCLE for v1.40.0

- INVENTORY.md: workflow-row "Invoked by" column updated to point at
  consolidated commands (`/gsd-phase` family, `/gsd-workspace --list`,
  `/gsd-config --advanced/--integrations/--profile`,
  `/gsd-sketch --wrap-up`, `/gsd-spike --wrap-up`); CLI-modules row for
  `secrets.cjs` updated to `/gsd-config --integrations`. Command count
  and namespace meta-skills section already reflect 65 shipped (= 59
  consolidated sub-skills + 6 ns-* routers).
- CLI-TOOLS.md: add `validate context` row under Validation Commands
  with the 60 %/70 % threshold envelope used by `/gsd-health --context`.
- STATE-MD-LIFECYCLE.md: flip status header from "proposed" to
  "shipped in v1.40.0" since `parseStateMd()` and `formatGsdState()`
  now read and render `active_phase`, `next_action`, `next_phases`,
  and `progress`.

`docs/AGENTS.md` audited and verified clean — `gsd-code-fixer` row
already lists the correct `/gsd-code-review --fix` spawner; no
deleted-skill references found. `docs/INVENTORY-MANIFEST.json`
audited and verified clean — already enumerates the 65 commands
(including six ns-* routers) and contains no deleted slash forms.

Refs #3047

* docs(en): cleanup ARCHITECTURE/CONFIGURATION for v1.40.0

- ARCHITECTURE.md: split Commands install-target list to call out the
  Gemini colon form (`/gsd:command-name`) vs hyphen form for every
  other runtime. Add a new subsection covering two-stage hierarchical
  routing via the six namespace meta-skills (#2792) and a paired note
  on the MCP token-budget interaction so readers see the two big
  per-turn cost levers in one place.
- CONFIGURATION.md: rewrite three references to the deleted
  `/gsd-settings-advanced` and `/gsd-settings-integrations` slash
  forms to use the consolidated `/gsd-config --advanced` /
  `/gsd-config --integrations` invocations. Add a new "STATE.md
  Frontmatter (Phase Lifecycle)" section documenting the four
  optional fields (`active_phase`, `next_action`, `next_phases`,
  `progress`) read by the v1.40 status-line, with a pointer to
  STATE-MD-LIFECYCLE.md for the full reference.

`docs/manual-update.md` audited and verified clean — already documents
`/gsd-update --reapply` (the consolidated form), no reference to the
deleted `/gsd-reapply-patches`.

Refs #3047

* docs(i18n): mirror v1.40.0 slash-command rename into ja-JP/ko-KR/zh-CN/pt-BR

Mechanical token-level renames only — every reference to a deleted
micro-skill slash form is rewritten to the consolidated form on the
matching parent skill. No prose was machine-translated; new prose
sections (slash-form primer, namespace routing primer, v1.40 feature
entries, STATE.md frontmatter) were left for human translator
follow-up.

Renames applied uniformly across all four trees:
  /gsd-add-todo, /gsd-add-note, /gsd-add-backlog,
  /gsd-plant-seed, /gsd-check-todos      → /gsd-capture[ --note|
                                            --backlog|--seed|--list]
  /gsd-add-phase, /gsd-insert-phase,
  /gsd-remove-phase, /gsd-edit-phase     → /gsd-phase[ --insert|
                                            --remove|--edit]
  /gsd-new-workspace, /gsd-list-workspaces,
  /gsd-remove-workspace                  → /gsd-workspace[ --new|
                                            --list|--remove]
  /gsd-settings-advanced,
  /gsd-settings-integrations,
  /gsd-set-profile                       → /gsd-config[ --advanced|
                                            --integrations|--profile]
  /gsd-sketch-wrap-up                    → /gsd-sketch --wrap-up
  /gsd-spike-wrap-up                     → /gsd-spike --wrap-up
  /gsd-reapply-patches                   → /gsd-update --reapply
  /gsd-code-review-fix                   → /gsd-code-review --fix
  /gsd-plan-milestone-gaps               → /gsd-audit-milestone

Refs #3047

* docs(changelog): regroup [Unreleased] under Feature/Enhancement/Fix

Replace the existing Keep-a-Changelog \`Added\` / \`Changed\` /
\`Performance\` / \`Removed\` / \`Fixed\` sub-headers in the [Unreleased]
block with the issue/PR template taxonomy:

  Added                 → Feature
  Changed / Performance → Enhancement
  Removed               → Enhancement
  Fixed                 → Fix

Order within the release: Feature → Enhancement → Fix. Every bullet
preserved verbatim — only headers and grouping changed; the awkward
inline-versioned headers (\`### Added — 1.40.0-rc.1\`,
\`### Changed — 1.40.0-rc.1\`, \`### Fixed — 1.40.0-rc.1\`) folded into
the same buckets with the \`— 1.40.0-rc.1\` suffix dropped, since the
[Unreleased] block IS 1.40.0-rc.1.

The [1.39.2] hotfix block called out in #3047's spec does not yet
exist in CHANGELOG.md (the previously released hotfix is [1.39.1]),
so this commit only regroups [Unreleased]. Older release blocks
([1.39.1] and earlier) are frozen and untouched.

Refs #3047

* docs(changeset): add fragment for v1.40.0 doc audit

Refs #3047

* docs(en): strip leading / from deleted slash-command tokens in FEATURES

REQ-CONSOLIDATE-03 and REQ-CONSOLIDATE-04 listed deleted commands by
their `/gsd-foo` form for the historical record. The docs-parity tests
in bug-3010, bug-3029-3034, and bug-3042-3044 use the regex
`/\/gsd-[a-z0-9][a-z0-9-]*/g` to scan user-facing surfaces for any
remaining mention of removed slash forms — they cannot tell prose
about a deleted command from a live recommendation.

Strip the leading slash from the bare-name references (preserve the
historical text otherwise). Tests now require a `/` prefix to match,
so `gsd-add-todo` reads identically to a human but no longer trips
the parser.

Verified locally: 65/65 tests pass across the three docs-parity
suites that were red on CI run 25270072600.

Refs #3047

* docs(en): fix CR feedback + drop literal /gsd:plan-phase from USER-GUIDE

CI: tests/bug-2543-gsd-slash-namespace.test.cjs flagged
docs/USER-GUIDE.md:35 for embedding the literal `/gsd:plan-phase`
token in the parenthetical Gemini-form example. The test scans every
.md under docs/ for `/gsd:<live-cmd>` because non-Gemini surfaces must
not advertise the colon form. Replaced the literal example with a
prose substitution rule.

CR: docs/ARCHITECTURE.md:125 — the namespace meta-skills were listed
by file-prefix (`gsd-ns-workflow`) but the invocable frontmatter `name:`
is the bare form (`gsd-workflow`). Verified against the six
`commands/gsd/ns-*.md` files. Replaced with the canonical names and
noted the file/name disagreement in-line.

CR: docs/COMMANDS.md:723 — `v1.40` aligned to canonical `v1.40.0`.

CR: docs/FEATURES.md:2679 — REQ-CTX-GUARD-02 advertised the wrong
invocation (`gsd-tools validate context`). The shipped handler is
exposed via `gsd-sdk query validate.context` and requires explicit
`--tokens-used <int>` + `--context-window <int>` flags (verified
against sdk/src/query/validate.ts:849-882 and
get-shit-done/bin/lib/validate-command-router.cjs:19-36).

CR: docs/zh-CN/README.md:533 — added `inherit` to the profile-options
parenthetical to match the canonical set (verified against
model-profiles.cjs:29 `VALID_PROFILES = […MODEL_PROFILES['gsd-planner'], 'inherit']`).

Verified locally: 74/74 tests pass across the four docs-parity suites
that were red on CI runs 25270072600 and 25270182903.

Refs #3047
2026-05-03 07:33:27 -04:00
Tom Boucher
7714b5244b fix(workflows,docs): scrub stale /gsd-code-review-fix and /gsd-plan-milestone-gaps refs (#3029, #3034) (#3038)
* fix(workflows,docs): scrub stale /gsd-code-review-fix and /gsd-plan-milestone-gaps refs (#3029, #3034)

#2790 consolidated /gsd-code-review-fix into /gsd-code-review --fix and
deleted /gsd-plan-milestone-gaps in favor of inline gap planning as part
of /gsd-audit-milestone's output. The deletion was propagated through
some surfaces (#2950 covered help/do/settings/discuss-phase/etc.) but
several user-facing surfaces still emitted the old forms:

#3029 — /gsd-code-review-fix references in:
- agents/gsd-code-fixer.md (description, "Spawned by", recovery prose)
- get-shit-done/workflows/code-review.md (offer text)
- get-shit-done/workflows/execute-phase.md (offer text)
- get-shit-done/workflows/code-review-fix.md (internal retry hints)
- docs/INVENTORY.md (agent + workflow rows)
- docs/CONFIGURATION.md (workflow.code_review row)
- docs/USER-GUIDE.md (3 occurrences in walkthrough)
- docs/AGENTS.md (gsd-code-fixer agent stub)
- docs/FEATURES.md (commands list + REQ-REVIEW-04)

All replaced with /gsd-code-review --fix. Internal retry hints in the
workflow file itself updated to point at the new form. Release notes
(docs/RELEASE-*.md) and gsd-ns-review's "absorbed by" deletion note
left unchanged — historical/explanatory content.

#3034 — /gsd-plan-milestone-gaps references in:
- get-shit-done/workflows/audit-milestone.md (<offer_next> blocks for
  gaps_found and tech_debt: lines 281, 323)
- commands/gsd/complete-milestone.md (gaps_found pre-flight: lines 46, 57)

Replaced with inline closure path:
  /gsd-phase --insert <N> "Close gap: <REQ-ID> ..."
  /gsd-discuss-phase <N>
  /gsd-plan-phase <N>
  /gsd-execute-phase <N>

Plus a Nyquist-coverage hint pointing at /gsd-validate-phase /
/gsd-secure-phase for retroactive audit-chain hygiene gaps. The
gsd-ns-project SKILL.md "deleted by #2790" note is preserved
(it's the canonical pointer for future readers asking what
happened to the command).

Tests:
- tests/bug-3029-3034-stale-command-routes.test.cjs — parser-based
  assertions per fixed surface, plus a structural cross-check that
  gsd-ns-project keeps the deletion note. 15 tests, all green.
- 6905/6905 full suite passes.

Closes #3029
Closes #3034

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: address CR feedback on PR #3038 — argument order, structural tests, agent count

CR findings on PR #3038:

1. **docs/USER-GUIDE.md (Major)** — `--fix` examples used flag-first form
   (`/gsd-code-review --fix 3`), but the supported CLI grammar is
   phase-first (`/gsd-code-review 3 --fix`). The original sed-based
   replacement preserved the position of the `gsd-code-review-fix`
   token, producing the wrong order. Fixed in USER-GUIDE.md (3
   occurrences) and the same drift in the workflow surfaces:
   - get-shit-done/workflows/code-review-fix.md (2 retry hints)
   - get-shit-done/workflows/code-review.md (offer text)
   - get-shit-done/workflows/execute-phase.md (offer text)

2. **docs/AGENTS.md (Minor)** — internal count drift: line 483 said
   "Ten additional agents" but line 725 said "12 advanced/specialized".
   Filesystem reality: 33 agents total, 21 primary, 12 specialized
   (count of `### ` stubs in the Advanced and Specialized section).
   Updated lines 3, 13, 483 to use 12/33 and added the two missing
   names (doc-classifier, doc-synthesizer) to the inline list at
   line 13.

3. **tests:94 (Major refactor suggestion)** — `.includes()` token checks
   were source-grep style. Refactored to a typed-IR pattern: extract
   the SET of slash-command tokens via regex, assert membership on the
   parsed Set instead of substring scanning the raw file text. Added
   the `allow-test-rule` comment explaining the IR-build vs
   IR-assertion split per scripts/lint-no-source-grep.cjs convention.

4. **tests:130 (Major)** — replacement-path assertion was file-wide and
   could false-pass on generic mentions of "inline" elsewhere in the
   file. Refactored: `extractOfferBlocks(content)` returns the typed
   list of `<offer_next>` and "Pre-flight" blocks where the deleted
   command previously lived, and the assertion runs against those
   blocks specifically. Now requires `/gsd-phase --insert` or
   inline-audit prose to appear in the same offer block, not just
   somewhere in the file.

15/15 targeted tests pass. 6905/6905 full suite pass. Lints clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-02 17:23:44 -04:00
Tom Boucher
e1d661ece0 feat(#3024): dynamic routing with failure-tier escalation (#3031)
* feat(#3024): dynamic routing with failure-tier escalation

Adds a `dynamic_routing` block to .planning/config.json that lets
the resolver start agents on a cheap tier and escalate one tier up
when the orchestrator detects a soft failure (verification
inconclusive, plan-check FLAG, etc.). Solves the "pay Opus rates as
insurance" anti-pattern by making escalation observed-quality-driven.

Architecture:
- AGENT_DEFAULT_TIERS map (light/standard/heavy) — every agent in
  MODEL_PROFILES declares a default tier; tests assert coverage
  so adding a new agent without updating the map fails CI.
- nextTier(currentTier) helper — light → standard → heavy → heavy
  (heavy stays at heavy; can't go further).
- resolveModelForTier(cwd, agentType, attempt) — new resolver. The
  orchestrator tracks the attempt counter and passes 0 for the
  first spawn, 1+ on escalation. The resolver caps internally at
  max_escalations so the orchestrator can blindly bump the counter.
- Schema validation: dynamic_routing.enabled / escalate_on_failure /
  max_escalations / tier_models.<light|standard|heavy>. Unknown
  tiers and unknown sub-keys rejected at config-set time.
- SDK schema mirror updated to keep CJS/SDK in lockstep (#2653).

Resolution precedence (highest → lowest):
  1. model_overrides[<agent>]              (full IDs accepted)
  2. dynamic_routing.tier_models[<tier>]   (NEW; escalation-aware)
  3. models[<phase_type>]                  (#3023 phase-type map)
  4. model_profile                         (per-agent column)
  5. Runtime default

Backward compatibility: dynamic_routing is disabled by default
(enabled: false or block omitted). resolveModelForTier short-
circuits to resolveModelInternal in that case, so callers can
adopt unconditionally without breaking existing behavior.

This PR delivers the JS-layer infrastructure: schema + tier map +
resolver. Orchestrator adoption (workflow markdown updates that
detect soft failures and call resolveModelForTier with attempt+1)
is incremental follow-up — verifier / plan-checker / integration-
checker each adopt the protocol when ready.

Tests (23 cases, all structural-IR — no stdout grep):
- Schema invariants: AGENT_DEFAULT_TIERS coverage, VALID_AGENT_TIERS
  exact match, every assignment uses a valid tier
- nextTier helper: light→standard→heavy→heavy, null on invalid input
- Disabled mode: no block + enabled:false both no-op (back-compat)
- Enabled mode: attempt=0 returns default tier model, attempt=1
  escalates, beyond max_escalations caps, heavy agents stay heavy,
  default max_escalations=1 when omitted
- Precedence: per-agent override beats dynamic_routing,
  dynamic_routing beats phase-type models
- Validation: every settings key accepted, unknown tiers/sub-keys
  rejected, bare `dynamic_routing` rejected as config-set target

Documentation:
- get-shit-done/references/model-profiles.md — full reference section
- docs/CONFIGURATION.md — full settings table + escalation flow
- docs/USER-GUIDE.md — task-oriented "Cheap-by-default" section
- docs/FEATURES.md — config row cross-link

Verification:
- 23/23 pass on regression test
- 6843/6843 full suite (23 net new from 6820)
- lint-no-source-grep clean (376 test files)
- SDK schema mirror keeps CJS/SDK in sync per #2653 parity test

Closes #3024

* fix(#3024): honor escalate_on_failure:false + 3 CR follow-ups

CodeRabbit on PR #3031 (4 findings — 1 Major + 2 Minor + 1 Nitpick):

1. **Major (inline)** — get-shit-done/bin/lib/core.cjs:1668
   resolveModelForTier ignored dynamic_routing.escalate_on_failure.
   When the user set it to false, escalation should be disabled, but
   the resolver only checked attempt/max_escalations. An orchestrator
   that always passes attempt+1 on retry would silently escalate
   despite the user opting out.
   Fix: gate effectiveAttempt on `dr.escalate_on_failure !== false`
   so false short-circuits every attempt back to the default tier.

2. **Minor (inline)** — docs/CONFIGURATION.md:123-126
   The dynamic_routing rows in the Core Settings table had 4 cells
   instead of 5 (missing the Options column), breaking the table
   structure. Added explicit Options values for enabled / escalate_on_failure
   / max_escalations rows.

3. **Minor (outside-diff)** — references/model-profiles.md:179-195
   "Resolution Logic" sketch was pre-#3024 and didn't include
   dynamic_routing in the precedence ladder. Updated to a 6-step
   block with dynamic_routing at step 3 (between override and
   phase-type).

4. **Nitpick** — tests/feat-3024-dynamic-routing.test.cjs:189+
   Tests used `if (lightAgent) { ... }` guards that silent-pass
   when AGENT_DEFAULT_TIERS drifts. Replaced all 5 conditional
   skips with `assert.ok(lightAgent, '...')` preconditions so a
   tier-mapping change surfaces as a test failure.

Plus: 2 new regression tests for the Major fix:
- escalate_on_failure:false caps every attempt at default tier
- escalate_on_failure:true (explicit) still escalates normally

Verification:
- 25/25 pass on regression test (23 prior + 2 escalate_on_failure)
- 6845/6845 full suite (2 net new)
- lint-no-source-grep clean

* docs(#3024): align precedence + add fence language tags (CR follow-up)

CodeRabbit (3 minor):

1. docs/CONFIGURATION.md:691 — "Per-Phase-Type Models → Resolution
   precedence" was a 4-step block written pre-#3024; readers got
   contradictory rules between the per-phase-type section and the
   later dynamic_routing section. Updated to the same 5-step ladder
   with dynamic_routing at step 2, and noted that dynamic_routing
   is disabled by default so this section's behavior is unchanged
   when the kill-switch is off.

2. docs/CONFIGURATION.md:770 — escalation-flow code fence missing
   language tag (MD040). Added `text`.

3. references/model-profiles.md:184 — resolution-ladder code fence
   missing language tag (MD040). Added `text`.

No code changes; docs only. Verification: regression test still 25/25.

* docs(#3024): clarify precedence prose — five layers, not four (CR nitpick)

CodeRabbit nitpick: the "Per-Phase-Type Models → Resolution
precedence" prose said "The four layers compose..." but the ladder
above lists five (including Runtime default). Also "dynamic_routing
escalates per-attempt above all of them" misreads as suggesting
dynamic_routing wins over model_overrides — actually overrides still
win at step 1.

Reworded top-down so the precedence direction is unambiguous:
  - model_profile = base
  - models = phase-level override
  - dynamic_routing = per-attempt escalation
  - model_overrides = per-agent exception (top)
  - runtime default = fallback

No code changes; docs only.

* docs(#3024): note escalate_on_failure:false in escalation-flow diagram (CR)

CodeRabbit nitpick: the escalation-flow diagram in
docs/CONFIGURATION.md described the soft-failure → respawn →
tier_models[next_tier_up] path, but didn't surface the
`dynamic_routing.escalate_on_failure: false` kill-switch right next
to it. Users reading the flow diagram (which is the canonical place
to understand attempt behavior) wouldn't see that the kill-switch
overrides the soft-failure branch.

Added a one-paragraph note immediately after the flow listing,
before the tier-sequence example, so the kill-switch is visible
exactly where users decide whether escalation will happen.

No code changes; docs only.
2026-05-02 14:26:35 -04:00
Tom Boucher
d812c66020 feat(#3023): per-phase-type model map in .planning/config.json (#3030)
* feat(#3023): per-phase-type model map in .planning/config.json

Adds a new `models` block to .planning/config.json with six phase-type
slots (planning / discuss / research / execution / verification /
completion). Lets users express coarse tuning ("Opus for planning,
Sonnet for the rest") without learning the agent taxonomy.

Resolution precedence (highest → lowest):
  1. Per-agent `model_overrides[agent]`     (full IDs; targeted exception)
  2. Phase-type `models[phase_type]`         (NEW; tier alias)
  3. Profile table (`model_profile`)          (per-agent column)
  4. Runtime default

The three layers compose: `models` defaults a phase, `model_overrides`
carves an exception. Phase-type values are tier aliases (opus/sonnet/
haiku/inherit) so the runtime-resolution chain (#2517) stays correct
end-to-end without further branching.

Implementation:
- model-profiles.cjs: new AGENT_TO_PHASE_TYPE map + VALID_PHASE_TYPES
  set. Each agent in MODEL_PROFILES gets one phase-type assignment;
  tests assert coverage so adding a new agent without updating the
  table fails CI.
- core.cjs (resolveModelInternal): inserts phase-type tier lookup
  between per-agent override and profile-derived tier. Skips runtime
  resolution when the resolved tier is 'inherit' (was previously gated
  only on profile === 'inherit'; phase-type can now produce inherit
  independently).
- core.cjs (loadConfig): pass `parsed.models` through both code paths
  so resolveModelInternal can read it.
- config-schema.cjs + sdk/src/query/config-schema.ts: dynamic-pattern
  validator accepts only the six known phase-types; unknown slots
  rejected at config-set time.

Backward compat: configs without `models` behave exactly as today.

Tests (15 cases, all structural-IR — no stdout grep):
- Schema: AGENT_TO_PHASE_TYPE coverage, VALID_PHASE_TYPES exact match
- Resolver: phase-type alone; per-agent override beats phase-type;
  phase-type beats profile; issue's full example; "inherit"; empty
  block is no-op; no block is no-op
- Validation: each of the 6 slots accepted; unknown slot rejected;
  bare `models` (no slot) rejected

Verification:
- 15/15 pass on new regression test
- 6808/6808 full suite (5 net new), 0 fail
- lint-no-source-grep clean across 375 test files

Closes #3023

* docs(#3023): document `models` per-phase-type config in user-facing docs

Adds `models` block coverage to the three user-facing docs that ship
with each release:

1. docs/CONFIGURATION.md
   - New "Per-Phase-Type Models" section between "Per-Agent Overrides"
     and "Non-Claude Runtimes" with:
       * full example mixing models + model_overrides
       * phase-type → agent mapping table
       * resolution-precedence pseudocode
       * accepted values (tier alias only)
       * "When to use which" decision matrix
       * validation behavior + example error
   - Added `"models": {}` to the Full Schema snippet
   - Added a row for `models.<phase_type>` to the config keys table
     (next to model_profile_overrides for adjacency)

2. docs/FEATURES.md
   - Added a row for models.<phase_type> in the Configurable Settings
     table (right under model_profile)
   - Cross-link to CONFIGURATION.md for the full surface

3. docs/USER-GUIDE.md
   - New task-oriented "Tuning model cost by phase" section above
     "Using Non-Claude Runtimes" — leads with the concrete config
     and shows the override pattern (one-shot phase + targeted exception)
   - Cross-link to CONFIGURATION.md

Verification:
- 29/29 pass on config-schema-docs-parity + docs-update + new feature
  test (parity-check passes, so the config-schema entry I added in the
  feature commit is now matched by a docs row)
- 6808/6808 full suite pass
- lint-no-source-grep clean

Doc style follows the same pattern used by the existing model_profile,
model_overrides, and model_profile_overrides sections — example-led,
table-backed, cross-referenced. Each doc surfaces the feature at the
right depth (reference / settings table / task guide).

* fix(#3023): mirror phase-type tier in resolveReasoningEffortInternal (CR Major)

CodeRabbit caught a real Codex correctness bug + 3 minor docs/test issues:

1. **Major (outside-diff)** — resolveReasoningEffortInternal in core.cjs
   derived its tier exclusively from the profile table, ignoring the
   models.<phase_type> override added in #3023. Failure mode on Codex:

     Config: model_profile=balanced, models.execution=opus, agent=gsd-executor
     resolveModelInternal:           tier=opus    → gpt-5.4
     resolveReasoningEffortInternal: tier=sonnet  → reasoning_effort=medium
                                                     ↑
                                                  WRONG — should be xhigh
                                                  (opus tier on Codex)

   The runtime received a mismatched (model, effort) pair. Mirrored the
   phase-type lookup from resolveModelInternal so both functions derive
   from the same tier source. 'inherit' phase-type returns null effort
   (no runtime entry maps to 'inherit'; let runtime decide).

2. Minor — .changeset/per-phase-type-models.md `pr: TBD` → `pr: 3030`.

3. Minor (outside-diff) — model-profiles.md "Resolution Logic" section
   omitted the new phase-type tier. Updated the 4-step block to a 5-step
   block including `models[phase_type]` between override and profile,
   plus a paragraph noting that `model` and `reasoning_effort` derive
   from the same tier source.

4. Nitpick — added 2 typo-safety tests:
   - models.research = "haiku3" (typo) → falls through to profile
   - models.research = "openai/gpt-5" (full ID) → falls through to profile
   Plus 5 new reasoning_effort tests covering the Major fix:
   - exported correctly
   - phase-type override flips both model AND effort to same tier
   - inherit phase-type returns null effort
   - per-agent override still bypasses phase-type for effort
   - claude runtime ignores models.* (no effort propagation)

Verification:
- 24/24 pass on regression test (15 original + 2 typo-safety + 5 effort + 2 outside-diff related)
- 6815/6815 full suite (7 net new from 6808)
- lint-no-source-grep clean

The reasoning_effort tests are written semantically (phase-type override
must produce the SAME effort as a profile-only opus config) rather than
hard-coding tier-specific effort strings, so changes to the runtime tier
map don't break them.

* fix(#3023): phase-type override beats profile=inherit (CR Major round 2)

CodeRabbit caught another precedence inversion: when
  { model_profile: 'inherit', models: { execution: 'opus' } }
both resolvers short-circuited on `profile === 'inherit'` BEFORE the
phase-type override could be honored. Result: model returned 'inherit'
and reasoning_effort returned null — both contradicting the documented
precedence where models[phase_type] wins over model_profile.

Fix in resolveModelInternal:
- Compute tier from phase-type FIRST. If phase-type is a valid alias,
  it wins. Otherwise, fall back to profile-derived tier OR 'inherit'
  (when profile === 'inherit').
- Gate the runtime-resolution branch on `tier !== 'inherit'` (was
  `profile !== 'inherit'`) so phase-type=opus can flip runtime mapping
  on even when profile=inherit.
- Gate the inherit-return on `tier === 'inherit'` (was
  `profile === 'inherit'`).

Fix in resolveReasoningEffortInternal:
- Remove the `if (profile === 'inherit') return null;` early-return.
- Compute tier from phase-type first, fall back to profile. If
  phase-type is explicitly 'inherit' OR the resolved tier is 'inherit',
  return null (no runtime entry maps to inherit).

Tests added (5 new):
- model: phase-type wins over profile=inherit (with explicit opus, with
  haiku for one phase + planner-without-slot still inheriting)
- model: profile=inherit + no models block → all agents inherit (no
  regression on existing inherit semantics)
- model: profile=inherit + models block but agent has no slot → that
  agent inherits, agents with slots get phase-type tier
- effort: phase-type opus + profile=inherit → produces opus-tier
  effort, NOT null (the original bug)

Verification:
- 27/27 pass on regression test (24 prior + 3 model + 1 effort)
- 6820/6820 full suite (5 net new)
- lint-no-source-grep clean

The effort test reads the expected value by running a profile-only opus
config and comparing — semantic check, not hard-coded effort string. So
runtime tier map changes don't break the test.
2026-05-02 13:19:15 -04:00
Tom Boucher
8de8acee46 fix(workflows): assert HEAD on per-agent branch before worktree commits (#2924) (#2941)
* fix(workflows): assert HEAD on per-agent branch before worktree commits

Worktree-mode setup could leave HEAD attached to a protected branch (master),
causing agent commits to land there. The previous response was a destructive
self-recovery via 'git update-ref refs/heads/master <sha>', which silently
rewinds the protected branch and destroys concurrent commits in multi-active
scenarios (parallel agents, user committing while agent runs).

- Reorder <worktree_branch_check> in execute-phase.md and quick.md to assert
  HEAD via 'git symbolic-ref' BEFORE any 'git reset --hard'. HALT with a
  blocker if HEAD is on main/master/develop/trunk/release/* or detached.
- Add a per-commit HEAD assertion (step 0) to gsd-executor.md
  <task_commit_protocol>; HEAD attachment can drift after 'git checkout <sha>'.
- Forbid 'git update-ref refs/heads/<protected>' in
  <destructive_git_prohibition>; surface the blocker rather than self-heal.
- Remove '--no-verify' as the worktree-mode default in execute-phase.md,
  execute-plan.md, quick.md, and references/git-integration.md. Hooks now
  run on every executor commit; opt out only via workflow.worktree_skip_hooks.
- Add regression test that parses the worktree_branch_check blocks structurally
  and asserts the symbolic-ref check precedes the reset --hard, no workflow
  performs update-ref on a protected ref, and --no-verify is no longer the
  default in any parallel-execution prompt.

* fix(#2924): address CodeRabbit review findings on worktree HEAD PR

- Add positive worktree-agent-* allow-list to <task_commit_protocol> step 0
  in gsd-executor.md and to <worktree_branch_check> in execute-phase.md and
  quick.md. The deny-list (main|master|develop|trunk|release/*) silently
  allowed feature/* and other arbitrary branches outside the agent namespace.
- Register workflow.worktree_skip_hooks in both config schemas
  (sdk/src/query/config-schema.ts and get-shit-done/bin/lib/config-schema.cjs)
  and document it in docs/CONFIGURATION.md so config-set accepts it.
- Fix stash lifecycle in execute-phase.md post-wave hook validation: stash
  under a named ref and pop after the hook run; warn on pop failure.
- Pre-dispatch PLAN.md commit in quick.md: gate on git diff --cached --quiet
  for idempotency and exit 1 with a clear error on commit failure (both the
  --no-verify and the normal branches) — no more swallowing real errors.
- Test fixes (tests/bug-2924-worktree-head-attachment.test.cjs):
  - Parse the protected-branch alternation structurally and require
    main, master, develop, trunk, release/.* (release/* was previously
    skipped by the \\b...\\b regex).
  - Use fs.readdirSync(dir, { recursive: true }) so workflows in nested
    subdirectories are also asserted against the update-ref ban.
  - Add allow-list assertions for execute-phase.md, quick.md, and
    gsd-executor.md to lock in the new positive namespace check.

* test(#2924): assert sub-section end marker exists before slicing

* test(#2924): use section boundary instead of fixed window for parallel-agents slice
2026-05-01 09:23:02 -04:00
Tom Boucher
8788ab2381 feat: post-merge build & test gate — Build step, iOS/Xcode, serial mode (#2751)
* feat: post-merge build & test gate — Build step, iOS/Xcode, serial mode

Step 5.6 of execute-phase is extended per #2720:
- Renamed from "Post-merge test gate (parallel mode only)" to "Post-merge build & test gate"
- Gate now runs in both parallel mode (after worktree merge) and serial mode (after last plan)
- Added Step A: Build gate resolving BUILD_CMD from workflow.build_command config key, then auto-detecting via priority: config override → Xcode (.xcodeproj) → Makefile build: → Justfile → Cargo/Go/Python/npm. Xcode uses xcodebuild -list -json to get first scheme, then xcodebuild build -scheme ... -destination 'platform=iOS Simulator,name=iPhone 16'. Build failure increments WAVE_FAILURE_COUNT.
- Added Xcode/iOS detection to Step B (Test gate): when *.xcodeproj present and no workflow.test_command configured, uses xcodebuild test instead of the previous "no test runner detected" skip. Scheme reused from Step A when available.
- Documented workflow.build_command and workflow.test_command in docs/CONFIGURATION.md (table + JSON schema)

Closes #2720

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(execute-phase): extract Step 5.6 body to post-merge-gate.md sub-file

Moves the build-detection logic and xcodebuild commands from the inline
Step 5.6 body into execute-phase/steps/post-merge-gate.md, replacing it
with a single Read() reference. Reduces execute-phase.md from 1755 to
1647 lines, satisfying the ≤1700 XL-tier budget enforced by
tests/workflow-size-budget.test.cjs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 13:33:50 -04:00
Tom Boucher
b40110111d feat(#2306): plan-review-convergence v2 — CYCLE_SUMMARY contract, config gate, local model reviewers (#2718)
* feat(#2306): plan-review-convergence v2 — CYCLE_SUMMARY contract, config gate, local model reviewers

Fixes the false-stall detection bug in the plan→review→replan convergence
loop. REVIEWS.md accumulates history across cycles so raw grep inflated
HIGH counts; HIGH count now comes from a per-cycle CYCLE_SUMMARY contract
emitted in the review agent's return message.

Key changes:
- workflow.plan_review_convergence config gate (disabled by default, same
  pattern as workflow.code_review / workflow.nyquist_validation)
- Review agent prompt defines CYCLE_SUMMARY: current_high=<N> contract with
  PARTIALLY RESOLVED / FULLY RESOLVED counting rules
- Orchestrator aborts on absent/malformed CYCLE_SUMMARY (distinguishes both)
- Warns when HIGH_COUNT > 0 but ## Current HIGH Concerns section is missing
- Stall detection and --ws forwarding preserved and tested
- Local model reviewers: --ollama, --lm-studio, --llama-cpp flags added to
  convergence workflow and review workflow; all three use OpenAI-compatible
  /v1/chat/completions endpoint with jq --rawfile for safe JSON encoding
- review.ollama_host / review.lm_studio_host / review.llama_cpp_host config
  keys registered and documented (default to localhost:11434/1234/8080)
- review.models.ollama / .lm_studio / .llama_cpp model-name config support
- 58 tests (up from 29 in PR #2339), all passing

Closes #2306
Closes #2339
Co-authored-by: Tom Boucher <trekkie@nomorestars.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): sync sdk/src/query/config-schema.ts with CJS schema (#2306)

Add workflow.plan_review_convergence, review.ollama_host,
review.lm_studio_host, and review.llama_cpp_host to the SDK-side
TypeScript mirror — required by the CJS↔SDK parity test (#2653).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(#2306): resolve CodeRabbit review findings

- Anchor HIGH_COUNT extraction with head -1 to prevent multi-match when
  agent return message contains multiple CYCLE_SUMMARY lines (e.g. quoted
  back from prompt context)
- Replace hardcoded reviewers list in REVIEWS.md frontmatter template with
  runtime-derived placeholder — the static list did not reflect which
  reviewers were actually invoked
- Broaden workflow.plan_review_convergence docs to include local reviewers
  (Ollama, LM Studio, llama.cpp) alongside cloud reviewers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ci): restore reviewers frontmatter list with runtime note

The cursor-reviewer.test.cjs (and equivalent per-reviewer tests) assert
that each supported reviewer appears on the reviewers: line — these are
wiring tests that catch when a new reviewer is added to invocation but
not to the REVIEWS.md template. Replacing the list with a placeholder
broke those tests.

Restore the full static list and add an inline comment clarifying that
the actual committed frontmatter should be filtered to only the reviewers
invoked that run — satisfying both the per-reviewer tests and the
CodeRabbit correctness note.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 14:18:34 -04:00
Tom Boucher
eba0c99698 fix(#2623): resolve parent .planning root for sub_repos workspaces in SDK query dispatch (#2629)
* fix(#2623): resolve parent .planning root for sub_repos workspaces in SDK query dispatch

When `gsd-sdk query` is invoked from inside a `sub_repos`-listed child repo,
`projectDir` defaulted to `process.cwd()` which pointed at the child repo,
not the parent workspace that owns `.planning/`. Handlers then directly
checked `${projectDir}/.planning` and reported `project_exists: false`.

The legacy `gsd-tools.cjs` CLI does not have this gap — it calls
`findProjectRoot(cwd)` from `bin/lib/core.cjs`, which walks up from the
starting directory checking each ancestor's `.planning/config.json` for a
`sub_repos` entry that lists the starting directory's top-level segment.

This change ports that walk-up as a new `findProjectRoot` helper in
`sdk/src/query/helpers.ts` and applies it once in `cli.ts:main()` before
dispatching `query`, `run`, `init`, or `auto`. Resolution is idempotent:
if `projectDir` already owns `.planning/` (including an explicit
`--project-dir` pointing at the workspace root), the helper returns it
unchanged. The walk is capped at 10 parent levels and never crosses
`$HOME`. All filesystem errors are swallowed.

Regression coverage:
- `helpers.test.ts` — 8 unit tests covering own-`.planning` guard (#1362),
  sub_repos match, nested-path match, `planning.sub_repos` shape,
  heuristic fallback, unparseable config, legacy `multiRepo: true`.
- `sub-repos-root.integration.test.ts` — end-to-end baseline (reproduces
  the bug without the walk-up) and fixed behavior (walk-up + dispatch of
  `init.new-milestone` reports `project_exists: true` with the parent
  workspace as `project_root`).

sdk vitest: 1511 pass / 24 fail (all 24 failures pre-existing on main,
baseline is 26 failing — `comm -23` against baseline produces zero new
failures). CJS: 5410 pass / 0 fail.

Closes #2623

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2623): remove stray .planing typo from integration test setup

Address CodeRabbit nitpick: the mkdir('.planing') call on line 23 was
dead code from a typo, with errors silently swallowed via .catch(() => {}).
The test already creates '.planning' correctly on the next line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:58:23 -04:00
Tom Boucher
5a8a6fb511 fix(#2256): pass per-agent model overrides through Codex/OpenCode transport (#2628)
The Codex and OpenCode install paths read `model_overrides` only from
`~/.gsd/defaults.json` (global). A per-project override set in
`.planning/config.json` — the reporter's exact setup for
`gsd-codebase-mapper` — was silently dropped, so the child agent inherited
the runtime's default model regardless of `model_overrides`.

Neither runtime has an inline `model` parameter on its spawn API
(Codex `spawn_agent(agent_type, message)`, OpenCode `task(description,
prompt, subagent_type, task_id, command)`), so the per-agent model must
reach the child via the static config GSD writes at install time. That
config was being populated from the wrong source.

Fix: add `readGsdEffectiveModelOverrides(targetDir)` which merges
`~/.gsd/defaults.json` with per-project `.planning/config.json`, with
per-project keys winning on conflict. Both install sites now call it and
walk up from the install root to locate `.planning/` — matching the
precedence `readGsdRuntimeProfileResolver` already uses for #2517.

Also update the Codex Task()->spawn_agent mapping block so it no longer
says "omit" without context: it now documents that per-agent overrides
are embedded in the agent TOML and notes the restriction that Codex
only permits `spawn_agent` when the user explicitly requested sub-agents
(do the work inline otherwise).

Regression tests (`tests/bug-2256-model-overrides-transport.test.cjs`)
cover: global-only, project-only, project-wins-on-conflict, walking up
from a nested `targetDir`, Codex TOML `model =` emission, and OpenCode
frontmatter `model:` emission.

Closes #2256

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 11:58:06 -04:00
Tom Boucher
f30da8326a feat: add gates ensuring discuss-phase decisions are translated to plans and verified (closes #2492) (#2611)
* feat(#2492): add gates ensuring discuss-phase decisions are translated and verified

Two gates close the loop between CONTEXT.md `<decisions>` and downstream
work, fixing #2492:

- Plan-phase **translation gate** (BLOCKING). After requirements
  coverage, refuses to mark a phase planned when a trackable decision
  is not cited (by id `D-NN` or by 6+-word phrase) in any plan's
  `must_haves`, `truths`, or body. Failure message names each missed
  decision with id, category, text, and remediation paths.

- Verify-phase **validation gate** (NON-BLOCKING). Searches plans,
  SUMMARY.md, files modified, and recent commit subjects for each
  trackable decision. Misses are written to VERIFICATION.md as a
  warning section but do not change verification status. Asymmetry is
  deliberate — fuzzy-match miss should not fail an otherwise green
  phase.

Shared helper `parseDecisions()` lives in `sdk/src/query/decisions.ts`
so #2493 can consume the same parser.

Decisions opt out of both gates via `### Claude's Discretion` heading
or `[informational]` / `[folded]` / `[deferred]` tags.

Both gates skip silently when `workflow.context_coverage_gate=false`
(default `true`).

Closes #2492

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2492): make plan-phase decision gate actually block (review F1, F8, F9, F10, F15)

- F1: replace `${context_path}` with `${CONTEXT_PATH}` in the plan-phase
  gate snippet so the BLOCKING gate receives a non-empty path. The
  variable was defined in Step 4 (`CONTEXT_PATH=$(_gsd_field "$INIT" ...)`)
  and the gate snippet referenced the lowercase form, leaving the gate to
  run with an empty path argument and silently skip.
- F15: wrap the SDK call with `jq -e '.data.passed == true' || exit 1` so
  failure halts the workflow instead of being printed and ignored. The
  verify-phase counterpart deliberately keeps no exit-1 (non-blocking by
  design) and now carries an inline note documenting the asymmetry.
- F10: tag the JSON example fence as `json` and the options-list fence as
  `text` (MD040).
- F8/F9: anchor the heading-presence test regexes to `^## 13[a-z]?\\.` so
  prose substrings like "Requirements Coverage Gate" mentioned in body
  text cannot satisfy the assertion. Added two new regression tests
  (variable-name match, exit-1 guard) so a future revert is caught.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2492): tighten decision-coverage gates against false positives and config drift (review F3,F4,F5,F6,F7,F16,F18,F19)

- F3: forward `workstream` arg through both gate handlers so workstream-scoped
  `workflow.context_coverage_gate=false` actually skips. Added negative test
  that creates a workstream config disabling the gate while the root config
  has it enabled and asserts the workstream call is skipped.
- F4: restrict the plan-phase haystack to designated sections — front-matter
  `must_haves` / `truths` / `objective` plus body sections under headings
  matching `must_haves|truths|tasks|objective`. HTML comments and fenced
  code blocks are stripped before extraction so a commented-out citation or
  a literal example never counts as coverage. Verify-phase keeps the broader
  artifact-wide haystack by design (non-blocking).
- F5: reject decisions with fewer than 6 normalized words from soft-matching
  (previously only rejected when the resulting phrase was under 12 chars
  AFTER slicing — too lenient). Short decisions now require an explicit
  `D-NN` citation, with regression tests for the boundary.
- F6: walk every `*-SUMMARY.md` independently and use `matchAll` with the
  `/g` flag so multiple `files_modified:` blocks across multiple summaries
  are all aggregated. Previously only the first block in the concatenated
  string was parsed, silently dropping later plans' files.
- F7: validate every `files_modified` path stays inside `projectDir` after
  resolution (rejects absolute paths, `../` traversal). Cap each file read
  at 256 KB. Skipped paths emit a stderr warning naming the entry.
- F16: validate `workflow.context_coverage_gate` is boolean in
  `loadGateConfig`; warn loudly on numeric or other-shaped values and
  default to ON. Mirrors the schema-vs-loadConfig validation gap from
  #2609.
- F18: bump verify-phase `git log -n` cap from 50 to 200 so longer-running
  phases are not undercounted. Documented as a precision-vs-recall tradeoff
  appropriate for a non-blocking gate.
- F19: tighten `QueryResult` / `QueryHandler` to be parameterized
  (`<T = unknown>`). Drops the `as unknown as Record<string, unknown>`
  casts in the gate handlers and surfaces shape mismatches at compile time
  for callers that pass a typed `data` value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2492): harden decisions parser and verify-phase glob (review F11,F12,F13,F14,F17,F20)

- F11: strip fenced code blocks from CONTEXT.md before searching for
  `<decisions>` so an example block inside ``` ``` is not mis-parsed.
- F12: accept tab-indented continuation lines (previously required a leading
  space) so decisions split with `\t` continue cleanly.
- F13: parse EVERY `<decisions>` block in the file via `matchAll`, not just
  the first. CONTEXT.md may legitimately carry more than one block.
- F14: `decisions.parse` handler now resolves a relative path against
  `projectDir` — symmetric with the gate handlers — and still accepts
  absolute paths.
- F17: replace `ls "${PHASE_DIR}"/*-CONTEXT.md | head -1` in verify-phase.md
  with a glob loop (ShellCheck SC2012 fix). Also avoids spawning an extra
  subprocess and survives filenames with whitespace.
- F20: extend the unicode quote-stripping in the discretion-heading match
  to cover U+2018/2019/201A/201B and the U+201C-F double-quote variants
  plus backtick, so any rendering of "Claude's Discretion" collapses to
  the same key.

Each fix has a regression test in `decisions.test.ts`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 00:26:53 -04:00
Tom Boucher
1a3d953767 feat: add unified post-planning gap checker (closes #2493) (#2610)
* feat: add unified post-planning gap checker (closes #2493)

Adds a unified post-planning gap checker as Step 13e of plan-phase.md.
After all plans are generated and committed, scans REQUIREMENTS.md and
CONTEXT.md <decisions> against every PLAN.md in the phase directory and
emits a single Source | Item | Status table.

Why
- The existing Requirements Coverage Gate (§13) blocks/re-plans on REQ
  gaps but emits two separate per-source signals. Issue #2493 asks for
  one unified report after planning so that requirements AND
  discuss-phase decisions slipping through are surfaced in one place
  before execution starts.

What
- New workflow.post_planning_gaps boolean config key, default true,
  added to VALID_CONFIG_KEYS, CONFIG_DEFAULTS, hardcoded.workflow, and
  cmdConfigSet (boolean validation).
- New get-shit-done/bin/lib/decisions.cjs — shared parser for
  CONTEXT.md <decisions> blocks (D-NN entries). Designed for reuse by
  the related #2492 plan/verify decision gates.
- New get-shit-done/bin/lib/gap-checker.cjs — parses REQUIREMENTS.md
  (checkbox + traceability table forms), reads CONTEXT.md decisions,
  walks PHASE_DIR/*-PLAN.md, runs word-boundary coverage detection
  (REQ-1 must not match REQ-10), formats a sorted report.
- New gsd-tools gap-analysis CLI command wired through gsd-tools.cjs.
- workflows/plan-phase.md gains §13e between §13d (commit plans) and
  §14 (Present Final Status). Existing §13 gate preserved — §13e is
  additive and non-blocking.
- sdk/prompts/workflows/plan-phase.md gets an equivalent
  post_planning_gaps step for headless mode.
- Docs: CONFIGURATION.md, references/planning-config.md, INVENTORY.md,
  INVENTORY-MANIFEST.json all updated.

Tests
- tests/post-planning-gaps-2493.test.cjs: 30 test cases covering step
  insertion position, decisions parser, gap detector behavior
  (covered/not-covered, false-positive guard, missing-file
  resilience, malformed-input resilience, gate on/off, deterministic
  natural sort), and full config integration.
- Full suite: 5234 / 5234 pass.

Design decisions
- Numbered §13e (sub-step), not §14 — §14 already exists (Present
  Final Status); inserting before it preserves downstream auto-advance
  step numbers.
- Existing §13 gate kept, not replaced — §13 blocks/re-plans on
  REQ gaps; §13e is the unified post-hoc report. Per spec: "default
  behavior MUST be backward compatible."
- Word-boundary ID matching avoids REQ-1 matching REQ-10 and avoids
  brittle semantic/substring matching.
- Shared decisions.cjs parser so #2492 can reuse the same regex.
- Natural-sort keys (REQ-02 before REQ-10) for deterministic output.
- Boolean validation in cmdConfigSet rejects non-boolean values
  matches the precedent set by drift_threshold/drift_action.

Closes #2493

* fix(#2493): expose post_planning_gaps in loadConfig() + sync schema example

Address CodeRabbit review on PR #2610:

- core.cjs loadConfig(): return post_planning_gaps from both the
  config.json branch and the global ~/.gsd/defaults.json fallback so
  callers can rely on config.post_planning_gaps regardless of whether
  the key is present (comment 3127977404, Major).
- docs/CONFIGURATION.md: add workflow.post_planning_gaps to the Full
  Schema JSON example so copy/paste users see the new toggle alongside
  security_block_on (comment 3127977392, Minor).
- tests/post-planning-gaps-2493.test.cjs: regression coverage for
  loadConfig() — default true when key absent, honors explicit
  true/false from workflow.post_planning_gaps.
2026-04-22 23:03:59 -04:00
Tom Boucher
cc17886c51 feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517) (#2609)
* feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517)

Adds an optional top-level `runtime` config key plus a
`model_profile_overrides[runtime][tier]` map. When `runtime` is set,
profile tiers (opus/sonnet/haiku) resolve to runtime-native model IDs
(and reasoning_effort where supported) instead of bare Claude aliases.

Codex defaults from the spec:
  opus   -> gpt-5.4        reasoning_effort: xhigh
  sonnet -> gpt-5.3-codex  reasoning_effort: medium
  haiku  -> gpt-5.4-mini   reasoning_effort: medium

Claude defaults mirror MODEL_ALIAS_MAP. Unknown runtimes fall back to
the Claude-alias safe default rather than emit IDs the runtime cannot
accept. reasoning_effort is only emitted into Codex install paths;
never returned from resolveModelInternal and never written to Claude
agent frontmatter.

Backwards compatible: any user without `runtime` set sees identical
behavior — the new branch is gated on `config.runtime != null`.

Precedence (highest to lowest):
  1. per-agent model_overrides
  2. runtime-aware tier resolution (when `runtime` is set)
  3. resolve_model_ids: "omit"
  4. Claude-native default
  5. inherit (literal passthrough)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2517): address adversarial review of #2609 (findings 1-16)

Addresses all 16 findings from the adversarial review of PR #2609.
Each finding is enumerated below with its resolution.

CRITICAL
- F1: readGsdRuntimeProfileResolver(targetDir) now probes per-project
  .planning/config.json AND ~/.gsd/defaults.json with per-project winning,
  so the PR's headline claim ("set runtime in project config and Codex
  TOML emit picks it up") actually holds end-to-end.
- F2: resolveTierEntry field-merges user overrides with built-in defaults.
  The CONFIGURATION.md string-shorthand example
    `{ codex: { opus: "gpt-5-pro" } }`
  now keeps reasoning_effort from the built-in entry. Partial-object
  overrides like `{ opus: { reasoning_effort: 'low' } }` keep the
  built-in model. Both paths regression-tested.

MAJOR
- F3: resolveReasoningEffortInternal gates strictly on the
  RUNTIMES_WITH_REASONING_EFFORT allowlist regardless of override
  presence. Override + unknown-runtime no longer leaks reasoning_effort.
- F4: runtime:"claude" is now a no-op for resolution (it is the implicit
  default). It no longer hijacks resolve_model_ids:"omit". Existing
  tests for `runtime:"claude"` returning Claude IDs were rewritten to
  reflect the no-op semantics; new test asserts the omit case returns "".
- F5: _readGsdConfigFile in install.js writes a stderr warning on JSON
  parse failure instead of silently returning null. Read failure and
  parse failure are warned separately. Library require is hoisted to top
  of install.js so it is not co-mingled with config-read failure modes.
- F6: install.js requires for core.cjs / model-profiles.cjs are hoisted
  to the top of the file with __dirname-based absolute paths so global
  npm install works regardless of cwd. Test asserts both lib paths exist
  relative to install.js __dirname.
- F7: docs/CONFIGURATION.md `runtime` row no longer lists `opencode` as
  a valid runtime — install-path emission for non-Codex runtimes is
  explicitly out of scope per #2517 / #2612, and the doc now points at
  #2612 for the follow-on work. resolveModelInternal still accepts any
  runtime string (back-compat) and falls back safely for unknown values.
- F8: Tests now isolate HOME (and GSD_HOME) to a per-test tmpdir so the
  developer's real ~/.gsd/defaults.json cannot bleed into assertions.
  Same pattern CodeRabbit caught on PRs #2603 / #2604.
- F9: `runtime` and `model_profile_overrides` documented as flat-only
  in core.cjs comments — not routed through `get()` because they are
  top-level keys per docs/CONFIGURATION.md and introducing nested
  resolution for two new keys was not worth the edge-case surface.
- F10/F13: loadConfig now invokes _warnUnknownProfileOverrides on the
  raw parsed config so direct .planning/config.json edits surface
  unknown runtime values (e.g. typo `runtime: "codx"`) and unknown
  tier values (e.g. `model_profile_overrides.codex.banana`) at read
  time. Warnings only — preserves back-compat for runtimes added
  later. Per-process warning cache prevents log spam across repeated
  loadConfig calls.

MINOR / NIT
- F11: Removed dead `tier || 'sonnet'` defensive shortcut. The local
  is now `const alias = tier;` with a comment explaining why `tier`
  is guaranteed truthy at that point (every MODEL_PROFILES entry
  defines `balanced`, the fallback profile).
- F12: Extracted resolveTierEntry() in core.cjs as the single source
  of truth for runtime-aware tier resolution. core.cjs and bin/install.js
  both consume it — no duplicated lookup logic between the two files.
- F14: Added regression tests for findings #1, #2, #3, #4, #6, #10, #13
  in tests/issue-2517-runtime-aware-profiles.test.cjs. Each must-fix
  path has a corresponding test that fails against the pre-fix code
  and passes against the post-fix code.
- F15: docs/CONFIGURATION.md `model_profile` row cross-references
  #1713 / #1806 next to the `adaptive` enum value.
- F16: RUNTIME_PROFILE_MAP remains in core.cjs as the single source of
  truth; install.js imports it through the exported resolveTierEntry
  helper rather than carrying its own copy. Doc files (CONFIGURATION.md,
  USER-GUIDE.md, settings.md) intentionally still embed the IDs as text
  — code comment in core.cjs flags that those doc files must be updated
  whenever the constant changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 23:00:37 -04:00
Tom Boucher
220da8e487 feat: /gsd-settings-integrations — configure third-party search and review integrations (closes #2529) (#2604)
* feat(#2529): /gsd-settings-integrations — third-party integrations command

Adds /gsd-settings-integrations for configuring API keys, code-review CLI
routing, and agent-skill injection. Distinct from /gsd-settings (workflow
toggles) because these are connectivity, not pipeline shape.

Three sections:
- Search Integrations: brave_search / firecrawl / exa_search API keys,
  plus search_gitignored toggle.
- Code Review CLI Routing: review.models.{claude,codex,gemini,opencode}
  shell-command strings.
- Agent Skills Injection: agent_skills.<agent-type> free-text input,
  validated against [a-zA-Z0-9_-]+.

Security:
- New secrets.cjs module with ****<last-4> masking convention.
- cmdConfigSet now masks value/previousValue in CLI output for secret keys.
- Plaintext is written only to .planning/config.json; never echoed to
  stdout/stderr, never written to audit/log files by this flow.
- Slug validators reject path separators, whitespace, shell metacharacters.

Tests (tests/settings-integrations.test.cjs — 25 cases):
- Artifact presence / frontmatter.
- Field round-trips via gsd-tools config-set for all four search keys,
  review.models.<cli>, agent_skills.<agent-type>.
- Config-merge safety: unrelated keys preserved across writes.
- Masking: config-set output never contains plaintext sentinel.
- Logging containment: plaintext secret sentinel appears only in
  config.json under .planning/, nowhere else on disk.
- Negative: path-traversal, shell-metachar, and empty-slug rejected.
- /gsd:settings workflow mentions /gsd:settings-integrations.

Docs:
- docs/COMMANDS.md: new command entry with security note.
- docs/CONFIGURATION.md: integration settings section (keys, routing,
  skills injection) with masking documentation.
- docs/CLI-TOOLS.md: reviewer CLI routing and secret-handling sections.
- docs/INVENTORY.md + INVENTORY-MANIFEST.json regenerated.

Closes #2529

* fix(#2529): mask secrets in config-get; address CodeRabbit review

cmdConfigGet was emitting plaintext for brave_search/firecrawl/exa_search.
Apply the same isSecretKey/maskSecret treatment used by config-set so the
CLI surface never echoes raw API keys; plaintext still lives only in
config.json on disk.

Also addresses CodeRabbit review items in the same PR area:
- #3127146188: config-get plaintext leak (root fix above)
- #3127146211: rename test sentinels to concat-built markers so secret
  scanners stop flagging the test file. Behavior preserved.
- #3127146207: add explicit 'text' language to fenced code blocks (MD040).
- nitpick: unify masked-value wording in read_current legend
  ('****<last-4>' instead of '**** already set').
- nitpick: extend round-trip test to cover search_gitignored toggle.

New regression test 'config-get masks secrets and never echoes plaintext'
verifies the fix for all three secret keys.

* docs(#2529): bump INVENTORY counts post-rebase (commands 84→85, workflows 82→83)

* fix(test): bump CLI Modules count 27→28 after rebase onto main (CI #24811455435)

PR #2604 was rebased onto main before #2605 (drift.cjs) merged. The
pull_request CI runs against the merge ref (refs/pull/2604/merge),
which now contains 28 .cjs files in get-shit-done/bin/lib/, but
docs/INVENTORY.md headline still said "(27 shipped)".

inventory-counts.test.cjs failed with:
  AssertionError: docs/INVENTORY.md "CLI Modules (27 shipped)" disagrees
  with get-shit-done/bin/lib/ file count (28)

Rebased branch onto current origin/main (picks up drift.cjs row, which
was already added by #2605) and bumped the headline to 28.

Full suite: 5200/5200 pass.
2026-04-22 21:41:00 -04:00
Tom Boucher
1a694fcac3 feat: auto-remap codebase after significant phase execution (closes #2003) (#2605)
* feat: auto-remap codebase after significant phase execution (#2003)

Adds a post-phase structural drift detector that compares the committed tree
against `.planning/codebase/STRUCTURE.md` and either warns or auto-remaps
the affected subtrees when drift exceeds a configurable threshold.

## Summary
- New `bin/lib/drift.cjs` — pure detector covering four drift categories:
  new directories outside mapped paths, new barrel exports at
  `(packages|apps)/*/src/index.*`, new migration files, and new route
  modules. Prioritizes the most-specific category per file.
- New `verify codebase-drift` CLI subcommand + SDK handler, registered as
  `gsd-sdk query verify.codebase-drift`.
- New `codebase_drift_gate` step in `execute-phase` between
  `schema_drift_gate` and `verify_phase_goal`. Non-blocking by contract —
  any error logs and the phase continues.
- Two new config keys: `workflow.drift_threshold` (int, default 3) and
  `workflow.drift_action` (`warn` | `auto-remap`, default `warn`), with
  enum/integer validation in `config-set`.
- `gsd-codebase-mapper` learns an optional `--paths <p1,p2,...>` scope hint
  for incremental remapping; agent/workflow docs updated.
- `last_mapped_commit` lives in YAML frontmatter on each
  `.planning/codebase/*.md` file; `readMappedCommit`/`writeMappedCommit`
  round-trip helpers ship in `drift.cjs`.

## Tests
- 55 new tests in `tests/drift-detection.test.cjs` covering:
  classification, threshold gating at 2/3/4 elements, warn vs. auto-remap
  routing, affected-path scoping, `--paths` sanitization (traversal,
  absolute, shell metacharacter rejection), frontmatter round-trip,
  defensive paths (missing STRUCTURE.md, malformed input, non-git repos),
  CLI JSON output, and documentation parity.
- Full suite: 5044 pass / 0 fail.

## Documentation
- `docs/CONFIGURATION.md` — rows for both new keys.
- `docs/ARCHITECTURE.md` — section on the post-execute drift gate.
- `docs/AGENTS.md` — `--paths` flag on `gsd-codebase-mapper`.
- `docs/USER-GUIDE.md` — user-facing behavior note + toggle commands.
- `docs/FEATURES.md` — new 27a section with REQ-DRIFT-01..06.
- `docs/INVENTORY.md` + `docs/INVENTORY-MANIFEST.json` — drift.cjs listed.
- `get-shit-done/workflows/execute-phase.md` — `codebase_drift_gate` step.
- `get-shit-done/workflows/map-codebase.md` — `parse_paths_flag` step.
- `agents/gsd-codebase-mapper.md` — `--paths` directive under parse_focus.

## Design decisions
- **Frontmatter over sidecar JSON** for `last_mapped_commit`: keeps the
  baseline attached to the file, survives git moves, survives per-doc
  regeneration, no extra file lifecycle.
- **Substring match against STRUCTURE.md** for `isPathMapped`: the map is
  free-form markdown, not a structured manifest; any mention of a path
  prefix counts as "mapped territory". Cheap, no parser, zero false
  negatives on reasonable maps.
- **Category priority migration > route > barrel > new_dir** so a file
  matching multiple rules counts exactly once at the most specific level.
- **Empty-tree SHA fallback** (`4b825dc6…`) when `last_mapped_commit` is
  absent — semantically correct (no baseline means everything is drift)
  and deterministic across repos.
- **Four layers of non-blocking** — detector try/catch, CLI try/catch, SDK
  handler try/catch, and workflow `|| echo` shell fallback. Any single
  layer failing still returns a valid skipped result.
- **SDK handler delegates to `gsd-tools.cjs`** rather than re-porting the
  detector to TypeScript, keeping drift logic in one canonical place.

Closes #2003

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mapper): tag --paths fenced block as text (CodeRabbit MD040)

Comment 3127255172.

* docs(config): use /gsd- dash command syntax in drift_action row (CodeRabbit)

Comment 3127255180. Matches the convention used by every other command
reference in docs/CONFIGURATION.md.

* fix(execute-phase): initialize AGENT_SKILLS_MAPPER + tag fenced blocks

Two CodeRabbit findings on the auto-remap branch of the drift gate:

- 3127255186 (must-fix): the mapper Task prompt referenced
  ${AGENT_SKILLS_MAPPER} but only AGENT_SKILLS (for gsd-executor) is
  loaded at init_context (line 72). Without this fix the literal
  placeholder string would leak into the spawned mapper's prompt.
  Add an explicit gsd-sdk query agent-skills gsd-codebase-mapper step
  right before the Task spawn.
- 3127255183: tag the warn-message and Task() fenced code blocks as
  text to satisfy markdownlint MD040.

* docs(map-codebase): wire PATH_SCOPE_HINT through every mapper prompt

CodeRabbit (review id 4158286952, comment 3127255190) flagged that the
parse_paths_flag step defined incremental-remap semantics but did not
inject a normalized variable into the spawn_agents and sequential_mapping
mapper prompts, so incremental remap could silently regress to a
whole-repo scan.

- Define SCOPED_PATHS / PATH_SCOPE_HINT in parse_paths_flag.
- Inject ${PATH_SCOPE_HINT} into all four spawn_agents Task prompts.
- Document the same scope contract for sequential_mapping mode.

* fix(drift): writeMappedCommit tolerates missing target file

CodeRabbit (review id 4158286952, drift.cjs:349-355 nitpick) noted that
readMappedCommit returns null on ENOENT but writeMappedCommit threw — an
asymmetry that breaks first-time stamping of a freshly produced doc that
the caller has not yet written.

- Catch ENOENT on the read; treat absent file as empty content.
- Add a regression test that calls writeMappedCommit on a non-existent
  path and asserts the file is created with correct frontmatter.
  Test was authored to fail before the fix (ENOENT) and passes after.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 21:21:44 -04:00
Tom Boucher
9c0a153a5f feat: /gsd-settings-advanced — power-user config tuning command (closes #2528) (#2603)
* feat: /gsd-settings-advanced — power-user config tuning command (closes #2528)

Adds a second-tier interactive configuration command covering the power-user
knobs that don't belong in the common-case /gsd-settings prompt. Six sectioned
AskUserQuestion batches cover planning, execution, discussion, cross-AI, git,
and runtime settings (19 config keys total). Current values are pre-selected;
numeric fields reject non-numeric input; writes route through
gsd-sdk query config-set so unrelated keys are preserved.

- commands/gsd/settings-advanced.md — command entry
- get-shit-done/workflows/settings-advanced.md — six-section workflow
- get-shit-done/workflows/settings.md — advertise advanced command
- get-shit-done/bin/lib/config-schema.cjs — add context_window to VALID_CONFIG_KEYS
- docs/COMMANDS.md, docs/CONFIGURATION.md, docs/INVENTORY.md — docs + inventory
- tests/gsd-settings-advanced.test.cjs — 81 tests (files, frontmatter,
  field coverage, pre-selection, merge-preserves-siblings, VALID_CONFIG_KEYS
  membership, confirmation table, /gsd-settings cross-link, negative scenarios)

All 5073 tests pass; coverage 88.66% (>= 70% threshold).

* docs(settings-advanced): clarify per-field numeric bounds and label fenced blocks

Addresses CodeRabbit review on PR #2603:
- Numeric-input rule now states min is field-specific: plan_bounce_passes
  and max_discuss_passes require >= 1; other numeric fields accept >= 0.
  Resolves the inconsistency between the global rule and the field-level
  prompts (CodeRabbit comment 3127136557).
- Adds 'text' fence language to seven previously unlabeled code blocks in
  the workflow (six AskUserQuestion sections plus the confirmation banner)
  to satisfy markdownlint MD040 (CodeRabbit comment 3127136561).

* test(settings-advanced): tighten section assertion, fix misleading test name, add executable numeric-input coverage

Addresses CodeRabbit review on PR #2603:
- Required section list now asserts the full 'Runtime / Output' heading
  rather than the looser 'Runtime' substring (comment 3127136564).
- Renames the subagent_timeout coercion test to match the actual key
  under test (was titled 'context_window' but exercised
  workflow.subagent_timeout — comment 3127136573).
- Adds two executable behavioral tests at the config-set boundary
  (comment 3127136579):
  * Non-numeric input on a numeric key currently lands as a string —
    locks in that the workflow's AskUserQuestion re-prompt loop is the
    layer responsible for type rejection. If a future change adds CLI-side
    numeric validation, the assertion flips and the test surfaces it.
  * Numeric string on workflow.max_discuss_passes is coerced to Number —
    locks in the parser invariant for a second numeric key.
2026-04-22 20:50:15 -04:00
Tom Boucher
533973700c feat(#2538): add last: /cmd suffix to statusline (opt-in) (#2594)
Adds a `statusline.show_last_command` config toggle (default: false) that
appends ` │ last: /<cmd>` to the statusline, showing the most recently
invoked slash command in the current session.

The suffix is derived by tailing the active Claude Code transcript
(provided as transcript_path in the hook input) and extracting the last
<command-name> tag. Reads only the final 256 KiB to stay cheap per render.
Graceful degradation: missing transcript, no recorded command, unreadable
config, or parse errors all silently omit the suffix without breaking the
statusline.

Closes #2538
2026-04-22 12:04:21 -04:00
Tom Boucher
c47a6a2164 fix: correct VALID_CONFIG_KEYS — remove internal state key, add missing public keys, migration hints (#2561)
* fix(#2530-2535): correct VALID_CONFIG_KEYS set — remove internal state key, add missing public keys, add migration hints

- Remove workflow._auto_chain_active from VALID_CONFIG_KEYS (internal runtime state, not user-settable) (#2530)
- Add hooks.workflow_guard to VALID_CONFIG_KEYS (read by gsd-workflow-guard.js hook, already documented) (#2531)
- Add workflow.ui_review to VALID_CONFIG_KEYS (read in autonomous.md via config-get) (#2532)
- Add workflow.max_discuss_passes to VALID_CONFIG_KEYS (read in discuss-phase.md via config-get) (#2533)
- Add CONFIG_KEY_SUGGESTIONS entries for sub_repos → planning.sub_repos and plan_checker → workflow.plan_check (#2535)
- Document workflow.ui_review and workflow.max_discuss_passes in docs/CONFIGURATION.md
- Clear INTERNAL_KEYS exemption in parity test (workflow._auto_chain_active removed from schema entirely)
- Add regression test file tests/bug-2530-valid-config-keys.test.cjs covering all 6 bugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: align SDK VALID_CONFIG_KEYS with CJS — remove internal key, add missing public keys

- Remove workflow._auto_chain_active from SDK (internal runtime state, not user-settable)
- Add workflow.ui_review, workflow.max_discuss_passes, hooks.workflow_guard to SDK
- Add ui_review and max_discuss_passes to Full Schema example in CONFIGURATION.md

Resolves CodeRabbit review on #2561.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 11:28:25 -04:00
Tom Boucher
b2534e8a05 feat(plan-phase): chunked mode + filesystem fallback for Windows stdio hang (#2499)
* feat(plan-phase): chunked mode + filesystem fallback for Windows stdio hang (#2310)

Addresses the 2026-04-16 Windows incident where gsd-planner wrote all 5
PLAN.md files to disk but Task() never returned, hanging the orchestrator
for 30+ minutes. Two mitigations:

1. Filesystem fallback (steps 9a, 11a): when Task() returns with an
   empty/truncated response but PLAN.md files exist on disk, surface a
   recoverable prompt (Accept plans / Retry planner / Stop) instead of
   silently failing. Directly addresses the post-restart recovery path.

2. Chunked mode (--chunked flag / workflow.plan_chunked config): splits the
   single long-lived planner Task into a short outline Task (~2 min) followed
   by N short per-plan Tasks (~3-5 min each). Each plan is committed
   individually for crash resilience. A hang loses one plan, not all of them.
   Resume detection skips plans already on disk on re-run.

RCA confirmed: task state mtime 14:29 vs PLAN.md writes 14:32-14:52 =
subagent completed normally, IPC return was dropped by Windows stdio deadlock.
Neither mitigation fixes the root cause (requires upstream Task() timeout
support); both bound damage and enable recovery.

New reference file planner-chunked.md keeps OUTLINE COMPLETE / PLAN COMPLETE
return formats out of gsd-planner.md (which sits at 46K near its size limit).

Closes #2310

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(plan-phase): address CodeRabbit review comments on #2499

- docs/CONFIGURATION.md: add workflow.plan_chunked to full JSON schema example
- plan-phase.md step 8.5.1: validate PLAN-OUTLINE.md with grep for OUTLINE
  COMPLETE marker before reusing (not just file existence)
- plan-phase.md step 8.5.2: validate per-plan PLAN.md has YAML frontmatter
  (head -1 grep for ---) before skipping in resume path
- plan-phase.md: add language tags (text/javascript/bash) to bare fenced
  code blocks in steps 8.5, 9a, 11a (markdownlint MD040)
- Rejected: commit_docs gate on per-plan commits (gsd-sdk query commit
  already respects commit_docs internally — comment was a false positive)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(plan-phase): route Accept-plans through step 9 PLANNING COMPLETE handling

Honors --skip-verify / plan_checker_enabled=false in 9a fallback path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 08:40:39 -04:00
Tom Boucher
2b494407e5 feat(assembly): add link mode for CLAUDE.md @-reference sections (#2484)
* feat(assembly): add link mode for CLAUDE.md @-reference sections (#2415)

Adds `claude_md_assembly.mode: "link"` config option that writes
`@.planning/<source>` instead of inlining content between GSD markers,
reducing typical CLAUDE.md size by ~65%. Per-block overrides available
via `claude_md_assembly.blocks.<section>`. Falls back to embed for
sections without a real source file (workflow, fallbacks).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): add positive assertion for embedded workflow content (CodeRabbit #2484)

The negative assertion only confirmed @GSD defaults wasn't written.
Add assert.ok(content.includes('GSD Workflow Enforcement')) to verify
the workflow section is actually embedded inline when link mode falls back.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 18:27:55 -04:00
Rezolv
86fb9c85c3 docs(sdk): registry docs and gsd-sdk query call sites (#2302 Track B) (#2340)
* feat(sdk): golden parity harness and query handler CJS alignment (#2302 Track A)

Golden/read-only parity tests and registry alignment, query handler fixes
(check-completion, state-mutation, commit, validate, summary, etc.), and
WAITING.json dual-write for .gsd/.planning readers.

Refs gsd-build/get-shit-done#2341

* fix(sdk): getMilestoneInfo matches GSD ROADMAP (🟡, last bold, STATE fallback)

- Recognize in-flight 🟡 milestone bullets like 🚧.
- Derive from last **vX.Y Title** before ## Phases when emoji absent.
- Fall back to STATE.md milestone when ROADMAP is missing; use last bare vX.Y
  in cleaned text instead of first (avoids v1.0 from shipped list).
- Fixes init.execute-phase milestone_version and buildStateFrontmatter after
  state.begin-phase (syncStateFrontmatter).

* feat(sdk): phase list, plan task structure, requirements extract handlers

- Register phase.list-plans, phase.list-artifacts, plan.task-structure,
  requirements.extract-from-plans (SDK-only; golden-policy exceptions).
- Add unit tests; document in QUERY-HANDLERS.md.
- writeProfile: honor --output, render dimensions, return profile_path and dimensions_scored.

* feat(sdk): centralize getGsdAgentsDir in query helpers

Extract agent directory resolution to helpers (GSD_AGENTS_DIR, primary
~/.claude/agents, legacy path). Use from init and docs-init init bundles.

docs(15): add 15-CONTEXT for autonomous phase-15 run.

* feat(sdk): query CLI CJS fallback and session correlation

- createRegistry(eventStream, sessionId) threads correlation into mutation events
- gsd-sdk query falls back to gsd-tools.cjs when no native handler matches
  (disable with GSD_QUERY_FALLBACK=off); stderr bridge warnings
- Export createRegistry from @gsd-build/sdk; add sdk/README.md
- Update QUERY-HANDLERS.md and registry module docs for fallback + sessionId
- Agents: prefer node dist/cli.js query over cat/grep for STATE and plans

* fix(sdk): init phase_found parity, docs-init agents path, state field extract

- Normalize findPhase not-found to null before roadmap fallback (matches findPhaseInternal)

- docs-init: use detectRuntime + resolveAgentsDir for checkAgentsInstalled

- state.cjs stateExtractField: horizontal whitespace only after colon (YAML progress guard)

- Tests: commit_docs default true; config-get golden uses temp config; golden integration green

Refs: #2302

* refactor(sdk): share SessionJsonlRecord in profile-extract-messages

CodeRabbit nit: dedupe JSONL record shape for isGenuineUserMessage and streamExtractMessages.

* fix(sdk): address CodeRabbit major threads (paths, gates, audit, verify)

- Resolve @file: and CLI JSON indirection relative to projectDir; guard empty normalized query command

- plan.task-structure + intel extract/patch-meta: resolvePathUnderProject containment

- check.config-gates: safe string booleans; plan_checker alias precedence over plan_check default

- state.validate/sync: phaseTokenMatches + comparePhaseNum ordering

- verify.schema-drift: token match phase dirs; files_modified from parsed frontmatter

- audit-open: has_scan_errors, unreadable rows, human report when scans fail

- requirements PLANNED key PLAN for root PLAN.md; gsd-tools timeout note

- ingest-docs: repo-root path containment; classifier output slug-hash

Golden parity test strips has_scan_errors until CJS adds field.

* fix: Resolve CodeRabbit security and quality findings
- Secure intel.ts and cli.ts against path traversal
- Catch and validate git add status in commit.ts
- Expand roadmap milestone marker extraction
- Fix parsing array-of-objects in frontmatter YAML
- Fix unhandled config evaluations
- Improve coverage test parity mapping

* docs(sdk): registry docs and gsd-sdk query call sites (#2302 Track B)

Update CHANGELOG, architecture and user guides, workflow call sites, and read-guard tests for gsd-sdk query; sync ARCHITECTURE.md command/workflow counts and directory-tree totals with the repo (80 commands, 77 workflows).

Address CodeRabbit: fix markdown tables and emphasis; align CLI-TOOLS GSDTools and state.read docs with implementation; correct roadmap handler name in universal-anti-patterns; resolve settings workflow config path without relying on config_path from state.load.

Refs gsd-build/get-shit-done#2340

* test: raise planner character extraction limit to 48K

* fix(sdk): resolve build TS error and doc conflict markers
2026-04-20 18:09:21 -04:00
Tom Boucher
9d55d531a4 fix(#2432,#2424): pre-dispatch PLAN.md commit + reapply-patches baseline detection; docs(#2397): config schema drift (#2469)
- quick.md Step 5.6: commit PLAN.md to base branch before worktree executor
  spawn when USE_WORKTREES is active, preventing CC #36182 path-resolution
  drift that caused silent writes to main repo instead of worktree
- reapply-patches.md Option A: replace first-add commit heuristic with
  pristine_hashes SHA-256 matching from backup-meta.json so baseline detection
  works correctly on multi-cycle repos; first-add fallback kept for older
  installers without pristine_hashes
- CONFIGURATION.md: move security_enforcement/security_asvs_level/security_block_on
  to workflow.* (matches templates/config.json and workflow readers); rename
  context_profile → context (matches VALID_CONFIG_KEYS in config.cjs); add
  planning.sub_repos to schema example
- universal-anti-patterns.md + context-budget.md: fix context_window_tokens →
  context_window (the actual key name in config.cjs)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 10:11:00 -04:00
Tom Boucher
62eaa8dd7b docs: close doc drift vectors — bidirectional parity, manifest, schema-driven config (#2479)
Option A — ghost-entry guard (INVENTORY ⊆ actual):
  tests/inventory-source-parity.test.cjs parses every declared row in
  INVENTORY.md and asserts the source file exists. Catches deletions and
  renames that leave ghost entries behind.

Option B — auto-generated structural manifest:
  scripts/gen-inventory-manifest.cjs walks all six family dirs and emits
  docs/INVENTORY-MANIFEST.json. tests/inventory-manifest-sync.test.cjs
  fails CI when a new surface ships without a manifest update, surfacing
  exactly which entries are missing.

Option C — schema-driven config validation + docs parity:
  get-shit-done/bin/lib/config-schema.cjs extracted from config.cjs as
  the single source of truth for VALID_CONFIG_KEYS and dynamic patterns.
  config.cjs now imports from it. tests/config-schema-docs-parity.test.cjs
  asserts every exact-match key appears in docs/CONFIGURATION.md, surfacing
  14 previously undocumented keys (planning.sub_repos, workflow.ai_integration_phase,
  git.base_branch, learnings.max_inject, and 10 others) — all now documented
  in their appropriate sections.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 09:39:05 -04:00
Logan
fbf30792f3 docs: authoritative shipped-surface inventory with filesystem-backed parity tests (#2390)
* docs: finish trust-bug fixes in user guide and commands

Correct load-bearing defects in the v1.36.0 docs corpus so readers stop
acting on wrong defaults and stale exhaustiveness claims.

- README.md: drop "Complete feature"/"Every command"/"All 18 agents"
  exhaustiveness claims; replace version-pinned "What's new in v1.32"
  bullet with a CHANGELOG pointer.
- CONFIGURATION.md: fix `claude_md_path` default (null/none -> `./CLAUDE.md`)
  in both Full Schema and core settings table; correct `workflow.tdd_mode`
  provenance from "Added in v1.37" to "Added in v1.36".
- USER-GUIDE.md: fix `workflow.discuss_mode` default (`standard` ->
  `discuss`) in the workflow-toggles table AND in the abbreviated Full
  Schema JSON block above it; align the Options cell with the shipped
  enum.
- COMMANDS.md: drop "Complete command syntax" subtitle overclaim to
  match the README posture.
- AGENTS.md: weaken "All 21 specialized agents" header to reflect that
  the `agents/` filesystem is authoritative (shipped roster is 31).

Part 1 of a stacked docs refresh series (PR 1/4).

* docs: refresh shipped surface coverage for v1.36

Close the v1.36.0 shipped-surface gaps in the docs corpus.

- COMMANDS.md: add /gsd-graphify section (build/query/status/diff) and
  its config gate; expand /gsd-quick with --validate flag and list/
  status/resume subcommands; expand /gsd-thread with list --open, list
  --resolved, close <slug>, status <slug>.
- CLI-TOOLS.md: replace the hardcoded "15 domain modules" count with a
  pointer to the Module Architecture table; add a graphify verb-family
  section (build/query/status/diff/snapshot); add Graphify and Learnings
  rows to the Module Architecture table.
- FEATURES.md: add TOC entries for #116 TDD Pipeline Mode and #117
  Knowledge Graph Integration; add the #117 body with REQ-GRAPH-01..05.
- CONFIGURATION.md: move security_enforcement / security_asvs_level /
  security_block_on from root into `workflow.*` in Full Schema to match
  templates/config.json and the gsd-sdk runtime reads; update Security
  Settings table to use the workflow.* prefix; add planning.sub_repos
  to Full Schema and description table; add a Graphify Settings section
  documenting graphify.enabled and graphify.build_timeout.

Note: VALID_CONFIG_KEYS in bin/lib/config.cjs does not yet include
workflow.security_* or planning.sub_repos, so config-set currently
rejects them. That is a pre-existing validator gap that this PR does
not attempt to fix; the docs now correctly describe where these keys
live per the shipped template and runtime reads.

Part 2 of a stacked docs refresh series (PR 2/5), based on PR 1.

* docs: make inventory authoritative and reconcile architecture

Upgrade docs/INVENTORY.md from "complete for agents, selective for others"
to authoritative across all six shipped-surface families, and reconcile
docs/ARCHITECTURE.md against the new inventory so the PR that introduces
INVENTORY does not also introduce an INVENTORY/ARCHITECTURE contradiction.

- docs/AGENTS.md: weaken "21 specialized agents" header to 21 primary +
  10 advanced (31 shipped); add new "Advanced and Specialized Agents"
  section with concise role cards for the 10 previously-omitted shipped
  agents (pattern-mapper, debug-session-manager, code-reviewer,
  code-fixer, ai-researcher, domain-researcher, eval-planner,
  eval-auditor, framework-selector, intel-updater); footnote the Agent
  Tool Permissions Summary as primary-agents-only so it no longer
  misleads.

- docs/INVENTORY.md (rewritten to be authoritative):
  * Full 31-agent roster with one-line role + spawner + primary-doc
    status per agent (unchanged from prior partial work).
  * Commands: full 75-row enumeration grouped by Core Workflow, Phase &
    Milestone Management, Session & Navigation, Codebase Intelligence,
    Review/Debug/Recovery, and Docs/Profile/Utilities — each row
    carries a one-line role derived from the command's frontmatter and
    a link to the source file.
  * Workflows: full 72-row enumeration covering every
    get-shit-done/workflows/*.md, with a one-line role per workflow and
    a column naming the user-facing command (or internal orchestrator)
    that invokes it.
  * References: full 41-row enumeration grouped by Core, Workflow,
    Thinking-Model clusters, and the Modular Planner decomposition,
    matching the groupings docs/ARCHITECTURE.md already uses; notes
    the few-shot-examples subdirectory separately.
  * CLI Modules and Hooks: unchanged — already full rosters.
  * Maintenance section rewritten to describe the drift-guard test
    suite that will land in PR4 (inventory-counts, commands-doc-parity,
    agents-doc-parity, cli-modules-doc-parity, hooks-doc-parity).

- docs/ARCHITECTURE.md reconciled against INVENTORY:
  * References block: drop the stale "(35 total)" count; point at
    INVENTORY.md#references-41-shipped for the authoritative count.
  * CLI Tools block: drop the stale "19 domain modules" count; point
    at INVENTORY.md#cli-modules-24-shipped for the authoritative roster.
  * Agent Spawn Categories: relabel as "Primary Agent Spawn Categories"
    and add a footer naming the 10 advanced agents and pointing at
    INVENTORY.md#agents-31-shipped for the full 31-agent roster.

- docs/CONFIGURATION.md: preserve the six model-profile rows added in
  the prior partial work, and tighten the fallback note so it names the
  13 shipped agents without an explicit profile row, documents
  model_overrides as the escape hatch, and points at INVENTORY.md for
  the authoritative 31-agent roster.

Part 3 of a stacked docs refresh series (PR 3/4). Remaining consistency
work (USER-GUIDE config-section delete-and-link, FEATURES.md TOC
reorder, ARCHITECTURE.md Hook-table expansion + installation-layout
collapse, CLI-TOOLS.md module-row additions, workflow-discuss-mode
invocation normalization, and the five doc-parity tests) lands in PR4.

* test(docs): add consistency guards and remove duplicate refs

Consolidates USER-GUIDE.md's command/config duplicates into pointers to
COMMANDS.md and CONFIGURATION.md (kills a ghost `resolve_model_ids` key
and a stale `discuss_mode: standard` default); reorders FEATURES.md TOC
chronologically so v1.32 precedes v1.34/1.35/1.36; expands
ARCHITECTURE.md's Hook table to the 11 shipped hooks
(gsd-read-injection-scanner, gsd-check-update-worker) and collapses
the installation-layout hook enumeration to the *.js/*.sh pattern form;
adds audit/gsd2-import/intel rows and state signal-*, audit-open,
from-gsd2 verbs to CLI-TOOLS.md; normalizes workflow-discuss-mode.md
invocations to `node gsd-tools.cjs config-set`.

Adds five drift guards anchored on docs/INVENTORY.md as the
authoritative roster: inventory-counts (all six families),
commands/agents/cli-modules/hooks parity checks that every shipped
surface has a row somewhere.

* fix(convergence): thread --ws to review agent; add stall and max-cycles behavioral tests

- Thread GSD_WS through to review agent spawn in plan-review-convergence
  workflow (step 5a) so --ws scoping is symmetric with planning step
- Add behavioral stall detection test: asserts workflow compares
  HIGH_COUNT >= prev_high_count and emits a stall warning
- Add behavioral --max-cycles 1 test: asserts workflow reaches escalation
  gate when cycle >= MAX_CYCLES with HIGH > 0 after a single cycle
- Include original PR files (commands, workflow, tests) as the branch
  predated the PR commits

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(docs,config): PR #2390 review — security_* config keys and REQ-GRAPH-02 scope

Addresses trek-e's review items that don't require rebase:

- config.cjs: add workflow.security_enforcement, workflow.security_asvs_level,
  workflow.security_block_on to VALID_CONFIG_KEYS so gsd-sdk config-set accepts
  them (closed the gap where docs/CONFIGURATION.md listed keys the validator
  rejected).
- core.cjs: add matching CONFIG_DEFAULTS entries (true / 1 / 'high') so the
  canonical defaults table matches the documented values.
- config.cjs: wire the three keys into the new-project workflow defaults so
  fresh configs inherit them.
- planning-config.md: document the three keys in the Workflow Fields table,
  keeping the CONFIG_DEFAULTS ↔ doc parity test happy.
- config-field-docs.test.cjs: extend NAMESPACE_MAP so the flat keys in
  CONFIG_DEFAULTS resolve to their workflow.* doc rows.
- FEATURES.md REQ-GRAPH-02: split the slash-command surface (build|query|
  status|diff) from the CLI surface which additionally exposes `snapshot`
  (invoked automatically at the tail of `graphify build`). The prior text
  overstated the slash-command surface.

* docs(inventory): refresh rosters and counts for post-rebase drift

origin/main accumulated surfaces since this PR was authored:

- Agents: 31 → 33 (+ gsd-doc-classifier, gsd-doc-synthesizer)
- Commands: 76 → 82 (+ ingest-docs, ultraplan-phase, spike, spike-wrap-up,
  sketch, sketch-wrap-up)
- Workflows: 73 → 79 (same 6 names)
- References: 41 → 49 (+ debugger-philosophy, doc-conflict-engine,
  mandatory-initial-read, project-skills-discovery, sketch-interactivity,
  sketch-theme-system, sketch-tooling, sketch-variant-patterns)

Adds rows in the existing sub-groupings, introduces a Sketch References
subsection, and bumps all four headline counts. Roles are pulled from
source frontmatter / purpose blocks for each file. All 5 parity tests
(inventory-counts, agents-doc-parity, commands-doc-parity,
cli-modules-doc-parity, hooks-doc-parity) pass against this state —
156 assertions, 0 failures.

Also updates the 'Coverage note' advanced-agent count 10 → 12 and the
few-shot-examples footnote "41 top-level references" → "49" to keep the
file internally consistent.

* docs(agents): add advanced stubs for gsd-doc-classifier and gsd-doc-synthesizer

Both agents ship on main (spawned by /gsd-ingest-docs) but had no
coverage in docs/AGENTS.md. Adds the "advanced stub" entries (Role,
property table, Key behaviors) following the template used by the other
10 advanced/specialized agents in the same section.

Also updates the Agent Tool Permissions Summary scope note from
"10 advanced/specialized agents" to 12 to reflect the two new stubs.

* docs(commands): add entries for ingest-docs, ultraplan-phase, plan-review-convergence

These three commands ship on main (plan-review-convergence via trek-e's
4b452d29 commit on this branch) but had no user-facing section in
docs/COMMANDS.md — they lived only in INVENTORY.md. The commands-doc-parity
test already passes via INVENTORY, but the user-facing doc was missing
canonical explanations, argument tables, and examples.

- /gsd-plan-review-convergence → Core Workflow (after /gsd-plan-phase)
- /gsd-ultraplan-phase → Core Workflow (after plan-review-convergence)
- /gsd-ingest-docs → Brownfield (after /gsd-import, since both consume
  the references/doc-conflict-engine.md contract)

Content pulled from each command's frontmatter and workflow purpose block.

* test: remove redundant ARCHITECTURE.md count tests

tests/architecture-counts.test.cjs and tests/command-count-sync.test.cjs
were added when docs/ARCHITECTURE.md carried hardcoded counts for commands/
workflows/agents. With the PR #2390 cleanup, ARCHITECTURE.md no longer
owns those numbers — docs/INVENTORY.md does, enforced by
tests/inventory-counts.test.cjs (scans the same filesystem directories
with the same readdirSync filter).

Keeping these ARCHITECTURE-specific tests would re-introduce the hardcoded
counts they guard, defeating trek-e's review point. The single-source-of-
truth parity tests already catch the same drift scenarios.

Related: #2257 (the regression this replaced).

---------

Co-authored-by: Tom Boucher <trekkie@nomorestars.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 09:31:34 -04:00
Tom Boucher
e24cb18b72 feat(workflow): add opt-in TDD pipeline mode (#2119)
* feat(workflow): add opt-in TDD pipeline mode (workflow.tdd_mode)

Add workflow.tdd_mode config key (default: false) that enables
red-green-refactor as a first-class phase execution mode. When
enabled, the planner aggressively applies type: tdd to eligible
tasks and the executor enforces RED/GREEN/REFACTOR gate sequence
with fail-fast on unexpected GREEN before RED. An end-of-phase
collaborative review checkpoint verifies gate compliance.

Closes #1871

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(test): allowlist plan-phase.md in prompt injection scan

plan-phase.md exceeds 50K chars after TDD mode integration.
This is legitimate orchestration complexity, not prompt stuffing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: trigger CI run

* ci: trigger CI run

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 14:42:01 -04:00
Tom Boucher
362e5ac36c fix(docs): correct plan_bounce_passes default from 1 to 2
The actual code default in config.cjs and config.json template is 2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 10:39:31 -04:00
Tom Boucher
4553d356d2 docs: add v1.36.0 feature documentation for PRs #2100-#2111
Document 8 new features (108-115) in FEATURES.md, add --bounce/--cross-ai
flags to COMMANDS.md, new /gsd-extract-learnings command, 8 new config keys
in CONFIGURATION.md, and skill-manifest + --ws flag in CLI-TOOLS.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 09:54:21 -04:00
Tom Boucher
6c2795598a docs: release notes and documentation updates for v1.35.0 (#2079)
Closes #2080
2026-04-10 22:29:06 -04:00
RodZ
dced50d887 docs: remove duplicate keys in CONFIGURATION.md (#1895)
The Full Schema JSON block had `context_profile` listed twice, and the
"Hook Settings" section was duplicated later in the document.
2026-04-07 08:18:20 -04:00
Tom Boucher
641ea8ad42 docs: update documentation for v1.34.0 release (#1868) 2026-04-06 16:25:41 -04:00
Rezolv
12cdf6090c feat(workflows): auto-copy learnings to global store at phase completion (#1828)
* feat(workflows): add auto-copy learnings to global store at phase completion

* fix(workflows): address review feedback for learnings auto-copy

- Replace shell-interpolated ${phase_dir} with agent context instruction
- Remove unquoted glob pattern in bash snippet
- Use gsd-tools learnings copy instead of manual file detection
- Document features.* dynamic namespace in config.cjs

* docs(config): add features.* namespace to CONFIGURATION.md schema
2026-04-05 19:33:43 -04:00
Jeremy McSpadden
ade67cf9f9 fix: update MODEL_ALIAS_MAP to current Claude model IDs (#1691)
Fixes #1690

- opus: claude-opus-4-0 → claude-opus-4-6
- sonnet: claude-sonnet-4-5 → claude-sonnet-4-6
- haiku: claude-haiku-3-5 → claude-haiku-4-5

Also updates the stale haiku reference in sdk/src/session-runner.ts
and documentation examples in CONFIGURATION.md (en, ja-JP, ko-KR).
2026-04-04 15:49:56 -04:00
Tom Boucher
c8d7ab3501 docs: fill documentation gaps from v1.32.0 audit
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 08:54:14 -04:00
Tom Boucher
acf82440e5 docs: update English documentation for v1.32.0 release
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 08:28:50 -04:00
Quang Do
d4767ac2e0 fix: replace /gsd: slash command format with /gsd- skill format in all user-facing content (#1579)
* fix: replace /gsd: command format with /gsd- skill format in all suggestions

All next-step suggestions shown to users were still using the old colon
format (/gsd:xxx) which cannot be copy-pasted as skills. Migrated all
occurrences across agents/, commands/, get-shit-done/, docs/, README files,
bin/install.js (hardcoded defaults for claude runtime), and
get-shit-done/bin/lib/*.cjs (generate-claude-md templates and error messages).
Updated tests to assert new hyphen format instead of old colon format.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: migrate remaining /gsd: format to /gsd- in hooks, workflows, and sdk

Addresses remaining user-facing occurrences missed in the initial migration:

- hooks/: fix 4 user-facing messages (pause-work, update, fast, quick)
  and 2 comments in gsd-workflow-guard.js
- get-shit-done/workflows/: fix 21 Skill() literal calls that Claude
  executes directly (installer does not transform workflow content)
- sdk/prompt-sanitizer.ts: update regex to strip /gsd- format in addition
  to legacy /gsd: format; update JSDoc comment
- tests/: update autonomous-ui-steps, prompt-sanitizer to assert new format

Note: commands/gsd/*.md frontmatter (name: gsd:xxx) intentionally unchanged
— installer derives skillName from directory path, not the name field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(plan-phase): preserve --chain flag in auto-advance sync and handle ui-phase gate in chain mode

Bug 1: step 15 sync-flag check only guarded against --auto, causing
_auto_chain_active to be cleared when plan-phase is invoked without
--auto in ARGUMENTS even though a --chain pipeline was active. Added
--chain to the guard condition, matching discuss-phase behaviour.

Bug 2: UI Design Contract gate (step 5.6) always exited the workflow
when UI-SPEC was missing, breaking the discuss --chain pipeline
silently. When _auto_chain_active is true, the gate now auto-invokes
gsd-ui-phase --auto via Skill() and continues to step 6 without
prompting. Manual invocations retain the existing AskUserQuestion flow.

* fix: remove <sub>/clear</sub> pattern and duplicate old-format command in discuss-phase.md

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 07:24:31 -04:00
Alex Alecu
fc1a4ccba1 merge: sync Kilo runtime branch with main
Bring the latest main branch updates into feat/kilo-runtime-support while preserving KILO_CONFIG resolution, Kilo agent permission conversion, and relative .claude path rewrites.
2026-04-02 16:00:09 +03:00
Tom Boucher
94a8005f97 docs: update documentation for v1.31.0 release
- CHANGELOG.md: 13 added, 1 changed, 21 fixed entries
- README.md: quick mode flags, --chain, security/schema features, commands table
- docs/FEATURES.md: v1.29-v1.31 feature sections (#56-#68)
- docs/CONFIGURATION.md: 5 new config options, Security Settings section
- Localized READMEs: ja-JP, ko-KR, zh-CN, pt-BR synced with English changes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:17:26 -04:00
Alex Alecu
ac4836d270 feat: add Kilo CLI runtime support 2026-03-31 15:59:31 +03:00
Tom Boucher
db3eeb8fe4 feat: agent skill injection via config (#1355)
Add agent_skills config section that maps agent types to skill directory
paths. At spawn time, workflows load configured skills and inject them
as <agent_skills> blocks in Task() prompts, giving subagents access to
project-specific skill files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:47:40 -04:00
Tom Boucher
7457e33263 docs: v1.28 release documentation update
Add documentation for all new features merged since v1.27:

- Forensics command (/gsd:forensics) — post-mortem workflow investigation
- Milestone Summary (/gsd:milestone-summary) — project summary for onboarding
- Workstream Namespacing (/gsd:workstreams) — parallel milestone work
- Manager Dashboard (/gsd:manager) — interactive phase command center
- Assumptions Discussion Mode (workflow.discuss_mode) — codebase-first context
- UI Phase Auto-Detection — surface /gsd:ui-phase for UI-heavy projects
- Multi-Runtime Installer Selection — select multiple runtimes interactively

Updated files:
- README.md: new commands, config keys, assumptions mode callout
- docs/COMMANDS.md: 4 new command entries with full syntax
- docs/FEATURES.md: 7 new feature entries (#49-#55) with requirements
- docs/CONFIGURATION.md: 3 new workflow config keys
- docs/AGENTS.md: 2 new agents, count 15→18
- docs/USER-GUIDE.md: assumptions mode, forensics, workstreams, non-Claude runtimes
- docs/README.md: updated index with discuss-mode doc link

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 12:13:17 -04:00
Tom Boucher
d478e7f485 docs: clarify model profile setup for non-Claude runtimes
Document resolve_model_ids: "omit" (set automatically by installer for
non-Claude runtimes), explain model_overrides with non-Claude model IDs,
and add a decision table for choosing between inherit, omit, and
overrides. Updates CONFIGURATION.md, USER-GUIDE.md, and the
model-profiles.md skill reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 03:05:36 -04:00
Tom Boucher
d5f2a7ea19 docs: update README and docs/ for v1.27 release
Add documentation for all new v1.27 features:
- 7 new commands (/gsd:fast, /gsd:review, /gsd:plant-seed, /gsd:thread,
  /gsd:add-backlog, /gsd:review-backlog, /gsd:pr-branch)
- Security hardening (security.cjs, prompt guard hook, workflow guard hook)
- Multi-repo workspace support, discussion audit trail, advisor mode
- New config options (research_before_questions, hooks.workflow_guard)
- Updated component counts in ARCHITECTURE.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:21:53 -04:00
Tom Boucher
841da5a80d Merge pull request #1155 from gsd-build/Solvely/quick-task-branching
feat(quick): add quick-task branch support
2026-03-19 12:03:58 -04:00
Tom Boucher
214a621cb2 docs: update changelog, architecture, CLI tools, config, features, and user guide for parallel execution fixes
Documentation updates for #1116 fixes and code review findings:

- CHANGELOG.md: Add --no-verify commit flag, post-wave hook validation,
  STATE.md file locking, duplicate function removal, cross-platform init
- docs/ARCHITECTURE.md: Add 'Parallel Commit Safety' section explaining
  --no-verify strategy and STATE.md lockfile mechanism
- docs/CLI-TOOLS.md: Document --no-verify flag on commit command with
  usage guidance
- docs/CONFIGURATION.md: Add note about pre-commit hooks and parallel
  execution behavior under parallelization settings
- docs/FEATURES.md: Add --no-verify to executor capabilities, add
  'Parallel Safety' section
- docs/USER-GUIDE.md: Add troubleshooting entries for parallel execution
  build lock errors and Windows EPERM crashes, update recovery table
2026-03-18 17:18:01 -04:00
Tom Boucher
a9be67f504 docs: comprehensive v1.26 release documentation update (#1187)
Updates all docs to reflect v1.26.0 features and changes:

README.md:
- Add /gsd:ship and /gsd:next to command tables
- Add /gsd:session-report to Session section
- Update workflow to show ship step and auto-advance
- Update inherit profile description for non-Anthropic providers

docs/COMMANDS.md:
- Add /gsd:next command reference with full state detection logic
- Add /gsd:session-report command reference with report contents

docs/FEATURES.md:
- Add Auto-Advance (Next) feature (#14)
- Add Cross-Phase Regression Gate feature (#20)
- Add Requirements Coverage Gate feature (#21)
- Add Session Reporting feature (#24)
- Fix all section numbering (was broken with duplicates)
- Update inherit profile to mention non-Anthropic providers
- Renumber all 39 features consistently

docs/USER-GUIDE.md:
- Add /gsd:ship to workflow diagram
- Add /gsd:next and /gsd:session-report to command tables
- Add HANDOFF.json and reports/ to file structure
- Add troubleshooting for non-Anthropic model providers
- Add recovery entries for session-report and next
- Update example workflow to include ship and session-report

docs/CONFIGURATION.md:
- Update inherit profile to mention non-Anthropic providers
2026-03-18 14:54:02 -04:00
Colin
3a0c81133b feat(quick): add quick-task branch support 2026-03-17 12:04:02 -04:00