get-shit-done

mirror of https://github.com/glittercowboy/get-shit-done synced 2026-05-13 18:46:38 +02:00

Author	SHA1	Message	Date
Tom Boucher	b37c487325	feat(security): package legitimacy gate against slopsquatting (#3215 ) * feat(security): package legitimacy gate against slopsquatting (#2827) GSD's research → plan → execute pipeline had no install-time legitimacy gate: a hallucinated package name that passes `npm view` could flow all the way to `gsd-executor` running `npm install <malicious-pkg>` with no human checkpoint. This PR closes that gap. Changes: - gsd-phase-researcher: runs slopcheck on every recommended package; emits `## Package Legitimacy Audit` table; strips [SLOP] packages; ecosystem-specific verification (pip/npm/cargo); WebSearch-sourced packages tagged [ASSUMED]; ctx7 fallback uses `command -v` guard instead of `npx --yes` - gsd-planner: injects `checkpoint:human-verify` before [ASSUMED]/[SUS] installs; adds T-{phase}-SC STRIDE row to <threat_model> template; ctx7 fallback also uses `command -v` guard - gsd-executor: RULE 3 excludes package installs from auto-fix; failed installs surface as checkpoints, never silent substitutions - tests/package-legitimacy-gate.test.cjs: 24 structural assertions covering the full gate (node:test + node:assert, no raw .includes()) - docs: USER-GUIDE, COMMANDS, ARCHITECTURE updated with gate description - .changeset: Security fragment for v1.51 release notes Closes #2827 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: expand Package Legitimacy Gate documentation Add full user-facing depth to the gate docs across USER-GUIDE, COMMANDS, and ARCHITECTURE: - USER-GUIDE: rewrite gate section with concrete RESEARCH.md/PLAN.md examples, slopcheck verdict table, [ASSUMED] WebSearch tagging explanation, slopcheck-unavailable troubleshooting, and graceful degradation behavior - COMMANDS.md: expand /gsd-plan-phase gate note with verdict bullets; add install-failure checkpoint behavior to /gsd-execute-phase - ARCHITECTURE.md: expand gate section with threat model rationale, layer table, claim provenance integration, ecosystem coverage, and graceful degradation semantics Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): harden package legitimacy checkpoint semantics * fix(planner): satisfy size gates and tighten package gate wording --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 09:08:06 -04:00
Tom Boucher	924c697097	docs: replace retired /gsd-intel with /gsd-map-codebase --query (#3258 ) (#3260 ) * test: forbid stale /gsd-intel references in workflow/reference docs (#3258) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: replace retired /gsd-intel with /gsd-map-codebase --query (#3258) Fixes 5 stale references across the two primary source files called out in the issue. PR #2790 folded /gsd-intel into /gsd-map-codebase --query; these prose surfaces were not updated at that time. Fixes #3258 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: fix additional stale /gsd-intel references found in adversarial sweep (#3258) Sweep found 7 more occurrences in docs/INVENTORY.md (x2), docs/USER-GUIDE.md (x4), docs/FEATURES.md (x2), and agents/gsd-intel-updater.md (x2). All replaced with /gsd-map-codebase --query. The gsd-intel-updater agent name itself (without leading slash) is intentionally preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: pr=3260 for #3258 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: fail loudly on unreadable files in bug-3258 regression scan (CR finding) Replace silent early-return on readFileSync failure with an explicit throw so unreadable files surface as test failures rather than skipped coverage gaps. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 09:06:37 -04:00
Tom Boucher	94f835af40	docs: add Prerelease editions install guidance (Next/Nightly/Insiders/Preview) (#3173 ) * docs: add Prerelease editions install guidance (Next/Nightly/Insiders/Preview) Documents the existing <RUNTIME>_CONFIG_DIR override pattern for users on prerelease runtime editions (Windsurf Next, Cursor Nightly, VS Code Insiders, Codex preview, JetBrains EAP, etc.) and explicitly states they are best-effort and not separately tested under release CI — consistent with the free-string runtime policy in #2517. Resolves the discoverability gap behind issue #3161 without enumerating each prerelease channel as a named runtime. Future "add <runtime>-next/-nightly" requests can be redirected to the new section. Closes #3172 * chore(changeset): set PR number for prerelease docs fragment	2026-05-06 12:44:48 -04:00
Tom Boucher	858c821829	docs: sweep stale /gsd-* command references across all user-facing docs Replace 30 absorbed/deleted standalone command forms with their consolidated flag-based equivalents across 25 files (English + 4 locales + AGENTS/CLI-TOOLS/CONFIGURATION): /gsd-session-report → /gsd-pause-work --report /gsd-list-phase-assumptions → /gsd-discuss-phase --assumptions /gsd-analyze-dependencies → /gsd-manager --analyze-deps /gsd-research-phase → /gsd-plan-phase --research-phase /gsd-plan-milestone-gaps → /gsd-audit-milestone /gsd-code-review-fix → /gsd-code-review --fix /gsd-spike-wrap-up → /gsd-spike --wrap-up /gsd-sketch-wrap-up → /gsd-sketch --wrap-up /gsd-set-profile → /gsd-config --profile /gsd-check-todos → /gsd-capture --list /gsd-add-todo → /gsd-capture /gsd-add-backlog → /gsd-capture --backlog /gsd-plant-seed → /gsd-capture --seed /gsd-note → /gsd-capture --note /gsd-add-phase → /gsd-phase /gsd-insert-phase → /gsd-phase --insert /gsd-edit-phase → /gsd-phase --edit /gsd-remove-phase → /gsd-phase --remove /gsd-new-workspace → /gsd-workspace --new /gsd-list-workspaces → /gsd-workspace --list /gsd-remove-workspace → /gsd-workspace --remove /gsd-sync-skills → /gsd-update --sync /gsd-reapply-patches → /gsd-update --reapply /gsd-scan → /gsd-map-codebase --fast /gsd-intel → /gsd-map-codebase --query /gsd-next → /gsd-progress --next /gsd-do → /gsd-progress --do /gsd-status → /gsd-progress /gsd-join-discord → /gsd-help Skipped: CHANGELOG, RELEASE notes, superpowers/specs (historical) Suite: 6971/6971 pass	2026-05-05 11:01:15 -04:00
Tom Boucher	d978ad6b2f	merge: sync main into PR #3114 and keep canonical next/profile commands	2026-05-04 23:32:42 -04:00
Tom Boucher	72f4c3b362	fix(docs): replace stale /gsd-next references with /gsd-progress --next	2026-05-04 22:54:01 -04:00
Tom Boucher	eb365f7336	docs: audit and update docs/ for v1.40.0 release (#3048 ) * docs(en): update FEATURES/USER-GUIDE/COMMANDS for v1.40.0 surface - FEATURES.md: append v1.40.0 section (#122 skill consolidation, #123 namespace meta-skills, #124 context-window guard, #125 phase-lifecycle status-line read-side); add to TOC. - USER-GUIDE.md: add slash-command form (hyphen vs colon) primer and namespace routing primer; replace deleted slash forms in walkthroughs (`/gsd-add-backlog`, `/gsd-plant-seed`, `/gsd-add-phase`, `/gsd-set-profile`, `/gsd-list-workspaces`, etc.) with consolidated forms (`/gsd-capture --backlog`, `/gsd-phase --insert`, `/gsd-config --profile`, `/gsd-workspace --list`, etc.); fix `/gsd-spike-wrap-up` and `/gsd-sketch-wrap-up` to flag form. - COMMANDS.md: clarify Command Syntax (Gemini = colon form, others = hyphen form); add Namespace Meta-Skills section with all six routers; add `--context` to /gsd-health flag table. Refs #3047 * docs(en): refresh INVENTORY/CLI-TOOLS/STATE-MD-LIFECYCLE for v1.40.0 - INVENTORY.md: workflow-row "Invoked by" column updated to point at consolidated commands (`/gsd-phase` family, `/gsd-workspace --list`, `/gsd-config --advanced/--integrations/--profile`, `/gsd-sketch --wrap-up`, `/gsd-spike --wrap-up`); CLI-modules row for `secrets.cjs` updated to `/gsd-config --integrations`. Command count and namespace meta-skills section already reflect 65 shipped (= 59 consolidated sub-skills + 6 ns-* routers). - CLI-TOOLS.md: add `validate context` row under Validation Commands with the 60 %/70 % threshold envelope used by `/gsd-health --context`. - STATE-MD-LIFECYCLE.md: flip status header from "proposed" to "shipped in v1.40.0" since `parseStateMd()` and `formatGsdState()` now read and render `active_phase`, `next_action`, `next_phases`, and `progress`. `docs/AGENTS.md` audited and verified clean — `gsd-code-fixer` row already lists the correct `/gsd-code-review --fix` spawner; no deleted-skill references found. `docs/INVENTORY-MANIFEST.json` audited and verified clean — already enumerates the 65 commands (including six ns-* routers) and contains no deleted slash forms. Refs #3047 * docs(en): cleanup ARCHITECTURE/CONFIGURATION for v1.40.0 - ARCHITECTURE.md: split Commands install-target list to call out the Gemini colon form (`/gsd:command-name`) vs hyphen form for every other runtime. Add a new subsection covering two-stage hierarchical routing via the six namespace meta-skills (#2792) and a paired note on the MCP token-budget interaction so readers see the two big per-turn cost levers in one place. - CONFIGURATION.md: rewrite three references to the deleted `/gsd-settings-advanced` and `/gsd-settings-integrations` slash forms to use the consolidated `/gsd-config --advanced` / `/gsd-config --integrations` invocations. Add a new "STATE.md Frontmatter (Phase Lifecycle)" section documenting the four optional fields (`active_phase`, `next_action`, `next_phases`, `progress`) read by the v1.40 status-line, with a pointer to STATE-MD-LIFECYCLE.md for the full reference. `docs/manual-update.md` audited and verified clean — already documents `/gsd-update --reapply` (the consolidated form), no reference to the deleted `/gsd-reapply-patches`. Refs #3047 * docs(i18n): mirror v1.40.0 slash-command rename into ja-JP/ko-KR/zh-CN/pt-BR Mechanical token-level renames only — every reference to a deleted micro-skill slash form is rewritten to the consolidated form on the matching parent skill. No prose was machine-translated; new prose sections (slash-form primer, namespace routing primer, v1.40 feature entries, STATE.md frontmatter) were left for human translator follow-up. Renames applied uniformly across all four trees: /gsd-add-todo, /gsd-add-note, /gsd-add-backlog, /gsd-plant-seed, /gsd-check-todos → /gsd-capture[ --note\| --backlog\|--seed\|--list] /gsd-add-phase, /gsd-insert-phase, /gsd-remove-phase, /gsd-edit-phase → /gsd-phase[ --insert\| --remove\|--edit] /gsd-new-workspace, /gsd-list-workspaces, /gsd-remove-workspace → /gsd-workspace[ --new\| --list\|--remove] /gsd-settings-advanced, /gsd-settings-integrations, /gsd-set-profile → /gsd-config[ --advanced\| --integrations\|--profile] /gsd-sketch-wrap-up → /gsd-sketch --wrap-up /gsd-spike-wrap-up → /gsd-spike --wrap-up /gsd-reapply-patches → /gsd-update --reapply /gsd-code-review-fix → /gsd-code-review --fix /gsd-plan-milestone-gaps → /gsd-audit-milestone Refs #3047 * docs(changelog): regroup [Unreleased] under Feature/Enhancement/Fix Replace the existing Keep-a-Changelog \`Added\` / \`Changed\` / \`Performance\` / \`Removed\` / \`Fixed\` sub-headers in the [Unreleased] block with the issue/PR template taxonomy: Added → Feature Changed / Performance → Enhancement Removed → Enhancement Fixed → Fix Order within the release: Feature → Enhancement → Fix. Every bullet preserved verbatim — only headers and grouping changed; the awkward inline-versioned headers (\`### Added — 1.40.0-rc.1\`, \`### Changed — 1.40.0-rc.1\`, \`### Fixed — 1.40.0-rc.1\`) folded into the same buckets with the \`— 1.40.0-rc.1\` suffix dropped, since the [Unreleased] block IS 1.40.0-rc.1. The [1.39.2] hotfix block called out in #3047's spec does not yet exist in CHANGELOG.md (the previously released hotfix is [1.39.1]), so this commit only regroups [Unreleased]. Older release blocks ([1.39.1] and earlier) are frozen and untouched. Refs #3047 * docs(changeset): add fragment for v1.40.0 doc audit Refs #3047 * docs(en): strip leading / from deleted slash-command tokens in FEATURES REQ-CONSOLIDATE-03 and REQ-CONSOLIDATE-04 listed deleted commands by their `/gsd-foo` form for the historical record. The docs-parity tests in bug-3010, bug-3029-3034, and bug-3042-3044 use the regex `/\/gsd-[a-z0-9][a-z0-9-]/g` to scan user-facing surfaces for any remaining mention of removed slash forms — they cannot tell prose about a deleted command from a live recommendation. Strip the leading slash from the bare-name references (preserve the historical text otherwise). Tests now require a `/` prefix to match, so `gsd-add-todo` reads identically to a human but no longer trips the parser. Verified locally: 65/65 tests pass across the three docs-parity suites that were red on CI run 25270072600. Refs #3047 docs(en): fix CR feedback + drop literal /gsd:plan-phase from USER-GUIDE CI: tests/bug-2543-gsd-slash-namespace.test.cjs flagged docs/USER-GUIDE.md:35 for embedding the literal `/gsd:plan-phase` token in the parenthetical Gemini-form example. The test scans every .md under docs/ for `/gsd:<live-cmd>` because non-Gemini surfaces must not advertise the colon form. Replaced the literal example with a prose substitution rule. CR: docs/ARCHITECTURE.md:125 — the namespace meta-skills were listed by file-prefix (`gsd-ns-workflow`) but the invocable frontmatter `name:` is the bare form (`gsd-workflow`). Verified against the six `commands/gsd/ns-*.md` files. Replaced with the canonical names and noted the file/name disagreement in-line. CR: docs/COMMANDS.md:723 — `v1.40` aligned to canonical `v1.40.0`. CR: docs/FEATURES.md:2679 — REQ-CTX-GUARD-02 advertised the wrong invocation (`gsd-tools validate context`). The shipped handler is exposed via `gsd-sdk query validate.context` and requires explicit `--tokens-used <int>` + `--context-window <int>` flags (verified against sdk/src/query/validate.ts:849-882 and get-shit-done/bin/lib/validate-command-router.cjs:19-36). CR: docs/zh-CN/README.md:533 — added `inherit` to the profile-options parenthetical to match the canonical set (verified against model-profiles.cjs:29 `VALID_PROFILES = […MODEL_PROFILES['gsd-planner'], 'inherit']`). Verified locally: 74/74 tests pass across the four docs-parity suites that were red on CI runs 25270072600 and 25270182903. Refs #3047	2026-05-03 07:33:27 -04:00
Tom Boucher	1e6737cd8e	feat(plan-phase): --research-phase flag + scrub stale slash-command refs (#3042 , #3044 ) (#3045 ) * feat(plan-phase): --research-phase flag absorbs deleted /gsd-research-phase + scrub stale refs (#3042, #3044) #3042 (orphaned research-phase): /gsd-research-phase had a workflow file but no slash-command stub. Rather than restore the orphan, the research- only capability is now a flag on /gsd-plan-phase: /gsd-plan-phase --research-phase <N> When set, the workflow scopes to phase N, runs the research step (Section 5 of the existing plan-phase workflow), then early-exits before the planner/plan-checker/verifier chain. Per RCA against the deleted standalone, the flag adds two modifiers to fully cover the original surface (Option B from the RCA discussion): - --view : print existing RESEARCH.md to stdout, no spawn. Cheapest mode for the correction-without-replanning loop the issue reporter explicitly called out. Errors with a clear hint if RESEARCH.md is missing. - --research : reuse the existing "force re-research" semantics. In research-only mode this skips the existing-RESEARCH.md prompt and re-spawns unconditionally. - Neither flag, RESEARCH.md exists : prompt update/view/skip. Mirrors the deleted standalone's existing-artifact menu (#3042 RCA). #3044 (stale slash-command refs): scrubbed five deleted commands from all user-facing surfaces, including English docs, 4 localized doc sets (ja-JP, ko-KR, zh-CN, pt-BR), workflows, templates, and references. /gsd-check-todos → /gsd-capture --list /gsd-new-workspace → /gsd-workspace --new /gsd-status → /gsd-progress /gsd-plan-milestone-gaps → table rows / orphan sections removed (PR #3038 only scrubbed workflows/agent; missed the docs surfaces this PR covers) /gsd-research-phase → /gsd-plan-phase --research-phase Includes a fix to docs/issue-driven-orchestration.md (PR #3036) which itself referenced /gsd-new-workspace 4 times — self-correction. Removed: - get-shit-done/workflows/research-phase.md (orphan, capability absorbed into --research-phase flag) Tests: - tests/bug-3042-3044-research-flag-and-stale-refs.test.cjs — 46 structural-IR tests across both bugs: - argument-hint advertises --research-phase + --view - workflow parses --research-phase, sets RESEARCH_ONLY, early-exits before planner - --view prints RESEARCH.md without spawning - --research forces refresh in research-only mode - existing-RESEARCH.md prompt path with update/view/skip - workflows/research-phase.md is removed - 5 deleted slash-commands absent from 17 English user-facing surfaces + 16 localized doc surfaces (4 locales × 4 docs each) - replacement command tokens present where deleted ones lived 6950/6950 full suite pass. Lints clean. Closes #3042 Closes #3044 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: address all 8 CR findings on PR #3045 Major (3): - get-shit-done/workflows/plan-phase.md:344 — added explicit early-exit guard at Section 5.1: "Skip if RESEARCH_ONLY=true". Without it, an LLM could fall through "use existing, skip to step 6" → planner spawn, violating the research-only contract. The guard makes the early-exit unreachable from any non-research-only branch. - get-shit-done/references/continuation-format.md (3 examples) + zh-CN/.../continuation-format.md (3 examples) — pointed to `/gsd-plan-phase --research-phase` but docs/COMMANDS.md didn't document the flag. Added a full --research-phase + --view + --research modifier section to the /gsd-plan-phase flag table in COMMANDS.md so the canonical reference matches the continuation examples. Minor (5): - docs/FEATURES.md:1632 — `/gsd-plan-phase --research-phase` → `/gsd-plan-phase --research-phase <N>` (include required arg). - get-shit-done/templates/README.md:46 — NN-VALIDATION.md producer reverted from `/gsd-plan-phase --research-phase` (Nyquist) to plain `/gsd-plan-phase` (Nyquist). VALIDATION.md is created during normal Nyquist flow, not research-only mode — the bulk replacement was wrong for that line. - get-shit-done/workflows/help.md:89 — signature line was missing `--research`; added it alongside `--research-phase` and `--view`. - tests/bug-3042-3044-...:197 — promptHasView/promptHasSkip were tautological (matched anywhere in 1700-line workflow). Tightened to a proximity check anchored on "RESEARCH.md already exists" prompt header within a 600-char window. Updated workflow to emit that literal phrase. - tests/feat-2840-...:95 — workspace assertion used `/gsd-workspace` but the documented replacement is `/gsd-workspace --new`. Tightened to require both tokens (in 3 places: requiredCommands list, regex in conceptPairs, error message). 6950/6950 full suite pass. Lint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 23:12:50 -04:00
Tom Boucher	7714b5244b	fix(workflows,docs): scrub stale /gsd-code-review-fix and /gsd-plan-milestone-gaps refs (#3029 , #3034 ) (#3038 ) * fix(workflows,docs): scrub stale /gsd-code-review-fix and /gsd-plan-milestone-gaps refs (#3029, #3034) #2790 consolidated /gsd-code-review-fix into /gsd-code-review --fix and deleted /gsd-plan-milestone-gaps in favor of inline gap planning as part of /gsd-audit-milestone's output. The deletion was propagated through some surfaces (#2950 covered help/do/settings/discuss-phase/etc.) but several user-facing surfaces still emitted the old forms: #3029 — /gsd-code-review-fix references in: - agents/gsd-code-fixer.md (description, "Spawned by", recovery prose) - get-shit-done/workflows/code-review.md (offer text) - get-shit-done/workflows/execute-phase.md (offer text) - get-shit-done/workflows/code-review-fix.md (internal retry hints) - docs/INVENTORY.md (agent + workflow rows) - docs/CONFIGURATION.md (workflow.code_review row) - docs/USER-GUIDE.md (3 occurrences in walkthrough) - docs/AGENTS.md (gsd-code-fixer agent stub) - docs/FEATURES.md (commands list + REQ-REVIEW-04) All replaced with /gsd-code-review --fix. Internal retry hints in the workflow file itself updated to point at the new form. Release notes (docs/RELEASE-.md) and gsd-ns-review's "absorbed by" deletion note left unchanged — historical/explanatory content. #3034 — /gsd-plan-milestone-gaps references in: - get-shit-done/workflows/audit-milestone.md (<offer_next> blocks for gaps_found and tech_debt: lines 281, 323) - commands/gsd/complete-milestone.md (gaps_found pre-flight: lines 46, 57) Replaced with inline closure path: /gsd-phase --insert <N> "Close gap: <REQ-ID> ..." /gsd-discuss-phase <N> /gsd-plan-phase <N> /gsd-execute-phase <N> Plus a Nyquist-coverage hint pointing at /gsd-validate-phase / /gsd-secure-phase for retroactive audit-chain hygiene gaps. The gsd-ns-project SKILL.md "deleted by #2790" note is preserved (it's the canonical pointer for future readers asking what happened to the command). Tests: - tests/bug-3029-3034-stale-command-routes.test.cjs — parser-based assertions per fixed surface, plus a structural cross-check that gsd-ns-project keeps the deletion note. 15 tests, all green. - 6905/6905 full suite passes. Closes #3029 Closes #3034 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> fix: address CR feedback on PR #3038 — argument order, structural tests, agent count CR findings on PR #3038: 1. docs/USER-GUIDE.md (Major) — `--fix` examples used flag-first form (`/gsd-code-review --fix 3`), but the supported CLI grammar is phase-first (`/gsd-code-review 3 --fix`). The original sed-based replacement preserved the position of the `gsd-code-review-fix` token, producing the wrong order. Fixed in USER-GUIDE.md (3 occurrences) and the same drift in the workflow surfaces: - get-shit-done/workflows/code-review-fix.md (2 retry hints) - get-shit-done/workflows/code-review.md (offer text) - get-shit-done/workflows/execute-phase.md (offer text) 2. docs/AGENTS.md (Minor) — internal count drift: line 483 said "Ten additional agents" but line 725 said "12 advanced/specialized". Filesystem reality: 33 agents total, 21 primary, 12 specialized (count of `### ` stubs in the Advanced and Specialized section). Updated lines 3, 13, 483 to use 12/33 and added the two missing names (doc-classifier, doc-synthesizer) to the inline list at line 13. 3. tests:94 (Major refactor suggestion) — `.includes()` token checks were source-grep style. Refactored to a typed-IR pattern: extract the SET of slash-command tokens via regex, assert membership on the parsed Set instead of substring scanning the raw file text. Added the `allow-test-rule` comment explaining the IR-build vs IR-assertion split per scripts/lint-no-source-grep.cjs convention. 4. tests:130 (Major) — replacement-path assertion was file-wide and could false-pass on generic mentions of "inline" elsewhere in the file. Refactored: `extractOfferBlocks(content)` returns the typed list of `<offer_next>` and "Pre-flight" blocks where the deleted command previously lived, and the assertion runs against those blocks specifically. Now requires `/gsd-phase --insert` or inline-audit prose to appear in the same offer block, not just somewhere in the file. 15/15 targeted tests pass. 6905/6905 full suite pass. Lints clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 17:23:44 -04:00
Tom Boucher	117b3ec009	docs: add issue-driven orchestration guide (#2840 ) (#3036 ) * docs: add issue-driven orchestration guide (#2840) Adds docs/issue-driven-orchestration.md — a recipe for driving GSD from a GitHub / Linear / Jira issue using existing primitives. Maps Symphony-style orchestration concepts onto GSD commands without vendoring code, adding a daemon, or introducing tracker integration. Concept mapping covers: - WORKFLOW.md → ROADMAP.md / STATE.md / phase CONTEXT.md / phase PLAN.md - isolated agent workspace → /gsd-new-workspace --strategy worktree - agent dispatch → /gsd-manager (interactive), /gsd-autonomous (unattended) - per-phase steps → /gsd-discuss-phase → /gsd-plan-phase → /gsd-execute-phase - proof-of-work → /gsd-verify-work (UAT.md persists across /clear) - adversarial review → /gsd-review (cross-AI peer review) - human merge gate → /gsd-ship - follow-up capture → /gsd-note, /gsd-plant-seed, /gsd-new-milestone End-to-end flow walks through 7 numbered steps from picking the tracker issue to capturing follow-ups. Safety boundaries (isolated worktrees, explicit human review, no automatic public posting, verification before ship) and non-goals (no vendoring, no daemon, no mandatory tracker, no gate bypass, no command-surface expansion) are spelled out explicitly so the doc cannot drift into "let's just add one more flag". Cross-linked from docs/README.md (Documentation Index) and docs/USER-GUIDE.md (Table of Contents preamble). Tests: tests/feat-2840-issue-driven-orchestration-guide.test.cjs — 9 structural-IR tests parse the guide into a typed record and assert on flags (commandsPresent, conceptPairs, nonGoalFlags, safetyFlags, numberedSteps). Fence-language MD040 check enforced. Cross-link presence enforced. No raw-text assertions on prose. 6890/6890 tests pass. Lint:tests clean (allow-test-rule comment justifies the doc-shape parser per scripts/lint-no-source-grep.cjs escape hatch). Lint:changeset clean. Closes #2840 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(test): guard USER-GUIDE.md existsSync before read (CR #3036) CR Minor: cross-linked-from-USER-GUIDE.md test called fs.readFileSync directly without first asserting fs.existsSync, asymmetric with the README.md test above. A missing USER-GUIDE.md would throw ENOENT instead of producing a meaningful assertion message. Mirror the null-guard pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 16:57:42 -04:00
Tom Boucher	95d2bc20f8	feat(hooks): opt-in SessionStart update banner for non-statusline users (#2795 ) (#3035 ) * feat(hooks): opt-in SessionStart update banner for non-statusline users (#2795) When a user declines (or keeps a non-GSD) statusline at install time, the installer now offers an opt-in SessionStart banner that surfaces GSD update availability. The banner reads the existing ~/.cache/gsd/gsd-update-check.json cache (written by gsd-check-update-worker.js) and emits a single systemMessage line only when update_available is true: GSD update available: <installed> → <latest>. Run /gsd-update. It is silent when up-to-date and rate-limits "check failed" diagnostics to once per 24h via a sentinel file so a corrupt cache doesn't nag every session. Removed cleanly by `npx get-shit-done-cc --uninstall` which strips both the script and the SessionStart entry. The banner is never offered when GSD's statusline is being installed (statusline already surfaces update info, so re-prompting would be noise). Implementation: - hooks/gsd-update-banner.js — pure functions buildBannerOutput, shouldSuppressFailureWarning, readCache; thin main() wires them. - bin/install.js — handleUpdateBanner() prompt, parseUpdateBannerInput(), buildUpdateBannerHookEntry(), buildUpdateBannerPromptText(); chained into installAllRuntimes() so finalize() receives both flags. updateBannerCommand computed alongside the other JS-hook commands; finishInstall() registers the SessionStart entry only when shouldInstallBanner === true and the hook file is present at the target. - Hook ships in scripts/build-hooks.js HOOKS_TO_COPY, listed in MANAGED_HOOKS for stale-detection in gsd-check-update-worker.js, in the uninstall hook-removal lists in install.js, and in the rewriteLegacyManagedNodeHookCommands allowlist. Tests: - tests/feat-2795-update-banner.test.cjs — 22 tests, structural-IR assertions on parsed JSON envelopes (no raw-text matching). Covers pure-function branches (cache present/absent, parseError, rate-limit suppression, missing version fields), end-to-end hook invocation against fixture cache states, and install.js wiring (prompt text, input parsing, hook entry shape). - tests/trae-install.test.cjs — updated install() return-shape assertion to include updateBannerCommand: null for the no-settings runtime. - 6881/6881 tests pass. Docs (bundled in same commit per the bundle-docs-with-code skill): - docs/USER-GUIDE.md — new "Surface GSD Update Notifications Without GSD's Statusline" task section with opt-in/opt-out instructions. - docs/FEATURES.md — REQ-HOOK-08 added; "Update Banner" subsection under the Hook System feature with cache flow + removal path. - docs/INVENTORY.md — hook count 11 → 12, new row for gsd-update-banner.js. - docs/INVENTORY-MANIFEST.json — regenerated. Closes #2795 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): gate banner prompt on actual installability (CR #3035) CodeRabbit findings on PR #3035: - bin/install.js (Major): continueAfterStatusline gated banner prompt on the raw `shouldInstallStatusline` flag from handleStatusline. But finishInstall later silently skips the statusline write on local installs unless --force-statusline is set (#2248). Two consequences: 1. Interactive local Claude/Gemini installs got neither a statusline nor a banner offer. 2. Codex/Cursor/Copilot/Windsurf/Trae/Cline-only installs (where every result.updateBannerCommand is null) still got prompted even though the choice was silently ignored. Fix: derive willInstallStatusline = shouldInstallStatusline && (isGlobal \|\| forceStatusline), and gate the banner prompt on a canInstallBanner precondition computed from results[].updateBannerCommand. Pass the raw shouldInstallStatusline through to finalize unchanged so per-runtime statusline gating in finishInstall is unaffected. - tests/feat-2795-update-banner.test.cjs (Minor): rate-limit suppression test parsed r1.stdout without first asserting r1.status === 0. Other e2e tests in this file (lines 210, 241) do this. A non-zero exit would surface as a cryptic SyntaxError instead of a status assertion failure. Fix applied verbatim. 6881/6881 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 16:33:16 -04:00
Tom Boucher	8c43ba7301	docs(#3025 ): MCP tool schema as a context-budget concern (#3032 ) * docs(#3025): MCP tool schema as a context-budget concern Adds documentation covering the largest GSD cost lever that GSD itself does not own: MCP tool schema injection. Every enabled MCP server adds its schema to every turn (often 20k+ tokens for heavyweight servers like browser/playwright, mac-tools, etc.), which can dwarf whatever `model_profile` tuning saves. Two doc surfaces (per the bundle-docs-with-code skill depth gradient): 1. get-shit-done/references/context-budget.md - New "MCP Tool Schema Cost (Harness Concern)" section. - Explains schemas-per-turn cost framing. - Names enabledMcpjsonServers / disabledMcpjsonServers and .claude/settings.json explicitly. - Pre-phase audit checklist: browser/playwright, platform-specific, cross-project/stale, duplicate/shadow. - Explicit "GSD does not manage MCP enablement — harness concern" statement so users don't hunt for a GSD setting. - Links to Anthropic Claude Code MCP docs as canonical reference. - Notes compounding interaction with model_profile (additive levers). 2. docs/USER-GUIDE.md - New task-oriented "Trim MCP servers to reduce per-turn cost" section above "Using Non-Claude Runtimes". - Same checklist condensed. - Cross-link to context-budget.md for the full reference. Tests: - tests/feat-3025-mcp-token-budget-docs.test.cjs (12 cases) parses both docs into typed semantic-flag records and asserts behavioral invariants (mentions key, includes audit, names harness, etc.) rather than substring-matching prose. Adheres to CONTRIBUTING.md no-source-grep — section can be reworded freely as long as the required semantics survive. - Markdownlint pre-flight tests (MD040 fence language, MD056 table column count) per the bundle-docs-with-code skill so CR can't ratchet on prose nitpicks across multiple review rounds. Verification: - 12/12 pass on regression test - 6857/6857 full suite (12 net new) - lint-no-source-grep clean (377 test files) Companion to #3023 (per-phase-type model map) and #3024 (dynamic routing). Together they cover the three biggest cost levers users ask about; this issue covers the one GSD does not own. Closes #3025 * docs(#3025): batch 3 CR fixes — pr id, relative link, named flag CodeRabbit on PR #3032 (3 minor — 2 inline + 1 nitpick), all in one push per the bundle-docs-with-code skill (avoid per-round nitpick ratchet): 1. Inline (Minor) — .changeset/mcp-token-budget-docs.md:3 `pr: TBD` → `pr: 3032` so changeset tooling can link the entry. 2. Inline (Minor) — docs/USER-GUIDE.md:1101 Used a hardcoded `https://github.com/.../blob/main/...` URL for the cross-link to `context-budget.md`. Rest of USER-GUIDE.md uses relative links. Switched to `../get-shit-done/references/context- budget.md#mcp-tool-schema-cost-harness-concern` so feature-branch work shows the right content and rename-resilience is preserved. 3. Nitpick — tests/feat-3025-mcp-token-budget-docs.test.cjs:234 The cross-link assertion used an inline `/context-budget/i.test(...)` while every other invariant in the file lived as a named flag in `parseMcpBudgetSection`. Per CONTRIBUTING.md no-source-grep, added `crossLinksContextBudget` to the parser and asserted on `parsed.crossLinksContextBudget` so the cross-link rule sits next to its siblings. Verification: - 12/12 pass on regression test (no count change; refactor only) - No source code changes, only docs + tests * test(#3025): strip inline markdown before phrase-match (CR nitpick) CodeRabbit caught that the `explainsHarnessNotGsd` primary regex branch couldn't match "GSD does not manage" in context-budget.md because the markdown bold markers (``) sit between contiguous words. The test passed today only via the fallback `harness (concern\|setting\|controlled)` branch — the primary branch was effectively dead code. Fix: strip inline markdown emphasis (``, ``, `~~`) and inline- code backticks before any phrase-matching in `parseMcpBudgetSection`. All seven flag computations now run against the stripped text so markdown formatting can't silently invalidate any invariant. Underscores are intentionally NOT stripped — `model_profile` and other snake_case identifiers must survive intact for the mentionsModelProfileInteraction check to find them. Verification: 12/12 still pass; primary branches now fire on real markdown content rather than relying on fallbacks. test(#3025): guard markdownlint tests against null section (CR nitpick) CodeRabbit caught that the MD040 and MD056 markdownlint pre-flight tests called `section.match(...)` and `section.split('\n')` directly on the value returned by `extractSection`, which returns null when no matching header is found. If the MCP section is ever removed (regression), both tests would throw `TypeError: Cannot read properties of null` instead of producing a clean assertion failure naming the actual problem. The semantic tests above are protected because parseMcpBudgetSection short-circuits to a typed-falsy record on null input. The markdownlint tests bypassed that guard since they need raw section text, not parsed flags. Added `assert.ok(section, ...)` preconditions to both so a missing section produces a meaningful failure message. No content changes; defensive programming only. Verification: 12/12 still pass.	2026-05-02 15:24:26 -04:00
Tom Boucher	e1d661ece0	feat(#3024 ): dynamic routing with failure-tier escalation (#3031 ) * feat(#3024): dynamic routing with failure-tier escalation Adds a `dynamic_routing` block to .planning/config.json that lets the resolver start agents on a cheap tier and escalate one tier up when the orchestrator detects a soft failure (verification inconclusive, plan-check FLAG, etc.). Solves the "pay Opus rates as insurance" anti-pattern by making escalation observed-quality-driven. Architecture: - AGENT_DEFAULT_TIERS map (light/standard/heavy) — every agent in MODEL_PROFILES declares a default tier; tests assert coverage so adding a new agent without updating the map fails CI. - nextTier(currentTier) helper — light → standard → heavy → heavy (heavy stays at heavy; can't go further). - resolveModelForTier(cwd, agentType, attempt) — new resolver. The orchestrator tracks the attempt counter and passes 0 for the first spawn, 1+ on escalation. The resolver caps internally at max_escalations so the orchestrator can blindly bump the counter. - Schema validation: dynamic_routing.enabled / escalate_on_failure / max_escalations / tier_models.<light\|standard\|heavy>. Unknown tiers and unknown sub-keys rejected at config-set time. - SDK schema mirror updated to keep CJS/SDK in lockstep (#2653). Resolution precedence (highest → lowest): 1. model_overrides[<agent>] (full IDs accepted) 2. dynamic_routing.tier_models[<tier>] (NEW; escalation-aware) 3. models[<phase_type>] (#3023 phase-type map) 4. model_profile (per-agent column) 5. Runtime default Backward compatibility: dynamic_routing is disabled by default (enabled: false or block omitted). resolveModelForTier short- circuits to resolveModelInternal in that case, so callers can adopt unconditionally without breaking existing behavior. This PR delivers the JS-layer infrastructure: schema + tier map + resolver. Orchestrator adoption (workflow markdown updates that detect soft failures and call resolveModelForTier with attempt+1) is incremental follow-up — verifier / plan-checker / integration- checker each adopt the protocol when ready. Tests (23 cases, all structural-IR — no stdout grep): - Schema invariants: AGENT_DEFAULT_TIERS coverage, VALID_AGENT_TIERS exact match, every assignment uses a valid tier - nextTier helper: light→standard→heavy→heavy, null on invalid input - Disabled mode: no block + enabled:false both no-op (back-compat) - Enabled mode: attempt=0 returns default tier model, attempt=1 escalates, beyond max_escalations caps, heavy agents stay heavy, default max_escalations=1 when omitted - Precedence: per-agent override beats dynamic_routing, dynamic_routing beats phase-type models - Validation: every settings key accepted, unknown tiers/sub-keys rejected, bare `dynamic_routing` rejected as config-set target Documentation: - get-shit-done/references/model-profiles.md — full reference section - docs/CONFIGURATION.md — full settings table + escalation flow - docs/USER-GUIDE.md — task-oriented "Cheap-by-default" section - docs/FEATURES.md — config row cross-link Verification: - 23/23 pass on regression test - 6843/6843 full suite (23 net new from 6820) - lint-no-source-grep clean (376 test files) - SDK schema mirror keeps CJS/SDK in sync per #2653 parity test Closes #3024 * fix(#3024): honor escalate_on_failure:false + 3 CR follow-ups CodeRabbit on PR #3031 (4 findings — 1 Major + 2 Minor + 1 Nitpick): 1. Major (inline) — get-shit-done/bin/lib/core.cjs:1668 resolveModelForTier ignored dynamic_routing.escalate_on_failure. When the user set it to false, escalation should be disabled, but the resolver only checked attempt/max_escalations. An orchestrator that always passes attempt+1 on retry would silently escalate despite the user opting out. Fix: gate effectiveAttempt on `dr.escalate_on_failure !== false` so false short-circuits every attempt back to the default tier. 2. Minor (inline) — docs/CONFIGURATION.md:123-126 The dynamic_routing rows in the Core Settings table had 4 cells instead of 5 (missing the Options column), breaking the table structure. Added explicit Options values for enabled / escalate_on_failure / max_escalations rows. 3. Minor (outside-diff) — references/model-profiles.md:179-195 "Resolution Logic" sketch was pre-#3024 and didn't include dynamic_routing in the precedence ladder. Updated to a 6-step block with dynamic_routing at step 3 (between override and phase-type). 4. Nitpick — tests/feat-3024-dynamic-routing.test.cjs:189+ Tests used `if (lightAgent) { ... }` guards that silent-pass when AGENT_DEFAULT_TIERS drifts. Replaced all 5 conditional skips with `assert.ok(lightAgent, '...')` preconditions so a tier-mapping change surfaces as a test failure. Plus: 2 new regression tests for the Major fix: - escalate_on_failure:false caps every attempt at default tier - escalate_on_failure:true (explicit) still escalates normally Verification: - 25/25 pass on regression test (23 prior + 2 escalate_on_failure) - 6845/6845 full suite (2 net new) - lint-no-source-grep clean * docs(#3024): align precedence + add fence language tags (CR follow-up) CodeRabbit (3 minor): 1. docs/CONFIGURATION.md:691 — "Per-Phase-Type Models → Resolution precedence" was a 4-step block written pre-#3024; readers got contradictory rules between the per-phase-type section and the later dynamic_routing section. Updated to the same 5-step ladder with dynamic_routing at step 2, and noted that dynamic_routing is disabled by default so this section's behavior is unchanged when the kill-switch is off. 2. docs/CONFIGURATION.md:770 — escalation-flow code fence missing language tag (MD040). Added `text`. 3. references/model-profiles.md:184 — resolution-ladder code fence missing language tag (MD040). Added `text`. No code changes; docs only. Verification: regression test still 25/25. * docs(#3024): clarify precedence prose — five layers, not four (CR nitpick) CodeRabbit nitpick: the "Per-Phase-Type Models → Resolution precedence" prose said "The four layers compose..." but the ladder above lists five (including Runtime default). Also "dynamic_routing escalates per-attempt above all of them" misreads as suggesting dynamic_routing wins over model_overrides — actually overrides still win at step 1. Reworded top-down so the precedence direction is unambiguous: - model_profile = base - models = phase-level override - dynamic_routing = per-attempt escalation - model_overrides = per-agent exception (top) - runtime default = fallback No code changes; docs only. * docs(#3024): note escalate_on_failure:false in escalation-flow diagram (CR) CodeRabbit nitpick: the escalation-flow diagram in docs/CONFIGURATION.md described the soft-failure → respawn → tier_models[next_tier_up] path, but didn't surface the `dynamic_routing.escalate_on_failure: false` kill-switch right next to it. Users reading the flow diagram (which is the canonical place to understand attempt behavior) wouldn't see that the kill-switch overrides the soft-failure branch. Added a one-paragraph note immediately after the flow listing, before the tier-sequence example, so the kill-switch is visible exactly where users decide whether escalation will happen. No code changes; docs only.	2026-05-02 14:26:35 -04:00
Tom Boucher	d812c66020	feat(#3023 ): per-phase-type model map in .planning/config.json (#3030 ) * feat(#3023): per-phase-type model map in .planning/config.json Adds a new `models` block to .planning/config.json with six phase-type slots (planning / discuss / research / execution / verification / completion). Lets users express coarse tuning ("Opus for planning, Sonnet for the rest") without learning the agent taxonomy. Resolution precedence (highest → lowest): 1. Per-agent `model_overrides[agent]` (full IDs; targeted exception) 2. Phase-type `models[phase_type]` (NEW; tier alias) 3. Profile table (`model_profile`) (per-agent column) 4. Runtime default The three layers compose: `models` defaults a phase, `model_overrides` carves an exception. Phase-type values are tier aliases (opus/sonnet/ haiku/inherit) so the runtime-resolution chain (#2517) stays correct end-to-end without further branching. Implementation: - model-profiles.cjs: new AGENT_TO_PHASE_TYPE map + VALID_PHASE_TYPES set. Each agent in MODEL_PROFILES gets one phase-type assignment; tests assert coverage so adding a new agent without updating the table fails CI. - core.cjs (resolveModelInternal): inserts phase-type tier lookup between per-agent override and profile-derived tier. Skips runtime resolution when the resolved tier is 'inherit' (was previously gated only on profile === 'inherit'; phase-type can now produce inherit independently). - core.cjs (loadConfig): pass `parsed.models` through both code paths so resolveModelInternal can read it. - config-schema.cjs + sdk/src/query/config-schema.ts: dynamic-pattern validator accepts only the six known phase-types; unknown slots rejected at config-set time. Backward compat: configs without `models` behave exactly as today. Tests (15 cases, all structural-IR — no stdout grep): - Schema: AGENT_TO_PHASE_TYPE coverage, VALID_PHASE_TYPES exact match - Resolver: phase-type alone; per-agent override beats phase-type; phase-type beats profile; issue's full example; "inherit"; empty block is no-op; no block is no-op - Validation: each of the 6 slots accepted; unknown slot rejected; bare `models` (no slot) rejected Verification: - 15/15 pass on new regression test - 6808/6808 full suite (5 net new), 0 fail - lint-no-source-grep clean across 375 test files Closes #3023 * docs(#3023): document `models` per-phase-type config in user-facing docs Adds `models` block coverage to the three user-facing docs that ship with each release: 1. docs/CONFIGURATION.md - New "Per-Phase-Type Models" section between "Per-Agent Overrides" and "Non-Claude Runtimes" with: * full example mixing models + model_overrides * phase-type → agent mapping table * resolution-precedence pseudocode * accepted values (tier alias only) * "When to use which" decision matrix * validation behavior + example error - Added `"models": {}` to the Full Schema snippet - Added a row for `models.<phase_type>` to the config keys table (next to model_profile_overrides for adjacency) 2. docs/FEATURES.md - Added a row for models.<phase_type> in the Configurable Settings table (right under model_profile) - Cross-link to CONFIGURATION.md for the full surface 3. docs/USER-GUIDE.md - New task-oriented "Tuning model cost by phase" section above "Using Non-Claude Runtimes" — leads with the concrete config and shows the override pattern (one-shot phase + targeted exception) - Cross-link to CONFIGURATION.md Verification: - 29/29 pass on config-schema-docs-parity + docs-update + new feature test (parity-check passes, so the config-schema entry I added in the feature commit is now matched by a docs row) - 6808/6808 full suite pass - lint-no-source-grep clean Doc style follows the same pattern used by the existing model_profile, model_overrides, and model_profile_overrides sections — example-led, table-backed, cross-referenced. Each doc surfaces the feature at the right depth (reference / settings table / task guide). * fix(#3023): mirror phase-type tier in resolveReasoningEffortInternal (CR Major) CodeRabbit caught a real Codex correctness bug + 3 minor docs/test issues: 1. Major (outside-diff) — resolveReasoningEffortInternal in core.cjs derived its tier exclusively from the profile table, ignoring the models.<phase_type> override added in #3023. Failure mode on Codex: Config: model_profile=balanced, models.execution=opus, agent=gsd-executor resolveModelInternal: tier=opus → gpt-5.4 resolveReasoningEffortInternal: tier=sonnet → reasoning_effort=medium ↑ WRONG — should be xhigh (opus tier on Codex) The runtime received a mismatched (model, effort) pair. Mirrored the phase-type lookup from resolveModelInternal so both functions derive from the same tier source. 'inherit' phase-type returns null effort (no runtime entry maps to 'inherit'; let runtime decide). 2. Minor — .changeset/per-phase-type-models.md `pr: TBD` → `pr: 3030`. 3. Minor (outside-diff) — model-profiles.md "Resolution Logic" section omitted the new phase-type tier. Updated the 4-step block to a 5-step block including `models[phase_type]` between override and profile, plus a paragraph noting that `model` and `reasoning_effort` derive from the same tier source. 4. Nitpick — added 2 typo-safety tests: - models.research = "haiku3" (typo) → falls through to profile - models.research = "openai/gpt-5" (full ID) → falls through to profile Plus 5 new reasoning_effort tests covering the Major fix: - exported correctly - phase-type override flips both model AND effort to same tier - inherit phase-type returns null effort - per-agent override still bypasses phase-type for effort - claude runtime ignores models.* (no effort propagation) Verification: - 24/24 pass on regression test (15 original + 2 typo-safety + 5 effort + 2 outside-diff related) - 6815/6815 full suite (7 net new from 6808) - lint-no-source-grep clean The reasoning_effort tests are written semantically (phase-type override must produce the SAME effort as a profile-only opus config) rather than hard-coding tier-specific effort strings, so changes to the runtime tier map don't break them. * fix(#3023): phase-type override beats profile=inherit (CR Major round 2) CodeRabbit caught another precedence inversion: when { model_profile: 'inherit', models: { execution: 'opus' } } both resolvers short-circuited on `profile === 'inherit'` BEFORE the phase-type override could be honored. Result: model returned 'inherit' and reasoning_effort returned null — both contradicting the documented precedence where models[phase_type] wins over model_profile. Fix in resolveModelInternal: - Compute tier from phase-type FIRST. If phase-type is a valid alias, it wins. Otherwise, fall back to profile-derived tier OR 'inherit' (when profile === 'inherit'). - Gate the runtime-resolution branch on `tier !== 'inherit'` (was `profile !== 'inherit'`) so phase-type=opus can flip runtime mapping on even when profile=inherit. - Gate the inherit-return on `tier === 'inherit'` (was `profile === 'inherit'`). Fix in resolveReasoningEffortInternal: - Remove the `if (profile === 'inherit') return null;` early-return. - Compute tier from phase-type first, fall back to profile. If phase-type is explicitly 'inherit' OR the resolved tier is 'inherit', return null (no runtime entry maps to inherit). Tests added (5 new): - model: phase-type wins over profile=inherit (with explicit opus, with haiku for one phase + planner-without-slot still inheriting) - model: profile=inherit + no models block → all agents inherit (no regression on existing inherit semantics) - model: profile=inherit + models block but agent has no slot → that agent inherits, agents with slots get phase-type tier - effort: phase-type opus + profile=inherit → produces opus-tier effort, NOT null (the original bug) Verification: - 27/27 pass on regression test (24 prior + 3 model + 1 effort) - 6820/6820 full suite (5 net new) - lint-no-source-grep clean The effort test reads the expected value by running a profile-only opus config and comparing — semantic check, not hard-coded effort string. So runtime tier map changes don't break the test.	2026-05-02 13:19:15 -04:00
Tom Boucher	f2decefede	fix(#3010 ): post-install message and docs use /gsd-update --reapply (#3012 ) * fix(#3010): post-install message and docs use /gsd-update --reapply PR #2824 consolidated 86 skills into ~58, removing the standalone /gsd-reapply-patches command and folding it into a flag on /gsd-update (/gsd-update --reapply). The 1.39.1 hotfix (#2954) updated help.md but missed three other surfaces that still recommended the dead form: 1. bin/install.js reportLocalPatches() — runtime emitter shown after every install with backed-up patches. All branches updated: - claude/opencode/kilo/copilot: /gsd-update --reapply - gemini: /gsd:update --reapply - codex: $gsd-update --reapply - cursor: gsd-update --reapply (mention the skill name) 2. get-shit-done/workflows/update.md — Step 4 prose and the check_local_patches block both referenced /gsd-reapply-patches. Replaced with /gsd-update --reapply (with backticks around the command per CR feedback for copy/paste UX). 3. Localized docs (en/ja-JP/ko-KR/zh-CN) — 14 files across ARCHITECTURE.md / COMMANDS.md / FEATURES.md / INVENTORY.md / USER-GUIDE.md / manual-update.md still listed the removed command. Tests: - bug-3010-reapply-patches-references.test.cjs (4 tests): scans bin/install.js's reportLocalPatches body, every workflow file, and every doc (excluding CHANGELOG history and help.md's deprecation notice) for the removed command form, and verifies each runtime branch emits the consolidated form via captured console output. - tests/copilot-install.test.cjs:1081-1115 — stale assertions that hard-coded the removed string updated to assert /gsd-update --reapply. Verification: 115/115 pass across both files. Co-authored-by: Patrick Clery <patrick@patrickclery.com> Closes #3010 * test(#3010): broaden dead-command scan + tighten runtime exact-match CodeRabbit follow-up findings on #3012: 1. Workflow + docs scans only matched "/gsd-reapply-patches", missing the gemini ("/gsd:reapply-patches") and codex ("$gsd-reapply-patches") spellings. A regression that re-introduced either form in localized docs would have passed silently. Extracted a DEAD_COMMAND_PATTERNS array + findDeadCommands() helper used by both scans, so all three removed forms are checked uniformly. Match output also reports which spellings hit, for faster diagnosis. 2. reportLocalPatches runtime test asserted output.includes('update --reapply'), which is too loose — a malformed prefix like '/gsd:update --reapply' on the claude branch would have passed. Replaced with an exact {runtime → expected token} map covering all 7 branches: claude/opencode/kilo/copilot → /gsd-update --reapply gemini → /gsd:update --reapply codex → $gsd-update --reapply cursor → gsd-update --reapply Negative assertion also runs DEAD_COMMAND_PATTERNS against output for every runtime, so dead forms can't slip in regardless of branch. Verification: 4/4 pass on bug-3010-reapply-patches-references.test.cjs. * test(#3010): add prefix-absence guard for cursor runtime (CR follow-up) CodeRabbit (Minor): the cursor expected token "gsd-update --reapply" is a substring of every prefixed form ("/gsd-update --reapply" for claude/ opencode/kilo/copilot, "\$gsd-update --reapply" for codex). The positive output.includes(expectedToken) check therefore can't distinguish correct cursor output from a regression where the installer emits a prefixed form for cursor — both pass the substring check. Add an explicit prefix-absence assertion for cursor that fails if any of /, \$, or : appears immediately before "gsd-update --reapply" in output. The gemini form ("/gsd:update --reapply") doesn't share the substring (gsd:update vs gsd-update) so it's already caught by the positive includes failing on cursor's expected bare token. Verification: 4/4 pass. --------- Co-authored-by: Patrick Clery <patrick@patrickclery.com>	2026-05-02 09:38:34 -04:00
Jeremy McSpadden	4d394a249d	fix(commands): normalize gsd slash namespace drift (#2858 ) * fix(commands): normalize gsd slash namespace drift * fix(#2855): address CodeRabbit findings on namespace drift PR Three CR findings, all valid: 1. autonomous.md line 783 still had `gsd:discuss-phase` (the PR's own normalization missed this line). Switched to `gsd-discuss-phase` and updated the matching test in autonomous-interactive.test.cjs that was asserting the now-retired colon form. 2. tests/bug-2543-gsd-slash-namespace.test.cjs source-grepped the fix-slash-commands.cjs script with .includes() rather than driving its transform behaviour. Refactored fix-slash-commands.cjs to export a pure transformContent(src, cmdNames) function, kept the CLI behaviour unchanged via require.main, and replaced the source-grep block with five behavioural cases: rewrite, multi-occurrence, idempotence on canonical input, no-op on gsd-sdk/gsd-tools, and word-boundary safety. 3. tests/bug-2808-skill-hyphen-name.test.cjs matched `name:` anywhere in SKILL.md; a stray name: in the body could satisfy the assertion. Scoped the lookup to the YAML frontmatter block via the suggested diff (parse the leading --- ... --- region first, then find name: inside it). Full suite: 5854/5854 passing. * fix(#2855): address remaining CodeRabbit findings on PR #2858 Three structural concerns flagged on the namespace-drift fix PR: 1. scripts/fix-slash-commands.cjs:24 — `buildPattern([])` compiled `/gsd:()(?=[^a-zA-Z0-9_-]\|$)/g`. The empty capture group still matches any `/gsd:` token followed by a non-word boundary (whitespace, EOL, punctuation), rewriting it to a stray `/gsd-`. Verified live: `transformContent("/gsd:", [])` → `"/gsd-"`. Added a guard returning null from `buildPattern` on empty input and updated `transformContent` and `processDir` to no-op when the pattern is null. 2. tests/autonomous-interactive.test.cjs:44-47 — assertion was `content.includes('gsd-discuss-phase') && content.includes('INTERACTIVE')`, which would false-pass on any unrelated co-occurrence (e.g. `INTERACTIVE=""` initialization plus a stray `gsd-discuss-phase` prose mention). Replaced with a structural extraction: locate the `If \`INTERACTIVE\` is set:` branch, bound it by the next `*If` / `<step>` boundary, and assert the `Skill(skill="gsd-discuss-phase", ...)` invocation lives inside that region. Tolerates whitespace around `(`, `skill`, and `=`. 3. tests/bug-2808-skill-hyphen-name.test.cjs:104 — colon-call regex was `Skill\(skill=...` and missed valid formatting like `Skill(skill = "gsd:cmd")` or `Skill( skill = ...)`. Loosened to `Skill\(\sskill\s=\s...` so reformatting drift can't slip past the namespace guard. Verification: 5854/5854 pass on `npm test` from the rebased branch. * fix(#2855): drop pre-validation filter that hid namespace drift CR finding on tests/bug-2808-skill-hyphen-name.test.cjs:128: the test collected generated skill directories with `.filter(entry => entry.isDirectory() && entry.name.startsWith('gsd-'))`, then validated namespace invariants over that filtered list. Anything that violated the prefix invariant — `gsd:extract-learnings` (colon form), `extract_learnings` without prefix, `Gsd-foo` mis-cased — would silently disappear from the iteration and the test would falsely pass. Drop the `startsWith('gsd-')` filter so every generated directory shows up. Add explicit assertions before the existing per-skill loop: - directory list is non-empty (catches a broken converter that produces nothing) - every directory begins with `gsd-` - every directory contains no `:` - every directory contains no `_` Re-audited the full PR diff for the same anti-pattern: only this one site filtered before validating the namespace; bug-2643 and commands-doc-parity also use `readdirSync().filter()` but only by file extension, which is correct. 5854/5854 on `npm test`. * fix(#2855): address remaining CR findings (1 active + 2 nitpicks) Three findings on PR #2858, all the same root cause: input narrowing before validation lets drift slip past the guards. 1. tests/bug-2808-...:104 (active) — `colonCallRe` captured local names with `[a-z0-9-]+`, which excluded the underscore. A drift like `Skill(skill="gsd:extract_learnings")` (deprecated colon syntax with the old underscore filename) silently slid through. Broadened the capture to `[^'"\s)]+` so any malformed local name is surfaced; surrounding pattern (whitespace tolerance, escape support, flags) unchanged. 2. tests/bug-2643-...:43 (nitpick) — `extractSkillNamesHyphen` and `extractSkillNamesColon` had the same over-strict capture plus relied on a single regex over raw bytes, which the project test- rigor memory bans (`feedback_no_source_grep_tests.md`). Replaced with `extractSkillCalls(content)` — a small structural extractor that walks `Skill(` openers, locates each call's matching `)`, parses the body's `skill = "..."` keyword argument with permissive whitespace + quoting + escape handling, and returns `{ name, raw }` records. The two namespace-form helpers become thin filters over the structured output. Tightened the body class to `[^'"\\]+` so a trailing escape `\` before the closing quote (as in `Skill(skill=\"gsd-foo\", …)` written inside another string context) doesn't get included in the captured name. 3. tests/bug-2543-...:44 (nitpick) — `DOC_SEARCH_FILES` was a hand- curated 7-entry array. Every doc added in the future would silently weaken drift detection until someone remembered to extend the list. Replaced with `discoverDocSearchFiles(ROOT)`: globs every `.md` under `docs/` and adds `README.md` if present. New docs are picked up automatically. Re-audited the diff surface for similar narrowings; no other sites filter or constrain before validating namespace invariants. 5854/5854 on `npm test`. * fix(#2855): recurse docs/ tree so localized translations are scanned too CR finding: discoverDocSearchFiles() stopped at docs/*.md, leaving localized translation trees (docs/ja-JP/, docs/zh-CN/, docs/ko-KR/, docs/pt-BR/) and other nested doc collections (docs/skills/, docs/superpowers/) invisible to the namespace-drift invariant. Verified the gap: docs/ has 6 nested directories with ~30 .md files that the previous top-level-only scan was skipping. None contain /gsd: references today, but a future translation update or new doc subdir could leak drift. Switch to an iterative stack walk so every .md under docs/ is scanned regardless of depth. Stack form (rather than recursion) avoids the risk of running into the call-stack limit on deep doc trees. 5854/5854 on `npm test`. --------- Co-authored-by: Tom Boucher <trekkie@nomorestars.com>	2026-04-29 22:56:59 -04:00
Tom Boucher	71a3f86fbe	docs: add end-to-end workflow walkthrough to USER-GUIDE.md (#2749 ) Adds a concrete single-phase walkthrough (webhook validator project) showing ROADMAP.md, CONTEXT.md, PLAN.md, SUMMARY.md, and STATE.md excerpts and how each command consumes the previous step's output. Also adds links to the walkthrough from README.md's nav bar and How It Works section. Closes #2359 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-26 13:33:47 -04:00
Tom Boucher	f30da8326a	feat: add gates ensuring discuss-phase decisions are translated to plans and verified (closes #2492 ) (#2611 ) * feat(#2492): add gates ensuring discuss-phase decisions are translated and verified Two gates close the loop between CONTEXT.md `<decisions>` and downstream work, fixing #2492: - Plan-phase translation gate (BLOCKING). After requirements coverage, refuses to mark a phase planned when a trackable decision is not cited (by id `D-NN` or by 6+-word phrase) in any plan's `must_haves`, `truths`, or body. Failure message names each missed decision with id, category, text, and remediation paths. - Verify-phase validation gate (NON-BLOCKING). Searches plans, SUMMARY.md, files modified, and recent commit subjects for each trackable decision. Misses are written to VERIFICATION.md as a warning section but do not change verification status. Asymmetry is deliberate — fuzzy-match miss should not fail an otherwise green phase. Shared helper `parseDecisions()` lives in `sdk/src/query/decisions.ts` so #2493 can consume the same parser. Decisions opt out of both gates via `### Claude's Discretion` heading or `[informational]` / `[folded]` / `[deferred]` tags. Both gates skip silently when `workflow.context_coverage_gate=false` (default `true`). Closes #2492 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#2492): make plan-phase decision gate actually block (review F1, F8, F9, F10, F15) - F1: replace `${context_path}` with `${CONTEXT_PATH}` in the plan-phase gate snippet so the BLOCKING gate receives a non-empty path. The variable was defined in Step 4 (`CONTEXT_PATH=$(_gsd_field "$INIT" ...)`) and the gate snippet referenced the lowercase form, leaving the gate to run with an empty path argument and silently skip. - F15: wrap the SDK call with `jq -e '.data.passed == true' \|\| exit 1` so failure halts the workflow instead of being printed and ignored. The verify-phase counterpart deliberately keeps no exit-1 (non-blocking by design) and now carries an inline note documenting the asymmetry. - F10: tag the JSON example fence as `json` and the options-list fence as `text` (MD040). - F8/F9: anchor the heading-presence test regexes to `^## 13[a-z]?\\.` so prose substrings like "Requirements Coverage Gate" mentioned in body text cannot satisfy the assertion. Added two new regression tests (variable-name match, exit-1 guard) so a future revert is caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#2492): tighten decision-coverage gates against false positives and config drift (review F3,F4,F5,F6,F7,F16,F18,F19) - F3: forward `workstream` arg through both gate handlers so workstream-scoped `workflow.context_coverage_gate=false` actually skips. Added negative test that creates a workstream config disabling the gate while the root config has it enabled and asserts the workstream call is skipped. - F4: restrict the plan-phase haystack to designated sections — front-matter `must_haves` / `truths` / `objective` plus body sections under headings matching `must_haves\|truths\|tasks\|objective`. HTML comments and fenced code blocks are stripped before extraction so a commented-out citation or a literal example never counts as coverage. Verify-phase keeps the broader artifact-wide haystack by design (non-blocking). - F5: reject decisions with fewer than 6 normalized words from soft-matching (previously only rejected when the resulting phrase was under 12 chars AFTER slicing — too lenient). Short decisions now require an explicit `D-NN` citation, with regression tests for the boundary. - F6: walk every `-SUMMARY.md` independently and use `matchAll` with the `/g` flag so multiple `files_modified:` blocks across multiple summaries are all aggregated. Previously only the first block in the concatenated string was parsed, silently dropping later plans' files. - F7: validate every `files_modified` path stays inside `projectDir` after resolution (rejects absolute paths, `../` traversal). Cap each file read at 256 KB. Skipped paths emit a stderr warning naming the entry. - F16: validate `workflow.context_coverage_gate` is boolean in `loadGateConfig`; warn loudly on numeric or other-shaped values and default to ON. Mirrors the schema-vs-loadConfig validation gap from #2609. - F18: bump verify-phase `git log -n` cap from 50 to 200 so longer-running phases are not undercounted. Documented as a precision-vs-recall tradeoff appropriate for a non-blocking gate. - F19: tighten `QueryResult` / `QueryHandler` to be parameterized (`<T = unknown>`). Drops the `as unknown as Record<string, unknown>` casts in the gate handlers and surfaces shape mismatches at compile time for callers that pass a typed `data` value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> fix(#2492): harden decisions parser and verify-phase glob (review F11,F12,F13,F14,F17,F20) - F11: strip fenced code blocks from CONTEXT.md before searching for `<decisions>` so an example block inside ``` ``` is not mis-parsed. - F12: accept tab-indented continuation lines (previously required a leading space) so decisions split with `\t` continue cleanly. - F13: parse EVERY `<decisions>` block in the file via `matchAll`, not just the first. CONTEXT.md may legitimately carry more than one block. - F14: `decisions.parse` handler now resolves a relative path against `projectDir` — symmetric with the gate handlers — and still accepts absolute paths. - F17: replace `ls "${PHASE_DIR}"/*-CONTEXT.md \| head -1` in verify-phase.md with a glob loop (ShellCheck SC2012 fix). Also avoids spawning an extra subprocess and survives filenames with whitespace. - F20: extend the unicode quote-stripping in the discretion-heading match to cover U+2018/2019/201A/201B and the U+201C-F double-quote variants plus backtick, so any rendering of "Claude's Discretion" collapses to the same key. Each fix has a regression test in `decisions.test.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 00:26:53 -04:00
Tom Boucher	cc17886c51	feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517 ) (#2609 ) * feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517) Adds an optional top-level `runtime` config key plus a `model_profile_overrides[runtime][tier]` map. When `runtime` is set, profile tiers (opus/sonnet/haiku) resolve to runtime-native model IDs (and reasoning_effort where supported) instead of bare Claude aliases. Codex defaults from the spec: opus -> gpt-5.4 reasoning_effort: xhigh sonnet -> gpt-5.3-codex reasoning_effort: medium haiku -> gpt-5.4-mini reasoning_effort: medium Claude defaults mirror MODEL_ALIAS_MAP. Unknown runtimes fall back to the Claude-alias safe default rather than emit IDs the runtime cannot accept. reasoning_effort is only emitted into Codex install paths; never returned from resolveModelInternal and never written to Claude agent frontmatter. Backwards compatible: any user without `runtime` set sees identical behavior — the new branch is gated on `config.runtime != null`. Precedence (highest to lowest): 1. per-agent model_overrides 2. runtime-aware tier resolution (when `runtime` is set) 3. resolve_model_ids: "omit" 4. Claude-native default 5. inherit (literal passthrough) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#2517): address adversarial review of #2609 (findings 1-16) Addresses all 16 findings from the adversarial review of PR #2609. Each finding is enumerated below with its resolution. CRITICAL - F1: readGsdRuntimeProfileResolver(targetDir) now probes per-project .planning/config.json AND ~/.gsd/defaults.json with per-project winning, so the PR's headline claim ("set runtime in project config and Codex TOML emit picks it up") actually holds end-to-end. - F2: resolveTierEntry field-merges user overrides with built-in defaults. The CONFIGURATION.md string-shorthand example `{ codex: { opus: "gpt-5-pro" } }` now keeps reasoning_effort from the built-in entry. Partial-object overrides like `{ opus: { reasoning_effort: 'low' } }` keep the built-in model. Both paths regression-tested. MAJOR - F3: resolveReasoningEffortInternal gates strictly on the RUNTIMES_WITH_REASONING_EFFORT allowlist regardless of override presence. Override + unknown-runtime no longer leaks reasoning_effort. - F4: runtime:"claude" is now a no-op for resolution (it is the implicit default). It no longer hijacks resolve_model_ids:"omit". Existing tests for `runtime:"claude"` returning Claude IDs were rewritten to reflect the no-op semantics; new test asserts the omit case returns "". - F5: _readGsdConfigFile in install.js writes a stderr warning on JSON parse failure instead of silently returning null. Read failure and parse failure are warned separately. Library require is hoisted to top of install.js so it is not co-mingled with config-read failure modes. - F6: install.js requires for core.cjs / model-profiles.cjs are hoisted to the top of the file with __dirname-based absolute paths so global npm install works regardless of cwd. Test asserts both lib paths exist relative to install.js __dirname. - F7: docs/CONFIGURATION.md `runtime` row no longer lists `opencode` as a valid runtime — install-path emission for non-Codex runtimes is explicitly out of scope per #2517 / #2612, and the doc now points at #2612 for the follow-on work. resolveModelInternal still accepts any runtime string (back-compat) and falls back safely for unknown values. - F8: Tests now isolate HOME (and GSD_HOME) to a per-test tmpdir so the developer's real ~/.gsd/defaults.json cannot bleed into assertions. Same pattern CodeRabbit caught on PRs #2603 / #2604. - F9: `runtime` and `model_profile_overrides` documented as flat-only in core.cjs comments — not routed through `get()` because they are top-level keys per docs/CONFIGURATION.md and introducing nested resolution for two new keys was not worth the edge-case surface. - F10/F13: loadConfig now invokes _warnUnknownProfileOverrides on the raw parsed config so direct .planning/config.json edits surface unknown runtime values (e.g. typo `runtime: "codx"`) and unknown tier values (e.g. `model_profile_overrides.codex.banana`) at read time. Warnings only — preserves back-compat for runtimes added later. Per-process warning cache prevents log spam across repeated loadConfig calls. MINOR / NIT - F11: Removed dead `tier \|\| 'sonnet'` defensive shortcut. The local is now `const alias = tier;` with a comment explaining why `tier` is guaranteed truthy at that point (every MODEL_PROFILES entry defines `balanced`, the fallback profile). - F12: Extracted resolveTierEntry() in core.cjs as the single source of truth for runtime-aware tier resolution. core.cjs and bin/install.js both consume it — no duplicated lookup logic between the two files. - F14: Added regression tests for findings #1, #2, #3, #4, #6, #10, #13 in tests/issue-2517-runtime-aware-profiles.test.cjs. Each must-fix path has a corresponding test that fails against the pre-fix code and passes against the post-fix code. - F15: docs/CONFIGURATION.md `model_profile` row cross-references #1713 / #1806 next to the `adaptive` enum value. - F16: RUNTIME_PROFILE_MAP remains in core.cjs as the single source of truth; install.js imports it through the exported resolveTierEntry helper rather than carrying its own copy. Doc files (CONFIGURATION.md, USER-GUIDE.md, settings.md) intentionally still embed the IDs as text — code comment in core.cjs flags that those doc files must be updated whenever the constant changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 23:00:37 -04:00
Tom Boucher	1a694fcac3	feat: auto-remap codebase after significant phase execution (closes #2003 ) (#2605 ) * feat: auto-remap codebase after significant phase execution (#2003) Adds a post-phase structural drift detector that compares the committed tree against `.planning/codebase/STRUCTURE.md` and either warns or auto-remaps the affected subtrees when drift exceeds a configurable threshold. ## Summary - New `bin/lib/drift.cjs` — pure detector covering four drift categories: new directories outside mapped paths, new barrel exports at `(packages\|apps)//src/index.`, new migration files, and new route modules. Prioritizes the most-specific category per file. - New `verify codebase-drift` CLI subcommand + SDK handler, registered as `gsd-sdk query verify.codebase-drift`. - New `codebase_drift_gate` step in `execute-phase` between `schema_drift_gate` and `verify_phase_goal`. Non-blocking by contract — any error logs and the phase continues. - Two new config keys: `workflow.drift_threshold` (int, default 3) and `workflow.drift_action` (`warn` \| `auto-remap`, default `warn`), with enum/integer validation in `config-set`. - `gsd-codebase-mapper` learns an optional `--paths <p1,p2,...>` scope hint for incremental remapping; agent/workflow docs updated. - `last_mapped_commit` lives in YAML frontmatter on each `.planning/codebase/.md` file; `readMappedCommit`/`writeMappedCommit` round-trip helpers ship in `drift.cjs`. ## Tests - 55 new tests in `tests/drift-detection.test.cjs` covering: classification, threshold gating at 2/3/4 elements, warn vs. auto-remap routing, affected-path scoping, `--paths` sanitization (traversal, absolute, shell metacharacter rejection), frontmatter round-trip, defensive paths (missing STRUCTURE.md, malformed input, non-git repos), CLI JSON output, and documentation parity. - Full suite: 5044 pass / 0 fail. ## Documentation - `docs/CONFIGURATION.md` — rows for both new keys. - `docs/ARCHITECTURE.md` — section on the post-execute drift gate. - `docs/AGENTS.md` — `--paths` flag on `gsd-codebase-mapper`. - `docs/USER-GUIDE.md` — user-facing behavior note + toggle commands. - `docs/FEATURES.md` — new 27a section with REQ-DRIFT-01..06. - `docs/INVENTORY.md` + `docs/INVENTORY-MANIFEST.json` — drift.cjs listed. - `get-shit-done/workflows/execute-phase.md` — `codebase_drift_gate` step. - `get-shit-done/workflows/map-codebase.md` — `parse_paths_flag` step. - `agents/gsd-codebase-mapper.md` — `--paths` directive under parse_focus. ## Design decisions - Frontmatter over sidecar JSON* for `last_mapped_commit`: keeps the baseline attached to the file, survives git moves, survives per-doc regeneration, no extra file lifecycle. - Substring match against STRUCTURE.md for `isPathMapped`: the map is free-form markdown, not a structured manifest; any mention of a path prefix counts as "mapped territory". Cheap, no parser, zero false negatives on reasonable maps. - Category priority migration > route > barrel > new_dir so a file matching multiple rules counts exactly once at the most specific level. - Empty-tree SHA fallback (`4b825dc6…`) when `last_mapped_commit` is absent — semantically correct (no baseline means everything is drift) and deterministic across repos. - Four layers of non-blocking — detector try/catch, CLI try/catch, SDK handler try/catch, and workflow `\|\| echo` shell fallback. Any single layer failing still returns a valid skipped result. - SDK handler delegates to `gsd-tools.cjs` rather than re-porting the detector to TypeScript, keeping drift logic in one canonical place. Closes #2003 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(mapper): tag --paths fenced block as text (CodeRabbit MD040) Comment 3127255172. * docs(config): use /gsd- dash command syntax in drift_action row (CodeRabbit) Comment 3127255180. Matches the convention used by every other command reference in docs/CONFIGURATION.md. * fix(execute-phase): initialize AGENT_SKILLS_MAPPER + tag fenced blocks Two CodeRabbit findings on the auto-remap branch of the drift gate: - 3127255186 (must-fix): the mapper Task prompt referenced ${AGENT_SKILLS_MAPPER} but only AGENT_SKILLS (for gsd-executor) is loaded at init_context (line 72). Without this fix the literal placeholder string would leak into the spawned mapper's prompt. Add an explicit gsd-sdk query agent-skills gsd-codebase-mapper step right before the Task spawn. - 3127255183: tag the warn-message and Task() fenced code blocks as text to satisfy markdownlint MD040. * docs(map-codebase): wire PATH_SCOPE_HINT through every mapper prompt CodeRabbit (review id 4158286952, comment 3127255190) flagged that the parse_paths_flag step defined incremental-remap semantics but did not inject a normalized variable into the spawn_agents and sequential_mapping mapper prompts, so incremental remap could silently regress to a whole-repo scan. - Define SCOPED_PATHS / PATH_SCOPE_HINT in parse_paths_flag. - Inject ${PATH_SCOPE_HINT} into all four spawn_agents Task prompts. - Document the same scope contract for sequential_mapping mode. * fix(drift): writeMappedCommit tolerates missing target file CodeRabbit (review id 4158286952, drift.cjs:349-355 nitpick) noted that readMappedCommit returns null on ENOENT but writeMappedCommit threw — an asymmetry that breaks first-time stamping of a freshly produced doc that the caller has not yet written. - Catch ENOENT on the read; treat absent file as empty content. - Add a regression test that calls writeMappedCommit on a non-existent path and asserts the file is created with correct frontmatter. Test was authored to fail before the fix (ENOENT) and passes after. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 21:21:44 -04:00
Rezolv	86fb9c85c3	docs(sdk): registry docs and gsd-sdk query call sites (#2302 Track B) (#2340 ) * feat(sdk): golden parity harness and query handler CJS alignment (#2302 Track A) Golden/read-only parity tests and registry alignment, query handler fixes (check-completion, state-mutation, commit, validate, summary, etc.), and WAITING.json dual-write for .gsd/.planning readers. Refs gsd-build/get-shit-done#2341 * fix(sdk): getMilestoneInfo matches GSD ROADMAP (🟡, last bold, STATE fallback) - Recognize in-flight 🟡 milestone bullets like 🚧. - Derive from last vX.Y Title before ## Phases when emoji absent. - Fall back to STATE.md milestone when ROADMAP is missing; use last bare vX.Y in cleaned text instead of first (avoids v1.0 from shipped list). - Fixes init.execute-phase milestone_version and buildStateFrontmatter after state.begin-phase (syncStateFrontmatter). * feat(sdk): phase list, plan task structure, requirements extract handlers - Register phase.list-plans, phase.list-artifacts, plan.task-structure, requirements.extract-from-plans (SDK-only; golden-policy exceptions). - Add unit tests; document in QUERY-HANDLERS.md. - writeProfile: honor --output, render dimensions, return profile_path and dimensions_scored. * feat(sdk): centralize getGsdAgentsDir in query helpers Extract agent directory resolution to helpers (GSD_AGENTS_DIR, primary ~/.claude/agents, legacy path). Use from init and docs-init init bundles. docs(15): add 15-CONTEXT for autonomous phase-15 run. * feat(sdk): query CLI CJS fallback and session correlation - createRegistry(eventStream, sessionId) threads correlation into mutation events - gsd-sdk query falls back to gsd-tools.cjs when no native handler matches (disable with GSD_QUERY_FALLBACK=off); stderr bridge warnings - Export createRegistry from @gsd-build/sdk; add sdk/README.md - Update QUERY-HANDLERS.md and registry module docs for fallback + sessionId - Agents: prefer node dist/cli.js query over cat/grep for STATE and plans * fix(sdk): init phase_found parity, docs-init agents path, state field extract - Normalize findPhase not-found to null before roadmap fallback (matches findPhaseInternal) - docs-init: use detectRuntime + resolveAgentsDir for checkAgentsInstalled - state.cjs stateExtractField: horizontal whitespace only after colon (YAML progress guard) - Tests: commit_docs default true; config-get golden uses temp config; golden integration green Refs: #2302 * refactor(sdk): share SessionJsonlRecord in profile-extract-messages CodeRabbit nit: dedupe JSONL record shape for isGenuineUserMessage and streamExtractMessages. * fix(sdk): address CodeRabbit major threads (paths, gates, audit, verify) - Resolve @file: and CLI JSON indirection relative to projectDir; guard empty normalized query command - plan.task-structure + intel extract/patch-meta: resolvePathUnderProject containment - check.config-gates: safe string booleans; plan_checker alias precedence over plan_check default - state.validate/sync: phaseTokenMatches + comparePhaseNum ordering - verify.schema-drift: token match phase dirs; files_modified from parsed frontmatter - audit-open: has_scan_errors, unreadable rows, human report when scans fail - requirements PLANNED key PLAN for root PLAN.md; gsd-tools timeout note - ingest-docs: repo-root path containment; classifier output slug-hash Golden parity test strips has_scan_errors until CJS adds field. * fix: Resolve CodeRabbit security and quality findings - Secure intel.ts and cli.ts against path traversal - Catch and validate git add status in commit.ts - Expand roadmap milestone marker extraction - Fix parsing array-of-objects in frontmatter YAML - Fix unhandled config evaluations - Improve coverage test parity mapping * docs(sdk): registry docs and gsd-sdk query call sites (#2302 Track B) Update CHANGELOG, architecture and user guides, workflow call sites, and read-guard tests for gsd-sdk query; sync ARCHITECTURE.md command/workflow counts and directory-tree totals with the repo (80 commands, 77 workflows). Address CodeRabbit: fix markdown tables and emphasis; align CLI-TOOLS GSDTools and state.read docs with implementation; correct roadmap handler name in universal-anti-patterns; resolve settings workflow config path without relying on config_path from state.load. Refs gsd-build/get-shit-done#2340 * test: raise planner character extraction limit to 48K * fix(sdk): resolve build TS error and doc conflict markers	2026-04-20 18:09:21 -04:00
Logan	fbf30792f3	docs: authoritative shipped-surface inventory with filesystem-backed parity tests (#2390 ) * docs: finish trust-bug fixes in user guide and commands Correct load-bearing defects in the v1.36.0 docs corpus so readers stop acting on wrong defaults and stale exhaustiveness claims. - README.md: drop "Complete feature"/"Every command"/"All 18 agents" exhaustiveness claims; replace version-pinned "What's new in v1.32" bullet with a CHANGELOG pointer. - CONFIGURATION.md: fix `claude_md_path` default (null/none -> `./CLAUDE.md`) in both Full Schema and core settings table; correct `workflow.tdd_mode` provenance from "Added in v1.37" to "Added in v1.36". - USER-GUIDE.md: fix `workflow.discuss_mode` default (`standard` -> `discuss`) in the workflow-toggles table AND in the abbreviated Full Schema JSON block above it; align the Options cell with the shipped enum. - COMMANDS.md: drop "Complete command syntax" subtitle overclaim to match the README posture. - AGENTS.md: weaken "All 21 specialized agents" header to reflect that the `agents/` filesystem is authoritative (shipped roster is 31). Part 1 of a stacked docs refresh series (PR 1/4). * docs: refresh shipped surface coverage for v1.36 Close the v1.36.0 shipped-surface gaps in the docs corpus. - COMMANDS.md: add /gsd-graphify section (build/query/status/diff) and its config gate; expand /gsd-quick with --validate flag and list/ status/resume subcommands; expand /gsd-thread with list --open, list --resolved, close <slug>, status <slug>. - CLI-TOOLS.md: replace the hardcoded "15 domain modules" count with a pointer to the Module Architecture table; add a graphify verb-family section (build/query/status/diff/snapshot); add Graphify and Learnings rows to the Module Architecture table. - FEATURES.md: add TOC entries for #116 TDD Pipeline Mode and #117 Knowledge Graph Integration; add the #117 body with REQ-GRAPH-01..05. - CONFIGURATION.md: move security_enforcement / security_asvs_level / security_block_on from root into `workflow.` in Full Schema to match templates/config.json and the gsd-sdk runtime reads; update Security Settings table to use the workflow. prefix; add planning.sub_repos to Full Schema and description table; add a Graphify Settings section documenting graphify.enabled and graphify.build_timeout. Note: VALID_CONFIG_KEYS in bin/lib/config.cjs does not yet include workflow.security_* or planning.sub_repos, so config-set currently rejects them. That is a pre-existing validator gap that this PR does not attempt to fix; the docs now correctly describe where these keys live per the shipped template and runtime reads. Part 2 of a stacked docs refresh series (PR 2/5), based on PR 1. * docs: make inventory authoritative and reconcile architecture Upgrade docs/INVENTORY.md from "complete for agents, selective for others" to authoritative across all six shipped-surface families, and reconcile docs/ARCHITECTURE.md against the new inventory so the PR that introduces INVENTORY does not also introduce an INVENTORY/ARCHITECTURE contradiction. - docs/AGENTS.md: weaken "21 specialized agents" header to 21 primary + 10 advanced (31 shipped); add new "Advanced and Specialized Agents" section with concise role cards for the 10 previously-omitted shipped agents (pattern-mapper, debug-session-manager, code-reviewer, code-fixer, ai-researcher, domain-researcher, eval-planner, eval-auditor, framework-selector, intel-updater); footnote the Agent Tool Permissions Summary as primary-agents-only so it no longer misleads. - docs/INVENTORY.md (rewritten to be authoritative): * Full 31-agent roster with one-line role + spawner + primary-doc status per agent (unchanged from prior partial work). * Commands: full 75-row enumeration grouped by Core Workflow, Phase & Milestone Management, Session & Navigation, Codebase Intelligence, Review/Debug/Recovery, and Docs/Profile/Utilities — each row carries a one-line role derived from the command's frontmatter and a link to the source file. * Workflows: full 72-row enumeration covering every get-shit-done/workflows/.md, with a one-line role per workflow and a column naming the user-facing command (or internal orchestrator) that invokes it. References: full 41-row enumeration grouped by Core, Workflow, Thinking-Model clusters, and the Modular Planner decomposition, matching the groupings docs/ARCHITECTURE.md already uses; notes the few-shot-examples subdirectory separately. * CLI Modules and Hooks: unchanged — already full rosters. * Maintenance section rewritten to describe the drift-guard test suite that will land in PR4 (inventory-counts, commands-doc-parity, agents-doc-parity, cli-modules-doc-parity, hooks-doc-parity). - docs/ARCHITECTURE.md reconciled against INVENTORY: * References block: drop the stale "(35 total)" count; point at INVENTORY.md#references-41-shipped for the authoritative count. * CLI Tools block: drop the stale "19 domain modules" count; point at INVENTORY.md#cli-modules-24-shipped for the authoritative roster. * Agent Spawn Categories: relabel as "Primary Agent Spawn Categories" and add a footer naming the 10 advanced agents and pointing at INVENTORY.md#agents-31-shipped for the full 31-agent roster. - docs/CONFIGURATION.md: preserve the six model-profile rows added in the prior partial work, and tighten the fallback note so it names the 13 shipped agents without an explicit profile row, documents model_overrides as the escape hatch, and points at INVENTORY.md for the authoritative 31-agent roster. Part 3 of a stacked docs refresh series (PR 3/4). Remaining consistency work (USER-GUIDE config-section delete-and-link, FEATURES.md TOC reorder, ARCHITECTURE.md Hook-table expansion + installation-layout collapse, CLI-TOOLS.md module-row additions, workflow-discuss-mode invocation normalization, and the five doc-parity tests) lands in PR4. * test(docs): add consistency guards and remove duplicate refs Consolidates USER-GUIDE.md's command/config duplicates into pointers to COMMANDS.md and CONFIGURATION.md (kills a ghost `resolve_model_ids` key and a stale `discuss_mode: standard` default); reorders FEATURES.md TOC chronologically so v1.32 precedes v1.34/1.35/1.36; expands ARCHITECTURE.md's Hook table to the 11 shipped hooks (gsd-read-injection-scanner, gsd-check-update-worker) and collapses the installation-layout hook enumeration to the .js/.sh pattern form; adds audit/gsd2-import/intel rows and state signal-, audit-open, from-gsd2 verbs to CLI-TOOLS.md; normalizes workflow-discuss-mode.md invocations to `node gsd-tools.cjs config-set`. Adds five drift guards anchored on docs/INVENTORY.md as the authoritative roster: inventory-counts (all six families), commands/agents/cli-modules/hooks parity checks that every shipped surface has a row somewhere. fix(convergence): thread --ws to review agent; add stall and max-cycles behavioral tests - Thread GSD_WS through to review agent spawn in plan-review-convergence workflow (step 5a) so --ws scoping is symmetric with planning step - Add behavioral stall detection test: asserts workflow compares HIGH_COUNT >= prev_high_count and emits a stall warning - Add behavioral --max-cycles 1 test: asserts workflow reaches escalation gate when cycle >= MAX_CYCLES with HIGH > 0 after a single cycle - Include original PR files (commands, workflow, tests) as the branch predated the PR commits Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs,config): PR #2390 review — security_* config keys and REQ-GRAPH-02 scope Addresses trek-e's review items that don't require rebase: - config.cjs: add workflow.security_enforcement, workflow.security_asvs_level, workflow.security_block_on to VALID_CONFIG_KEYS so gsd-sdk config-set accepts them (closed the gap where docs/CONFIGURATION.md listed keys the validator rejected). - core.cjs: add matching CONFIG_DEFAULTS entries (true / 1 / 'high') so the canonical defaults table matches the documented values. - config.cjs: wire the three keys into the new-project workflow defaults so fresh configs inherit them. - planning-config.md: document the three keys in the Workflow Fields table, keeping the CONFIG_DEFAULTS ↔ doc parity test happy. - config-field-docs.test.cjs: extend NAMESPACE_MAP so the flat keys in CONFIG_DEFAULTS resolve to their workflow.* doc rows. - FEATURES.md REQ-GRAPH-02: split the slash-command surface (build\|query\| status\|diff) from the CLI surface which additionally exposes `snapshot` (invoked automatically at the tail of `graphify build`). The prior text overstated the slash-command surface. * docs(inventory): refresh rosters and counts for post-rebase drift origin/main accumulated surfaces since this PR was authored: - Agents: 31 → 33 (+ gsd-doc-classifier, gsd-doc-synthesizer) - Commands: 76 → 82 (+ ingest-docs, ultraplan-phase, spike, spike-wrap-up, sketch, sketch-wrap-up) - Workflows: 73 → 79 (same 6 names) - References: 41 → 49 (+ debugger-philosophy, doc-conflict-engine, mandatory-initial-read, project-skills-discovery, sketch-interactivity, sketch-theme-system, sketch-tooling, sketch-variant-patterns) Adds rows in the existing sub-groupings, introduces a Sketch References subsection, and bumps all four headline counts. Roles are pulled from source frontmatter / purpose blocks for each file. All 5 parity tests (inventory-counts, agents-doc-parity, commands-doc-parity, cli-modules-doc-parity, hooks-doc-parity) pass against this state — 156 assertions, 0 failures. Also updates the 'Coverage note' advanced-agent count 10 → 12 and the few-shot-examples footnote "41 top-level references" → "49" to keep the file internally consistent. * docs(agents): add advanced stubs for gsd-doc-classifier and gsd-doc-synthesizer Both agents ship on main (spawned by /gsd-ingest-docs) but had no coverage in docs/AGENTS.md. Adds the "advanced stub" entries (Role, property table, Key behaviors) following the template used by the other 10 advanced/specialized agents in the same section. Also updates the Agent Tool Permissions Summary scope note from "10 advanced/specialized agents" to 12 to reflect the two new stubs. * docs(commands): add entries for ingest-docs, ultraplan-phase, plan-review-convergence These three commands ship on main (plan-review-convergence via trek-e's 4b452d29 commit on this branch) but had no user-facing section in docs/COMMANDS.md — they lived only in INVENTORY.md. The commands-doc-parity test already passes via INVENTORY, but the user-facing doc was missing canonical explanations, argument tables, and examples. - /gsd-plan-review-convergence → Core Workflow (after /gsd-plan-phase) - /gsd-ultraplan-phase → Core Workflow (after plan-review-convergence) - /gsd-ingest-docs → Brownfield (after /gsd-import, since both consume the references/doc-conflict-engine.md contract) Content pulled from each command's frontmatter and workflow purpose block. * test: remove redundant ARCHITECTURE.md count tests tests/architecture-counts.test.cjs and tests/command-count-sync.test.cjs were added when docs/ARCHITECTURE.md carried hardcoded counts for commands/ workflows/agents. With the PR #2390 cleanup, ARCHITECTURE.md no longer owns those numbers — docs/INVENTORY.md does, enforced by tests/inventory-counts.test.cjs (scans the same filesystem directories with the same readdirSync filter). Keeping these ARCHITECTURE-specific tests would re-introduce the hardcoded counts they guard, defeating trek-e's review point. The single-source-of- truth parity tests already catch the same drift scenarios. Related: #2257 (the regression this replaced). --------- Co-authored-by: Tom Boucher <trekkie@nomorestars.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 09:31:34 -04:00
Tom Boucher	2e97dee0d0	docs: update release notes and command reference for v1.37.0 (#2382 ) * fix(tests): clear CLAUDECODE env var in read-guard test runner The hook skips its advisory on two env vars: CLAUDE_SESSION_ID and CLAUDECODE. runHook() cleared CLAUDE_SESSION_ID but inherited CLAUDECODE from process.env, so tests run inside a Claude Code session silently no-oped and produced no stdout, causing JSON.parse to throw. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): update ARCHITECTURE.md counts and add TEXT_MODE fallback to sketch workflow Four new spike/sketch files were added in 1.37.0 but two housekeeping items were missed: ARCHITECTURE.md component counts (75→79 commands, 72→76 workflows) and the required TEXT_MODE fallback in sketch.md for non-Claude runtimes (#2012). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): update directory-tree slash command count in ARCHITECTURE.md Missed the second count in the directory tree (# 75 slash commands → 79). The prose "Total commands" was updated but the tree annotation was not, causing command-count-sync.test.cjs to fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: update release notes and command reference for v1.37.0 Covers spike/sketch commands, agent size-budget enforcement, and shared boilerplate extraction across README, COMMANDS, FEATURES, and USER-GUIDE. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 13:45:30 -04:00
Tom Boucher	712e381f13	docs: document required Bash permission patterns for executor subagents (#2071 ) (#2288 ) * docs: document required Bash permission patterns for gsd-executor subagents (Closes #2071) Adds a new "Executor Subagent Gets Permission denied on Bash Commands" section to USER-GUIDE.md Troubleshooting. Documents the wildcard Bash patterns that must be added to ~/.claude/settings.json (or per-project .claude/settings.local.json) for each supported stack so fresh installs aren't blocked mid-execution. Covers: git write commands, gh, Rails/Ruby, Python/uv, Node/npm/pnpm/bun, and Rust/Cargo. Includes a complete example settings.json snippet for Rails. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(phase): preserve zero-padded prefix in renameDecimalPhases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 22:47:12 -04:00
Rezolv	d3a79917fa	feat: Phase 2 caller migration — gsd-sdk query in workflows, agents, commands (#2179 ) * feat: Phase 2 caller migration — gsd-sdk query in workflows (#2122) Cherry-picked orchestration rewrites from feat/sdk-foundation (#2008, `4018fee`) onto current main, resolving conflicts to keep upstream worktree guards and post-merge test gate. SDK stub registry omitted (out of Phase 2 scope per #2122). Refs: #2122 #2008 Made-with: Cursor * docs: add gsd-sdk query migration blurb Made-with: Cursor * docs(workflows): extend Phase 2 gsd-sdk query caller migration - Swap node gsd-tools.cjs for gsd-sdk query in review, plan-phase, execute-plan, ship, extract_learnings, ai-integration-phase, eval-review, next, thread - Document graphify CJS-only in gsd-planner; dual-path in CLI-TOOLS and ARCHITECTURE - Update tests: workstreams gsd-sdk path, thread frontmatter.get, workspace init., CRLF-safe autonomous frontmatter parse - CHANGELOG: Phase 2 caller migration scope Made-with: Cursor docs(phase2): USER-GUIDE + remaining gsd-sdk query call sites - USER-GUIDE: dual-path CLI section; state validate/sync use full CJS path - Commands: debug (config-get+tdd), quick (security note), intel Task prompt - Agent: gsd-debug-session-manager resolve-model via jq - Workflows: milestone-summary, forensics, next, complete-milestone/verify-work (audit-open CJS notes), discuss-phase, progress, verify-phase, add/insert/remove phase, transition, manager, quick workflow; remove-phase commit without --files - Test: quick-session-management accepts frontmatter.get - CHANGELOG: Phase 2 follow-up bullet Made-with: Cursor * docs(phase2): align gsd-sdk query examples in commands and agents - init.* query names; frontmatter.get uses positional field name - state.* handlers use positional args; commit uses positional paths - CJS-only notes for from-gsd2 and graphify; learnings.query wording - CHANGELOG: Phase 2 orchestration doc pass Made-with: Cursor * docs(phase2): normalize gsd-sdk query commit to positional file paths - Strip --files from commit examples in workflows, references, commands - Keep commit-to-subrepo ... --files (separate handler) - git-planning-commit.md: document positional args - Tests: new-project commit line, state.record-session, gates CRLF, roadmap.analyze - CHANGELOG [Unreleased] Made-with: Cursor * feat(sdk): gsd-sdk query parity with gsd-tools and PR 2179 registry fixes - Route query via longest-prefix match and dotted single-token expansion; fall back to runGsdToolsQuery (same argv as node gsd-tools.cjs) for full CLI coverage. - Parse gsd-sdk query permissively so gsd-tools flags (--json, --verify, etc.) are not rejected by strict parseArgs. - resolveGsdToolsPath: honor GSD_TOOLS_PATH; prefer bundled get-shit-done copy over project .claude installs; export runGsdToolsQuery from the SDK. - Fix gsd-tools audit-open (core.output; pass object for --json JSON). - Register summary-extract as alias of summary.extract; fix audit-fix workflow to call audit-uat instead of invalid init.audit-uat (PR review). Updates QUERY-HANDLERS.md and CHANGELOG [Unreleased]. Made-with: Cursor * fix(sdk): Phase 2 scope — Trek-e review (#2179, #2122) - Remove gsd-sdk query passthrough to gsd-tools.cjs; drop GSD_TOOLS_PATH - Consolidate argv routing in resolveQueryArgv(); update USAGE and QUERY-HANDLERS - Surface @file: read failures in GSDTools.parseOutput - execute-plan: defer Task Commit Protocol to gsd-executor - stale-colon-refs: skip .planning/ and root CLAUDE.md (gitignored overlays) - CHANGELOG [Unreleased]: maintainer review and routing notes Made-with: Cursor	2026-04-15 22:46:31 -04:00
Tom Boucher	990b87abd4	feat(discuss-phase): adapt gray area language for non-technical owners via USER-PROFILE.md (#2125 ) (#2173 ) When USER-PROFILE.md signals a non-technical product owner (learning_style: guided, jargon in frustration_triggers, or high-level explanation_depth), discuss-phase now reframes gray area labels and advisor_research rationale paragraphs in product-outcome language. Same technical decisions, translated framing so product owners can participate meaningfully without needing implementation vocabulary. Closes #2125 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 16:45:29 -04:00
Tom Boucher	6c2795598a	docs: release notes and documentation updates for v1.35.0 (#2079 ) Closes #2080	2026-04-10 22:29:06 -04:00
Tom Boucher	641ea8ad42	docs: update documentation for v1.34.0 release (#1868 )	2026-04-06 16:25:41 -04:00
Tom Boucher	c8d7ab3501	docs: fill documentation gaps from v1.32.0 audit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-04 08:54:14 -04:00
Tom Boucher	acf82440e5	docs: update English documentation for v1.32.0 release Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-04 08:28:50 -04:00
Quang Do	d4767ac2e0	fix: replace /gsd: slash command format with /gsd- skill format in all user-facing content (#1579 ) * fix: replace /gsd: command format with /gsd- skill format in all suggestions All next-step suggestions shown to users were still using the old colon format (/gsd:xxx) which cannot be copy-pasted as skills. Migrated all occurrences across agents/, commands/, get-shit-done/, docs/, README files, bin/install.js (hardcoded defaults for claude runtime), and get-shit-done/bin/lib/.cjs (generate-claude-md templates and error messages). Updated tests to assert new hyphen format instead of old colon format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> fix: migrate remaining /gsd: format to /gsd- in hooks, workflows, and sdk Addresses remaining user-facing occurrences missed in the initial migration: - hooks/: fix 4 user-facing messages (pause-work, update, fast, quick) and 2 comments in gsd-workflow-guard.js - get-shit-done/workflows/: fix 21 Skill() literal calls that Claude executes directly (installer does not transform workflow content) - sdk/prompt-sanitizer.ts: update regex to strip /gsd- format in addition to legacy /gsd: format; update JSDoc comment - tests/: update autonomous-ui-steps, prompt-sanitizer to assert new format Note: commands/gsd/.md frontmatter (name: gsd:xxx) intentionally unchanged — installer derives skillName from directory path, not the name field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> fix(plan-phase): preserve --chain flag in auto-advance sync and handle ui-phase gate in chain mode Bug 1: step 15 sync-flag check only guarded against --auto, causing _auto_chain_active to be cleared when plan-phase is invoked without --auto in ARGUMENTS even though a --chain pipeline was active. Added --chain to the guard condition, matching discuss-phase behaviour. Bug 2: UI Design Contract gate (step 5.6) always exited the workflow when UI-SPEC was missing, breaking the discuss --chain pipeline silently. When _auto_chain_active is true, the gate now auto-invokes gsd-ui-phase --auto via Skill() and continues to step 6 without prompting. Manual invocations retain the existing AskUserQuestion flow. * fix: remove <sub>/clear</sub> pattern and duplicate old-format command in discuss-phase.md --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-04 07:24:31 -04:00
Tom Boucher	4d4c3cce22	Merge pull request #1560 from LakshmanTurlapati/fix/session-scoped-workstream-routing [codex] isolate active workstream state per session	2026-04-03 11:26:36 -04:00
LakshmanTurlapati	5ce8183928	fix: isolate active workstream state per session	2026-04-02 20:01:08 -05:00
Alex Alecu	ac4836d270	feat: add Kilo CLI runtime support	2026-03-31 15:59:31 +03:00
Tom Boucher	7457e33263	docs: v1.28 release documentation update Add documentation for all new features merged since v1.27: - Forensics command (/gsd:forensics) — post-mortem workflow investigation - Milestone Summary (/gsd:milestone-summary) — project summary for onboarding - Workstream Namespacing (/gsd:workstreams) — parallel milestone work - Manager Dashboard (/gsd:manager) — interactive phase command center - Assumptions Discussion Mode (workflow.discuss_mode) — codebase-first context - UI Phase Auto-Detection — surface /gsd:ui-phase for UI-heavy projects - Multi-Runtime Installer Selection — select multiple runtimes interactively Updated files: - README.md: new commands, config keys, assumptions mode callout - docs/COMMANDS.md: 4 new command entries with full syntax - docs/FEATURES.md: 7 new feature entries (#49-#55) with requirements - docs/CONFIGURATION.md: 3 new workflow config keys - docs/AGENTS.md: 2 new agents, count 15→18 - docs/USER-GUIDE.md: assumptions mode, forensics, workstreams, non-Claude runtimes - docs/README.md: updated index with discuss-mode doc link Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 12:13:17 -04:00
Tom Boucher	d478e7f485	docs: clarify model profile setup for non-Claude runtimes Document resolve_model_ids: "omit" (set automatically by installer for non-Claude runtimes), explain model_overrides with non-Claude model IDs, and add a decision table for choosing between inherit, omit, and overrides. Updates CONFIGURATION.md, USER-GUIDE.md, and the model-profiles.md skill reference. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 03:05:36 -04:00
Tom Boucher	5c4d5e5f47	feat: add multi-project workspace commands (#1241 ) Three new commands for managing isolated GSD workspaces: - /gsd:new-workspace — create workspace with repo worktrees/clones - /gsd:list-workspaces — scan ~/gsd-workspaces/ for active workspaces - /gsd:remove-workspace — clean up workspace and git worktrees Supports both multi-repo orchestration (subset of repos from a parent directory) and feature branch isolation (worktree of current repo with independent .planning/). Includes init functions, command routing, workflows, 24 tests, and user documentation. Closes #1241 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-20 17:02:48 -04:00
Tom Boucher	d5f2a7ea19	docs: update README and docs/ for v1.27 release Add documentation for all new v1.27 features: - 7 new commands (/gsd:fast, /gsd:review, /gsd:plant-seed, /gsd:thread, /gsd:add-backlog, /gsd:review-backlog, /gsd:pr-branch) - Security hardening (security.cjs, prompt guard hook, workflow guard hook) - Multi-repo workspace support, discussion audit trail, advisor mode - New config options (research_before_questions, hooks.workflow_guard) - Updated component counts in ARCHITECTURE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-20 12:21:53 -04:00
Tom Boucher	841da5a80d	Merge pull request #1155 from gsd-build/Solvely/quick-task-branching feat(quick): add quick-task branch support	2026-03-19 12:03:58 -04:00
Tom Boucher	214a621cb2	docs: update changelog, architecture, CLI tools, config, features, and user guide for parallel execution fixes Documentation updates for #1116 fixes and code review findings: - CHANGELOG.md: Add --no-verify commit flag, post-wave hook validation, STATE.md file locking, duplicate function removal, cross-platform init - docs/ARCHITECTURE.md: Add 'Parallel Commit Safety' section explaining --no-verify strategy and STATE.md lockfile mechanism - docs/CLI-TOOLS.md: Document --no-verify flag on commit command with usage guidance - docs/CONFIGURATION.md: Add note about pre-commit hooks and parallel execution behavior under parallelization settings - docs/FEATURES.md: Add --no-verify to executor capabilities, add 'Parallel Safety' section - docs/USER-GUIDE.md: Add troubleshooting entries for parallel execution build lock errors and Windows EPERM crashes, update recovery table	2026-03-18 17:18:01 -04:00
Tom Boucher	a9be67f504	docs: comprehensive v1.26 release documentation update (#1187 ) Updates all docs to reflect v1.26.0 features and changes: README.md: - Add /gsd:ship and /gsd:next to command tables - Add /gsd:session-report to Session section - Update workflow to show ship step and auto-advance - Update inherit profile description for non-Anthropic providers docs/COMMANDS.md: - Add /gsd:next command reference with full state detection logic - Add /gsd:session-report command reference with report contents docs/FEATURES.md: - Add Auto-Advance (Next) feature (#14) - Add Cross-Phase Regression Gate feature (#20) - Add Requirements Coverage Gate feature (#21) - Add Session Reporting feature (#24) - Fix all section numbering (was broken with duplicates) - Update inherit profile to mention non-Anthropic providers - Renumber all 39 features consistently docs/USER-GUIDE.md: - Add /gsd:ship to workflow diagram - Add /gsd:next and /gsd:session-report to command tables - Add HANDOFF.json and reports/ to file structure - Add troubleshooting for non-Anthropic model providers - Add recovery entries for session-report and next - Update example workflow to include ship and session-report docs/CONFIGURATION.md: - Update inherit profile to mention non-Anthropic providers	2026-03-18 14:54:02 -04:00
Tom Boucher	f54f3df776	feat: structured session handoff artifact for cross-session continuity (#940 ) (#1122 ) Enhance /gsd:pause-work to write .planning/HANDOFF.json alongside .continue-here.md. The JSON provides machine-readable state that /gsd:resume-work can parse for precise resumption. HANDOFF.json includes: - Task position (phase, plan, task number, status) - Completed and remaining tasks with commit hashes - Blockers with type classification (technical/human_action/external) - Human actions pending (API keys, approvals, manual testing) - Uncommitted files list - Context notes for mental model restoration Resume-work changes: - HANDOFF.json is primary resumption source (highest priority) - Surfaces blockers and human actions immediately on session start - Validates uncommitted files against git status - Deletes HANDOFF.json after successful resumption - Falls back to .continue-here.md if no JSON exists Also checks for placeholder content in SUMMARY.md files to catch false completions (frontmatter claims complete but body has TBD). Fixes #940	2026-03-18 10:01:25 -06:00
Tom Boucher	a97e4c2c6f	feat: /gsd:ship command for PR creation from verified phase work (#829 ) (#1123 ) * feat: /gsd:ship command for PR creation from verified phase work (#829) New command that bridges local completion → merged PR, closing the plan → execute → verify → ship loop. Workflow (workflows/ship.md): 1. Preflight: verification passed, clean tree, correct branch, gh auth 2. Push branch to remote 3. Auto-generate rich PR body from planning artifacts: - Phase goal from ROADMAP.md - Changes from SUMMARY.md files - Requirements addressed (REQ-IDs) - Verification status - Key decisions 4. Create PR via gh CLI (supports --draft) 5. Optional code review request 6. Update STATE.md with shipping status Files: - commands/gsd/ship.md: New command entry point - get-shit-done/workflows/ship.md: Full workflow implementation - get-shit-done/workflows/help.md: Add ship to help output - docs/COMMANDS.md: Command reference - docs/FEATURES.md: Feature spec with REQ-SHIP-01 through 05 - docs/USER-GUIDE.md: Add to command table - CHANGELOG.md: Document new command Fixes #829 * fix(tests): update expected skill count from 39 to 40 for new ship command The Copilot install E2E tests hardcode the expected number of skill directories and manifest entries. Adding commands/gsd/ship.md increased the count from 39 to 40.	2026-03-18 10:01:08 -06:00
Colin	3a0c81133b	feat(quick): add quick-task branch support	2026-03-17 12:04:02 -04:00
jorge g	698985feb1	feat: add inherit model profile for OpenCode /model (#951 )	2026-03-15 11:40:08 -06:00
Fana	9481fdf802	feat: add /gsd:ui-phase + /gsd:ui-review design contract and visual audit layer (closes #986 ) Add pre-planning design contract generation and post-execution visual audit for frontend phases. Closes the gap where execute-phase runs without a visual contract, producing inconsistent spacing, color, and typography across components. New agents: gsd-ui-researcher (UI-SPEC.md), gsd-ui-checker (6-dimension validation), gsd-ui-auditor (6-pillar scored audit + registry re-vetting). Third-party shadcn registry blocks are machine-vetted at contract time, verification time, and audit time — three enforcement points, not a checkbox. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-14 21:57:59 -06:00
Fana	ef032bc8f1	feat: harden Nyquist defaults, add retroactive validation, compress prompts (#855 ) * fix: change nyquist_validation default to true and harden absent-key skip conditions new-project.md never wrote the key, so agents reading config directly treated absent as falsy. Changed all agent skip conditions from "is false" to "explicitly set to false; absent = enabled". Default changed from false to true in core.cjs, config.cjs, and templates/config.json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: enforce VALIDATION.md creation with verification gate and Check 8e Step 5.5 was narrative markdown that Claude skipped under context pressure. Now MANDATORY with Write tool requirement and file-existence verification. Step 7.5 gates planner spawn on VALIDATION.md presence. Check 8e blocks Dimension 8 if file missing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add W008/W009 health checks and addNyquistKey repair for Nyquist drift detection W008 warns when workflow.nyquist_validation key is absent from config.json (agents may skip validation). W009 warns when RESEARCH.md has Validation Architecture section but no VALIDATION.md file exists. addNyquistKey repair adds the missing key with default true value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add /gsd:validate-phase command and gsd-nyquist-auditor agent Retroactively applies Nyquist validation to already-executed phases. Works mid-milestone and post-milestone. Detects existing test coverage, maps gaps to phase requirements, writes missing tests, debugs failing ones, and produces {phase}-VALIDATION.md from existing artifacts. Handles three states: VALIDATION.md exists (audit + update), no VALIDATION.md (reconstruct from PLAN.md + SUMMARY.md), phase not yet executed (exit cleanly with guidance). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: audit-milestone reports Nyquist compliance gaps across phases Adds Nyquist coverage table to audit-milestone output when workflow.nyquist_validation is true. Identifies phases missing VALIDATION.md or with nyquist_compliant: false/partial. Routes to /gsd:validate-phase for resolution. Updates USER-GUIDE with retroactive validation documentation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: compress Nyquist prompts to match GSD meta-prompt density conventions Auditor agent: deleted philosophy section (35 lines), compressed execution flow 60%, removed redundant constraints. Workflow: cut purpose bloat, collapsed state narrative, compressed auditor spawn template. Command: removed redundant process section. Plan-phase Steps 5.5/7.5: replaced hedging language with directives. Audit-milestone Step 5.5: collapsed sub-steps into inline instructions. Net: -376 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 11:19:44 -06:00
Lex Christopherson	c298a1a4dc	refactor: rename depth setting to granularity (closes #879 ) The "depth" setting with values quick/standard/comprehensive implied investigation thoroughness, but it only controls phase count. Renamed to "granularity" with values coarse/standard/fine to accurately reflect what it controls: how finely scope is sliced into phases. Includes backward-compatible migration in loadConfig and config-ensure that auto-renames depth→granularity with value mapping in both .planning/config.json and ~/.gsd/defaults.json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 10:53:34 -06:00
Bantuson	e0f9c738c9	feat: add Nyquist validation layer to plan-phase pipeline Adds automated feedback architecture research to gsd-phase-researcher, enforces it as Dimension 8 in gsd-plan-checker, and introduces {phase}-VALIDATION.md as the per-phase validation contract. Ensures every phase plan includes automated verify commands before execution begins. Opt-out via workflow.nyquist_validation: false. Closes #122 (partial), related #117 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 13:46:39 +02:00
Pablo Alonso	7de17fca8a	docs: add User Guide with workflow diagrams, troubleshooting, and config reference Create docs/USER-GUIDE.md as a detailed companion to the README, covering: - Workflow diagrams (project lifecycle, planning coordination, execution waves, brownfield) - Command reference with "When to Use" guidance for each command - Full config.json schema including workflow toggles, git branching, and per-agent model profiles - Practical usage examples for common scenarios - Troubleshooting section for common issues - Recovery quick reference table Add link from README navigation bar and Configuration section to the User Guide. Closes #457 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 14:12:35 +01:00

50 Commits