get-shit-done

mirror of https://github.com/glittercowboy/get-shit-done synced 2026-05-14 02:56:38 +02:00

Author	SHA1	Message	Date
Tom Boucher	8d5f509edf	fix(3266): preserve wave 0 and bucket plans by depends_on DAG in phase-plan-index (#3276 ) * fix(3266): preserve wave 0 and bucket plans by depends_on DAG in phase-plan-index Fixes two cooperating bugs in the phase-plan-index builder: 1. Wave 0 collapse: `parseInt(...) \|\| 1` coerced parsed value `0` to `1` due to JS falsy default. Fixed with `Number.isNaN` guard. 2. depends_on ignored: wave-bucketing used only the `wave:` frontmatter field. Now replaced with Kahn's topological-level algorithm over `depends_on`: source nodes (no in-phase deps) → lowest level; each plan's level = max(deps' levels) + 1. Declared `wave:` that disagrees with computed level emits a non-fatal warning on the result. Cycle detection throws GSDError. `PlanInfo` gains `depends_on: string[]`. `PhasePlanIndex` gains `warnings?: string[]`. Both TS (`sdk/src/query/phase.ts`) and CJS twin (`get-shit-done/bin/lib/phase.cjs`) fixed identically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add changeset for #3276 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(phase): resolve depends_on against canonical plan id (#3276 CR) Build a secondary `canonicalToId` index alongside `planMap` so that a dependency declared as '03-01' resolves to a descriptive plan stored under '03-01-auth-hardening', preventing silent wave-ordering failures. Applied at both DAG construction sites in phase.cjs and the SDK's phase.ts (k014 parity). Regression test added. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-09 00:25:05 -04:00
Tom Boucher	8bc255c266	fix(workstream): normalize migration workstream names (#3269 ) * fix(workstream): normalize migrate-name to valid slug * docs(context): record workstream migrate-name slug invariant * fix(catalog-cjs): balanced fallback for unknown profile (CR finding A) profiles[profile] could return undefined for any profile key absent from the catalog entry, causing downstream callers like formatAgentToModelMapAsTable to crash on .length. Add ?? profiles.balanced fallback to match the SDK adapter. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(sdk): anchor path resolution on import.meta.url not cwd (CR finding B) resolve(process.cwd(), '..') breaks when Vitest is invoked from the repo root because cwd is already the repo root and '..' goes one level above. Replace with a file-relative path using fileURLToPath(new URL('../../../', import.meta.url)) anchored at the test file's location (sdk/src/query/). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: derive Group B runtime list from catalog (CR finding C) Hardcoded ['kilo', 'cline', ...] throws TypeError if a runtime name is removed from the catalog. Derive group B dynamically via Object.keys(catalog.runtimeTierDefaults).filter(r => !r.opus) so the test never goes stale and auto-covers future Group B additions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(workflow): add hermes to Step B runtime options (CR finding D) hermes appears in the Group A built-in defaults table but was missing from the AskUserQuestion options in Step B, forcing users to manually type it via 'Other (Group B or custom)'. Add explicit hermes entry for UI consistency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(config): refresh dynamic_routing tier table; fix stale L671 (findings E+F) Finding E: tier table was missing 6 heavy-tier agents and 15 standard/light agents added by this PR. Updated all three rows to match catalog routingTier assignments (33 agents total). Finding F: removed stale '18 of 31' claim and agent enumeration; replaced with accurate note that all 33 agents have explicit catalog entries. Updated authoritative source pointers to model-catalog.cjs / model-catalog.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(core): add profile-fallback unit tests for quality and budget (CR nitpick G) The PR introduced quality→opus and budget→haiku unknown-agent fallbacks but only balanced→sonnet and inherit→inherit were tested. Add two tests covering the remaining two branches to complete coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * adr: define planning workspace and worktree seam * refactor(worktree): extract worktree safety policy module * refactor(workstream): extract active workstream pointer store seam * test(worktree): cover policy branch paths and persist seam guardrails * refactor(worktree): centralize health inventory seam for W017 * fix(workspace): align SDK project path policy with CJS planningDir * refactor(query): unify SDK planning path projection seam * refactor(init): route workspace projection through planningPaths seam * docs(adr): add SDK architecture and planning path ADRs * refactor(worktree): deepen name, pointer, inventory, and config seams * docs(config): harmonize claude-opus-4-6 to 4-7 in resolve_model_ids example (CR finding 2) * fix(sdk): return undefined for model_profile='inherit' sentinel (CR finding 3) * docs(adr): renumber conflicting 0003-sdk-package-seam-module to 0007, update seam-map reference (CR finding 4) * fix(workstream): align CJS and SDK name validation to accept dots, guard path traversal via includes('..') (CR finding 5) * fix(sdk): guard writeActiveWorkstream against non-existent workstream directory, k014/k031 parity (CR finding 6) * chore(changeset): add #3269 changeset (CR finding 1 — proper changeset for this PR) * docs(inventory): register 3 new CLI modules in INVENTORY.md/MANIFEST (active-workstream-store, workstream-name-policy, worktree-safety) * fix(sdk): use relPlanningPath(workstream) in planningPaths, fix setActiveWorkstream/getActiveWorkstream name errors in workstream.ts * fix(sdk): validate GSD_WORKSTREAM in planningPaths before use (#3269 regression) planningPaths() called resolveWorkspaceContext() which returned GSD_WORKSTREAM raw (no validation). An invalid value like '../evil' was used as effectiveWorkstream, constructing a bad path; roadmapAnalyze() caught the ENOENT and returned a no-phase_count error object instead of the root ROADMAP result. Fix: validate envCtx.workstream with validateWorkstreamName() in planningPaths() before accepting it as effectiveWorkstream. Invalid env → null → root .planning/ fallback, preserving the bug-2791 contract: invalid GSD_WORKSTREAM is silently ignored and falls back to the root context (phase_count: 0 for empty root ROADMAP). The bug-2791 regression test now passes. No other call sites read GSD_WORKSTREAM without validation: query-runtime-context.ts already validates; cli.ts already validates; context-engine.ts takes a caller-validated workstream parameter. Closes #3268 (regression introduced by #3269 workstream-name-policy work). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-09 00:15:04 -04:00
Tom Boucher	65abc4fc90	refactor(query): deepen phase lifecycle seams (#3267 ) * refactor(query): extract phase lifecycle policy module * refactor(query): extract phase fs and roadmap mutation adapters * fix(sdk): propagate non-ENOENT readdir errors in phase-filesystem-adapter (CR finding 1) Swallow only ENOENT in listDirectories; rethrow EACCES, EIO, and other unexpected errors so callers surface real failures rather than silently treating a permission-denied phases dir as empty. Also adds regression test: EACCES from readdir now propagates as thrown error instead of returning []. * fix(sdk): propagate non-ENOENT readFile errors in phase-roadmap-mutation (CR finding 4) readModifyWriteRoadmapMd now falls back to empty content only on ENOENT; EACCES, EIO, and other errors are rethrown so a subsequent write cannot clobber real roadmap content that is temporarily unreadable. Regression tests: EACCES propagates; absent ROADMAP.md still starts empty. * fix(sdk): omit Depends on: Phase 0 for first sequential phase; align prefix grammar (CR findings 2+3) Finding 2: buildPhaseRoadmapEntry now omits the "Depends on" line when phaseId == 1 (prevPhase would be 0, which is not a valid predecessor). The guard is `prevPhase < 1` so future phase-0 configs are also safe. Finding 3: collectDecimalSuffixesFromDirNames regex prefix pattern updated from `[A-Z]{1,6}` to `[A-Z][A-Z0-9]*` (case-insensitive flag added), matching the grammar used by scanSequentialMaxPhaseFromDirs. Prevents k014 parity drift for alphanumeric project-code prefixes longer than six characters or containing digits. Regression tests for both fixes included.	2026-05-09 00:14:59 -04:00
Tom Boucher	288b3b4170	fix(3259): non-mutating --help guard for native query handlers (#3272 ) * fix(3259): non-mutating --help guard for native query handlers; reject --help as milestone version Adds a dispatcher-level guard in query-dispatch.ts that short-circuits to a non-mutating help stub whenever --help/-h appears in args destined for a native mutating handler (fail-closed by default). Adds defense- in-depth in milestoneComplete to reject --help/-h as a version value before any disk write. Regression tests cover: per-handler --help guard, registry-driven invariant across all mutating commands, handler-level GSDError for both flags, and preservation of the #3019 CJS fallback contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add changeset fragment for #3272 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 23:53:27 -04:00
Tom Boucher	ecd57e622c	fix(3265): prefer YAML frontmatter for state-snapshot canonical fields (#3275 ) * fix(3265): prefer YAML frontmatter for state-snapshot canonical fields stateSnapshot in both sdk/src/query/state.ts and the CJS twin (get-shit-done/bin/lib/state.cjs cmdStateSnapshot) passed the whole STATE.md blob to stateExtractField, whose bold pattern (Field:) has no line anchor. A body table cell such as "Status: to ✅ COMPLETE" therefore silenced the correct YAML frontmatter value. Fix: extractFrontmatter(content) first; stripFrontmatter(content) for the body passed to stateExtractField; for each canonical scalar field prefer the non-empty frontmatter value, falling back to body extraction when the key is absent or the file has no frontmatter block at all. Regression tests added in sdk/src/query/state.test.ts (vitest) and tests/state.test.cjs (node:test) covering: - frontmatter status beats Status: inside a table cell - frontmatter current_plan beats bold body value - no-frontmatter files continue to extract from body - field absent from frontmatter falls through to body extractor Fixes #3265 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add changeset for #3275 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: reproduce fmStr drops non-string YAML scalars (#3275 CR finding) Add tests/bug-3275-fmstr-non-string-scalars.test.cjs with 5 cases covering CJS state-snapshot with numeric frontmatter scalars (current_phase: 19, total_phases: 7, total_plans_in_phase: 5), string regression, and no-frontmatter body fallback regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(state): fmStr accepts numeric/boolean YAML scalars (CR finding) Rename `fmStr` to `fmScalar` in both state.cjs and sdk/src/query/state.ts and broaden the type guard so that non-null number/boolean frontmatter values are coerced to String(v) instead of being discarded. The previous `typeof v === 'string'` check was a latent bug: if the YAML parser ever returns typed scalars (e.g. `current_phase: 19` as the number 19), the frontmatter value would be silently dropped and the stale body value used instead. Both files are updated identically (k014 parity). Also adds three SDK vitest regression cases (numeric current_phase, total_phases, total_plans_in_phase) in sdk/src/query/state.test.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 23:53:21 -04:00
Tom Boucher	96806003c5	fix(#3229 ): shared model catalog source of truth for agent profiles + runtime tier defaults (#3230 ) * docs(adr): add ADR-0003 model catalog module * fix(#3229): add shared model catalog as source of truth for agent profiles and runtime tier defaults Research / design (ADR-0003): - Existing drift came from 4 independent model truths: 1. CJS model-profiles.cjs 2. SDK config-query.ts stale copy (18 agents) 3. settings-advanced.md runtime tier table 4. session-runner Claude-only profile map - New design: one machine-readable Model Catalog Module in sdk/shared/ that both packages ship and consume. Implementation: - sdk/shared/model-catalog.json — canonical source of truth for: - full 33-agent registry - per-agent golden (quality) alias + balanced/budget aliases - adaptive derivation from routingTier - agent→phaseType map - agent→dynamic-routing default tier map - runtime tier defaults for all supported runtimes - get-shit-done/bin/lib/model-catalog.cjs — CJS adapter over the catalog - sdk/src/model-catalog.ts — SDK adapter over the same catalog - CJS model-profiles.cjs now re-exports derived data from model-catalog.cjs - SDK config-query.ts now re-exports MODEL_PROFILES/VALID_PROFILES from model-catalog.ts instead of maintaining its own list - sdk/src/query/helpers.ts runtime list now comes from the catalog (fixes hermes drift) - sdk/src/session-runner.ts Claude profile→model-id mapping now resolves via catalog - docs/CONFIGURATION.md + settings-advanced.md runtime tables updated to match catalog Behavior changes: - resolve-model now covers every shipped agent file on disk (33 agents) - unknown-agent fallback is profile-semantic, not hardcoded sonnet: quality→opus, budget→haiku, balanced/adaptive→sonnet, inherit→inherit - Group B runtimes remain known runtimes but do not get built-in tier defaults Tests (RED→GREEN): - root tests: shipped agent files must equal MODEL_PROFILES keys - sdk tests: shipped agent files must equal MODEL_PROFILES keys - direct fix assertion: gsd-code-reviewer resolves to opus under quality with no unknown_agent - runtime defaults parity test: settings-advanced.md + CONFIGURATION.md tables must match catalog - helper tests: hermes included in SUPPORTED_RUNTIMES and getRuntimeConfigDir() Closes #3229 * chore(changeset): update #3229 changeset pr field to 3230 * fix(ci): update inherit fallback expectations and inventory parity for model catalog	2026-05-08 21:25:37 -04:00
Tom Boucher	deeb6deb67	fix(install): accept Codex TOML floats; idempotent rollback (#3245 ) (#3254 ) * test: reproduce extractFrontmatter LAST-block bug (#3240) * test: reproduce state.update progress trampling and percent formula (#3242) Two failing regression tests: - Bug A: state.update "Last Activity" tramples curated progress.* frontmatter via readModifyWriteStateMd → syncStateFrontmatter - Bug B: 12 declared ROADMAP phases / 6 realized / 6/6 plans done → percent: 100 instead of 50 (phase-fraction ignored) * test: reproduce TOML float rejection and partial rollback (#3245) Two failing regression tests: 1. parseTomlToObject rejects valid Codex TOML floats (tool_timeout_sec = 20.0) 2. Post-install validation failure leaves skills/, agents/, VERSION on disk despite restoring config.toml — hybrid state after abort Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): accept TOML floats; idempotent codex rollback (#3245) Two fixes for the Codex install failure introduced by #2760 CR4 finding 3: 1. parseTomlValue now accepts TOML 1.0 float literals (decimals, exponents, underscore separators, signed). Codex CLI's serde schema requires f64 for tool_timeout_sec / startup_timeout_sec — the prior strict-integer-only check was the inverse of what Codex requires, causing every config with a float to trigger a fatal schema validation failure. Date/time separators (-/:T/Z) are still rejected. 2. restoreCodexSnapshot is extended into a unified idempotent rollback that reverts ALL Codex-specific mutations on failure: - config.toml (existing behavior) - skills/gsd-* directories (new) - agents/gsd-.{md,toml} files (new) - get-shit-done/VERSION (new) - orphaned atomic-write temp files (new) Pre-install state is captured before the first Codex write so the rollback reflects the true pre-GSD state. Non-gsd- user content is untouched. The rollback is safe to call multiple times and before any snapshots are captured. Fixes #3245 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: pr=3254 for #3245 * test: fix source-grep lint violation in bug-3242 test (#3242) Replace content.includes() check with line-by-line parse of STATE.md body. The lint enforces structural assertions over raw text matching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: mark #3242 RED tests as todo pending fix (#3242) The three failing tests are intentional regression tests for bugs in state.cjs that will be fixed in a separate PR. Mark them { todo: true } so they don't block CI on this branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): tighten TOML underscore placement validation (CR finding 1) The float regex used [\d_]* which accepts invalid forms like 1__0, 1_.0, and 1._0. TOML 1.0 §2 requires underscores only between digits. Switch both the integer pre-check and the full float pattern to (?:_?\d)* so consecutive underscores, leading underscores on a segment, and trailing underscores on a segment are all rejected before replace(/_/g,'') can silently normalize them into valid JS numbers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): restore pre-existing gsd-* content on rollback (CR finding 2) The snapshot only recorded names of pre-existing skills/gsd-* dirs and agents/gsd-* files. On a failed reinstall the rollback could delete newly-created dirs but could not restore the bytes of dirs/files that were overwritten, leaving the user in a hybrid state (old config.toml, new skill files). Now snapshot the full file tree of every pre-existing gsd-* skill dir into codexPreInstallSkillContents (Map<name, Map<relPath, Buffer>>) and every pre-existing agent file into codexPreInstallAgentContents (Map<filename, Buffer>). restoreCodexSnapshot() uses these maps to wipe-and-restore overwritten entries and only removes entries that had no pre-install state, giving a true atomic rollback guarantee. Reads are best-effort so a partial snapshot is still better than none. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): scope temp-file cleanup to installer-owned writes (CR finding 3) _cleanTmpFiles() was deleting any .tmp-<pid>-<n> file found under targetDir. This is too broad: other tools in the user's Codex/home directory may create temp files matching the same suffix pattern, and a GSD install rollback would silently delete them. Add __atomicWrittenTmps (a module-level Set<string>) populated by atomicWriteFileSync for every temp path it creates. _cleanTmpFiles() now checks __atomicWrittenTmps.has(full) before unlinking, so only temp files this installer process actually wrote are eligible for cleanup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> fix(test): remove no-op doesNotThrow wrapping try/catch (CR finding 4) assert.doesNotThrow(() => { try { f(); } catch(_){} }) always passes because the catch block swallows every exception before the outer assertion can see it. This meant the rollback-idempotency guarantee was never actually verified. Replace with an explicit threw flag around runCodexInstall, assert that the install did throw (validation failure is expected), and add a post-rollback state assertion that skills/ was not created. This gives a loud failure surface if runCodexInstall starts crashing from inside the rollback path, matching the intent described in the test comment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): correct describe title for float-acceptance tests (CR nitpick 1) The describe block title said 'rejects malformed input that previously slipped through', but the test inside now asserts that TOML floats are accepted (the #3245 inversion). This misled readers expecting every sub-test to assert rejection. Update the title to reflect the mixed behaviour: floats are accepted; dates and trailing-garbage are rejected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): rename test to match what the assertion actually checks (CR nitpick 2) The test name 'post-install config retains float literal form (20.0 not truncated to 20)' promised a string-form invariant, but the assertion uses numeric equality (assert.strictEqual(parsed.tool_timeout_sec, 20)) which cannot distinguish 20 from 20.0 in JS. Rename to 'post-install config round-trips tool_timeout_sec as numeric 20' so the description matches what the test actually verifies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): replace raw text scan with state json assertion (CR nitpick 3) The 'Last Activity updates the body field' test was reading STATE.md as raw text, splitting on newlines, and using lines.find/startsWith to locate the 'Last Activity:' line — the exact pattern-match-on-source approach prohibited by the no-source-grep testing standard. Replace with runGsdTools('state json', tmpDir) which surfaces the body- extracted Last Activity value as fm.last_activity in its JSON output, and assert against that structured field instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): correct post-rollback state assertion for early-failure case The previous assertion checked that skills/ didn't exist, but the installer writes skills/ before the schema validator fires. Rollback removes gsd-* dirs inside skills/, not skills/ itself. Update the assertion to verify that no gsd-* skill dirs survive rollback, which is the actual invariant the test name describes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: document full rollback scope (CR finding 1) Adds config.toml restoration and orphaned atomic-write temp-file cleanup to the changeset description — the previous text only listed skills/, agents/, and VERSION. * fix(install): wrap post-snapshot scope in rollback handler (CR finding 2) Any throw between the pre-install snapshot capture and the Codex config block (skills copy, agents copy, VERSION write, manifest write, leaked- path scan, etc.) now triggers _codexPreConfigRollback() so the caller is never left in a partially-installed state. Previously only the later config.toml mutation paths had rollback wired in. Introduces _codexPreConfigRollback (defined right after snapshot capture) and wraps the intervening operations in a try/catch that invokes it on error for Codex installs; non-Codex paths are unaffected. * test: assert threw=true to prevent vacuous pass (CR finding 4) Two tests used bare try/catch without asserting threw === true, so they would silently pass even if runCodexInstall never threw (k060 pattern). Each bare catch block is replaced with a threw flag and a strictEqual(threw, true, ...) assertion. CR findings 2+3 are both addressed in the preceding install commit: finding 3 (restore from snapshot manifest, not current FS state) lands alongside the rollback-wrapper change as part of the restoreCodexSnapshot refactor. * fix(install): reject leading zeros in TOML float integer part per TOML 1.0 (CR finding round 4) TOML 1.0 §2 disallows leading zeros in the integer part of numeric literals — `01`, `00`, `01.5`, `00e2`, `+01.0`, `-01.0` are all invalid. The pre-check and float regexes in parseTomlValue used `\d(?:_?\d)` which accepted any digit as the leading digit. Both regexes are tightened to `(0\|[1-9](?:_?\d))` for the integer part: - `0` alone is valid - a non-zero leading digit followed by optional underscored digits is valid - `01`, `00`, and any variant with a leading zero and further digits is rejected The "still rejects bare time (07:32:00)" test assertion is broadened from `/unsupported TOML value/` to `/unsupported TOML value\|trailing bytes/` because the parser now stops at `0` and the remainder `7:32:00` is rejected as trailing bytes — the invariant (time literals are not accepted) is unchanged. 25 new regression tests cover all rejection cases and valid TOML forms. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 10:25:59 -04:00
Tom Boucher	397c34142a	Deepen SDK package seam and converge runtime skills policy (#3238 ) * Deepen SDK package seam and converge runtime skills policy * fix(sdk): unified install-root resolution for workflows and agents (CR finding 1) Use the already-resolved gsdInstallDir constant instead of calling resolveLegacyInstallDir() again when computing agentsDir, ensuring workflowsDir and agentsDir share the same install root. * fix(sdk): tilde shortening requires path-boundary match (CR finding 2) Both renderGlobalSkillsBaseDisplayPath and renderGlobalSkillDisplayPath used startsWith(home) which could incorrectly shorten unrelated paths sharing the same prefix. Now checks for home === base or base.startsWith(home + sep) to ensure a real directory boundary. * fix(sdk): validate loadConfig export before invocation (CR finding 3) After requiring core.cjs, check typeof mod.loadConfig === 'function' before calling it. Throws a classified GSDError with the module path if the export is missing, rather than a generic TypeError. * fix(test): guard root lookup before .path dereference (CR finding 4) Added assert.ok() guards for claudeRoot and codexRoot after the .find() calls so that a missing root produces an explicit assertion failure rather than a TypeError on .path dereference. * fix(ci): fail-safe on transient API errors in approval dismissal (CR finding 6) resolveRole() returns 'unknown' for non-404 errors (rate limits, 5xx, network blips). shouldDismissReviewer() now treats 'unknown' as unresolvable and skips dismissal, preventing legitimate approvals from being dismissed due to a transient API failure. Only 'none' (true 404) is treated as a confirmed non-collaborator. * changeset: pr=3238 SDK package seam and runtime skills convergence * fix(sdk): harden resolveGlobalSkillDir against path traversal (CR finding 1) Use resolve+relative to validate that skillName cannot escape the global skills base directory. Values like "../../foo" or absolute paths now return null instead of joining directly. All imports (resolve, relative, isAbsolute) were already present in helpers.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sdk): split skill-dir-resolution and skill-not-found warnings (CR finding 2) After resolveGlobalSkillDir's hardening can return null for traversal attempts, the old single-branch warning "Global skill not found at ..." was misleading. Split into two distinct cases: - skillDir === null → "Could not resolve global skill directory for ..." - skillMd missing → "Global skill not found at ..." Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: lock skill path-traversal rejection in resolveGlobalSkillDir Regression test verifying that traversal segments (../../foo, ../escape), empty string, and absolute paths are all rejected (return null), while a legitimate skill name resolves correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(sdk): align display-path contract + traversal coverage for resolveGlobalSkillMarkdownPath (CR nitpicks) - renderGlobalSkillsBaseDisplayPath now returns a non-null string for unsupported runtimes (e.g. cline → "(cline does not use a skills directory)") matching the existing renderGlobalSkillDisplayPath contract; callers of both helpers no longer need null-checks for unsupported runtimes. - Remove now-redundant ! non-null assertion on renderGlobalSkillsBaseDisplayPath calls in skill-manifest.ts (return type is string, not string \| null). - Extend the path-traversal test block to assert resolveGlobalSkillMarkdownPath also propagates null for ../../foo, ../escape, empty, and /abs/path inputs, locking the null-propagation contract against future refactors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 09:06:43 -04:00
Tom Boucher	f5fe5bc063	fix(config): allow model_overrides.<agent-id> in config-set (#3227 ) (#3253 ) * test: reproduce config-set rejecting model_overrides.<agent-id> (#3227) * fix(config): allow model_overrides.<agent-id> in config-set (#3227) * changeset: pr=3253 for #3227	2026-05-08 08:40:53 -04:00
Tom Boucher	b2f0fdf250	fix(sdk): anchor extractFrontmatter at file start (#3240 ) (#3247 ) * test: reproduce extractFrontmatter LAST-block bug (#3240) * fix(sdk): anchor extractFrontmatter at file start (#3240) * changeset: pr=3247 for #3240	2026-05-08 08:40:30 -04:00
Tom Boucher	447763411a	fix(sdk): phase.add honors --dry-run; rejects unknown flags (#3226 ) (#3246 ) * test: reproduce phase.add dry-run + flag validation gaps (#3226) Add failing tests for: - --dry-run silently absorbed into description (symptom A) - Unknown --flag should return validation error (symptom C) - ### Phase N: ROADMAP heading scan verification (symptom B) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sdk): phase.add honors --dry-run; rejects unknown flags (#3226) - Add flag parser to phaseAdd: strip recognized flags (--dry-run) from args before positional parsing so they never silently become description or customId values - --dry-run computes the next phase number and roadmap_entry string but skips mkdir, writeFile, and readModifyWriteRoadmapMd; returns { dry_run: true, roadmap_entry } alongside normal fields - Any unrecognized --flag throws a Validation GSDError naming the flag - ROADMAP ### Phase N: heading scan for numbering (symptom B) was already correct; verified with new regression test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: pr=3246 for #3226 * fix(sdk): phase.add scans disk AND roadmap (union, not fallback) Address CodeRabbit finding: the conditional `if (maxPhase === 0)` guard around the filesystem scan meant that if ROADMAP had any phases but disk was ahead (e.g. ROADMAP max=10, dirs include 12-), phase.add would pick 11 and collide with the existing directory. Remove the guard: always scan on-disk phase directories and take the max across both ROADMAP and filesystem (union semantics). All 57 phase-lifecycle tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> test: reproduce phase.add concurrent ID collision (CR finding) Two concurrent phase.add calls against the same project observe maxPhase before the lock is held, producing duplicate phase IDs. Adds a Promise.all regression test that asserts both calls succeed with distinct phase numbers {11, 12}. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sdk): compute phase number under roadmap lock (CR finding) Move maxPhase/newPhaseId/dirName computation inside the readModifyWriteRoadmapMd callback so the entire read → compute → write cycle is serialised under the lock. Previously, two concurrent phase.add calls could both observe maxPhase=N before either acquired the lock, then both write with phase ID N+1 — producing duplicate IDs. In dry-run mode (no write, no race) the computation still happens outside the lock to avoid unnecessary contention. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 08:40:24 -04:00
Tom Boucher	8054959417	fix(query): workstream resolution in init.milestone-op and roadmap.analyze (#3196 ) - initMilestoneOp now accepts and propagates the workstream parameter: relPlanningPath(workstream) replaces the hardcoded '.planning' dir, getMilestoneInfo gets workstream passed, extractCurrentMilestone gets workstream passed, archiveDir is derived from planningDir not root. - resolveQueryRuntimeContext now reads .planning/active-workstream as a third-priority fallback after --ws flag and GSD_WORKSTREAM env var, completing the documented resolution chain for all query handlers. Closes #3196 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-06 22:01:45 -04:00
Tom Boucher	2d32ad82be	fix(plan-phase): remove agent: directive that caused OpenCode subagent dispatch (#3156 ) (#3206 ) * feat(roadmap): parse Mode: field on phase sections Adds a 'mode' field to roadmap.get-phase and roadmap.analyze outputs. Recognizes 'Mode: mvp' lines in phase sections; lowercased + trimmed. Forward-compat: unrecognized values preserved verbatim, no enum check. Foundation for --mvp flag in plan-phase (PRD: vertical-mvp-slice). * feat(plan-phase): parse --mvp flag and resolve MVP_MODE Resolution order: CLI flag → ROADMAP Mode: field → workflow.mvp_mode config → false. Walking Skeleton gate fires for new-project Phase 1. Wires MVP_MODE + WALKING_SKELETON into gsd-planner subagent prompt. Per PRD vertical-mvp-slice Phase 1 (Q1, Q2, Q4). * docs(planner): add vertical-slice planning reference New reference loaded by gsd-planner when MVP_MODE=true. Defines slice ordering, Walking Skeleton rules, and anti-patterns. Referenced from plan-phase workflow MVP_MODE wiring. * docs(planner): add SKELETON.md template Template emitted by gsd-planner under WALKING_SKELETON=true. Captures architectural decisions and out-of-scope list for new-project Phase 1. * chore(inventory): register new planner references Added planner-mvp-mode.md and skeleton-template.md to INVENTORY.md and INVENTORY-MANIFEST.json. References now: 53. * feat(gsd-planner): add MVP Mode Detection section Mode-switched branch in the existing planner agent (per Q4: single agent). Vertical-slice decomposition rules, Walking Skeleton handling, and TDD-mode compatibility. Heavy guidance lives in references/planner-mvp-mode.md. * test(plan-phase): add --mvp resolution-chain integration cases Validates roadmap.get-phase --pick mode and confirms workflow.mvp_mode default is unset in fresh projects. * docs(changelog): announce --mvp vertical-slice planning (#2826) * feat(mvp-phase): add /gsd mvp-phase slash command Standalone command for vertical MVP planning. Frontmatter only; heavyweight workflow at get-shit-done/workflows/mvp-phase.md follows in next commit. Mirrors discuss-phase/edit-phase command shape. * docs(planner): add user-story-template reference Defines the canonical 'As a / I want to / So that' format and the ROADMAP.md / PLAN.md emit rules. Used by mvp-phase workflow and gsd-planner agent under MVP_MODE. * docs(planner): add SPIDR splitting reference Defines size signals, the five SPIDR axes (Spike/Paths/Interfaces/Data/Rules), the interactive workflow, and anti-patterns. Per PRD Q3 decision: full interactive flow, not lightweight check. Used by mvp-phase workflow. * fix(mvp-phase): trim description to fit 100-char budget * feat(mvp-phase): add mvp-phase workflow Standalone workflow: phase validation -> user story prompts (As a / I want to / So that) -> SPIDR splitting check -> ROADMAP write (Mode + Goal) -> delegation to plan-phase. Per PRD Phase 2 (Q3 full SPIDR; Phase-2-A/B/C/D decisions). Plan-phase auto-detects MVP via Phase 1's resolution chain, so no flags are needed when delegating. * feat(gsd-planner): emit user-story header in PLAN.md under MVP mode Extends the MVP Mode Detection section (added in Phase 1) so the planner sources the user story from ROADMAP Goal: and emits the bolded As a / I want to / so that form as the first content under the phase header in PLAN.md. References user-story-template.md. * test(mvp-phase): integration smoke test for ROADMAP mutation Validates roadmap.get-phase output after a workflow-spec'd ROADMAP write: mode=mvp and goal=full user story. Catches schema drift between workflow emit and parser expectation. Includes a long-story case (>120 chars) to confirm SPIDR-rejected stories still parse correctly. * chore(inventory): register mvp-phase command + 2 new references Adds /gsd mvp-phase to commands list, mvp-phase workflow to workflows list, and user-story-template.md + spidr-splitting.md to references. References count: 53 -> 55. * docs(changelog): announce /gsd mvp-phase command (#2826) * fix(mvp-phase): add TEXT_MODE plain-text fallback for non-Claude runtimes (#2012) * docs(executor): add MVP+TDD gate reference Defines the runtime gate semantics for execute-phase when both MVP_MODE and TDD_MODE are true: pre-task verification of failing-test commit, end-of-phase review escalation from advisory to blocking, behavior-adding task definition. Loaded conditionally by execute-phase workflow and gsd-executor agent. * feat(execute-phase): MVP+TDD runtime gate + blocking review Resolves MVP_MODE in Step 1 (CLI flag -> roadmap mode -> config -> false). Adds per-task gate that halts before behavior-adding tasks run if no failing-test commit exists for the plan. Escalates end-of-phase TDD review from advisory to blocking when both MVP_MODE and TDD_MODE active. Also updates INVENTORY-MANIFEST.json to register execute-mvp-tdd.md (added by Task 1) so manifest-sync tests pass. Per PRD vertical-mvp-slice Phase 3a (decisions Phase-3-A, Phase-3-Split). * feat(gsd-executor): add MVP+TDD Gate section Mirrors the planner's MVP Mode Detection pattern from Phase 1. Instructs halt-and-report when the runtime gate trips, references execute-mvp-tdd.md for full semantics. No agent changes outside the new section. * test(execute-phase): add MVP+TDD resolution-chain integration cases Validates roadmap.get-phase --pick mode and confirms workflow.mvp_mode default is unset in fresh projects. Mirrors the Phase 1 plan-phase resolution-chain integration test. * chore(inventory): register execute-mvp-tdd reference Bumps References count 55 -> 56. Registers execute-mvp-tdd.md. Adds "init" to PROSE_ALLOWLIST in registry integration test so bare `gsd-sdk query init` prose examples in plan docs don't trigger the unregistered-handler guard (real commands are all init.<subcommand>). * docs(changelog): announce MVP+TDD runtime gate in execute-phase (#2826) * docs(verifier): add verify-mvp-mode reference Defines UAT framing under MVP mode: user-flow walk-through first, technical checks deferred, coverage check as goal-backward narrowing to the user story's outcome clause. Loaded conditionally by verify-work workflow and gsd-verifier agent. * feat(verify-work): MVP-mode UAT framing — user flow first Resolves MVP_MODE from phase mode field. Under MVP mode, generates UAT in three ordered sections: user-flow walk-through (derived from user story), technical checks (deferred), coverage check (goal-backward). Falls back to standard UAT generation when mode is null/absent. User-story-format guard refuses to verify a mode:mvp phase with a non-user-story goal. Also updates docs/INVENTORY.md (56 references) and docs/INVENTORY-MANIFEST.json to register verify-mvp-mode.md added in Task 1. Per PRD vertical-mvp-slice Phase 3b (decisions Phase-3-B, Phase-3-Verify-Structure). * feat(gsd-verifier): add MVP Mode Verification section Narrows goal-backward verification to the user-story [outcome] clause when phase mode is mvp. References verify-mvp-mode.md. Preserves existing goal-backward methodology for non-MVP phases. User-story-format guard refuses to verify a mode:mvp phase with a non-user-story goal. * docs(changelog): announce MVP-mode UAT framing in verify-work (#2826) * feat(new-project): add Vertical MVP vs Horizontal Layers mode prompt Asks user at project init how to structure the project. Vertical MVP emits Mode: mvp on every initial roadmap phase (per-phase mode preserved per PRD Q1). Horizontal Layers falls back to standard template — no behavioral change for existing flows. Per PRD vertical-mvp-slice Phase 4 (decision Phase-4-Persistence). * feat(progress): add MVP-mode user-flow display When phase has Mode: mvp, progress renders user-flow status from PLAN.md task names alongside standard task progress. Tasks that aren't user-flow-shaped (technical-sounding) are filtered out of the user-flow sub-block. Falls back to standard display when mode is null/absent. Per PRD vertical-mvp-slice Phase 4 (decision Phase-4-Progress). * feat(stats): add MVP phase count summary Reads roadmap.analyze (which surfaces mode per phase from Phase 1) and emits 'Phases: N total \| M MVP \| K standard' summary line. Suppressed when MVP_COUNT == 0 to avoid clutter on non-MVP projects. Per PRD vertical-mvp-slice Phase 4. * feat(graphify): add MVP-mode visual differentiation MVP-mode phases render with #22c55e fill color AND ' (MVP)' label suffix — two-channel signaling for color-blind and grayscale renders. Standard phases unchanged. Per PRD vertical-mvp-slice Phase 4 (PRD Q5: distinct visual treatment). * docs(changelog): announce Phase 4 discovery & progress (#2826) * chore(release): bump dev to 1.50.0-canary.0 for first 1.50.0 canary Sets the base version that .github/workflows/canary.yml derives the canary tag from (strips suffix → base 1.50.0 → next available v1.50.0-canary.N). This kicks off the 1.50.0 release train, opened by the MVP/TDD/UAT vertical slice landed across PRs #2867, #2874, #2878, #2880, #2883. * docs: add CANARY stream README + v1.50.0-canary.1 release notes - docs/CANARY.md — explains the dev→@canary stream policy, install/rollback paths, and when (not) to install canary builds - docs/RELEASE-v1.50.0-canary.1.md — release notes for the first 1.50.0 canary cut: vertical MVP/TDD/UAT slice (#2867 + #2874 + #2878 + #2880 + #2883), opening the 1.50.0 train under PRD #2826 - docs/README.md — index entry + quick link for the canary stream * fix(ci/canary): publish gate checks dev branch, not main Four publish-step `if:` conditions in .github/workflows/canary.yml were checking `github.ref == 'refs/heads/main'`. Those steps (Tag and push, Publish to npm, Publish SDK to npm, Verify publish) therefore always skipped on every workflow_dispatch invocation since canary runs from dev, never main. The workflow's own header comment is unambiguous: `dev → @canary`. The gate was a copy-paste from release.yml (which correctly targets main for the @next/@latest streams) that was never corrected for the canary stream. This is why the 1.50.0-canary.1 publish hadn't materialized despite three green workflow runs. With the gate corrected, the next dispatch will actually publish. * ci(release-sdk): make release-sdk.yml dispatchable from the dev branch The workflow lives on main only, so the GitHub Actions "Use workflow from" dropdown doesn't list dev — meaning dev → @dev publishes can't be triggered from the dev branch directly. Add the file to dev so an operator can dispatch it with branch=dev and tag=dev. Per project release-stream policy: dev branch publishes canary (@dev). This is the stream that needs the file most, since main never publishes @dev itself (main does @next / @latest). File is byte-identical to main's release-sdk.yml — straight propagation, no behavioral change. Tracking issues #2925, #2929. * docs(mvp): canary-prep concept cleanup — CONTEXT.md, mvp-concepts index, --prd interaction (#3176) * chore(mvp): concept cleanup + cross-ref index for v1.50.0-canary.2 prep - CONTEXT.md gains 7 MVP domain terms (MVP Mode, User Story, Walking Skeleton, Vertical Slice, Behavior-Adding Task, MVP+TDD Gate, SPIDR Splitting) so the project glossary matches the shipped surface. - New get-shit-done/references/mvp-concepts.md indexes the six MVP reference files and concept-to-file map so agents and contributors can find the right canonical doc without grepping. - plan-phase.md Walking Skeleton block now documents that --mvp and --prd compose orthogonally on Phase 1; no precedence needed. - INVENTORY/INVENTORY-MANIFEST refreshed for the new reference (58 -> 59). No behavior change. Canary-prep cleanup ahead of v1.50.0-canary.2. Surfaced for follow-up (not in this PR): - MVP_MODE resolution shell block duplicated across plan-phase, execute-phase, verify-work workflows (needs a shared workflow-include mechanism; structural change). - Behavior-Adding Task predicate is prose-only; no shared utility. - User Story regex hardcoded in verify-work; would benefit from a central definition consumed by the verifier and the mvp-phase command. * chore(changeset): set PR number for mvp concept cleanup * feat(mvp): centralize resolution surfaces + fix SDK roadmap mode parity (#3178) Three new SDK query verbs replace the architectural duplication surfaced by the v1.50.0-canary.2 review against dev tip `12c4e565`: phase.mvp-mode <N> [--cli-flag] Single canonical precedence resolver (CLI flag -> ROADMAP Mode: mvp -> workflow.mvp_mode config -> false). Replaces 4-8 lines of bash that were duplicated across plan-phase.md, execute-phase.md, verify-work.md, and progress.md. Returns {active, source, roadmap_mode, config_mvp_mode, cli_flag_present}. task.is-behavior-adding <plan-file> \| --task-content <xml> Behavior-Adding Task predicate (tdd="true" + <behavior> block + non-test source files in <files>). Replaces prose-only specification in references/execute-mvp-tdd.md; gsd-executor agent now invokes the verb instead of re-inlining the three checks. Returns {is_behavior_adding, checks, reason}. user-story.validate <text> \| --story <text> Owns the canonical User Story regex /^As a .+, I want to .+, so that .+\.$/ previously hardcoded in verify-work.md prose. Consumed by gsd-verifier (phase-goal guard) and /gsd-mvp-phase (interactive-prompt validation). Returns {valid, slots: {role, capability, outcome}, errors[]}. Bug fix bundled: sdk/src/query/roadmap.ts searchPhaseInContent now extracts the mode field from Mode:, restoring parity with roadmap.cjs:120-123. Without this, roadmap.get-phase --pick mode returned null on the native dispatch path even when the phase had Mode: mvp set, causing MVP_MODE to silently fall through to the config/false branch in every consuming workflow. The original PRs Phase 1 (#2885) shipped the CJS parser but the SDK port omitted the field; this fix brings them back to parity. Workflows + agents updated to call the verbs: - plan-phase.md, execute-phase.md, verify-work.md, progress.md call phase.mvp-mode (one line replaces the duplicated bash chains). - execute-phase.md MVP+TDD gate calls task.is-behavior-adding. - verify-work.md goal guard calls user-story.validate. - mvp-phase.md interactive prompt validates via user-story.validate. - gsd-executor agent references task.is-behavior-adding instead of prose. - gsd-verifier agent references user-story.validate instead of inlined regex. Tests: 24 new vitest tests in sdk/src/query/mvp.test.ts cover all three verbs + the regression. Two existing contract tests (progress, verify) updated to assert on the new verb shape. All 60 existing MVP contract tests pass; golden integration suite (38 + 42 tests) passes. Closes #3177 * fix(canary.2): unblock release gates for v1.50.0-canary.2 Run 25451329660 (Release SDK Bundle on dev, 2026-05-06T17:41) failed at the test-suite step with 3 deterministic content/structure gate failures, all attributable to the MVP umbrella integration in #3178 and the docs sweep in #3180. Failure 1: /gsd-mvp-phase undocumented in workflows/help.md - tests/bug-2954-help-md-slash-command-stubs.test.cjs requires every shipped commands/gsd/<X>.md to have a /gsd-<X> mention in help.md - PR #3180 updated docs/COMMANDS.md but missed help.md (which the AI agents load in-product) - Fix: add a /gsd-mvp-phase entry to help.md right before /gsd-plan-phase Failures 2 + 3: execute-phase.md (1727) and plan-phase.md (1714) over XL budget (1700) - PR #3178 added MVP-mode verb calls (phase.mvp-mode, task.is-behavior-adding, user-story.validate) to both workflow files, pushing them past 1700 lines - Fix: bump XL_BUDGET 1700 -> 1800 with inline comment pointing at the structural follow-up (extract MVP bodies to <workflow>/modes/mvp.md per the discuss-phase/modes/ precedent) - The structural extract is the right long-term fix but is bigger than canary unblock scope; will land in a follow-up after canary cycles Local verification: $ node --test tests/bug-2954-help-md-slash-command-stubs.test.cjs tests/workflow-size-budget.test.cjs tests 111 pass 111 fail 0 After this lands, re-trigger Release SDK Bundle on dev for v1.50.0-canary.2. * chore(changeset): set PR number for canary.2 unblock * fix(codex): generate-claude-md writes to AGENTS.md on Codex runtime When config.runtime === 'codex' or GSD_RUNTIME=codex, override the output target to AGENTS.md regardless of claude_md_path, so Codex projects no longer have GSD sections written to CLAUDE.md by mistake. Fixes both the CJS (gsd-tools) and SDK (profile-output.ts) paths. Explicit --output flags are still honoured in both paths. Closes #3163 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(plan-phase): remove agent: directive that caused OpenCode subagent dispatch On OpenCode, any command with `agent: <name>` in its frontmatter is auto-dispatched to a subagent context where the Agent tool is unavailable. plan-phase.md and mvp-phase.md both carried `agent: gsd-planner`, causing them to run inside gsd-planner's subagent context with no ability to spawn researcher/planner/checker subagents — the orchestrator fell back to inline execution for all three phases. Fix: remove `agent: gsd-planner` from both command files so they run in the main agent context. Also replace the stale `Task` tool in allowed-tools with `Agent` (the correct dispatcher tool name post-#3168 rename). Adds a structural regression test that parses YAML frontmatter of every commands/gsd/.md file and asserts no command carries an `agent:` directive. Closes #3156 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> fix(mvp): address CodeRabbit workflow and contract findings * fix(execute-phase): use registered state.update query command --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-06 21:51:38 -04:00
Tom Boucher	4237b0d78e	Merge pull request #3106 from nicholasferrer/fix/3061-commit-pathspec-leak fix(commit): scope every commit call to its staged pathspec	2026-05-06 20:55:30 -04:00
Tom Boucher	d44fcee013	Merge pull request #3110 from patrickclery/fix/3100-search-dirs-colon-leaks fix: replace stale /gsd: references in agents/, sdk/src/, and .clinerules	2026-05-06 20:52:43 -04:00
Tom Boucher	995e24431b	Merge pull request #3193 from gsd-build/fix/2641-details-summary-milestone-anchor fix(sdk): extractCurrentMilestone supports <details><summary> milestone headers (#2641)	2026-05-06 20:40:13 -04:00
Tom Boucher	ca148036d2	fix(roadmap): prevent milestone version substring matches	2026-05-06 20:37:18 -04:00
Ben Lamm	13bf56477a	fix(#2641 ): symmetric attribute tolerance in stripShippedMilestones + lockdown tests Address CodeRabbit follow-up review on PR #3046. One real bug + two lockdown gaps + one defensive assertion. REAL BUG — sibling-asymmetry in <details> attribute tolerance: extractCurrentMilestone's <details>-aware fallback uses <details\b[^>]> to tolerate attributes (#2641 hardening commit). stripShippedMilestones still used literal <details>, so shipped content wrapped in `<details open>` (or any attributed tag) leaked through the strip. This is the failure mode trek-e's review almost caught with the "<details open>" / extended-attribute test gap I deferred — CodeRabbit caught the deeper issue: it's not just a test gap, it's an actual asymmetry between the two functions that handle <details> blocks. Fix: align stripShippedMilestones's regex with extractCurrentMilestone's <details\b[^>]> form. Comment explicitly notes the symmetry contract so a future change to either function flags the other. Tests added in stripShippedMilestones describe block: - removes <details open> blocks - removes <details class="..." data-..."> blocks LOCKDOWN — leading-# strip in synthesized heading: My existing inline-HTML test exercised tag-stripping but didn't directly exercise the leading-# strip path (`.replace(/^#+\s*/, '')`). Added a dedicated test with `<summary># v0.9 Hash-Prefixed</summary>` so a future refactor that drops the strip would fail loudly instead of producing `## # v0.9 …` (which downstream `#{2,4}` regex parses as a 4-hash header). DEFENSIVE — toBeDefined guard in roadmapAnalyze regression test: Added `expect(data.milestones).toBeDefined()` before casting and calling `.some()`. Failure now reports "expected undefined to be defined" instead of TypeError. META: my prior adversarial pass missed the sibling-asymmetry because the checklist's "sibling consistency" item only audited PARSERS for the same INPUT field (STATE.md's `milestone:`), not ADJACENT FUNCTIONS that process the same DATA SHAPE (<details> blocks). The latter is a wider audit — every adjacent function that touches the data shape my new code relies on. Will refine the learned rule. Verification: 51/51 roadmap.test.ts pass (was 48; +3 tests). FAMP smoke unchanged: roadmap.get-phase 3 returns active milestone phase.	2026-05-06 15:41:27 -04:00
Ben Lamm	19041b8824	test(#2641 ): lockdown tests from self-adversarial pass Self-run adversarial pass on PR #3046 before next reviewer round-trip. Three lockdown tests added — none uncovered new bugs, all lock current behavior so a future change doesn't silently flip a convention. 1. Single-quote YAML version (`milestone: 'v0.9'`) Parity with the existing double-quote test. The strip pattern `/^["']\|["']$/g` handles both — locked here so a future change to either character class doesn't silently regress one form. 2. Heading-anchor wins over <details> fallback (precedence lock) When a ROADMAP has BOTH `### v0.9` heading AND `<details><summary>v0.9</summary>` block, the heading-level lookup matches first and the fallback never fires. Test asserts the heading slice is returned starting at offset 0 AND the synthesized `## v0.9 ... details-anchored` heading is NOT prepended (proves fallback didn't run). Also documented in-test that the heading-anchor slice naturally includes downstream <details> blocks verbatim — a property of the heading path, not of this PR's fallback. 3. Multiple <details> blocks for same version → first match wins `content.match(detailsPattern)` (non-`g`) returns first match in document order. Locked so a future change to the matcher (e.g. switching to `matchAll` and picking last) doesn't silently change which block is treated as active. Adversarial-checklist coverage on commit `781cc6f8`: - Boundary cases: empty / whitespace / single-char / single-quote / double-quote / digit-suffix (`v0.91`) / dot-suffix (`v0.9.0`) / hyphen-suffix (`v0.9-rc.1`, intentional same-milestone match per existing currentVersionStr convention) — all covered. - Sibling consistency: parseMilestoneFromState, getMilestoneInfo, extractCurrentMilestone all strip quotes identically. - Comment-vs-behavior: walked nested-guard, empty-guard, lookahead, tag-strip by hand against the regex; all comments accurate. - Downstream consumers: roadmapAnalyze + roadmapGetPhase both verified end-to-end via tests + FAMP smoke. - Failure-mode locality: all fall-through paths produce loud failures (empty arrays, `{found:false}`); no silent confident-wrong outputs. 48/48 roadmap.test.ts tests pass.	2026-05-06 15:41:27 -04:00
Ben Lamm	4b66ca5800	fix(#2641 ): harden <details> fallback per trek-e review Address trek-e's adversarial review on PR #3046. Two critical merge-blockers plus four hardening items, all now covered with tests. CRITICAL #1 — substring-version trap: `[^<]${escapedVersion}[^<]` did substring containment, so `milestone: v0.1` matched <summary>v0.10 …</summary> and returned the v0.10 block's body as the active milestone — confidently-wrong content worse than the pre-PR fall-through. Add `(?![\d.])` non-version-character lookahead, mirroring the same boundary protection used by the existing `currentVersionStr` logic on the heading path. Test asserts v0.1 active with v0.10 sibling block returns v0.1's phases, not v0.10's. CRITICAL #2 — nested <details> silent truncation: The lazy `[\s\S]?</details>` terminates on the FIRST </details>, which is the inner closer when nesting is present. Prior comment claimed "would mis-anchor (acceptable; falls through)" — factually wrong: the match succeeds with truncated body and is returned with a confident `## ${summary}` heading. Future maintainer investigating a "missing phase" report would be misled. Add `!detailsMatch[2].includes('<details')` guard so nesting falls through to stripShippedMilestones (loud failure) instead of returning truncated content (silent failure). Test locks the contract: no synthesized v0.9 heading anchored to truncated body. HARDENING: - Empty-body guard: `<details><summary>v0.9</summary></details>` would synthesize `## v0.9\n` (phantom milestone, zero phases, no error signal). Treat as no-match. - Inline-HTML in <summary>: rejected by `[^<]` capture. Widen to `(?:(?!</summary>).)?` (non-greedy until close tag) and strip tags + leading `#` from the captured summary before promoting to a `##` heading. Covers GitHub-rendered <em>(active)</em>, <code>v0.9</code>, <strong>...</strong> patterns. - JSDoc: rewrote to describe both anchoring strategies and the synthesized-heading contract; demoted stale "Port of core.cjs lines 1102-1170" to historical context with the divergence list. - Comment block: rewrote in contract style ("any consumer scanning /##\s.*vX.Y/ sees the active milestone") instead of coupling to specific call sites (roadmapAnalyze, "later in this file"). Adds explicit regex anatomy + hardening-guards section so future readers can audit each guard. OUT OF SCOPE (per trek-e's "Recommended action" tier): - Debug logging on fall-through paths (Suggestion #10) — adds tracing surface to a function that doesn't currently use logger; appropriate for a follow-up if/when other extraction bugs surface. - Uppercase <DETAILS>/<SUMMARY> + extended attribute coverage (Test gap #7 last two rows) — already covered by the documented `i` flag and the existing <details open> test; adding redundant cases inflates the test set without locking new contracts. Verification: 45/45 roadmap.test.ts tests pass (was 41/41; added 4 hardening tests). FAMP end-to-end smoke unchanged: roadmap.get-phase 3 returns "Claude Code integration polish", roadmap.analyze surfaces v0.9 Local-First Bus in data.milestones with phase_count: 4.	2026-05-06 15:41:27 -04:00
Ben Lamm	c8239f67f8	fix(#2641 ): inject normalized ## heading from <details><summary> capture Address CodeRabbit review on PR #3046: the prior commit returned only the body inside <details>...</details>, which fixed the `roadmapGetPhase` miss but left `roadmapAnalyze`'s downstream `data.milestones` scan (`/##\s(.v(\d+(?:\.\d+)+)[^(\n])/gi` at the bottom of roadmap.ts) without an active-milestone anchor in the returned slice. Now capture the <summary> text and prepend it as a synthesized `##` heading on the returned slice. This makes both `data.phases` (the original bug) AND `data.milestones` (the downstream consumer) surface the active milestone correctly for <details>-wrapped ROADMAPs. Also widened the inner tag to `<summary\b[^>]>` for symmetry with the outer `<details\b[^>]>` — both now tolerate attributes. Verified end-to-end against FAMP's v0.9 ROADMAP: - Before this commit (after PR #3046 base): milestones: [{heading: '# Phase 1: ... (v0.5.2 atomic bump)', version: 'v0.5.2'}] - After this commit: milestones: [{heading: 'v0.9 Local-First Bus', version: 'v0.9'}, {heading: '# Phase 1: ... (v0.5.2 ...)', version: 'v0.5.2'}] (The v0.5.2 entry is pre-existing noise from the loose `##\s` regex matching the `### Phase 1: famp-bus (v0.5.2 atomic bump)` body heading; unrelated to this fix and out of scope for this PR.) Tests: - Updated the two `<details><summary>` tests to assert the synthesized `## v0.9 Local-First Bus` heading is present on the returned slice. - Added a 4th regression test (`roadmapAnalyze`) confirming `data.milestones` now contains the active milestone for <details>-wrapped ROADMAPs. - All 40 roadmap.test.ts tests pass.	2026-05-06 15:41:27 -04:00
Ben Lamm	ba6a3efc3e	fix(#2641 ): strip YAML quotes from STATE.md milestone version Address CodeRabbit review on PR #3046: extractCurrentMilestone read the `milestone:` value from STATE.md frontmatter via `.trim()` only, while parseMilestoneFromState() and getMilestoneInfo() both also strip surrounding YAML quotes via `.replace(/^["']\|["']$/g, '')`. For projects whose STATE.md uses quoted YAML (`milestone: "v0.9"`), `version` carried literal quotes, `escapedVersion` became `\"v0\.9\"`, and neither the markdown-heading regex nor the new <details><summary> fallback could match anything — falling through to stripShippedMilestones() and reintroducing the same archived-milestone misrouting this PR addresses. Strip quotes for parity. Three-line addition + one new test. All 41 roadmap.test.ts tests pass.	2026-05-06 15:41:27 -04:00
Ben Lamm	592b676414	fix(#2641 ): recognize <details><summary> as active-milestone anchor `extractCurrentMilestone` only matched markdown headings (## v0.9, ### v0.9) to find the active milestone slice. Projects that wrap their active milestone's phase details inside `<details><summary>vX.Y …</summary>` (a common GitHub-friendly collapse pattern, e.g. FAMP) fell through to `stripShippedMilestones`, which strips ALL `<details>` blocks indiscriminately. Net effect: `roadmapGetPhase` returned `{found:false}` for phases that ARE in the active ROADMAP. The `init.phase-op` safety guard at `init.ts:133` ('drop archived disk match when phase is in current ROADMAP') depends on `roadmapPhase.found`, so it didn't fire. `init.phase-op` then returned a `phase_dir` pointing at an ARCHIVED milestone's same-numbered phase — silently routing downstream workflows (e.g. /gsd-discuss-phase) into completed phases. Fix: when no markdown heading matches the active version, try matching `<details\b[^>]><summary>...vX.Y...</summary>`. Returns the inner content of the matching block. Purely additive — `stripShippedMilestones` behavior and its tests are unchanged. The `\b[^>]>` form tolerates attributes like `<details open>` or `<details class="...">` (GitHub commonly emits `<details open>` for default-expanded sections). Lazy `[\s\S]*?` matches up to the first `</details>`; nested `<details>` inside the active milestone are not expected and would mis-anchor (acceptable; falls through to the existing `stripShippedMilestones` path with no regression vs. today's behavior). Closes #2641. Distinct from the closed #2642 which bundled three orthogonal changes (parser fix + checkbox-scan fix + STATE.md counting auth) into one PR; this PR addresses only the parser anchoring bug, leaving `stripShippedMilestones`, `roadmapAnalyze`, and `initMilestoneOp` untouched. Tests added (3, all in `roadmap.test.ts`): - `bug-2641: finds active milestone wrapped in <details><summary>vX.Y …</summary>` - `bug-2641: finds active milestone in <details open><summary>vX.Y …</summary>` - `bug-2641: returns found:true for phase inside <details>-wrapped active milestone` (end-to-end via `roadmapGetPhase`) All existing `roadmap.test.ts` tests pass (39/39). Real-world repro verified against an FAMP-style ROADMAP: before the fix, `gsd-sdk query roadmap.get-phase 3` returned `{found:false}` despite the phase being at line 113 of the active ROADMAP; after the fix, it returns the correct phase metadata, and `init.phase-op 3` no longer returns the v0.8 archived `phase_dir`.	2026-05-06 15:41:27 -04:00
Tom Boucher	a1a81eec90	fix(config): align SDK runtime-state key validation with CJS	2026-05-06 15:19:34 -04:00
Tom Boucher	96ce608ee6	fix(config): add resolve_model_ids to VALID_CONFIG_KEYS; accept workflow._auto_chain_active via RUNTIME_STATE_KEYS Fixes #3162 `resolve_model_ids` is a documented top-level config key (CONFIGURATION.md) read by core.cjs and session-runner.ts, but was missing from the CJS and SDK VALID_CONFIG_KEYS allowlists — causing config-set to reject it with "Unknown config key". `workflow._auto_chain_active` is internal runtime state intentionally excluded from VALID_CONFIG_KEYS by #2530, but plan-phase, execute-phase, discuss-phase, transition, and new-project workflows all write it via `config-set`. Without a valid write path these calls emit spurious errors (silenced with `\|\| true` but noisy in logs). A new RUNTIME_STATE_KEYS set in config-schema.cjs holds keys that isValidConfigKey() accepts without exposing them as user-settable options — preserving the #2530 intent while fixing the runtime error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-06 14:56:34 -04:00
Tom Boucher	3785c09307	fix(sdk): include hotpath fallback reason in bridge observability	2026-05-05 20:29:43 -04:00
Tom Boucher	fe16143e29	fix(sdk): align hotpath observability with actual dispatch mode	2026-05-05 20:22:03 -04:00
Tom Boucher	8ad2e3877f	fix(sdk): address CodeRabbit runtime bridge and docs findings	2026-05-05 19:59:56 -04:00
Tom Boucher	51b809e8e9	feat(sdk): expose runtime bridge controls via GSD options	2026-05-05 19:36:47 -04:00
Tom Boucher	00ba404b60	test(sdk): enforce runtime bridge seam and explicit no-fallback behavior	2026-05-05 19:31:03 -04:00
Tom Boucher	54b06e653e	docs(sdk): document runtime bridge seam, strict mode, and fallback policy	2026-05-05 19:29:59 -04:00
Tom Boucher	1bd11ab699	feat(sdk): emit runtime bridge dispatch observability events	2026-05-05 19:26:01 -04:00
Tom Boucher	0026065c7a	feat(sdk): add strict mode and explicit fallback policy to runtime bridge	2026-05-05 19:23:39 -04:00
Tom Boucher	98dd9e4afb	refactor(sdk): add runtime bridge seam for query dispatch	2026-05-05 19:21:31 -04:00
Tom Boucher	9811782e6d	fix(#3121 ): implement commands verb in SDK native registry (#3146 ) - Add commandsList handler — returns sorted JSON array of all registered verb strings; satisfies workstream-flag.md + agent tooling discoverability - Register ['commands', commandsList] in DECISION_ROUTING_STATIC_CATALOG - Add golden-policy exemption (SDK-only, no CJS mirror needed) - check.decision-coverage-plan/verify were already registered; commands was the remaining gap Closes #3121	2026-05-05 15:02:34 -04:00
Nicholas Ferrer	85ef9553d2	fix(commit): scope every commit call to its staged pathspec The commit handler ran `git add <paths>` followed by `git commit` without a pathspec, so anything pre-staged externally before the handler ran was swept into the commit. #2767 fixed every call site to use --files but left the handler emitting a pathspec-less commit, so the bug survived the well-formed form too. Compute pathsToCommit once and pass `'--', ...pathsToCommit` to every git commit invocation: regular, --amend, and commit-to-subrepo. The staged-files check uses the same pathspec so "nothing staged" reflects what would actually be committed, not unrelated index entries. Two follow-up safeguards on the same surface: * When `--files` is passed but every following token gets filtered out (e.g. `--files --no-verify`), reject with `--files requires at least one path` instead of silently falling back to .planning/. * Both `git add` invocations now use the `--` separator so a path starting with `-` (e.g. a file literally named `-A.md`) is treated as a pathspec rather than a git option. Adds five regression tests in `commit.test.ts`: three covering the pathspec scope (`--files`, `.planning/` fallback, and `--amend` with pre-staged unrelated changes), one covering the empty `--files` rejection, and one covering the `-A.md` round-trip. Closes #3061	2026-05-05 15:30:05 -03:00
Patrick Clery	f9c1f01971	fix: extend fix-slash-commands SEARCH_DIRS to agents/, sdk/src/, .clinerules scripts/fix-slash-commands.cjs SEARCH_DIRS did not cover agents/, sdk/src/, or top-level files, so 9 colon-form references survived in 6 files. The hit at agents/gsd-codebase-mapper.md:105 propagated into ~/.claude/agents/ at install time (the fixer is not wired into install) and produced unrunnable /gsd:<cmd> suggestions in agent output on non-Gemini runtimes. This commit includes Pass 1 (the 9 line edits) AND Pass 2 (extending the fixer's SEARCH_DIRS so future regressions are auto-rewritten and caught by the bug-2543 guard, which mirrors that list). The standalone bug-3100 test added in the prior revision is removed in favor of the bug-2543 guard's extended scan, per CONTRIBUTING.md test standards (no source-grep tests on non-.md files). Refs #3100	2026-05-05 13:19:10 -04:00
Tom Boucher	aa64638176	Merge pull request #3112 from gsd-build/fix/3101-plan-summary-matcher-in-core-cjs-reports fix: canonicalize plan-summary matching for suffixless summaries	2026-05-04 23:35:34 -04:00
Tom Boucher	e7ecd46bbe	Merge pull request #3115 from gsd-build/fix/3053-sdk-ignores-multi-plan-phase-layout-plan fix: count nested plans/ layout in phase status indexing	2026-05-04 23:35:26 -04:00
Tom Boucher	083e813aea	Merge pull request #3116 from gsd-build/fix/3055-bug-top-level-branching-strategy-in-plan fix: normalize legacy top-level branching_strategy into git config	2026-05-04 23:33:28 -04:00
Tom Boucher	f01f6b76dd	Merge pull request #3122 from gsd-build/fix/3088-gsd-complete-milestone-leaves-state-md-n fix: normalize stale STATE narrative tails on milestone completion	2026-05-04 23:31:06 -04:00
Tom Boucher	67684626d8	fix(3088): append missing STATE narrative sections on milestone close	2026-05-04 23:29:45 -04:00
coderabbitai[bot]	2d25c97706	fix: apply CodeRabbit auto-fixes Fixed 1 file(s) based on 2 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>	2026-05-05 03:17:22 +00:00
Tom Boucher	2dcf374da0	fix(milestone): normalize STATE narrative after milestone completion	2026-05-04 23:17:00 -04:00
Tom Boucher	58062a64a0	fix(sdk-config): honor legacy top-level branching_strategy in init	2026-05-04 23:06:54 -04:00
Tom Boucher	65024683fd	fix(init): count plans/ summaries from nested plans/ layout	2026-05-04 23:03:10 -04:00
Tom Boucher	c7886415c3	fix(phase): canonicalize plan-summary matching for suffixless summaries	2026-05-04 22:51:15 -04:00
Tom Boucher	40acf1f02e	fix: address CodeRabbit findings on query/transport error handling	2026-05-04 21:49:41 -04:00
Tom Boucher	38718e9d4b	fix: avoid unsafe Promise cast in execRaw	2026-05-04 21:49:40 -04:00
Tom Boucher	0500bdf619	refactor: deepen query architecture seams with compatibility shims	2026-05-04 21:49:40 -04:00

1 2 3 4

196 Commits