mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-05-13 18:46:38 +02:00
* test: reproduce extractFrontmatter LAST-block bug (#3240) * test: reproduce state.update progress trampling and percent formula (#3242) Two failing regression tests: - Bug A: state.update "Last Activity" tramples curated progress.* frontmatter via readModifyWriteStateMd → syncStateFrontmatter - Bug B: 12 declared ROADMAP phases / 6 realized / 6/6 plans done → percent: 100 instead of 50 (phase-fraction ignored) * test: reproduce TOML float rejection and partial rollback (#3245) Two failing regression tests: 1. parseTomlToObject rejects valid Codex TOML floats (tool_timeout_sec = 20.0) 2. Post-install validation failure leaves skills/, agents/, VERSION on disk despite restoring config.toml — hybrid state after abort Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): accept TOML floats; idempotent codex rollback (#3245) Two fixes for the Codex install failure introduced by #2760 CR4 finding 3: 1. parseTomlValue now accepts TOML 1.0 float literals (decimals, exponents, underscore separators, signed). Codex CLI's serde schema requires f64 for tool_timeout_sec / startup_timeout_sec — the prior strict-integer-only check was the inverse of what Codex requires, causing every config with a float to trigger a fatal schema validation failure. Date/time separators (-/:T/Z) are still rejected. 2. restoreCodexSnapshot is extended into a unified idempotent rollback that reverts ALL Codex-specific mutations on failure: - config.toml (existing behavior) - skills/gsd-* directories (new) - agents/gsd-*.{md,toml} files (new) - get-shit-done/VERSION (new) - orphaned atomic-write temp files (new) Pre-install state is captured before the first Codex write so the rollback reflects the true pre-GSD state. Non-gsd-* user content is untouched. The rollback is safe to call multiple times and before any snapshots are captured. Fixes #3245 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: pr=3254 for #3245 * test: fix source-grep lint violation in bug-3242 test (#3242) Replace content.includes() check with line-by-line parse of STATE.md body. The lint enforces structural assertions over raw text matching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: mark #3242 RED tests as todo pending fix (#3242) The three failing tests are intentional regression tests for bugs in state.cjs that will be fixed in a separate PR. Mark them { todo: true } so they don't block CI on this branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): tighten TOML underscore placement validation (CR finding 1) The float regex used [\d_]* which accepts invalid forms like 1__0, 1_.0, and 1._0. TOML 1.0 §2 requires underscores only between digits. Switch both the integer pre-check and the full float pattern to (?:_?\d)* so consecutive underscores, leading underscores on a segment, and trailing underscores on a segment are all rejected before replace(/_/g,'') can silently normalize them into valid JS numbers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): restore pre-existing gsd-* content on rollback (CR finding 2) The snapshot only recorded names of pre-existing skills/gsd-* dirs and agents/gsd-* files. On a failed reinstall the rollback could delete newly-created dirs but could not restore the bytes of dirs/files that were overwritten, leaving the user in a hybrid state (old config.toml, new skill files). Now snapshot the full file tree of every pre-existing gsd-* skill dir into codexPreInstallSkillContents (Map<name, Map<relPath, Buffer>>) and every pre-existing agent file into codexPreInstallAgentContents (Map<filename, Buffer>). restoreCodexSnapshot() uses these maps to wipe-and-restore overwritten entries and only removes entries that had no pre-install state, giving a true atomic rollback guarantee. Reads are best-effort so a partial snapshot is still better than none. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): scope temp-file cleanup to installer-owned writes (CR finding 3) _cleanTmpFiles() was deleting any *.tmp-<pid>-<n> file found under targetDir. This is too broad: other tools in the user's Codex/home directory may create temp files matching the same suffix pattern, and a GSD install rollback would silently delete them. Add __atomicWrittenTmps (a module-level Set<string>) populated by atomicWriteFileSync for every temp path it creates. _cleanTmpFiles() now checks __atomicWrittenTmps.has(full) before unlinking, so only temp files this installer process actually wrote are eligible for cleanup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): remove no-op doesNotThrow wrapping try/catch (CR finding 4) assert.doesNotThrow(() => { try { f(); } catch(_){} }) always passes because the catch block swallows every exception before the outer assertion can see it. This meant the rollback-idempotency guarantee was never actually verified. Replace with an explicit threw flag around runCodexInstall, assert that the install did throw (validation failure is expected), and add a post-rollback state assertion that skills/ was not created. This gives a loud failure surface if runCodexInstall starts crashing from inside the rollback path, matching the intent described in the test comment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): correct describe title for float-acceptance tests (CR nitpick 1) The describe block title said 'rejects malformed input that previously slipped through', but the test inside now asserts that TOML floats are accepted (the #3245 inversion). This misled readers expecting every sub-test to assert rejection. Update the title to reflect the mixed behaviour: floats are accepted; dates and trailing-garbage are rejected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): rename test to match what the assertion actually checks (CR nitpick 2) The test name 'post-install config retains float literal form (20.0 not truncated to 20)' promised a string-form invariant, but the assertion uses numeric equality (assert.strictEqual(parsed.tool_timeout_sec, 20)) which cannot distinguish 20 from 20.0 in JS. Rename to 'post-install config round-trips tool_timeout_sec as numeric 20' so the description matches what the test actually verifies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): replace raw text scan with state json assertion (CR nitpick 3) The 'Last Activity updates the body field' test was reading STATE.md as raw text, splitting on newlines, and using lines.find/startsWith to locate the 'Last Activity:' line — the exact pattern-match-on-source approach prohibited by the no-source-grep testing standard. Replace with runGsdTools('state json', tmpDir) which surfaces the body- extracted Last Activity value as fm.last_activity in its JSON output, and assert against that structured field instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): correct post-rollback state assertion for early-failure case The previous assertion checked that skills/ didn't exist, but the installer writes skills/ before the schema validator fires. Rollback removes gsd-* dirs inside skills/, not skills/ itself. Update the assertion to verify that no gsd-* skill dirs survive rollback, which is the actual invariant the test name describes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * changeset: document full rollback scope (CR finding 1) Adds config.toml restoration and orphaned atomic-write temp-file cleanup to the changeset description — the previous text only listed skills/, agents/, and VERSION. * fix(install): wrap post-snapshot scope in rollback handler (CR finding 2) Any throw between the pre-install snapshot capture and the Codex config block (skills copy, agents copy, VERSION write, manifest write, leaked- path scan, etc.) now triggers _codexPreConfigRollback() so the caller is never left in a partially-installed state. Previously only the later config.toml mutation paths had rollback wired in. Introduces _codexPreConfigRollback (defined right after snapshot capture) and wraps the intervening operations in a try/catch that invokes it on error for Codex installs; non-Codex paths are unaffected. * test: assert threw=true to prevent vacuous pass (CR finding 4) Two tests used bare try/catch without asserting threw === true, so they would silently pass even if runCodexInstall never threw (k060 pattern). Each bare catch block is replaced with a threw flag and a strictEqual(threw, true, ...) assertion. CR findings 2+3 are both addressed in the preceding install commit: finding 3 (restore from snapshot manifest, not current FS state) lands alongside the rollback-wrapper change as part of the restoreCodexSnapshot refactor. * fix(install): reject leading zeros in TOML float integer part per TOML 1.0 (CR finding round 4) TOML 1.0 §2 disallows leading zeros in the integer part of numeric literals — `01`, `00`, `01.5`, `00e2`, `+01.0`, `-01.0` are all invalid. The pre-check and float regexes in parseTomlValue used `\d(?:_?\d)*` which accepted any digit as the leading digit. Both regexes are tightened to `(0|[1-9](?:_?\d)*)` for the integer part: - `0` alone is valid - a non-zero leading digit followed by optional underscored digits is valid - `01`, `00`, and any variant with a leading zero and further digits is rejected The "still rejects bare time (07:32:00)" test assertion is broadened from `/unsupported TOML value/` to `/unsupported TOML value|trailing bytes/` because the parser now stops at `0` and the remainder `7:32:00` is rejected as trailing bytes — the invariant (time literals are not accepted) is unchanged. 25 new regression tests cover all rejection cases and valid TOML forms. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>