Files
get-shit-done/bin
Tom Boucher deeb6deb67 fix(install): accept Codex TOML floats; idempotent rollback (#3245) (#3254)
* test: reproduce extractFrontmatter LAST-block bug (#3240)

* test: reproduce state.update progress trampling and percent formula (#3242)

Two failing regression tests:
- Bug A: state.update "Last Activity" tramples curated progress.* frontmatter via readModifyWriteStateMd → syncStateFrontmatter
- Bug B: 12 declared ROADMAP phases / 6 realized / 6/6 plans done → percent: 100 instead of 50 (phase-fraction ignored)

* test: reproduce TOML float rejection and partial rollback (#3245)

Two failing regression tests:
1. parseTomlToObject rejects valid Codex TOML floats (tool_timeout_sec = 20.0)
2. Post-install validation failure leaves skills/, agents/, VERSION on disk
   despite restoring config.toml — hybrid state after abort

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(install): accept TOML floats; idempotent codex rollback (#3245)

Two fixes for the Codex install failure introduced by #2760 CR4 finding 3:

1. parseTomlValue now accepts TOML 1.0 float literals (decimals,
   exponents, underscore separators, signed). Codex CLI's serde schema
   requires f64 for tool_timeout_sec / startup_timeout_sec — the prior
   strict-integer-only check was the inverse of what Codex requires,
   causing every config with a float to trigger a fatal schema validation
   failure. Date/time separators (-/:T/Z) are still rejected.

2. restoreCodexSnapshot is extended into a unified idempotent rollback
   that reverts ALL Codex-specific mutations on failure:
   - config.toml (existing behavior)
   - skills/gsd-* directories (new)
   - agents/gsd-*.{md,toml} files (new)
   - get-shit-done/VERSION (new)
   - orphaned atomic-write temp files (new)
   Pre-install state is captured before the first Codex write so the
   rollback reflects the true pre-GSD state. Non-gsd-* user content is
   untouched. The rollback is safe to call multiple times and before any
   snapshots are captured.

Fixes #3245

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* changeset: pr=3254 for #3245

* test: fix source-grep lint violation in bug-3242 test (#3242)

Replace content.includes() check with line-by-line parse of STATE.md body.
The lint enforces structural assertions over raw text matching.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: mark #3242 RED tests as todo pending fix (#3242)

The three failing tests are intentional regression tests for bugs in
state.cjs that will be fixed in a separate PR. Mark them { todo: true }
so they don't block CI on this branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(install): tighten TOML underscore placement validation (CR finding 1)

The float regex used [\d_]* which accepts invalid forms like 1__0, 1_.0,
and 1._0. TOML 1.0 §2 requires underscores only between digits. Switch
both the integer pre-check and the full float pattern to (?:_?\d)* so
consecutive underscores, leading underscores on a segment, and trailing
underscores on a segment are all rejected before replace(/_/g,'') can
silently normalize them into valid JS numbers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(install): restore pre-existing gsd-* content on rollback (CR finding 2)

The snapshot only recorded names of pre-existing skills/gsd-* dirs and
agents/gsd-* files. On a failed reinstall the rollback could delete
newly-created dirs but could not restore the bytes of dirs/files that
were overwritten, leaving the user in a hybrid state (old config.toml,
new skill files).

Now snapshot the full file tree of every pre-existing gsd-* skill dir
into codexPreInstallSkillContents (Map<name, Map<relPath, Buffer>>) and
every pre-existing agent file into codexPreInstallAgentContents
(Map<filename, Buffer>). restoreCodexSnapshot() uses these maps to
wipe-and-restore overwritten entries and only removes entries that had
no pre-install state, giving a true atomic rollback guarantee.
Reads are best-effort so a partial snapshot is still better than none.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(install): scope temp-file cleanup to installer-owned writes (CR finding 3)

_cleanTmpFiles() was deleting any *.tmp-<pid>-<n> file found under
targetDir. This is too broad: other tools in the user's Codex/home
directory may create temp files matching the same suffix pattern, and a
GSD install rollback would silently delete them.

Add __atomicWrittenTmps (a module-level Set<string>) populated by
atomicWriteFileSync for every temp path it creates. _cleanTmpFiles()
now checks __atomicWrittenTmps.has(full) before unlinking, so only temp
files this installer process actually wrote are eligible for cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): remove no-op doesNotThrow wrapping try/catch (CR finding 4)

assert.doesNotThrow(() => { try { f(); } catch(_){} }) always passes
because the catch block swallows every exception before the outer
assertion can see it. This meant the rollback-idempotency guarantee was
never actually verified.

Replace with an explicit threw flag around runCodexInstall, assert that
the install did throw (validation failure is expected), and add a
post-rollback state assertion that skills/ was not created. This gives
a loud failure surface if runCodexInstall starts crashing from inside
the rollback path, matching the intent described in the test comment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): correct describe title for float-acceptance tests (CR nitpick 1)

The describe block title said 'rejects malformed input that previously
slipped through', but the test inside now asserts that TOML floats are
accepted (the #3245 inversion). This misled readers expecting every
sub-test to assert rejection. Update the title to reflect the mixed
behaviour: floats are accepted; dates and trailing-garbage are rejected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): rename test to match what the assertion actually checks (CR nitpick 2)

The test name 'post-install config retains float literal form (20.0 not
truncated to 20)' promised a string-form invariant, but the assertion
uses numeric equality (assert.strictEqual(parsed.tool_timeout_sec, 20))
which cannot distinguish 20 from 20.0 in JS. Rename to 'post-install
config round-trips tool_timeout_sec as numeric 20' so the description
matches what the test actually verifies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): replace raw text scan with state json assertion (CR nitpick 3)

The 'Last Activity updates the body field' test was reading STATE.md as
raw text, splitting on newlines, and using lines.find/startsWith to
locate the 'Last Activity:' line — the exact pattern-match-on-source
approach prohibited by the no-source-grep testing standard.

Replace with runGsdTools('state json', tmpDir) which surfaces the body-
extracted Last Activity value as fm.last_activity in its JSON output,
and assert against that structured field instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): correct post-rollback state assertion for early-failure case

The previous assertion checked that skills/ didn't exist, but the
installer writes skills/ before the schema validator fires. Rollback
removes gsd-* dirs inside skills/, not skills/ itself. Update the
assertion to verify that no gsd-* skill dirs survive rollback, which
is the actual invariant the test name describes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* changeset: document full rollback scope (CR finding 1)

Adds config.toml restoration and orphaned atomic-write temp-file
cleanup to the changeset description — the previous text only listed
skills/, agents/, and VERSION.

* fix(install): wrap post-snapshot scope in rollback handler (CR finding 2)

Any throw between the pre-install snapshot capture and the Codex config
block (skills copy, agents copy, VERSION write, manifest write, leaked-
path scan, etc.) now triggers _codexPreConfigRollback() so the caller
is never left in a partially-installed state.  Previously only the later
config.toml mutation paths had rollback wired in.

Introduces _codexPreConfigRollback (defined right after snapshot capture)
and wraps the intervening operations in a try/catch that invokes it on
error for Codex installs; non-Codex paths are unaffected.

* test: assert threw=true to prevent vacuous pass (CR finding 4)

Two tests used bare try/catch without asserting threw === true, so they
would silently pass even if runCodexInstall never threw (k060 pattern).
Each bare catch block is replaced with a threw flag and a
strictEqual(threw, true, ...) assertion.

CR findings 2+3 are both addressed in the preceding install commit:
finding 3 (restore from snapshot manifest, not current FS state) lands
alongside the rollback-wrapper change as part of the restoreCodexSnapshot
refactor.

* fix(install): reject leading zeros in TOML float integer part per TOML 1.0 (CR finding round 4)

TOML 1.0 §2 disallows leading zeros in the integer part of numeric
literals — `01`, `00`, `01.5`, `00e2`, `+01.0`, `-01.0` are all invalid.
The pre-check and float regexes in parseTomlValue used `\d(?:_?\d)*` which
accepted any digit as the leading digit.

Both regexes are tightened to `(0|[1-9](?:_?\d)*)` for the integer part:
- `0` alone is valid
- a non-zero leading digit followed by optional underscored digits is valid
- `01`, `00`, and any variant with a leading zero and further digits is rejected

The "still rejects bare time (07:32:00)" test assertion is broadened from
`/unsupported TOML value/` to `/unsupported TOML value|trailing bytes/`
because the parser now stops at `0` and the remainder `7:32:00` is rejected
as trailing bytes — the invariant (time literals are not accepted) is unchanged.

25 new regression tests cover all rejection cases and valid TOML forms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 10:25:59 -04:00
..