get-shit-done

mirror of https://github.com/glittercowboy/get-shit-done synced 2026-05-13 18:46:38 +02:00

Author	SHA1	Message	Date
Tom Boucher	c4d3fe62a5	fix(install): require persistent SDK reachability before reporting ready (#3231 ) (#3249 ) * test: reproduce false GSD SDK ready signals on Linux (#3231) * fix(install): require persistent SDK reachability before reporting ready (#3231) * changeset: pr=3249 for #3231 * fix(install): filter _npx from login-shell PATH probe (CR finding 1) Apply filterNpxFromPath() to the getUserShellPath() result before passing it to isGsdSdkOnPath(), mirroring the same filtering already applied to process.env.PATH. Without this, a transient _npx entry in the login-shell PATH can falsely satisfy the cross-shell reachability check and reintroduce the false-ready condition this PR fixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): unconditional legacy-shim replacement assertion (CR finding 2) Replace readFileSync+includes source-grep check with isLegacyGsdSdkShim() and add an else branch asserting that when sdkReady is false, a warning/error was emitted. Previously the sdkReady===false path had no assertion at all, allowing the test to pass without verifying any postcondition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: replace text-grep assertions with structured ones (CR finding 2 + nitpick) Finding 2: restructure the legacy-shim replacement assertion to branch on isLegacyGsdSdkShim() state (a behavioral fact) rather than console output, and add an unconditional postcondition for both branches. Nitpick 3 (4 locations): - lines 149-153: replace /GSD SDK ready/.test(combined) with isGsdSdkOnPath(filterNpxFromPath(PATH)) === false - lines 167-169, 185-189: split filterNpxFromPath result into segments array and use array.includes() instead of string.includes() on the raw PATH string - lines 375-377: replace /GSD SDK ready/.test(combined) with fs.existsSync(shimPath) + isGsdSdkOnPath(filterNpxFromPath(localBin)) All 8 tests pass. lint-no-source-grep: 0 violations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(build-hooks): per-PID staging dir eliminates concurrent-cleanup TOCTOU race When multiple test before() hooks spawned build-hooks.js concurrently (--test-concurrency=4), a race existed: Process A would finish all copies, call rmdirSync('.dist-staging/') in cleanup, then Process B — still in its copy loop — would call copyFileSync(src, '.dist-staging/hook.pid.ts') and get ENOENT because the staging directory was gone. On macOS/Linux, copyFileSync reports the SOURCE path in ENOENT errors when the destination directory is missing, making the failure appear to be a missing source file (hooks/gsd-statusline.js) rather than a missing destination directory. This misled the diagnosis. Fix: make STAGE_DIR per-PID ('.dist-staging-<pid>/') so each builder owns its own staging directory. No other process touches it, eliminating all contention on staging-dir creation and cleanup. Update .gitignore to match the new 'hooks/.dist-staging-/' glob. Reproduces as: CI test matrix (macos-24, ubuntu-22, ubuntu-24) all failing with ENOENT on hooks/gsd-statusline.js in bug-2136 before() hook. The new test file added in this PR (bug-3231) shifts the concurrency schedule just enough to expose the race on every CI run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> test: assert on captured console output, not tautological PATH state (CR finding) The two discarded `captureConsole()` return values in the bug-3231 test were flagged by CodeRabbit as tautological assertions. Fix: - Test 1 (transient _npx PATH): capture stdout/stderr and assert the installer does NOT emit "GSD SDK ready" (the false-positive the PR fixes), and that it does emit some diagnostic output instead. - Test 3 (clean install): capture stdout/stderr and assert the installer DOES emit "GSD SDK ready" after successfully self-linking into a persistent PATH dir — confirming the positive path works correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-08 09:39:33 -04:00
Tom Boucher	f4c4ec6211	docs(build-hooks): correct staging-dir cleanup comment The previous comment claimed "rmdir-on-non-empty is a no-op" — that is factually wrong. fs.rmdirSync throws ENOTEMPTY on non-empty directories. The actual race-safety mechanism is: 1. fs.readdirSync(STAGE_DIR) -> leftovers 2. fs.rmdirSync(STAGE_DIR) only when leftovers.length === 0 3. Outer try/catch swallows TOCTOU ENOTEMPTY (peer added a file between readdir and rmdir) and ENOENT (peer already cleaned up). Comment now references the leftovers variable and both fs calls so a future reader can map narrative to code without reverse-engineering it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 23:50:52 -04:00
Tom Boucher	c47c2c5def	fix(build-hooks): handle Windows EPERM/EBUSY on rename, fall back to copy POSIX rename(2) atomically replaces dest even when readers hold open handles. Windows MoveFileEx (which fs.renameSync uses with MOVEFILE_REPLACE_EXISTING) cannot — it throws EPERM/EBUSY when another process has the destination open. Concurrent install.js readers and antivirus scanners are realistic triggers; both release within ms. renameAtomicWithRetry() preserves the bare renameSync call on POSIX (no overhead) and on Windows retries up to 4 times with 10/30/90/270ms backoff, then falls back to copyFileSync + unlinkSync. If even copy fails because dest is hard-locked, log a non-fatal warning and leave the prior dest in place — a subsequent build retries from a fresh state. The build no longer crashes on Windows transient locking. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 23:48:31 -04:00
Tom Boucher	c4f11db5e9	fix(build-hooks): atomic rename to prevent race with concurrent install reads scripts/build-hooks.js used fs.copyFileSync (truncate-then-write, non-atomic). Under --test-concurrency=4, multiple builder invocations raced; a parallel install.js subprocess could readFileSync between truncate and write and observe an empty file, then write that emptiness into the install target. Surfaced as the release-blocking bug-2136-sh-hook-version part 4 failure on main even though the same SHA passed every install-smoke matrix entry. Fix: stage outputs to hooks/.dist-staging/ then fs.renameSync into hooks/dist/. POSIX rename(2) is atomic, so concurrent readers always observe a complete file. The existing bug-2136 part 4 test locks the post-fix invariant. Failing run: https://github.com/gsd-build/get-shit-done/actions/runs/25472202941/job/74738276687 Closes #3214 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 23:34:29 -04:00
Tom Boucher	d44fcee013	Merge pull request #3110 from patrickclery/fix/3100-search-dirs-colon-leaks fix: replace stale /gsd: references in agents/, sdk/src/, and .clinerules	2026-05-06 20:52:43 -04:00
coderabbitai[bot]	96d2556209	fix: apply CodeRabbit auto-fixes Fixed 1 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>	2026-05-05 22:36:38 +00:00
Tom Boucher	a411e08e88	fix(coderabbit): resolve all 12 findings on PR #3152 MAJOR (security/correctness): - commands/gsd/debug.md: add Write to allowed-tools (session file creation requires it — workflow explicitly says 'use Write tool, never heredoc') - workflows/debug.md: add SLUG sanitization guard to steps 1b+1c (status/ continue subcommands used raw user input in file paths — path traversal) - workflows/thread.md: sanitize $ARGUMENTS in RESUME mode before file path construction (was bypassing the sanitization guard in CLOSE/STATUS modes) MINOR (consistency/correctness): - docs/INVENTORY-MANIFEST.json: remove stale top-level 'workflows' array (duplicate of families.workflows introduced in earlier update) - commands/gsd/resume-work.md: normalize process to 'Execute end-to-end.' - commands/gsd/settings.md: normalize process to 'Execute end-to-end.' - commands/gsd/update.md: normalize otherwise branch to 'execute end-to-end.' - docs/adr/0002: add Status: Accepted + Date header (ADR convention) - workflows/extract-learnings.md: rename step extract_learnings → extract-learnings - tests/extract-learnings.test.cjs: tighten step-name assertion to exact name ARCHITECTURE: - scripts/command-contract-helpers.cjs: extract CANONICAL_TOOLS, parseFrontmatter, executionContextRefs as shared module — single source of truth consumed by both lint script and test suite (prevents silent lint/test disagreement) - scripts/lint-command-contract.cjs: require() helpers instead of duplicating - tests/command-contract.test.cjs: require() helpers; move readFileSync calls inside test() callbacks (registration-time throws surface as named failures)	2026-05-05 16:06:29 -04:00
Tom Boucher	81f9534b5a	feat(adr-0002): command contract validation module + prose @-ref cleanup + workflow extraction ADR-0002: commands/gsd/*.md contract now enforced at two layers: LINT (scripts/lint-command-contract.cjs — new CI step): - name: present, starts with gsd: or gsd- - description: non-empty - allowed-tools: non-empty, all entries canonical - execution_context @-refs: resolve on disk, no trailing prose on same line - handles both @~/ and $HOME/ path prefixes TEST (tests/command-contract.test.cjs — 361 assertions): - Behavioral contract for all 65 command files - Replaces scattered coverage in enh-2790 + bug-3135 - Per-command per-rule test — one failure names the exact file + rule CI (.github/workflows/test.yml): - 'Lint — command contract (ADR-0002)' step added to lint-tests job PROSE @-REF CLEANUP (39 command files, ~900 tokens/invocation recovered): - Removed redundant @~/.claude/get-shit-done/... paths from <process> prose - execution_context block is now the single authoritative load declaration - Routing commands (sketch, spike, update, pause-work, etc.) keep routing instructions; only the inert path token is stripped WORKFLOW EXTRACTION (debug.md + thread.md, ~15,000 chars / ~3,750 tokens): - get-shit-done/workflows/debug.md: full process extracted from commands/gsd/debug.md - get-shit-done/workflows/thread.md: full process extracted from commands/gsd/thread.md - Command files reduced to frontmatter + objective + execution_context + context - debug.md: 9,603 → 1,703 chars; thread.md: 7,868 → 585 chars RENAME: - get-shit-done/workflows/extract_learnings.md → extract-learnings.md (aligns with hyphen convention of all other workflow files) DOCS: - docs/INVENTORY.md: count 85→87, new rows, rename row, fix add-todo --backlog attribution - docs/INVENTORY-MANIFEST.json: +debug.md +thread.md +extract-learnings.md -extract_learnings.md Closes ADR-0002 implementation.	2026-05-05 15:18:13 -04:00
Patrick Clery	f9c1f01971	fix: extend fix-slash-commands SEARCH_DIRS to agents/, sdk/src/, .clinerules scripts/fix-slash-commands.cjs SEARCH_DIRS did not cover agents/, sdk/src/, or top-level files, so 9 colon-form references survived in 6 files. The hit at agents/gsd-codebase-mapper.md:105 propagated into ~/.claude/agents/ at install time (the fixer is not wired into install) and produced unrunnable /gsd:<cmd> suggestions in agent output on non-Gemini runtimes. This commit includes Pass 1 (the 9 line edits) AND Pass 2 (extending the fixer's SEARCH_DIRS so future regressions are auto-rewritten and caught by the bug-2543 guard, which mirrors that list). The standalone bug-3100 test added in the prior revision is removed in favor of the bug-2543 guard's extended scan, per CONTRIBUTING.md test standards (no source-grep tests on non-.md files). Refs #3100	2026-05-05 13:19:10 -04:00
Tom Boucher	95d2bc20f8	feat(hooks): opt-in SessionStart update banner for non-statusline users (#2795 ) (#3035 ) * feat(hooks): opt-in SessionStart update banner for non-statusline users (#2795) When a user declines (or keeps a non-GSD) statusline at install time, the installer now offers an opt-in SessionStart banner that surfaces GSD update availability. The banner reads the existing ~/.cache/gsd/gsd-update-check.json cache (written by gsd-check-update-worker.js) and emits a single systemMessage line only when update_available is true: GSD update available: <installed> → <latest>. Run /gsd-update. It is silent when up-to-date and rate-limits "check failed" diagnostics to once per 24h via a sentinel file so a corrupt cache doesn't nag every session. Removed cleanly by `npx get-shit-done-cc --uninstall` which strips both the script and the SessionStart entry. The banner is never offered when GSD's statusline is being installed (statusline already surfaces update info, so re-prompting would be noise). Implementation: - hooks/gsd-update-banner.js — pure functions buildBannerOutput, shouldSuppressFailureWarning, readCache; thin main() wires them. - bin/install.js — handleUpdateBanner() prompt, parseUpdateBannerInput(), buildUpdateBannerHookEntry(), buildUpdateBannerPromptText(); chained into installAllRuntimes() so finalize() receives both flags. updateBannerCommand computed alongside the other JS-hook commands; finishInstall() registers the SessionStart entry only when shouldInstallBanner === true and the hook file is present at the target. - Hook ships in scripts/build-hooks.js HOOKS_TO_COPY, listed in MANAGED_HOOKS for stale-detection in gsd-check-update-worker.js, in the uninstall hook-removal lists in install.js, and in the rewriteLegacyManagedNodeHookCommands allowlist. Tests: - tests/feat-2795-update-banner.test.cjs — 22 tests, structural-IR assertions on parsed JSON envelopes (no raw-text matching). Covers pure-function branches (cache present/absent, parseError, rate-limit suppression, missing version fields), end-to-end hook invocation against fixture cache states, and install.js wiring (prompt text, input parsing, hook entry shape). - tests/trae-install.test.cjs — updated install() return-shape assertion to include updateBannerCommand: null for the no-settings runtime. - 6881/6881 tests pass. Docs (bundled in same commit per the bundle-docs-with-code skill): - docs/USER-GUIDE.md — new "Surface GSD Update Notifications Without GSD's Statusline" task section with opt-in/opt-out instructions. - docs/FEATURES.md — REQ-HOOK-08 added; "Update Banner" subsection under the Hook System feature with cache flow + removal path. - docs/INVENTORY.md — hook count 11 → 12, new row for gsd-update-banner.js. - docs/INVENTORY-MANIFEST.json — regenerated. Closes #2795 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(install): gate banner prompt on actual installability (CR #3035) CodeRabbit findings on PR #3035: - bin/install.js (Major): continueAfterStatusline gated banner prompt on the raw `shouldInstallStatusline` flag from handleStatusline. But finishInstall later silently skips the statusline write on local installs unless --force-statusline is set (#2248). Two consequences: 1. Interactive local Claude/Gemini installs got neither a statusline nor a banner offer. 2. Codex/Cursor/Copilot/Windsurf/Trae/Cline-only installs (where every result.updateBannerCommand is null) still got prompted even though the choice was silently ignored. Fix: derive willInstallStatusline = shouldInstallStatusline && (isGlobal \|\| forceStatusline), and gate the banner prompt on a canInstallBanner precondition computed from results[].updateBannerCommand. Pass the raw shouldInstallStatusline through to finalize unchanged so per-runtime statusline gating in finishInstall is unaffected. - tests/feat-2795-update-banner.test.cjs (Minor): rate-limit suppression test parsed r1.stdout without first asserting r1.status === 0. Other e2e tests in this file (lines 210, 241) do this. A non-zero exit would surface as a cryptic SyntaxError instead of a status assertion failure. Fix applied verbatim. 6881/6881 tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 16:33:16 -04:00
Tom Boucher	4277f7d7e8	fix(#2994 ): move verify-reapply-patches.cjs to get-shit-done/bin/ so it ships to user installs (#3000 ) * fix(#2994): move verify-reapply-patches.cjs to get-shit-done/bin/ so installer ships it scripts/verify-reapply-patches.cjs (added in #2972 to close the verified-yes-without-checking gap from #2969) shipped in the npm tarball but never reached user installs: bin/install.js copies get-shit-done/ recursively but does not copy the top-level scripts/ directory. Effect: every fresh install hit `Cannot find module …/scripts/verify-reapply-patches.cjs` on Step 5 of /gsd-reapply-patches. The whole point of moving verification out of LLM-driven prose into a deterministic script is undone if the script does not resolve at runtime. Fix: move the script to get-shit-done/bin/verify-reapply-patches.cjs (same pattern as gsd-tools.cjs and other runtime bin scripts that the installer ships) and update reapply-patches.md Step 5 to invoke ${GSD_HOME}/get-shit-done/bin/verify-reapply-patches.cjs. Tests: - bug-2969 SCRIPT path updated to the new location - New bug-2994-verify-reapply-patches-installed-path.test.cjs parses reapply-patches.md into structured invocation records and asserts every node ${GSD_HOME}/... reference lives under get-shit-done/ (the installed tree). Catches future regressions where someone moves a runtime-needed script back to scripts/. Closes #2994 * chore(#2994): add changeset fragment for PR #3000 * chore(#2994): add changeset fragment for PR #3000 * docs(#2994): update verifier-script-location comment to reflect new path (CR) CodeRabbit on PR #3000: the parenthetical at line 278 still said the script ships under scripts/, but this PR moved it to get-shit-done/bin/. Updated the prose to reference the new location and the installer target path. * chore(#3000): drop direct CHANGELOG.md edit; release entry now lives in .changeset/ The changeset-fragment workflow (#2975) renders fragments into CHANGELOG.md at release time. Direct edits to [Unreleased] on each PR caused merge conflicts on every concurrent PR. This commit restores CHANGELOG.md to match origin/main; the release entry for this fix is preserved in the .changeset/*.md fragment(s) on this branch, which the release workflow consolidates.	2026-05-02 00:29:34 -04:00
Tom Boucher	c2ada7e799	feat(#2995 ): post-install path audit for workflow-invoked scripts (#2996 ) * feat(#2995): post-install path audit for workflow-invoked scripts Catches the gap class surfaced by #2994: a workflow references a script via ${GSD_HOME}/<path> that ships in the npm tarball but is not copied to the user's config dir at install time. Unit tests don't catch it because they resolve the script via path.join(__dirname, '..', 'scripts', …) — the source layout, not the deployed layout. Implementation built TDD per #2995, vertical slices with structured-IR assertions: scripts/audit-workflow-script-paths.cjs - Pure auditWorkflowScriptPaths({ workflowsDir, repoRoot, installedPrefixes }) returns { ok, findings: [{ workflow, path, kind }] } via the AUDIT_FINDING enum. - Two finding kinds: MISSING_FROM_REPO (typo / file deleted) and NOT_INSTALLED (#2994 class — first segment outside installed prefixes). - Tolerates ${GSD_HOME:-...} default-fallback syntax. tests/bug-2995-post-install-script-paths.test.cjs - 9 tests across 3 suites: • Pure-function pass and per-finding-kind detection (5 tests on synthetic fixtures). • Real workflow audit (2 tests asserting the actual repo's get-shit-done/workflows/ has no NEW gaps and KNOWN_GAPS stays consistent with audit findings). • Enum shape lock + extractReferences edge cases. - All assertions on typed AUDIT_FINDING enum / structured records; zero raw text matching. - KNOWN_GAPS is a Set keyed on `workflow\|path\|kind` strings; currently contains the #2994 entry. The companion test fails if a KNOWN_GAPS entry no longer matches a real finding (forces the allow-list to shrink as gaps fix). The audit immediately catches #2994's gap on `reapply-patches.md`. The allow-list contains exactly that entry; new gaps fail CI; #2994's fix will remove the entry as part of the same PR. Closes #2995 Refs #2994 * chore(#2995): add changeset fragment for PR #2996 * chore(#2995): add changeset fragment for PR #2996 * fix(#2995): emit both NOT_INSTALLED + MISSING_FROM_REPO; clean up fixture leak (CR) CodeRabbit on PR #2996 found two issues: 1. (Low value) auditWorkflowScriptPaths short-circuited on NOT_INSTALLED, masking MISSING_FROM_REPO for the same ref. Removed the `continue` so both findings emit in one run; added a regression test. 2. (Low value) bug-2995 test created tmpRoot in before() but never wrote into it; per-fixture mkdtempSync dirs leaked. Rooted fixture repos under tmpRoot so the after() cleanup actually frees them.	2026-05-01 21:13:45 -04:00
Tom Boucher	918f987a19	feat(#2982 ): extend no-source-grep lint to catch var-binding readFileSync.includes() (#2985 ) * feat(#2982): extend no-source-grep lint to catch var-binding readFileSync.includes() The base lint (scripts/lint-no-source-grep.cjs) only catches readFileSync(...).<text-method>() chained directly. The much more common var-binding form escapes it: const src = fs.readFileSync(p, 'utf8'); // 50 lines later if (src.includes('foo')) {} // ← still grep, lint missed it Scan of the test suite found ~141 files using this pattern. Implementation built TDD per #2982 with structured-IR assertions: scripts/lint-no-source-grep-extras.cjs - detectVarBindingViolations(src) — pure detector, two passes: pass 1 collects vars bound from readFileSync, pass 2 finds any <var>.<includes\|startsWith\|endsWith\|match\|search>( on those vars. - detectWrappedAssertOkMatch(src) — flags assert.ok(<expr>.match(...)) which escapes the assert.match rule. - VIOLATION enum exposes stable codes for tests to assert on. scripts/lint-no-source-grep.cjs - Wires the new detectors into the existing per-file check; one additional violation row per file with the first 3 sample tokens. tests/bug-2982-lint-var-binding.test.cjs - 13 tests, all assertions on typed VIOLATION enum / structured records. Covers all 5 text-match methods, multi-var, no-bind, string literal (must NOT trigger), wrapped assert.ok(.match), and assert.match (must NOT double-flag). Migration backlog (#2974 expanded scope): - 42 files annotated `// allow-test-rule: source-text-is-the-product` (legitimate — they read .md/.json/.yml files whose deployed text IS the product) - 3 files annotated `// allow-test-rule: pending-migration-to-typed-ir [#2974]` (read .cjs/.js source — clear migration debt) - 95 files annotated `pending-migration-to-typed-ir [#2974]` with `Per-file review may reclassify as source-text-is-the-product during migration` (mixed — manual review under #2974) After this lands the lint reports 0 violations on main; new violations in PRs surface immediately. Closes #2982 Refs #2974 * test(#2982): fix truncated test name per CR The label ended with a bare '(' from a copy-paste mishap. Now reads 'does NOT flag .matchAll(...) — matchAll is not match, so assert.ok(.matchAll(...)) is not flagged'. * chore(#2982): add changeset fragment for PR #2985 * chore(#2982): add changeset fragment for PR #2985	2026-05-01 19:50:10 -04:00
Tom Boucher	9d5db87249	feat(#2975 ): adopt changeset-fragment workflow to eliminate CHANGELOG conflicts (#2978 ) * feat(#2975): adopt changeset-fragment workflow to eliminate CHANGELOG conflicts Two PRs that both edit `### Fixed` in CHANGELOG.md always conflict on merge. Recently bit on #2960/#2972 in the same session — fix-the-conflict-and-rebase tax. Replace the shared-file model with per-PR fragment files that never share lines. Implementation built TDD per #2975, vertical slices with structured-IR assertions throughout: scripts/changeset/parse.cjs - fragment text → typed record + frozen FRAGMENT_ERROR enum (8 tests) scripts/changeset/render.cjs - fragments → structured IR with Keep-a-Changelog section ordering (2 tests) scripts/changeset/serialize.cjs - IR ↔ markdown round-trip pair (parse(serialize(ir)) === ir, 3 tests) scripts/changeset/cli.cjs - file-I/O wrapper with --json mode; reads .changeset/, folds into CHANGELOG.md, deletes consumed fragments. Idempotent. (1 test) scripts/changeset/lint.cjs - pure verdict (changedFiles, labels) → { ok, reason } via LINT_REASON enum. Honors `no-changelog` label. (5 tests) scripts/changeset/new.cjs - fragment scaffolder with random adjective-noun-noun filename. Tests assert via parseFragment round-trip. (3 tests) Total: 22 tests, all assertions on typed structured fields. No regex on text, no String#includes on file content. Lint clean across 356 test files. Supporting: .changeset/README.md - format spec + workflow docs .changeset/eager-hawks-rally.md - dogfood fragment for THIS PR (will be the first thing the new release tool consumes) .github/workflows/changeset-required.yml - CI: every PR runs lint.cjs package.json - npm run changeset, changelog:render, lint:changeset CONTRIBUTING.md - new "CHANGELOG Entries — Drop a Fragment" section between PR Guidelines and Testing Standards Closes #2975 * fix(#2975): address CodeRabbit findings on changeset workflow 7 valid findings (4 Major, 3 Minor); all addressed: scripts/changeset/parse.cjs - Preserve fragment body verbatim. Previously body.trim() ate intentional leading whitespace (code blocks, etc.); now trim() is used only for the emptiness check, and a single trailing newline is stripped (the editor-added one) so well-formed fragments round-trip byte-for-byte. Added a regression test asserting a code-block-leading body is preserved. scripts/changeset/cli.cjs - Validate flag values during argument parsing. parseArgs now returns { ok, opts \| error }; rejects `--repo` etc. with no following value or with another flag as the value. main() surfaces the error message before exiting 2. - Handle post-write fragment-deletion failures. After CHANGELOG.md is written, any unlink failure is captured into a structured deleteFailures list with reason 'fail_fragment_delete'; cmdRender returns exitCode=1 with the partial-failure detail instead of leaving the changelog updated and fragments behind (which would cause double-consumption on rerun). scripts/changeset/lint.cjs - Treat CHANGELOG.md as a linted user-facing path. Direct edits to CHANGELOG.md (the bypass route around the new workflow) now fail the lint with FAIL_MISSING_FRAGMENT. Added a regression test for that case. - Use cp.execFileSync instead of cp.execSync for the git diff call. Eliminates the shell-interpolation surface on GITHUB_BASE_REF; git's own arg parser remains the validator. scripts/changeset/new.cjs - Atomic fragment creation. existsSync() + writeFileSync was racy under concurrent invocations. Now writeFileSync uses { flag: 'wx' } which fails EEXIST on collision; the random-name retry loop catches EEXIST and re-rolls. Throws explicitly after 16 attempts rather than silently overwriting. .changeset/README.md - Add language tag `md` to the format example fence (markdownlint MD040). All 25 changeset tests pass; lint clean (356 test files, 0 violations). * fix(#2975): sanitize --type and validate flag values in new.cjs (CR fixes) Two CR findings on scripts/changeset/new.cjs: 1. (Minor) `type` was embedded in frontmatter without sanitization. A newline in the value (e.g. `--type 'Fixed\ntype: Added'`) would corrupt the fragment. scaffoldFragment now validates `type` against the Keep-a-Changelog ALLOWED_TYPES set BEFORE writing — same set parse.cjs uses on consume. Throws with a typed error referencing the allowed values; tests cover the newline case + 4 other non-allowed values. 2. (Minor) `--repo` (and other value-taking flags) without a value silently set opts.repo to undefined, which produced a cryptic ERR_INVALID_ARG_TYPE deep inside path.join. parseArgs now mirrors the cli.cjs convention: returns { ok, opts \| error }, validates that the next token exists and is not itself another flag, and surfaces a precise "missing value for --repo" message before exit. Added 3 tests: missing-trailing-value, flag-as-value, well-formed. 29 tests pass across the changeset suite (4 new regression tests).	2026-05-01 18:12:20 -04:00
Tom Boucher	fb92d1e596	fix(#2983 ): classifier exit-code discipline, base-tag staging, drop vestigial merge-back (#2984 ) * fix(#2983): classifier exit-code discipline, base-tag staging, drop vestigial merge-back Three issues surfaced by CodeRabbit's post-merge review of #2981 plus a production failure on the v1.39.1 release run. (1) Overloaded classifier exit code scripts/diff-touches-shipped-paths.cjs reused exit 1 for both the legitimate "no shipped paths" result and Node's default exit on uncaught throw, so any classifier failure (corrupt package.json, EPERM, etc.) was indistinguishable from a normal skip — the workflow's `if ! ... ; then skip` idiom would silently drop the commit. Distinct exit codes now: 0 shipped — at least one path is in the npm `files` whitelist 1 not shipped — CI / test / docs / planning only 2 classifier error — workflow MUST fail-fast uncaughtException + unhandledRejection + try/catch around fs/JSON parsing all route to exit 2 with stderr context. (2) Classifier missing at the base tag (CRITICAL) `Prepare hotfix branch` runs `git checkout -b "$BRANCH" "$BASE_TAG"` BEFORE the cherry-pick loop, replacing the working tree with the base tag's contents. Base tags predating #2980 (notably v1.39.0, the most likely next hotfix base) don't have scripts/diff-touches-shipped-paths.cjs at all — `node <missing>` exits non-zero — `if !` skips every commit — empty hotfix branch published. Strictly worse than the original #2980 push-rejection, which at least failed loudly. Stage the classifier from the dispatched ref's working tree into $RUNNER_TEMP at the top of the run script (before any working-tree- mutating git command). The cherry-pick loop now references $CLASSIFIER (staged) instead of the in-tree path. Sanity guards: refuse to start if scripts/diff-touches-shipped-paths.cjs is missing in the dispatched ref, refuse to proceed if cp didn't materialize $CLASSIFIER. The cherry-pick loop captures node's exit via ${PIPESTATUS[1]} and dispatches via explicit case: 0 proceed with cherry-pick 1 skip into NON_SHIPPED_SKIPPED * emit ::error:: + exit "$CLASSIFIER_RC" (3) Drop the merge-back PR step Auto-cherry-pick only picks commits already on main (`git cherry HEAD origin/main` outputs the unmerged ones; we filter fix:/chore: from main). By construction every code commit on the hotfix branch is already on main. The only hotfix-branch-only commit is `chore: bump version to X.Y.Z for hotfix`, which either no-ops against main or rewinds main's in-progress version. The merge-back PR was vestigial. It also failed in production on run 25232968975 with `GitHub Actions is not permitted to create or approve pull requests (createPullRequest)` — org policy blocks PR creation from the workflow's GH_TOKEN. Even without that block, the PR would have nothing useful to merge. Step removed. The `pull-requests: write` permission granted solely for the merge-back step has been dropped from the release job (least-privilege). Regression coverage tests/bug-2983-classifier-exit-codes-and-base-tag-staging.test.cjs adds 12 assertions across two describe blocks: - 5 classifier behavioral: exit 0/1 preserved, exit 2 on missing package.json, exit 2 on malformed JSON, exit-code constants exported. - 7 workflow contract: classifier staged before checkout, target is $RUNNER_TEMP, missing-source guard, missing-staged guard, PIPESTATUS-based dispatch, error branch fails workflow, loop uses staged path (not in-tree). tests/bug-2980-hotfix-only-picks-shipping-changes.test.cjs updated where it asserted the pre-#2983 `if ! ... ; then` shape: now accepts the post-#2983 case-dispatch form. The test still proves the classifier participates; bug-2983 enforces the specific shape. Run summary references for the curious reviewer: - Run 25232010071 — original #2980 trigger (workflow-file push rejection) - Run 25232968975 — failed merge-back step that prompted the "is this even useful?" question that drove the removal Closes #2983 * fix(#2983): address CodeRabbit findings on PR #2984 Two findings, both real, both fixed. (1) [Critical] PIPESTATUS capture clobbered by `\|\| true` Pre-fix shape: git diff-tree ... \| node "$CLASSIFIER" \|\| true CLASSIFIER_RC="${PIPESTATUS[1]}" When the classifier exits 1 ("not shipped" — common case) or 2 (error), `\|\| true` triggers the right-hand side. `true` is a one-command "pipeline" that overwrites PIPESTATUS to (0). ${PIPESTATUS[1]} on the next line is therefore unset (or stale under set -u). The case dispatch then matched the empty string — falling into `)` and failing the workflow on every non-shipped commit, OR matching `0)` after some shells default-init unset to 0 and silently picking commits that don't ship. Local repro confirms the issue: $ bash -c 'set -euo pipefail; false \| sh -c "exit 7" \|\| true; \ echo "PIPESTATUS: ${PIPESTATUS[]}"; \ echo "[1]: ${PIPESTATUS[1]:-<unset>}"' PIPESTATUS: 0 [1]: <unset> Fix: bracket the pipeline in `set +e`/`set -e`, snapshot PIPESTATUS into a local array on the very next line, then dispatch on the snapshot: set +e git diff-tree ... \| node "$CLASSIFIER" PIPE_RC=("${PIPESTATUS[@]}") set -e DIFFTREE_RC="${PIPE_RC[0]}" CLASSIFIER_RC="${PIPE_RC[1]}" The snapshot must happen on the first line after the pipeline; any intervening simple command resets PIPESTATUS. The array form is invariant against that. Bonus from the new shape: $DIFFTREE_RC is now also captured. git diff-tree is unlikely to fail on a known-good $SHA, but if it does, we no longer feed partial/empty input to the classifier and call it "not shipped." A non-zero DIFFTREE_RC emits ::error::git diff-tree failed and exits. (2) [Minor] Stale "Merge-back PR opened against main" summary line The hotfix run summary still printed: echo "- Merge-back PR opened against main" But the merge-back step itself was removed in the previous commit on this branch. Operators reading the summary would expect a PR that doesn't exist. Replaced with explicit non-action text: echo "- No merge-back PR (auto-picked commits are already on main)" Test coverage bug-2983 test file gains 3 assertions: - PIPE_RC array-snapshot pattern is required (regex matches the exact `PIPE_RC=("${PIPESTATUS[@]}")` form). - The `pipeline \|\| true; ${PIPESTATUS[1]}` antipattern is explicitly forbidden via assert.doesNotMatch. - DIFFTREE_RC is captured from PIPE_RC[0] and a non-zero value triggers ::error::git diff-tree failed. - Run summary forbids `Merge-back PR opened against main` and requires the new non-action sentence. bug-2964 test's loop-anchor window bumped 6 KB → 8 KB to accommodate the additional pre-pick scaffolding (the test's own comment had already anticipated this kind of growth, citing prior precedents from #2970 and #2980). Mark CodeRabbit comments resolved post-commit. Refs CR finding ids 3175253571, 3175253578 on PR #2984.	2026-05-01 17:25:20 -04:00
Tom Boucher	7424271aa0	fix(#2980 ): hotfix cherry-pick only picks commits that change what ships (#2981 ) * fix(#2980): pre-skip workflow-file cherry-picks in release-sdk hotfix loop The default GITHUB_TOKEN issued to the release-sdk run lacks the `workflow` scope, so the prepare job's `git push origin "$BRANCH"` is rejected by GitHub when any cherry-picked commit modifies a file under `.github/workflows/`: ! [remote rejected] hotfix/X.YY.Z -> hotfix/X.YY.Z (refusing to allow a GitHub App to create or update workflow ... without `workflows` permission) Pre-#2980 behavior: the auto_cherry_pick loop happily picked workflow-file commits, then the trailing push exploded with no clear signal which commit was the culprit. v1.39.1 hit this on PR #2977 (run 25232010071) — earlier release-sdk fixes (#2965, #2967, #2970) had been skipped on conflict so their workflow-file changes never reached the push step, masking the bug; #2977 was the first workflow-file commit to apply cleanly and the push immediately exploded. Fix: pre-pick guard in the cherry-pick loop. Inspect each candidate commit's file list via `git diff-tree --no-commit-id --name-only -r` BEFORE attempting the pick. If any path matches `^\.github/workflows/`, skip the commit, emit a `::warning::` annotation naming the dropped commit, and append to a new `WORKFLOW_SKIPPED` bucket. The run summary surfaces this bucket in its own section, distinct from `CONFLICT_SKIPPED` (real merge conflicts) and `POLICY_SKIPPED` (feat/refactor exclusions), so operators reviewing the run never confuse the remediation paths. The loud-warning piece is non-negotiable: silent drops were explicitly rejected as a failure mode during the option-1/2/3 tradeoff discussion. If a workflow-file fix genuinely needs to ship in a hotfix, the operator applies it manually on the hotfix branch using a token with `workflow` scope, or lands it on main and re-cuts the release. Regression covered by tests/bug-2980-skip-workflow-file-cherrypicks.test.cjs (5 assertions: pre-pick guard exists, uses `git diff-tree`, emits `::warning::`, lands in dedicated bucket, surfaces in summary). The bug-2964 test's 4 KB window after the cherry-pick-loop anchor was nudged to 6 KB to accommodate the new pre-pick scaffolding — the test's own comment had already anticipated this kind of growth (citing #2970's merge-commit pre-skip as prior precedent). Closes #2980 * refactor(#2980): replace workflow-file pre-skip with shipped-paths filter The previous commit on this branch caught only the .github/workflows/* subset of the bug, treating the symptom (push rejection on workflow-file changes) rather than the root cause (the fix:/chore: filter is too broad — it picks any commit with that conventional-commit type even when the diff cannot affect the published npm package). CI-only fixes (release-sdk.yml itself, hotfix tooling, test-only commits) shouldn't flow through hotfix runs at all — they cannot change what `npm install get-shit-done-cc@X.YY.Z` produces. The .github/workflows/* push rejection is just the loudest of these "shouldn't have been picked" cases; tests/, docs/, .planning/ commits get picked silently with the same lack of effect on consumers. Replace the workflow-file pre-skip with a shipped-paths filter: - New scripts/diff-touches-shipped-paths.cjs reads package.json `files`, plus package.json itself (always-shipped per `npm pack` semantics), and exits 0 iff any input path is in the shipped set. Lockfile is not shipped (npm pack excludes it unless explicitly in `files`). - Workflow loop now pipes `git diff-tree --no-commit-id --name-only -r` through the classifier; on exit 1 the commit is skipped and appended to a new NON_SHIPPED_SKIPPED bucket (replaces WORKFLOW_SKIPPED). - Run summary surfaces NON_SHIPPED_SKIPPED as informational — no ::warning:: annotation. A non-shipping commit cannot affect the package, so a yellow alert would imply remediation is possible and would mislead operators. The classifier in a separate .cjs file (rather than inline bash heredoc) is so its rules — directory-prefix vs exact-match, package.json-always-shipped, lockfile-not-shipped — are unit-testable in tests/bug-2980-hotfix-only-picks-shipping-changes.test.cjs (11 new assertions: 4 static workflow + 6 classifier behavioral + 1 mixed- diff edge case). Why this dissolves the original push-rejection bug: workflow files aren't in `files`, so workflow-only commits are skipped pre-pick. The push step never sees them. If a workflow-file fix genuinely needs to ship in a hotfix release (extremely rare — the hotfix workflow is read from main's ref, not the hotfix branch's), the operator applies it manually using a token with `workflow` scope. The pre-skip puts that requirement in the run summary explicitly. Closes #2980	2026-05-01 16:59:49 -04:00
Tom Boucher	ef43f5161f	fix(#2969 ): deterministic Step 5 verification gate for /gsd-reapply-patches (#2972 ) * fix(#2969): deterministic Step 5 verification gate for /gsd-reapply-patches The prior Step 5 "Hunk Verification Gate" was prescribed correctly in the workflow text — but executed laxly by the LLM, which filled in `verified: yes` without actually checking content presence. The reporter observed three distinct files (skills/gsd-discuss-phase/SKILL.md, skills/gsd-autonomous/ SKILL.md, get-shit-done/workflows/new-project.md) where archives contained substantive user-added blocks that did not survive into the merged result, yet the gate reported clean. Move verification from LLM-driven prose into a deterministic Node script the workflow calls. The script can't be shortcut. Changes: - scripts/verify-reapply-patches.cjs (new): pure Node, no external deps. For each file in the patches dir, computes user-added significant lines as the line-set diff between backup and pristine baseline (when available; falls back to "every significant backup line" when no pristine — over-broad but the safe direction for this bug class). Asserts each line appears literally in the merged installed file via String.prototype.includes. Filters trivial lines (length < 12 chars, pure punctuation, decorative comments) so harmless drift doesn't trigger false failures. Exits 0 on pass, 1 on any miss with per-file diagnostic, 2 on usage error. Supports --json for workflow consumption. - get-shit-done/workflows/reapply-patches.md: rewrite Step 5 to call the script and parse its JSON output. The Step 4 Hunk Verification Table remains as advisory Claude-readable summary, but the gate is now the script's exit code. - tests/bug-2969-verify-reapply-patches.test.cjs (new): 6 tests covering (a) pass when every line survives, (b) fail when a line is missing, (c) fail when the merged file is deleted entirely, (d) --json structured report shape, (e) backup-meta.json is correctly skipped as metadata, (f) no-pristine-dir fallback exercises the safe over-broad path. All pass. Out of scope: the manifest-baseline tightening described in #2969 Failure 1 (saveLocalPatches comparing against the wrong baseline so prior silent wipes poison subsequent updates). That's a separate, bigger architectural change involving pristine-content infrastructure; this PR addresses the gate fidelity half so users at least see the diagnostic when content goes missing. Closes #2969 (partial — Failure 2 only) * fix(#2969): preserve #1999 Hunk Verification Table assertions alongside new script gate CI failure on PR #2972 surfaced that tests/reapply-patches.test.cjs (the #1999 contract) asserts Step 5 references: - "Hunk Verification Table" - `verified: no` failure condition - explicit STOP/halt/abort directive - "table absent / missing" halt path My initial Step 5 rewrite for #2969 substituted the deterministic script for the table-based gate entirely, stripping those references. The script is the strictly stronger gate, but the existing #1999 test enforces the table-based safety net as a defense-in-depth contract. Restore both gates as a layered Step 5: - 5a (binding): deterministic verifier script — script gate, exits non-zero on any miss, cannot be shortcut by the LLM - 5b (advisory): Hunk Verification Table review — preserved as redundant safety net for the case where the script has a bug or the pristine baseline is unavailable Both gates must pass. Verified: tests/reapply-patches.test.cjs (5 tests in the #1999 suite) and tests/bug-2969-verify-reapply-patches.test.cjs (6 tests in the #2969 suite) all pass — 21/21 total in this fixture. * fix(#2969): address CodeRabbit findings on workflow + script Five CR findings on PR #2972, all valid; addressed in this commit: 1. (Major) Stderr was merged into VERIFY_OUTPUT via `2>&1`, so any Node warning, deprecation notice, or stack trace would corrupt the JSON parse downstream. Capture stdout only; stderr remains on the controlling terminal for operator visibility. 2. (Major) verifyFile() crashed with EISDIR/EACCES instead of producing a structured diagnostic when the installed path was a directory or unreadable. Wrap statSync/readFileSync in try/catch and emit a per-file fail row; the whole-run gate continues with structured output. Added test case asserting the directory-at-installed-path case fails with `not a regular file` diagnostic instead of crashing. 3. (Minor) PRISTINE_FLAG built as a single string + unquoted expansion would split paths with spaces. Switched to a bash array (VERIFY_ARGS) that preserves whitespace through expansion. 4. (Minor) Fenced code block missing language tag (markdownlint MD040). Added `text` tag to the error message block. 5. (Minor) Usage comment said pristine fallback was "backup-meta lookup" but the actual code path falls back to significant-line checks from backup content. Corrected the comment to match implementation. Verified all 21 tests in tests/reapply-patches.test.cjs (#1999 contract) + tests/bug-2969-verify-reapply-patches.test.cjs (now 7 tests with the new directory case) pass. * test(#2969): structured JSON assertions, no substring matching on script output Replace every assert.match(r.stdout, /pattern/) call with structured assertions on the parsed JSON report from the script's own --json mode. The script's --json contract IS the structured shape we test against — the test author should never depend on the human-readable formatter output, just as no test should depend on substring presence in source. Changes: - All 7 tests now run the verifier with --json (via a runVerifier() helper) and parse the resulting JSON document into { status, report, stderr }. Diagnostic stderr is preserved as a separate channel for debug output but is not used for assertions. - Each previously substring-matched diagnostic ("Failures: 1", "not a regular file", "installed file missing after merge", file path, dropped line) is now a deepEqual / equal / Array.includes against typed report fields: report.failures, report.results[i].status, report.results[i].reason, report.results[i].file, report.results[i].missing[]. - Added an explicit "documented shape" test asserting the JSON output has exactly the keys { file, missing, reason, status } per result — locks the public contract of the --json mode. - DRY'd up fixture reset into a resetFixture() helper since every test starts with a fresh patches/installed/pristine triple. Linter: scripts/lint-no-source-grep.cjs reports 0 violations across 348 test files. Combined run of bug-2969-...test.cjs (7 tests) + reapply-patches.test.cjs (5 tests in the #1999 suite) all pass — 22/22 in the relevant fixture. * fix(#2969): typed REASON enum + raw-text-matching rule shipped repo-wide This commit closes the loop on the no-source-grep discipline: 1. scripts/verify-reapply-patches.cjs: - Frozen REASON enum exposes the diagnostic surface as stable codes: OK_NO_USER_LINES_VS_PRISTINE, OK_NO_SIGNIFICANT_BACKUP_LINES, FAIL_INSTALLED_MISSING, FAIL_INSTALLED_NOT_REGULAR_FILE, FAIL_READ_ERROR, FAIL_USER_LINES_MISSING. - Each result.reason is now a code from this enum, not free text. Tests assert via REASON.X equality, not regex on prose. - REASON exported from module.exports. 2. tests/bug-2969-verify-reapply-patches.test.cjs: - Full rewrite. Every assertion on typed structured fields: report.results[0].status === 'fail', report.results[0].reason === REASON.FAIL_INSTALLED_NOT_REGULAR_FILE, report.results[0].missing.includes(droppedLine) (Array set membership, not String substring). - Locks the REASON enum surface via Object.keys(REASON).sort() deepEqual. - Locks the JSON report shape via Object.keys(report).sort() deepEqual. - Zero regex, zero String#includes, zero startsWith/endsWith on text. 3. CONTRIBUTING.md: - New section "Prohibited: Raw Text Matching on Test Outputs" with concrete BAD/GOOD examples (substring on file content; assert.match on stdout; "structured parser" hiding string ops; regex on free-form reason fields). - The rule statement: "Tests assert on typed structured values. If the code under test produces text, the code under test must also expose a structured intermediate representation, and the test must assert on that IR — never on the rendered text." - Required structured-surface table: file IR, --json mode, frozen enum, fs facts. - "Hiding grep behind a function is still grep" callout — the parser-wrapper anti-pattern. - New `pre-existing-text-matching` exemption category for the 8 grandfathered files. Marked Transitional; new tests cannot use it. 4. scripts/lint-no-source-grep.cjs: - Three new patterns enforced (in addition to the existing .cjs-source readFileSync rule): - assert.match/doesNotMatch on .stdout/.stderr - .stdout/.stderr.<includes\|startsWith\|endsWith>( - readFileSync(...).<includes\|startsWith\|endsWith>( - Aggregated violations per file (multiple findings now report together). - Updated diagnostic message references both CONTRIBUTING.md sections. 5. 8 pre-existing tests annotated with `// allow-test-rule: pre-existing-text-matching` so the lint passes on this commit; each carries the prose "Tracked for migration to typed-IR assertions; do not copy this pattern." Files: bug-2649, bug-2687, bug-2796, bug-2838, bug-2943, graphify, hooks-opt-in, security-scan. Verification: lint 0 violations across 348 test files; full suite passes. * fix(#2969): rename exemption category to pending-migration-to-typed-ir + cite tracking issue Per maintainer feedback: 1. "Grandfathered" / "legacy" framing is wrong — both terms imply permanent or condoned exemption. The 8 files are tracked for correction, not exempted. 2. Each annotated file must cite the tracking issue so the migration work is auditable. Changes: - CONTRIBUTING.md: rename exemption category from `pre-existing-text-matching` to `pending-migration-to-typed-ir`. Update prose to "Tracked for correction, not exempted" and require each annotation to cite the open migration issue (e.g. `// allow-test-rule: pending-migration-to-typed-ir [#NNNN]`). - 8 test files: update annotation to cite #2974 (the tracking issue opened for migrating these files to typed-IR assertions).	2026-05-01 16:14:39 -04:00
Jeremy McSpadden	4d394a249d	fix(commands): normalize gsd slash namespace drift (#2858 ) * fix(commands): normalize gsd slash namespace drift * fix(#2855): address CodeRabbit findings on namespace drift PR Three CR findings, all valid: 1. autonomous.md line 783 still had `gsd:discuss-phase` (the PR's own normalization missed this line). Switched to `gsd-discuss-phase` and updated the matching test in autonomous-interactive.test.cjs that was asserting the now-retired colon form. 2. tests/bug-2543-gsd-slash-namespace.test.cjs source-grepped the fix-slash-commands.cjs script with .includes() rather than driving its transform behaviour. Refactored fix-slash-commands.cjs to export a pure transformContent(src, cmdNames) function, kept the CLI behaviour unchanged via require.main, and replaced the source-grep block with five behavioural cases: rewrite, multi-occurrence, idempotence on canonical input, no-op on gsd-sdk/gsd-tools, and word-boundary safety. 3. tests/bug-2808-skill-hyphen-name.test.cjs matched `name:` anywhere in SKILL.md; a stray name: in the body could satisfy the assertion. Scoped the lookup to the YAML frontmatter block via the suggested diff (parse the leading --- ... --- region first, then find name: inside it). Full suite: 5854/5854 passing. * fix(#2855): address remaining CodeRabbit findings on PR #2858 Three structural concerns flagged on the namespace-drift fix PR: 1. scripts/fix-slash-commands.cjs:24 — `buildPattern([])` compiled `/gsd:()(?=[^a-zA-Z0-9_-]\|$)/g`. The empty capture group still matches any `/gsd:` token followed by a non-word boundary (whitespace, EOL, punctuation), rewriting it to a stray `/gsd-`. Verified live: `transformContent("/gsd:", [])` → `"/gsd-"`. Added a guard returning null from `buildPattern` on empty input and updated `transformContent` and `processDir` to no-op when the pattern is null. 2. tests/autonomous-interactive.test.cjs:44-47 — assertion was `content.includes('gsd-discuss-phase') && content.includes('INTERACTIVE')`, which would false-pass on any unrelated co-occurrence (e.g. `INTERACTIVE=""` initialization plus a stray `gsd-discuss-phase` prose mention). Replaced with a structural extraction: locate the `If \`INTERACTIVE\` is set:` branch, bound it by the next `*If` / `<step>` boundary, and assert the `Skill(skill="gsd-discuss-phase", ...)` invocation lives inside that region. Tolerates whitespace around `(`, `skill`, and `=`. 3. tests/bug-2808-skill-hyphen-name.test.cjs:104 — colon-call regex was `Skill\(skill=...` and missed valid formatting like `Skill(skill = "gsd:cmd")` or `Skill( skill = ...)`. Loosened to `Skill\(\sskill\s=\s...` so reformatting drift can't slip past the namespace guard. Verification: 5854/5854 pass on `npm test` from the rebased branch. * fix(#2855): drop pre-validation filter that hid namespace drift CR finding on tests/bug-2808-skill-hyphen-name.test.cjs:128: the test collected generated skill directories with `.filter(entry => entry.isDirectory() && entry.name.startsWith('gsd-'))`, then validated namespace invariants over that filtered list. Anything that violated the prefix invariant — `gsd:extract-learnings` (colon form), `extract_learnings` without prefix, `Gsd-foo` mis-cased — would silently disappear from the iteration and the test would falsely pass. Drop the `startsWith('gsd-')` filter so every generated directory shows up. Add explicit assertions before the existing per-skill loop: - directory list is non-empty (catches a broken converter that produces nothing) - every directory begins with `gsd-` - every directory contains no `:` - every directory contains no `_` Re-audited the full PR diff for the same anti-pattern: only this one site filtered before validating the namespace; bug-2643 and commands-doc-parity also use `readdirSync().filter()` but only by file extension, which is correct. 5854/5854 on `npm test`. * fix(#2855): address remaining CR findings (1 active + 2 nitpicks) Three findings on PR #2858, all the same root cause: input narrowing before validation lets drift slip past the guards. 1. tests/bug-2808-...:104 (active) — `colonCallRe` captured local names with `[a-z0-9-]+`, which excluded the underscore. A drift like `Skill(skill="gsd:extract_learnings")` (deprecated colon syntax with the old underscore filename) silently slid through. Broadened the capture to `[^'"\s)]+` so any malformed local name is surfaced; surrounding pattern (whitespace tolerance, escape support, flags) unchanged. 2. tests/bug-2643-...:43 (nitpick) — `extractSkillNamesHyphen` and `extractSkillNamesColon` had the same over-strict capture plus relied on a single regex over raw bytes, which the project test- rigor memory bans (`feedback_no_source_grep_tests.md`). Replaced with `extractSkillCalls(content)` — a small structural extractor that walks `Skill(` openers, locates each call's matching `)`, parses the body's `skill = "..."` keyword argument with permissive whitespace + quoting + escape handling, and returns `{ name, raw }` records. The two namespace-form helpers become thin filters over the structured output. Tightened the body class to `[^'"\\]+` so a trailing escape `\` before the closing quote (as in `Skill(skill=\"gsd-foo\", …)` written inside another string context) doesn't get included in the captured name. 3. tests/bug-2543-...:44 (nitpick) — `DOC_SEARCH_FILES` was a hand- curated 7-entry array. Every doc added in the future would silently weaken drift detection until someone remembered to extend the list. Replaced with `discoverDocSearchFiles(ROOT)`: globs every `.md` under `docs/` and adds `README.md` if present. New docs are picked up automatically. Re-audited the diff surface for similar narrowings; no other sites filter or constrain before validating namespace invariants. 5854/5854 on `npm test`. * fix(#2855): recurse docs/ tree so localized translations are scanned too CR finding: discoverDocSearchFiles() stopped at docs/*.md, leaving localized translation trees (docs/ja-JP/, docs/zh-CN/, docs/ko-KR/, docs/pt-BR/) and other nested doc collections (docs/skills/, docs/superpowers/) invisible to the namespace-drift invariant. Verified the gap: docs/ has 6 nested directories with ~30 .md files that the previous top-level-only scan was skipping. None contain /gsd: references today, but a future translation update or new doc subdir could leak drift. Switch to an iterative stack walk so every .md under docs/ is scanned regardless of depth. Stack form (rather than recursion) avoids the risk of running into the call-stack limit on deep doc trees. 5854/5854 on `npm test`. --------- Co-authored-by: Tom Boucher <trekkie@nomorestars.com>	2026-04-29 22:56:59 -04:00
Tom Boucher	e81592878e	feat(#2789 ): trim skill description anti-patterns; enforce 100-char budget (#2823 ) * feat(#2789): trim skill description anti-patterns; enforce 100-char budget - Trim descriptions in all commands/gsd/.md files over 100 chars - Remove flag documentation from descriptions (belongs in argument-hint) - Remove Triggers: keyword stuffing - Add scripts/lint-descriptions.cjs — fails on descriptions > 100 chars - Add npm script: lint:descriptions - Add tests/enh-2789-description-budget.test.cjs Closes #2789 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> docs(#2789): add CHANGELOG entry for description budget lint * docs(#2789): update COMMANDS.md descriptions; add skill description standards note Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 08:14:11 -04:00
Tom Boucher	aeef87de7f	docs(test-standards): enforce no-source-grep rule with CI linter + CONTRIBUTING.md (#2700 ) * docs(test-standards): enforce no-source-grep rule with CI linter + update CONTRIBUTING.md Adds scripts/lint-no-source-grep.cjs — a static linter that detects readFileSync on .cjs source files in tests without an allow-test-rule annotation. Wires it into CI as a new lint-tests job in test.yml and as npm run lint:tests. Resolves all 9 existing violations across the test suite: - Rewrites workspace routing tests (3) as behavioral runGsdTools calls that verify each command is router-recognized (exit != "Unknown init workflow") - Adds allow-test-rule annotations with explanatory comments to 7 legitimate structural tests: architectural invariants (locking, orphan-worktree), structural regression guards (milestone-regex-global), docs-parity (config-field-docs), integration-test-input (copilot-install), and structural-implementation-guards (bug-1891, discuss-mode) Updates CONTRIBUTING.md Testing Standards section with: - "Prohibited: Source-Grep Tests" section with the before/after pattern, root cause analysis of why it breaks (commit `990c3e64`), and CI reference - allow-test-rule exemption table (6 recognized categories with when-to-use) - "CI Test Quality Checks" table showing lint-tests job and local run command Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve CodeRabbit findings on PR #2700 - CONTRIBUTING.md: "four recognized categories" → "six" (table has 6 rows) - workspace.test.cjs: use positional args in routing tests (no --name flag) - lint-no-source-grep.cjs: add source-dir guard to READ_WITH_INLINE_CJS_RE (mirrors CJS_PATH_CONST_RE's protection against false positives on temp files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): tighten allow-test-rule and add recursive test discovery - ALLOW_ANNOTATION now requires at least one non-whitespace char after the colon so bare '// allow-test-rule:' cannot bypass the lint gate - findTestFiles() recurses into subdirectories so nested *.test.cjs files are covered if the tests/ tree ever grows subdirs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 11:34:55 -04:00
Tom Boucher	259c1d07d3	fix(#2647 ): guard tarball ships sdk/dist so gsd-sdk query works (#2671 ) v1.38.3 shipped without sdk/dist/ because the outer `files` whitelist and `prepublishOnly` chain had drifted. The `gsd-sdk` bin shim then fell through to a stale @gsd-build/sdk@0.1.0 (pre-`query`), breaking every workflow that called `gsd-sdk query <noun>` on fresh installs. Current package.json already restores `sdk/dist` + `build:sdk` prepublish; this PR locks the fix in with: - tests/bug-2647-outer-tarball-sdk-dist.test.cjs — asserts `files` includes `sdk/dist`, `prepublishOnly` invokes `build:sdk`, the shim resolves sdk/dist/cli.js, `npm pack --dry-run` lists sdk/dist/cli.js, and the built CLI exposes a `query` subcommand. - scripts/verify-tarball-sdk-dist.sh — packs, extracts, installs prod deps, and runs `node sdk/dist/cli.js query --help` against the real tarball output. - .github/workflows/release.yml — runs the verify script in both next and stable release jobs before `npm publish`. Partial fix for #2649 (same root cause on the sibling sdk package). Fixes #2647	2026-04-24 18:05:18 -04:00
Tom Boucher	73c1af5168	fix(#2543 ): replace legacy /gsd-<cmd> syntax with /gsd:<cmd> across all source files (#2595 ) Commands are now installed as commands/gsd/<name>.md and invoked as /gsd:<name> in Claude Code. The old hyphen form /gsd-<name> was still hardcoded in hundreds of places across workflows, references, templates, lib modules, and command files — causing "Unknown command" errors whenever GSD suggested a command to the user. Replace all /gsd-<cmd> occurrences where <cmd> is a known command name (derived at runtime from commands/gsd/*.md) using a targeted Node.js script. Agent names, tool names (gsd-sdk, gsd-tools), directory names, and path fragments are not touched. Adds regression test tests/bug-2543-gsd-slash-namespace.test.cjs that enforces zero legacy occurrences going forward. Removes inverted tests/stale-colon-refs.test.cjs (bug #1748) which enforced the now-obsolete hyphen form; the new bug-2543 test supersedes it. Updates 5 assertion tests that hardcoded the old hyphen form to accept the new colon form. Closes #2543 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:04:25 -04:00
Tom Boucher	62eaa8dd7b	docs: close doc drift vectors — bidirectional parity, manifest, schema-driven config (#2479 ) Option A — ghost-entry guard (INVENTORY ⊆ actual): tests/inventory-source-parity.test.cjs parses every declared row in INVENTORY.md and asserts the source file exists. Catches deletions and renames that leave ghost entries behind. Option B — auto-generated structural manifest: scripts/gen-inventory-manifest.cjs walks all six family dirs and emits docs/INVENTORY-MANIFEST.json. tests/inventory-manifest-sync.test.cjs fails CI when a new surface ships without a manifest update, surfacing exactly which entries are missing. Option C — schema-driven config validation + docs parity: get-shit-done/bin/lib/config-schema.cjs extracted from config.cjs as the single source of truth for VALID_CONFIG_KEYS and dynamic patterns. config.cjs now imports from it. tests/config-schema-docs-parity.test.cjs asserts every exact-match key appears in docs/CONFIGURATION.md, surfacing 14 previously undocumented keys (planning.sub_repos, workflow.ai_integration_phase, git.base_branch, learnings.max_inject, and 10 others) — all now documented in their appropriate sections. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 09:39:05 -04:00
Jeremy McSpadden	13a96ee994	fix(build): include gsd-read-injection-scanner in hooks/dist (#2406 ) The scanner was added in #2201 but never added to the HOOKS_TO_COPY allowlist in scripts/build-hooks.js, so it never landed in hooks/dist/. install.js reads from hooks/dist/, so every install on 1.37.0/1.37.1 emitted "Skipped read injection scanner hook — not found at target" and the read-time prompt-injection scanner was silently disabled. - Add gsd-read-injection-scanner.js to HOOKS_TO_COPY - Add it to EXPECTED_ALL_HOOKS regression test in install-hooks-copy Fixes #2406 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 08:42:36 -05:00
Tom Boucher	c35997fb0b	feat(hooks): add gsd-read-injection-scanner PostToolUse hook (#2201 ) (#2328 ) * feat: add /gsd-spec-phase — Socratic spec refinement with ambiguity scoring (#2213) Introduces `/gsd-spec-phase <phase>` as an optional pre-step before discuss-phase. Clarifies WHAT a phase delivers (requirements, boundaries, acceptance criteria) with quantitative ambiguity scoring before discuss-phase handles HOW to implement. - `commands/gsd/spec-phase.md` — slash command routing to workflow - `get-shit-done/workflows/spec-phase.md` — full Socratic interview loop (up to 6 rounds, 5 rotating perspectives: Researcher, Simplifier, Boundary Keeper, Failure Analyst, Seed Closer) with weighted 4-dimension ambiguity gate (≤ 0.20 to write SPEC.md) - `get-shit-done/templates/spec.md` — SPEC.md template with falsifiable requirements (Current/Target/Acceptance per requirement), Boundaries, Acceptance Criteria, Ambiguity Report, and Interview Log; includes two full worked examples - `get-shit-done/workflows/discuss-phase.md` — new `check_spec` step detects `{padded_phase}-SPEC.md` at startup; displays "Found SPEC.md — N requirements locked. Focusing on implementation decisions."; `analyze_phase` respects `spec_loaded` flag to skip "what/why" gray areas; `write_context` emits `<spec_lock>` section with boundary summary and canonical ref to SPEC.md - `docs/ARCHITECTURE.md` — update command/workflow counts (74→75, 71→72) Closes #2213 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(hooks): add gsd-read-injection-scanner PostToolUse hook (#2201) Adds a new PostToolUse hook that scans content returned by the Read tool for prompt injection patterns, including four summarisation-specific patterns (retention-directive, permanence-claim, etc.) that survive context compression. Defense-in-depth for long GSD sessions where the context summariser cannot distinguish user instructions from content read from external files. - Advisory-only (warns without blocking), consistent with gsd-prompt-guard.js - LOW severity for 1-2 patterns, HIGH for 3+ - Inlined pattern library (hook independence) - Exclusion list: .planning/, REVIEW.md, CHECKPOINT, security docs, hook sources - Wired in install.js as PostToolUse matcher: Read, timeout: 5s - Added to MANAGED_HOOKS for staleness detection - 19 tests covering all 13 acceptance criteria (SCAN-01–07, EXCL-01–06, EDGE-01–06) Closes #2201 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): add read-injection-scanner files to prompt-injection-scan allowlist Test payloads in tests/read-injection-scanner.test.cjs and inlined patterns in hooks/gsd-read-injection-scanner.js legitimately contain injection strings. Add both to the CI script allowlist to prevent false-positive failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): assert exitCode, stdout, and signal explicitly in EDGE-05 Addresses CodeRabbit feedback: the success path discarded the return value so a malformed-JSON input that produced stdout would still pass. Now captures and asserts all three observable properties. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 17:22:31 -04:00
Tom Boucher	50f61bfd9a	fix(hooks): complete stale-hooks false-positive fix — stamp .sh version headers + fix detector regex (#2224 ) * fix(hooks): stamp gsd-hook-version in .sh hooks and fix stale detection regex (#2136, #2206) Three-part fix for the persistent "⚠ stale hooks — run /gsd-update" false positive that appeared on every session after a fresh install. Root cause: the stale-hook detector (gsd-check-update.js) could only match the JS comment syntax // in its version regex — never the bash # syntax used in .sh hooks. And the bash hooks had no version header at all, so they always landed in the "unknown / stale" branch regardless. Neither partial fix (PR #2207 regex only, PR #2215 install stamping only) was sufficient alone: - Regex fix without install stamping: hooks install with literal "{{GSD_VERSION}}", the {{-guard silently skips them, bash hook staleness permanently undetectable after future updates. - Install stamping without regex fix: hooks are stamped correctly with "# gsd-hook-version: 1.36.0" but the detector's // regex can't read it; still falls to the unknown/stale branch on every session. Fix: 1. Add "# gsd-hook-version: {{GSD_VERSION}}" header to gsd-phase-boundary.sh, gsd-session-state.sh, gsd-validate-commit.sh 2. Extend install.js (both bundled and Codex paths) to substitute {{GSD_VERSION}} in .sh files at install time (same as .js hooks) 3. Extend gsd-check-update.js versionMatch regex to handle bash "#" comment syntax: /(?:\/\/\|#) gsd-hook-version:\s(.+)/ Tests: 11 new assertions across 5 describe blocks covering all three fix parts independently plus an E2E install+detect round-trip. 3885/3885 pass. Approach credit: PR #2207 (j2h4u / Maxim Brashenko) for the regex fix; PR #2215 (nitsan2dots) for the install.js substitution approach. Closes #2136, #2206, #2209, #2210, #2212 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> refactor(hooks): extract check-update worker to dedicated file, eliminating template-literal regex escaping Move stale-hook detection logic from inline `node -e '<template literal>'` subprocess to a standalone gsd-check-update-worker.js. Benefits: - Regex is plain JS with no double-escaping (root cause of the (?:\\/\\/\|#) confusion) - Worker is independently testable and can be read directly by tests - Uses execFileSync (array args) to satisfy security hook that blocks execSync - MANAGED_HOOKS now includes gsd-check-update-worker.js itself Update tests to read worker file instead of main hook for regex/configDir assertions. All 3886 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 17:57:38 -04:00
Tom Boucher	2703422be8	refactor(tests): standardize to node:assert/strict and t.after() per CONTRIBUTING.md (#1675 ) * refactor(tests): standardize to node:assert/strict and t.after() per CONTRIBUTING.md - Replace require('node:assert') with require('node:assert/strict') across all 73 test files to enforce strict equality (no type coercion) - Replace try/finally cleanup blocks with t.after() hooks in core.test.cjs and hooks-opt-in.test.cjs per the test lifecycle standards - Utility functions in codex-config and security-scan retain try/finally as that is appropriate for per-function resource guards, not lifecycle hooks Closes #1674 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf(tests): add --test-concurrency=4 to test runner for parallel file execution Node.js --test-concurrency controls how many test files run as parallel child processes. Set to 4 by default, configurable via TEST_CONCURRENCY env var. Fixes tests at a known level rather than inheriting os.availableParallelism() which varies across CI environments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(security): allowlist verify.test.cjs in prompt-injection scanner tests/verify.test.cjs uses <human>...</human> as GSD phase task-type XML (meaning "a human should verify this step"), which matches the scanner's fake-message-boundary pattern for LLM APIs. This is a false positive — add it to the allowlist alongside the other test files that legitimately contain injection-adjacent patterns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 14:29:03 -04:00
Tom Boucher	ca6a273685	fix: remove marketing text from runtime prompt, fix #1656 and #1657 (#1672 ) * chore: ignore .worktrees directory Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(install): remove marketing taglines from runtime selection prompt Closes #1654 The runtime selection menu had promotional copy appended to some entries ("open source, the #1 AI coding platform on OpenRouter", "open source, free models"). Replaced with just the name and path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(kilo): update test to assert marketing tagline is removed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): use process.execPath so tests pass in shells without node on PATH Three test patterns called bare `node` via shell, which fails in Claude Code sessions where `node` is not on PATH: - helpers.cjs string branch: execSync(`node ...`) → execFileSync(process.execPath) with a shell-style tokenizer that handles quoted args and inner-quote stripping - hooks-opt-in.test.cjs: spawnSync('bash', ...) for hooks that call `node` internally → spawnHook() wrapper that injects process.execPath dir into PATH - concurrency-safety.test.cjs: exec(`node ...`) for concurrent patch test → exec(`"${process.execPath}" ...`) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve #1656 and #1657 — bash hooks missing from dist, SDK install prompt #1656: Community bash hooks (gsd-session-state.sh, gsd-validate-commit.sh, gsd-phase-boundary.sh) were never included in HOOKS_TO_COPY in build-hooks.js, so hooks/dist/ never contained them and the installer could not copy them to user machines. Fixed by adding the three .sh files to the copy array with chmod +x preservation and skipping JS syntax validation for shell scripts. #1657: promptSdk() called installSdk() which ran `npm install -g @gsd-build/sdk` — a package that does not exist on npm, causing visible errors during interactive installs. Removed promptSdk(), installSdk(), --sdk flag, and all call sites. Regression tests in tests/bugs-1656-1657.test.cjs guard both fixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: sort runtime list alphabetically after Claude Code - Claude Code stays pinned at position 1 - Remaining 10 runtimes sorted A-Z: Antigravity(2), Augment(3), Codex(4), Copilot(5), Cursor(6), Gemini(7), Kilo(8), OpenCode(9), Trae(10), Windsurf(11) - Updated runtimeMap, allRuntimes, and prompt display in promptRuntime() - Updated multi-runtime-select, kilo-install, copilot-install tests to match Also fix #1656 regression test: run build-hooks.js in before() hook so hooks/dist/ is populated on CI (directory is gitignored; build runs via prepublishOnly before publish, not during npm ci). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 14:15:30 -04:00
Tom Boucher	9d626de5fa	fix(hooks): add read-before-edit guard for non-Claude runtimes (#1645 ) * fix(hooks): add read-before-edit guidance for non-Claude runtimes When models that don't natively enforce read-before-edit hit the guard, the error message now includes explicit instruction to Read first. This prevents infinite retry loops that burn through usage. Closes #1628 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(build): register gsd-read-guard.js in HOOKS_TO_COPY and harden tests The hook was missing from scripts/build-hooks.js, so global installs would never receive the hook file in hooks/dist/. Also adds tests for build registration, install uninstall list, and non-string file_path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-04 07:35:18 -04:00
Tom Boucher	98f05d43b8	fix: security scan self-detection and Windows test compatibility - Add base64-scan.sh and secret-scan.sh to prompt injection scanner allowlist (scanner was flagging its own pattern strings) - Skip executable bit check on Windows (no Unix permissions) - Skip bash script execution tests on Windows (requires Git Bash) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 13:30:15 -04:00
Tom Boucher	feec5a37a2	ci(security): add prompt injection, base64, and secret scanning Add CI security pipeline to catch prompt injection attacks, base64-obfuscated payloads, leaked secrets, and .planning/ directory commits in PRs. This is critical for get-shit-done because the entire codebase is markdown prompts — a prompt injection in a workflow file IS the attack surface. New files: - scripts/prompt-injection-scan.sh: scans for instruction override, role manipulation, system boundary injection, DAN/jailbreak, and tool call injection patterns in changed files - scripts/base64-scan.sh: extracts base64 blobs >= 40 chars, decodes them, and checks decoded content against injection patterns (skips data URIs and binary content) - scripts/secret-scan.sh: detects AWS keys, OpenAI/Anthropic keys, GitHub PATs, Stripe keys, private key headers, and generic credential patterns - .github/workflows/security-scan.yml: runs all three scans plus a .planning/ directory check on every PR - .base64scanignore / .secretscanignore: per-repo false positive allowlists - tests/security-scan.test.cjs: 51 tests covering script existence, pattern matching, false positive avoidance, and workflow structure All scripts support --diff (CI), --file, and --dir modes. Cross-platform (macOS + Linux). SHA-pinned actions. Environment variables used for github context in run blocks (no direct interpolation). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 13:23:51 -04:00
Tom Boucher	62db008570	security: add prompt injection guards, path traversal prevention, and input validation Defense-in-depth security hardening for a codebase where markdown files become LLM system prompts. Adds centralized security module, PreToolUse hook for injection detection, and CI-ready codebase scan. New files: - security.cjs: path traversal prevention, prompt injection scanner/sanitizer, safe JSON parsing, field name validation, shell arg validation - gsd-prompt-guard.js: PreToolUse hook scans .planning/ writes for injection - security.test.cjs: 62 unit tests for all security functions - prompt-injection-scan.test.cjs: CI scan of all agent/workflow/command files Hardened code paths: - readTextArgOrFile: path traversal guard (--prd, --text-file) - cmdStateUpdate/Patch: field name validation prevents regex injection - cmdCommit: sanitizeForPrompt strips invisible chars from commit messages - gsd-tools --fields: safeJsonParse wraps unprotected JSON.parse - cmdFrontmatterGet/Set: null byte rejection - cmdVerifyPathExists: null byte rejection - install.js: registers prompt guard hook, updates uninstaller Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-20 11:38:26 -04:00
Tom Boucher	a6ba3e268e	feat: PreToolUse workflow guard hook for rogue edit prevention (#678 ) (#1197 ) New opt-in PreToolUse hook that warns when Claude edits files outside a GSD workflow context (no active /gsd: command or subagent). Soft guard — advises, does not block. The edit proceeds but Claude sees a reminder to use /gsd:fast or /gsd:quick for state tracking. Enable: set hooks.workflow_guard: true in .planning/config.json Default: disabled (false) Allows without warning: - .planning/ files (GSD state management) - Config files (.gitignore, .env, CLAUDE.md, settings.json) - Subagent contexts (executor, planner, etc.) Includes 3s stdin timeout guard and silent fail-safe. Closes #678	2026-03-18 17:36:07 -04:00
Tom Boucher	14c1dd845b	fix(build): add syntax validation to hook build script (#1165 ) Prevents shipping hooks with JavaScript SyntaxError (like the duplicate const cwd declaration that caused PostToolUse errors for all users in v1.25.1). The build script now validates each hook file's syntax via vm.Script before copying to dist/. If any hook has a SyntaxError, the build fails with a clear error message and exits non-zero, blocking npm publish. Refs #1107, #1109, #1125, #1161	2026-03-18 09:56:51 -06:00
Lex Christopherson	02a5319777	fix(ci): propagate coverage env in cross-platform test runner The run-tests.cjs child process now inherits NODE_V8_COVERAGE from the parent so c8 collects coverage data. Also restores npm scripts to use the cross-platform runner for both test and test:coverage commands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-27 10:07:02 -06:00
Lex Christopherson	ccb8ae1d18	fix(ci): cross-platform test runner for Windows glob expansion npm scripts pass `tests/*.test.cjs` to node/c8 as a literal string on Windows (PowerShell/cmd don't expand globs). Adding `shell: bash` to CI steps doesn't help because c8 spawns node as a child process using the system shell. Use a Node script to enumerate test files cross-platform. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-27 10:00:26 -06:00
vinicius-tersi	7542d364b4	feat: context window monitor hook with agent-side WARNING/CRITICAL alerts Adds PostToolUse hook that reads context metrics from statusline bridge file and injects alerts into agent conversation when context is low. Features: - Two-tier alerts: WARNING (<=35% remaining) and CRITICAL (<=25%) - Smart debounce: 5 tool uses between warnings, severity escalation bypasses - Silent fail: never blocks tool execution - Security: session_id sanitized to prevent path traversal Ref #212	2026-02-20 14:40:08 -06:00
Lex Christopherson	d1fda80c7f	revert: remove codebase intelligence system Rolled back the intel system due to overengineering concerns: - 1200+ line hook with SQLite graph database - 21MB sql.js dependency - Entity generation spawning additional Claude calls - Complex system with unclear value Removed: - /gsd:analyze-codebase command - /gsd:query-intel command - gsd-intel-index.js, gsd-intel-session.js, gsd-intel-prune.js hooks - gsd-entity-generator, gsd-indexer agents - entity.md template - sql.js dependency Preserved: - Model profiles feature - Statusline hook - All other v1.9.x improvements -3,065 lines removed Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 10:28:53 -06:00
Lex Christopherson	cdad7b8ad7	fix: update build script to use gsd-statusline.js Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 12:11:08 -06:00

39 Commits