mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-05-13 18:46:38 +02:00
a33cbe72f569e75e72d94de85d0930296d1ca1df
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
918f987a19 |
feat(#2982): extend no-source-grep lint to catch var-binding readFileSync.includes() (#2985)
* feat(#2982): extend no-source-grep lint to catch var-binding readFileSync.includes() The base lint (scripts/lint-no-source-grep.cjs) only catches readFileSync(...).<text-method>() chained directly. The much more common var-binding form escapes it: const src = fs.readFileSync(p, 'utf8'); // 50 lines later if (src.includes('foo')) {} // ← still grep, lint missed it Scan of the test suite found ~141 files using this pattern. Implementation built TDD per #2982 with structured-IR assertions: scripts/lint-no-source-grep-extras.cjs - detectVarBindingViolations(src) — pure detector, two passes: pass 1 collects vars bound from readFileSync, pass 2 finds any <var>.<includes|startsWith|endsWith|match|search>( on those vars. - detectWrappedAssertOkMatch(src) — flags assert.ok(<expr>.match(...)) which escapes the assert.match rule. - VIOLATION enum exposes stable codes for tests to assert on. scripts/lint-no-source-grep.cjs - Wires the new detectors into the existing per-file check; one additional violation row per file with the first 3 sample tokens. tests/bug-2982-lint-var-binding.test.cjs - 13 tests, all assertions on typed VIOLATION enum / structured records. Covers all 5 text-match methods, multi-var, no-bind, string literal (must NOT trigger), wrapped assert.ok(.match), and assert.match (must NOT double-flag). Migration backlog (#2974 expanded scope): - 42 files annotated `// allow-test-rule: source-text-is-the-product` (legitimate — they read .md/.json/.yml files whose deployed text IS the product) - 3 files annotated `// allow-test-rule: pending-migration-to-typed-ir [#2974]` (read .cjs/.js source — clear migration debt) - 95 files annotated `pending-migration-to-typed-ir [#2974]` with `Per-file review may reclassify as source-text-is-the-product during migration` (mixed — manual review under #2974) After this lands the lint reports 0 violations on main; new violations in PRs surface immediately. Closes #2982 Refs #2974 * test(#2982): fix truncated test name per CR The label ended with a bare '(' from a copy-paste mishap. Now reads 'does NOT flag .matchAll(...) — matchAll is not match, so assert.ok(.matchAll(...)) is not flagged'. * chore(#2982): add changeset fragment for PR #2985 * chore(#2982): add changeset fragment for PR #2985 |
||
|
|
ef43f5161f |
fix(#2969): deterministic Step 5 verification gate for /gsd-reapply-patches (#2972)
* fix(#2969): deterministic Step 5 verification gate for /gsd-reapply-patches The prior Step 5 "Hunk Verification Gate" was prescribed correctly in the workflow text — but executed laxly by the LLM, which filled in `verified: yes` without actually checking content presence. The reporter observed three distinct files (skills/gsd-discuss-phase/SKILL.md, skills/gsd-autonomous/ SKILL.md, get-shit-done/workflows/new-project.md) where archives contained substantive user-added blocks that did not survive into the merged result, yet the gate reported clean. Move verification from LLM-driven prose into a deterministic Node script the workflow calls. The script can't be shortcut. Changes: - scripts/verify-reapply-patches.cjs (new): pure Node, no external deps. For each file in the patches dir, computes user-added significant lines as the line-set diff between backup and pristine baseline (when available; falls back to "every significant backup line" when no pristine — over-broad but the safe direction for this bug class). Asserts each line appears literally in the merged installed file via String.prototype.includes. Filters trivial lines (length < 12 chars, pure punctuation, decorative comments) so harmless drift doesn't trigger false failures. Exits 0 on pass, 1 on any miss with per-file diagnostic, 2 on usage error. Supports --json for workflow consumption. - get-shit-done/workflows/reapply-patches.md: rewrite Step 5 to call the script and parse its JSON output. The Step 4 Hunk Verification Table remains as advisory Claude-readable summary, but the gate is now the script's exit code. - tests/bug-2969-verify-reapply-patches.test.cjs (new): 6 tests covering (a) pass when every line survives, (b) fail when a line is missing, (c) fail when the merged file is deleted entirely, (d) --json structured report shape, (e) backup-meta.json is correctly skipped as metadata, (f) no-pristine-dir fallback exercises the safe over-broad path. All pass. Out of scope: the manifest-baseline tightening described in #2969 Failure 1 (saveLocalPatches comparing against the wrong baseline so prior silent wipes poison subsequent updates). That's a separate, bigger architectural change involving pristine-content infrastructure; this PR addresses the gate fidelity half so users at least see the diagnostic when content goes missing. Closes #2969 (partial — Failure 2 only) * fix(#2969): preserve #1999 Hunk Verification Table assertions alongside new script gate CI failure on PR #2972 surfaced that tests/reapply-patches.test.cjs (the #1999 contract) asserts Step 5 references: - "Hunk Verification Table" - `verified: no` failure condition - explicit STOP/halt/abort directive - "table absent / missing" halt path My initial Step 5 rewrite for #2969 substituted the deterministic script for the table-based gate entirely, stripping those references. The script is the strictly stronger gate, but the existing #1999 test enforces the table-based safety net as a defense-in-depth contract. Restore both gates as a layered Step 5: - 5a (binding): deterministic verifier script — script gate, exits non-zero on any miss, cannot be shortcut by the LLM - 5b (advisory): Hunk Verification Table review — preserved as redundant safety net for the case where the script has a bug or the pristine baseline is unavailable Both gates must pass. Verified: tests/reapply-patches.test.cjs (5 tests in the #1999 suite) and tests/bug-2969-verify-reapply-patches.test.cjs (6 tests in the #2969 suite) all pass — 21/21 total in this fixture. * fix(#2969): address CodeRabbit findings on workflow + script Five CR findings on PR #2972, all valid; addressed in this commit: 1. (Major) Stderr was merged into VERIFY_OUTPUT via `2>&1`, so any Node warning, deprecation notice, or stack trace would corrupt the JSON parse downstream. Capture stdout only; stderr remains on the controlling terminal for operator visibility. 2. (Major) verifyFile() crashed with EISDIR/EACCES instead of producing a structured diagnostic when the installed path was a directory or unreadable. Wrap statSync/readFileSync in try/catch and emit a per-file fail row; the whole-run gate continues with structured output. Added test case asserting the directory-at-installed-path case fails with `not a regular file` diagnostic instead of crashing. 3. (Minor) PRISTINE_FLAG built as a single string + unquoted expansion would split paths with spaces. Switched to a bash array (VERIFY_ARGS) that preserves whitespace through expansion. 4. (Minor) Fenced code block missing language tag (markdownlint MD040). Added `text` tag to the error message block. 5. (Minor) Usage comment said pristine fallback was "backup-meta lookup" but the actual code path falls back to significant-line checks from backup content. Corrected the comment to match implementation. Verified all 21 tests in tests/reapply-patches.test.cjs (#1999 contract) + tests/bug-2969-verify-reapply-patches.test.cjs (now 7 tests with the new directory case) pass. * test(#2969): structured JSON assertions, no substring matching on script output Replace every assert.match(r.stdout, /pattern/) call with structured assertions on the parsed JSON report from the script's own --json mode. The script's --json contract IS the structured shape we test against — the test author should never depend on the human-readable formatter output, just as no test should depend on substring presence in source. Changes: - All 7 tests now run the verifier with --json (via a runVerifier() helper) and parse the resulting JSON document into { status, report, stderr }. Diagnostic stderr is preserved as a separate channel for debug output but is not used for assertions. - Each previously substring-matched diagnostic ("Failures: 1", "not a regular file", "installed file missing after merge", file path, dropped line) is now a deepEqual / equal / Array.includes against typed report fields: report.failures, report.results[i].status, report.results[i].reason, report.results[i].file, report.results[i].missing[]. - Added an explicit "documented shape" test asserting the JSON output has exactly the keys { file, missing, reason, status } per result — locks the public contract of the --json mode. - DRY'd up fixture reset into a resetFixture() helper since every test starts with a fresh patches/installed/pristine triple. Linter: scripts/lint-no-source-grep.cjs reports 0 violations across 348 test files. Combined run of bug-2969-...test.cjs (7 tests) + reapply-patches.test.cjs (5 tests in the #1999 suite) all pass — 22/22 in the relevant fixture. * fix(#2969): typed REASON enum + raw-text-matching rule shipped repo-wide This commit closes the loop on the no-source-grep discipline: 1. scripts/verify-reapply-patches.cjs: - Frozen REASON enum exposes the diagnostic surface as stable codes: OK_NO_USER_LINES_VS_PRISTINE, OK_NO_SIGNIFICANT_BACKUP_LINES, FAIL_INSTALLED_MISSING, FAIL_INSTALLED_NOT_REGULAR_FILE, FAIL_READ_ERROR, FAIL_USER_LINES_MISSING. - Each result.reason is now a code from this enum, not free text. Tests assert via REASON.X equality, not regex on prose. - REASON exported from module.exports. 2. tests/bug-2969-verify-reapply-patches.test.cjs: - Full rewrite. Every assertion on typed structured fields: report.results[0].status === 'fail', report.results[0].reason === REASON.FAIL_INSTALLED_NOT_REGULAR_FILE, report.results[0].missing.includes(droppedLine) (Array set membership, not String substring). - Locks the REASON enum surface via Object.keys(REASON).sort() deepEqual. - Locks the JSON report shape via Object.keys(report).sort() deepEqual. - Zero regex, zero String#includes, zero startsWith/endsWith on text. 3. CONTRIBUTING.md: - New section "Prohibited: Raw Text Matching on Test Outputs" with concrete BAD/GOOD examples (substring on file content; assert.match on stdout; "structured parser" hiding string ops; regex on free-form reason fields). - The rule statement: "Tests assert on typed structured values. If the code under test produces text, the code under test must also expose a structured intermediate representation, and the test must assert on that IR — never on the rendered text." - Required structured-surface table: file IR, --json mode, frozen enum, fs facts. - "Hiding grep behind a function is still grep" callout — the parser-wrapper anti-pattern. - New `pre-existing-text-matching` exemption category for the 8 grandfathered files. Marked Transitional; new tests cannot use it. 4. scripts/lint-no-source-grep.cjs: - Three new patterns enforced (in addition to the existing .cjs-source readFileSync rule): - assert.match/doesNotMatch on .stdout/.stderr - .stdout/.stderr.<includes|startsWith|endsWith>( - readFileSync(...).<includes|startsWith|endsWith>( - Aggregated violations per file (multiple findings now report together). - Updated diagnostic message references both CONTRIBUTING.md sections. 5. 8 pre-existing tests annotated with `// allow-test-rule: pre-existing-text-matching` so the lint passes on this commit; each carries the prose "Tracked for migration to typed-IR assertions; do not copy this pattern." Files: bug-2649, bug-2687, bug-2796, bug-2838, bug-2943, graphify, hooks-opt-in, security-scan. Verification: lint 0 violations across 348 test files; full suite passes. * fix(#2969): rename exemption category to pending-migration-to-typed-ir + cite tracking issue Per maintainer feedback: 1. "Grandfathered" / "legacy" framing is wrong — both terms imply permanent or condoned exemption. The 8 files are tracked for correction, not exempted. 2. Each annotated file must cite the tracking issue so the migration work is auditable. Changes: - CONTRIBUTING.md: rename exemption category from `pre-existing-text-matching` to `pending-migration-to-typed-ir`. Update prose to "Tracked for correction, not exempted" and require each annotation to cite the open migration issue (e.g. `// allow-test-rule: pending-migration-to-typed-ir [#NNNN]`). - 8 test files: update annotation to cite #2974 (the tracking issue opened for migrating these files to typed-IR assertions). |
||
|
|
aeef87de7f |
docs(test-standards): enforce no-source-grep rule with CI linter + CONTRIBUTING.md (#2700)
* docs(test-standards): enforce no-source-grep rule with CI linter + update CONTRIBUTING.md
Adds scripts/lint-no-source-grep.cjs — a static linter that detects readFileSync
on .cjs source files in tests without an allow-test-rule annotation. Wires it
into CI as a new lint-tests job in test.yml and as npm run lint:tests.
Resolves all 9 existing violations across the test suite:
- Rewrites workspace routing tests (3) as behavioral runGsdTools calls that
verify each command is router-recognized (exit != "Unknown init workflow")
- Adds allow-test-rule annotations with explanatory comments to 7 legitimate
structural tests: architectural invariants (locking, orphan-worktree),
structural regression guards (milestone-regex-global), docs-parity
(config-field-docs), integration-test-input (copilot-install), and
structural-implementation-guards (bug-1891, discuss-mode)
Updates CONTRIBUTING.md Testing Standards section with:
- "Prohibited: Source-Grep Tests" section with the before/after pattern,
root cause analysis of why it breaks (commit
|