test(resilience): address PR #2944 review (exhaustive taxonomy sanity check)

Addresses the P2 comment from Greptile on PR #2944:

> "The invariant is verified only for IMPUTATION.crisis_monitoring_absent
>  vs IMPUTATION.curated_list_absent, so a future IMPUTE entry that
>  violates the ordering (e.g. a stable-absence override with a score
>  below an unmonitored one) would not be caught here. Extending the
>  loop to group all stable-absence and unmonitored entries from both
>  tables and assert the min stable-absence > max unmonitored would
>  make the guard exhaustive."

Fair. The earlier test only pinned the two base entries, so a
regression in a per-metric IMPUTE override (the exact layer most
likely to drift over time) would have slipped through CI. This
commit rewrites the semantic-sanity test to:

1. Collect every entry from both IMPUTATION and IMPUTE into a
   single labeled list.
2. Partition by imputationClass.
3. Assert min stable-absence score > max unmonitored score.
4. Assert min stable-absence certaintyCoverage > max unmonitored
   certaintyCoverage.
5. Report the offending entries in the failure message so a
   reviewer who hits the failure immediately sees which entry broke
   the invariant and from which table.

Under the current values the invariant holds:
- stable-absence scores: IMPUTATION.crisis_monitoring_absent=85,
  IMPUTE.ipcFood=88, IMPUTE.unhcrDisplacement=85.
- unmonitored scores: IMPUTATION.curated_list_absent=50,
  IMPUTE.wtoData=60, IMPUTE.bisEer=50, IMPUTE.bisCredit=50.
- min stable-absence = 85 > max unmonitored = 60, so the check
  passes with a 25-point gap.

Testing:
- npx tsx --test tests/resilience-dimension-scorers.test.mts: 51/51
  pass (46 existing + 5 taxonomy tests, including the rewritten
  semantic-sanity check)
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Elie Habib
2026-04-11 14:26:33 +04:00
parent 61ca872b9a
commit 0f3d05ac04

View File

@@ -755,18 +755,47 @@ describe('resilience imputation taxonomy (T1.7)', () => {
}
});
it('stable-absence entries score higher than unmonitored (semantic sanity)', () => {
it('stable-absence entries score higher than unmonitored, across BOTH tables (semantic sanity)', () => {
// stable-absence = strong positive signal (feed is comprehensive,
// nothing happened). unmonitored = we do not know, penalized.
// If this assertion ever fails, the semantic meaning of the classes
// has drifted and the taxonomy needs to be re-argued.
// The invariant must hold across every entry in both IMPUTATION and
// IMPUTE, otherwise a per-metric override can silently break the
// ordering (e.g. a `stable-absence` override with a score lower than
// an `unmonitored` entry would pass a tables-only check but violate
// the taxonomy's semantic meaning).
//
// Raised in review of PR #2944: the earlier version of this test
// only checked the two base entries in IMPUTATION and would have
// missed a regression in an IMPUTE override.
const allEntries = [
...Object.entries(IMPUTATION).map(([k, v]) => ({ label: `IMPUTATION.${k}`, entry: v })),
...Object.entries(IMPUTE).map(([k, v]) => ({ label: `IMPUTE.${k}`, entry: v })),
];
const stableAbsence = allEntries.filter((e) => e.entry.imputationClass === 'stable-absence');
const unmonitored = allEntries.filter((e) => e.entry.imputationClass === 'unmonitored');
assert.ok(stableAbsence.length > 0, 'expected at least one stable-absence entry across both tables');
assert.ok(unmonitored.length > 0, 'expected at least one unmonitored entry across both tables');
const minStableScore = Math.min(...stableAbsence.map((e) => e.entry.score));
const maxUnmonitoredScore = Math.max(...unmonitored.map((e) => e.entry.score));
assert.ok(
IMPUTATION.crisis_monitoring_absent.score > IMPUTATION.curated_list_absent.score,
`stable-absence score (${IMPUTATION.crisis_monitoring_absent.score}) should be higher than unmonitored (${IMPUTATION.curated_list_absent.score})`,
minStableScore > maxUnmonitoredScore,
`every stable-absence entry must score higher than every unmonitored entry. ` +
`min stable-absence score = ${minStableScore}, max unmonitored score = ${maxUnmonitoredScore}. ` +
`stable-absence entries: ${stableAbsence.map((e) => `${e.label}=${e.entry.score}`).join(', ')}. ` +
`unmonitored entries: ${unmonitored.map((e) => `${e.label}=${e.entry.score}`).join(', ')}.`,
);
const minStableCertainty = Math.min(...stableAbsence.map((e) => e.entry.certaintyCoverage));
const maxUnmonitoredCertainty = Math.max(...unmonitored.map((e) => e.entry.certaintyCoverage));
assert.ok(
IMPUTATION.crisis_monitoring_absent.certaintyCoverage > IMPUTATION.curated_list_absent.certaintyCoverage,
`stable-absence certainty (${IMPUTATION.crisis_monitoring_absent.certaintyCoverage}) should be higher than unmonitored (${IMPUTATION.curated_list_absent.certaintyCoverage})`,
minStableCertainty > maxUnmonitoredCertainty,
`every stable-absence entry must have higher certaintyCoverage than every unmonitored entry. ` +
`min stable-absence certainty = ${minStableCertainty}, max unmonitored certainty = ${maxUnmonitoredCertainty}. ` +
`stable-absence entries: ${stableAbsence.map((e) => `${e.label}=${e.entry.certaintyCoverage}`).join(', ')}. ` +
`unmonitored entries: ${unmonitored.map((e) => `${e.label}=${e.entry.certaintyCoverage}`).join(', ')}.`,
);
});
});