mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
test(resilience): address PR #2944 review (exhaustive taxonomy sanity check)
Addresses the P2 comment from Greptile on PR #2944: > "The invariant is verified only for IMPUTATION.crisis_monitoring_absent > vs IMPUTATION.curated_list_absent, so a future IMPUTE entry that > violates the ordering (e.g. a stable-absence override with a score > below an unmonitored one) would not be caught here. Extending the > loop to group all stable-absence and unmonitored entries from both > tables and assert the min stable-absence > max unmonitored would > make the guard exhaustive." Fair. The earlier test only pinned the two base entries, so a regression in a per-metric IMPUTE override (the exact layer most likely to drift over time) would have slipped through CI. This commit rewrites the semantic-sanity test to: 1. Collect every entry from both IMPUTATION and IMPUTE into a single labeled list. 2. Partition by imputationClass. 3. Assert min stable-absence score > max unmonitored score. 4. Assert min stable-absence certaintyCoverage > max unmonitored certaintyCoverage. 5. Report the offending entries in the failure message so a reviewer who hits the failure immediately sees which entry broke the invariant and from which table. Under the current values the invariant holds: - stable-absence scores: IMPUTATION.crisis_monitoring_absent=85, IMPUTE.ipcFood=88, IMPUTE.unhcrDisplacement=85. - unmonitored scores: IMPUTATION.curated_list_absent=50, IMPUTE.wtoData=60, IMPUTE.bisEer=50, IMPUTE.bisCredit=50. - min stable-absence = 85 > max unmonitored = 60, so the check passes with a 25-point gap. Testing: - npx tsx --test tests/resilience-dimension-scorers.test.mts: 51/51 pass (46 existing + 5 taxonomy tests, including the rewritten semantic-sanity check) - npm run typecheck: clean Generated with Claude Opus 4.6 (1M context) via Claude Code + Compound Engineering v2.49.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -755,18 +755,47 @@ describe('resilience imputation taxonomy (T1.7)', () => {
|
||||
}
|
||||
});
|
||||
|
||||
it('stable-absence entries score higher than unmonitored (semantic sanity)', () => {
|
||||
it('stable-absence entries score higher than unmonitored, across BOTH tables (semantic sanity)', () => {
|
||||
// stable-absence = strong positive signal (feed is comprehensive,
|
||||
// nothing happened). unmonitored = we do not know, penalized.
|
||||
// If this assertion ever fails, the semantic meaning of the classes
|
||||
// has drifted and the taxonomy needs to be re-argued.
|
||||
// The invariant must hold across every entry in both IMPUTATION and
|
||||
// IMPUTE, otherwise a per-metric override can silently break the
|
||||
// ordering (e.g. a `stable-absence` override with a score lower than
|
||||
// an `unmonitored` entry would pass a tables-only check but violate
|
||||
// the taxonomy's semantic meaning).
|
||||
//
|
||||
// Raised in review of PR #2944: the earlier version of this test
|
||||
// only checked the two base entries in IMPUTATION and would have
|
||||
// missed a regression in an IMPUTE override.
|
||||
const allEntries = [
|
||||
...Object.entries(IMPUTATION).map(([k, v]) => ({ label: `IMPUTATION.${k}`, entry: v })),
|
||||
...Object.entries(IMPUTE).map(([k, v]) => ({ label: `IMPUTE.${k}`, entry: v })),
|
||||
];
|
||||
|
||||
const stableAbsence = allEntries.filter((e) => e.entry.imputationClass === 'stable-absence');
|
||||
const unmonitored = allEntries.filter((e) => e.entry.imputationClass === 'unmonitored');
|
||||
|
||||
assert.ok(stableAbsence.length > 0, 'expected at least one stable-absence entry across both tables');
|
||||
assert.ok(unmonitored.length > 0, 'expected at least one unmonitored entry across both tables');
|
||||
|
||||
const minStableScore = Math.min(...stableAbsence.map((e) => e.entry.score));
|
||||
const maxUnmonitoredScore = Math.max(...unmonitored.map((e) => e.entry.score));
|
||||
assert.ok(
|
||||
IMPUTATION.crisis_monitoring_absent.score > IMPUTATION.curated_list_absent.score,
|
||||
`stable-absence score (${IMPUTATION.crisis_monitoring_absent.score}) should be higher than unmonitored (${IMPUTATION.curated_list_absent.score})`,
|
||||
minStableScore > maxUnmonitoredScore,
|
||||
`every stable-absence entry must score higher than every unmonitored entry. ` +
|
||||
`min stable-absence score = ${minStableScore}, max unmonitored score = ${maxUnmonitoredScore}. ` +
|
||||
`stable-absence entries: ${stableAbsence.map((e) => `${e.label}=${e.entry.score}`).join(', ')}. ` +
|
||||
`unmonitored entries: ${unmonitored.map((e) => `${e.label}=${e.entry.score}`).join(', ')}.`,
|
||||
);
|
||||
|
||||
const minStableCertainty = Math.min(...stableAbsence.map((e) => e.entry.certaintyCoverage));
|
||||
const maxUnmonitoredCertainty = Math.max(...unmonitored.map((e) => e.entry.certaintyCoverage));
|
||||
assert.ok(
|
||||
IMPUTATION.crisis_monitoring_absent.certaintyCoverage > IMPUTATION.curated_list_absent.certaintyCoverage,
|
||||
`stable-absence certainty (${IMPUTATION.crisis_monitoring_absent.certaintyCoverage}) should be higher than unmonitored (${IMPUTATION.curated_list_absent.certaintyCoverage})`,
|
||||
minStableCertainty > maxUnmonitoredCertainty,
|
||||
`every stable-absence entry must have higher certaintyCoverage than every unmonitored entry. ` +
|
||||
`min stable-absence certainty = ${minStableCertainty}, max unmonitored certainty = ${maxUnmonitoredCertainty}. ` +
|
||||
`stable-absence entries: ${stableAbsence.map((e) => `${e.label}=${e.entry.certaintyCoverage}`).join(', ')}. ` +
|
||||
`unmonitored entries: ${unmonitored.map((e) => `${e.label}=${e.entry.certaintyCoverage}`).join(', ')}.`,
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user