worldmonitor/tests/resilience-dimension-freshness.test.mts at main

mirror of https://github.com/koala73/worldmonitor.git synced 2026-04-25 17:14:57 +02:00

Files

Elie Habib c48ceea463 feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer (#3324 )

* feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer

Plan §3.4 follow-up to #3305 + #3319. Lands the scorer + dimension
registration so the SWF seed from the Railway cron feeds a real score
once the bake-in window closes. No weight rebalance yet (separate
commit with Spearman sensitivity check), no health.js graduation yet
(7-day ON_DEMAND window per feedback_health_required_key_needs_
railway_cron_first.md), no bootstrap wiring yet (follow-up PR).

Shape of the change

Retirement:
- reserveAdequacy joins fuelStockDays in RESILIENCE_RETIRED_DIMENSIONS.
The legacy scorer now mirrors scoreFuelStockDays: returns
coverage=0 / imputationClass=null so the dimension is filtered out
of the confidence / coverage averages via the registry filter in
computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence. Kept in RESILIENCE_DIMENSION_ORDER for
structural continuity (tests, cached payload shape, registry
membership). Indicator registry tier demoted to 'experimental'.

Two new active dimensions:
- liquidReserveAdequacy (replaces the liquid-reserves half of the
retired reserveAdequacy). Same source (WB FI.RES.TOTL.MO, total
reserves in months of imports) but re-anchored 1..12 months
instead of 1..18. Twelve months ≈ IMF "full reserve adequacy"
benchmark for a diversified emerging-market importer — the tighter
ceiling prevents wealthy commodity-exporters from claiming outsized
credit for on-paper reserve stocks that are not the relevant
shock-absorption buffer.
- sovereignFiscalBuffer. Reads resilience:recovery:sovereign-wealth:v1
(populated by scripts/seed-sovereign-wealth.mjs, landed in #3305 +
wired into Railway cron in #3319). Computes the saturating
transform:
effectiveMonths = Σ [ aum/annualImports × 12 × access × liquidity × transparency ]
score = 100 × (1 − exp(−effectiveMonths / 12))
Exponential saturation prevents Norway-type outliers (effective
months in the 100s) from dominating the recovery pillar.

Three code paths in scoreSovereignFiscalBuffer:
1. Seed key absent entirely → IMPUTE.recoverySovereignFiscalBuffer
(score 50 / coverage 0.3 / unmonitored). Covers the Railway-cron
bake-in window before the first successful tick.
2. Seed present, country NOT in manifest → score=0 with FULL coverage.
Substantive absence, NOT imputation — per plan §3.4 "What happens
to no-SWF countries." 0 × weight = 0 in the numerator, so the
country correctly scores lower than SWF-holding peers on this dim.
3. Seed present, country in payload → saturating score, coverage
derated by the partial-seed completeness signal (so a Mubadala or
Temasek scrape drift on a multi-fund country shows up as lower
confidence rather than a silently-understated total).

Indicator registry:
- Demoted recoveryReserveMonths (tied to retired reserveAdequacy) to
tier='experimental'.
- Added recoveryLiquidReserveMonths: WB FI.RES.TOTL.MO, anchors 1..12,
tier='core', coverage=188.
- Added recoverySovereignWealthEffectiveMonths: the new SWF signal,
tier='experimental' for now because the manifest only has 8 funds
(below the 180-core / 137-§3.6-gate threshold). Graduating to 'core'
requires expanding the manifest past ~137 entries — a later PR.

Tests updated

- resilience-release-gate: 19→21 dim count; RETIRED_DIMENSIONS allow-
list now includes reserveAdequacy alongside fuelStockDays.
- resilience-dimension-scorers: scoreReserveAdequacy monotonicity +
"high reserves score well" tests migrated to scoreLiquidReserve-
Adequacy (same source, new 1..12 anchor). New retirement-shape test
for scoreReserveAdequacy mirroring the PR 3 fuelStockDays retirement
test. Four new scorer tests pin the three code paths of
scoreSovereignFiscalBuffer (absent seed / no-SWF country / SWF
country / partial-completeness derate).
- resilience-scorers fixture: baseline 60.12→60.35, recovery-domain
flat mean 47.33→48.75, overall 63.27→63.6. Each number commented
with the driver (split adds liquidReserveAdequacy 18@1.0 + sovereign
FiscalBuffer 50@0.3 at IMPUTE; retired reserveAdequacy drops out).
- resilience-dimension-monotonicity: target scoreLiquidReserveAdequacy
instead of scoreReserveAdequacy.
- resilience-handlers: response-shape dim count 19→21.
- resilience-indicator-registry: coverage 19→21 dimensions.
- resilience-dimension-freshness: allowlisted the new sovereign-wealth
seed-meta key in KNOWN_SEEDS_NOT_IN_HEALTH for the ON_DEMAND window.
- resilience-methodology-lint HEADING_TO_DIMENSION: added the two new
heading mappings. Methodology doc gets H4 sections for Liquid
Reserve Adequacy and Sovereign Fiscal Buffer; Reserve Adequacy
section is annotated as retired.
- resilience-retired-dimensions-parity: client-side
RESILIENCE_RETIRED_DIMENSION_IDS gets reserveAdequacy. Parser
upgraded to strip inline `// …` comments from the array body so a
future reviewer can drop a rationale next to an entry without
breaking parity.
- resilience-confidence-averaging: fixture updated to include both
retired dims (reserveAdequacy + fuelStockDays) — confirms the
registry filter correctly excludes BOTH from the visible coverage
reading.

Extraction harness (scripts/compare-resilience-current-vs-proposed.mjs):
- recoveryLiquidReserveMonths: reads the same reserve-adequacy seed
field as recoveryReserveMonths.
- recoverySovereignWealthEffectiveMonths: reads the new SWF seed key
on field totalEffectiveMonths. Absent-payload → 0 for correlation
math (matches the substantive-no-SWF scorer branch).

Out of scope for this commit (follow-ups)

- Recovery-domain weight rebalance + Spearman sensitivity rerun
against the PR 0 baseline.
- health.js graduation (SEED_META entry + ON_DEMAND_KEYS removal) once
Railway cron has ~7 days of clean runs.
- api/bootstrap.js wiring once an RPC consumer needs the SWF data.
- Manifest expansion past 137 countries so sovereignFiscalBuffer can
graduate from tier='experimental' to tier='core'.

Tests: 6573/6573 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on all touched files.

* fix(resilience): PR 2 review — add widget labels for new dimensions

P2 review finding on PR #3324. DIMENSION_LABELS in src/components/
resilience-widget-utils.ts covered only the old 19 dimension IDs, so
the two new active dims (liquidReserveAdequacy, sovereignFiscalBuffer)
would render with their raw internal IDs in the confidence grid for
every country once the scorer started emitting them. The widget test
at getResilienceDimensionLabel also asserted only the 19-label set,
so the gap would have shipped silently.

Fix: add user-facing short labels for both new dims. "Reserves" is
already claimed by the retired reserveAdequacy, so the replacement
disambiguates with "Liquid Reserves"; sovereignFiscalBuffer →
"Sovereign Wealth" per the methodology doc H4 heading.

Also added a regression guard — new test asserts EVERY id in
RESILIENCE_DIMENSION_ORDER resolves to a non-id label. Any future
dimension that ships without a matching DIMENSION_LABELS entry now
fails CI loudly instead of leaking the ID into the UI.

Tests: 502/502 resilience tests pass (+1 new coverage check).
Typecheck clean on both configs.

* fix(resilience): PR 2 review — remove dead IMPUTE.recoveryReserveAdequacy entry

Greptile P2: the retired scoreReserveAdequacy stub no longer reads
from IMPUTE (it hardcodes coverage=0 / imputationClass=null per the
retirement pattern), making IMPUTE.recoveryReserveAdequacy dead code.
Removed the entry + added a breadcrumb comment pointing at the
replacement IMPUTE.recoveryLiquidReserveAdequacy.

The second P2 (bootstrap.js not wired) is a deliberate non-goal — the
reviewer explicitly flags "for visibility" since it's tracked in the
PR body. No action this commit; bootstrap wiring lands alongside the
SEED_META graduation after the ~7-day Railway-cron bake-in.

Tests: 502/502 resilience tests still pass. Typecheck clean.

2026-04-23 09:01:30 +04:00

21 KiB

Raw Permalink Blame History

View Raw

21 KiB Raw Permalink Blame History

21 KiB

Raw Permalink Blame History