mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)
* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently
First commit in PR 3 of the resilience repair plan. Retires
`fuelStockDays` from the core score with no replacement.
Why permanent, not replaced:
IEA emergency-stockholding rules are defined in days of NET IMPORTS
and do not bind net exporters by design. Norway/Canada/US measured
in days-of-imports are incomparable to Germany/Japan measured the
same way — the construct is fundamentally different across the two
country classes. No globally-comparable recovery-fuel signal can
be built from this source; the pre-repair probe showed 100% imputed
at 50 for every country in the April 2026 freeze.
scoreFuelStockDays:
- Rewritten to return coverage=0 + observedWeight=0 +
imputationClass='source-failure' for every country regardless
of seed content.
- Drops the dimension from the `recovery` domain's coverage-
weighted mean automatically; remaining recovery dimensions
pick up the share via re-normalisation in
`_shared.ts#coverageWeightedMean`.
- No explicit weight transfer needed — the coverage-weighted
blend handles redistribution.
Registry:
- recoveryFuelStockDays re-tagged from tier='enrichment' to
tier='experimental' so the Core coverage gate treats it as
out-of-score.
- Description updated to make the retirement explicit; entry
stays in the registry for structural continuity (the
dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER
for the 19-dimension tests; removing the dimension entirely is
a PR 4 structural-audit concern).
Housekeeping:
- Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no
longer read; noUnusedLocals would reject it).
- Removed `RecoveryFuelStocksCountry` interface for the same
reason. Comment at the removed declaration instructs future
maintainers not to re-add the type as a reservation; when a
new recovery-fuel concept lands, introduce a fresh interface.
Plan reference: §3.5 point 1 of
`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`.
51 resilience tests pass, typecheck + biome clean. The
`recovery` domain's published score will shift slightly for every
country because the 0.10 slot that fuelStockDays was imputing to
now redistributes; the compare-harness acceptance-gate rerun at
merge time will quantify the shift per plan §6 gates.
* feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves
BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35
fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60
countries max — so every non-BIS country fell through to
curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45).
Combined with reserveMarginPct already removed in PR 1, currencyExternal
was the clearest "construct absent for most of the world" carrier left
in the scorer.
Changes:
_dimension-scorers.ts
- scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX
reserves only. Coverage ladder:
inflation + reserves → 0.85 (observed primary + secondary)
inflation only → 0.55
reserves only → 0.40
neither → 0.30 (IMPUTE.bisEer retained for snapshot
continuity; semantics read as
"no IMF + no WB reserves" now)
- Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved
via comment only, flagged by noUnusedLocals), stddev() helper,
getCountryBisExchangeRates() loader, BisExchangeRate interface,
dateToSortableNumber() — all were exclusive callers of the retired
BIS path.
_indicator-registry.ts
- New core entry inflationStability (weight 0.60, tier=core,
sourceKey=economic:imf:macro:v2).
- fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability
anchor).
- fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental
(BIS ~60-country coverage; off the core weight sum).
- Non-experimental weights now sum to 1.0 (0.60 + 0.40).
scripts/compare-resilience-current-vs-proposed.mjs
- EXTRACTION_RULES: added inflationStability →
imf-macro-country-field field=inflationPct so the registry-parity
test passes and the correlation harness sees the new construct.
tests/resilience-dimension-scorers.test.mts
- Dropped BIS-era wording ("non-BIS country") and test 266
(BIS-outage coverage 0.35 branch) which collapsed to the inflation-
only path post-retirement.
- Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+
reserves 0.55 → 0.85.
tests/resilience-scorers.test.mts
- domainAverages.economic 68.33 → 66.33 (US currencyExternal score
shifts slightly under IMF+reserves vs old BIS composite).
- stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279.
- overallScore 65.82 → 65.52.
- baselineScore unchanged (currencyExternal is stress-only).
All 6324 data-tier tests pass. typecheck:api clean. No change to
seeders or Redis keys; this is a pure scorer + registry rebuild.
* feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)
Plan §2.1 diagnosis table showed externalDebtCoverage saturating at
score=100 across all 9 probe countries — including stressed states.
Signal was collapsed. Root cause: (worst=5, best=0) gave every country
with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's
reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no
worry" territory instead of the "mild warning" it should be.
Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score
50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure).
Ratios above 2.0 clamp to 0, consistent with "beyond this point the
country is already in crisis; exact value stops mattering."
Files changed:
- _indicator-registry.ts: recoveryDebtToReserves goalposts
{worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to
cite Greenspan-Guidotti; inline comment documents anchor + rationale.
- _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound
changed from (0..5) to (0..2), with inline comment.
- docs/methodology/country-resilience-index.mdx: goalpost table row
5-0 → 2-0, description cites Greenspan-Guidotti.
- docs/methodology/indicator-sources.yaml:
* constructStatus: dead-signal → observed-mechanism (signal is now
discriminating).
* reviewNotes updated to describe the new anchor.
* mechanismTestRationale names the Greenspan-Guidotti rule.
- tests/resilience-dimension-monotonicity.test.mts: updated the
comment + picked values inside the (0..2) discriminating band (0.3
and 1.5). Old values (1 vs 4) had 4 clamping to 0.
- tests/resilience-dimension-scorers.test.mts: NO score threshold
relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96).
- tests/resilience-scorers.test.mts: fixture drift:
* domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25).
* baselineScore 63.63 → 60.12 (extDebt is baseline type).
* overallScore 65.52 → 63.27.
* stressScore / stressFactor unchanged (extDebt is baseline-only).
All 6324 data-tier tests pass. typecheck:api clean.
* feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight
Plan §3.6 adds a new acceptance criterion (also §5 item 5):
> No indicator with observed coverage below 70% may exceed 5% nominal
> weight OR 5% effective influence in the post-change sensitivity run.
This commit enforces the NOMINAL-WEIGHT half as a unit test that runs
on every CI build. The EFFECTIVE-INFLUENCE half is produced by
scripts/validate-resilience-sensitivity.mjs as a committed artifact;
the gate file only asserts that script still exists so a refactor that
removes it breaks the build loudly.
Why the gate exists (plan §3.6):
"A dimension at 30% observed coverage carries the same effective
weight as one at 95%. This contradicts the OECD/JRC handbook on
uncertainty analysis."
Implementation:
tests/resilience-coverage-influence-gate.test.mts — three tests:
1. Nominal-weight gate: for every core indicator with coverage < 137
countries (70% of the ~195-country universe), computes its nominal
overall weight as
indicator.weight × (1/dimensions-in-domain) × domain-weight
and asserts it does not exceed 5%. Equal-share-per-dimension is
the *upper bound* on runtime weight (coverage-weighted mean gives
a lower share when a dimension drops out), so this is a strict
bound: if the nominal number passes, the runtime number also
passes for every country.
2. Effective-influence contract: asserts the sensitivity script
exists at its expected path. Removing it (intentionally or by
refactor) breaks the build.
3. Audit visibility: prints the top 10 core indicators by nominal
overall weight. No assertion beyond "ran" — the list lets
reviewers spot outliers that pass the gate but are near the cap.
Current state (observed from audit output):
recoveryReserveMonths: nominal=4.17% coverage=188
recoveryDebtToReserves: nominal=4.17% coverage=185
recoveryImportHhi: nominal=4.17% coverage=190
inflationStability: nominal=3.40% coverage=185
electricityConsumption: nominal=3.30% coverage=217
ucdpConflict: nominal=3.09% coverage=193
Every core indicator has coverage ≥ 180 (already enforced by the
pre-existing indicator-tiering test), so the nominal-weight gate has
no current violators — its purpose is catching future drift, not
flagging today's state.
All 6327 data-tier tests pass. typecheck:api clean.
* docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate
Methodology-doc update capturing the three §3.5 landings and the §3.6 CI
gate. Five edits:
1. **Known construct limitations section (#5 and #6):** strikethrough the
original "dead signals" and "no coverage-based weight cap" items,
annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" +
specifics of what shipped.
2. **Currency & External H4 section:** completely rewritten. Old table
(fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is
replaced by the two-indicator post-PR-3 table (inflationStability at
0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out
(0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as
experimental-tier drill-downs only.
3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the
methodology-lint H4-to-dimension mapping does not break; body
rewritten to explain that the dimension is retired from core but the
seeder still runs for IEA-member drill-downs.
4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description
cites Greenspan-Guidotti reserve-adequacy rule.
5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering
§3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit
updates.
No scoring or code changes in this commit. Methodology-lint test passes
(H4 mapping intact). All 6327 data-tier tests pass.
* fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation
Reviewer catch (thanks). The previous gate computed each indicator's
nominal overall weight as
indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight
and claimed this was an upper bound ("actual runtime weight is ≤ this
when some dimensions drop out on coverage"). That is BACKWARDS for
this scorer.
The domain aggregation is coverage-weighted
(server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean),
so when a dimension pins at coverage=0 it is EXCLUDED from the
denominator and the surviving dimensions' shares go UP, not down.
PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to
coverage=0 for every country — so in the current live state the
recovery domain has 5 contributing dimensions (not 6), and each core
recovery indicator's nominal share is
1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%)
The old gate therefore under-estimated nominal influence and could
silently pass exactly the kind of low-coverage overweight regression
it is meant to block.
Fix:
- Added `coreBearingDimensions(domainId)` helper that counts only
dimensions that have ≥1 core indicator in the registry. A dimension
with only experimental/enrichment entries (post-retirement
fuelStockDays) has no core contribution → does not dilute shares.
- Updated `nominalOverallWeight` to divide by the core-bearing count,
not the raw dimension count.
- Rewrote the helper's doc comment to stop claiming this is a strict
upper bound — explicitly calls out the dynamic case (source failure
raising surviving dim shares further) as the sensitivity script's
responsibility.
- Added a new regression test: asserts (a) at least one recovery
dimension is all-non-core (fuelStockDays post-retirement),
(b) fuelStockDays has zero core indicators, and (c) recoveryDebt
ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion
of the retirement or regression to N_total-denominator will fail
loudly.
Top-10 audit output now correctly shows:
recoveryReserveMonths: nominal=5% coverage=188
recoveryDebtToReserves: nominal=5% coverage=185
recoveryImportHhi: nominal=5% coverage=190
(was 4.17% each under the old math)
All 486 resilience tests pass. typecheck:api clean.
Note: the 5% figure is exactly AT the cap, not over it. "exceed" means
strictly > 5%, so it still passes. But now the reviewer / audit log
reflects reality.
* fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label
Addresses the Codex review P1 + P2 on PR #3297.
P1 — retired-dim drag on confidence averages
--------------------------------------------
scoreFuelStockDays returns coverage=0 by design (retired construct),
but computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence averaged across all 19 dimensions. That
dragged every country's reported averageCoverage down — US went from
0.8556 (active dims only) to 0.8105 (all dims) — enough drift to
misclassify edge countries as lowConfidence and to shift the ranking
widget's overallCoverage pill for every country.
Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in
_dimension-scorers.ts and filter it out of all three averages. The
filter is keyed on the retired-dim REGISTRY, not on coverage === 0,
because a non-retired dim can legitimately emit coverage=0 on a
genuinely sparse-data country via weightedBlend fall-through — those
entries MUST keep dragging confidence down (that is the sparse-data
signal lowConfidence exists to surface). Verified: sparse-country
release-gate test (marks sparse WHO/FAO countries as low confidence)
still passes with the registry-keyed filter; would have failed with
a naive coverage=0 filter.
Server-client parity: widget-utils cannot import server code, so
RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept
in lockstep by tests/resilience-retired-dimensions-parity.test.mts
(parses the widget file as text, same pattern as existing widget-util
tests that can't import the widget module directly).
P2 — false "Source down" label on retired dim
---------------------------------------------
scoreFuelStockDays hard-coded imputationClass: 'source-failure',
which the widget maps to "Source down: upstream seeder failed" with
a `!` icon for every country. That is semantically wrong for an
intentional retirement. Flipped to null so the widget's absent-path
renders a neutral cell without a false outage label. null is already
a legal value of ResilienceDimensionScore.imputationClass; no type
change needed.
Tests
-----
- tests/resilience-confidence-averaging.test.mts (new): pins the
registry-keyed filter semantic for computeOverallCoverage +
computeLowConfidence. Includes a negative-control test proving
non-retired coverage=0 dims still flip lowConfidence.
- tests/resilience-retired-dimensions-parity.test.mts (new):
lockstep gate between server and client retired-dim lists.
- Widget test adds a registry-keyed exclusion test with a non-retired
coverage=0 dim in the fixture to lock in the correct semantic.
- Existing tests asserting imputationClass: 'source-failure' for
fuelStockDays flipped to null.
All 494 resilience tests + full 6336/6336 data-tier suite pass.
Typecheck clean for both tsconfig.json and tsconfig.api.json.
* docs(resilience): align methodology + registry metadata with shipped imputationClass=null
Follow-up to the previous PR 3 review commit that flipped
scoreFuelStockDays's imputationClass from 'source-failure' to null to
avoid a false "Source down" widget label on every country. The code
changed; the doc and registry metadata did not, leaving three sites
in the methodology mdx and two comment/description sites in the
registry still claiming imputationClass='source-failure'. Any future
reviewer (or tooling that treats the registry description as
authoritative) would be misled.
This commit rewrites those sites to describe the shipped behavior:
- imputationClass=null (not 'source-failure'), with the rationale
- exclusion from confidence/coverage averages via the
RESILIENCE_RETIRED_DIMENSIONS registry filter
- the distinction between structural retirement (filtered) and
runtime coverage=0 (kept so sparse-data countries still flag
lowConfidence)
Touched:
- docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590)
- server/worldmonitor/resilience/v1/_indicator-registry.ts
(recoveryFuelStockDays comment block + description field)
No code-behavior change. Docs-only.
Tests: 157 targeted resilience tests pass (incl. methodology-lint +
widget + release-gate + confidence-averaging). Typecheck clean on
both tsconfig.json and tsconfig.api.json.
This commit is contained in:
@@ -30,8 +30,8 @@ The first-publication repair is sequenced as PR 0 → PR 1 → PR 3 → PR 2 →
|
||||
2. **Gas and coal penalized as vulnerability even when domestic.** Current `gasShare` / `coalShare` penalties conflate fossil-dominance with fossil-import-dependence. Replaced in PR 1 with a single `importedFossilDependence` composite using World Bank `EG.IMP.CONS.ZS` × `EG.ELC.FOSL.ZS` under the **Option B (power-system framing)** decision documented in the Energy Domain section.
|
||||
3. **No nuclear credit in `scoreEnergy`.** Nuclear-heavy generation scores no points despite firm low-carbon characteristics. Fixed in PR 1 by collapsing `renewShare` + new nuclear share + hydroelectric into a single `lowCarbonGenerationShare` indicator sourced from World Bank `EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS + EG.ELC.HYRO.ZS`. Hydro is summed explicitly because WB RNEW excludes hydroelectric; without HYRO, hydro-heavy countries (Norway ~95%, Paraguay ~99%, Brazil ~65%, Canada ~60%) would score near zero on this 0.20-weight signal despite having near-100% low-carbon grids.
|
||||
4. **Sovereign-wealth buffers invisible to `reserveAdequacy`.** Current dimension only sees central-bank reserves; SWF assets are not counted. Fixed in PR 2 by splitting the dimension into `liquidReserveAdequacy` + `sovereignFiscalBuffer` with a three-component haircut (access × liquidity × transparency) and a saturating transform.
|
||||
5. **Dead and regional-only signals in the global core score.** `fuelStockDays` (100% imputed globally), `euGasStorageStress` (EU-only), and `currencyExternal` (BIS 64-economy coverage) currently carry material weight despite insufficient coverage for a world ranking. Retired or scoped regional-only in PR 3.
|
||||
6. **No coverage-based weight cap.** A dimension at 30% observed coverage carries the same weight as one at 95%. Fixed in PR 3 with a CI-enforced rule: no indicator with observed coverage below 70% may exceed 5% nominal weight or 5% effective influence.
|
||||
5. **Dead and regional-only signals in the global core score.** ~~`fuelStockDays` (100% imputed globally), `euGasStorageStress` (EU-only), and `currencyExternal` (BIS 64-economy coverage) currently carry material weight despite insufficient coverage for a world ranking.~~ **Landed in PR 3 §3.5**: `fuelStockDays` permanently retired (coverage=0, imputationClass=null for every country — the scorer tags `null` rather than `source-failure` so the widget does not render a false "Source down" label, and the dimension is excluded from confidence/coverage averages via the `RESILIENCE_RETIRED_DIMENSIONS` registry); `currencyExternal` rebuilt on IMF inflation + WB reserves (no BIS); BIS `fxVolatility` + `fxDeviation` demoted to experimental tier; `externalDebtCoverage` re-goalposted from (0..5) to (0..2) per Greenspan-Guidotti to stop saturating at 100.
|
||||
6. **No coverage-based weight cap.** ~~A dimension at 30% observed coverage carries the same weight as one at 95%.~~ **Landed in PR 3 §3.6**: CI-enforced gate (`tests/resilience-coverage-influence-gate.test.mts`) fails the build if any core indicator with coverage < 137 countries (70% of the ~195 universe) carries more than 5% nominal weight in the overall score. The effective-influence half runs via `scripts/validate-resilience-sensitivity.mjs` as a committed artifact.
|
||||
|
||||
Each item maps to an acceptance gate and a spec in the repair plan. Until PR 1–PR 3 land, published rankings reflect the current construct and should be read in that context.
|
||||
|
||||
@@ -84,13 +84,16 @@ Each dimension is scored from 0-100 using a weighted blend of its sub-metrics. B
|
||||
|
||||
#### Currency & External
|
||||
|
||||
PR 3 §3.5 point 2 retired the BIS-backed core construct. BIS REER and DSR cover only the 64 BIS-reporting economies, so the old composite fell through to curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45) for ~130 of 195 countries. The rebuilt dimension uses two globally-covered World Bank / IMF series.
|
||||
|
||||
| Indicator | Description | Direction | Goalposts (worst-best) | Weight | Source | Cadence |
|
||||
|---|---|---|---|---|---|---|
|
||||
| fxVolatility | Annualized BIS real effective exchange rate volatility | Lower is better | 50 - 0 | 0.60 | BIS | Monthly |
|
||||
| fxDeviation | Absolute deviation of BIS real EER from equilibrium (100) | Lower is better | 35 - 0 | 0.25 | BIS | Monthly |
|
||||
| fxReservesAdequacy | Total reserves in months of imports (World Bank FI.RES.TOTL.MO) | Higher is better | 1 - 12 | 0.15 | World Bank | Annual |
|
||||
| inflationStability | Headline consumer inflation, % YoY (IMF WEO); primary signal for currency stability globally | Lower is better | 50 - 0 | 0.60 | IMF | Annual |
|
||||
| fxReservesAdequacy | Total reserves in months of imports (World Bank FI.RES.TOTL.MO) | Higher is better | 1 - 12 | 0.40 | World Bank | Annual |
|
||||
|
||||
For non-BIS countries (~160 countries), a fallback chain applies: (1) IMF inflation + World Bank reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation (score 50, certainty 0.3).
|
||||
Coverage ladder (post-PR-3): both present → 0.85; inflation only → 0.55; reserves only → 0.40; neither → 0.30 (curated_list_absent imputation, subject to source-failure re-tagging on adapter outage).
|
||||
|
||||
Retained as experimental (enrichment-only, ~64 BIS-reporting countries): `fxVolatility` (annualized BIS REER volatility, 50-0 goalpost) and `fxDeviation` (absolute deviation of BIS REER from 100, 35-0). These do not contribute to the core overall score; they surface on the country drill-down for BIS-tracked economies.
|
||||
|
||||
#### Trade & Sanctions
|
||||
|
||||
@@ -242,7 +245,7 @@ This domain forms the recovery-capacity pillar. It measures a country's ability
|
||||
|
||||
| Indicator | Description | Direction | Goalposts (worst-best) | Weight | Source | Cadence |
|
||||
|---|---|---|---|---|---|---|
|
||||
| recoveryDebtToReserves | Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD) | Lower is better | 5 - 0 | 1.00 | World Bank | Annual |
|
||||
| recoveryDebtToReserves | Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD); anchored on Greenspan-Guidotti reserve-adequacy rule | Lower is better | 2 - 0 | 1.00 | World Bank | Annual |
|
||||
|
||||
#### Import Concentration
|
||||
|
||||
@@ -262,11 +265,15 @@ State continuity is a derived dimension: it reads from existing WGI, UCDP, and d
|
||||
|
||||
#### Fuel Stock Days
|
||||
|
||||
PR 3 §3.5 point 1 permanently retired `fuelStockDays` from the core overall score. The dimension remains registered for schema continuity but pins at `coverage=0`, `score=50`, `imputationClass=null` for every country. Domain averages skip it via the coverage-weighted mean (coverage=0 contributes zero weight), and the user-facing confidence / coverage-percent averages exclude it via the `RESILIENCE_RETIRED_DIMENSIONS` registry filter in `computeLowConfidence`, `computeOverallCoverage`, and the widget's `formatResilienceConfidence`. `imputationClass` is deliberately `null` rather than `source-failure` — a retirement is structural, not a runtime outage, and the widget maps `source-failure` to a "Source down: upstream seeder failed" label with a `!` icon which would manufacture a false outage signal for every country on a deliberate construct retirement.
|
||||
|
||||
Why retired: fuel-stock disclosure is an IEA/OECD-member obligation covering ~45 countries. Every non-member was imputed via `unmonitored` (score 50, coverage 0.30). Combined with its 1/6 share of the recovery domain, this was the single largest "construct-absent-for-most-of-the-world" carrier in the scorer — the primary reason UAE landed at rank 69 with `energy=53, reserveAdequacy=25, fuelStockDays=50/unmonitored` in the pre-repair audit.
|
||||
|
||||
| Indicator | Description | Direction | Goalposts (worst-best) | Weight | Source | Cadence |
|
||||
|---|---|---|---|---|---|---|
|
||||
| recoveryFuelStockDays | Days of fuel stock cover (IEA Oil Stocks / EIA Weekly Petroleum Status) | Higher is better | 0 - 120 | 1.00 | IEA/EIA | Monthly |
|
||||
| recoveryFuelStockDays | Days of fuel stock cover (IEA Oil Stocks / EIA Weekly Petroleum Status) — **experimental tier, not part of core score** | Higher is better | 0 - 120 | 1.00 | IEA/EIA | Monthly |
|
||||
|
||||
Fuel stock days is an Enrichment-tier signal (coverage ~45 countries, IEA/OECD members). Countries without fuel stock data are imputed with the `unmonitored` class.
|
||||
The seeder still runs on its weekly schedule so the data surfaces on IEA/OECD-member country drill-downs. It stays retired from the core score unless a globally-comparable concept (strategic-reserve disclosure mandated across >180 countries) emerges.
|
||||
|
||||
## Normalization
|
||||
|
||||
@@ -576,6 +583,17 @@ Self-assessed against the standard composite-indicator review axes on a 0-10 sca
|
||||
- **New seeders (weekly):** `seed-low-carbon-generation.mjs` (EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS + EG.ELC.HYRO.ZS), `seed-fossil-electricity-share.mjs` (EG.ELC.FOSL.ZS), `seed-power-reliability.mjs` (EG.ELC.LOSS.ZS). Bundled by `seed-bundle-resilience-energy-v2.mjs` for a single Railway cron service. Net-energy-imports (`EG.IMP.CONS.ZS`) is NOT a new seeder — it reuses the existing `seed-resilience-static.mjs` path. All three new seed keys are gated as `ON_DEMAND_KEYS` in `api/health.js` until Railway cron provisions and the first clean run lands; graduate out of the set after ~7 days of clean runs.
|
||||
- **Acceptance gates (plan §6):** Spearman vs baseline >= 0.85; no country moves >15 points; matched-pair gap signs verified; cohort median shifts capped at 10 points; per-indicator effective influence measured via the PR 0 apparatus. Results committed as `docs/snapshots/resilience-ranking-live-post-pr1-<date>.json` and `docs/snapshots/resilience-energy-v2-acceptance-<date>.json` at flag-flip time.
|
||||
|
||||
### v2.2 (April 2026) — PR 3 dead-signal cleanup
|
||||
|
||||
**Status: landing.** PR 3 in the resilience repair plan (`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`). Addresses plan §3.5 (dead signals and regional-only signals in the core score) and §3.6 (coverage-based nominal-weight cap). Unlike PR 1, no flag — changes apply immediately because the retired constructs were never producing global signal.
|
||||
|
||||
- **§3.5 point 1 — `fuelStockDays` permanently retired from the core score.** IEA/EIA fuel-stock disclosure covers ~45 OECD-member countries; every other country was imputed `unmonitored`. `scoreFuelStockDays` now pins at `score=50, coverage=0, imputationClass=null` for every country. Coverage-weighted domain aggregation excludes it (coverage=0 contributes zero weight), and user-facing confidence / coverage averages exclude it via the `RESILIENCE_RETIRED_DIMENSIONS` registry filter (distinct from non-retired runtime coverage=0 entries, which must keep dragging confidence down — that is the sparse-data signal). `imputationClass=null` (not `source-failure`) because retirement is structural, not a runtime outage; `source-failure` would render a false "Source down" label in the widget on every country. The `recoveryFuelStockDays` registry entry remains (tier=`experimental`) so the data surfaces on IEA-member drill-downs. Re-retention requires a globally-comparable strategic-reserve disclosure concept (>180 countries) to emerge.
|
||||
- **§3.5 point 2 — `currencyExternal` rebuilt on IMF inflation + WB reserves.** BIS REER / DSR covered only the 64 BIS-reporting economies; the old composite fell through to curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45) for ~130 of 195 countries. New dimension: `inflationStability` (IMF WEO headline inflation, weight 0.60) + `fxReservesAdequacy` (WB reserves in months, weight 0.40). Coverage ladder: both=0.85, inflation-only=0.55, reserves-only=0.40, neither=0.30. Legacy `fxVolatility` + `fxDeviation` kept as `tier='experimental'` on country drill-downs for the 64 BIS economies.
|
||||
- **§3.5 point 3 — `externalDebtCoverage` re-goalposted from (0..5) to (0..2).** The old goalpost made ratios < 0.5 all score above 90, saturating at 100 across the full 9-country probe (including stressed states). New goalpost is anchored on Greenspan-Guidotti: ratio=1.0 (short-term debt matches reserves = reserve inadequacy threshold) → score 50; ratio=2.0 (double the threshold = acute rollover-shock exposure) → score 0. Ratios above 2.0 clamp to 0.
|
||||
- **§3.6 — Coverage-and-influence gate on indicator weight.** `tests/resilience-coverage-influence-gate.test.mts` fails the build if any core indicator with observed coverage below 70% of the ~195-country universe (<137 countries) carries more than 5% nominal weight in the overall score. The effective-influence half (variance-explained, Pearson-derivative) runs through `scripts/validate-resilience-sensitivity.mjs` and is committed as an artifact per plan §5 acceptance-criterion 9.
|
||||
- **Acceptance gates (plan §6):** Spearman vs prior-state >= 0.85, no country swings >5 points from PR 1 state (plan §3.5 deliverable row 4), all release-gate anchors hold, matched-pair directions verified. Sensitivity rerun and post-PR-3 snapshot committed as `docs/snapshots/resilience-ranking-live-post-pr3-<date>.json` at flag-flip/ranking-refresh time.
|
||||
- **Construct-audit updates:** `docs/methodology/indicator-sources.yaml` updates `recoveryDebtToReserves.constructStatus` from `dead-signal` to `observed-mechanism` citing the Greenspan-Guidotti anchor.
|
||||
|
||||
### Editorial notes
|
||||
|
||||
- This document is maintained at parity with OECD/JRC composite-indicator standards: every dimension has a named source, direction, goalpost range, weight rationale, cadence, and imputation class. A methodology doc linter (Phase 1 T1.8) validates that the list of dimensions in the indicator registry matches the list documented here and fails CI if they drift.
|
||||
|
||||
@@ -591,9 +591,9 @@
|
||||
coveragePct: 0.75
|
||||
lastObservedYear: 2023
|
||||
license: CC-BY-4.0
|
||||
mechanismTestRationale: Short-term external debt to reserves ratio — rollover-shock exposure.
|
||||
constructStatus: dead-signal
|
||||
reviewNotes: Saturates at 100 for every country in the 9-country probe (goalpost 0-5 is too generous). PR 3 re-goalpost.
|
||||
mechanismTestRationale: Short-term external debt to reserves ratio — rollover-shock exposure. Goalpost anchored on Greenspan-Guidotti (ratio≥1 = reserve inadequacy).
|
||||
constructStatus: observed-mechanism
|
||||
reviewNotes: PR 3 §3.5 point 3 re-goalposted from (0..5) to (0..2). Old goalpost saturated at 100 across the 9-country probe including stressed states; new anchor maps ratio=1.0 to score 50 and ratio=2.0 to score 0.
|
||||
|
||||
- indicator: recoveryImportHhi
|
||||
dimension: importConcentration
|
||||
|
||||
@@ -242,9 +242,13 @@ const EXTRACTION_RULES = {
|
||||
householdDebtService: { type: 'not-implemented', reason: 'BIS DSR curated series needs per-country quarterly DSR selection matching the scorer window' },
|
||||
|
||||
// ── currencyExternal ────────────────────────────────────────────────
|
||||
// PR 3 §3.5: BIS retired from core; inflationStability (IMF macro) is
|
||||
// the new primary with reserves secondary. fxVolatility/fxDeviation
|
||||
// stay experimental-only (BIS monthly-change math not exported).
|
||||
inflationStability: { type: 'imf-macro-country-field', field: 'inflationPct' },
|
||||
fxReservesAdequacy: { type: 'static-path', path: ['fxReservesMonths', 'months'] },
|
||||
fxVolatility: { type: 'not-implemented', reason: 'BIS REER annualized volatility needs scorer monthly-change std-dev; helper not exported' },
|
||||
fxDeviation: { type: 'not-implemented', reason: 'BIS REER absolute deviation from 100 needs scorer latest-value selection; helper not exported' },
|
||||
fxReservesAdequacy: { type: 'static-path', path: ['fxReservesMonths', 'months'] },
|
||||
|
||||
// ── tradeSanctions ──────────────────────────────────────────────────
|
||||
sanctionCount: { type: 'sanctions-count' },
|
||||
|
||||
@@ -167,12 +167,9 @@ interface ImfMacroEntry {
|
||||
year?: number | null;
|
||||
}
|
||||
|
||||
interface BisExchangeRate {
|
||||
countryCode?: string;
|
||||
realEer?: number;
|
||||
realChange?: number;
|
||||
date?: string;
|
||||
}
|
||||
// BisExchangeRate interface removed in PR 3 §3.5: only the
|
||||
// now-removed getCountryBisExchangeRates() + scoreCurrencyExternal's
|
||||
// BIS path used it.
|
||||
|
||||
interface NationalDebtEntry {
|
||||
iso3?: string;
|
||||
@@ -235,7 +232,10 @@ interface SocialVelocityPost {
|
||||
const RESILIENCE_STATIC_PREFIX = 'resilience:static:';
|
||||
const RESILIENCE_SHIPPING_STRESS_KEY = 'supply_chain:shipping_stress:v1';
|
||||
const RESILIENCE_TRANSIT_SUMMARIES_KEY = 'supply_chain:transit-summaries:v1';
|
||||
const RESILIENCE_BIS_EXCHANGE_KEY = 'economic:bis:eer:v1';
|
||||
// RESILIENCE_BIS_EXCHANGE_KEY removed in PR 3 §3.5: scoreCurrencyExternal
|
||||
// no longer reads BIS EER. fxVolatility / fxDeviation indicators remain
|
||||
// registered as tier='experimental' for drill-down panels; those panels
|
||||
// read BIS directly via their own handlers, not via this scorer.
|
||||
const RESILIENCE_BIS_DSR_KEY = 'economic:bis:dsr:v1';
|
||||
const RESILIENCE_NATIONAL_DEBT_KEY = 'economic:national-debt:v1';
|
||||
const RESILIENCE_IMF_MACRO_KEY = 'economic:imf:macro:v2';
|
||||
@@ -258,7 +258,11 @@ const RESILIENCE_RECOVERY_FISCAL_SPACE_KEY = 'resilience:recovery:fiscal-space:v
|
||||
const RESILIENCE_RECOVERY_RESERVE_ADEQUACY_KEY = 'resilience:recovery:reserve-adequacy:v1';
|
||||
const RESILIENCE_RECOVERY_EXTERNAL_DEBT_KEY = 'resilience:recovery:external-debt:v1';
|
||||
const RESILIENCE_RECOVERY_IMPORT_HHI_KEY = 'resilience:recovery:import-hhi:v1';
|
||||
const RESILIENCE_RECOVERY_FUEL_STOCKS_KEY = 'resilience:recovery:fuel-stocks:v1';
|
||||
// RESILIENCE_RECOVERY_FUEL_STOCKS_KEY removed in PR 3: scoreFuelStockDays
|
||||
// no longer reads any source key. If a new globally-comparable
|
||||
// recovery-fuel concept lands in a future PR, add a new key with an
|
||||
// explicit semantic (e.g. resilience:fuel-import-volatility:v1) rather
|
||||
// than resurrecting this one.
|
||||
|
||||
// PR 1 energy-construct v2 seed keys (plan §3.1–§3.3). Written by
|
||||
// scripts/seed-low-carbon-generation.mjs, scripts/seed-fossil-
|
||||
@@ -448,13 +452,9 @@ function mean(values: number[]): number | null {
|
||||
return values.reduce((sum, value) => sum + value, 0) / values.length;
|
||||
}
|
||||
|
||||
function stddev(values: number[]): number | null {
|
||||
if (values.length < 2) return null;
|
||||
const avg = mean(values);
|
||||
if (avg == null) return null;
|
||||
const variance = values.reduce((sum, value) => sum + (value - avg) ** 2, 0) / values.length;
|
||||
return Math.sqrt(variance);
|
||||
}
|
||||
// stddev() removed in PR 3 §3.5: its only caller was scoreCurrencyExternal's
|
||||
// BIS-volatility path which is now retired. Re-introduce if a future
|
||||
// scorer genuinely needs a series-volatility computation.
|
||||
|
||||
// T1.7 schema pass: tie-break order when multiple imputed metrics share
|
||||
// weight. Earlier classes in this list win on ties. stable-absence expresses
|
||||
@@ -565,15 +565,8 @@ function matchesCountryText(value: unknown, countryCode: string): boolean {
|
||||
return false;
|
||||
}
|
||||
|
||||
function dateToSortableNumber(value: unknown): number {
|
||||
if (typeof value === 'string') {
|
||||
const compact = value.replace(/[^0-9]/g, '');
|
||||
const numeric = Number(compact);
|
||||
if (Number.isFinite(numeric) && numeric > 0) return numeric;
|
||||
}
|
||||
const numeric = Number(value);
|
||||
return Number.isFinite(numeric) ? numeric : 0;
|
||||
}
|
||||
// dateToSortableNumber() removed in PR 3 §3.5: only the now-removed
|
||||
// getCountryBisExchangeRates() used it.
|
||||
|
||||
async function defaultSeedReader(key: string): Promise<unknown | null> {
|
||||
return getCachedJson(key, true);
|
||||
@@ -631,14 +624,9 @@ function getImfLaborEntry(raw: unknown, countryCode: string): ImfLaborEntry | nu
|
||||
return (countries[countryCode] as ImfLaborEntry | undefined) ?? null;
|
||||
}
|
||||
|
||||
function getCountryBisExchangeRates(raw: unknown, countryCode: string): BisExchangeRate[] {
|
||||
const rates: BisExchangeRate[] = Array.isArray((raw as { rates?: unknown[] } | null)?.rates)
|
||||
? ((raw as { rates?: BisExchangeRate[] }).rates ?? [])
|
||||
: [];
|
||||
return rates
|
||||
.filter((entry) => matchesCountryIdentifier(entry.countryCode, countryCode))
|
||||
.sort((left, right) => dateToSortableNumber(left.date) - dateToSortableNumber(right.date));
|
||||
}
|
||||
// getCountryBisExchangeRates() removed in PR 3 §3.5: only scoreCurrencyExternal
|
||||
// called it, and that scorer no longer reads BIS EER. Drill-down panels
|
||||
// that want BIS series read it via their own dedicated handler.
|
||||
|
||||
function getLatestDebtEntry(raw: unknown, countryCode: string): NationalDebtEntry | null {
|
||||
const iso3 = ISO2_TO_ISO3[countryCode.toUpperCase()];
|
||||
@@ -890,76 +878,87 @@ function scoreFxReserves(months: number): number {
|
||||
return normalizeHigherBetter(Math.min(months, 12), 1, 12);
|
||||
}
|
||||
|
||||
// PR 3 §3.5 point 3: retire the BIS-dependent primary path. BIS EER
|
||||
// covers ~64 economies — a core signal that's null for ~150 countries
|
||||
// is structurally wrong for a world-ranking score. The scorer now
|
||||
// uses only global-coverage inputs:
|
||||
// - inflationStability: IMF `inflationPct` (CPI, ~185 countries)
|
||||
// - fxReservesAdequacy: WB `FI.RES.TOTL.MO` (~160 countries)
|
||||
// BIS `realChange` / `realEer` are still read for drill-down panels
|
||||
// via the fxVolatility / fxDeviation registry entries (now re-tagged
|
||||
// `tier='experimental'` so they're excluded from the Core coverage
|
||||
// gate), but the SCORER path ignores them entirely. A country that
|
||||
// used to take the "BIS primary" branch now takes the same path as
|
||||
// a non-BIS country, producing consistent per-country-reproducibility
|
||||
// regardless of whether BIS tracks them.
|
||||
//
|
||||
// Weight split in the core blend:
|
||||
// inflationStability 0.6 | fxReservesAdequacy 0.4
|
||||
// Mirrors the pre-existing "fallback when no BIS" blend weights.
|
||||
export async function scoreCurrencyExternal(
|
||||
countryCode: string,
|
||||
reader: ResilienceSeedReader = defaultSeedReader,
|
||||
): Promise<ResilienceDimensionScore> {
|
||||
const [bisExchangeRaw, imfMacroRaw, staticRecord] = await Promise.all([
|
||||
reader(RESILIENCE_BIS_EXCHANGE_KEY),
|
||||
const [imfMacroRaw, staticRecord] = await Promise.all([
|
||||
reader(RESILIENCE_IMF_MACRO_KEY),
|
||||
readStaticCountry(countryCode, reader),
|
||||
]);
|
||||
const countryRates = getCountryBisExchangeRates(bisExchangeRaw, countryCode);
|
||||
const latest = countryRates[countryRates.length - 1] ?? null;
|
||||
const volSource = countryRates
|
||||
.map((entry) => safeNum(entry.realChange))
|
||||
.filter((value): value is number => value != null)
|
||||
.slice(-12);
|
||||
const vol = volSource.length >= 2
|
||||
? (stddev(volSource) ?? 0) * Math.sqrt(12)
|
||||
: volSource.length === 1
|
||||
? Math.abs(volSource[0]!) * Math.sqrt(12)
|
||||
: null;
|
||||
|
||||
const imfEntry = getImfMacroEntry(imfMacroRaw, countryCode);
|
||||
const hasInflation = imfMacroRaw != null && imfEntry?.inflationPct != null;
|
||||
const inflationScore = hasInflation
|
||||
? normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50)
|
||||
: null;
|
||||
|
||||
const reservesMonths = getFxReservesMonths(staticRecord);
|
||||
const reservesScore = reservesMonths != null ? scoreFxReserves(reservesMonths) : null;
|
||||
|
||||
// Country not in BIS EER (curated ~40 economies), or BIS seed is down entirely.
|
||||
// Use IMF CPI inflation + WB FX reserves as currency stability proxies.
|
||||
// Inflation covers ~185 countries, reserves ~160 countries via World Bank FI.RES.TOTL.MO.
|
||||
if (countryRates.length === 0) {
|
||||
const imfEntry = getImfMacroEntry(imfMacroRaw, countryCode);
|
||||
const hasInflation = imfMacroRaw != null && imfEntry?.inflationPct != null;
|
||||
const hasReserves = reservesScore != null;
|
||||
|
||||
if (hasInflation && hasReserves) {
|
||||
const inflScore = normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50);
|
||||
const blended = inflScore * 0.6 + reservesScore * 0.4;
|
||||
const coverage = bisExchangeRaw != null ? 0.55 : 0.45;
|
||||
return { score: roundScore(blended), coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
|
||||
}
|
||||
if (hasInflation) {
|
||||
const coverage = bisExchangeRaw != null ? 0.45 : 0.35;
|
||||
return { score: normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50), coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
|
||||
}
|
||||
if (hasReserves) {
|
||||
const coverage = bisExchangeRaw != null ? 0.4 : 0.3;
|
||||
return { score: reservesScore, coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
|
||||
}
|
||||
// No BIS EER, no IMF inflation fallback, no WB reserves fallback.
|
||||
// This is true structural absence: the country isn't covered by any
|
||||
// currency-stability source we track. Tag with curated_list_absent
|
||||
// (= 'unmonitored') so the taxonomy is the single source of truth
|
||||
// and the aggregation pass can still re-tag it as 'source-failure'
|
||||
// when the underlying adapter fails. The prior absence-based branch
|
||||
// returned { score: 50, imputationClass: null } which silently
|
||||
// bypassed the taxonomy; replaced in T1.7 source-failure wiring.
|
||||
if (hasInflation && reservesScore != null) {
|
||||
const blended = inflationScore! * 0.6 + reservesScore * 0.4;
|
||||
return {
|
||||
score: IMPUTE.bisEer.score,
|
||||
coverage: IMPUTE.bisEer.certaintyCoverage,
|
||||
observedWeight: 0,
|
||||
imputedWeight: 1,
|
||||
imputationClass: IMPUTE.bisEer.imputationClass,
|
||||
score: roundScore(blended),
|
||||
coverage: 0.85,
|
||||
observedWeight: 1,
|
||||
imputedWeight: 0,
|
||||
imputationClass: null,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
if (hasInflation) {
|
||||
return {
|
||||
score: inflationScore!,
|
||||
coverage: 0.55,
|
||||
observedWeight: 1,
|
||||
imputedWeight: 0,
|
||||
imputationClass: null,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
if (reservesScore != null) {
|
||||
return {
|
||||
score: reservesScore,
|
||||
coverage: 0.4,
|
||||
observedWeight: 1,
|
||||
imputedWeight: 0,
|
||||
imputationClass: null,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
|
||||
// BIS EER data present: volatility + deviation are primary, reserves supplementary.
|
||||
return weightedBlend([
|
||||
{ score: vol == null ? null : normalizeLowerBetter(vol, 0, 50), weight: 0.6 },
|
||||
{ score: latest == null ? null : normalizeLowerBetter(Math.abs((safeNum(latest.realEer) ?? 100) - 100), 0, 35), weight: 0.25 },
|
||||
{ score: reservesScore, weight: 0.15 },
|
||||
]);
|
||||
// Neither global-coverage source present. True structural absence;
|
||||
// keep the curated_list_absent → unmonitored taxonomy so the
|
||||
// aggregation pass can still re-tag as source-failure on adapter
|
||||
// outage. (IMPUTE.bisEer is the existing entry; we keep its
|
||||
// identity/name for snapshot continuity but the semantics now read
|
||||
// as "no IMF + no WB reserves" rather than "no BIS".)
|
||||
return {
|
||||
score: IMPUTE.bisEer.score,
|
||||
coverage: IMPUTE.bisEer.certaintyCoverage,
|
||||
observedWeight: 0,
|
||||
imputedWeight: 1,
|
||||
imputationClass: IMPUTE.bisEer.imputationClass,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
|
||||
export async function scoreTradeSanctions(
|
||||
@@ -1407,11 +1406,11 @@ interface RecoveryImportHhiCountry {
|
||||
year?: number | null;
|
||||
}
|
||||
|
||||
interface RecoveryFuelStocksCountry {
|
||||
fuelStockDays?: number | null;
|
||||
meetsObligation?: boolean | null;
|
||||
belowObligation?: boolean | null;
|
||||
}
|
||||
// RecoveryFuelStocksCountry interface removed in PR 3 — scoreFuelStockDays
|
||||
// no longer reads any payload. Do NOT re-add the type as a reservation;
|
||||
// the tsc noUnusedLocals rule rejects unused locals. When a new
|
||||
// recovery-fuel concept lands, introduce a fresh interface with a
|
||||
// different name + the actual shape it needs.
|
||||
|
||||
function getRecoveryCountryEntry<T>(raw: unknown, countryCode: string): T | null {
|
||||
const countries = (raw as { countries?: Record<string, T> } | null)?.countries;
|
||||
@@ -1480,8 +1479,12 @@ export async function scoreExternalDebtCoverage(
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
// PR 3 §3.5 point 3: goalpost re-anchored on Greenspan-Guidotti.
|
||||
// Ratio 1.0 (short-term debt matches reserves) = score 50; ratio 2.0
|
||||
// = score 0 (acute rollover-shock exposure). See registry entry
|
||||
// recoveryDebtToReserves for the construct rationale.
|
||||
return weightedBlend([
|
||||
{ score: normalizeLowerBetter(entry.debtToReservesRatio, 0, 5), weight: 1.0 },
|
||||
{ score: normalizeLowerBetter(entry.debtToReservesRatio, 0, 2), weight: 1.0 },
|
||||
]);
|
||||
}
|
||||
|
||||
@@ -1551,26 +1554,75 @@ export async function scoreStateContinuity(
|
||||
]);
|
||||
}
|
||||
|
||||
// PR 3 §3.5 point 1: retired permanently from the core score. IEA
|
||||
// emergency-stockholding rules are defined in days of NET IMPORTS
|
||||
// and do not bind net exporters by design; the net-importer vs net-
|
||||
// exporter framings are incomparable, so no global resilience signal
|
||||
// can be built from this data. Published coverage for the IEA/EIA
|
||||
// connector sat at 100% imputed at 50 for every country in the
|
||||
// pre-repair probe (`fuelStockDays` was `source-failure` for every
|
||||
// ISO in the April 2026 freeze snapshot).
|
||||
//
|
||||
// Returning `coverage: 0` + `observedWeight: 0` drops the dimension
|
||||
// from the `recovery` domain's coverage-weighted mean entirely; the
|
||||
// remaining recovery dimensions pick up its share of the domain
|
||||
// weight via auto-redistribution (no explicit weight transfer needed
|
||||
// — `coverageWeightedMean` in `_shared.ts` already does this).
|
||||
//
|
||||
// Does NOT return in PR 4. A new globally-comparable recovery-fuel
|
||||
// concept (e.g. fuel-import-volatility or strategic-buffer-ratio
|
||||
// with a unified net-importer/net-exporter definition) could replace
|
||||
// this scorer in a future PR, but that is out of scope for the
|
||||
// first-publication repair.
|
||||
//
|
||||
// The dimension `fuelStockDays` remains in `RESILIENCE_DIMENSION_ORDER`
|
||||
// for structural continuity (tests, pillar membership, registry
|
||||
// shape); retiring the dimension entirely is a PR 4 structural-audit
|
||||
// concern. The `recoveryFuelStockDays` indicator is re-tagged as
|
||||
// `tier: 'experimental'` in the registry so the Core coverage gate
|
||||
// does not consider it active.
|
||||
// Authoritative registry of dimensions retired from the core score.
|
||||
// Retired dimensions still appear in `RESILIENCE_DIMENSION_ORDER` for
|
||||
// structural continuity (tests, pillar membership, registry shape) and
|
||||
// their scorers still run (returning coverage=0). This set exists so
|
||||
// downstream confidence/coverage averages (`computeLowConfidence`,
|
||||
// `computeOverallCoverage`, the widget's `formatResilienceConfidence`)
|
||||
// can explicitly exclude retired dims — distinct from coverage=0
|
||||
// dimensions that reflect genuine data sparsity, which must still drag
|
||||
// the confidence reading down so sparse-data countries stay flagged as
|
||||
// low-confidence. See `tests/resilience-confidence-averaging.test.mts`
|
||||
// for the exact semantic this set enables.
|
||||
//
|
||||
// Client-side mirror: `RESILIENCE_RETIRED_DIMENSION_IDS` in
|
||||
// `src/components/resilience-widget-utils.ts`. Kept in lockstep via
|
||||
// `tests/resilience-retired-dimensions-parity.test.mts`.
|
||||
export const RESILIENCE_RETIRED_DIMENSIONS: ReadonlySet<ResilienceDimensionId> = new Set([
|
||||
'fuelStockDays',
|
||||
]);
|
||||
|
||||
export async function scoreFuelStockDays(
|
||||
countryCode: string,
|
||||
reader: ResilienceSeedReader = defaultSeedReader,
|
||||
_countryCode: string,
|
||||
_reader: ResilienceSeedReader = defaultSeedReader,
|
||||
): Promise<ResilienceDimensionScore> {
|
||||
const raw = await reader(RESILIENCE_RECOVERY_FUEL_STOCKS_KEY);
|
||||
const entry = getRecoveryCountryEntry<RecoveryFuelStocksCountry>(raw, countryCode);
|
||||
// The seeder writes `fuelStockDays`, not `stockDays`.
|
||||
if (!entry || entry.fuelStockDays == null) {
|
||||
return {
|
||||
score: IMPUTE.recoveryFuelStocks.score,
|
||||
coverage: IMPUTE.recoveryFuelStocks.certaintyCoverage,
|
||||
observedWeight: 0,
|
||||
imputedWeight: 1,
|
||||
imputationClass: IMPUTE.recoveryFuelStocks.imputationClass,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
return weightedBlend([
|
||||
{ score: normalizeHigherBetter(Math.min(entry.fuelStockDays, 120), 0, 120), weight: 1.0 },
|
||||
]);
|
||||
// imputationClass is `null` (not 'source-failure') because the dimension
|
||||
// is retired by design, not failing at runtime. 'source-failure' renders
|
||||
// as "Source down: upstream seeder failed" with a `!` icon in the widget
|
||||
// (see IMPUTATION_CLASS_LABELS in src/components/resilience-widget-utils.ts);
|
||||
// surfacing that label for every country would manufacture a false outage
|
||||
// signal for a deliberate construct retirement. The dimension is excluded
|
||||
// from confidence/coverage averages via the `RESILIENCE_RETIRED_DIMENSIONS`
|
||||
// registry filter in `computeLowConfidence`, `computeOverallCoverage`, and
|
||||
// the widget's `formatResilienceConfidence`. The filter is registry-keyed
|
||||
// (not `coverage === 0`) so genuinely sparse-data countries still surface
|
||||
// as low-confidence from non-retired coverage=0 dims.
|
||||
return {
|
||||
score: 50,
|
||||
coverage: 0,
|
||||
observedWeight: 0,
|
||||
imputedWeight: 0,
|
||||
imputationClass: null,
|
||||
freshness: { lastObservedAtMs: 0, staleness: '' },
|
||||
};
|
||||
}
|
||||
|
||||
export const RESILIENCE_DIMENSION_SCORERS: Record<
|
||||
|
||||
@@ -108,11 +108,49 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
|
||||
license: 'non-commercial',
|
||||
},
|
||||
|
||||
// ── currencyExternal (3 sub-metrics, plus IMF inflation fallback for non-BIS) ─
|
||||
// ── currencyExternal ─────────────────────────────────────────────────────
|
||||
// PR 3 §3.5 point 3 rebalanced the dimension's core scoring:
|
||||
// - BIS-dependent signals (fxVolatility, fxDeviation) moved to
|
||||
// tier='experimental'. BIS EER covers ~64 economies, which is too
|
||||
// narrow for a world-ranking Core signal. They remain in the registry
|
||||
// for drill-down / enrichment panels but scoreCurrencyExternal no
|
||||
// longer reads them.
|
||||
// - Core scoring is now: inflationStability (IMF CPI, ~185 countries)
|
||||
// at weight 0.6, fxReservesAdequacy (WB FI.RES.TOTL.MO, ~188 countries)
|
||||
// at weight 0.4. Both are global-coverage, so every country gets the
|
||||
// same construct regardless of BIS membership.
|
||||
{
|
||||
id: 'inflationStability',
|
||||
dimension: 'currencyExternal',
|
||||
description: 'IMF CPI inflation (lower is better). Global-coverage primary signal for currency stability. Core input to scoreCurrencyExternal under PR 3 §3.5. A future PR may upgrade this to a 5-year inflation-volatility computation once the seeder tracks the series; headline inflation is a reasonable first-cut for stability ranking.',
|
||||
direction: 'lowerBetter',
|
||||
goalposts: { worst: 50, best: 0 },
|
||||
weight: 0.6,
|
||||
sourceKey: 'economic:imf:macro:v2',
|
||||
scope: 'global',
|
||||
cadence: 'annual',
|
||||
tier: 'core',
|
||||
coverage: 185,
|
||||
license: 'open-data',
|
||||
},
|
||||
{
|
||||
id: 'fxReservesAdequacy',
|
||||
dimension: 'currencyExternal',
|
||||
description: 'Total reserves in months of imports (World Bank FI.RES.TOTL.MO). Global-coverage core signal for currency stability; paired with inflationStability in scoreCurrencyExternal after PR 3 §3.5 rebalancing.',
|
||||
direction: 'higherBetter',
|
||||
goalposts: { worst: 1, best: 12 },
|
||||
weight: 0.4,
|
||||
sourceKey: 'resilience:static:*',
|
||||
scope: 'global',
|
||||
cadence: 'annual',
|
||||
tier: 'core',
|
||||
coverage: 188,
|
||||
license: 'open-data',
|
||||
},
|
||||
{
|
||||
id: 'fxVolatility',
|
||||
dimension: 'currencyExternal',
|
||||
description: 'Annualized BIS real effective exchange rate volatility (std-dev of monthly changes * sqrt(12)). Fallback chain when BIS absent: (1) IMF inflation + WB reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation.',
|
||||
description: 'Annualized BIS real effective exchange rate volatility (std-dev of monthly changes * sqrt(12)). Enrichment-only for the ~64 BIS-tracked economies after PR 3 §3.5 — NOT read by scoreCurrencyExternal. Available via drill-down panels only.',
|
||||
direction: 'lowerBetter',
|
||||
goalposts: { worst: 50, best: 0 },
|
||||
weight: 0.6,
|
||||
@@ -120,17 +158,14 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
|
||||
scope: 'curated',
|
||||
cadence: 'monthly',
|
||||
imputation: { type: 'conservative', score: 50, certainty: 0.3 },
|
||||
// BIS REER is curated (~60 countries). Demoted to Enrichment by the
|
||||
// Phase 2 A4 coverage gate; the IMF inflation + WB reserves fallback
|
||||
// chain still feeds the Core fxReservesAdequacy signal globally.
|
||||
tier: 'enrichment',
|
||||
tier: 'experimental',
|
||||
coverage: 60,
|
||||
license: 'non-commercial',
|
||||
},
|
||||
{
|
||||
id: 'fxDeviation',
|
||||
dimension: 'currencyExternal',
|
||||
description: 'Absolute deviation of latest BIS real EER from 100 (equilibrium index). Fallback chain when BIS absent: (1) IMF inflation + WB reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation.',
|
||||
description: 'Absolute deviation of latest BIS real EER from 100 (equilibrium index). Enrichment-only for the ~64 BIS-tracked economies after PR 3 §3.5 — NOT read by scoreCurrencyExternal. Available via drill-down panels only.',
|
||||
direction: 'lowerBetter',
|
||||
goalposts: { worst: 35, best: 0 },
|
||||
weight: 0.25,
|
||||
@@ -138,25 +173,10 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
|
||||
scope: 'curated',
|
||||
cadence: 'monthly',
|
||||
imputation: { type: 'conservative', score: 50, certainty: 0.3 },
|
||||
// BIS REER curated source, same coverage limitation as fxVolatility.
|
||||
tier: 'enrichment',
|
||||
tier: 'experimental',
|
||||
coverage: 60,
|
||||
license: 'non-commercial',
|
||||
},
|
||||
{
|
||||
id: 'fxReservesAdequacy',
|
||||
dimension: 'currencyExternal',
|
||||
description: 'Total reserves in months of imports (World Bank FI.RES.TOTL.MO). Supplementary metric for BIS countries (weight 0.15), primary metric alongside IMF inflation for non-BIS countries (~160 countries).',
|
||||
direction: 'higherBetter',
|
||||
goalposts: { worst: 1, best: 12 },
|
||||
weight: 0.15,
|
||||
sourceKey: 'resilience:static:*',
|
||||
scope: 'global',
|
||||
cadence: 'annual',
|
||||
tier: 'core',
|
||||
coverage: 188,
|
||||
license: 'open-data',
|
||||
},
|
||||
|
||||
// ── tradeSanctions (4 sub-metrics) ────────────────────────────────────────
|
||||
{
|
||||
@@ -905,9 +925,16 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
|
||||
{
|
||||
id: 'recoveryDebtToReserves',
|
||||
dimension: 'externalDebtCoverage',
|
||||
description: 'Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD); values above 1 signal reserve inadequacy for debt service',
|
||||
description: 'Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD); Greenspan-Guidotti rule treats ratio≥1 as reserve inadequacy, ratio≥2 as acute rollover-shock exposure',
|
||||
direction: 'lowerBetter',
|
||||
goalposts: { worst: 5, best: 0 },
|
||||
// PR 3 §3.5 point 3: re-goalposted from (0..5) to (0..2). Old goalpost
|
||||
// saturated at 100 across the full 9-country probe including stressed
|
||||
// states. New anchor: ratio=1.0 (Greenspan-Guidotti reserve-adequacy
|
||||
// threshold) maps to score 50; ratio=2.0 (double the threshold, acute
|
||||
// distress) maps to 0. Ratios above 2.0 clamp to 0 — consistent with
|
||||
// "beyond this point the precise value stops mattering, the country
|
||||
// is already in a rollover-crisis regime."
|
||||
goalposts: { worst: 2, best: 0 },
|
||||
weight: 1.0,
|
||||
sourceKey: 'resilience:recovery:external-debt:v1',
|
||||
scope: 'global',
|
||||
@@ -978,17 +1005,31 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
|
||||
},
|
||||
|
||||
// ── fuelStockDays (1 sub-metric) ─────────────────────────────────────────
|
||||
// PR 3 §3.5 point 1: RETIRED from the core score. IEA emergency-
|
||||
// stockholding is defined in days of NET IMPORTS; the net-importer
|
||||
// vs net-exporter framings are incomparable, so no global resilience
|
||||
// signal can be built from this data. scoreFuelStockDays now returns
|
||||
// coverage=0 + imputationClass=null for every country (filtered out
|
||||
// of confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS
|
||||
// registry in _dimension-scorers.ts). imputationClass is deliberately
|
||||
// `null` rather than 'source-failure' — a retirement is structural,
|
||||
// not a runtime outage, and surfacing 'source-failure' would manufacture
|
||||
// a false "Source down" label in the widget for every country. The
|
||||
// registry entry stays at tier='experimental' so the Core coverage
|
||||
// gate treats it as out-of-score; the dimension itself remains
|
||||
// registered for structural continuity (PR 4 structural-audit may
|
||||
// remove it entirely).
|
||||
{
|
||||
id: 'recoveryFuelStockDays',
|
||||
dimension: 'fuelStockDays',
|
||||
description: 'Days of fuel stock cover (IEA Oil Stocks / EIA Weekly Petroleum Status); strategic buffer for energy-dependent recovery',
|
||||
description: 'RETIRED in PR 3. Legacy days-of-fuel-stock-cover (IEA Oil Stocks / EIA Weekly Petroleum Status). Does not contribute to the score — scoreFuelStockDays returns coverage=0 + imputationClass=null, and the dimension is excluded from confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS registry. Kept in the registry as tier=experimental for structural continuity; a globally-comparable recovery-fuel concept could replace this in a future PR.',
|
||||
direction: 'higherBetter',
|
||||
goalposts: { worst: 0, best: 120 },
|
||||
weight: 1.0,
|
||||
sourceKey: 'resilience:recovery:fuel-stocks:v1',
|
||||
scope: 'global',
|
||||
cadence: 'monthly',
|
||||
tier: 'enrichment',
|
||||
tier: 'experimental',
|
||||
coverage: 45,
|
||||
license: 'open-data',
|
||||
},
|
||||
|
||||
@@ -17,6 +17,7 @@ import {
|
||||
RESILIENCE_DIMENSION_ORDER,
|
||||
RESILIENCE_DIMENSION_TYPES,
|
||||
RESILIENCE_DOMAIN_ORDER,
|
||||
RESILIENCE_RETIRED_DIMENSIONS,
|
||||
createMemoizedSeedReader,
|
||||
getResilienceDomainWeight,
|
||||
scoreAllDimensions,
|
||||
@@ -338,8 +339,22 @@ function parseHistoryPoints(raw: unknown): ResilienceHistoryPoint[] {
|
||||
return history.sort((left, right) => left.date.localeCompare(right.date));
|
||||
}
|
||||
|
||||
function computeLowConfidence(dimensions: ResilienceDimension[], imputationShare: number): boolean {
|
||||
const averageCoverage = mean(dimensions.map((dimension) => dimension.coverage)) ?? 0;
|
||||
export function computeLowConfidence(dimensions: ResilienceDimension[], imputationShare: number): boolean {
|
||||
// Exclude RETIRED dimensions (fuelStockDays, post-PR-3) from the
|
||||
// confidence reading. They contribute zero weight to domain scoring
|
||||
// via coverageWeightedMean, so including them in a flat coverage mean
|
||||
// would drag the user-facing confidence signal down for every country
|
||||
// purely because of a deliberate construct retirement.
|
||||
//
|
||||
// IMPORTANT: we do NOT filter by `coverage === 0` because a genuinely
|
||||
// sparse-data country can legitimately produce coverage=0 on non-
|
||||
// retired dims via weightedBlend fall-through, and those coverage=0
|
||||
// entries SHOULD drag the confidence down — that is precisely the
|
||||
// sparse-data signal lowConfidence exists to surface.
|
||||
const scoring = dimensions.filter(
|
||||
(dimension) => !RESILIENCE_RETIRED_DIMENSIONS.has(dimension.id as ResilienceDimensionId),
|
||||
);
|
||||
const averageCoverage = mean(scoring.map((dimension) => dimension.coverage)) ?? 0;
|
||||
return averageCoverage < LOW_CONFIDENCE_COVERAGE_THRESHOLD || imputationShare > LOW_CONFIDENCE_IMPUTATION_SHARE_THRESHOLD;
|
||||
}
|
||||
|
||||
@@ -634,8 +649,18 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise
|
||||
|
||||
export const GREY_OUT_COVERAGE_THRESHOLD = 0.40;
|
||||
|
||||
function computeOverallCoverage(response: GetResilienceScoreResponse): number {
|
||||
const coverages = response.domains.flatMap((domain) => domain.dimensions.map((dimension) => dimension.coverage));
|
||||
export function computeOverallCoverage(response: GetResilienceScoreResponse): number {
|
||||
// Exclude RETIRED dimensions (fuelStockDays, post-PR-3) — their
|
||||
// coverage=0 is structural, not a sparsity signal, and should not
|
||||
// drag down the ranking widget's overallCoverage pill. Non-retired
|
||||
// coverage=0 dims (genuine weightedBlend fall-through) stay in the
|
||||
// average because they reflect real data sparsity for that country.
|
||||
// See `computeLowConfidence` for the matching rationale.
|
||||
const coverages = response.domains.flatMap((domain) =>
|
||||
domain.dimensions
|
||||
.filter((dimension) => !RESILIENCE_RETIRED_DIMENSIONS.has(dimension.id as ResilienceDimensionId))
|
||||
.map((dimension) => dimension.coverage),
|
||||
);
|
||||
if (coverages.length === 0) return 0;
|
||||
return coverages.reduce((sum, coverage) => sum + coverage, 0) / coverages.length;
|
||||
}
|
||||
|
||||
@@ -1,5 +1,17 @@
|
||||
import type { ResilienceScoreResponse } from '@/services/resilience';
|
||||
|
||||
// Client-side mirror of the server-side authoritative set
|
||||
// (`RESILIENCE_RETIRED_DIMENSIONS` in
|
||||
// server/worldmonitor/resilience/v1/_dimension-scorers.ts). Duplicated
|
||||
// because the widget module cannot import server code; kept in lockstep
|
||||
// by `tests/resilience-retired-dimensions-parity.test.mts`. Retired
|
||||
// dimensions are filtered out of the displayed coverage percentage so
|
||||
// a deliberate construct retirement does not silently drag the user-
|
||||
// facing confidence reading down for every country.
|
||||
const RESILIENCE_RETIRED_DIMENSION_IDS: ReadonlySet<string> = new Set([
|
||||
'fuelStockDays',
|
||||
]);
|
||||
|
||||
// Gated locked-preview fixture rendered when the resilience widget is
|
||||
// visible to non-entitled users. The preview is blurred and
|
||||
// non-interactive via the .resilience-widget__preview CSS class, so
|
||||
@@ -144,7 +156,21 @@ export function getResilienceDomainLabel(domainId: string): string {
|
||||
|
||||
export function formatResilienceConfidence(data: ResilienceScoreResponse): string {
|
||||
if (data.lowConfidence) return 'Low confidence — sparse data';
|
||||
const coverages = data.domains.flatMap((d) => d.dimensions.map((dim) => dim.coverage));
|
||||
// Exclude RETIRED dimensions (fuelStockDays, post-PR-3) from the
|
||||
// displayed coverage percentage. Retirement is structural — the
|
||||
// scorer returns coverage=0 by design and the dimension contributes
|
||||
// zero weight to the domain score server-side — so including it in
|
||||
// a flat client-side mean would drag the displayed percentage down
|
||||
// for every country even though the dimension is not part of the
|
||||
// score. Non-retired coverage=0 dims (genuine data sparsity) stay in
|
||||
// the average because they reflect a real confidence signal for that
|
||||
// country; the server already sets `lowConfidence` when the overall
|
||||
// picture is too sparse, which short-circuits above.
|
||||
const coverages = data.domains.flatMap((d) =>
|
||||
d.dimensions
|
||||
.filter((dim) => !RESILIENCE_RETIRED_DIMENSION_IDS.has(dim.id))
|
||||
.map((dim) => dim.coverage),
|
||||
);
|
||||
const avgCoverage = coverages.length > 0
|
||||
? Math.round((coverages.reduce((s, c) => s + c, 0) / coverages.length) * 100)
|
||||
: 0;
|
||||
|
||||
139
tests/resilience-confidence-averaging.test.mts
Normal file
139
tests/resilience-confidence-averaging.test.mts
Normal file
@@ -0,0 +1,139 @@
|
||||
import assert from 'node:assert/strict';
|
||||
import { describe, it } from 'node:test';
|
||||
|
||||
import {
|
||||
computeLowConfidence,
|
||||
computeOverallCoverage,
|
||||
} from '../server/worldmonitor/resilience/v1/_shared';
|
||||
import type {
|
||||
GetResilienceScoreResponse,
|
||||
ResilienceDimension,
|
||||
} from '../src/generated/server/worldmonitor/resilience/v1/service_server';
|
||||
|
||||
// PR 3 §3.5 follow-up (reviewer P1): the retired dimension (fuelStockDays,
|
||||
// post-retirement) returns coverage=0 structurally and contributes zero
|
||||
// weight to the domain score via coverageWeightedMean. The user-facing
|
||||
// confidence/coverage averages must exclude retired dims — otherwise
|
||||
// the retirement silently drags the reported averageCoverage down for
|
||||
// every country even though the dimension is not part of the score.
|
||||
//
|
||||
// Reviewer anchor: on the US profile, including retired dims gave
|
||||
// averageCoverage=0.8105 vs 0.8556 when retired dims are excluded —
|
||||
// enough drift to misclassify edge countries as lowConfidence and to
|
||||
// shift the widget's overallCoverage pill for the whole ranking.
|
||||
//
|
||||
// Critical invariant: the filter is keyed on the retired-dim REGISTRY,
|
||||
// not on `coverage === 0`. Non-retired dimensions can legitimately
|
||||
// emit coverage=0 on genuinely sparse-data countries via weightedBlend
|
||||
// fall-through, and those entries MUST continue to drag confidence
|
||||
// down — that is the sparse-data signal lowConfidence exists to
|
||||
// surface. A too-aggressive `coverage === 0` filter would hide the
|
||||
// sparsity and e.g. let South Sudan pass as full-confidence.
|
||||
|
||||
function dim(id: string, coverage: number): ResilienceDimension {
|
||||
return {
|
||||
id,
|
||||
score: 50,
|
||||
coverage,
|
||||
observedWeight: coverage > 0 ? 1 : 0,
|
||||
imputedWeight: 0,
|
||||
imputationClass: '',
|
||||
freshness: { lastObservedAtMs: '0', staleness: '' },
|
||||
};
|
||||
}
|
||||
|
||||
describe('computeOverallCoverage: retired-dim exclusion', () => {
|
||||
it('excludes retired dimensions from the average', () => {
|
||||
const response = {
|
||||
domains: [
|
||||
{
|
||||
id: 'recovery',
|
||||
dimensions: [
|
||||
dim('fiscalSpace', 0.9),
|
||||
dim('reserveAdequacy', 0.8),
|
||||
// Retired: must not pull the average down.
|
||||
dim('fuelStockDays', 0),
|
||||
],
|
||||
},
|
||||
],
|
||||
} as unknown as GetResilienceScoreResponse;
|
||||
|
||||
// (0.9 + 0.8) / 2 = 0.85. With retired included the flat mean
|
||||
// would be (0.9 + 0.8 + 0) / 3 ≈ 0.5667 — the regression shape.
|
||||
assert.equal(computeOverallCoverage(response).toFixed(4), '0.8500');
|
||||
});
|
||||
|
||||
it('keeps NON-retired coverage=0 dims in the average (sparse-data signal)', () => {
|
||||
// A genuinely sparse-data country can emit coverage=0 on non-retired
|
||||
// dims via weightedBlend fall-through. Those entries must stay in
|
||||
// the average so sparse countries still surface as low confidence
|
||||
// via the flat mean path.
|
||||
const response = {
|
||||
domains: [
|
||||
{
|
||||
id: 'economic',
|
||||
dimensions: [
|
||||
dim('macroFiscal', 0.9),
|
||||
// NON-retired coverage=0: represents genuine data sparsity.
|
||||
dim('currencyExternal', 0),
|
||||
],
|
||||
},
|
||||
],
|
||||
} as unknown as GetResilienceScoreResponse;
|
||||
|
||||
// (0.9 + 0) / 2 = 0.45. If the filter were keyed on coverage=0,
|
||||
// the genuine sparsity would be hidden and this would be 0.9.
|
||||
assert.equal(computeOverallCoverage(response).toFixed(4), '0.4500');
|
||||
});
|
||||
|
||||
it('returns 0 when ALL dims are retired (degenerate case)', () => {
|
||||
const response = {
|
||||
domains: [
|
||||
{ id: 'recovery', dimensions: [dim('fuelStockDays', 0)] },
|
||||
],
|
||||
} as unknown as GetResilienceScoreResponse;
|
||||
assert.equal(computeOverallCoverage(response), 0);
|
||||
});
|
||||
});
|
||||
|
||||
describe('computeLowConfidence: retired-dim exclusion', () => {
|
||||
it('does not flip lowConfidence purely on retired-dim drag', () => {
|
||||
// Three active dims at 0.72 = 0.72 mean (above the low-confidence
|
||||
// threshold). A single retired dim at coverage=0 must not flip the
|
||||
// flag by dragging the flat mean below the threshold — that was
|
||||
// the regression on the US profile.
|
||||
const dims = [
|
||||
dim('fiscalSpace', 0.72),
|
||||
dim('reserveAdequacy', 0.72),
|
||||
dim('externalDebtCoverage', 0.72),
|
||||
dim('fuelStockDays', 0), // retired
|
||||
];
|
||||
assert.equal(computeLowConfidence(dims, 0), false,
|
||||
'retired fuelStockDays must not flip lowConfidence for an otherwise well-covered country');
|
||||
});
|
||||
|
||||
it('DOES flip lowConfidence for non-retired coverage=0 dims (sparse data)', () => {
|
||||
// A sparse-data country: multiple non-retired dims at coverage=0
|
||||
// via weightedBlend fall-through. The flat mean drops below the
|
||||
// threshold and the flag must fire — this is the sparse-data
|
||||
// signal lowConfidence exists to surface. A too-aggressive filter
|
||||
// on coverage=0 would hide this.
|
||||
const dims = [
|
||||
dim('macroFiscal', 0.9),
|
||||
dim('currencyExternal', 0), // non-retired coverage=0
|
||||
dim('tradeSanctions', 0), // non-retired coverage=0
|
||||
dim('cyberDigital', 0), // non-retired coverage=0
|
||||
];
|
||||
assert.equal(computeLowConfidence(dims, 0), true,
|
||||
'non-retired coverage=0 dims must drag lowConfidence down — that is the sparse-data signal');
|
||||
});
|
||||
|
||||
it('respects the imputationShare threshold independently', () => {
|
||||
// Imputation-share check is a separate arm of the OR; retired-dim
|
||||
// filtering must not suppress a legitimate high-imputation-share
|
||||
// trigger.
|
||||
const dims = [dim('fiscalSpace', 0.95)];
|
||||
assert.equal(computeLowConfidence(dims, 0.6), true,
|
||||
'imputationShare > 0.4 must flip lowConfidence even when coverage looks strong');
|
||||
});
|
||||
});
|
||||
194
tests/resilience-coverage-influence-gate.test.mts
Normal file
194
tests/resilience-coverage-influence-gate.test.mts
Normal file
@@ -0,0 +1,194 @@
|
||||
import assert from 'node:assert/strict';
|
||||
import { existsSync } from 'node:fs';
|
||||
import { dirname, join } from 'node:path';
|
||||
import { describe, it } from 'node:test';
|
||||
import { fileURLToPath } from 'node:url';
|
||||
|
||||
import { INDICATOR_REGISTRY } from '../server/worldmonitor/resilience/v1/_indicator-registry.ts';
|
||||
import {
|
||||
RESILIENCE_DIMENSION_DOMAINS,
|
||||
getResilienceDomainWeight,
|
||||
type ResilienceDimensionId,
|
||||
type ResilienceDomainId,
|
||||
} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
|
||||
|
||||
// PR 3 §3.6 — Coverage-and-influence cap on indicator weight.
|
||||
//
|
||||
// Rule (plan §3.6, verbatim):
|
||||
// No indicator with observed coverage below 70% may exceed 5% nominal
|
||||
// weight OR 5% effective influence in the post-change sensitivity run.
|
||||
//
|
||||
// This file enforces the NOMINAL-WEIGHT half (static, runs every build).
|
||||
// The effective-influence half is checked by the variable-importance
|
||||
// output of scripts/validate-resilience-sensitivity.mjs and committed as
|
||||
// an artifact; see plan §5 acceptance-criteria item 9.
|
||||
//
|
||||
// Why the gate exists (plan §3.6):
|
||||
// "A dimension at 30% observed coverage carries the same effective
|
||||
// weight as one at 95%. This contradicts the OECD/JRC handbook on
|
||||
// uncertainty analysis."
|
||||
//
|
||||
// Assumption: the global universe is ~195 countries (UN members + a few
|
||||
// territories commonly ranked). "70% coverage" → 137+ countries.
|
||||
|
||||
const GLOBAL_COUNTRY_UNIVERSE = 195;
|
||||
const COVERAGE_FLOOR = Math.ceil(GLOBAL_COUNTRY_UNIVERSE * 0.7); // 137
|
||||
const NOMINAL_WEIGHT_CAP = 0.05; // 5%
|
||||
|
||||
// Nominal overall weight of an indicator = weight in dimension
|
||||
// × dimension share of domain
|
||||
// × domain weight in overall score.
|
||||
//
|
||||
// `dimension share of domain` is NOT 1/N_total — the scorer aggregates
|
||||
// by coverage-weighted mean (server/worldmonitor/resilience/v1/_shared.ts
|
||||
// coverageWeightedMean), so a dimension that pins at coverage=0 drops
|
||||
// out of the denominator and the surviving dimensions' shares go UP,
|
||||
// not down. PR 3 commit 1 retires fuelStockDays by pinning its scorer
|
||||
// at coverage=0 for every country — so in the current live state the
|
||||
// recovery domain has 5 contributing dimensions (not 6), and each core
|
||||
// recovery indicator's nominal share is 1/5 × 0.25 = 5%, not the
|
||||
// 1/6 × 0.25 = 4.17% a naive N-based count would report.
|
||||
//
|
||||
// We therefore count "effective contributing dimensions" per domain:
|
||||
// dimensions that have at least one tier='core' indicator in the
|
||||
// registry. A dimension with only experimental/enrichment indicators
|
||||
// (e.g. fuelStockDays, post-retirement) scores coverage=0 in the core
|
||||
// path and is excluded from the coverage-weighted domain mean, so it
|
||||
// does not dilute the core dimensions' shares.
|
||||
//
|
||||
// This still under-estimates the WORST case — a live source-failure
|
||||
// run can drop a usually-contributing dimension to coverage=0, further
|
||||
// raising surviving dimensions' shares. The worst-case upper bound is
|
||||
// indicator.weight × domain_weight (single surviving dimension, 1/1
|
||||
// share). Enforcing THAT bound would fail most indicators, so we
|
||||
// enforce the baseline (all core-bearing dimensions present) here and
|
||||
// rely on the sensitivity-script's effective-influence output (plan
|
||||
// §3.6 second half, plan §5 acceptance item 9) to catch the dynamic
|
||||
// case.
|
||||
//
|
||||
// Indicator weights within a dimension are normalized to sum to 1 for
|
||||
// non-experimental tiers (enforced by the indicator-registry test).
|
||||
|
||||
function dimensionsInDomain(domainId: ResilienceDomainId): ResilienceDimensionId[] {
|
||||
return (Object.keys(RESILIENCE_DIMENSION_DOMAINS) as ResilienceDimensionId[])
|
||||
.filter((dimId) => RESILIENCE_DIMENSION_DOMAINS[dimId] === domainId);
|
||||
}
|
||||
|
||||
function coreBearingDimensions(domainId: ResilienceDomainId): Set<ResilienceDimensionId> {
|
||||
const dimsInDomain = new Set(dimensionsInDomain(domainId));
|
||||
const withCore = new Set<ResilienceDimensionId>();
|
||||
for (const entry of INDICATOR_REGISTRY) {
|
||||
if (entry.tier === 'core' && dimsInDomain.has(entry.dimension)) {
|
||||
withCore.add(entry.dimension);
|
||||
}
|
||||
}
|
||||
return withCore;
|
||||
}
|
||||
|
||||
function nominalOverallWeight(indicator: typeof INDICATOR_REGISTRY[number]): number {
|
||||
const domainId = RESILIENCE_DIMENSION_DOMAINS[indicator.dimension];
|
||||
if (domainId == null) return 0;
|
||||
const domainWeight = getResilienceDomainWeight(domainId);
|
||||
// Count only dimensions that have ≥1 core indicator — retired or
|
||||
// all-experimental dimensions contribute coverage=0 to the scorer and
|
||||
// are excluded from the coverage-weighted domain mean.
|
||||
const contributing = coreBearingDimensions(domainId).size;
|
||||
const dimensionShare = contributing > 0 ? 1 / contributing : 0;
|
||||
return indicator.weight * dimensionShare * domainWeight;
|
||||
}
|
||||
|
||||
describe('resilience coverage-and-influence gate (PR 3 §3.6)', () => {
|
||||
it('no indicator with <70% country coverage carries >5% nominal weight in the overall score', () => {
|
||||
const violations = INDICATOR_REGISTRY
|
||||
// Only core indicators contribute to the overall (public) score.
|
||||
// Enrichment and experimental are drill-down-only, so their
|
||||
// nominal-weight-in-overall is 0 regardless of registry weight.
|
||||
.filter((e) => e.tier === 'core')
|
||||
.filter((e) => e.coverage < COVERAGE_FLOOR)
|
||||
.map((e) => ({
|
||||
id: e.id,
|
||||
dimension: e.dimension,
|
||||
coverage: e.coverage,
|
||||
weight: e.weight,
|
||||
nominalOverall: Number(nominalOverallWeight(e).toFixed(4)),
|
||||
}))
|
||||
.filter((v) => v.nominalOverall > NOMINAL_WEIGHT_CAP);
|
||||
|
||||
assert.deepEqual(
|
||||
violations,
|
||||
[],
|
||||
`Indicators below ${COVERAGE_FLOOR}-country coverage floor with nominal overall weight > ${NOMINAL_WEIGHT_CAP * 100}%:\n${
|
||||
violations.map((v) => ` - ${v.id} (dim=${v.dimension}, coverage=${v.coverage}, nominal=${(v.nominalOverall * 100).toFixed(2)}%)`).join('\n')
|
||||
}\n\nFix options:\n 1. Demote to enrichment or experimental tier.\n 2. Lower the indicator's weight within its dimension.\n 3. Improve coverage to ≥${COVERAGE_FLOOR} countries.`,
|
||||
);
|
||||
});
|
||||
|
||||
it('effective-influence artifact reference exists (sensitivity-script contract)', () => {
|
||||
// The plan (§3.6, §5 item 9) requires post-change variable-importance
|
||||
// to confirm the nominal-weight gate is not violated in the dynamic
|
||||
// (variance-explained) dimension either. That artifact is produced
|
||||
// by scripts/validate-resilience-sensitivity.mjs and not re-computed
|
||||
// here (it requires seeded Redis). This test only asserts the gate
|
||||
// script exists, so removing it via refactor breaks the build.
|
||||
const here = dirname(fileURLToPath(import.meta.url));
|
||||
const sensScript = join(here, '..', 'scripts', 'validate-resilience-sensitivity.mjs');
|
||||
assert.ok(existsSync(sensScript),
|
||||
`plan §3.6 effective-influence half is enforced by ${sensScript} — file is missing`);
|
||||
});
|
||||
|
||||
it('retired dimensions (coverage=0 for every country) do not count in the per-domain share denominator', () => {
|
||||
// Regression guard for the §3.6 gate math. When PR 3 commit 1
|
||||
// pinned fuelStockDays at coverage=0, the coverage-weighted domain
|
||||
// aggregation raised the surviving recovery dimensions' shares from
|
||||
// 1/6 to 1/5. Any gate that uses 1/N_total as the divisor will
|
||||
// under-report nominal influence and can silently pass a regression
|
||||
// that drives a low-coverage indicator above the 5% cap.
|
||||
//
|
||||
// This test asserts the helper correctly excludes all-experimental
|
||||
// dimensions from the share denominator.
|
||||
const recoveryDimsTotal = dimensionsInDomain('recovery').length;
|
||||
const recoveryCoreBearing = coreBearingDimensions('recovery').size;
|
||||
assert.ok(recoveryCoreBearing < recoveryDimsTotal,
|
||||
`expected at least one recovery dimension to be all-non-core (post-fuelStockDays-retirement); got ${recoveryCoreBearing}/${recoveryDimsTotal}. If this flips, the fuelStockDays retirement was reverted and §3.6 math assumptions need review.`);
|
||||
|
||||
// Explicit: fuelStockDays is the dimension we retired. Confirm it
|
||||
// has zero core indicators.
|
||||
const fuelStockCoreCount = INDICATOR_REGISTRY.filter(
|
||||
(e) => e.dimension === 'fuelStockDays' && e.tier === 'core',
|
||||
).length;
|
||||
assert.equal(fuelStockCoreCount, 0,
|
||||
'fuelStockDays must have zero core indicators post-PR 3 §3.5 retirement. If this fails, un-retire must be intentional + the gate math reviewed.');
|
||||
|
||||
// And the recovery-domain core indicators should each compute 5%
|
||||
// under the corrected formula (1.0 × 1/5 × 0.25), not 4.17%.
|
||||
const debtToReserves = INDICATOR_REGISTRY.find((e) => e.id === 'recoveryDebtToReserves');
|
||||
assert.ok(debtToReserves != null, 'recoveryDebtToReserves must exist');
|
||||
const computed = nominalOverallWeight(debtToReserves!);
|
||||
// 0.05 exactly, allow fp wiggle
|
||||
assert.ok(Math.abs(computed - 0.05) < 1e-9,
|
||||
`recoveryDebtToReserves nominal weight should be 0.05 (1.0 × 1/5 × 0.25) post-retirement; got ${computed}. If this is 0.0417, the share denominator is using 1/6 instead of 1/5 — fuelStockDays retirement is not being excluded.`);
|
||||
});
|
||||
|
||||
it('reports the current nominal-weight distribution for audit', () => {
|
||||
// Visibility-only (no assertion beyond "ran cleanly"). The output
|
||||
// lets reviewers eyeball the distribution and spot outliers that
|
||||
// technically pass (coverage ≥ floor) but still carry unusually
|
||||
// high weight for a narrow construct.
|
||||
const ranked = INDICATOR_REGISTRY
|
||||
.filter((e) => e.tier === 'core')
|
||||
.map((e) => ({
|
||||
id: e.id,
|
||||
nominalOverall: Number((nominalOverallWeight(e) * 100).toFixed(2)),
|
||||
coverage: e.coverage,
|
||||
}))
|
||||
.sort((a, b) => b.nominalOverall - a.nominalOverall)
|
||||
.slice(0, 10);
|
||||
if (ranked.length > 0) {
|
||||
console.warn('[PR 3 §3.6] top 10 core indicators by nominal overall weight:');
|
||||
for (const r of ranked) {
|
||||
console.warn(` ${r.id}: nominal=${r.nominalOverall}% coverage=${r.coverage}`);
|
||||
}
|
||||
}
|
||||
assert.ok(ranked.length > 0, 'expected at least one core indicator');
|
||||
});
|
||||
});
|
||||
@@ -98,12 +98,12 @@ describe('resilience dimension monotonicity — scoreExternalDebtCoverage', () =
|
||||
}
|
||||
|
||||
it('higher debtToReservesRatio → lower score', async () => {
|
||||
// NOTE: the current scorer saturates at 100 for ratio ≤ 0 (goalpost
|
||||
// lower-better, worst=5 best=0). Picking values inside the 0-5 band
|
||||
// to get a meaningful gradient. PR 3 §3.6 re-goalposts this.
|
||||
const good = await scoreWith(1);
|
||||
const bad = await scoreWith(4);
|
||||
assert.ok(good.score > bad.score, `debtToReservesRatio 1→4 should lower score; got ${good.score} → ${bad.score}`);
|
||||
// PR 3 §3.5 point 3: goalpost is now lower-better worst=2 best=0
|
||||
// (Greenspan-Guidotti anchor). Any ratio ≥ 2 clamps to 0, so pick
|
||||
// values inside the discriminating band to get a meaningful gradient.
|
||||
const good = await scoreWith(0.3);
|
||||
const bad = await scoreWith(1.5);
|
||||
assert.ok(good.score > bad.score, `debtToReservesRatio 0.3→1.5 should lower score; got ${good.score} → ${bad.score}`);
|
||||
});
|
||||
});
|
||||
|
||||
|
||||
@@ -227,53 +227,36 @@ describe('resilience dimension scorers', () => {
|
||||
`coverage should be ~0.45 (only sanctions loaded), got ${score.coverage}`);
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: non-BIS country with no IMF data falls back to curated_list_absent (score 50)', async () => {
|
||||
// BIS loaded, IMF macro also null — no inflation proxy available → curated_list_absent imputation.
|
||||
const reader = async (key: string): Promise<unknown | null> => {
|
||||
if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
|
||||
return null; // economic:imf:macro:v1 also null
|
||||
};
|
||||
const score = await scoreCurrencyExternal('MZ', reader); // Mozambique not in BIS
|
||||
assert.equal(score.score, 50, 'curated_list_absent must impute score=50 when IMF also missing');
|
||||
it('scoreCurrencyExternal: no IMF and no reserves → curated_list_absent imputation (score 50)', async () => {
|
||||
// PR 3 §3.5: BIS retired. Without IMF inflation or WB reserves,
|
||||
// scorer falls through to IMPUTE.bisEer (kept for snapshot continuity).
|
||||
const reader = async (_key: string): Promise<unknown | null> => null;
|
||||
const score = await scoreCurrencyExternal('MZ', reader);
|
||||
assert.equal(score.score, 50, 'curated_list_absent must impute score=50 when IMF+reserves missing');
|
||||
assert.equal(score.coverage, 0.3, 'curated_list_absent certaintyCoverage=0.3');
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: non-BIS country with IMF inflation uses inflation proxy (coverage 0.45)', async () => {
|
||||
// BIS loaded, IMF macro has inflation → use inflation proxy instead of curated_list_absent.
|
||||
it('scoreCurrencyExternal: IMF inflation only (no reserves) uses inflation proxy (coverage 0.55)', async () => {
|
||||
// PR 3 §3.5: BIS retired. IMF inflation alone gives inflation-only path (0.55).
|
||||
const reader = async (key: string): Promise<unknown | null> => {
|
||||
if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
|
||||
if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 8, currentAccountPct: -5, year: 2024 } } };
|
||||
return null;
|
||||
};
|
||||
const score = await scoreCurrencyExternal('MZ', reader);
|
||||
// normalizeLowerBetter(min(8,50), 0, 50) = (50-8)/50*100 = 84
|
||||
assert.equal(score.score, 84, 'low-inflation country gets high currency score via IMF proxy');
|
||||
assert.equal(score.coverage, 0.45, 'IMF inflation proxy coverage=0.45 (better than pure imputation)');
|
||||
assert.equal(score.coverage, 0.55, 'IMF inflation only (no reserves) → coverage 0.55');
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: non-BIS country with hyperinflation is capped at score 0', async () => {
|
||||
it('scoreCurrencyExternal: hyperinflation is capped at score 0 (inflation-only path)', async () => {
|
||||
const reader = async (key: string): Promise<unknown | null> => {
|
||||
if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
|
||||
if (key === 'economic:imf:macro:v2') return { countries: { ZW: { inflationPct: 250, currentAccountPct: -8, year: 2024 } } };
|
||||
return null;
|
||||
};
|
||||
const score = await scoreCurrencyExternal('ZW', reader);
|
||||
// min(250, 50) = 50 → normalizeLowerBetter(50, 0, 50) = 0
|
||||
assert.equal(score.score, 0, 'hyperinflation ≥50% is capped → score 0');
|
||||
assert.equal(score.coverage, 0.45, 'hyperinflation still gets IMF proxy coverage=0.45');
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: BIS outage + IMF inflation present → uses proxy with coverage=0.35', async () => {
|
||||
// BIS seed is completely down (null), but IMF macro is available.
|
||||
// The inflation proxy should still be applied — BIS outage must not block the IMF path.
|
||||
const reader = async (key: string): Promise<unknown | null> => {
|
||||
if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 6, currentAccountPct: -2, year: 2024 } } };
|
||||
return null; // economic:bis:eer:v1 null = BIS seed outage
|
||||
};
|
||||
const score = await scoreCurrencyExternal('MZ', reader);
|
||||
// normalizeLowerBetter(min(6,50), 0, 50) = (50-6)/50*100 = 88
|
||||
assert.equal(score.score, 88, 'BIS outage must not block IMF inflation proxy');
|
||||
assert.equal(score.coverage, 0.35, 'BIS outage reduces proxy coverage to 0.35 (primary source unavailable)');
|
||||
assert.equal(score.coverage, 0.55, 'hyperinflation still gets IMF inflation-only coverage 0.55');
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: both BIS and IMF null → curated_list_absent imputation (T1.7)', async () => {
|
||||
@@ -308,9 +291,9 @@ describe('resilience dimension scorers', () => {
|
||||
assert.ok(withReserves.coverage > 0, 'coverage must be positive with BIS + reserves');
|
||||
});
|
||||
|
||||
it('scoreCurrencyExternal: non-BIS country with good reserves scores higher than with bad reserves', async () => {
|
||||
it('scoreCurrencyExternal: good reserves score higher than bad reserves (inflation+reserves path)', async () => {
|
||||
// PR 3 §3.5: BIS retired. inflation+reserves path → coverage 0.85.
|
||||
const makeReader = (months: number) => async (key: string): Promise<unknown | null> => {
|
||||
if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
|
||||
if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 15, currentAccountPct: -5, year: 2024 } } };
|
||||
if (key === 'resilience:static:MZ') return { fxReservesMonths: { source: 'worldbank', months, year: 2023 } };
|
||||
return null;
|
||||
@@ -319,7 +302,7 @@ describe('resilience dimension scorers', () => {
|
||||
const badRes = await scoreCurrencyExternal('MZ', makeReader(1.5));
|
||||
assert.ok(goodRes.score > badRes.score, `good reserves (${goodRes.score}) must score higher than bad (${badRes.score})`);
|
||||
assert.equal(goodRes.coverage, badRes.coverage, 'coverage should be the same when both have inflation+reserves');
|
||||
assert.equal(goodRes.coverage, 0.55, 'non-BIS with inflation+reserves gets coverage=0.55');
|
||||
assert.equal(goodRes.coverage, 0.85, 'inflation+reserves path gets coverage=0.85');
|
||||
});
|
||||
|
||||
it('scoreMacroFiscal: IMF current account loaded, surplus country scores higher than deficit', async () => {
|
||||
@@ -1150,8 +1133,9 @@ describe('resilience source-failure aggregation (T1.7)', () => {
|
||||
});
|
||||
|
||||
it('scoreExternalDebtCoverage: low debt-to-reserves ratio scores well', async () => {
|
||||
// PR 3 §3.5: goalpost tightened (5→2). NO ratio=0.2 → (2-0.2)/2 = 90.
|
||||
const no = await scoreExternalDebtCoverage('NO', fixtureReader);
|
||||
assert.ok(no.score > 90, `NO with ratio 0.2 should score >90, got ${no.score}`);
|
||||
assert.ok(no.score >= 85, `NO with ratio 0.2 should score >=85, got ${no.score}`);
|
||||
});
|
||||
|
||||
it('scoreImportConcentration: low HHI scores well', async () => {
|
||||
@@ -1167,17 +1151,29 @@ describe('resilience source-failure aggregation (T1.7)', () => {
|
||||
assert.equal(no.imputationClass, null, 'NO has real data, no imputation class');
|
||||
});
|
||||
|
||||
it('scoreFuelStockDays: country with stock data scores based on coverage', async () => {
|
||||
// PR 3 §3.5: fuelStockDays retired permanently from the core score.
|
||||
// scoreFuelStockDays returns coverage=0 + observedWeight=0 +
|
||||
// imputationClass=null for every country regardless of seed content —
|
||||
// the previous two behavioural tests no longer apply because there is
|
||||
// no distinction between "has data" and "missing data" any more. New
|
||||
// regression test: assert the retirement shape holds identically for
|
||||
// a country that USED to have data and a country that never did, so no
|
||||
// future commit silently re-enables the old branch.
|
||||
//
|
||||
// imputationClass is pinned to `null` (not 'source-failure') because
|
||||
// 'source-failure' renders as "Source down: upstream seeder failed"
|
||||
// with a `!` icon in the widget — semantically wrong for an intentional
|
||||
// retirement. `null` lets the widget render the dimension as a neutral
|
||||
// "absent" cell without a false outage label.
|
||||
it('scoreFuelStockDays: retired — returns coverage=0 + null imputationClass for every country', async () => {
|
||||
const no = await scoreFuelStockDays('NO', fixtureReader);
|
||||
// NO fixture: fuelStockDays=90 → normalizeHigherBetter(90, 0, 120) = 75
|
||||
assert.ok(no.score > 60, `NO with 90 fuelStockDays should score >60, got ${no.score}`);
|
||||
assert.ok(no.observedWeight > 0, 'real fuel-stock data must have observed weight');
|
||||
});
|
||||
|
||||
it('scoreFuelStockDays: country without fuel stock data returns unmonitored', async () => {
|
||||
const ye = await scoreFuelStockDays('YE', fixtureReader);
|
||||
assert.equal(ye.imputationClass, 'unmonitored');
|
||||
assert.equal(ye.observedWeight, 0);
|
||||
for (const [label, result] of [['NO', no], ['YE', ye]] as const) {
|
||||
assert.equal(result.coverage, 0, `${label}: retired dimension must have coverage=0`);
|
||||
assert.equal(result.observedWeight, 0, `${label}: retired dimension must have observedWeight=0`);
|
||||
assert.equal(result.imputedWeight, 0, `${label}: retired dimension must have imputedWeight=0`);
|
||||
assert.equal(result.imputationClass, null, `${label}: retired dimension must not tag source-failure (intentional retirement, not a runtime outage)`);
|
||||
}
|
||||
});
|
||||
|
||||
it('recovery domain is present in scoreAllDimensions output', async () => {
|
||||
|
||||
@@ -49,12 +49,28 @@ function installRedisFixtures() {
|
||||
|
||||
describe('resilience release gate', () => {
|
||||
it('keeps all 19 dimension scorers non-placeholder for the required countries', async () => {
|
||||
// PR 3 §3.5: fuelStockDays is retired — scoreFuelStockDays emits
|
||||
// coverage=0 + imputationClass=null for every country. The retirement
|
||||
// is intentional (construct incomparable across net importers / net
|
||||
// exporters). Allow-list it so the zero-coverage placeholder check
|
||||
// still catches unintended regressions in the OTHER 18 dimensions.
|
||||
//
|
||||
// imputationClass=null (not 'source-failure') because the widget maps
|
||||
// 'source-failure' to a "Source down: upstream seeder failed" label
|
||||
// with a `!` icon — surfacing that for every country on a deliberate
|
||||
// retirement would manufacture a false outage signal.
|
||||
const RETIRED_DIMENSIONS = new Set(['fuelStockDays']);
|
||||
for (const countryCode of REQUIRED_DIMENSION_COUNTRIES) {
|
||||
const scores = await scoreAllDimensions(countryCode, fixtureReader);
|
||||
const entries = Object.entries(scores);
|
||||
assert.equal(entries.length, 19, `${countryCode} should have all resilience dimensions`);
|
||||
for (const [dimensionId, score] of entries) {
|
||||
assert.ok(Number.isFinite(score.score), `${countryCode} ${dimensionId} should produce a numeric score`);
|
||||
if (RETIRED_DIMENSIONS.has(dimensionId)) {
|
||||
assert.equal(score.coverage, 0, `${countryCode} ${dimensionId} is retired and must stay at coverage=0`);
|
||||
assert.equal(score.imputationClass, null, `${countryCode} ${dimensionId} retired dimensions must tag null imputationClass (not source-failure)`);
|
||||
continue;
|
||||
}
|
||||
assert.ok(score.coverage > 0, `${countryCode} ${dimensionId} should not fall back to zero-coverage placeholder scoring`);
|
||||
}
|
||||
}
|
||||
|
||||
55
tests/resilience-retired-dimensions-parity.test.mts
Normal file
55
tests/resilience-retired-dimensions-parity.test.mts
Normal file
@@ -0,0 +1,55 @@
|
||||
import assert from 'node:assert/strict';
|
||||
import { readFileSync } from 'node:fs';
|
||||
import { fileURLToPath } from 'node:url';
|
||||
import { dirname, resolve } from 'node:path';
|
||||
import { describe, it } from 'node:test';
|
||||
|
||||
import { RESILIENCE_RETIRED_DIMENSIONS } from '../server/worldmonitor/resilience/v1/_dimension-scorers';
|
||||
|
||||
// Keep the client-side mirror (`RESILIENCE_RETIRED_DIMENSION_IDS` in
|
||||
// src/components/resilience-widget-utils.ts) in lockstep with the
|
||||
// server-side authoritative set. Server and widget cannot share a
|
||||
// module, but their retired-dim view must never diverge — divergence
|
||||
// would leave one surface filtering the wrong set and re-introduce
|
||||
// the PR 3 §3.5 drag regression on that surface.
|
||||
//
|
||||
// We parse the widget file as text (rather than importing it) because
|
||||
// the widget module indirectly pulls in browser-only types that crash
|
||||
// a plain node test runner. Same pattern as existing widget-util tests.
|
||||
|
||||
const here = dirname(fileURLToPath(import.meta.url));
|
||||
const WIDGET_UTILS_PATH = resolve(here, '../src/components/resilience-widget-utils.ts');
|
||||
|
||||
function parseClientRetiredIds(): Set<string> {
|
||||
const source = readFileSync(WIDGET_UTILS_PATH, 'utf8');
|
||||
const match = source.match(
|
||||
/const RESILIENCE_RETIRED_DIMENSION_IDS:\s*ReadonlySet<string>\s*=\s*new Set\(\[([^\]]*)\]\)/,
|
||||
);
|
||||
if (!match) {
|
||||
throw new Error(
|
||||
'Could not locate RESILIENCE_RETIRED_DIMENSION_IDS constant in resilience-widget-utils.ts. ' +
|
||||
'If the constant was renamed or reformatted, update this parser to match.',
|
||||
);
|
||||
}
|
||||
const ids = match[1]!
|
||||
.split(',')
|
||||
.map((entry) => entry.trim())
|
||||
.filter((entry) => entry.length > 0)
|
||||
.map((entry) => entry.replace(/^['"]|['"]$/g, ''));
|
||||
return new Set(ids);
|
||||
}
|
||||
|
||||
describe('retired-dimensions client/server parity', () => {
|
||||
it('server RESILIENCE_RETIRED_DIMENSIONS matches client RESILIENCE_RETIRED_DIMENSION_IDS', () => {
|
||||
const serverSet = new Set<string>(RESILIENCE_RETIRED_DIMENSIONS);
|
||||
const clientSet = parseClientRetiredIds();
|
||||
|
||||
const serverOnly = [...serverSet].filter((id) => !clientSet.has(id));
|
||||
const clientOnly = [...clientSet].filter((id) => !serverSet.has(id));
|
||||
|
||||
assert.deepEqual(serverOnly, [],
|
||||
`Server-only retired dims: ${serverOnly.join(', ')}. Update RESILIENCE_RETIRED_DIMENSION_IDS in src/components/resilience-widget-utils.ts.`);
|
||||
assert.deepEqual(clientOnly, [],
|
||||
`Client-only retired dims: ${clientOnly.join(', ')}. Update RESILIENCE_RETIRED_DIMENSIONS in server/worldmonitor/resilience/v1/_dimension-scorers.ts.`);
|
||||
});
|
||||
});
|
||||
@@ -60,10 +60,17 @@ describe('resilience scorer contracts', () => {
|
||||
// source-failure when the adapter is in seed-meta failedDatasets. This is the
|
||||
// single source of truth for "no currency data"; null-imputationClass paths
|
||||
// on non-real-data return branches are no longer permitted.
|
||||
// PR 3 §3.5: fuelStockDays removed from this set — scoreFuelStockDays
|
||||
// now returns coverage=0 + imputationClass=null for every country
|
||||
// (retired), so it passes the default coverage=0 assertion below
|
||||
// instead of the T1.7 fall-through assertion. The `null` tag (rather
|
||||
// than 'source-failure') reflects the intentional retirement — see
|
||||
// the widget `formatDimensionConfidence` absent-path which would
|
||||
// otherwise surface a false "Source down" label on every country.
|
||||
const coverageZeroExempt = new Set([
|
||||
'currencyExternal',
|
||||
'fiscalSpace', 'reserveAdequacy', 'externalDebtCoverage',
|
||||
'importConcentration', 'stateContinuity', 'fuelStockDays',
|
||||
'importConcentration', 'stateContinuity',
|
||||
]);
|
||||
for (const [dimensionId, scorer] of Object.entries(RESILIENCE_DIMENSION_SCORERS)) {
|
||||
const result = await scorer('US');
|
||||
@@ -92,13 +99,17 @@ describe('resilience scorer contracts', () => {
|
||||
return [domainId, average];
|
||||
}));
|
||||
|
||||
// PR 3 §3.5: economic 68.33 → 66.33 after currencyExternal rebuild.
|
||||
// Recovery 54.83 → 47.33 after externalDebtCoverage goalpost was
|
||||
// tightened from (0..5) to (0..2) per §3.5 point 3 (US ratio=1.5
|
||||
// now scores 25 instead of 70).
|
||||
assert.deepEqual(domainAverages, {
|
||||
economic: 68.33,
|
||||
economic: 66.33,
|
||||
infrastructure: 79,
|
||||
energy: 80,
|
||||
'social-governance': 61.75,
|
||||
'health-food': 60.5,
|
||||
recovery: 54.83,
|
||||
recovery: 47.33,
|
||||
});
|
||||
|
||||
function round(v: number, d = 2) { return Number(v.toFixed(d)); }
|
||||
@@ -126,9 +137,16 @@ describe('resilience scorer contracts', () => {
|
||||
const stressScore = round(coverageWeightedMean(stressDims));
|
||||
const stressFactor = round(Math.max(0, Math.min(1 - stressScore / 100, 0.5)), 4);
|
||||
|
||||
assert.equal(baselineScore, 62.64);
|
||||
assert.equal(stressScore, 65.84);
|
||||
assert.equal(stressFactor, 0.3416);
|
||||
// PR 3 §3.5: 62.64 → 63.63 (fuelStockDays retirement) → 60.12
|
||||
// (externalDebtCoverage goalpost tightened; US score drops from 70
|
||||
// to 25, pulling the coverage-weighted baseline mean down).
|
||||
assert.equal(baselineScore, 60.12);
|
||||
// PR 3 §3.5: 65.84 → 67.85 (fuelStockDays retirement) → 67.21
|
||||
// (currencyExternal rebuilt on IMF inflation + WB reserves, coverage
|
||||
// shifts and US stress score moves). stressFactor updates in lockstep:
|
||||
// 1 - 67.21/100 = 0.3279, clamped to 0.5.
|
||||
assert.equal(stressScore, 67.21);
|
||||
assert.equal(stressFactor, 0.3279);
|
||||
|
||||
const overallScore = round(
|
||||
RESILIENCE_DOMAIN_ORDER.map((domainId) => {
|
||||
@@ -140,7 +158,10 @@ describe('resilience scorer contracts', () => {
|
||||
return round(cwMean) * getResilienceDomainWeight(domainId);
|
||||
}).reduce((sum, v) => sum + v, 0),
|
||||
);
|
||||
assert.equal(overallScore, 65.57);
|
||||
// PR 3 §3.5: 65.57 → 65.82 (fuelStockDays retirement) → 65.52
|
||||
// (currencyExternal rebuild) → 63.27 (externalDebtCoverage goalpost
|
||||
// tightened 0..5 → 0..2; US recovery-domain contribution drops).
|
||||
assert.equal(overallScore, 63.27);
|
||||
});
|
||||
|
||||
it('baselineScore is computed from baseline + mixed dimensions only', async () => {
|
||||
@@ -211,7 +232,9 @@ describe('resilience scorer contracts', () => {
|
||||
);
|
||||
|
||||
assert.ok(expected > 0, 'overall should be positive');
|
||||
assert.equal(expected, 65.57, 'overallScore should match sum(domainScore * domainWeight)');
|
||||
// PR 3 §3.5: 65.82 → 65.52 (currencyExternal rebuild) → 63.27 after
|
||||
// externalDebtCoverage goalpost tightened from (0..5) to (0..2).
|
||||
assert.equal(expected, 63.27, 'overallScore should match sum(domainScore * domainWeight); 65.52 → 63.27 after PR 3 §3.5 externalDebtCoverage re-goalpost');
|
||||
});
|
||||
|
||||
it('stressFactor is still computed (informational) and clamped to [0, 0.5]', () => {
|
||||
|
||||
@@ -76,6 +76,45 @@ test('formatResilienceConfidence shows sparse-data copy when low confidence is s
|
||||
);
|
||||
});
|
||||
|
||||
// PR 3 §3.5 follow-up: retired dimensions (fuelStockDays, post-PR-3)
|
||||
// return coverage=0 structurally (by design, not by sparsity) and
|
||||
// contribute zero weight to domain scoring. The widget's displayed
|
||||
// coverage percentage must exclude them — otherwise a deliberate
|
||||
// construct retirement would drag the user-facing confidence reading
|
||||
// down for every country even though the dimension is not part of the
|
||||
// score. Reviewer P1 anchor: US shows avgCoverage=0.8105 with retired
|
||||
// dim included vs 0.8556 with retired excluded.
|
||||
//
|
||||
// Important: the filter is keyed on the retired-dim ID, NOT on
|
||||
// `coverage === 0`. A non-retired dimension can legitimately emit
|
||||
// coverage=0 on a genuinely sparse-data country (via weightedBlend
|
||||
// fall-through), and those entries must continue to drag confidence
|
||||
// down — that is the sparse-data signal lowConfidence exists to
|
||||
// surface.
|
||||
test('formatResilienceConfidence excludes retired dimensions by ID (not by coverage=0)', () => {
|
||||
const withRetired: ResilienceScoreResponse = {
|
||||
...baseResponse,
|
||||
domains: [
|
||||
{ id: 'economic', score: 80, weight: 0.22, dimensions: [
|
||||
{ id: 'macroFiscal', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 },
|
||||
// Non-retired dim with coverage=0: must STAY in the average
|
||||
// (genuine data sparsity, not a retirement).
|
||||
{ id: 'currencyExternal', score: 50, coverage: 0, observedWeight: 0, imputedWeight: 0 },
|
||||
] },
|
||||
{ id: 'recovery', score: 65, weight: 1.0, dimensions: [
|
||||
{ id: 'fiscalSpace', score: 72, coverage: 0.8, observedWeight: 0.8, imputedWeight: 0.2 },
|
||||
// Retired dimension: coverage=0 is structural; must be excluded.
|
||||
{ id: 'fuelStockDays', score: 50, coverage: 0, observedWeight: 0, imputedWeight: 0 },
|
||||
] },
|
||||
],
|
||||
};
|
||||
// Average over non-retired entries: (0.9 + 0 + 0.8) / 3 = 0.5667 → 57%.
|
||||
// If fuelStockDays were included: (0.9 + 0 + 0.8 + 0) / 4 = 0.425 → 43%.
|
||||
// If we filtered by coverage=0: (0.9 + 0.8) / 2 = 0.85 → 85% (the
|
||||
// over-aggressive filter that would mask genuine sparsity).
|
||||
assert.equal(formatResilienceConfidence(withRetired), 'Coverage 57% ✓');
|
||||
});
|
||||
|
||||
test('formatResilienceChange30d preserves explicit sign formatting', () => {
|
||||
assert.equal(formatResilienceChange30d(2.41), '30d +2.4');
|
||||
assert.equal(formatResilienceChange30d(-1.26), '30d -1.3');
|
||||
|
||||
Reference in New Issue
Block a user