feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)

* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently First commit in PR 3 of the resilience repair plan. Retires `fuelStockDays` from the core score with no replacement. Why permanent, not replaced: IEA emergency-stockholding rules are defined in days of NET IMPORTS and do not bind net exporters by design. Norway/Canada/US measured in days-of-imports are incomparable to Germany/Japan measured the same way — the construct is fundamentally different across the two country classes. No globally-comparable recovery-fuel signal can be built from this source; the pre-repair probe showed 100% imputed at 50 for every country in the April 2026 freeze. scoreFuelStockDays: - Rewritten to return coverage=0 + observedWeight=0 + imputationClass='source-failure' for every country regardless of seed content. - Drops the dimension from the `recovery` domain's coverage- weighted mean automatically; remaining recovery dimensions pick up the share via re-normalisation in `_shared.ts#coverageWeightedMean`. - No explicit weight transfer needed — the coverage-weighted blend handles redistribution. Registry: - recoveryFuelStockDays re-tagged from tier='enrichment' to tier='experimental' so the Core coverage gate treats it as out-of-score. - Description updated to make the retirement explicit; entry stays in the registry for structural continuity (the dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER for the 19-dimension tests; removing the dimension entirely is a PR 4 structural-audit concern). Housekeeping: - Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no longer read; noUnusedLocals would reject it). - Removed `RecoveryFuelStocksCountry` interface for the same reason. Comment at the removed declaration instructs future maintainers not to re-add the type as a reservation; when a new recovery-fuel concept lands, introduce a fresh interface. Plan reference: §3.5 point 1 of `docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`. 51 resilience tests pass, typecheck + biome clean. The `recovery` domain's published score will shift slightly for every country because the 0.10 slot that fuelStockDays was imputing to now redistributes; the compare-harness acceptance-gate rerun at merge time will quantify the shift per plan §6 gates. * feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35 fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60 countries max — so every non-BIS country fell through to curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45). Combined with reserveMarginPct already removed in PR 1, currencyExternal was the clearest "construct absent for most of the world" carrier left in the scorer. Changes: _dimension-scorers.ts - scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX reserves only. Coverage ladder: inflation + reserves → 0.85 (observed primary + secondary) inflation only → 0.55 reserves only → 0.40 neither → 0.30 (IMPUTE.bisEer retained for snapshot continuity; semantics read as "no IMF + no WB reserves" now) - Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved via comment only, flagged by noUnusedLocals), stddev() helper, getCountryBisExchangeRates() loader, BisExchangeRate interface, dateToSortableNumber() — all were exclusive callers of the retired BIS path. _indicator-registry.ts - New core entry inflationStability (weight 0.60, tier=core, sourceKey=economic:imf:macro:v2). - fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability anchor). - fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental (BIS ~60-country coverage; off the core weight sum). - Non-experimental weights now sum to 1.0 (0.60 + 0.40). scripts/compare-resilience-current-vs-proposed.mjs - EXTRACTION_RULES: added inflationStability → imf-macro-country-field field=inflationPct so the registry-parity test passes and the correlation harness sees the new construct. tests/resilience-dimension-scorers.test.mts - Dropped BIS-era wording ("non-BIS country") and test 266 (BIS-outage coverage 0.35 branch) which collapsed to the inflation- only path post-retirement. - Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+ reserves 0.55 → 0.85. tests/resilience-scorers.test.mts - domainAverages.economic 68.33 → 66.33 (US currencyExternal score shifts slightly under IMF+reserves vs old BIS composite). - stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279. - overallScore 65.82 → 65.52. - baselineScore unchanged (currencyExternal is stress-only). All 6324 data-tier tests pass. typecheck:api clean. No change to seeders or Redis keys; this is a pure scorer + registry rebuild. * feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2) Plan §2.1 diagnosis table showed externalDebtCoverage saturating at score=100 across all 9 probe countries — including stressed states. Signal was collapsed. Root cause: (worst=5, best=0) gave every country with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no worry" territory instead of the "mild warning" it should be. Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score 50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure). Ratios above 2.0 clamp to 0, consistent with "beyond this point the country is already in crisis; exact value stops mattering." Files changed: - _indicator-registry.ts: recoveryDebtToReserves goalposts {worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to cite Greenspan-Guidotti; inline comment documents anchor + rationale. - _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound changed from (0..5) to (0..2), with inline comment. - docs/methodology/country-resilience-index.mdx: goalpost table row 5-0 → 2-0, description cites Greenspan-Guidotti. - docs/methodology/indicator-sources.yaml: * constructStatus: dead-signal → observed-mechanism (signal is now discriminating). * reviewNotes updated to describe the new anchor. * mechanismTestRationale names the Greenspan-Guidotti rule. - tests/resilience-dimension-monotonicity.test.mts: updated the comment + picked values inside the (0..2) discriminating band (0.3 and 1.5). Old values (1 vs 4) had 4 clamping to 0. - tests/resilience-dimension-scorers.test.mts: NO score threshold relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96). - tests/resilience-scorers.test.mts: fixture drift: * domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25). * baselineScore 63.63 → 60.12 (extDebt is baseline type). * overallScore 65.52 → 63.27. * stressScore / stressFactor unchanged (extDebt is baseline-only). All 6324 data-tier tests pass. typecheck:api clean. * feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight Plan §3.6 adds a new acceptance criterion (also §5 item 5): > No indicator with observed coverage below 70% may exceed 5% nominal > weight OR 5% effective influence in the post-change sensitivity run. This commit enforces the NOMINAL-WEIGHT half as a unit test that runs on every CI build. The EFFECTIVE-INFLUENCE half is produced by scripts/validate-resilience-sensitivity.mjs as a committed artifact; the gate file only asserts that script still exists so a refactor that removes it breaks the build loudly. Why the gate exists (plan §3.6): "A dimension at 30% observed coverage carries the same effective weight as one at 95%. This contradicts the OECD/JRC handbook on uncertainty analysis." Implementation: tests/resilience-coverage-influence-gate.test.mts — three tests: 1. Nominal-weight gate: for every core indicator with coverage < 137 countries (70% of the ~195-country universe), computes its nominal overall weight as indicator.weight × (1/dimensions-in-domain) × domain-weight and asserts it does not exceed 5%. Equal-share-per-dimension is the *upper bound* on runtime weight (coverage-weighted mean gives a lower share when a dimension drops out), so this is a strict bound: if the nominal number passes, the runtime number also passes for every country. 2. Effective-influence contract: asserts the sensitivity script exists at its expected path. Removing it (intentionally or by refactor) breaks the build. 3. Audit visibility: prints the top 10 core indicators by nominal overall weight. No assertion beyond "ran" — the list lets reviewers spot outliers that pass the gate but are near the cap. Current state (observed from audit output): recoveryReserveMonths: nominal=4.17% coverage=188 recoveryDebtToReserves: nominal=4.17% coverage=185 recoveryImportHhi: nominal=4.17% coverage=190 inflationStability: nominal=3.40% coverage=185 electricityConsumption: nominal=3.30% coverage=217 ucdpConflict: nominal=3.09% coverage=193 Every core indicator has coverage ≥ 180 (already enforced by the pre-existing indicator-tiering test), so the nominal-weight gate has no current violators — its purpose is catching future drift, not flagging today's state. All 6327 data-tier tests pass. typecheck:api clean. * docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate Methodology-doc update capturing the three §3.5 landings and the §3.6 CI gate. Five edits: 1. **Known construct limitations section (#5 and #6):** strikethrough the original "dead signals" and "no coverage-based weight cap" items, annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" + specifics of what shipped. 2. **Currency & External H4 section:** completely rewritten. Old table (fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is replaced by the two-indicator post-PR-3 table (inflationStability at 0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out (0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as experimental-tier drill-downs only. 3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the methodology-lint H4-to-dimension mapping does not break; body rewritten to explain that the dimension is retired from core but the seeder still runs for IEA-member drill-downs. 4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description cites Greenspan-Guidotti reserve-adequacy rule. 5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering §3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit updates. No scoring or code changes in this commit. Methodology-lint test passes (H4 mapping intact). All 6327 data-tier tests pass. * fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation Reviewer catch (thanks). The previous gate computed each indicator's nominal overall weight as indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight and claimed this was an upper bound ("actual runtime weight is ≤ this when some dimensions drop out on coverage"). That is BACKWARDS for this scorer. The domain aggregation is coverage-weighted (server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean), so when a dimension pins at coverage=0 it is EXCLUDED from the denominator and the surviving dimensions' shares go UP, not down. PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to coverage=0 for every country — so in the current live state the recovery domain has 5 contributing dimensions (not 6), and each core recovery indicator's nominal share is 1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%) The old gate therefore under-estimated nominal influence and could silently pass exactly the kind of low-coverage overweight regression it is meant to block. Fix: - Added `coreBearingDimensions(domainId)` helper that counts only dimensions that have ≥1 core indicator in the registry. A dimension with only experimental/enrichment entries (post-retirement fuelStockDays) has no core contribution → does not dilute shares. - Updated `nominalOverallWeight` to divide by the core-bearing count, not the raw dimension count. - Rewrote the helper's doc comment to stop claiming this is a strict upper bound — explicitly calls out the dynamic case (source failure raising surviving dim shares further) as the sensitivity script's responsibility. - Added a new regression test: asserts (a) at least one recovery dimension is all-non-core (fuelStockDays post-retirement), (b) fuelStockDays has zero core indicators, and (c) recoveryDebt ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion of the retirement or regression to N_total-denominator will fail loudly. Top-10 audit output now correctly shows: recoveryReserveMonths: nominal=5% coverage=188 recoveryDebtToReserves: nominal=5% coverage=185 recoveryImportHhi: nominal=5% coverage=190 (was 4.17% each under the old math) All 486 resilience tests pass. typecheck:api clean. Note: the 5% figure is exactly AT the cap, not over it. "exceed" means strictly > 5%, so it still passes. But now the reviewer / audit log reflects reality. * fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label Addresses the Codex review P1 + P2 on PR #3297. P1 — retired-dim drag on confidence averages -------------------------------------------- scoreFuelStockDays returns coverage=0 by design (retired construct), but computeLowConfidence, computeOverallCoverage, and the widget's formatResilienceConfidence averaged across all 19 dimensions. That dragged every country's reported averageCoverage down — US went from 0.8556 (active dims only) to 0.8105 (all dims) — enough drift to misclassify edge countries as lowConfidence and to shift the ranking widget's overallCoverage pill for every country. Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in _dimension-scorers.ts and filter it out of all three averages. The filter is keyed on the retired-dim REGISTRY, not on coverage === 0, because a non-retired dim can legitimately emit coverage=0 on a genuinely sparse-data country via weightedBlend fall-through — those entries MUST keep dragging confidence down (that is the sparse-data signal lowConfidence exists to surface). Verified: sparse-country release-gate test (marks sparse WHO/FAO countries as low confidence) still passes with the registry-keyed filter; would have failed with a naive coverage=0 filter. Server-client parity: widget-utils cannot import server code, so RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept in lockstep by tests/resilience-retired-dimensions-parity.test.mts (parses the widget file as text, same pattern as existing widget-util tests that can't import the widget module directly). P2 — false "Source down" label on retired dim --------------------------------------------- scoreFuelStockDays hard-coded imputationClass: 'source-failure', which the widget maps to "Source down: upstream seeder failed" with a `!` icon for every country. That is semantically wrong for an intentional retirement. Flipped to null so the widget's absent-path renders a neutral cell without a false outage label. null is already a legal value of ResilienceDimensionScore.imputationClass; no type change needed. Tests ----- - tests/resilience-confidence-averaging.test.mts (new): pins the registry-keyed filter semantic for computeOverallCoverage + computeLowConfidence. Includes a negative-control test proving non-retired coverage=0 dims still flip lowConfidence. - tests/resilience-retired-dimensions-parity.test.mts (new): lockstep gate between server and client retired-dim lists. - Widget test adds a registry-keyed exclusion test with a non-retired coverage=0 dim in the fixture to lock in the correct semantic. - Existing tests asserting imputationClass: 'source-failure' for fuelStockDays flipped to null. All 494 resilience tests + full 6336/6336 data-tier suite pass. Typecheck clean for both tsconfig.json and tsconfig.api.json. * docs(resilience): align methodology + registry metadata with shipped imputationClass=null Follow-up to the previous PR 3 review commit that flipped scoreFuelStockDays's imputationClass from 'source-failure' to null to avoid a false "Source down" widget label on every country. The code changed; the doc and registry metadata did not, leaving three sites in the methodology mdx and two comment/description sites in the registry still claiming imputationClass='source-failure'. Any future reviewer (or tooling that treats the registry description as authoritative) would be misled. This commit rewrites those sites to describe the shipped behavior: - imputationClass=null (not 'source-failure'), with the rationale - exclusion from confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS registry filter - the distinction between structural retirement (filtered) and runtime coverage=0 (kept so sparse-data countries still flag lowConfidence) Touched: - docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590) - server/worldmonitor/resilience/v1/_indicator-registry.ts (recoveryFuelStockDays comment block + description field) No code-behavior change. Docs-only. Tests: 157 targeted resilience tests pass (incl. methodology-lint + widget + release-gate + confidence-averaging). Typecheck clean on both tsconfig.json and tsconfig.api.json.
2026-04-25 17:14:57 +02:00 · 2026-04-22 23:57:28 +04:00
parent c067a7dd63
commit 7cf37c604c
15 changed files with 838 additions and 210 deletions
--- a/server/worldmonitor/resilience/v1/_dimension-scorers.ts
+++ b/server/worldmonitor/resilience/v1/_dimension-scorers.ts
@@ -167,12 +167,9 @@ interface ImfMacroEntry {
  year?: number | null;
 }

-interface BisExchangeRate {
-  countryCode?: string;
-  realEer?: number;
-  realChange?: number;
-  date?: string;
-}
+// BisExchangeRate interface removed in PR 3 §3.5: only the
+// now-removed getCountryBisExchangeRates() + scoreCurrencyExternal's
+// BIS path used it.

 interface NationalDebtEntry {
  iso3?: string;
@@ -235,7 +232,10 @@ interface SocialVelocityPost {
 const RESILIENCE_STATIC_PREFIX = 'resilience:static:';
 const RESILIENCE_SHIPPING_STRESS_KEY = 'supply_chain:shipping_stress:v1';
 const RESILIENCE_TRANSIT_SUMMARIES_KEY = 'supply_chain:transit-summaries:v1';
-const RESILIENCE_BIS_EXCHANGE_KEY = 'economic:bis:eer:v1';
+// RESILIENCE_BIS_EXCHANGE_KEY removed in PR 3 §3.5: scoreCurrencyExternal
+// no longer reads BIS EER. fxVolatility / fxDeviation indicators remain
+// registered as tier='experimental' for drill-down panels; those panels
+// read BIS directly via their own handlers, not via this scorer.
 const RESILIENCE_BIS_DSR_KEY = 'economic:bis:dsr:v1';
 const RESILIENCE_NATIONAL_DEBT_KEY = 'economic:national-debt:v1';
 const RESILIENCE_IMF_MACRO_KEY = 'economic:imf:macro:v2';
@@ -258,7 +258,11 @@ const RESILIENCE_RECOVERY_FISCAL_SPACE_KEY = 'resilience:recovery:fiscal-space:v
 const RESILIENCE_RECOVERY_RESERVE_ADEQUACY_KEY = 'resilience:recovery:reserve-adequacy:v1';
 const RESILIENCE_RECOVERY_EXTERNAL_DEBT_KEY = 'resilience:recovery:external-debt:v1';
 const RESILIENCE_RECOVERY_IMPORT_HHI_KEY = 'resilience:recovery:import-hhi:v1';
-const RESILIENCE_RECOVERY_FUEL_STOCKS_KEY = 'resilience:recovery:fuel-stocks:v1';
+// RESILIENCE_RECOVERY_FUEL_STOCKS_KEY removed in PR 3: scoreFuelStockDays
+// no longer reads any source key. If a new globally-comparable
+// recovery-fuel concept lands in a future PR, add a new key with an
+// explicit semantic (e.g. resilience:fuel-import-volatility:v1) rather
+// than resurrecting this one.

 // PR 1 energy-construct v2 seed keys (plan §3.1–§3.3). Written by
 // scripts/seed-low-carbon-generation.mjs, scripts/seed-fossil-
@@ -448,13 +452,9 @@ function mean(values: number[]): number | null {
  return values.reduce((sum, value) => sum + value, 0) / values.length;
 }

-function stddev(values: number[]): number | null {
-  if (values.length < 2) return null;
-  const avg = mean(values);
-  if (avg == null) return null;
-  const variance = values.reduce((sum, value) => sum + (value - avg) ** 2, 0) / values.length;
-  return Math.sqrt(variance);
-}
+// stddev() removed in PR 3 §3.5: its only caller was scoreCurrencyExternal's
+// BIS-volatility path which is now retired. Re-introduce if a future
+// scorer genuinely needs a series-volatility computation.

 // T1.7 schema pass: tie-break order when multiple imputed metrics share
 // weight. Earlier classes in this list win on ties. stable-absence expresses
@@ -565,15 +565,8 @@ function matchesCountryText(value: unknown, countryCode: string): boolean {
  return false;
 }

-function dateToSortableNumber(value: unknown): number {
-  if (typeof value === 'string') {
-    const compact = value.replace(/[^0-9]/g, '');
-    const numeric = Number(compact);
-    if (Number.isFinite(numeric) && numeric > 0) return numeric;
-  }
-  const numeric = Number(value);
-  return Number.isFinite(numeric) ? numeric : 0;
-}
+// dateToSortableNumber() removed in PR 3 §3.5: only the now-removed
+// getCountryBisExchangeRates() used it.

 async function defaultSeedReader(key: string): Promise<unknown | null> {
  return getCachedJson(key, true);
@@ -631,14 +624,9 @@ function getImfLaborEntry(raw: unknown, countryCode: string): ImfLaborEntry | nu
  return (countries[countryCode] as ImfLaborEntry | undefined) ?? null;
 }

-function getCountryBisExchangeRates(raw: unknown, countryCode: string): BisExchangeRate[] {
-  const rates: BisExchangeRate[] = Array.isArray((raw as { rates?: unknown[] } | null)?.rates)
-    ? ((raw as { rates?: BisExchangeRate[] }).rates ?? [])
-    : [];
-  return rates
-    .filter((entry) => matchesCountryIdentifier(entry.countryCode, countryCode))
-    .sort((left, right) => dateToSortableNumber(left.date) - dateToSortableNumber(right.date));
-}
+// getCountryBisExchangeRates() removed in PR 3 §3.5: only scoreCurrencyExternal
+// called it, and that scorer no longer reads BIS EER. Drill-down panels
+// that want BIS series read it via their own dedicated handler.

 function getLatestDebtEntry(raw: unknown, countryCode: string): NationalDebtEntry | null {
  const iso3 = ISO2_TO_ISO3[countryCode.toUpperCase()];
@@ -890,76 +878,87 @@ function scoreFxReserves(months: number): number {
  return normalizeHigherBetter(Math.min(months, 12), 1, 12);
 }

+// PR 3 §3.5 point 3: retire the BIS-dependent primary path. BIS EER
+// covers ~64 economies — a core signal that's null for ~150 countries
+// is structurally wrong for a world-ranking score. The scorer now
+// uses only global-coverage inputs:
+//   - inflationStability: IMF `inflationPct` (CPI, ~185 countries)
+//   - fxReservesAdequacy: WB `FI.RES.TOTL.MO` (~160 countries)
+// BIS `realChange` / `realEer` are still read for drill-down panels
+// via the fxVolatility / fxDeviation registry entries (now re-tagged
+// `tier='experimental'` so they're excluded from the Core coverage
+// gate), but the SCORER path ignores them entirely. A country that
+// used to take the "BIS primary" branch now takes the same path as
+// a non-BIS country, producing consistent per-country-reproducibility
+// regardless of whether BIS tracks them.
+//
+// Weight split in the core blend:
+//   inflationStability 0.6 | fxReservesAdequacy 0.4
+// Mirrors the pre-existing "fallback when no BIS" blend weights.
 export async function scoreCurrencyExternal(
  countryCode: string,
  reader: ResilienceSeedReader = defaultSeedReader,
 ): Promise<ResilienceDimensionScore> {
-  const [bisExchangeRaw, imfMacroRaw, staticRecord] = await Promise.all([
-    reader(RESILIENCE_BIS_EXCHANGE_KEY),
+  const [imfMacroRaw, staticRecord] = await Promise.all([
    reader(RESILIENCE_IMF_MACRO_KEY),
    readStaticCountry(countryCode, reader),
  ]);
-  const countryRates = getCountryBisExchangeRates(bisExchangeRaw, countryCode);
-  const latest = countryRates[countryRates.length - 1] ?? null;
-  const volSource = countryRates
-    .map((entry) => safeNum(entry.realChange))
-    .filter((value): value is number => value != null)
-    .slice(-12);
-  const vol = volSource.length >= 2
-    ? (stddev(volSource) ?? 0) * Math.sqrt(12)
-    : volSource.length === 1
-      ? Math.abs(volSource[0]!) * Math.sqrt(12)
-      : null;
+
+  const imfEntry = getImfMacroEntry(imfMacroRaw, countryCode);
+  const hasInflation = imfMacroRaw != null && imfEntry?.inflationPct != null;
+  const inflationScore = hasInflation
+    ? normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50)
+    : null;

  const reservesMonths = getFxReservesMonths(staticRecord);
  const reservesScore = reservesMonths != null ? scoreFxReserves(reservesMonths) : null;

-  // Country not in BIS EER (curated ~40 economies), or BIS seed is down entirely.
-  // Use IMF CPI inflation + WB FX reserves as currency stability proxies.
-  // Inflation covers ~185 countries, reserves ~160 countries via World Bank FI.RES.TOTL.MO.
-  if (countryRates.length === 0) {
-    const imfEntry = getImfMacroEntry(imfMacroRaw, countryCode);
-    const hasInflation = imfMacroRaw != null && imfEntry?.inflationPct != null;
-    const hasReserves = reservesScore != null;
-
-    if (hasInflation && hasReserves) {
-      const inflScore = normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50);
-      const blended = inflScore * 0.6 + reservesScore * 0.4;
-      const coverage = bisExchangeRaw != null ? 0.55 : 0.45;
-      return { score: roundScore(blended), coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
-    }
-    if (hasInflation) {
-      const coverage = bisExchangeRaw != null ? 0.45 : 0.35;
-      return { score: normalizeLowerBetter(Math.min(imfEntry!.inflationPct!, 50), 0, 50), coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
-    }
-    if (hasReserves) {
-      const coverage = bisExchangeRaw != null ? 0.4 : 0.3;
-      return { score: reservesScore, coverage, observedWeight: 1, imputedWeight: 0, imputationClass: null, freshness: { lastObservedAtMs: 0, staleness: '' } };
-    }
-    // No BIS EER, no IMF inflation fallback, no WB reserves fallback.
-    // This is true structural absence: the country isn't covered by any
-    // currency-stability source we track. Tag with curated_list_absent
-    // (= 'unmonitored') so the taxonomy is the single source of truth
-    // and the aggregation pass can still re-tag it as 'source-failure'
-    // when the underlying adapter fails. The prior absence-based branch
-    // returned { score: 50, imputationClass: null } which silently
-    // bypassed the taxonomy; replaced in T1.7 source-failure wiring.
+  if (hasInflation && reservesScore != null) {
+    const blended = inflationScore! * 0.6 + reservesScore * 0.4;
    return {
-      score: IMPUTE.bisEer.score,
-      coverage: IMPUTE.bisEer.certaintyCoverage,
-      observedWeight: 0,
-      imputedWeight: 1,
-      imputationClass: IMPUTE.bisEer.imputationClass,
+      score: roundScore(blended),
+      coverage: 0.85,
+      observedWeight: 1,
+      imputedWeight: 0,
+      imputationClass: null,
+      freshness: { lastObservedAtMs: 0, staleness: '' },
+    };
+  }
+  if (hasInflation) {
+    return {
+      score: inflationScore!,
+      coverage: 0.55,
+      observedWeight: 1,
+      imputedWeight: 0,
+      imputationClass: null,
+      freshness: { lastObservedAtMs: 0, staleness: '' },
+    };
+  }
+  if (reservesScore != null) {
+    return {
+      score: reservesScore,
+      coverage: 0.4,
+      observedWeight: 1,
+      imputedWeight: 0,
+      imputationClass: null,
      freshness: { lastObservedAtMs: 0, staleness: '' },
    };
  }

-  // BIS EER data present: volatility + deviation are primary, reserves supplementary.
-  return weightedBlend([
-    { score: vol == null ? null : normalizeLowerBetter(vol, 0, 50), weight: 0.6 },
-    { score: latest == null ? null : normalizeLowerBetter(Math.abs((safeNum(latest.realEer) ?? 100) - 100), 0, 35), weight: 0.25 },
-    { score: reservesScore, weight: 0.15 },
-  ]);
+  // Neither global-coverage source present. True structural absence;
+  // keep the curated_list_absent → unmonitored taxonomy so the
+  // aggregation pass can still re-tag as source-failure on adapter
+  // outage. (IMPUTE.bisEer is the existing entry; we keep its
+  // identity/name for snapshot continuity but the semantics now read
+  // as "no IMF + no WB reserves" rather than "no BIS".)
+  return {
+    score: IMPUTE.bisEer.score,
+    coverage: IMPUTE.bisEer.certaintyCoverage,
+    observedWeight: 0,
+    imputedWeight: 1,
+    imputationClass: IMPUTE.bisEer.imputationClass,
+    freshness: { lastObservedAtMs: 0, staleness: '' },
+  };
 }

 export async function scoreTradeSanctions(
@@ -1407,11 +1406,11 @@ interface RecoveryImportHhiCountry {
  year?: number | null;
 }

-interface RecoveryFuelStocksCountry {
-  fuelStockDays?: number | null;
-  meetsObligation?: boolean | null;
-  belowObligation?: boolean | null;
-}
+// RecoveryFuelStocksCountry interface removed in PR 3 — scoreFuelStockDays
+// no longer reads any payload. Do NOT re-add the type as a reservation;
+// the tsc noUnusedLocals rule rejects unused locals. When a new
+// recovery-fuel concept lands, introduce a fresh interface with a
+// different name + the actual shape it needs.

 function getRecoveryCountryEntry<T>(raw: unknown, countryCode: string): T | null {
  const countries = (raw as { countries?: Record<string, T> } | null)?.countries;
@@ -1480,8 +1479,12 @@ export async function scoreExternalDebtCoverage(
      freshness: { lastObservedAtMs: 0, staleness: '' },
    };
  }
+  // PR 3 §3.5 point 3: goalpost re-anchored on Greenspan-Guidotti.
+  // Ratio 1.0 (short-term debt matches reserves) = score 50; ratio 2.0
+  // = score 0 (acute rollover-shock exposure). See registry entry
+  // recoveryDebtToReserves for the construct rationale.
  return weightedBlend([
-    { score: normalizeLowerBetter(entry.debtToReservesRatio, 0, 5), weight: 1.0 },
+    { score: normalizeLowerBetter(entry.debtToReservesRatio, 0, 2), weight: 1.0 },
  ]);
 }

@@ -1551,26 +1554,75 @@ export async function scoreStateContinuity(
  ]);
 }

+// PR 3 §3.5 point 1: retired permanently from the core score. IEA
+// emergency-stockholding rules are defined in days of NET IMPORTS
+// and do not bind net exporters by design; the net-importer vs net-
+// exporter framings are incomparable, so no global resilience signal
+// can be built from this data. Published coverage for the IEA/EIA
+// connector sat at 100% imputed at 50 for every country in the
+// pre-repair probe (`fuelStockDays` was `source-failure` for every
+// ISO in the April 2026 freeze snapshot).
+//
+// Returning `coverage: 0` + `observedWeight: 0` drops the dimension
+// from the `recovery` domain's coverage-weighted mean entirely; the
+// remaining recovery dimensions pick up its share of the domain
+// weight via auto-redistribution (no explicit weight transfer needed
+// — `coverageWeightedMean` in `_shared.ts` already does this).
+//
+// Does NOT return in PR 4. A new globally-comparable recovery-fuel
+// concept (e.g. fuel-import-volatility or strategic-buffer-ratio
+// with a unified net-importer/net-exporter definition) could replace
+// this scorer in a future PR, but that is out of scope for the
+// first-publication repair.
+//
+// The dimension `fuelStockDays` remains in `RESILIENCE_DIMENSION_ORDER`
+// for structural continuity (tests, pillar membership, registry
+// shape); retiring the dimension entirely is a PR 4 structural-audit
+// concern. The `recoveryFuelStockDays` indicator is re-tagged as
+// `tier: 'experimental'` in the registry so the Core coverage gate
+// does not consider it active.
+// Authoritative registry of dimensions retired from the core score.
+// Retired dimensions still appear in `RESILIENCE_DIMENSION_ORDER` for
+// structural continuity (tests, pillar membership, registry shape) and
+// their scorers still run (returning coverage=0). This set exists so
+// downstream confidence/coverage averages (`computeLowConfidence`,
+// `computeOverallCoverage`, the widget's `formatResilienceConfidence`)
+// can explicitly exclude retired dims — distinct from coverage=0
+// dimensions that reflect genuine data sparsity, which must still drag
+// the confidence reading down so sparse-data countries stay flagged as
+// low-confidence. See `tests/resilience-confidence-averaging.test.mts`
+// for the exact semantic this set enables.
+//
+// Client-side mirror: `RESILIENCE_RETIRED_DIMENSION_IDS` in
+// `src/components/resilience-widget-utils.ts`. Kept in lockstep via
+// `tests/resilience-retired-dimensions-parity.test.mts`.
+export const RESILIENCE_RETIRED_DIMENSIONS: ReadonlySet<ResilienceDimensionId> = new Set([
+  'fuelStockDays',
+]);
+
 export async function scoreFuelStockDays(
-  countryCode: string,
-  reader: ResilienceSeedReader = defaultSeedReader,
+  _countryCode: string,
+  _reader: ResilienceSeedReader = defaultSeedReader,
 ): Promise<ResilienceDimensionScore> {
-  const raw = await reader(RESILIENCE_RECOVERY_FUEL_STOCKS_KEY);
-  const entry = getRecoveryCountryEntry<RecoveryFuelStocksCountry>(raw, countryCode);
-  // The seeder writes `fuelStockDays`, not `stockDays`.
-  if (!entry || entry.fuelStockDays == null) {
-    return {
-      score: IMPUTE.recoveryFuelStocks.score,
-      coverage: IMPUTE.recoveryFuelStocks.certaintyCoverage,
-      observedWeight: 0,
-      imputedWeight: 1,
-      imputationClass: IMPUTE.recoveryFuelStocks.imputationClass,
-      freshness: { lastObservedAtMs: 0, staleness: '' },
-    };
-  }
-  return weightedBlend([
-    { score: normalizeHigherBetter(Math.min(entry.fuelStockDays, 120), 0, 120), weight: 1.0 },
-  ]);
+  // imputationClass is `null` (not 'source-failure') because the dimension
+  // is retired by design, not failing at runtime. 'source-failure' renders
+  // as "Source down: upstream seeder failed" with a `!` icon in the widget
+  // (see IMPUTATION_CLASS_LABELS in src/components/resilience-widget-utils.ts);
+  // surfacing that label for every country would manufacture a false outage
+  // signal for a deliberate construct retirement. The dimension is excluded
+  // from confidence/coverage averages via the `RESILIENCE_RETIRED_DIMENSIONS`
+  // registry filter in `computeLowConfidence`, `computeOverallCoverage`, and
+  // the widget's `formatResilienceConfidence`. The filter is registry-keyed
+  // (not `coverage === 0`) so genuinely sparse-data countries still surface
+  // as low-confidence from non-retired coverage=0 dims.
+  return {
+    score: 50,
+    coverage: 0,
+    observedWeight: 0,
+    imputedWeight: 0,
+    imputationClass: null,
+    freshness: { lastObservedAtMs: 0, staleness: '' },
+  };
 }

 export const RESILIENCE_DIMENSION_SCORERS: Record<
--- a/server/worldmonitor/resilience/v1/_indicator-registry.ts
+++ b/server/worldmonitor/resilience/v1/_indicator-registry.ts
@@ -108,11 +108,49 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
    license: 'non-commercial',
  },

-  // ── currencyExternal (3 sub-metrics, plus IMF inflation fallback for non-BIS) ─
+  // ── currencyExternal ─────────────────────────────────────────────────────
+  // PR 3 §3.5 point 3 rebalanced the dimension's core scoring:
+  //   - BIS-dependent signals (fxVolatility, fxDeviation) moved to
+  //     tier='experimental'. BIS EER covers ~64 economies, which is too
+  //     narrow for a world-ranking Core signal. They remain in the registry
+  //     for drill-down / enrichment panels but scoreCurrencyExternal no
+  //     longer reads them.
+  //   - Core scoring is now: inflationStability (IMF CPI, ~185 countries)
+  //     at weight 0.6, fxReservesAdequacy (WB FI.RES.TOTL.MO, ~188 countries)
+  //     at weight 0.4. Both are global-coverage, so every country gets the
+  //     same construct regardless of BIS membership.
+  {
+    id: 'inflationStability',
+    dimension: 'currencyExternal',
+    description: 'IMF CPI inflation (lower is better). Global-coverage primary signal for currency stability. Core input to scoreCurrencyExternal under PR 3 §3.5. A future PR may upgrade this to a 5-year inflation-volatility computation once the seeder tracks the series; headline inflation is a reasonable first-cut for stability ranking.',
+    direction: 'lowerBetter',
+    goalposts: { worst: 50, best: 0 },
+    weight: 0.6,
+    sourceKey: 'economic:imf:macro:v2',
+    scope: 'global',
+    cadence: 'annual',
+    tier: 'core',
+    coverage: 185,
+    license: 'open-data',
+  },
+  {
+    id: 'fxReservesAdequacy',
+    dimension: 'currencyExternal',
+    description: 'Total reserves in months of imports (World Bank FI.RES.TOTL.MO). Global-coverage core signal for currency stability; paired with inflationStability in scoreCurrencyExternal after PR 3 §3.5 rebalancing.',
+    direction: 'higherBetter',
+    goalposts: { worst: 1, best: 12 },
+    weight: 0.4,
+    sourceKey: 'resilience:static:*',
+    scope: 'global',
+    cadence: 'annual',
+    tier: 'core',
+    coverage: 188,
+    license: 'open-data',
+  },
  {
    id: 'fxVolatility',
    dimension: 'currencyExternal',
-    description: 'Annualized BIS real effective exchange rate volatility (std-dev of monthly changes * sqrt(12)). Fallback chain when BIS absent: (1) IMF inflation + WB reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation.',
+    description: 'Annualized BIS real effective exchange rate volatility (std-dev of monthly changes * sqrt(12)). Enrichment-only for the ~64 BIS-tracked economies after PR 3 §3.5 — NOT read by scoreCurrencyExternal. Available via drill-down panels only.',
    direction: 'lowerBetter',
    goalposts: { worst: 50, best: 0 },
    weight: 0.6,
@@ -120,17 +158,14 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
    scope: 'curated',
    cadence: 'monthly',
    imputation: { type: 'conservative', score: 50, certainty: 0.3 },
-    // BIS REER is curated (~60 countries). Demoted to Enrichment by the
-    // Phase 2 A4 coverage gate; the IMF inflation + WB reserves fallback
-    // chain still feeds the Core fxReservesAdequacy signal globally.
-    tier: 'enrichment',
+    tier: 'experimental',
    coverage: 60,
    license: 'non-commercial',
  },
  {
    id: 'fxDeviation',
    dimension: 'currencyExternal',
-    description: 'Absolute deviation of latest BIS real EER from 100 (equilibrium index). Fallback chain when BIS absent: (1) IMF inflation + WB reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation.',
+    description: 'Absolute deviation of latest BIS real EER from 100 (equilibrium index). Enrichment-only for the ~64 BIS-tracked economies after PR 3 §3.5 — NOT read by scoreCurrencyExternal. Available via drill-down panels only.',
    direction: 'lowerBetter',
    goalposts: { worst: 35, best: 0 },
    weight: 0.25,
@@ -138,25 +173,10 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
    scope: 'curated',
    cadence: 'monthly',
    imputation: { type: 'conservative', score: 50, certainty: 0.3 },
-    // BIS REER curated source, same coverage limitation as fxVolatility.
-    tier: 'enrichment',
+    tier: 'experimental',
    coverage: 60,
    license: 'non-commercial',
  },
-  {
-    id: 'fxReservesAdequacy',
-    dimension: 'currencyExternal',
-    description: 'Total reserves in months of imports (World Bank FI.RES.TOTL.MO). Supplementary metric for BIS countries (weight 0.15), primary metric alongside IMF inflation for non-BIS countries (~160 countries).',
-    direction: 'higherBetter',
-    goalposts: { worst: 1, best: 12 },
-    weight: 0.15,
-    sourceKey: 'resilience:static:*',
-    scope: 'global',
-    cadence: 'annual',
-    tier: 'core',
-    coverage: 188,
-    license: 'open-data',
-  },

  // ── tradeSanctions (4 sub-metrics) ────────────────────────────────────────
  {
@@ -905,9 +925,16 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
  {
    id: 'recoveryDebtToReserves',
    dimension: 'externalDebtCoverage',
-    description: 'Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD); values above 1 signal reserve inadequacy for debt service',
+    description: 'Short-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD); Greenspan-Guidotti rule treats ratio≥1 as reserve inadequacy, ratio≥2 as acute rollover-shock exposure',
    direction: 'lowerBetter',
-    goalposts: { worst: 5, best: 0 },
+    // PR 3 §3.5 point 3: re-goalposted from (0..5) to (0..2). Old goalpost
+    // saturated at 100 across the full 9-country probe including stressed
+    // states. New anchor: ratio=1.0 (Greenspan-Guidotti reserve-adequacy
+    // threshold) maps to score 50; ratio=2.0 (double the threshold, acute
+    // distress) maps to 0. Ratios above 2.0 clamp to 0 — consistent with
+    // "beyond this point the precise value stops mattering, the country
+    // is already in a rollover-crisis regime."
+    goalposts: { worst: 2, best: 0 },
    weight: 1.0,
    sourceKey: 'resilience:recovery:external-debt:v1',
    scope: 'global',
@@ -978,17 +1005,31 @@ export const INDICATOR_REGISTRY: IndicatorSpec[] = [
  },

  // ── fuelStockDays (1 sub-metric) ─────────────────────────────────────────
+  // PR 3 §3.5 point 1: RETIRED from the core score. IEA emergency-
+  // stockholding is defined in days of NET IMPORTS; the net-importer
+  // vs net-exporter framings are incomparable, so no global resilience
+  // signal can be built from this data. scoreFuelStockDays now returns
+  // coverage=0 + imputationClass=null for every country (filtered out
+  // of confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS
+  // registry in _dimension-scorers.ts). imputationClass is deliberately
+  // `null` rather than 'source-failure' — a retirement is structural,
+  // not a runtime outage, and surfacing 'source-failure' would manufacture
+  // a false "Source down" label in the widget for every country. The
+  // registry entry stays at tier='experimental' so the Core coverage
+  // gate treats it as out-of-score; the dimension itself remains
+  // registered for structural continuity (PR 4 structural-audit may
+  // remove it entirely).
  {
    id: 'recoveryFuelStockDays',
    dimension: 'fuelStockDays',
-    description: 'Days of fuel stock cover (IEA Oil Stocks / EIA Weekly Petroleum Status); strategic buffer for energy-dependent recovery',
+    description: 'RETIRED in PR 3. Legacy days-of-fuel-stock-cover (IEA Oil Stocks / EIA Weekly Petroleum Status). Does not contribute to the score — scoreFuelStockDays returns coverage=0 + imputationClass=null, and the dimension is excluded from confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS registry. Kept in the registry as tier=experimental for structural continuity; a globally-comparable recovery-fuel concept could replace this in a future PR.',
    direction: 'higherBetter',
    goalposts: { worst: 0, best: 120 },
    weight: 1.0,
    sourceKey: 'resilience:recovery:fuel-stocks:v1',
    scope: 'global',
    cadence: 'monthly',
-    tier: 'enrichment',
+    tier: 'experimental',
    coverage: 45,
    license: 'open-data',
  },
--- a/server/worldmonitor/resilience/v1/_shared.ts
+++ b/server/worldmonitor/resilience/v1/_shared.ts
@@ -17,6 +17,7 @@ import {
  RESILIENCE_DIMENSION_ORDER,
  RESILIENCE_DIMENSION_TYPES,
  RESILIENCE_DOMAIN_ORDER,
+  RESILIENCE_RETIRED_DIMENSIONS,
  createMemoizedSeedReader,
  getResilienceDomainWeight,
  scoreAllDimensions,
@@ -338,8 +339,22 @@ function parseHistoryPoints(raw: unknown): ResilienceHistoryPoint[] {
  return history.sort((left, right) => left.date.localeCompare(right.date));
 }

-function computeLowConfidence(dimensions: ResilienceDimension[], imputationShare: number): boolean {
-  const averageCoverage = mean(dimensions.map((dimension) => dimension.coverage)) ?? 0;
+export function computeLowConfidence(dimensions: ResilienceDimension[], imputationShare: number): boolean {
+  // Exclude RETIRED dimensions (fuelStockDays, post-PR-3) from the
+  // confidence reading. They contribute zero weight to domain scoring
+  // via coverageWeightedMean, so including them in a flat coverage mean
+  // would drag the user-facing confidence signal down for every country
+  // purely because of a deliberate construct retirement.
+  //
+  // IMPORTANT: we do NOT filter by `coverage === 0` because a genuinely
+  // sparse-data country can legitimately produce coverage=0 on non-
+  // retired dims via weightedBlend fall-through, and those coverage=0
+  // entries SHOULD drag the confidence down — that is precisely the
+  // sparse-data signal lowConfidence exists to surface.
+  const scoring = dimensions.filter(
+    (dimension) => !RESILIENCE_RETIRED_DIMENSIONS.has(dimension.id as ResilienceDimensionId),
+  );
+  const averageCoverage = mean(scoring.map((dimension) => dimension.coverage)) ?? 0;
  return averageCoverage < LOW_CONFIDENCE_COVERAGE_THRESHOLD || imputationShare > LOW_CONFIDENCE_IMPUTATION_SHARE_THRESHOLD;
 }

@@ -634,8 +649,18 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise

 export const GREY_OUT_COVERAGE_THRESHOLD = 0.40;

-function computeOverallCoverage(response: GetResilienceScoreResponse): number {
-  const coverages = response.domains.flatMap((domain) => domain.dimensions.map((dimension) => dimension.coverage));
+export function computeOverallCoverage(response: GetResilienceScoreResponse): number {
+  // Exclude RETIRED dimensions (fuelStockDays, post-PR-3) — their
+  // coverage=0 is structural, not a sparsity signal, and should not
+  // drag down the ranking widget's overallCoverage pill. Non-retired
+  // coverage=0 dims (genuine weightedBlend fall-through) stay in the
+  // average because they reflect real data sparsity for that country.
+  // See `computeLowConfidence` for the matching rationale.
+  const coverages = response.domains.flatMap((domain) =>
+    domain.dimensions
+      .filter((dimension) => !RESILIENCE_RETIRED_DIMENSIONS.has(dimension.id as ResilienceDimensionId))
+      .map((dimension) => dimension.coverage),
+  );
  if (coverages.length === 0) return 0;
  return coverages.reduce((sum, coverage) => sum + coverage, 0) / coverages.length;
 }