feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)

* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently First commit in PR 3 of the resilience repair plan. Retires `fuelStockDays` from the core score with no replacement. Why permanent, not replaced: IEA emergency-stockholding rules are defined in days of NET IMPORTS and do not bind net exporters by design. Norway/Canada/US measured in days-of-imports are incomparable to Germany/Japan measured the same way — the construct is fundamentally different across the two country classes. No globally-comparable recovery-fuel signal can be built from this source; the pre-repair probe showed 100% imputed at 50 for every country in the April 2026 freeze. scoreFuelStockDays: - Rewritten to return coverage=0 + observedWeight=0 + imputationClass='source-failure' for every country regardless of seed content. - Drops the dimension from the `recovery` domain's coverage- weighted mean automatically; remaining recovery dimensions pick up the share via re-normalisation in `_shared.ts#coverageWeightedMean`. - No explicit weight transfer needed — the coverage-weighted blend handles redistribution. Registry: - recoveryFuelStockDays re-tagged from tier='enrichment' to tier='experimental' so the Core coverage gate treats it as out-of-score. - Description updated to make the retirement explicit; entry stays in the registry for structural continuity (the dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER for the 19-dimension tests; removing the dimension entirely is a PR 4 structural-audit concern). Housekeeping: - Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no longer read; noUnusedLocals would reject it). - Removed `RecoveryFuelStocksCountry` interface for the same reason. Comment at the removed declaration instructs future maintainers not to re-add the type as a reservation; when a new recovery-fuel concept lands, introduce a fresh interface. Plan reference: §3.5 point 1 of `docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`. 51 resilience tests pass, typecheck + biome clean. The `recovery` domain's published score will shift slightly for every country because the 0.10 slot that fuelStockDays was imputing to now redistributes; the compare-harness acceptance-gate rerun at merge time will quantify the shift per plan §6 gates. * feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35 fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60 countries max — so every non-BIS country fell through to curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45). Combined with reserveMarginPct already removed in PR 1, currencyExternal was the clearest "construct absent for most of the world" carrier left in the scorer. Changes: _dimension-scorers.ts - scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX reserves only. Coverage ladder: inflation + reserves → 0.85 (observed primary + secondary) inflation only → 0.55 reserves only → 0.40 neither → 0.30 (IMPUTE.bisEer retained for snapshot continuity; semantics read as "no IMF + no WB reserves" now) - Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved via comment only, flagged by noUnusedLocals), stddev() helper, getCountryBisExchangeRates() loader, BisExchangeRate interface, dateToSortableNumber() — all were exclusive callers of the retired BIS path. _indicator-registry.ts - New core entry inflationStability (weight 0.60, tier=core, sourceKey=economic:imf:macro:v2). - fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability anchor). - fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental (BIS ~60-country coverage; off the core weight sum). - Non-experimental weights now sum to 1.0 (0.60 + 0.40). scripts/compare-resilience-current-vs-proposed.mjs - EXTRACTION_RULES: added inflationStability → imf-macro-country-field field=inflationPct so the registry-parity test passes and the correlation harness sees the new construct. tests/resilience-dimension-scorers.test.mts - Dropped BIS-era wording ("non-BIS country") and test 266 (BIS-outage coverage 0.35 branch) which collapsed to the inflation- only path post-retirement. - Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+ reserves 0.55 → 0.85. tests/resilience-scorers.test.mts - domainAverages.economic 68.33 → 66.33 (US currencyExternal score shifts slightly under IMF+reserves vs old BIS composite). - stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279. - overallScore 65.82 → 65.52. - baselineScore unchanged (currencyExternal is stress-only). All 6324 data-tier tests pass. typecheck:api clean. No change to seeders or Redis keys; this is a pure scorer + registry rebuild. * feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2) Plan §2.1 diagnosis table showed externalDebtCoverage saturating at score=100 across all 9 probe countries — including stressed states. Signal was collapsed. Root cause: (worst=5, best=0) gave every country with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no worry" territory instead of the "mild warning" it should be. Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score 50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure). Ratios above 2.0 clamp to 0, consistent with "beyond this point the country is already in crisis; exact value stops mattering." Files changed: - _indicator-registry.ts: recoveryDebtToReserves goalposts {worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to cite Greenspan-Guidotti; inline comment documents anchor + rationale. - _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound changed from (0..5) to (0..2), with inline comment. - docs/methodology/country-resilience-index.mdx: goalpost table row 5-0 → 2-0, description cites Greenspan-Guidotti. - docs/methodology/indicator-sources.yaml: * constructStatus: dead-signal → observed-mechanism (signal is now discriminating). * reviewNotes updated to describe the new anchor. * mechanismTestRationale names the Greenspan-Guidotti rule. - tests/resilience-dimension-monotonicity.test.mts: updated the comment + picked values inside the (0..2) discriminating band (0.3 and 1.5). Old values (1 vs 4) had 4 clamping to 0. - tests/resilience-dimension-scorers.test.mts: NO score threshold relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96). - tests/resilience-scorers.test.mts: fixture drift: * domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25). * baselineScore 63.63 → 60.12 (extDebt is baseline type). * overallScore 65.52 → 63.27. * stressScore / stressFactor unchanged (extDebt is baseline-only). All 6324 data-tier tests pass. typecheck:api clean. * feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight Plan §3.6 adds a new acceptance criterion (also §5 item 5): > No indicator with observed coverage below 70% may exceed 5% nominal > weight OR 5% effective influence in the post-change sensitivity run. This commit enforces the NOMINAL-WEIGHT half as a unit test that runs on every CI build. The EFFECTIVE-INFLUENCE half is produced by scripts/validate-resilience-sensitivity.mjs as a committed artifact; the gate file only asserts that script still exists so a refactor that removes it breaks the build loudly. Why the gate exists (plan §3.6): "A dimension at 30% observed coverage carries the same effective weight as one at 95%. This contradicts the OECD/JRC handbook on uncertainty analysis." Implementation: tests/resilience-coverage-influence-gate.test.mts — three tests: 1. Nominal-weight gate: for every core indicator with coverage < 137 countries (70% of the ~195-country universe), computes its nominal overall weight as indicator.weight × (1/dimensions-in-domain) × domain-weight and asserts it does not exceed 5%. Equal-share-per-dimension is the *upper bound* on runtime weight (coverage-weighted mean gives a lower share when a dimension drops out), so this is a strict bound: if the nominal number passes, the runtime number also passes for every country. 2. Effective-influence contract: asserts the sensitivity script exists at its expected path. Removing it (intentionally or by refactor) breaks the build. 3. Audit visibility: prints the top 10 core indicators by nominal overall weight. No assertion beyond "ran" — the list lets reviewers spot outliers that pass the gate but are near the cap. Current state (observed from audit output): recoveryReserveMonths: nominal=4.17% coverage=188 recoveryDebtToReserves: nominal=4.17% coverage=185 recoveryImportHhi: nominal=4.17% coverage=190 inflationStability: nominal=3.40% coverage=185 electricityConsumption: nominal=3.30% coverage=217 ucdpConflict: nominal=3.09% coverage=193 Every core indicator has coverage ≥ 180 (already enforced by the pre-existing indicator-tiering test), so the nominal-weight gate has no current violators — its purpose is catching future drift, not flagging today's state. All 6327 data-tier tests pass. typecheck:api clean. * docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate Methodology-doc update capturing the three §3.5 landings and the §3.6 CI gate. Five edits: 1. **Known construct limitations section (#5 and #6):** strikethrough the original "dead signals" and "no coverage-based weight cap" items, annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" + specifics of what shipped. 2. **Currency & External H4 section:** completely rewritten. Old table (fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is replaced by the two-indicator post-PR-3 table (inflationStability at 0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out (0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as experimental-tier drill-downs only. 3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the methodology-lint H4-to-dimension mapping does not break; body rewritten to explain that the dimension is retired from core but the seeder still runs for IEA-member drill-downs. 4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description cites Greenspan-Guidotti reserve-adequacy rule. 5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering §3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit updates. No scoring or code changes in this commit. Methodology-lint test passes (H4 mapping intact). All 6327 data-tier tests pass. * fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation Reviewer catch (thanks). The previous gate computed each indicator's nominal overall weight as indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight and claimed this was an upper bound ("actual runtime weight is ≤ this when some dimensions drop out on coverage"). That is BACKWARDS for this scorer. The domain aggregation is coverage-weighted (server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean), so when a dimension pins at coverage=0 it is EXCLUDED from the denominator and the surviving dimensions' shares go UP, not down. PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to coverage=0 for every country — so in the current live state the recovery domain has 5 contributing dimensions (not 6), and each core recovery indicator's nominal share is 1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%) The old gate therefore under-estimated nominal influence and could silently pass exactly the kind of low-coverage overweight regression it is meant to block. Fix: - Added `coreBearingDimensions(domainId)` helper that counts only dimensions that have ≥1 core indicator in the registry. A dimension with only experimental/enrichment entries (post-retirement fuelStockDays) has no core contribution → does not dilute shares. - Updated `nominalOverallWeight` to divide by the core-bearing count, not the raw dimension count. - Rewrote the helper's doc comment to stop claiming this is a strict upper bound — explicitly calls out the dynamic case (source failure raising surviving dim shares further) as the sensitivity script's responsibility. - Added a new regression test: asserts (a) at least one recovery dimension is all-non-core (fuelStockDays post-retirement), (b) fuelStockDays has zero core indicators, and (c) recoveryDebt ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion of the retirement or regression to N_total-denominator will fail loudly. Top-10 audit output now correctly shows: recoveryReserveMonths: nominal=5% coverage=188 recoveryDebtToReserves: nominal=5% coverage=185 recoveryImportHhi: nominal=5% coverage=190 (was 4.17% each under the old math) All 486 resilience tests pass. typecheck:api clean. Note: the 5% figure is exactly AT the cap, not over it. "exceed" means strictly > 5%, so it still passes. But now the reviewer / audit log reflects reality. * fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label Addresses the Codex review P1 + P2 on PR #3297. P1 — retired-dim drag on confidence averages -------------------------------------------- scoreFuelStockDays returns coverage=0 by design (retired construct), but computeLowConfidence, computeOverallCoverage, and the widget's formatResilienceConfidence averaged across all 19 dimensions. That dragged every country's reported averageCoverage down — US went from 0.8556 (active dims only) to 0.8105 (all dims) — enough drift to misclassify edge countries as lowConfidence and to shift the ranking widget's overallCoverage pill for every country. Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in _dimension-scorers.ts and filter it out of all three averages. The filter is keyed on the retired-dim REGISTRY, not on coverage === 0, because a non-retired dim can legitimately emit coverage=0 on a genuinely sparse-data country via weightedBlend fall-through — those entries MUST keep dragging confidence down (that is the sparse-data signal lowConfidence exists to surface). Verified: sparse-country release-gate test (marks sparse WHO/FAO countries as low confidence) still passes with the registry-keyed filter; would have failed with a naive coverage=0 filter. Server-client parity: widget-utils cannot import server code, so RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept in lockstep by tests/resilience-retired-dimensions-parity.test.mts (parses the widget file as text, same pattern as existing widget-util tests that can't import the widget module directly). P2 — false "Source down" label on retired dim --------------------------------------------- scoreFuelStockDays hard-coded imputationClass: 'source-failure', which the widget maps to "Source down: upstream seeder failed" with a `!` icon for every country. That is semantically wrong for an intentional retirement. Flipped to null so the widget's absent-path renders a neutral cell without a false outage label. null is already a legal value of ResilienceDimensionScore.imputationClass; no type change needed. Tests ----- - tests/resilience-confidence-averaging.test.mts (new): pins the registry-keyed filter semantic for computeOverallCoverage + computeLowConfidence. Includes a negative-control test proving non-retired coverage=0 dims still flip lowConfidence. - tests/resilience-retired-dimensions-parity.test.mts (new): lockstep gate between server and client retired-dim lists. - Widget test adds a registry-keyed exclusion test with a non-retired coverage=0 dim in the fixture to lock in the correct semantic. - Existing tests asserting imputationClass: 'source-failure' for fuelStockDays flipped to null. All 494 resilience tests + full 6336/6336 data-tier suite pass. Typecheck clean for both tsconfig.json and tsconfig.api.json. * docs(resilience): align methodology + registry metadata with shipped imputationClass=null Follow-up to the previous PR 3 review commit that flipped scoreFuelStockDays's imputationClass from 'source-failure' to null to avoid a false "Source down" widget label on every country. The code changed; the doc and registry metadata did not, leaving three sites in the methodology mdx and two comment/description sites in the registry still claiming imputationClass='source-failure'. Any future reviewer (or tooling that treats the registry description as authoritative) would be misled. This commit rewrites those sites to describe the shipped behavior: - imputationClass=null (not 'source-failure'), with the rationale - exclusion from confidence/coverage averages via the RESILIENCE_RETIRED_DIMENSIONS registry filter - the distinction between structural retirement (filtered) and runtime coverage=0 (kept so sparse-data countries still flag lowConfidence) Touched: - docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590) - server/worldmonitor/resilience/v1/_indicator-registry.ts (recoveryFuelStockDays comment block + description field) No code-behavior change. Docs-only. Tests: 157 targeted resilience tests pass (incl. methodology-lint + widget + release-gate + confidence-averaging). Typecheck clean on both tsconfig.json and tsconfig.api.json.
2026-04-25 17:14:57 +02:00 · 2026-04-22 23:57:28 +04:00
parent c067a7dd63
commit 7cf37c604c
15 changed files with 838 additions and 210 deletions
--- a/tests/resilience-confidence-averaging.test.mts
+++ b/tests/resilience-confidence-averaging.test.mts
@@ -0,0 +1,139 @@
+import assert from 'node:assert/strict';
+import { describe, it } from 'node:test';
+
+import {
+  computeLowConfidence,
+  computeOverallCoverage,
+} from '../server/worldmonitor/resilience/v1/_shared';
+import type {
+  GetResilienceScoreResponse,
+  ResilienceDimension,
+} from '../src/generated/server/worldmonitor/resilience/v1/service_server';
+
+// PR 3 §3.5 follow-up (reviewer P1): the retired dimension (fuelStockDays,
+// post-retirement) returns coverage=0 structurally and contributes zero
+// weight to the domain score via coverageWeightedMean. The user-facing
+// confidence/coverage averages must exclude retired dims — otherwise
+// the retirement silently drags the reported averageCoverage down for
+// every country even though the dimension is not part of the score.
+//
+// Reviewer anchor: on the US profile, including retired dims gave
+// averageCoverage=0.8105 vs 0.8556 when retired dims are excluded —
+// enough drift to misclassify edge countries as lowConfidence and to
+// shift the widget's overallCoverage pill for the whole ranking.
+//
+// Critical invariant: the filter is keyed on the retired-dim REGISTRY,
+// not on `coverage === 0`. Non-retired dimensions can legitimately
+// emit coverage=0 on genuinely sparse-data countries via weightedBlend
+// fall-through, and those entries MUST continue to drag confidence
+// down — that is the sparse-data signal lowConfidence exists to
+// surface. A too-aggressive `coverage === 0` filter would hide the
+// sparsity and e.g. let South Sudan pass as full-confidence.
+
+function dim(id: string, coverage: number): ResilienceDimension {
+  return {
+    id,
+    score: 50,
+    coverage,
+    observedWeight: coverage > 0 ? 1 : 0,
+    imputedWeight: 0,
+    imputationClass: '',
+    freshness: { lastObservedAtMs: '0', staleness: '' },
+  };
+}
+
+describe('computeOverallCoverage: retired-dim exclusion', () => {
+  it('excludes retired dimensions from the average', () => {
+    const response = {
+      domains: [
+        {
+          id: 'recovery',
+          dimensions: [
+            dim('fiscalSpace', 0.9),
+            dim('reserveAdequacy', 0.8),
+            // Retired: must not pull the average down.
+            dim('fuelStockDays', 0),
+          ],
+        },
+      ],
+    } as unknown as GetResilienceScoreResponse;
+
+    // (0.9 + 0.8) / 2 = 0.85. With retired included the flat mean
+    // would be (0.9 + 0.8 + 0) / 3 ≈ 0.5667 — the regression shape.
+    assert.equal(computeOverallCoverage(response).toFixed(4), '0.8500');
+  });
+
+  it('keeps NON-retired coverage=0 dims in the average (sparse-data signal)', () => {
+    // A genuinely sparse-data country can emit coverage=0 on non-retired
+    // dims via weightedBlend fall-through. Those entries must stay in
+    // the average so sparse countries still surface as low confidence
+    // via the flat mean path.
+    const response = {
+      domains: [
+        {
+          id: 'economic',
+          dimensions: [
+            dim('macroFiscal', 0.9),
+            // NON-retired coverage=0: represents genuine data sparsity.
+            dim('currencyExternal', 0),
+          ],
+        },
+      ],
+    } as unknown as GetResilienceScoreResponse;
+
+    // (0.9 + 0) / 2 = 0.45. If the filter were keyed on coverage=0,
+    // the genuine sparsity would be hidden and this would be 0.9.
+    assert.equal(computeOverallCoverage(response).toFixed(4), '0.4500');
+  });
+
+  it('returns 0 when ALL dims are retired (degenerate case)', () => {
+    const response = {
+      domains: [
+        { id: 'recovery', dimensions: [dim('fuelStockDays', 0)] },
+      ],
+    } as unknown as GetResilienceScoreResponse;
+    assert.equal(computeOverallCoverage(response), 0);
+  });
+});
+
+describe('computeLowConfidence: retired-dim exclusion', () => {
+  it('does not flip lowConfidence purely on retired-dim drag', () => {
+    // Three active dims at 0.72 = 0.72 mean (above the low-confidence
+    // threshold). A single retired dim at coverage=0 must not flip the
+    // flag by dragging the flat mean below the threshold — that was
+    // the regression on the US profile.
+    const dims = [
+      dim('fiscalSpace', 0.72),
+      dim('reserveAdequacy', 0.72),
+      dim('externalDebtCoverage', 0.72),
+      dim('fuelStockDays', 0), // retired
+    ];
+    assert.equal(computeLowConfidence(dims, 0), false,
+      'retired fuelStockDays must not flip lowConfidence for an otherwise well-covered country');
+  });
+
+  it('DOES flip lowConfidence for non-retired coverage=0 dims (sparse data)', () => {
+    // A sparse-data country: multiple non-retired dims at coverage=0
+    // via weightedBlend fall-through. The flat mean drops below the
+    // threshold and the flag must fire — this is the sparse-data
+    // signal lowConfidence exists to surface. A too-aggressive filter
+    // on coverage=0 would hide this.
+    const dims = [
+      dim('macroFiscal', 0.9),
+      dim('currencyExternal', 0),   // non-retired coverage=0
+      dim('tradeSanctions', 0),     // non-retired coverage=0
+      dim('cyberDigital', 0),       // non-retired coverage=0
+    ];
+    assert.equal(computeLowConfidence(dims, 0), true,
+      'non-retired coverage=0 dims must drag lowConfidence down — that is the sparse-data signal');
+  });
+
+  it('respects the imputationShare threshold independently', () => {
+    // Imputation-share check is a separate arm of the OR; retired-dim
+    // filtering must not suppress a legitimate high-imputation-share
+    // trigger.
+    const dims = [dim('fiscalSpace', 0.95)];
+    assert.equal(computeLowConfidence(dims, 0.6), true,
+      'imputationShare > 0.4 must flip lowConfidence even when coverage looks strong');
+  });
+});
--- a/tests/resilience-coverage-influence-gate.test.mts
+++ b/tests/resilience-coverage-influence-gate.test.mts
@@ -0,0 +1,194 @@
+import assert from 'node:assert/strict';
+import { existsSync } from 'node:fs';
+import { dirname, join } from 'node:path';
+import { describe, it } from 'node:test';
+import { fileURLToPath } from 'node:url';
+
+import { INDICATOR_REGISTRY } from '../server/worldmonitor/resilience/v1/_indicator-registry.ts';
+import {
+  RESILIENCE_DIMENSION_DOMAINS,
+  getResilienceDomainWeight,
+  type ResilienceDimensionId,
+  type ResilienceDomainId,
+} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
+
+// PR 3 §3.6 — Coverage-and-influence cap on indicator weight.
+//
+// Rule (plan §3.6, verbatim):
+//   No indicator with observed coverage below 70% may exceed 5% nominal
+//   weight OR 5% effective influence in the post-change sensitivity run.
+//
+// This file enforces the NOMINAL-WEIGHT half (static, runs every build).
+// The effective-influence half is checked by the variable-importance
+// output of scripts/validate-resilience-sensitivity.mjs and committed as
+// an artifact; see plan §5 acceptance-criteria item 9.
+//
+// Why the gate exists (plan §3.6):
+//   "A dimension at 30% observed coverage carries the same effective
+//   weight as one at 95%. This contradicts the OECD/JRC handbook on
+//   uncertainty analysis."
+//
+// Assumption: the global universe is ~195 countries (UN members + a few
+// territories commonly ranked). "70% coverage" → 137+ countries.
+
+const GLOBAL_COUNTRY_UNIVERSE = 195;
+const COVERAGE_FLOOR = Math.ceil(GLOBAL_COUNTRY_UNIVERSE * 0.7); // 137
+const NOMINAL_WEIGHT_CAP = 0.05; // 5%
+
+// Nominal overall weight of an indicator = weight in dimension
+//   × dimension share of domain
+//   × domain weight in overall score.
+//
+// `dimension share of domain` is NOT 1/N_total — the scorer aggregates
+// by coverage-weighted mean (server/worldmonitor/resilience/v1/_shared.ts
+// coverageWeightedMean), so a dimension that pins at coverage=0 drops
+// out of the denominator and the surviving dimensions' shares go UP,
+// not down. PR 3 commit 1 retires fuelStockDays by pinning its scorer
+// at coverage=0 for every country — so in the current live state the
+// recovery domain has 5 contributing dimensions (not 6), and each core
+// recovery indicator's nominal share is 1/5 × 0.25 = 5%, not the
+// 1/6 × 0.25 = 4.17% a naive N-based count would report.
+//
+// We therefore count "effective contributing dimensions" per domain:
+// dimensions that have at least one tier='core' indicator in the
+// registry. A dimension with only experimental/enrichment indicators
+// (e.g. fuelStockDays, post-retirement) scores coverage=0 in the core
+// path and is excluded from the coverage-weighted domain mean, so it
+// does not dilute the core dimensions' shares.
+//
+// This still under-estimates the WORST case — a live source-failure
+// run can drop a usually-contributing dimension to coverage=0, further
+// raising surviving dimensions' shares. The worst-case upper bound is
+// indicator.weight × domain_weight (single surviving dimension, 1/1
+// share). Enforcing THAT bound would fail most indicators, so we
+// enforce the baseline (all core-bearing dimensions present) here and
+// rely on the sensitivity-script's effective-influence output (plan
+// §3.6 second half, plan §5 acceptance item 9) to catch the dynamic
+// case.
+//
+// Indicator weights within a dimension are normalized to sum to 1 for
+// non-experimental tiers (enforced by the indicator-registry test).
+
+function dimensionsInDomain(domainId: ResilienceDomainId): ResilienceDimensionId[] {
+  return (Object.keys(RESILIENCE_DIMENSION_DOMAINS) as ResilienceDimensionId[])
+    .filter((dimId) => RESILIENCE_DIMENSION_DOMAINS[dimId] === domainId);
+}
+
+function coreBearingDimensions(domainId: ResilienceDomainId): Set<ResilienceDimensionId> {
+  const dimsInDomain = new Set(dimensionsInDomain(domainId));
+  const withCore = new Set<ResilienceDimensionId>();
+  for (const entry of INDICATOR_REGISTRY) {
+    if (entry.tier === 'core' && dimsInDomain.has(entry.dimension)) {
+      withCore.add(entry.dimension);
+    }
+  }
+  return withCore;
+}
+
+function nominalOverallWeight(indicator: typeof INDICATOR_REGISTRY[number]): number {
+  const domainId = RESILIENCE_DIMENSION_DOMAINS[indicator.dimension];
+  if (domainId == null) return 0;
+  const domainWeight = getResilienceDomainWeight(domainId);
+  // Count only dimensions that have ≥1 core indicator — retired or
+  // all-experimental dimensions contribute coverage=0 to the scorer and
+  // are excluded from the coverage-weighted domain mean.
+  const contributing = coreBearingDimensions(domainId).size;
+  const dimensionShare = contributing > 0 ? 1 / contributing : 0;
+  return indicator.weight * dimensionShare * domainWeight;
+}
+
+describe('resilience coverage-and-influence gate (PR 3 §3.6)', () => {
+  it('no indicator with <70% country coverage carries >5% nominal weight in the overall score', () => {
+    const violations = INDICATOR_REGISTRY
+      // Only core indicators contribute to the overall (public) score.
+      // Enrichment and experimental are drill-down-only, so their
+      // nominal-weight-in-overall is 0 regardless of registry weight.
+      .filter((e) => e.tier === 'core')
+      .filter((e) => e.coverage < COVERAGE_FLOOR)
+      .map((e) => ({
+        id: e.id,
+        dimension: e.dimension,
+        coverage: e.coverage,
+        weight: e.weight,
+        nominalOverall: Number(nominalOverallWeight(e).toFixed(4)),
+      }))
+      .filter((v) => v.nominalOverall > NOMINAL_WEIGHT_CAP);
+
+    assert.deepEqual(
+      violations,
+      [],
+      `Indicators below ${COVERAGE_FLOOR}-country coverage floor with nominal overall weight > ${NOMINAL_WEIGHT_CAP * 100}%:\n${
+        violations.map((v) => `  - ${v.id} (dim=${v.dimension}, coverage=${v.coverage}, nominal=${(v.nominalOverall * 100).toFixed(2)}%)`).join('\n')
+      }\n\nFix options:\n  1. Demote to enrichment or experimental tier.\n  2. Lower the indicator's weight within its dimension.\n  3. Improve coverage to ≥${COVERAGE_FLOOR} countries.`,
+    );
+  });
+
+  it('effective-influence artifact reference exists (sensitivity-script contract)', () => {
+    // The plan (§3.6, §5 item 9) requires post-change variable-importance
+    // to confirm the nominal-weight gate is not violated in the dynamic
+    // (variance-explained) dimension either. That artifact is produced
+    // by scripts/validate-resilience-sensitivity.mjs and not re-computed
+    // here (it requires seeded Redis). This test only asserts the gate
+    // script exists, so removing it via refactor breaks the build.
+    const here = dirname(fileURLToPath(import.meta.url));
+    const sensScript = join(here, '..', 'scripts', 'validate-resilience-sensitivity.mjs');
+    assert.ok(existsSync(sensScript),
+      `plan §3.6 effective-influence half is enforced by ${sensScript} — file is missing`);
+  });
+
+  it('retired dimensions (coverage=0 for every country) do not count in the per-domain share denominator', () => {
+    // Regression guard for the §3.6 gate math. When PR 3 commit 1
+    // pinned fuelStockDays at coverage=0, the coverage-weighted domain
+    // aggregation raised the surviving recovery dimensions' shares from
+    // 1/6 to 1/5. Any gate that uses 1/N_total as the divisor will
+    // under-report nominal influence and can silently pass a regression
+    // that drives a low-coverage indicator above the 5% cap.
+    //
+    // This test asserts the helper correctly excludes all-experimental
+    // dimensions from the share denominator.
+    const recoveryDimsTotal = dimensionsInDomain('recovery').length;
+    const recoveryCoreBearing = coreBearingDimensions('recovery').size;
+    assert.ok(recoveryCoreBearing < recoveryDimsTotal,
+      `expected at least one recovery dimension to be all-non-core (post-fuelStockDays-retirement); got ${recoveryCoreBearing}/${recoveryDimsTotal}. If this flips, the fuelStockDays retirement was reverted and §3.6 math assumptions need review.`);
+
+    // Explicit: fuelStockDays is the dimension we retired. Confirm it
+    // has zero core indicators.
+    const fuelStockCoreCount = INDICATOR_REGISTRY.filter(
+      (e) => e.dimension === 'fuelStockDays' && e.tier === 'core',
+    ).length;
+    assert.equal(fuelStockCoreCount, 0,
+      'fuelStockDays must have zero core indicators post-PR 3 §3.5 retirement. If this fails, un-retire must be intentional + the gate math reviewed.');
+
+    // And the recovery-domain core indicators should each compute 5%
+    // under the corrected formula (1.0 × 1/5 × 0.25), not 4.17%.
+    const debtToReserves = INDICATOR_REGISTRY.find((e) => e.id === 'recoveryDebtToReserves');
+    assert.ok(debtToReserves != null, 'recoveryDebtToReserves must exist');
+    const computed = nominalOverallWeight(debtToReserves!);
+    // 0.05 exactly, allow fp wiggle
+    assert.ok(Math.abs(computed - 0.05) < 1e-9,
+      `recoveryDebtToReserves nominal weight should be 0.05 (1.0 × 1/5 × 0.25) post-retirement; got ${computed}. If this is 0.0417, the share denominator is using 1/6 instead of 1/5 — fuelStockDays retirement is not being excluded.`);
+  });
+
+  it('reports the current nominal-weight distribution for audit', () => {
+    // Visibility-only (no assertion beyond "ran cleanly"). The output
+    // lets reviewers eyeball the distribution and spot outliers that
+    // technically pass (coverage ≥ floor) but still carry unusually
+    // high weight for a narrow construct.
+    const ranked = INDICATOR_REGISTRY
+      .filter((e) => e.tier === 'core')
+      .map((e) => ({
+        id: e.id,
+        nominalOverall: Number((nominalOverallWeight(e) * 100).toFixed(2)),
+        coverage: e.coverage,
+      }))
+      .sort((a, b) => b.nominalOverall - a.nominalOverall)
+      .slice(0, 10);
+    if (ranked.length > 0) {
+      console.warn('[PR 3 §3.6] top 10 core indicators by nominal overall weight:');
+      for (const r of ranked) {
+        console.warn(`  ${r.id}: nominal=${r.nominalOverall}%  coverage=${r.coverage}`);
+      }
+    }
+    assert.ok(ranked.length > 0, 'expected at least one core indicator');
+  });
+});
--- a/tests/resilience-dimension-monotonicity.test.mts
+++ b/tests/resilience-dimension-monotonicity.test.mts
@@ -98,12 +98,12 @@ describe('resilience dimension monotonicity — scoreExternalDebtCoverage', () =
  }

  it('higher debtToReservesRatio → lower score', async () => {
-    // NOTE: the current scorer saturates at 100 for ratio ≤ 0 (goalpost
-    // lower-better, worst=5 best=0). Picking values inside the 0-5 band
-    // to get a meaningful gradient. PR 3 §3.6 re-goalposts this.
-    const good = await scoreWith(1);
-    const bad = await scoreWith(4);
-    assert.ok(good.score > bad.score, `debtToReservesRatio 1→4 should lower score; got ${good.score} → ${bad.score}`);
+    // PR 3 §3.5 point 3: goalpost is now lower-better worst=2 best=0
+    // (Greenspan-Guidotti anchor). Any ratio ≥ 2 clamps to 0, so pick
+    // values inside the discriminating band to get a meaningful gradient.
+    const good = await scoreWith(0.3);
+    const bad = await scoreWith(1.5);
+    assert.ok(good.score > bad.score, `debtToReservesRatio 0.3→1.5 should lower score; got ${good.score} → ${bad.score}`);
  });
 });

--- a/tests/resilience-dimension-scorers.test.mts
+++ b/tests/resilience-dimension-scorers.test.mts
@@ -227,53 +227,36 @@ describe('resilience dimension scorers', () => {
      `coverage should be ~0.45 (only sanctions loaded), got ${score.coverage}`);
  });

-  it('scoreCurrencyExternal: non-BIS country with no IMF data falls back to curated_list_absent (score 50)', async () => {
-    // BIS loaded, IMF macro also null — no inflation proxy available → curated_list_absent imputation.
-    const reader = async (key: string): Promise<unknown | null> => {
-      if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
-      return null; // economic:imf:macro:v1 also null
-    };
-    const score = await scoreCurrencyExternal('MZ', reader); // Mozambique not in BIS
-    assert.equal(score.score, 50, 'curated_list_absent must impute score=50 when IMF also missing');
+  it('scoreCurrencyExternal: no IMF and no reserves → curated_list_absent imputation (score 50)', async () => {
+    // PR 3 §3.5: BIS retired. Without IMF inflation or WB reserves,
+    // scorer falls through to IMPUTE.bisEer (kept for snapshot continuity).
+    const reader = async (_key: string): Promise<unknown | null> => null;
+    const score = await scoreCurrencyExternal('MZ', reader);
+    assert.equal(score.score, 50, 'curated_list_absent must impute score=50 when IMF+reserves missing');
    assert.equal(score.coverage, 0.3, 'curated_list_absent certaintyCoverage=0.3');
  });

-  it('scoreCurrencyExternal: non-BIS country with IMF inflation uses inflation proxy (coverage 0.45)', async () => {
-    // BIS loaded, IMF macro has inflation → use inflation proxy instead of curated_list_absent.
+  it('scoreCurrencyExternal: IMF inflation only (no reserves) uses inflation proxy (coverage 0.55)', async () => {
+    // PR 3 §3.5: BIS retired. IMF inflation alone gives inflation-only path (0.55).
    const reader = async (key: string): Promise<unknown | null> => {
-      if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
      if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 8, currentAccountPct: -5, year: 2024 } } };
      return null;
    };
    const score = await scoreCurrencyExternal('MZ', reader);
    // normalizeLowerBetter(min(8,50), 0, 50) = (50-8)/50*100 = 84
    assert.equal(score.score, 84, 'low-inflation country gets high currency score via IMF proxy');
-    assert.equal(score.coverage, 0.45, 'IMF inflation proxy coverage=0.45 (better than pure imputation)');
+    assert.equal(score.coverage, 0.55, 'IMF inflation only (no reserves) → coverage 0.55');
  });

-  it('scoreCurrencyExternal: non-BIS country with hyperinflation is capped at score 0', async () => {
+  it('scoreCurrencyExternal: hyperinflation is capped at score 0 (inflation-only path)', async () => {
    const reader = async (key: string): Promise<unknown | null> => {
-      if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
      if (key === 'economic:imf:macro:v2') return { countries: { ZW: { inflationPct: 250, currentAccountPct: -8, year: 2024 } } };
      return null;
    };
    const score = await scoreCurrencyExternal('ZW', reader);
    // min(250, 50) = 50 → normalizeLowerBetter(50, 0, 50) = 0
    assert.equal(score.score, 0, 'hyperinflation ≥50% is capped → score 0');
-    assert.equal(score.coverage, 0.45, 'hyperinflation still gets IMF proxy coverage=0.45');
-  });
-
-  it('scoreCurrencyExternal: BIS outage + IMF inflation present → uses proxy with coverage=0.35', async () => {
-    // BIS seed is completely down (null), but IMF macro is available.
-    // The inflation proxy should still be applied — BIS outage must not block the IMF path.
-    const reader = async (key: string): Promise<unknown | null> => {
-      if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 6, currentAccountPct: -2, year: 2024 } } };
-      return null; // economic:bis:eer:v1 null = BIS seed outage
-    };
-    const score = await scoreCurrencyExternal('MZ', reader);
-    // normalizeLowerBetter(min(6,50), 0, 50) = (50-6)/50*100 = 88
-    assert.equal(score.score, 88, 'BIS outage must not block IMF inflation proxy');
-    assert.equal(score.coverage, 0.35, 'BIS outage reduces proxy coverage to 0.35 (primary source unavailable)');
+    assert.equal(score.coverage, 0.55, 'hyperinflation still gets IMF inflation-only coverage 0.55');
  });

  it('scoreCurrencyExternal: both BIS and IMF null → curated_list_absent imputation (T1.7)', async () => {
@@ -308,9 +291,9 @@ describe('resilience dimension scorers', () => {
    assert.ok(withReserves.coverage > 0, 'coverage must be positive with BIS + reserves');
  });

-  it('scoreCurrencyExternal: non-BIS country with good reserves scores higher than with bad reserves', async () => {
+  it('scoreCurrencyExternal: good reserves score higher than bad reserves (inflation+reserves path)', async () => {
+    // PR 3 §3.5: BIS retired. inflation+reserves path → coverage 0.85.
    const makeReader = (months: number) => async (key: string): Promise<unknown | null> => {
-      if (key === 'economic:bis:eer:v1') return { rates: [{ countryCode: 'US', realChange: 1.2, realEer: 101, date: '2025-09' }] };
      if (key === 'economic:imf:macro:v2') return { countries: { MZ: { inflationPct: 15, currentAccountPct: -5, year: 2024 } } };
      if (key === 'resilience:static:MZ') return { fxReservesMonths: { source: 'worldbank', months, year: 2023 } };
      return null;
@@ -319,7 +302,7 @@ describe('resilience dimension scorers', () => {
    const badRes = await scoreCurrencyExternal('MZ', makeReader(1.5));
    assert.ok(goodRes.score > badRes.score, `good reserves (${goodRes.score}) must score higher than bad (${badRes.score})`);
    assert.equal(goodRes.coverage, badRes.coverage, 'coverage should be the same when both have inflation+reserves');
-    assert.equal(goodRes.coverage, 0.55, 'non-BIS with inflation+reserves gets coverage=0.55');
+    assert.equal(goodRes.coverage, 0.85, 'inflation+reserves path gets coverage=0.85');
  });

  it('scoreMacroFiscal: IMF current account loaded, surplus country scores higher than deficit', async () => {
@@ -1150,8 +1133,9 @@ describe('resilience source-failure aggregation (T1.7)', () => {
  });

  it('scoreExternalDebtCoverage: low debt-to-reserves ratio scores well', async () => {
+    // PR 3 §3.5: goalpost tightened (5→2). NO ratio=0.2 → (2-0.2)/2 = 90.
    const no = await scoreExternalDebtCoverage('NO', fixtureReader);
-    assert.ok(no.score > 90, `NO with ratio 0.2 should score >90, got ${no.score}`);
+    assert.ok(no.score >= 85, `NO with ratio 0.2 should score >=85, got ${no.score}`);
  });

  it('scoreImportConcentration: low HHI scores well', async () => {
@@ -1167,17 +1151,29 @@ describe('resilience source-failure aggregation (T1.7)', () => {
    assert.equal(no.imputationClass, null, 'NO has real data, no imputation class');
  });

-  it('scoreFuelStockDays: country with stock data scores based on coverage', async () => {
+  // PR 3 §3.5: fuelStockDays retired permanently from the core score.
+  // scoreFuelStockDays returns coverage=0 + observedWeight=0 +
+  // imputationClass=null for every country regardless of seed content —
+  // the previous two behavioural tests no longer apply because there is
+  // no distinction between "has data" and "missing data" any more. New
+  // regression test: assert the retirement shape holds identically for
+  // a country that USED to have data and a country that never did, so no
+  // future commit silently re-enables the old branch.
+  //
+  // imputationClass is pinned to `null` (not 'source-failure') because
+  // 'source-failure' renders as "Source down: upstream seeder failed"
+  // with a `!` icon in the widget — semantically wrong for an intentional
+  // retirement. `null` lets the widget render the dimension as a neutral
+  // "absent" cell without a false outage label.
+  it('scoreFuelStockDays: retired — returns coverage=0 + null imputationClass for every country', async () => {
    const no = await scoreFuelStockDays('NO', fixtureReader);
-    // NO fixture: fuelStockDays=90 → normalizeHigherBetter(90, 0, 120) = 75
-    assert.ok(no.score > 60, `NO with 90 fuelStockDays should score >60, got ${no.score}`);
-    assert.ok(no.observedWeight > 0, 'real fuel-stock data must have observed weight');
-  });
-
-  it('scoreFuelStockDays: country without fuel stock data returns unmonitored', async () => {
    const ye = await scoreFuelStockDays('YE', fixtureReader);
-    assert.equal(ye.imputationClass, 'unmonitored');
-    assert.equal(ye.observedWeight, 0);
+    for (const [label, result] of [['NO', no], ['YE', ye]] as const) {
+      assert.equal(result.coverage, 0, `${label}: retired dimension must have coverage=0`);
+      assert.equal(result.observedWeight, 0, `${label}: retired dimension must have observedWeight=0`);
+      assert.equal(result.imputedWeight, 0, `${label}: retired dimension must have imputedWeight=0`);
+      assert.equal(result.imputationClass, null, `${label}: retired dimension must not tag source-failure (intentional retirement, not a runtime outage)`);
+    }
  });

  it('recovery domain is present in scoreAllDimensions output', async () => {
--- a/tests/resilience-release-gate.test.mts
+++ b/tests/resilience-release-gate.test.mts
@@ -49,12 +49,28 @@ function installRedisFixtures() {

 describe('resilience release gate', () => {
  it('keeps all 19 dimension scorers non-placeholder for the required countries', async () => {
+    // PR 3 §3.5: fuelStockDays is retired — scoreFuelStockDays emits
+    // coverage=0 + imputationClass=null for every country. The retirement
+    // is intentional (construct incomparable across net importers / net
+    // exporters). Allow-list it so the zero-coverage placeholder check
+    // still catches unintended regressions in the OTHER 18 dimensions.
+    //
+    // imputationClass=null (not 'source-failure') because the widget maps
+    // 'source-failure' to a "Source down: upstream seeder failed" label
+    // with a `!` icon — surfacing that for every country on a deliberate
+    // retirement would manufacture a false outage signal.
+    const RETIRED_DIMENSIONS = new Set(['fuelStockDays']);
    for (const countryCode of REQUIRED_DIMENSION_COUNTRIES) {
      const scores = await scoreAllDimensions(countryCode, fixtureReader);
      const entries = Object.entries(scores);
      assert.equal(entries.length, 19, `${countryCode} should have all resilience dimensions`);
      for (const [dimensionId, score] of entries) {
        assert.ok(Number.isFinite(score.score), `${countryCode} ${dimensionId} should produce a numeric score`);
+        if (RETIRED_DIMENSIONS.has(dimensionId)) {
+          assert.equal(score.coverage, 0, `${countryCode} ${dimensionId} is retired and must stay at coverage=0`);
+          assert.equal(score.imputationClass, null, `${countryCode} ${dimensionId} retired dimensions must tag null imputationClass (not source-failure)`);
+          continue;
+        }
        assert.ok(score.coverage > 0, `${countryCode} ${dimensionId} should not fall back to zero-coverage placeholder scoring`);
      }
    }
--- a/tests/resilience-retired-dimensions-parity.test.mts
+++ b/tests/resilience-retired-dimensions-parity.test.mts
@@ -0,0 +1,55 @@
+import assert from 'node:assert/strict';
+import { readFileSync } from 'node:fs';
+import { fileURLToPath } from 'node:url';
+import { dirname, resolve } from 'node:path';
+import { describe, it } from 'node:test';
+
+import { RESILIENCE_RETIRED_DIMENSIONS } from '../server/worldmonitor/resilience/v1/_dimension-scorers';
+
+// Keep the client-side mirror (`RESILIENCE_RETIRED_DIMENSION_IDS` in
+// src/components/resilience-widget-utils.ts) in lockstep with the
+// server-side authoritative set. Server and widget cannot share a
+// module, but their retired-dim view must never diverge — divergence
+// would leave one surface filtering the wrong set and re-introduce
+// the PR 3 §3.5 drag regression on that surface.
+//
+// We parse the widget file as text (rather than importing it) because
+// the widget module indirectly pulls in browser-only types that crash
+// a plain node test runner. Same pattern as existing widget-util tests.
+
+const here = dirname(fileURLToPath(import.meta.url));
+const WIDGET_UTILS_PATH = resolve(here, '../src/components/resilience-widget-utils.ts');
+
+function parseClientRetiredIds(): Set<string> {
+  const source = readFileSync(WIDGET_UTILS_PATH, 'utf8');
+  const match = source.match(
+    /const RESILIENCE_RETIRED_DIMENSION_IDS:\s*ReadonlySet<string>\s*=\s*new Set\(\[([^\]]*)\]\)/,
+  );
+  if (!match) {
+    throw new Error(
+      'Could not locate RESILIENCE_RETIRED_DIMENSION_IDS constant in resilience-widget-utils.ts. ' +
+      'If the constant was renamed or reformatted, update this parser to match.',
+    );
+  }
+  const ids = match[1]!
+    .split(',')
+    .map((entry) => entry.trim())
+    .filter((entry) => entry.length > 0)
+    .map((entry) => entry.replace(/^['"]|['"]$/g, ''));
+  return new Set(ids);
+}
+
+describe('retired-dimensions client/server parity', () => {
+  it('server RESILIENCE_RETIRED_DIMENSIONS matches client RESILIENCE_RETIRED_DIMENSION_IDS', () => {
+    const serverSet = new Set<string>(RESILIENCE_RETIRED_DIMENSIONS);
+    const clientSet = parseClientRetiredIds();
+
+    const serverOnly = [...serverSet].filter((id) => !clientSet.has(id));
+    const clientOnly = [...clientSet].filter((id) => !serverSet.has(id));
+
+    assert.deepEqual(serverOnly, [],
+      `Server-only retired dims: ${serverOnly.join(', ')}. Update RESILIENCE_RETIRED_DIMENSION_IDS in src/components/resilience-widget-utils.ts.`);
+    assert.deepEqual(clientOnly, [],
+      `Client-only retired dims: ${clientOnly.join(', ')}. Update RESILIENCE_RETIRED_DIMENSIONS in server/worldmonitor/resilience/v1/_dimension-scorers.ts.`);
+  });
+});
--- a/tests/resilience-scorers.test.mts
+++ b/tests/resilience-scorers.test.mts
@@ -60,10 +60,17 @@ describe('resilience scorer contracts', () => {
    //     source-failure when the adapter is in seed-meta failedDatasets. This is the
    //     single source of truth for "no currency data"; null-imputationClass paths
    //     on non-real-data return branches are no longer permitted.
+    // PR 3 §3.5: fuelStockDays removed from this set — scoreFuelStockDays
+    // now returns coverage=0 + imputationClass=null for every country
+    // (retired), so it passes the default coverage=0 assertion below
+    // instead of the T1.7 fall-through assertion. The `null` tag (rather
+    // than 'source-failure') reflects the intentional retirement — see
+    // the widget `formatDimensionConfidence` absent-path which would
+    // otherwise surface a false "Source down" label on every country.
    const coverageZeroExempt = new Set([
      'currencyExternal',
      'fiscalSpace', 'reserveAdequacy', 'externalDebtCoverage',
-      'importConcentration', 'stateContinuity', 'fuelStockDays',
+      'importConcentration', 'stateContinuity',
    ]);
    for (const [dimensionId, scorer] of Object.entries(RESILIENCE_DIMENSION_SCORERS)) {
      const result = await scorer('US');
@@ -92,13 +99,17 @@ describe('resilience scorer contracts', () => {
      return [domainId, average];
    }));

+    // PR 3 §3.5: economic 68.33 → 66.33 after currencyExternal rebuild.
+    // Recovery 54.83 → 47.33 after externalDebtCoverage goalpost was
+    // tightened from (0..5) to (0..2) per §3.5 point 3 (US ratio=1.5
+    // now scores 25 instead of 70).
    assert.deepEqual(domainAverages, {
-      economic: 68.33,
+      economic: 66.33,
      infrastructure: 79,
      energy: 80,
      'social-governance': 61.75,
      'health-food': 60.5,
-      recovery: 54.83,
+      recovery: 47.33,
    });

    function round(v: number, d = 2) { return Number(v.toFixed(d)); }
@@ -126,9 +137,16 @@ describe('resilience scorer contracts', () => {
    const stressScore = round(coverageWeightedMean(stressDims));
    const stressFactor = round(Math.max(0, Math.min(1 - stressScore / 100, 0.5)), 4);

-    assert.equal(baselineScore, 62.64);
-    assert.equal(stressScore, 65.84);
-    assert.equal(stressFactor, 0.3416);
+    // PR 3 §3.5: 62.64 → 63.63 (fuelStockDays retirement) → 60.12
+    // (externalDebtCoverage goalpost tightened; US score drops from 70
+    // to 25, pulling the coverage-weighted baseline mean down).
+    assert.equal(baselineScore, 60.12);
+    // PR 3 §3.5: 65.84 → 67.85 (fuelStockDays retirement) → 67.21
+    // (currencyExternal rebuilt on IMF inflation + WB reserves, coverage
+    // shifts and US stress score moves). stressFactor updates in lockstep:
+    //   1 - 67.21/100 = 0.3279, clamped to 0.5.
+    assert.equal(stressScore, 67.21);
+    assert.equal(stressFactor, 0.3279);

    const overallScore = round(
      RESILIENCE_DOMAIN_ORDER.map((domainId) => {
@@ -140,7 +158,10 @@ describe('resilience scorer contracts', () => {
        return round(cwMean) * getResilienceDomainWeight(domainId);
      }).reduce((sum, v) => sum + v, 0),
    );
-    assert.equal(overallScore, 65.57);
+    // PR 3 §3.5: 65.57 → 65.82 (fuelStockDays retirement) → 65.52
+    // (currencyExternal rebuild) → 63.27 (externalDebtCoverage goalpost
+    // tightened 0..5 → 0..2; US recovery-domain contribution drops).
+    assert.equal(overallScore, 63.27);
  });

  it('baselineScore is computed from baseline + mixed dimensions only', async () => {
@@ -211,7 +232,9 @@ describe('resilience scorer contracts', () => {
    );

    assert.ok(expected > 0, 'overall should be positive');
-    assert.equal(expected, 65.57, 'overallScore should match sum(domainScore * domainWeight)');
+    // PR 3 §3.5: 65.82 → 65.52 (currencyExternal rebuild) → 63.27 after
+    // externalDebtCoverage goalpost tightened from (0..5) to (0..2).
+    assert.equal(expected, 63.27, 'overallScore should match sum(domainScore * domainWeight); 65.52 → 63.27 after PR 3 §3.5 externalDebtCoverage re-goalpost');
  });

  it('stressFactor is still computed (informational) and clamped to [0, 0.5]', () => {
--- a/tests/resilience-widget.test.mts
+++ b/tests/resilience-widget.test.mts
@@ -76,6 +76,45 @@ test('formatResilienceConfidence shows sparse-data copy when low confidence is s
  );
 });

+// PR 3 §3.5 follow-up: retired dimensions (fuelStockDays, post-PR-3)
+// return coverage=0 structurally (by design, not by sparsity) and
+// contribute zero weight to domain scoring. The widget's displayed
+// coverage percentage must exclude them — otherwise a deliberate
+// construct retirement would drag the user-facing confidence reading
+// down for every country even though the dimension is not part of the
+// score. Reviewer P1 anchor: US shows avgCoverage=0.8105 with retired
+// dim included vs 0.8556 with retired excluded.
+//
+// Important: the filter is keyed on the retired-dim ID, NOT on
+// `coverage === 0`. A non-retired dimension can legitimately emit
+// coverage=0 on a genuinely sparse-data country (via weightedBlend
+// fall-through), and those entries must continue to drag confidence
+// down — that is the sparse-data signal lowConfidence exists to
+// surface.
+test('formatResilienceConfidence excludes retired dimensions by ID (not by coverage=0)', () => {
+  const withRetired: ResilienceScoreResponse = {
+    ...baseResponse,
+    domains: [
+      { id: 'economic', score: 80, weight: 0.22, dimensions: [
+        { id: 'macroFiscal', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 },
+        // Non-retired dim with coverage=0: must STAY in the average
+        // (genuine data sparsity, not a retirement).
+        { id: 'currencyExternal', score: 50, coverage: 0, observedWeight: 0, imputedWeight: 0 },
+      ] },
+      { id: 'recovery', score: 65, weight: 1.0, dimensions: [
+        { id: 'fiscalSpace', score: 72, coverage: 0.8, observedWeight: 0.8, imputedWeight: 0.2 },
+        // Retired dimension: coverage=0 is structural; must be excluded.
+        { id: 'fuelStockDays', score: 50, coverage: 0, observedWeight: 0, imputedWeight: 0 },
+      ] },
+    ],
+  };
+  // Average over non-retired entries: (0.9 + 0 + 0.8) / 3 = 0.5667 → 57%.
+  // If fuelStockDays were included: (0.9 + 0 + 0.8 + 0) / 4 = 0.425 → 43%.
+  // If we filtered by coverage=0: (0.9 + 0.8) / 2 = 0.85 → 85% (the
+  // over-aggressive filter that would mask genuine sparsity).
+  assert.equal(formatResilienceConfidence(withRetired), 'Coverage 57% ✓');
+});
+
 test('formatResilienceChange30d preserves explicit sign formatting', () => {
  assert.equal(formatResilienceChange30d(2.41), '30d +2.4');
  assert.equal(formatResilienceChange30d(-1.26), '30d -1.3');