feat(resilience): PR 0 diagnostic freeze + fairness-audit harness (no scoring changes) (#3284)

* feat(resilience): PR 0 diagnostic freeze + fairness-audit harness Lands the before-state and measurement apparatus every subsequent resilience-scorer PR validates against. Zero scoring changes. Per the v3 plan at docs/plans/2026-04-22-001-fix-resilience-scorer-structural- bias-plan.md this is tranche 0 of five. What lands: - Construct contract published in the methodology doc: absolute resilience not development-adjusted, mechanism test for every indicator, peer-relative views published separately from the core. - Known construct limitations section: six construct errors scheduled for PR 1-3 repair with explicit mapping to plan tranches. - Indicator-source manifest at docs/methodology/indicator-sources.yaml with source, seriesId, seriesUrl, coveragePct, lastObservedYear, license, mechanismTestRationale, and a constructStatus classification. - Pre-repair ranking snapshot at docs/snapshots/resilience-ranking-live-pre-repair-2026-04-22.json (217 items + 5 greyedOut, captured 2026-04-22 08:38 UTC at commit 425507d15). - Cohort configuration at tests/helpers/resilience-cohorts.mts: six cohorts covering 87 countries (net-fuel-exporters, net-energy- importers-oecd, nuclear-heavy-generation, coal-heavy-domestic, small-island-importers, fragile-states). - Matched-pair sanity panel at tests/helpers/resilience-matched-pairs.mts: six pairs (FR/DE, NO/CA, UAE/BH, JP/KR, IN/ZA, SG/CH) with expected- direction rationale and minGap for acceptance gate 7. - scripts/compare-resilience-current-vs-proposed.mjs extended to emit cohortSummary and matchedPairSummary alongside the existing output shape (backward compatible). - tests/resilience-cohort-config.test.mts: 11 validations ensuring the cohort + matched-pair configs stay well-formed. Deferred to PR 0.5 (before PR 1 lands): - Monotonicity test harness for all 19 dimension scorers pinning the sign of every indicator. - Pearson-derivative variable-influence baseline inside the sensitivity script producing the nominal-weight-vs-effective-influence table that plan acceptance gate 8 requires. Verification: typecheck:all clean, 430/430 resilience tests pass, 11/11 new cohort-config tests pass, snapshot auto-discovered and validated by the existing snapshot-test harness. * feat(resilience): PR 0 follow-ups — monotonicity harness, variable-influence baseline, cross-consumer formula gate Completes the PR 0 scope per the v3 plan §5 deliverables. Three adds: 1. Monotonicity test harness tests/resilience-dimension-monotonicity.test.mts pins the direction of movement for 14 indicators across 7 dimensions (reserve adequacy, fiscal space 3x, external debt coverage, import concentration, governance WGI, food/water 2x, energy 5x). Each test builds two synthetic ResilienceSeedReader fixtures differing only in the target indicator and asserts the dimension score moves in the documented direction. The scoreEnergy tests explicitly flag three indicators (gasShare, coalShare, electricityConsumption) that PR 1 §3.1-3.2 overturns so future readers understand which directional claims the plan intentionally replaces. 2. Variable-influence baseline scripts/compare-resilience-current-vs-proposed.mjs now computes per-dimension Pearson correlation against the current overallScore scaled by the dimension's nominal domain weight (a Pearson-derivative approximation of Sobol indices). The output carries a variableInfluence[] array sorted by abs(effectiveInfluence) desc. Acceptance gate 8 from the plan compares post-change effective influence against assigned nominal weight; divergences flag a wealth-proxy or saturated-signal construct problem. 3. Cross-consumer formula gate Five external consumers of resilience:score:v10:* now filter stale- formula entries so a flag flip does not serve mixed-formula data downstream: - server/worldmonitor/supply-chain/v1/get-route-impact.ts — readResilienceScore() checks _formula via the new getCurrentCacheFormula export and returns 0 on mismatch. - scripts/validate-resilience-correlation.mjs, scripts/validate-resilience-backtest.mjs, scripts/backtest-resilience-outcomes.mjs, scripts/benchmark-resilience-external.mjs — each inlines a currentCacheFormulaLocal() helper that mirrors the server's formula derivation from env, skips parsed entries whose _formula disagrees, and logs the skip count so operators can notice a mismatch during the flip window. A mixed-formula cohort (some countries d6-tagged, others pc-tagged) would confound every correlation, AUC, and Spearman this repair plan depends on for its acceptance gates. These guards close that gap. Verification: typecheck:all clean, 444/444 resilience tests pass (+14 from the new monotonicity harness). * fix(resilience): PR 0 review follow-ups — sample-union + doc tense Two review-driven fixes on top of PR 0. 1. scripts/compare-resilience-current-vs-proposed.mjs — the cohort and matched-pair summaries were computed against the historical 52-country sensitivity seed, which silently excluded the small-island-importers cohort (zero members in the seed) and the sg-vs-ch matched pair (Singapore not in the seed). With the current script those acceptance gates are partially measured at best. SAMPLE now = union(historical 52 seed, every cohort member, every matched-pair endpoint). The imports for RESILIENCE_COHORTS and MATCHED_PAIRS moved from inside main() to module scope so the union can be computed before the script runs. Net sample size grows from 52 to ~95 countries. Still fast enough for an interactive pass; makes the acceptance gates honest. 2. docs/methodology/country-resilience-index.mdx — the construct contract wording read as present-tense compliance ("Every indicator in the scorer passes a single mechanism test"), which contradicted the immediately-following passage about indicators that currently fail the test. Reworded to "is being evaluated against" and added an explicit PR-0-does-not-change-scoring paragraph that names the known-failing indicators (electricityConsumption, gas/coal flat penalties, WHO per-capita health spend) and points at the repair plan for the replacement schedule. Verification: typecheck:all clean, 444/444 resilience tests pass. * fix(resilience): compare-script loads frozen baseline + emits per-indicator influence Addresses two P1 review findings on PR #3284: 1. Script previously compared current-6d vs proposed-pillar-combined from the SAME checkout; never loaded the frozen pre-PR-0 baseline, so acceptance gates 2/6/7 ("no country moved >15pts vs baseline", cohort median shift vs baseline, matched-pair gap change vs baseline) could not be enforced for later scorer PRs. Now auto-discovers the most recent resilience-ranking-live-pre-repair-<date>.json (or post-<pr>-<date>) in docs/snapshots/ and emits a baselineComparison block with: spearmanVsBaseline, maxCountryAbsDelta, biggestDriftsVsBaseline, cohortShiftVsBaseline, matchedPairGapChange. If no baseline is found, the block is emitted with status 'unavailable' so callers distinguish missing-baseline from passed-baseline. 2. variableInfluence was emitted only at the dimension level, which hid the exact sub-indicators the repair plan targets (electricityConsumption, gasShare, coalShare, etc.) inside their parent dimension. Added extractIndicatorValues() which pulls twelve construct-risk indicators per country from the shared memoized reader, then computes per-indicator Pearson correlation against the current overall score. Emitted as perIndicatorInfluence[], sorted by absolute effective influence. Acceptance gate 8 ("effective influence agrees in sign and rank-order with assigned nominal weights") is now computable at the indicator level, not only at the dimension level. No production code touched; diagnostic-harness only. * fix(resilience): baseline-snapshot selection by structured parse, not filename sort Addresses P1 review on compare-resilience-current-vs-proposed.mjs:118-130. Plain filename sort breaks the "immediate-prior state" contract two ways: 1. Lexical ordering: `pre-repair` sorts after `post-*` (`pr...` to 'r' > 'o'), so the PR-0 freeze would keep winning even after post-PR snapshots exist. Later scorer PRs would then report acceptance-gate deltas against the original pre-repair freeze instead of the immediately-prior post-PR-(N-1) snapshot — the gate would appear valid while measuring against the wrong baseline. 2. Lexical ordering: `pr10` < `pr9` (digit-by-digit), so PR-10 would lose the selection to PR-9. Fix: parseBaselineSnapshotMeta() extracts (kind, prNumber, date) from the filename. Sort keys are (kindRank desc, prNumber desc, date desc): - post always beats pre-repair (kindRank 1 vs 0) - among posts, prNumber compared numerically (10 beats 9) - date breaks ties (same-PR re-snapshots, later capture wins) - unlabeled post tags get prNumber 0 so they sort between pre-repair and any numbered PR snapshot Surfaced in output: baselineKind / baselinePrNumber / baselineDate alongside baselineFile so the operator can verify which snapshot was selected without having to reopen the file. Module now isMain-guarded per feedback_seed_isMain_guard memory so tests can import parseBaselineSnapshotMeta without firing the scoring run. Added tests/resilience-baseline-snapshot-ordering.test.mjs (9 tests) pinning the ordering contract for every known failure mode. Diagnostic-harness change only. No production code touched. * fix(resilience): full scorable universe + registry-driven per-indicator influence Addresses two fresh P1 review findings on the PR 0 compare harness. Finding 1 — acceptance math ran on a curated ~95-country sample, so plan gate 2 could miss large regressions in excluded countries. - Main scoring loop now iterates the FULL scorable universe (listScorableCountries()), not the 52-country seed + cohort union. - Removed SAMPLE / HISTORICAL_SENSITIVITY_SEED constants. - Added scorableUniverseSize + cohortMissingFromScorable to output so operators see universe size and any cohort/pair endpoint that listScorable refuses to score (fail-loud, not silent drop). Finding 3 — per-indicator influence was a hand-picked 12-indicator subset, hiding most registry indicators from the baseline that later scorer PRs need. - Extraction is now driven by INDICATOR_REGISTRY. Every Core + Enrichment indicator gets a row with explicit extractionStatus: implemented | not-implemented (with reason) | unregistered-in-harness - EXTRACTION_RULES covers 40/59 indicators across 11 shape families (static-path, static-wb-infrastructure, static-wgi, static-wgi-mean, static-who, energy-mix-field, gas-storage-field, recovery-country- field, imf-macro/labor-country-field, national-debt, sanctions-count). - Remaining 19 indicators need either a scorer trace hook (PR 0.5) or a safe aggregation duplicate; each carries a reason string. - extractionCoverage summary (totalIndicators / implemented / notImplemented / unregisteredInHarness / coreImplemented / coreTotal) exposed in output so PR 0.5 progress is measurable. Added tests/resilience-indicator-extraction-plan.test.mjs (11 tests) pinning: every registry entry has an extraction row; not-implemented rows carry a reason; all 12 plan-named construct-risk indicators stay extractable; Core-tier coverage floor of 45%; shape-family unit tests. Diagnostic-harness change only. No production code touched. * fix(resilience): wire event-aggregate per-indicator influence via exported scorer helpers Addresses P1 review on PR 0 compare harness. Previous commit marked 16 Core-tier indicators as 'not-implemented' because they needed scorer event-window/severity-weighting math; that left the gate-9 acceptance apparatus incomplete for a large part of the shipped score. Fix: export the scorer-internal aggregation helpers so the harness calls them directly. Zero aggregation math duplicated in the harness, harness and scorer cannot drift. Exported from _dimension-scorers.ts (purely additive): summarizeCyber, summarizeOutages, summarizeGps, summarizeUcdp, summarizeUnrest, summarizeSocialVelocity, getCountryDisplacement, getThreatSummaryScore, countTradeRestrictions, countTradeBarriers. 13 extraction rules moved from not-implemented to implemented: cyberThreats, internetOutages, infraOutages, gpsJamming, ucdpConflict, unrestEvents, socialVelocity, newsThreatScore, displacementTotal, displacementHosted, tradeRestrictions, tradeBarriers, recoveryConflictPressure, recoveryDisplacementVelocity. Coverage: 52/59 total (88%), 46/50 Core-tier (92%). Four Core indicators remain not-implemented for STRUCTURAL reasons, NOT missing code. Scorer inputs are genuinely global scalars with zero per-country variance, so Pearson(indicator, overall) is 0 or NaN by construction: shippingStress, transitDisruption, energyPriceStress — scorer reads a global scalar applied to every country; a per-country effective signal would need re-expression as (global x per-country exposure), which is a derived signal in a different entry. aquastatWaterAvailability — needs a distinct sub-indicator path resolver; enrichment follow-up. New test asserts the three no-per-country-variance indicators STAY not-implemented with a matching reason, so any future extraction that appears to cover them without fixing the underlying construct fails. Dispatcher split into STATIC / SIMPLE / AGGREGATE extractor tables to stay under biome complexity limit. Core-tier floor test raised from 45% to 80%. 89 resilience tests pass, typecheck clean, biome clean. No production behaviour changes. * fix(resilience): tag-gated AQUASTAT extractor closes the last fixable Core gap Reviewer flagged aquastatWaterAvailability as the only remaining Core indicator where the not-implemented status was structurally fixable rather than conceptually impossible. Both aquastatWaterStress and aquastatWaterAvailability share a single .aquastat.value field; the scorer's scoreAquastatValue splits them by the sibling .aquastat.indicator tag keyword (stress/withdrawal/ dependency to stress family; availability/renewable/access to availability family). The harness now mirrors this branching: - classifyAquastatFamily implements the scorer's priority order (stress-family match wins even if the tag also contains an availability keyword, matching the sequential if-check at _dimension-scorers.ts L770-776). - static-aquastat-stress / static-aquastat-availability extractors return the value only when the family matches, so stress-family readings never corrupt the availability Pearson and vice versa. Core-tier coverage: 46/50 to 47/50 (94%). The 3 remaining Core not-implemented indicators (shippingStress, transitDisruption, energyPriceStress) are all structural impossibilities: scorer inputs are global scalars with zero per-country variance. New contract test pins both directions of the tag gate plus the priority-order edge case (a tag containing both families' keywords routes to stress). 90 resilience tests pass, typecheck clean, biome clean.
2026-04-25 17:14:57 +02:00 · 2026-04-22 16:44:12 +04:00
parent 2765b46dad
commit da0f26a3cf
16 changed files with 5207 additions and 34 deletions
--- a/docs/methodology/country-resilience-index.mdx
+++ b/docs/methodology/country-resilience-index.mdx
@@ -12,6 +12,29 @@ This document describes the **currently shipping** behavior of the index. The ve

 Everything documented below describes the **currently shipping** state: schemaVersion `"2.0"` shape, 6 domains × 19 dimensions × 3 pillars, and the 6-domain weighted `overall_score`. When an operator flips the pillar-combined flag on, the subsection on [Pillar-combined score activation](#pillar-combined-score-activation-flag-gated-default-off) documents what changes.

+## Construct contract
+
+Country Resilience measures **absolute national shock-absorption and recovery capacity at a point in time**. It does not adjust for income level. Development-adjacent indicators enter only when they measure a direct resilience mechanism. Those indicators use threshold or saturating transforms so the score rewards functional capacity, not affluence itself. Peer-relative over- and under-performance will be published separately as an analytical overlay, not inside the core score.
+
+The scorer will treat development as relevant only where it creates a direct and measurable shock-absorption mechanism. Pure level-of-affluence proxies are excluded. Development-relative overperformance will be reported separately and will not alter the ordinal country ranking.
+
+Every indicator in the scorer is being evaluated against a single **mechanism test**: *what direct shock channel does this measure?* An indicator whose only answer is "this country is rich" is excluded from the core score regardless of its historical correlation with resilience outcomes. An indicator whose answer is "capacity X absorbs shock Y" can enter but must use a threshold or saturating transform so it rewards the mechanism rather than the level of resource that drives it.
+
+This PR (the diagnostic freeze) does not change any scoring behaviour. It ships the mechanism-test framework and the apparatus to measure compliance; it does not claim compliance. Several indicators in the current scorer fail the test (notably `electricityConsumption`, `gasShare` / `coalShare` as flat domestic-fossil penalties, and `WHO per-capita health spend`). They are tagged `wealth-proxy` or equivalent in `docs/methodology/indicator-sources.yaml` and scheduled for replacement in PR 1 / PR 4 of the repair plan. Published rankings today reflect the pre-repair scorer; the mechanism-test contract applies fully only after PR 4.
+
+## Known construct limitations (in repair)
+
+The first-publication repair is sequenced as PR 0 → PR 1 → PR 3 → PR 2 → PR 4 under the plan above. At the time of writing (PR 0 shipping), the following six construct errors are known and scheduled:
+
+1. **`electricityConsumption` is a wealth proxy, not a resilience signal.** Weight 0.30 on the `energy` dimension; rewards per-capita load rather than grid-integrity capacity. Replaced in PR 1 with `powerLossesPct`, `reserveMarginPct`, and `accessToElectricityPct` (the last moved to the `infrastructure` domain).
+2. **Gas and coal penalized as vulnerability even when domestic.** Current `gasShare` / `coalShare` penalties conflate fossil-dominance with fossil-import-dependence. Replaced in PR 1 with a single `importedFossilDependence` composite using World Bank `EG.IMP.CONS.ZS`.
+3. **No nuclear credit in `scoreEnergy`.** Nuclear-heavy generation scores no points despite firm low-carbon characteristics. Fixed in PR 1 by collapsing `renewShare` + new `nuclearShare` into a single `lowCarbonGenerationShare` indicator.
+4. **Sovereign-wealth buffers invisible to `reserveAdequacy`.** Current dimension only sees central-bank reserves; SWF assets are not counted. Fixed in PR 2 by splitting the dimension into `liquidReserveAdequacy` + `sovereignFiscalBuffer` with a three-component haircut (access × liquidity × transparency) and a saturating transform.
+5. **Dead and regional-only signals in the global core score.** `fuelStockDays` (100% imputed globally), `euGasStorageStress` (EU-only), and `currencyExternal` (BIS 64-economy coverage) currently carry material weight despite insufficient coverage for a world ranking. Retired or scoped regional-only in PR 3.
+6. **No coverage-based weight cap.** A dimension at 30% observed coverage carries the same weight as one at 95%. Fixed in PR 3 with a CI-enforced rule: no indicator with observed coverage below 70% may exceed 5% nominal weight or 5% effective influence.
+
+Each item maps to an acceptance gate and a spec in the repair plan. Until PR 1–PR 3 land, published rankings reflect the current construct and should be read in that context.
+
 ## In the dashboard

 CRI is surfaced across three places in the product, all driven from the same currently-shipping score:
--- a/docs/methodology/indicator-sources.yaml
+++ b/docs/methodology/indicator-sources.yaml
@@ -0,0 +1,686 @@
+# Resilience scorer indicator-source manifest (PR 0 scaffold, 2026-04-22).
+#
+# One entry per sub-indicator used inside a dimension scorer. Each entry
+# answers the mechanism test from docs/plans/2026-04-22-001-fix-resilience-
+# scorer-structural-bias-plan.md §1.1: what direct shock channel does this
+# measure?
+#
+# Fields:
+#   indicator              — scorer variable name (matches the weighted-blend entry)
+#   dimension              — parent dimension id (matches RESILIENCE_DIMENSION_ORDER)
+#   domain                 — parent domain id
+#   weight                 — current nominal weight inside the dimension blend
+#   direction              — higher-better | lower-better | composite
+#   source                 — authority that publishes the series
+#   seriesId               — canonical series id where applicable (e.g. EG.IMP.CONS.ZS)
+#   seriesUrl              — direct link to source documentation
+#   coveragePct            — observed-data coverage across the 222-country static index (first-pass estimate; authoritative value lives in the matching seeder's seed-meta.coverage field)
+#   lastObservedYear       — most-recent year with global data in the source
+#   license                — reuse license (CC-BY, CC0, OGL, Proprietary-with-fair-use, etc.)
+#   mechanismTestRationale — one-sentence answer to "what direct shock channel does this measure?"
+#   constructStatus        — observed-mechanism | wealth-proxy | imputed-floor | regional-only | dead-signal
+#
+# constructStatus is the v3-plan classification:
+#   observed-mechanism  — passes the mechanism test; kept as-is pending goalpost review
+#   wealth-proxy        — fails the mechanism test; slated for removal or threshold-transform
+#   imputed-floor       — data source not wired; producing only the imputed midpoint
+#   regional-only       — data source covers <50% of scorable countries
+#   dead-signal         — saturated or compressed; signal collapsed across the ranking
+
+# ECONOMIC DOMAIN (weight 0.17) -------------------------------------------
+
+- indicator: govRevenuePct
+  dimension: macroFiscal
+  domain: economic
+  weight: 0.50
+  direction: higher-better
+  source: IMF
+  seriesId: GGR_G01_GDP_PT
+  seriesUrl: https://www.imf.org/external/datamapper/GGR_G01_GDP_PT@FM/
+  coveragePct: 0.90
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Government revenue % GDP is the policy-response headroom a state can deploy during a fiscal shock; higher = more ability to absorb.
+  constructStatus: observed-mechanism
+  reviewNotes: Goalpost 5-45 is probably too wide at the top; Nordic revenue/GDP ≥ 50% saturates. Review in PR 4 goalpost pass.
+
+- indicator: debtGrowthRate
+  dimension: macroFiscal
+  domain: economic
+  weight: 0.20
+  direction: lower-better
+  source: National debt databases (IMF + WB cross-source)
+  seriesId: TBD
+  seriesUrl: TBD
+  coveragePct: 0.80
+  lastObservedYear: 2024
+  license: Mixed (per-source)
+  mechanismTestRationale: Rising debt growth indicates deteriorating fiscal trajectory and reduced capacity to finance a shock response.
+  constructStatus: observed-mechanism
+
+- indicator: currentAccountPct
+  dimension: macroFiscal
+  domain: economic
+  weight: 0.30
+  direction: higher-better
+  source: IMF
+  seriesId: BCA_NGDPD
+  seriesUrl: https://www.imf.org/external/datamapper/BCA_NGDPD@WEO/
+  coveragePct: 0.85
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Current account surplus indicates external-payments resilience; deficit indicates vulnerability to external-finance shocks.
+  constructStatus: observed-mechanism
+
+- indicator: fxVolatility
+  dimension: currencyExternal
+  domain: economic
+  weight: 0.60
+  direction: lower-better
+  source: BIS Data Portal
+  seriesId: Broad Effective Exchange Rate
+  seriesUrl: https://data.bis.org/topics/EER
+  coveragePct: 0.30  # BIS covers 64 economies
+  lastObservedYear: 2024
+  license: CC-BY
+  mechanismTestRationale: FX volatility measures monetary-shock transmission risk.
+  constructStatus: regional-only
+  reviewNotes: BIS EER covers only 64 economies. Replace with FR.INR.RINR (real interest rate) or IMF inflation volatility in PR 3.
+
+- indicator: fxDeviation
+  dimension: currencyExternal
+  domain: economic
+  weight: 0.25
+  direction: lower-better
+  source: BIS Data Portal
+  seriesId: EER deviation from equilibrium
+  seriesUrl: https://data.bis.org/topics/EER
+  coveragePct: 0.30
+  lastObservedYear: 2024
+  license: CC-BY
+  mechanismTestRationale: EER deviation from equilibrium proxies mis-aligned exchange rates that create abrupt-correction risk.
+  constructStatus: regional-only
+  reviewNotes: Same 64-economy limitation. Retire in PR 3.
+
+- indicator: fxReservesAdequacy
+  dimension: currencyExternal
+  domain: economic
+  weight: 0.15
+  direction: higher-better
+  source: World Bank
+  seriesId: FI.RES.TOTL.MO
+  seriesUrl: https://data.worldbank.org/indicator/FI.RES.TOTL.MO
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Reserves in months of imports directly measures immediate external-finance cushion.
+  constructStatus: observed-mechanism
+  reviewNotes: Also enters reserveAdequacy at 1.0 weight. Double-counting risk across dimensions; review in PR 2.
+
+- indicator: sanctionCount
+  dimension: tradeSanctions
+  domain: economic
+  weight: 0.45
+  direction: lower-better
+  source: OFAC
+  seriesId: Consolidated Sanctions List (count per country)
+  seriesUrl: https://sanctionslist.ofac.treas.gov/
+  coveragePct: 1.00
+  lastObservedYear: 2026
+  license: US Government public domain
+  mechanismTestRationale: Active sanctions restrict trade/finance channels; higher count = more channels restricted.
+  constructStatus: observed-mechanism
+  reviewNotes: OFAC-only. PR 4 adds EU/UK/CN sanctions for directional completeness.
+
+- indicator: tradeRestrictions
+  dimension: tradeSanctions
+  domain: economic
+  weight: 0.15
+  direction: lower-better
+  source: WTO
+  seriesId: Trade Monitoring Database
+  seriesUrl: https://www.wto.org/english/tratop_e/tpr_e/trade_monitoring_e.htm
+  coveragePct: 0.75
+  lastObservedYear: 2025
+  license: Open
+  mechanismTestRationale: Active trade restrictions (in-force, weighted 3×) directly measure market-access loss.
+  constructStatus: observed-mechanism
+
+- indicator: tradeBarriers
+  dimension: tradeSanctions
+  domain: economic
+  weight: 0.15
+  direction: lower-better
+  source: WTO
+  seriesId: Trade Barriers Notifications
+  seriesUrl: https://tradebarriers.wto.org/
+  coveragePct: 0.70
+  lastObservedYear: 2025
+  license: Open
+  mechanismTestRationale: Notified trade barriers (not yet in force) indicate near-term market-access risk.
+  constructStatus: observed-mechanism
+
+- indicator: appliedTariffRate
+  dimension: tradeSanctions
+  domain: economic
+  weight: 0.25
+  direction: lower-better
+  source: World Bank / WITS
+  seriesId: TM.TAX.MRCH.WM.AR.ZS
+  seriesUrl: https://data.worldbank.org/indicator/TM.TAX.MRCH.WM.AR.ZS
+  coveragePct: 0.90
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Applied tariff rates measure cost of trade restriction on imports.
+  constructStatus: observed-mechanism
+
+# INFRASTRUCTURE DOMAIN (weight 0.15) -------------------------------------
+
+- indicator: cyberThreats
+  dimension: cyberDigital
+  domain: infrastructure
+  weight: 0.45
+  direction: lower-better
+  source: Cyber threat feeds (mixed Western-origin)
+  seriesId: severity-weighted count (critical 3×, high 2×, medium 1×, low 0.5×)
+  seriesUrl: Internal seed
+  coveragePct: 0.70
+  lastObservedYear: 2026
+  license: Proprietary feeds (aggregated)
+  mechanismTestRationale: Severity-weighted cyber threat count directly measures ongoing cyber-attack pressure on national digital infrastructure.
+  constructStatus: observed-mechanism
+  reviewNotes: Western-feed bias; non-English cyber activity under-represented. PR 4 §4.8 tracks this.
+
+- indicator: internetOutages
+  dimension: cyberDigital
+  domain: infrastructure
+  weight: 0.35
+  direction: lower-better
+  source: Cloudflare Radar + internal monitoring
+  seriesId: Outage penalty (total 4×, major 2×, partial 1×)
+  seriesUrl: https://radar.cloudflare.com/
+  coveragePct: 0.95
+  lastObservedYear: 2026
+  license: CC-BY-4.0
+  mechanismTestRationale: Internet outages directly measure digital-infrastructure availability under current stress.
+  constructStatus: observed-mechanism
+
+- indicator: gpsJamming
+  dimension: cyberDigital
+  domain: infrastructure
+  weight: 0.20
+  direction: lower-better
+  source: GPSJam
+  seriesId: Hex penalty (high 3×, medium 1×)
+  seriesUrl: https://gpsjam.org/
+  coveragePct: 0.95
+  lastObservedYear: 2026
+  license: Open data
+  mechanismTestRationale: GPS jamming intensity measures electronic-warfare / navigation-disruption exposure.
+  constructStatus: observed-mechanism
+
+- indicator: logisticsPerformanceIndex
+  dimension: logisticsSupply
+  domain: infrastructure
+  weight: TBD
+  direction: higher-better
+  source: World Bank LPI
+  seriesId: LP.LPI.OVRL.XQ
+  seriesUrl: https://lpi.worldbank.org/
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Logistics Performance Index measures functional capacity to move goods; directly shocks during supply-chain disruptions.
+  constructStatus: observed-mechanism
+  reviewNotes: Goalpost anchors OECD-centric; PR 4 review.
+
+- indicator: infrastructureSubcomponents
+  dimension: infrastructure
+  domain: infrastructure
+  weight: TBD
+  direction: higher-better
+  source: World Bank + WEF Global Competitiveness
+  seriesId: Composite
+  seriesUrl: TBD
+  coveragePct: 0.80
+  lastObservedYear: 2024
+  license: Mixed
+  mechanismTestRationale: Physical infrastructure quality is the baseline capacity for delivering services during normal and crisis periods.
+  constructStatus: observed-mechanism
+
+# ENERGY DOMAIN (weight 0.11) --------------------------------------------
+
+- indicator: dependency
+  dimension: energy
+  domain: energy
+  weight: 0.25
+  direction: lower-better
+  source: IEA (via static seed)
+  seriesId: Energy import dependency (%)
+  seriesUrl: https://www.iea.org/data-and-statistics/data-browser
+  coveragePct: 0.50  # IEA detail covers OECD + major non-OECD
+  lastObservedYear: 2023
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Share of energy consumption that is imported; direct supply-shock exposure.
+  constructStatus: observed-mechanism
+  reviewNotes: PR 1 §3.2 replaces with World Bank EG.IMP.CONS.ZS (better coverage) as part of importedFossilDependence composite.
+
+- indicator: gasShare
+  dimension: energy
+  domain: energy
+  weight: 0.12
+  direction: lower-better
+  source: IEA World Energy Balances via static seed
+  seriesId: Natural gas share of primary energy
+  seriesUrl: https://www.iea.org/data-and-statistics/data-browser
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: CURRENT SCORER applies this as a vulnerability (lower-better) but it CONFLATES fossil-dominance with fossil-import-dependence. Domestic gas is a resilience asset, not a vulnerability.
+  constructStatus: wealth-proxy
+  reviewNotes: PR 1 §3.2 removes as standalone input; folds into importedFossilDependence under power-system framing.
+
+- indicator: coalShare
+  dimension: energy
+  domain: energy
+  weight: 0.08
+  direction: lower-better
+  source: IEA World Energy Balances via static seed
+  seriesId: Coal share of primary energy
+  seriesUrl: https://www.iea.org/data-and-statistics/data-browser
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Same concern as gasShare — penalty is climate-frame, not resilience-frame. Fails mechanism test under absolute-resilience contract.
+  constructStatus: wealth-proxy
+  reviewNotes: PR 1 §3.2 removes.
+
+- indicator: renewShare
+  dimension: energy
+  domain: energy
+  weight: 0.05
+  direction: higher-better
+  source: IEA / World Bank
+  seriesId: EG.ELC.RNEW.ZS
+  seriesUrl: https://data.worldbank.org/indicator/EG.ELC.RNEW.ZS
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Share of electricity from renewables, proxies low-carbon-firm-generation capacity (for hydro/geothermal) and diversity-of-supply (for wind/solar).
+  constructStatus: observed-mechanism
+  reviewNotes: PR 1 §3.3 collapses with nuclearShare (currently missing) into one lowCarbonGenerationShare indicator.
+
+- indicator: storageStress
+  dimension: energy
+  domain: energy
+  weight: 0.10
+  direction: lower-better
+  source: GIE AGSI+
+  seriesId: EU gas storage fill % (per country)
+  seriesUrl: https://agsi.gie.eu/
+  coveragePct: 0.15  # EU + UK + a handful
+  lastObservedYear: 2026
+  license: Open
+  mechanismTestRationale: EU gas storage fill directly measures winter-heating-shock buffer. European-only platform.
+  constructStatus: regional-only
+  reviewNotes: PR 1 §3.5 renames to euGasStorageStress and scopes to EU-only (weight 0 for non-EU).
+
+- indicator: exposedEnergyStress
+  dimension: energy
+  domain: energy
+  weight: 0.10
+  direction: composite
+  source: Internal composite (energy-price-stress × import-exposure)
+  seriesId: Derived
+  seriesUrl: Internal seed
+  coveragePct: 0.70
+  lastObservedYear: 2026
+  license: Internal
+  mechanismTestRationale: Combines energy-price shocks with import-exposure to measure price-shock transmission.
+  constructStatus: observed-mechanism
+  reviewNotes: PR 1 may simplify given importedFossilDependence covers import-exposure directly.
+
+- indicator: electricityConsumption
+  dimension: energy
+  domain: energy
+  weight: 0.30
+  direction: higher-better
+  source: World Bank
+  seriesId: EG.USE.ELEC.KH.PC
+  seriesUrl: https://data.worldbank.org/indicator/EG.USE.ELEC.KH.PC
+  coveragePct: 0.90
+  lastObservedYear: 2022
+  license: CC-BY-4.0
+  mechanismTestRationale: FAILS the mechanism test. Per-capita electricity consumption tracks GDP per capita; it is a level-of-load measure not a resilience mechanism. IEA energy-security framing treats EFFICIENCY (lower load for same output) as resilience, which this indicator inversely rewards.
+  constructStatus: wealth-proxy
+  reviewNotes: PR 1 §3.1 removes. Replaced with powerLossesPct (EG.ELC.LOSS.ZS), reserveMarginPct (IEA), and accessToElectricityPct (EG.ELC.ACCS.ZS) moved to infrastructure domain.
+
+# SOCIAL-GOVERNANCE DOMAIN (weight 0.19) ---------------------------------
+
+- indicator: wgiComposite
+  dimension: governanceInstitutional
+  domain: social-governance
+  weight: 1.0
+  direction: higher-better
+  source: World Bank WGI
+  seriesId: Voice/Accountability, Political Stability, Government Effectiveness, Regulatory Quality, Rule of Law, Control of Corruption
+  seriesUrl: https://info.worldbank.org/governance/wgi/
+  coveragePct: 0.98
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: WGI subscores measure state capacity to design and enforce policy response to shocks. Passes the mechanism test conditionally — the composite is a direct policy-response-capacity signal.
+  constructStatus: observed-mechanism
+  reviewNotes: Weights review in PR 4. Individual WGI subscores may need separate weighting vs equal-blend.
+
+- indicator: gpiScore
+  dimension: socialCohesion
+  domain: social-governance
+  weight: 0.40  # approximate
+  direction: lower-better
+  source: Institute for Economics and Peace
+  seriesId: Global Peace Index
+  seriesUrl: https://www.visionofhumanity.org/
+  coveragePct: 0.75
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: GPI measures internal conflict, militarization, and external conflict intensity — direct social-cohesion-shock exposure.
+  constructStatus: observed-mechanism
+  reviewNotes: Known Western-democracy bias in GPI methodology; PR 4 review.
+
+- indicator: displacementMetric
+  dimension: socialCohesion
+  domain: social-governance
+  weight: 0.30  # approximate
+  direction: lower-better
+  source: UNHCR
+  seriesId: totalDisplaced
+  seriesUrl: https://data.unhcr.org/
+  coveragePct: 0.95
+  lastObservedYear: 2025
+  license: Open
+  mechanismTestRationale: Total displaced persons directly measures ongoing forced-migration pressure. BIAS: currently blends origin + host; penalizes Jordan/Turkey/Germany for HOSTING.
+  constructStatus: wealth-proxy  # classified as biased — bias label
+  reviewNotes: PR 4 §4.2 splits origin (negative signal) from host (mixed signal).
+
+- indicator: unrestMetric
+  dimension: socialCohesion
+  domain: social-governance
+  weight: 0.30  # approximate
+  direction: lower-better
+  source: Internal unrest seed (cross-source signals + UCDP)
+  seriesId: unrestCount + sqrt(fatalities)
+  seriesUrl: Internal seed
+  coveragePct: 0.85
+  lastObservedYear: 2026
+  license: Mixed
+  mechanismTestRationale: Active unrest events measure current social-cohesion stress.
+  constructStatus: observed-mechanism
+
+- indicator: borderSecuritySubs
+  dimension: borderSecurity
+  domain: social-governance
+  weight: TBD
+  direction: composite
+  source: Composite (UNHCR displacement + UCDP conflict + governance)
+  seriesId: Derived
+  seriesUrl: Internal seed
+  coveragePct: 0.80
+  lastObservedYear: 2026
+  license: Mixed
+  mechanismTestRationale: Border-security composite captures cross-border shock transmission exposure.
+  constructStatus: observed-mechanism
+  reviewNotes: Inherits displacement host-vs-sending bias from socialCohesion. PR 4 fix.
+
+- indicator: rsfPressFreedom
+  dimension: informationCognitive
+  domain: social-governance
+  weight: TBD
+  direction: higher-better
+  source: Reporters Sans Frontieres
+  seriesId: Press Freedom Index
+  seriesUrl: https://rsf.org/en/index
+  coveragePct: 0.85
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Press freedom proxies quality of information-shock response and independent verification capacity.
+  constructStatus: observed-mechanism
+
+- indicator: languageNormalizedSocialVelocity
+  dimension: informationCognitive
+  domain: social-governance
+  weight: TBD
+  direction: composite
+  source: Reddit + cross-source + internal language-coverage-weighting
+  seriesId: Internal
+  seriesUrl: Internal seed
+  coveragePct: 0.95
+  lastObservedYear: 2026
+  license: Mixed
+  mechanismTestRationale: Language-normalized social-information velocity measures information-shock propagation speed adjusted for source-density bias.
+  constructStatus: observed-mechanism
+
+# HEALTH-FOOD DOMAIN (weight 0.13) ---------------------------------------
+
+- indicator: whoHealthExpenditure
+  dimension: healthPublicService
+  domain: health-food
+  weight: TBD
+  direction: higher-better
+  source: WHO Global Health Observatory
+  seriesId: Current health expenditure per capita, PPP
+  seriesUrl: https://www.who.int/data/gho
+  coveragePct: 0.95
+  lastObservedYear: 2022
+  license: CC-BY-4.0
+  mechanismTestRationale: Health expenditure per capita proxies health-system capacity. FAILS the strict mechanism test — it measures SPEND, not CAPACITY. Should be replaced with surge-capacity / bed-density / ICU-density threshold signal.
+  constructStatus: wealth-proxy
+  reviewNotes: PR 4 §4.9 replacement.
+
+- indicator: ipcPhase
+  dimension: foodWater
+  domain: health-food
+  weight: 0.15
+  direction: lower-better
+  source: FAO IPC (Integrated Food Security Phase Classification)
+  seriesId: IPC Phase (1-5)
+  seriesUrl: https://www.ipcinfo.org/
+  coveragePct: 0.40  # IPC covers acutely-affected countries
+  lastObservedYear: 2025
+  license: Open
+  mechanismTestRationale: IPC phase directly measures current food-security-crisis severity.
+  constructStatus: observed-mechanism
+  reviewNotes: Coverage is inherently partial — IPC only tracks countries with current/imminent food crises. Imputed to a resilient-default for non-tracked countries.
+
+- indicator: aquastatWaterStress
+  dimension: foodWater
+  domain: health-food
+  weight: 0.25
+  direction: lower-better
+  source: FAO AQUASTAT
+  seriesId: Water stress (withdrawal / renewable resources)
+  seriesUrl: https://www.fao.org/aquastat/
+  coveragePct: 0.85
+  lastObservedYear: 2020
+  license: Open
+  mechanismTestRationale: Water stress directly measures water-supply-shock exposure.
+  constructStatus: observed-mechanism
+
+- indicator: aquastatWaterAvailability
+  dimension: foodWater
+  domain: health-food
+  weight: 0.15
+  direction: higher-better
+  source: FAO AQUASTAT
+  seriesId: Water availability (m³/capita)
+  seriesUrl: https://www.fao.org/aquastat/
+  coveragePct: 0.85
+  lastObservedYear: 2020
+  license: Open
+  mechanismTestRationale: Water availability per capita proxies baseline water-security-shock buffer.
+  constructStatus: observed-mechanism
+
+# RECOVERY DOMAIN (weight 0.25) ------------------------------------------
+
+- indicator: recoveryGovRevenue
+  dimension: fiscalSpace
+  domain: recovery
+  weight: 0.40
+  direction: higher-better
+  source: IMF
+  seriesId: GGR_G01_GDP_PT
+  seriesUrl: https://www.imf.org/external/datamapper/GGR_G01_GDP_PT@FM/
+  coveragePct: 0.90
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: Government revenue % GDP for recovery scenarios — policy-response fiscal headroom.
+  constructStatus: observed-mechanism
+  reviewNotes: Duplicate with macroFiscal.govRevenuePct. PR 4 may de-duplicate.
+
+- indicator: recoveryFiscalBalance
+  dimension: fiscalSpace
+  domain: recovery
+  weight: 0.30
+  direction: higher-better
+  source: IMF
+  seriesId: GGXCNL_G01_GDP_PT
+  seriesUrl: https://www.imf.org/external/datamapper/GGXCNL_G01_GDP_PT@FM/
+  coveragePct: 0.85
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: General government net lending/borrowing as % of GDP — direct fiscal-response-capacity signal.
+  constructStatus: observed-mechanism
+
+- indicator: recoveryDebtToGdp
+  dimension: fiscalSpace
+  domain: recovery
+  weight: 0.30
+  direction: lower-better
+  source: IMF
+  seriesId: GGXWDG_NGDP_PT
+  seriesUrl: https://www.imf.org/external/datamapper/GGXWDG_NGDP_PT@FM/
+  coveragePct: 0.90
+  lastObservedYear: 2024
+  license: Proprietary-with-fair-use
+  mechanismTestRationale: General government gross debt to GDP — fiscal-stress cushion.
+  constructStatus: observed-mechanism
+  reviewNotes: Goalpost 0-150 is too linear; Japan at 260% (mostly domestic, yen-denominated) scores 0 despite weak real fiscal-stress risk. PR 4 §4.4 adds holder-composition modifier.
+
+- indicator: recoveryReserveMonths
+  dimension: reserveAdequacy
+  domain: recovery
+  weight: 1.00
+  direction: higher-better
+  source: World Bank
+  seriesId: FI.RES.TOTL.MO
+  seriesUrl: https://data.worldbank.org/indicator/FI.RES.TOTL.MO
+  coveragePct: 0.85
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Central-bank reserves in months of imports — immediate external-liquidity cushion.
+  constructStatus: observed-mechanism
+  reviewNotes: PR 2 §3.4 renames to liquidReserveAdequacy; new dimension sovereignFiscalBuffer added.
+
+- indicator: recoveryDebtToReserves
+  dimension: externalDebtCoverage
+  domain: recovery
+  weight: 1.00
+  direction: lower-better
+  source: World Bank
+  seriesId: DT.DOD.DSTC.CD / FI.RES.TOTL.CD
+  seriesUrl: https://data.worldbank.org/indicator/DT.DOD.DSTC.CD
+  coveragePct: 0.75
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: Short-term external debt to reserves ratio — rollover-shock exposure.
+  constructStatus: dead-signal
+  reviewNotes: Saturates at 100 for every country in the 9-country probe (goalpost 0-5 is too generous). PR 3 re-goalpost.
+
+- indicator: recoveryImportHhi
+  dimension: importConcentration
+  domain: recovery
+  weight: 1.00
+  direction: lower-better
+  source: UN Comtrade
+  seriesId: HS2 bilateral Herfindahl-Hirschman Index
+  seriesUrl: https://comtrade.un.org/
+  coveragePct: 0.70
+  lastObservedYear: 2023
+  license: Open
+  mechanismTestRationale: Import-partner concentration (HHI) — supplier-shock exposure.
+  constructStatus: observed-mechanism
+  reviewNotes: Coverage gap for UAE and small-island states; PR 1+ audit.
+
+- indicator: recoveryWgiContinuity
+  dimension: stateContinuity
+  domain: recovery
+  weight: 0.50
+  direction: higher-better
+  source: World Bank WGI
+  seriesId: Mean of WGI subscores
+  seriesUrl: https://info.worldbank.org/governance/wgi/
+  coveragePct: 0.98
+  lastObservedYear: 2023
+  license: CC-BY-4.0
+  mechanismTestRationale: WGI composite as state-continuity proxy — institutional durability through shocks.
+  constructStatus: observed-mechanism
+  reviewNotes: Duplicate with governanceInstitutional.wgiComposite. PR 4 de-duplicate.
+
+- indicator: recoveryConflictPressure
+  dimension: stateContinuity
+  domain: recovery
+  weight: 0.30
+  direction: lower-better
+  source: UCDP
+  seriesId: Armed conflict events / fatalities
+  seriesUrl: https://ucdp.uu.se/
+  coveragePct: 0.95
+  lastObservedYear: 2026
+  license: Open
+  mechanismTestRationale: UCDP conflict intensity — direct state-continuity-shock metric.
+  constructStatus: observed-mechanism
+
+- indicator: recoveryDisplacementVelocity
+  dimension: stateContinuity
+  domain: recovery
+  weight: 0.20
+  direction: lower-better
+  source: UNHCR
+  seriesId: Displacement as share of population
+  seriesUrl: https://data.unhcr.org/
+  coveragePct: 0.95
+  lastObservedYear: 2025
+  license: Open
+  mechanismTestRationale: Displacement velocity — population-scale state-continuity stress.
+  constructStatus: observed-mechanism
+  reviewNotes: Inherits host-vs-sending bias. PR 4 §4.2 fix.
+
+- indicator: recoveryFuelStockDays
+  dimension: fuelStockDays
+  domain: recovery
+  weight: 1.00
+  direction: higher-better
+  source: IEA / EIA
+  seriesId: Days of fuel stock cover
+  seriesUrl: https://www.iea.org/data-and-statistics/data-tools/oil-stocks-of-iea-countries
+  coveragePct: 0.30  # imputed for every country
+  lastObservedYear: null
+  license: Proprietary
+  mechanismTestRationale: Days of fuel stock for import-shock coverage. IEA rules bind only net importers; net exporters get no observed value.
+  constructStatus: imputed-floor
+  reviewNotes: PR 3 §3.5 retires from core score (permanent). Enrichment-only if IEA/EIA connector ever wires.
+
+# PENDING ADDITIONS FOR PR 1+ --------------------------------------------
+
+# PR 1 additions (not yet in the scorer):
+# - powerLossesPct       → EG.ELC.LOSS.ZS (transmission+distribution losses, lower-better)
+# - reserveMarginPct     → IEA electricity balance (generation reserve margin, higher-better)
+# - accessToElectricityPct → EG.ELC.ACCS.ZS (threshold/saturating, moved to infrastructure domain)
+# - importedFossilDependence → EG.IMP.CONS.ZS × fossil-generation-share (Option B power-system framing)
+# - lowCarbonGenerationShare → EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS (higher-better)
+
+# PR 2 additions (not yet in the scorer):
+# - liquidReserveAdequacy  → FI.RES.TOTL.MO (rename of current reserveMonths)
+# - sovereignFiscalBuffer  → IFSWF + official disclosures × access × liquidity × transparency
+
+# PR 3 replacements (not yet in the scorer):
+# - realInterestRate       → FR.INR.RINR (replaces currencyExternal for non-BIS countries)
--- a/docs/snapshots/resilience-ranking-live-pre-repair-2026-04-22.json
+++ b/docs/snapshots/resilience-ranking-live-pre-repair-2026-04-22.json
--- a/scripts/backtest-resilience-outcomes.mjs
+++ b/scripts/backtest-resilience-outcomes.mjs
@@ -28,6 +28,15 @@ const __dirname = dirname(fileURLToPath(import.meta.url));
 const VALIDATION_DIR = join(__dirname, '..', 'docs', 'methodology', 'country-resilience-index', 'validation');

 const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
+
+// Mirror of _shared.ts#currentCacheFormula. Must stay in lockstep; see
+// the same comment in scripts/validate-resilience-correlation.mjs for
+// the rationale.
+function currentCacheFormulaLocal() {
+  const combine = (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
+  const v2 = (process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';
+  return combine && v2 ? 'pc' : 'd6';
+}
 const BACKTEST_RESULT_KEY = 'resilience:backtest:outcomes:v1';
 const BACKTEST_TTL_SECONDS = 7 * 24 * 60 * 60;

@@ -214,6 +223,8 @@ async function fetchAllResilienceScores(url, token) {
  const commands = ALL_COUNTRIES.map((cc) => ['GET', `${RESILIENCE_SCORE_CACHE_PREFIX}${cc}`]);
  const batchSize = 50;
  const scores = new Map();
+  const current = currentCacheFormulaLocal();
+  let staleFormulaSkipped = 0;

  for (let i = 0; i < commands.length; i += batchSize) {
    const batch = commands.slice(i, i + batchSize);
@@ -225,6 +236,12 @@ async function fetchAllResilienceScores(url, token) {
      if (typeof raw !== 'string') continue;
      try {
        const parsed = JSON.parse(raw);
+        // Cross-formula gate: mixed-formula cohorts would confound
+        // the AUC for recovery-prediction models.
+        if (parsed?._formula !== current) {
+          staleFormulaSkipped++;
+          continue;
+        }
        if (parsed?.overallScore != null) {
          scores.set(batchCodes[j], parsed.overallScore);
        }
@@ -232,6 +249,9 @@ async function fetchAllResilienceScores(url, token) {
    }
  }

+  if (staleFormulaSkipped > 0) {
+    console.warn(`[backtest-resilience-outcomes] skipped ${staleFormulaSkipped} stale-formula score entries (current=${current})`);
+  }
  return scores;
 }

--- a/scripts/benchmark-resilience-external.mjs
+++ b/scripts/benchmark-resilience-external.mjs
@@ -363,6 +363,15 @@ function median(arr) {
  return s.length % 2 ? s[mid] : (s[mid - 1] + s[mid]) / 2;
 }

+// Mirror of _shared.ts#currentCacheFormula. Must stay in lockstep; a
+// mixed-formula benchmark would produce a meaningless Spearman / Pearson
+// against INFORM / HDI / WRI reference indices.
+function currentCacheFormulaLocal() {
+  const combine = (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
+  const v2 = (process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';
+  return combine && v2 ? 'pc' : 'd6';
+}
+
 async function readWmScoresFromRedis() {
  const { url, token } = getRedisCredentials();
  const rankingResp = await fetch(`${url}/get/${encodeURIComponent('resilience:ranking:v10')}`, {
@@ -379,8 +388,18 @@ async function readWmScoresFromRedis() {
    return new Map();
  }
  const parsed = JSON.parse(rankingData.result);
+  // Cross-formula gate: the ranking payload carries a `_formula` tag
+  // written by get-resilience-ranking.ts#stampRankingCacheTag. If the
+  // tag disagrees with the current formula (because the flag just
+  // flipped and the ranking cron hasn't rebuilt yet), reject the
+  // ranking rather than benchmarking against a stale-formula cohort.
+  const current = currentCacheFormulaLocal();
+  if (parsed && typeof parsed === 'object' && parsed._formula !== current) {
+    console.warn(`[benchmark] Ranking _formula=${parsed._formula ?? 'undefined'} does not match current=${current} — skipping (stale-formula cache entry)`);
+    return new Map();
+  }
  // The ranking cache stores a GetResilienceRankingResponse object
-  // with { items, greyedOut }, not a bare array.
+  // with { items, greyedOut, _formula }, not a bare array.
  const ranking = Array.isArray(parsed) ? parsed : (parsed?.items ?? []);
  const scores = new Map();
  for (const item of ranking) {
@@ -388,7 +407,7 @@ async function readWmScoresFromRedis() {
      scores.set(item.countryCode, item.overallScore);
    }
  }
-  console.log(`[benchmark] Read ${scores.size} WM resilience scores from Redis`);
+  console.log(`[benchmark] Read ${scores.size} WM resilience scores from Redis (formula=${current})`);
  return scores;
 }

--- a/scripts/compare-resilience-current-vs-proposed.mjs
+++ b/scripts/compare-resilience-current-vs-proposed.mjs
--- a/scripts/validate-resilience-backtest.mjs
+++ b/scripts/validate-resilience-backtest.mjs
@@ -29,6 +29,15 @@ loadEnvFile(import.meta.url);
 // Source of truth: server/worldmonitor/resilience/v1/_shared.ts
 const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';

+// Mirror of _shared.ts#currentCacheFormula — must stay in lockstep so
+// the backtest only ingests same-formula cache entries. A mixed-formula
+// cohort would confound the recovery-prediction correlations.
+function currentCacheFormulaLocal() {
+  const combine = (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
+  const v2 = (process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';
+  return combine && v2 ? 'pc' : 'd6';
+}
+
 const MIN_SCORED_COUNTRIES = 5;

 let _scoreAllDimensions = null;
@@ -188,12 +197,27 @@ function pearsonCorrelation(xs, ys) {
 async function fetchScoresForCountries(url, token, countryCodes) {
  const commands = countryCodes.map((cc) => ['GET', `${RESILIENCE_SCORE_CACHE_PREFIX}${cc}`]);
  const results = await redisPipeline(url, token, commands);
+  const current = currentCacheFormulaLocal();
+  let staleFormulaSkipped = 0;

  const scores = new Map();
  for (let i = 0; i < countryCodes.length; i++) {
    const raw = results[i]?.result;
    if (typeof raw !== 'string') continue;
-    try { scores.set(countryCodes[i], JSON.parse(raw)); } catch { /* skip */ }
+    try {
+      const parsed = JSON.parse(raw);
+      // Cross-formula gate: only ingest same-formula entries. A
+      // mixed-formula cohort would produce a meaningless correlation
+      // between baseline resilience and post-shock recovery.
+      if (parsed?._formula !== current) {
+        staleFormulaSkipped++;
+        continue;
+      }
+      scores.set(countryCodes[i], parsed);
+    } catch { /* skip */ }
+  }
+  if (staleFormulaSkipped > 0) {
+    console.warn(`[validate-resilience-backtest] skipped ${staleFormulaSkipped} stale-formula entries (current=${current})`);
  }
  return scores;
 }
--- a/scripts/validate-resilience-correlation.mjs
+++ b/scripts/validate-resilience-correlation.mjs
@@ -5,6 +5,17 @@ import { loadEnvFile, getRedisCredentials } from './_seed-utils.mjs';
 // Source of truth: server/worldmonitor/resilience/v1/_shared.ts → RESILIENCE_SCORE_CACHE_PREFIX
 const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';

+// Mirror of server/worldmonitor/resilience/v1/_shared.ts#currentCacheFormula.
+// Must stay in lockstep with the server-side definition so this script
+// skips cross-formula cache entries for the same reasons the server
+// does — correlations benchmarked against a mixed-formula cohort of
+// d6 + pc entries would be meaningless.
+function currentCacheFormulaLocal() {
+  const combine = (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
+  const v2 = (process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';
+  return combine && v2 ? 'pc' : 'd6';
+}
+
 const REFERENCE_INDICES = {
  ndgain: {
    NO: 0.76, IS: 0.72, NZ: 0.71, DK: 0.74, SE: 0.73, FI: 0.72, CH: 0.73, AU: 0.70,
@@ -79,6 +90,8 @@ function spearmanRho(x, y) {
 async function fetchWorldMonitorScores(url, token, countryCodes) {
  const commands = countryCodes.map((c) => ['GET', `${RESILIENCE_SCORE_CACHE_PREFIX}${c}`]);
  const results = await redisPipeline(url, token, commands);
+  const current = currentCacheFormulaLocal();
+  const skipped = { staleFormula: 0, noOverallScore: 0, malformed: 0 };

  const scores = new Map();
  for (let i = 0; i < countryCodes.length; i++) {
@@ -86,10 +99,27 @@ async function fetchWorldMonitorScores(url, token, countryCodes) {
    if (typeof raw !== 'string') continue;
    try {
      const parsed = JSON.parse(raw);
+      // Cross-formula gate: the benchmark/validation scripts run off
+      // live cache entries. A mixed-formula cohort (some countries
+      // scored under d6, others under pc because their cache entries
+      // landed on either side of a flag flip) would produce a
+      // meaningless Spearman. Skip stale-formula entries so the
+      // correlation runs only against same-formula peers.
+      if (parsed?._formula !== current) {
+        skipped.staleFormula++;
+        continue;
+      }
      if (typeof parsed?.overallScore === 'number' && parsed.overallScore > 0) {
        scores.set(countryCodes[i], parsed.overallScore);
+      } else {
+        skipped.noOverallScore++;
      }
-    } catch { /* skip */ }
+    } catch {
+      skipped.malformed++;
+    }
+  }
+  if (skipped.staleFormula > 0) {
+    console.warn(`[validate-resilience-correlation] skipped ${skipped.staleFormula} stale-formula entries (current=${current})`);
  }
  return scores;
 }
--- a/server/worldmonitor/resilience/v1/_dimension-scorers.ts
+++ b/server/worldmonitor/resilience/v1/_dimension-scorers.ts
@@ -605,7 +605,7 @@ function getLatestDebtEntry(raw: unknown, countryCode: string): NationalDebtEntr
  return null;
 }

-function countTradeRestrictions(raw: unknown, countryCode: string): number {
+export function countTradeRestrictions(raw: unknown, countryCode: string): number {
  const restrictions: TradeRestriction[] = Array.isArray((raw as { restrictions?: unknown[] } | null)?.restrictions)
    ? ((raw as { restrictions?: TradeRestriction[] }).restrictions ?? [])
    : [];
@@ -617,7 +617,7 @@ function countTradeRestrictions(raw: unknown, countryCode: string): number {
  }, 0);
 }

-function countTradeBarriers(raw: unknown, countryCode: string): number {
+export function countTradeBarriers(raw: unknown, countryCode: string): number {
  const barriers: TradeBarrier[] = Array.isArray((raw as { barriers?: unknown[] } | null)?.barriers)
    ? ((raw as { barriers?: TradeBarrier[] }).barriers ?? [])
    : [];
@@ -630,7 +630,7 @@ function isInWtoReporterSet(raw: unknown, countryCode: string): boolean {
  return reporters.includes(countryCode);
 }

-function summarizeOutages(raw: unknown, countryCode: string): { total: number; major: number; partial: number } {
+export function summarizeOutages(raw: unknown, countryCode: string): { total: number; major: number; partial: number } {
  const outages: InternetOutage[] = Array.isArray((raw as { outages?: unknown[] } | null)?.outages)
    ? ((raw as { outages?: InternetOutage[] }).outages ?? [])
    : [];
@@ -648,7 +648,7 @@ function summarizeOutages(raw: unknown, countryCode: string): { total: number; m
  }, { total: 0, major: 0, partial: 0 });
 }

-function summarizeGps(raw: unknown, countryCode: string): { high: number; medium: number } {
+export function summarizeGps(raw: unknown, countryCode: string): { high: number; medium: number } {
  const hexes: GpsJamHex[] = Array.isArray((raw as { hexes?: unknown[] } | null)?.hexes)
    ? ((raw as { hexes?: GpsJamHex[] }).hexes ?? [])
    : [];
@@ -664,7 +664,7 @@ function summarizeGps(raw: unknown, countryCode: string): { high: number; medium
  }, { high: 0, medium: 0 });
 }

-function summarizeCyber(raw: unknown, countryCode: string): { weightedCount: number } {
+export function summarizeCyber(raw: unknown, countryCode: string): { weightedCount: number } {
  const threats: CyberThreat[] = Array.isArray((raw as { threats?: unknown[] } | null)?.threats)
    ? ((raw as { threats?: CyberThreat[] }).threats ?? [])
    : [];
@@ -683,7 +683,7 @@ function summarizeCyber(raw: unknown, countryCode: string): { weightedCount: num
  };
 }

-function summarizeUnrest(raw: unknown, countryCode: string): { unrestCount: number; fatalities: number } {
+export function summarizeUnrest(raw: unknown, countryCode: string): { unrestCount: number; fatalities: number } {
  const events: UnrestEvent[] = Array.isArray((raw as { events?: unknown[] } | null)?.events)
    ? ((raw as { events?: UnrestEvent[] }).events ?? [])
    : [];
@@ -697,7 +697,7 @@ function summarizeUnrest(raw: unknown, countryCode: string): { unrestCount: numb
  }, { unrestCount: 0, fatalities: 0 });
 }

-function summarizeUcdp(raw: unknown, countryCode: string): { eventCount: number; deaths: number; typeWeight: number } {
+export function summarizeUcdp(raw: unknown, countryCode: string): { eventCount: number; deaths: number; typeWeight: number } {
  const events: UcdpEvent[] = Array.isArray((raw as { events?: unknown[] } | null)?.events)
    ? ((raw as { events?: UcdpEvent[] }).events ?? [])
    : [];
@@ -711,20 +711,20 @@ function summarizeUcdp(raw: unknown, countryCode: string): { eventCount: number;
  }, { eventCount: 0, deaths: 0, typeWeight: 0 });
 }

-function getCountryDisplacement(raw: unknown, countryCode: string): CountryDisplacement | null {
+export function getCountryDisplacement(raw: unknown, countryCode: string): CountryDisplacement | null {
  const summary = (raw as { summary?: { countries?: CountryDisplacement[] } } | null)?.summary;
  const countries = Array.isArray(summary?.countries) ? summary.countries : [];
  return countries.find((entry) => matchesCountryIdentifier(entry.code, countryCode)) ?? null;
 }

-function summarizeSocialVelocity(raw: unknown, countryCode: string): number {
+export function summarizeSocialVelocity(raw: unknown, countryCode: string): number {
  const posts: SocialVelocityPost[] = Array.isArray((raw as { posts?: unknown[] } | null)?.posts)
    ? ((raw as { posts?: SocialVelocityPost[] }).posts ?? [])
    : [];
  return posts.reduce((sum, post) => sum + (matchesCountryText(post.title, countryCode) ? (safeNum(post.velocityScore) ?? 0) : 0), 0);
 }

-function getThreatSummaryScore(raw: unknown, countryCode: string): number | null {
+export function getThreatSummaryScore(raw: unknown, countryCode: string): number | null {
  if (!raw || typeof raw !== 'object') return null;
  const byCountry = (raw as Record<string, unknown>).byCountry ?? raw; // backward-compat: old payload was a flat ISO2 map
  const counts = (byCountry as Record<string, Record<string, number>>)?.[countryCode.toUpperCase()];
--- a/server/worldmonitor/supply-chain/v1/get-route-impact.ts
+++ b/server/worldmonitor/supply-chain/v1/get-route-impact.ts
@@ -24,7 +24,7 @@ import { lazyFetchBilateralHs4 } from './_bilateral-hs4-lazy';
 import { ROUTE_IMPACT_KEY } from '../../../_shared/cache-keys';
 import { CHOKEPOINT_REGISTRY } from '../../../_shared/chokepoint-registry';
 import { BYPASS_CORRIDORS_BY_CHOKEPOINT } from '../../../_shared/bypass-corridors';
-import { RESILIENCE_SCORE_CACHE_PREFIX } from '../../resilience/v1/_shared';
+import { RESILIENCE_SCORE_CACHE_PREFIX, getCurrentCacheFormula } from '../../resilience/v1/_shared';
 import COUNTRY_PORT_CLUSTERS from '../../../../scripts/shared/country-port-clusters.json';

 const CACHE_TTL_SECONDS = 86400; // 24h
@@ -129,10 +129,20 @@ function emptyResponse(_req: GetRouteImpactRequest, comtradeSource: string): Get
 async function readResilienceScore(iso2: string): Promise<number> {
  try {
    const raw = await getCachedJson(`${RESILIENCE_SCORE_CACHE_PREFIX}${iso2}`, true);
-    if (raw && typeof raw === 'object' && 'overallScore' in (raw as object)) {
-      return (raw as { overallScore: number }).overallScore;
+    if (!raw || typeof raw !== 'object' || !('overallScore' in (raw as object))) {
+      return 0;
    }
-    return 0;
+    // Cross-formula gate: score cache entries written under a different
+    // formula than the current one are stale and must not be served
+    // downstream. Returning 0 here mirrors the not-found case — the
+    // caller (computeImpact) treats 0 as "no resilience signal" and
+    // renders the lane without a resilience modifier. A fresh
+    // per-country rescoring is triggered naturally on the next call
+    // to the resilience handler, so the staleness is self-healing.
+    const tag = (raw as { _formula?: unknown })._formula;
+    const current = getCurrentCacheFormula();
+    if (tag !== current) return 0;
+    return (raw as { overallScore: number }).overallScore;
  } catch {
    return 0;
  }
--- a/tests/helpers/resilience-cohorts.mts
+++ b/tests/helpers/resilience-cohorts.mts
@@ -0,0 +1,133 @@
+// Cohort definitions for the resilience-scorer fairness audit.
+// Referenced by scripts/compare-resilience-current-vs-proposed.mjs and
+// tests/resilience-cohort-config.test.mts. See
+// docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md
+// §5 (PR 0) and §7 for the role these cohorts play in the acceptance gates.
+//
+// Membership is curated, not live-derived. Each cohort lists every country
+// that clearly falls into the category under widely-accepted definitions
+// (IEA + WB for exporters/importers, IAEA for nuclear-heavy, etc.). The
+// median-shift gate per cohort (§6 gate 6) is computed from these lists.
+//
+// Borderline cases are deliberately excluded: if a country only fits a
+// cohort in some years, we leave it out so the cohort median stays a
+// stable reference across PRs.
+
+export interface ResilienceCohort {
+  /** Unique id used in reports and commit messages. */
+  id: string;
+  /** Human-readable cohort name. */
+  label: string;
+  /** One-sentence definition citing the objective criterion used. */
+  definition: string;
+  /** Source or authority that grounds the definition. */
+  source: string;
+  /** ISO-3166 alpha-2 country codes in the cohort. */
+  countryCodes: readonly string[];
+}
+
+export const RESILIENCE_COHORTS: readonly ResilienceCohort[] = [
+  {
+    id: 'net-fuel-exporters',
+    label: 'Net fuel exporters',
+    definition:
+      'Countries whose net petroleum + gas exports exceed domestic consumption on a 5-year rolling average. These countries are the archetype the current scorer under-scores via gas/coal penalties and the net-import-biased fuel-stock metric.',
+    source: 'IEA World Energy Balances + UN Comtrade HS27 cross-check. List curated 2026-04-22.',
+    countryCodes: [
+      'AE', 'SA', 'QA', 'KW', 'OM', 'BH',  // Gulf
+      'NO', 'CA',                           // Wealthy democracies
+      'RU', 'IR', 'IQ',                     // Major non-aligned
+      'KZ', 'AZ', 'TM',                     // Post-Soviet
+      'VE', 'CO', 'EC',                     // South America
+      'NG', 'DZ', 'LY', 'AO',               // Africa
+      'BN',                                 // Southeast Asia
+    ],
+  },
+  {
+    id: 'net-energy-importers-oecd',
+    label: 'Net energy importers (OECD core)',
+    definition:
+      'OECD countries with EG.IMP.CONS.ZS > 20% (net energy imports as share of primary energy use). Validates that exporter-aimed fixes do not accidentally uplift these as a side effect.',
+    source: 'World Bank WDI EG.IMP.CONS.ZS, 2022 values. Curated 2026-04-22.',
+    countryCodes: [
+      'DE', 'FR', 'IT', 'ES', 'PT',  // EU core + periphery
+      'BE', 'NL', 'AT', 'CH',        // EU continental
+      'JP', 'KR',                    // East Asia
+      'GB', 'IE',                    // UK + Ireland
+      'GR', 'HU', 'CZ', 'SK',        // Southern + Central EU
+      'TR',                          // Bridge economy
+    ],
+  },
+  {
+    id: 'nuclear-heavy-generation',
+    label: 'Nuclear-heavy generation mix',
+    definition:
+      'Countries where nuclear supplied ≥ 15% of electricity generation in the most recent reporting year. Validates that the new lowCarbonGenerationShare indicator correctly rewards firm low-carbon generation (PR 1 §3.3).',
+    source: 'IAEA PRIS (Power Reactor Information System) + World Bank EG.ELC.NUCL.ZS. Curated 2026-04-22.',
+    countryCodes: [
+      'FR', 'SK', 'UA', 'HU', 'BE', 'BG', 'SI',  // Central/Eastern Europe heavy adopters
+      'CZ', 'FI', 'SE', 'CH',                     // Western/Northern EU adopters
+      'KR', 'US',                                 // North America + East Asia
+      'AE',                                       // UAE (Barakah)
+      'RU',                                       // Russia
+      'AR',                                       // Argentina (small but material share)
+    ],
+  },
+  {
+    id: 'coal-heavy-domestic',
+    label: 'Coal-heavy domestic producers',
+    definition:
+      'Countries where coal supplied ≥ 30% of electricity generation AND the coal is predominantly domestic (not imported). Validates that the new importedFossilDependence composite correctly distinguishes domestic from imported coal exposure.',
+    source: 'World Bank EG.ELC.COAL.ZS + WITS/Comtrade domestic-vs-imports cross-check. Curated 2026-04-22.',
+    countryCodes: [
+      'IN', 'CN', 'ID',        // Asia heavyweights
+      'ZA', 'BW',              // Southern Africa
+      'AU', 'US',              // OECD domestic producers
+      'PL', 'RS', 'BA', 'KZ',  // Central/Eastern Europe + post-Soviet
+      'MN',                    // Mongolia
+    ],
+  },
+  {
+    id: 'small-island-importers',
+    label: 'Small-island fuel importers',
+    definition:
+      'Small-island developing states that import essentially all fossil fuels. Data coverage is thin for this cohort; catches fixes that require data they structurally lack.',
+    source: 'UN SIDS list, subset with > 100k population. Curated 2026-04-22.',
+    countryCodes: [
+      'FJ', 'WS', 'TO', 'VU', 'SB', 'PG', 'KI', 'TV',  // Pacific
+      'MV',                                              // Indian Ocean
+      'MU', 'SC', 'CV',                                  // Africa-adjacent
+      'BB', 'TT', 'JM', 'LC', 'VC', 'GD',                // Caribbean
+    ],
+  },
+  {
+    id: 'fragile-states',
+    label: 'Fragile states (low-band anchors)',
+    definition:
+      'Countries consistently in the bottom band of multiple composite indices (Fund for Peace FSI top-10 fragile 2019-2023, UN LDC, UCDP conflict-affected). Release-gate anchors must continue to score these at or below the LOW_BAND_CEILING through every PR.',
+    source: 'Intersection of Fund for Peace FSI, UN LDC list, and UCDP conflict-event database. Curated 2026-04-22.',
+    countryCodes: [
+      'YE', 'SO', 'SD', 'SS',     // Horn + NE Africa
+      'CF', 'TD', 'NE', 'ML', 'BF', 'BI',  // Sahel + Great Lakes
+      'CD', 'ET',                 // Central/East Africa
+      'HT',                       // Caribbean
+      'SY', 'IQ', 'AF',           // MENA
+      'MM', 'LB',                 // Asia + Levant
+    ],
+  },
+] as const;
+
+export function cohortMembershipFor(countryCode: string): readonly string[] {
+  const cc = countryCode.trim().toUpperCase();
+  return RESILIENCE_COHORTS
+    .filter((cohort) => cohort.countryCodes.includes(cc))
+    .map((cohort) => cohort.id);
+}
+
+export function unionMembership(): readonly string[] {
+  const seen = new Set<string>();
+  for (const cohort of RESILIENCE_COHORTS) {
+    for (const cc of cohort.countryCodes) seen.add(cc);
+  }
+  return [...seen];
+}
--- a/tests/helpers/resilience-matched-pairs.mts
+++ b/tests/helpers/resilience-matched-pairs.mts
@@ -0,0 +1,92 @@
+// Matched-pair sanity panel for the resilience-scorer fairness audit.
+// Referenced by scripts/compare-resilience-current-vs-proposed.mjs and
+// tests/resilience-cohort-config.test.mts. See
+// docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md
+// §7 for the role these pairs play in the acceptance gates.
+//
+// Each pair tests a specific scorer-behavior axis under pre-chosen,
+// publicly-defensible directional expectations. Acceptance gate #7
+// enforces that each pair's within-pair-gap sign stays as documented
+// across every scorer-changing PR. A pair flipping direction stops the
+// PR and forces the construct change to be re-examined.
+
+export interface MatchedPair {
+  /** Unique id used in reports. */
+  id: string;
+  /** ISO-3166 alpha-2 for the country expected to score higher. */
+  higherExpected: string;
+  /** ISO-3166 alpha-2 for the country expected to score lower. */
+  lowerExpected: string;
+  /** Scorer-behavior axis the pair tests. */
+  axis: string;
+  /**
+   * One-paragraph rationale. Documents both why the direction is
+   * defensible today AND the conditions under which it could flip.
+   * The rationale should be neutral — not a score target, but a
+   * statement about the underlying resilience mechanism.
+   */
+  rationale: string;
+  /**
+   * Minimum gap (higher - lower) required. If the gap shrinks below this
+   * after a PR's change, the sanity gate flags it as a near-flip even
+   * though the sign hasn't changed. Default 3 points.
+   */
+  minGap?: number;
+}
+
+export const MATCHED_PAIRS: readonly MatchedPair[] = [
+  {
+    id: 'fr-vs-de',
+    higherExpected: 'FR',
+    lowerExpected: 'DE',
+    axis: 'Nuclear-heavy vs non-nuclear OECD importers',
+    rationale:
+      'France (~65% nuclear) has firm low-carbon electricity generation that Germany lacks post-phase-out; both are net energy importers but France\'s shock-absorption capacity via generation-mix independence is materially higher. A scorer that loses this gap under PR 1 has mis-weighted generation-mix vs other infrastructure signals. Germany\'s stronger fiscal/export sector does not close the gap in the current scorer; it shouldn\'t close it under PR 1 either.',
+    minGap: 3,
+  },
+  {
+    id: 'no-vs-ca',
+    higherExpected: 'NO',
+    lowerExpected: 'CA',
+    axis: 'SWF-fueled fossil exporter vs non-SWF fossil exporter',
+    rationale:
+      'Norway and Canada share the net-fuel-exporter + OECD + good-governance profile. Norway\'s sovereign-wealth buffer (GPFG, $1.6T) produces a materially larger shock-absorption cushion that Canada does not have. A scorer that loses this gap under PR 2 indicates the sovereignFiscalBuffer dimension is under-weighted OR the transparency/access/liquidity haircuts are over-penalizing Norway\'s fiscal-rule-bound withdrawals.',
+    minGap: 3,
+  },
+  {
+    id: 'uae-vs-bh',
+    higherExpected: 'AE',
+    lowerExpected: 'BH',
+    axis: 'Gulf with large SWF scale vs small-scale Gulf',
+    rationale:
+      'UAE\'s SWF scale (ADIA + Mubadala + ICD ≈ $1.7T for a population of ~10M) is two orders of magnitude higher per capita than Bahrain\'s (Mumtalakat ≈ $20B for ~1.5M). UAE infrastructure and recovery-domain indicators dominate. A scorer that shows AE ≈ BH after PR 1+PR 2 is mis-scaling the SWF haircut transform.',
+    minGap: 5,
+  },
+  {
+    id: 'jp-vs-kr',
+    higherExpected: 'JP',
+    lowerExpected: 'KR',
+    axis: 'Nuclear-adopters with different post-Fukushima trajectories',
+    rationale:
+      'Japan is a more established, more governance-tested nuclear adopter with deeper bureaucratic institutions and slightly stronger liquid-reserve cushion; South Korea is more dynamic but has higher concentration in semiconductor exports and lower SWF adequacy. The pair is intentionally narrow — within ~5 points expected — because both are strong OECD Asian economies. A wide gap or a direction flip under any PR indicates the scorer is over-reacting to governance-style differences or geopolitical-volatility proxies.',
+    minGap: 1,
+  },
+  {
+    id: 'in-vs-za',
+    higherExpected: 'IN',
+    lowerExpected: 'ZA',
+    axis: 'Coal-heavy domestic producers',
+    rationale:
+      'India and South Africa are both coal-heavy domestic producers with weak governance relative to OECD peers. India has materially higher macro-fiscal resilience (larger reserves, larger economy, more diversified export base, growing nuclear share) than South Africa (load-shedding crisis, weaker fiscal space). A scorer that loses this gap after PR 1 indicates the importedFossilDependence composite is over-crediting South Africa for its domestic coal without weighting its power-system-reliability collapse.',
+    minGap: 3,
+  },
+  {
+    id: 'sg-vs-ch',
+    higherExpected: 'SG',
+    lowerExpected: 'CH',
+    axis: 'Small high-infrastructure economies (SWF scale vs neutrality premium)',
+    rationale:
+      'Both are small, wealthy, governance-strong, high-infrastructure economies. Singapore\'s combined SWF (GIC + Temasek ≈ $1T) is materially larger per capita than Switzerland\'s SNB-held reserves despite similar GDP per capita. Singapore also has more explicit reserve-for-crisis access rules. Expect SG > CH by a small but real margin after PR 2. A wide gap would indicate over-crediting the SWF transform; a flipped direction would indicate the liquidReserveAdequacy dimension is picking up Switzerland\'s SNB strength disproportionately.',
+    minGap: 1,
+  },
+] as const;
--- a/tests/resilience-baseline-snapshot-ordering.test.mjs
+++ b/tests/resilience-baseline-snapshot-ordering.test.mjs
@@ -0,0 +1,124 @@
+// Contract test for the baseline-snapshot selection logic used by
+// scripts/compare-resilience-current-vs-proposed.mjs. The selector is
+// what drives acceptance gates 2 / 6 / 7 (matched-pair, cohort, max
+// country drift) for every scorer PR in the resilience repair plan.
+// A plain filename sort breaks on two axes:
+//   1. `pre-repair` sorts after `post-*` lexically (`pr...` → 'r' > 'o'),
+//      so the pre-repair freeze would keep winning forever.
+//   2. `post-pr9` sorts after `post-pr10` lexically, so PR-10 would
+//      lose to PR-9.
+// These tests pin the parsed ordering so neither failure mode silently
+// regresses.
+
+import test from 'node:test';
+import assert from 'node:assert/strict';
+
+const mod = await import('../scripts/compare-resilience-current-vs-proposed.mjs');
+const { parseBaselineSnapshotMeta } = mod;
+
+function orderedFilenames(filenames) {
+  return filenames
+    .map(parseBaselineSnapshotMeta)
+    .filter((m) => m != null)
+    .sort((a, b) => {
+      if (a.kindRank !== b.kindRank) return b.kindRank - a.kindRank;
+      if (a.prNumber !== b.prNumber) return b.prNumber - a.prNumber;
+      return b.date.localeCompare(a.date);
+    })
+    .map((m) => m.filename);
+}
+
+test('parseBaselineSnapshotMeta: pre-repair filename is recognised', () => {
+  const meta = parseBaselineSnapshotMeta('resilience-ranking-live-pre-repair-2026-04-22.json');
+  assert.ok(meta);
+  assert.equal(meta.kind, 'pre-repair');
+  assert.equal(meta.kindRank, 0);
+  assert.equal(meta.prNumber, -1);
+  assert.equal(meta.date, '2026-04-22');
+});
+
+test('parseBaselineSnapshotMeta: post-pr<N> filename parses prNumber numerically', () => {
+  const meta = parseBaselineSnapshotMeta('resilience-ranking-live-post-pr10-2026-05-01.json');
+  assert.ok(meta);
+  assert.equal(meta.kind, 'post');
+  assert.equal(meta.kindRank, 1);
+  assert.equal(meta.prNumber, 10);
+  assert.equal(meta.date, '2026-05-01');
+  assert.equal(meta.tag, 'pr10');
+});
+
+test('parseBaselineSnapshotMeta: post-<freeform-tag> falls back to prNumber 0', () => {
+  const meta = parseBaselineSnapshotMeta('resilience-ranking-live-post-handcal-2026-06-01.json');
+  assert.ok(meta);
+  assert.equal(meta.kind, 'post');
+  assert.equal(meta.prNumber, 0);
+  assert.equal(meta.tag, 'handcal');
+});
+
+test('parseBaselineSnapshotMeta: unrelated filenames return null', () => {
+  assert.equal(parseBaselineSnapshotMeta('resilience-ranking-2026-04-21.json'), null);
+  assert.equal(parseBaselineSnapshotMeta('resilience-ranking-pillar-combined-projected-2026-04-21.json'), null);
+  assert.equal(parseBaselineSnapshotMeta('README.md'), null);
+});
+
+test('selection ordering: post always beats pre-repair regardless of date', () => {
+  const ordered = orderedFilenames([
+    'resilience-ranking-live-pre-repair-2026-06-01.json',
+    'resilience-ranking-live-post-pr1-2026-05-01.json',
+  ]);
+  assert.deepEqual(ordered, [
+    'resilience-ranking-live-post-pr1-2026-05-01.json',
+    'resilience-ranking-live-pre-repair-2026-06-01.json',
+  ]);
+});
+
+test('selection ordering: pr10 beats pr9 (numeric, not lexical)', () => {
+  const ordered = orderedFilenames([
+    'resilience-ranking-live-post-pr9-2026-05-15.json',
+    'resilience-ranking-live-post-pr10-2026-06-01.json',
+    'resilience-ranking-live-post-pr2-2026-05-01.json',
+  ]);
+  assert.deepEqual(ordered, [
+    'resilience-ranking-live-post-pr10-2026-06-01.json',
+    'resilience-ranking-live-post-pr9-2026-05-15.json',
+    'resilience-ranking-live-post-pr2-2026-05-01.json',
+  ]);
+});
+
+test('selection ordering: realistic PR-0..PR-4 ladder picks the latest PR', () => {
+  const ordered = orderedFilenames([
+    'resilience-ranking-live-pre-repair-2026-04-22.json',
+    'resilience-ranking-live-post-pr1-2026-05-10.json',
+    'resilience-ranking-live-post-pr3-2026-06-02.json',
+    'resilience-ranking-live-post-pr2-2026-05-25.json',
+    'resilience-ranking-live-post-pr4-2026-06-18.json',
+  ]);
+  assert.equal(ordered[0], 'resilience-ranking-live-post-pr4-2026-06-18.json');
+  assert.equal(ordered.at(-1), 'resilience-ranking-live-pre-repair-2026-04-22.json');
+});
+
+test('selection ordering: same pr number, later date wins', () => {
+  // Edge case: a PR re-snapshotted after a hotfix. The later capture
+  // should win so "immediate prior" remains the most recent observation
+  // of that PR's landed state.
+  const ordered = orderedFilenames([
+    'resilience-ranking-live-post-pr2-2026-05-25.json',
+    'resilience-ranking-live-post-pr2-2026-05-27.json',
+  ]);
+  assert.equal(ordered[0], 'resilience-ranking-live-post-pr2-2026-05-27.json');
+});
+
+test('selection ordering: unlabeled post tag sorts between pre-repair and pr1', () => {
+  // Guards against a future misnamed snapshot sneaking in and either
+  // beating a numbered PR or losing to the original pre-repair.
+  const ordered = orderedFilenames([
+    'resilience-ranking-live-pre-repair-2026-04-22.json',
+    'resilience-ranking-live-post-handcal-2026-05-02.json',
+    'resilience-ranking-live-post-pr1-2026-05-10.json',
+  ]);
+  assert.deepEqual(ordered, [
+    'resilience-ranking-live-post-pr1-2026-05-10.json',
+    'resilience-ranking-live-post-handcal-2026-05-02.json',
+    'resilience-ranking-live-pre-repair-2026-04-22.json',
+  ]);
+});
--- a/tests/resilience-cohort-config.test.mts
+++ b/tests/resilience-cohort-config.test.mts
@@ -0,0 +1,124 @@
+// Validates the cohort and matched-pair configuration used by the PR 0
+// diagnostic-freeze harness. These configs are load-bearing for the
+// fairness audit in docs/plans/2026-04-22-001-fix-resilience-scorer-
+// structural-bias-plan.md §7 — a silent regression in them would
+// corrupt the acceptance-gate evidence for every subsequent scorer PR.
+
+import assert from 'node:assert/strict';
+import { describe, it } from 'node:test';
+
+import { RESILIENCE_COHORTS, unionMembership } from './helpers/resilience-cohorts.mts';
+import { MATCHED_PAIRS } from './helpers/resilience-matched-pairs.mts';
+
+const ISO2_RE = /^[A-Z]{2}$/;
+
+describe('resilience cohort configuration', () => {
+  it('every cohort has at least 3 members', () => {
+    for (const cohort of RESILIENCE_COHORTS) {
+      assert.ok(
+        cohort.countryCodes.length >= 3,
+        `cohort ${cohort.id} has ${cohort.countryCodes.length} members; medians are unreliable below 3`,
+      );
+    }
+  });
+
+  it('every cohort country code is a valid ISO-3166 alpha-2', () => {
+    for (const cohort of RESILIENCE_COHORTS) {
+      for (const cc of cohort.countryCodes) {
+        assert.match(cc, ISO2_RE, `cohort ${cohort.id} has non-ISO2 code "${cc}"`);
+      }
+    }
+  });
+
+  it('no cohort has duplicate members within itself', () => {
+    for (const cohort of RESILIENCE_COHORTS) {
+      const unique = new Set(cohort.countryCodes);
+      assert.equal(
+        unique.size,
+        cohort.countryCodes.length,
+        `cohort ${cohort.id} has duplicate members: ${cohort.countryCodes.length - unique.size} duplicates`,
+      );
+    }
+  });
+
+  it('every cohort has a documented definition and source', () => {
+    for (const cohort of RESILIENCE_COHORTS) {
+      assert.ok(cohort.definition.length > 20, `cohort ${cohort.id} definition too short`);
+      assert.ok(cohort.source.length > 10, `cohort ${cohort.id} source citation too short`);
+      assert.ok(cohort.label.length > 3, `cohort ${cohort.id} label too short`);
+    }
+  });
+
+  it('cohort union covers at least 70 unique countries', () => {
+    // PR 0 §7: the union of cohort membership must span a meaningful
+    // slice of the ranking. 70 countries is roughly a third of the
+    // scorable set — sufficient for cohort-median gates to distinguish
+    // construct-change effects from noise.
+    const union = unionMembership();
+    assert.ok(
+      union.length >= 70,
+      `cohort union has ${union.length} unique countries; expected ≥ 70 for meaningful fairness audit`,
+    );
+  });
+
+  it('cohort ids are unique', () => {
+    const ids = RESILIENCE_COHORTS.map((c) => c.id);
+    const unique = new Set(ids);
+    assert.equal(unique.size, ids.length, 'cohort ids must be unique');
+  });
+});
+
+describe('resilience matched-pair configuration', () => {
+  it('every matched pair references two distinct valid ISO-2 codes', () => {
+    for (const pair of MATCHED_PAIRS) {
+      assert.match(pair.higherExpected, ISO2_RE, `pair ${pair.id} higherExpected`);
+      assert.match(pair.lowerExpected, ISO2_RE, `pair ${pair.id} lowerExpected`);
+      assert.notEqual(
+        pair.higherExpected,
+        pair.lowerExpected,
+        `pair ${pair.id} has higher === lower (${pair.higherExpected})`,
+      );
+    }
+  });
+
+  it('every matched pair has a documented axis + rationale', () => {
+    for (const pair of MATCHED_PAIRS) {
+      assert.ok(pair.axis.length > 10, `pair ${pair.id} axis too short`);
+      // Rationale must be substantive — pins the expected-direction
+      // defensibility so a reviewer can challenge the pair on its
+      // merits rather than guessing at intent.
+      assert.ok(pair.rationale.length > 100, `pair ${pair.id} rationale too short (${pair.rationale.length} chars)`);
+    }
+  });
+
+  it('every matched pair has a non-negative minimum gap', () => {
+    for (const pair of MATCHED_PAIRS) {
+      const minGap = pair.minGap ?? 3;
+      assert.ok(
+        minGap >= 0,
+        `pair ${pair.id} minGap=${minGap} must be ≥ 0`,
+      );
+      // Guard against an accidentally-enormous minGap that would make
+      // the gate trivially fail — no pair should need more than a
+      // 10-point cushion.
+      assert.ok(
+        minGap <= 10,
+        `pair ${pair.id} minGap=${minGap} suspiciously large; pairs with gaps > 10 are probably not valid sanity-check peers`,
+      );
+    }
+  });
+
+  it('pair ids are unique', () => {
+    const ids = MATCHED_PAIRS.map((p) => p.id);
+    const unique = new Set(ids);
+    assert.equal(unique.size, ids.length, 'matched-pair ids must be unique');
+  });
+
+  it('at least 4 pairs are defined to exercise the fairness audit', () => {
+    // Acceptance gate #7 in the plan requires the matched-pair sanity
+    // panel to be exercised every scorer-changing PR. Too few pairs
+    // and the panel provides insufficient coverage across scorer
+    // behavior axes.
+    assert.ok(MATCHED_PAIRS.length >= 4, `expected ≥ 4 matched pairs, got ${MATCHED_PAIRS.length}`);
+  });
+});
--- a/tests/resilience-dimension-monotonicity.test.mts
+++ b/tests/resilience-dimension-monotonicity.test.mts
@@ -0,0 +1,259 @@
+// Monotonicity-test harness. Pins the direction of movement for the
+// highest-leverage indicators so PR 1 + PR 2 cannot accidentally flip
+// a sign silently. See
+// docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md
+// §5 (PR 0 deliverable) and §6 (acceptance gate 8).
+//
+// Each test builds two synthetic `ResilienceSeedReader` fixtures that
+// differ only in the target indicator's value and asserts the dimension
+// score moves in the documented direction.
+//
+// Scope (minimum viable, expanded in PR 0.5 follow-ups):
+//   - scoreEnergy: dependency, gasShare, coalShare, renewShare, electricityConsumption
+//     (all five direction claims the current scorer makes — PR 1 overturns three of them)
+//   - scoreReserveAdequacy: reserveMonths
+//   - scoreFiscalSpace: govRevenuePct, fiscalBalancePct, debtToGdpPct
+//   - scoreExternalDebtCoverage: debtToReservesRatio
+//   - scoreImportConcentration: hhi
+//   - scoreFoodWater: peopleInCrisis, phase
+//   - scoreGovernanceInstitutional: WGI mean
+//
+// 15 indicators × 1 direction check each = 15 assertions. The harness
+// is written as a table so PR 1 can add/remove rows without touching
+// test logic.
+
+import assert from 'node:assert/strict';
+import { describe, it } from 'node:test';
+
+import {
+  scoreEnergy,
+  scoreReserveAdequacy,
+  scoreFiscalSpace,
+  scoreExternalDebtCoverage,
+  scoreImportConcentration,
+  scoreFoodWater,
+  scoreGovernanceInstitutional,
+  type ResilienceSeedReader,
+} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
+
+const TEST_ISO2 = 'XX';
+
+function makeStaticReader(staticRecord: unknown, overrides: Record<string, unknown> = {}): ResilienceSeedReader {
+  return async (key: string) => {
+    if (key === `resilience:static:${TEST_ISO2}`) return staticRecord;
+    if (key in overrides) return overrides[key];
+    return null;
+  };
+}
+
+function makeRecoveryReader(keyValueMap: Record<string, unknown>): ResilienceSeedReader {
+  return async (key: string) => keyValueMap[key] ?? null;
+}
+
+describe('resilience dimension monotonicity — scoreReserveAdequacy', () => {
+  it('higher reserveMonths → higher score', async () => {
+    const low = await scoreReserveAdequacy(TEST_ISO2, makeRecoveryReader({
+      'resilience:recovery:reserve-adequacy:v1': { countries: { [TEST_ISO2]: { reserveMonths: 2 } } },
+    }));
+    const high = await scoreReserveAdequacy(TEST_ISO2, makeRecoveryReader({
+      'resilience:recovery:reserve-adequacy:v1': { countries: { [TEST_ISO2]: { reserveMonths: 12 } } },
+    }));
+    assert.ok(high.score > low.score, `reserveMonths 2→12 should raise score; got ${low.score} → ${high.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreFiscalSpace', () => {
+  const baseEntry = { govRevenuePct: 25, fiscalBalancePct: 0, debtToGdpPct: 60 };
+
+  async function scoreWith(override: Partial<typeof baseEntry>) {
+    return scoreFiscalSpace(TEST_ISO2, makeRecoveryReader({
+      'resilience:recovery:fiscal-space:v1': { countries: { [TEST_ISO2]: { ...baseEntry, ...override } } },
+    }));
+  }
+
+  it('higher govRevenuePct → higher score', async () => {
+    const low = await scoreWith({ govRevenuePct: 10 });
+    const high = await scoreWith({ govRevenuePct: 40 });
+    assert.ok(high.score > low.score, `govRevenuePct 10→40 should raise score; got ${low.score} → ${high.score}`);
+  });
+
+  it('higher fiscalBalancePct → higher score', async () => {
+    const low = await scoreWith({ fiscalBalancePct: -10 });
+    const high = await scoreWith({ fiscalBalancePct: 3 });
+    assert.ok(high.score > low.score, `fiscalBalancePct -10→3 should raise score; got ${low.score} → ${high.score}`);
+  });
+
+  it('higher debtToGdpPct → lower score', async () => {
+    const low = await scoreWith({ debtToGdpPct: 40 });
+    const high = await scoreWith({ debtToGdpPct: 140 });
+    assert.ok(low.score > high.score, `debtToGdpPct 40→140 should lower score; got ${low.score} → ${high.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreExternalDebtCoverage', () => {
+  async function scoreWith(ratio: number) {
+    return scoreExternalDebtCoverage(TEST_ISO2, makeRecoveryReader({
+      'resilience:recovery:external-debt:v1': { countries: { [TEST_ISO2]: { debtToReservesRatio: ratio } } },
+    }));
+  }
+
+  it('higher debtToReservesRatio → lower score', async () => {
+    // NOTE: the current scorer saturates at 100 for ratio ≤ 0 (goalpost
+    // lower-better, worst=5 best=0). Picking values inside the 0-5 band
+    // to get a meaningful gradient. PR 3 §3.6 re-goalposts this.
+    const good = await scoreWith(1);
+    const bad = await scoreWith(4);
+    assert.ok(good.score > bad.score, `debtToReservesRatio 1→4 should lower score; got ${good.score} → ${bad.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreImportConcentration', () => {
+  async function scoreWith(hhi: number) {
+    return scoreImportConcentration(TEST_ISO2, makeRecoveryReader({
+      'resilience:recovery:import-hhi:v1': { countries: { [TEST_ISO2]: { hhi } } },
+    }));
+  }
+
+  it('higher hhi → lower score (more concentration = more exposure)', async () => {
+    // HHI payload is on a 0..1 scale (normalised before storage).
+    // 0.15 = diversified supplier base; 0.45 = concentrated.
+    const diversified = await scoreWith(0.15);
+    const concentrated = await scoreWith(0.45);
+    assert.ok(diversified.score > concentrated.score, `hhi 0.15→0.45 should lower score; got ${diversified.score} → ${concentrated.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreGovernanceInstitutional', () => {
+  async function scoreWith(wgiMeanValue: number) {
+    // Static-record shape per `getStaticWgiValues`: `wgi.indicators.<name>.value`.
+    const staticRecord = {
+      wgi: {
+        indicators: {
+          voiceAccountability:    { value: wgiMeanValue },
+          politicalStability:     { value: wgiMeanValue },
+          governmentEffectiveness:{ value: wgiMeanValue },
+          regulatoryQuality:      { value: wgiMeanValue },
+          ruleOfLaw:              { value: wgiMeanValue },
+          controlOfCorruption:    { value: wgiMeanValue },
+        },
+      },
+    };
+    return scoreGovernanceInstitutional(TEST_ISO2, makeStaticReader(staticRecord));
+  }
+
+  it('higher WGI mean → higher score', async () => {
+    const weak = await scoreWith(-1.5);
+    const strong = await scoreWith(1.5);
+    assert.ok(strong.score > weak.score, `WGI -1.5→1.5 should raise score; got ${weak.score} → ${strong.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreFoodWater', () => {
+  async function scoreWith(override: Record<string, unknown>) {
+    const fao = { peopleInCrisis: 100, phase: 'Phase 1', ...override };
+    const staticRecord = { fao, aquastat: { waterStress: { value: 40 }, waterAvailability: { value: 2000 } } };
+    return scoreFoodWater(TEST_ISO2, makeStaticReader(staticRecord));
+  }
+
+  it('higher peopleInCrisis → lower score', async () => {
+    const healthy = await scoreWith({ peopleInCrisis: 1000 });
+    const crisis = await scoreWith({ peopleInCrisis: 5_000_000 });
+    assert.ok(healthy.score > crisis.score, `peopleInCrisis 1k→5M should lower score; got ${healthy.score} → ${crisis.score}`);
+  });
+
+  it('higher IPC phase → lower score', async () => {
+    const phase2 = await scoreWith({ phase: 'Phase 2' });
+    const phase5 = await scoreWith({ phase: 'Phase 5' });
+    assert.ok(phase2.score > phase5.score, `phase 2→5 should lower score; got ${phase2.score} → ${phase5.score}`);
+  });
+});
+
+describe('resilience dimension monotonicity — scoreEnergy (current construct)', () => {
+  // NOTE: these tests pin the CURRENT scorer direction for each indicator.
+  // PR 1 §3.1-3.3 overturns three of them (electricityConsumption, gasShare,
+  // coalShare) — when PR 1 ships, those tests are REPLACED by tests for
+  // the new indicators (importedFossilDependence, lowCarbonGenerationShare).
+  // The failure of one of these tests in the meantime is a signal that a
+  // PR has accidentally altered the construct; PR 1 should update this
+  // file in the same commit that changes scoreEnergy.
+
+  function makeEnergyReader(overrides: {
+    staticRecord?: unknown;
+    mix?: unknown;
+    prices?: unknown;
+    storage?: unknown;
+  } = {}): ResilienceSeedReader {
+    const defaultStatic = {
+      iea: { energyImportDependency: { value: 30 } },
+      infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
+    };
+    const defaultMix = { gasShare: 30, coalShare: 20, renewShare: 30 };
+    return async (key: string) => {
+      if (key === `resilience:static:${TEST_ISO2}`) return overrides.staticRecord ?? defaultStatic;
+      if (key === 'economic:energy:v1:all') return overrides.prices ?? null;
+      if (key === `energy:mix:v1:${TEST_ISO2}`) return overrides.mix ?? defaultMix;
+      if (key === `energy:gas-storage:v1:${TEST_ISO2}`) return overrides.storage ?? null;
+      return null;
+    };
+  }
+
+  it('higher import dependency → lower score', async () => {
+    const selfSufficient = await scoreEnergy(TEST_ISO2, makeEnergyReader({
+      staticRecord: {
+        iea: { energyImportDependency: { value: 10 } },
+        infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
+      },
+    }));
+    const dependent = await scoreEnergy(TEST_ISO2, makeEnergyReader({
+      staticRecord: {
+        iea: { energyImportDependency: { value: 90 } },
+        infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
+      },
+    }));
+    assert.ok(selfSufficient.score > dependent.score, `import dep 10→90 should lower score; got ${selfSufficient.score} → ${dependent.score}`);
+  });
+
+  it('higher renewShare → higher score', async () => {
+    const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 20, renewShare: 5 } }));
+    const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 20, renewShare: 70 } }));
+    assert.ok(high.score > low.score, `renewShare 5→70 should raise score; got ${low.score} → ${high.score}`);
+  });
+
+  it('CURRENT: higher gasShare → lower score (THIS CHANGES IN PR 1 — see plan §3.2)', async () => {
+    // Pins the current (v3-plan-condemned) behavior so PR 1 knows what
+    // it is replacing. When PR 1 ships the new importedFossilDependence
+    // composite, this test is REPLACED, not deleted — the replacement
+    // pins the new construct's direction.
+    const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 10, coalShare: 20, renewShare: 30 } }));
+    const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 70, coalShare: 20, renewShare: 30 } }));
+    assert.ok(low.score > high.score, `gasShare 10→70 should lower score under current construct; got ${low.score} → ${high.score}`);
+  });
+
+  it('CURRENT: higher coalShare → lower score (THIS CHANGES IN PR 1 — see plan §3.2)', async () => {
+    const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 10, renewShare: 30 } }));
+    const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 70, renewShare: 30 } }));
+    assert.ok(low.score > high.score, `coalShare 10→70 should lower score under current construct; got ${low.score} → ${high.score}`);
+  });
+
+  it('CURRENT: higher electricityConsumption → higher score (THIS FAILS THE MECHANISM TEST — see plan §3.1)', async () => {
+    // This test PASSES today because the current scorer rewards
+    // per-capita electricity consumption. The v3 plan classifies
+    // electricityConsumption as a wealth-proxy that fails the mechanism
+    // test; PR 1 removes it. When PR 1 ships, this test is DELETED (not
+    // replaced), because the indicator no longer exists. The delete is
+    // the signal that the wealth-proxy concern is resolved.
+    const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({
+      staticRecord: {
+        iea: { energyImportDependency: { value: 30 } },
+        infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 500 } } },
+      },
+    }));
+    const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({
+      staticRecord: {
+        iea: { energyImportDependency: { value: 30 } },
+        infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 7500 } } },
+      },
+    }));
+    assert.ok(high.score > low.score, `electricityConsumption 500→7500 kWh/cap should raise score under current construct; got ${low.score} → ${high.score}`);
+  });
+});
--- a/tests/resilience-indicator-extraction-plan.test.mjs
+++ b/tests/resilience-indicator-extraction-plan.test.mjs
@@ -0,0 +1,235 @@
+// Contract test for the registry-driven per-indicator extraction plan
+// used by scripts/compare-resilience-current-vs-proposed.mjs. Pins two
+// acceptance-apparatus invariants:
+//
+//   1. Every indicator in INDICATOR_REGISTRY has a corresponding
+//      EXTRACTION_RULES row (implemented OR not-implemented with a
+//      reason). No silent omissions.
+//   2. All six repair-plan construct-risk indicators (energy mix +
+//      electricity consumption + energy import dependency + WGI
+//      sub-pillars + recovery fiscal indicators) are 'implemented'
+//      in the harness, so PR 1 / PR 3 / PR 4 can measure
+//      pre-vs-post effective-influence against their baselines.
+
+import test from 'node:test';
+import assert from 'node:assert/strict';
+
+const scriptMod = await import('../scripts/compare-resilience-current-vs-proposed.mjs');
+const registryMod = await import('../server/worldmonitor/resilience/v1/_indicator-registry.ts');
+
+const { buildIndicatorExtractionPlan, applyExtractionRule, EXTRACTION_RULES } = scriptMod;
+const { INDICATOR_REGISTRY } = registryMod;
+
+test('every INDICATOR_REGISTRY entry has an EXTRACTION_RULES row', () => {
+  const missing = INDICATOR_REGISTRY.filter((spec) => !(spec.id in EXTRACTION_RULES));
+  assert.deepEqual(
+    missing.map((s) => s.id),
+    [],
+    'new indicator(s) added to INDICATOR_REGISTRY without adding an EXTRACTION_RULES entry; ' +
+      'add an extractor or an explicit { type: "not-implemented", reason }',
+  );
+});
+
+test('extraction plan row exists for every registry entry', () => {
+  const plan = buildIndicatorExtractionPlan(INDICATOR_REGISTRY);
+  assert.equal(plan.length, INDICATOR_REGISTRY.length);
+  for (const entry of plan) {
+    assert.ok(['implemented', 'not-implemented', 'unregistered-in-harness'].includes(entry.extractionStatus));
+  }
+});
+
+test('"not-implemented" rows carry a reason string', () => {
+  const plan = buildIndicatorExtractionPlan(INDICATOR_REGISTRY);
+  for (const entry of plan) {
+    if (entry.extractionStatus === 'not-implemented') {
+      assert.ok(
+        typeof entry.reason === 'string' && entry.reason.length > 0,
+        `indicator ${entry.indicator} marked not-implemented but has no reason`,
+      );
+    }
+  }
+});
+
+test('all construct-risk indicators flagged by the repair plan are implemented', () => {
+  // The repair plan §3.1–§3.2, §4.3, §4.4 specifically names these
+  // indicators as the ones whose effective influence must be
+  // measurable pre- and post-change. If any becomes 'not-implemented',
+  // the acceptance apparatus for that PR evaporates. IDs match
+  // INDICATOR_REGISTRY exactly — the registry renames macroFiscal
+  // fiscal-space sub-indicators with a `recovery*` prefix when they
+  // live in the fiscalSpace dimension.
+  const mustBeImplemented = [
+    'gasShare',
+    'coalShare',
+    'renewShare',
+    'electricityConsumption',
+    'energyImportDependency',
+    'govRevenuePct',
+    'recoveryGovRevenue',
+    'recoveryFiscalBalance',
+    'recoveryDebtToGdp',
+    'recoveryReserveMonths',
+    'recoveryDebtToReserves',
+    'recoveryImportHhi',
+  ];
+  const plan = buildIndicatorExtractionPlan(INDICATOR_REGISTRY);
+  const byId = Object.fromEntries(plan.map((p) => [p.indicator, p]));
+  for (const id of mustBeImplemented) {
+    assert.ok(byId[id], `construct-risk indicator ${id} is not in the extraction plan`);
+    assert.equal(
+      byId[id].extractionStatus,
+      'implemented',
+      `construct-risk indicator ${id} must be extractable; got "${byId[id].extractionStatus}": ${byId[id].reason ?? ''}`,
+    );
+  }
+});
+
+test('core-tier indicator coverage meets a minimum floor', () => {
+  // Drives the extractionCoverage summary in the output. Floor raised
+  // after wiring the exported scorer-aggregate helpers (summarizeCyber,
+  // summarizeOutages, summarizeGps, summarizeUcdp, summarizeUnrest,
+  // getThreatSummaryScore, getCountryDisplacement, countTradeRestrictions,
+  // countTradeBarriers). The only Core-tier indicators still unextracted
+  // are those whose scorer inputs are genuinely global scalars
+  // (shippingStress, transitDisruption, energyPriceStress) or require
+  // unexported time-series helpers (fxVolatility, fxDeviation,
+  // aquastatWaterAvailability, householdDebtService, etc.).
+  const plan = buildIndicatorExtractionPlan(INDICATOR_REGISTRY);
+  const coreTotal = plan.filter((p) => p.tier === 'core').length;
+  const coreImplemented = plan.filter((p) => p.tier === 'core' && p.extractionStatus === 'implemented').length;
+  assert.ok(
+    coreImplemented / coreTotal >= 0.80,
+    `core-tier extraction coverage fell below 80%: ${coreImplemented}/${coreTotal}`,
+  );
+});
+
+test('the three "no per-country variance" indicators stay not-implemented with correct reason', () => {
+  // shippingStress / transitDisruption / energyPriceStress are
+  // scorer-level GLOBAL scalars — Pearson(global, overall) is 0 or
+  // NaN by construction. They must NOT be marked implemented: any
+  // future implementation that appears to extract them is wrong
+  // unless it re-expresses them as per-country effective contribution.
+  const plan = buildIndicatorExtractionPlan(INDICATOR_REGISTRY);
+  const byId = Object.fromEntries(plan.map((p) => [p.indicator, p]));
+  for (const id of ['shippingStress', 'transitDisruption', 'energyPriceStress']) {
+    assert.equal(byId[id]?.extractionStatus, 'not-implemented', `${id} should stay not-implemented (no per-country variance)`);
+    assert.match(byId[id].reason, /no per-country variance|global/i);
+  }
+});
+
+test('applyExtractionRule — static-path navigates nested object fields', () => {
+  const rule = { type: 'static-path', path: ['iea', 'energyImportDependency', 'value'] };
+  const sources = { staticRecord: { iea: { energyImportDependency: { value: 42 } } } };
+  assert.equal(applyExtractionRule(rule, sources, 'AE'), 42);
+});
+
+test('applyExtractionRule — recovery-country-field uses .countries[iso2].<field>', () => {
+  const rule = { type: 'recovery-country-field', key: 'resilience:recovery:fiscal-space:v1', field: 'govRevenuePct' };
+  const sources = { fiscalSpace: { countries: { AE: { govRevenuePct: 30 } } } };
+  assert.equal(applyExtractionRule(rule, sources, 'AE'), 30);
+});
+
+test('applyExtractionRule — static-wgi reads .wgi.indicators[code].value', () => {
+  // WGI keys are World-Bank standard codes (VA.EST, PV.EST, etc.)
+  const rule = { type: 'static-wgi', code: 'RL.EST' };
+  const sources = { staticRecord: { wgi: { indicators: { 'RL.EST': { value: 1.2 } } } } };
+  assert.equal(applyExtractionRule(rule, sources, 'DE'), 1.2);
+});
+
+test('applyExtractionRule — static-wgi-mean averages all six WGI sub-pillars', () => {
+  const rule = { type: 'static-wgi-mean' };
+  const sources = { staticRecord: { wgi: { indicators: {
+    'VA.EST': { value: 1.0 },
+    'PV.EST': { value: -1.0 },
+    'GE.EST': { value: 0.5 },
+    'RQ.EST': { value: -0.5 },
+    'RL.EST': { value: 2.0 },
+    'CC.EST': { value: 0.0 },
+  } } } };
+  assert.equal(applyExtractionRule(rule, sources, 'DE'), (1.0 + -1.0 + 0.5 + -0.5 + 2.0 + 0.0) / 6);
+});
+
+test('applyExtractionRule — missing values return null (pairwise-drop contract)', () => {
+  const rule = { type: 'static-path', path: ['iea', 'energyImportDependency', 'value'] };
+  assert.equal(applyExtractionRule(rule, {}, 'AE'), null);
+  assert.equal(applyExtractionRule(rule, { staticRecord: null }, 'AE'), null);
+  assert.equal(applyExtractionRule(rule, { staticRecord: { iea: null } }, 'AE'), null);
+});
+
+test('applyExtractionRule — not-implemented rules short-circuit to null', () => {
+  const rule = { type: 'not-implemented', reason: 'test' };
+  assert.equal(applyExtractionRule(rule, {}, 'AE'), null);
+});
+
+test('applyExtractionRule — summarize-cyber wires through exported scorer helper', () => {
+  const rule = { type: 'summarize-cyber' };
+  const cyber = { threats: [{ country: 'AE', severity: 'CRITICALITY_LEVEL_CRITICAL' }] };
+  // Pass a stub helper to prove the rule dispatches through it.
+  const helpers = {
+    summarizeCyber: (raw, cc) => ({
+      weightedCount: raw.threats.filter((t) => t.country === cc).length * 3,
+    }),
+  };
+  assert.equal(applyExtractionRule(rule, { cyber }, 'AE', helpers), 3);
+  // Without the helper available, rule falls back to null.
+  assert.equal(applyExtractionRule(rule, { cyber }, 'AE', {}), null);
+});
+
+test('applyExtractionRule — summarize-outages-penalty computes 4/2/1 weighting', () => {
+  const rule = { type: 'summarize-outages-penalty' };
+  const outages = { outages: [] };
+  const helpers = {
+    summarizeOutages: () => ({ total: 1, major: 2, partial: 3 }),
+  };
+  // penalty = 1*4 + 2*2 + 3*1 = 11
+  assert.equal(applyExtractionRule(rule, { outages }, 'AE', helpers), 11);
+});
+
+test('applyExtractionRule — displacement-field reads per-country entry by field name', () => {
+  const rule = { type: 'displacement-field', field: 'totalDisplaced' };
+  const displacement = {};
+  const helpers = {
+    getCountryDisplacement: () => ({ totalDisplaced: 12345, hostTotal: 678 }),
+  };
+  assert.equal(applyExtractionRule(rule, { displacement }, 'SY', helpers), 12345);
+});
+
+test('applyExtractionRule — count-trade-restrictions uses scorer-exported counter', () => {
+  const rule = { type: 'count-trade-restrictions' };
+  const tradeRestrictions = { restrictions: [] };
+  const helpers = { countTradeRestrictions: () => 5 };
+  assert.equal(applyExtractionRule(rule, { tradeRestrictions }, 'AE', helpers), 5);
+  // Zero coerces to null (pairwise-drop contract for empty signals).
+  assert.equal(applyExtractionRule(rule, { tradeRestrictions }, 'AE', { countTradeRestrictions: () => 0 }), null);
+});
+
+test('applyExtractionRule — aquastat stress vs availability gated by indicator tag', () => {
+  // Mirror scoreAquastatValue in _dimension-scorers.ts: both indicators
+  // share .aquastat.value, but the .aquastat.indicator tag classifies
+  // which family the reading belongs to. A stress-family country must
+  // NOT contribute a reading to the availability extractor, and vice
+  // versa, otherwise the Pearson correlation mixes two different
+  // construct scales.
+  const stressRule = { type: 'static-aquastat-stress' };
+  const availabilityRule = { type: 'static-aquastat-availability' };
+
+  const stressCountry = { staticRecord: { aquastat: { value: 42, indicator: 'Water stress (withdrawal/availability)' } } };
+  const availabilityCountry = { staticRecord: { aquastat: { value: 1500, indicator: 'Renewable water availability per capita' } } };
+  const unknownCountry = { staticRecord: { aquastat: { value: 99, indicator: 'Some unrecognised tag' } } };
+  const missingCountry = { staticRecord: { aquastat: { value: null, indicator: 'stress' } } };
+
+  // Stress-tagged country: only the stress extractor returns the value.
+  assert.equal(applyExtractionRule(stressRule, stressCountry, 'AE'), 42);
+  assert.equal(applyExtractionRule(availabilityRule, stressCountry, 'AE'), null);
+
+  // Availability-tagged country: only the availability extractor returns.
+  assert.equal(applyExtractionRule(availabilityRule, availabilityCountry, 'DE'), 1500);
+  assert.equal(applyExtractionRule(stressRule, availabilityCountry, 'DE'), null);
+
+  // Unknown tag: neither extractor returns (pairwise-drop).
+  assert.equal(applyExtractionRule(stressRule, unknownCountry, 'XX'), null);
+  assert.equal(applyExtractionRule(availabilityRule, unknownCountry, 'XX'), null);
+
+  // Missing value: null regardless of tag.
+  assert.equal(applyExtractionRule(stressRule, missingCountry, 'XX'), null);
+});