feat(resilience): flag-gated pillar-combined score activation (default off) (#3267)

Wires the non-compensatory 3-pillar combined overall_score behind a RESILIENCE_PILLAR_COMBINE_ENABLED env flag. Default is false so this PR ships zero behavior change in production. When flipped true the top-level overall_score switches from the 6-domain weighted aggregate to penalizedPillarScore(pillars) with alpha 0.5 and pillar weights 0.40 / 0.35 / 0.25. Evidence from docs/snapshots/resilience-pillar-sensitivity-2026-04-21: - Spearman rank correlation current vs proposed 0.9935 - Mean score delta -13.44 points (every country drops, penalty is always at most 1) - Max top-50 rank swing 6 positions (Russia) - No ceiling or floor effects under plus/minus 20pct perturbation - Release gate PASS 0/19 Code change in server/worldmonitor/resilience/v1/_shared.ts: - New isPillarCombineEnabled() reads env dynamically so tests can flip state without reloading the module - overallScore branches on (isPillarCombineEnabled() AND RESILIENCE_SCHEMA_V2_ENABLED AND pillars.length > 0); otherwise falls through to the 6-domain aggregate (unchanged default path) - RESILIENCE_SCORE_CACHE_PREFIX bumped v9 to v10 - RESILIENCE_RANKING_CACHE_KEY bumped v9 to v10 Cache invalidation: the version bump forces both per-country score cache and ranking cache to recompute from the current code path on first read after a flag flip. Without the bump, 6-domain values cached under the flag-off path would continue to serve for up to 6-12 hours after the flip, producing a ragged mix of formulas. Ripple of v9 to v10: - api/health.js registry entry - scripts/seed-resilience-scores.mjs (both keys) - scripts/validate-resilience-correlation.mjs, scripts/backtest-resilience-outcomes.mjs, scripts/validate-resilience-backtest.mjs, scripts/benchmark-resilience-external.mjs - tests/resilience-ranking.test.mts 24 fixture usages - tests/resilience-handlers.test.mts - tests/resilience-scores-seed.test.mjs explicit pin - tests/resilience-pillar-aggregation.test.mts explicit pin - docs/methodology/country-resilience-index.mdx New tests/resilience-pillar-combine-activation.test.mts: 7 assertions exercising the flag-on path against the release fixtures with re-anchored bands (NO at least 60, YE/SO at most 40, NO greater than US preserved, elite greater than fragile). Regression guard verifies flipping the flag back off restores the 6-domain aggregate. tests/resilience-ranking-snapshot.test.mts: band thresholds now resolve from a METHODOLOGY_BANDS table keyed on snapshot.methodologyFormula. Backward compatible (missing formula defaults to domain-weighted-6d bands). Snapshots: - docs/snapshots/resilience-ranking-2026-04-21.json tagged methodologyFormula domain-weighted-6d - docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json new: top/bottom/major-economies tables projected from the 52-country sensitivity sample. Explicitly tagged projected (NOT a full-universe live capture). When the flag is flipped in production, run scripts/freeze-resilience-ranking.mjs to capture the authoritative full-universe snapshot. Methodology doc: Pillar-combined score activation section rewritten to describe the flag-gated mechanism (activation is an env-var flip, no code deploy) and the rollback path. Verification: npm run typecheck:all clean, 397/397 resilience tests pass (up from 390, +7 activation tests). Activation plan: 1. Merge this PR with flag default false (zero behavior change) 2. Set RESILIENCE_PILLAR_COMBINE_ENABLED=true in Vercel and Railway env 3. Redeploy or wait for next cold start; v9 to v10 bump forces every country to be rescored on first read 4. Run scripts/freeze-resilience-ranking.mjs against the flag-on deployment and commit the resulting snapshot 5. Ship a v2.0 methodology-change note explaining the re-anchored scale so analysts understand the universal ~13 point score drop is a scale rebase, not a country-level regression Rollback: set RESILIENCE_PILLAR_COMBINE_ENABLED=false, flush resilience:score:v10:* and resilience:ranking:v10 keys (or wait for TTLs). The 6-domain formula stays alongside the pillar combine in _shared.ts and needs no code change to come back.
2026-04-25 17:14:57 +02:00 · 2026-04-22 06:52:07 +04:00
parent 502bd4472c
commit fbaf07e106
16 changed files with 1434 additions and 129 deletions
--- a/api/health.js
+++ b/api/health.js
@@ -161,7 +161,7 @@ const STANDALONE_KEYS = {
  pizzint:                  'intelligence:pizzint:seed:v1',
  resilienceStaticIndex:    'resilience:static:index:v1',
  resilienceStaticFao:      'resilience:static:fao',
-  resilienceRanking:        'resilience:ranking:v9',
+  resilienceRanking:        'resilience:ranking:v10',
  productCatalog:           'product-catalog:v2',
  energySpineCountries:     'energy:spine:v1:_countries',
  energyExposure:           'energy:exposure:v1:index',
--- a/docs/methodology/country-resilience-index.mdx
+++ b/docs/methodology/country-resilience-index.mdx
@@ -5,11 +5,16 @@ description: "Real-time resilience scoring for ~220 countries across 6 domains a

 The WorldMonitor Country Resilience Index (CRI) scores every country in the world on a 0-100 scale, combining long-run structural capacity with current operational stress to produce an actionable resilience metric. Rather than relying on static country risk ratings, the CRI updates every 6 hours from official and authoritative sources and exposes full provenance, coverage, and imputation context so analysts can see exactly *why* a score moved and how much of it is real data versus imputed.

-This document is the v1.0 reference for the live product. A planned v2.0 upgrade will rebuild the top-level shape into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, and ship an annual Reference Edition at citation quality. That work is tracked in a separate reference-grade upgrade plan and is not yet shipped; everything documented below describes the **current shipping behavior**.
+This document describes the **currently shipping** behavior of the index. The versioning has two independent axes:
+
+- **Response shape**: `schemaVersion: "2.0"` is the current default. Every response carries a real coverage-weighted `pillars[]` array regrouping the six domains into structural readiness / live shock exposure / recovery capacity. The legacy `schemaVersion: "1.0"` shape (pillars empty) remains available via the `RESILIENCE_SCHEMA_V2_ENABLED=false` env flag for one release cycle.
+- **Scoring formula**: the top-level `overall_score` is the six-domain weighted aggregate (the v1 compensatory formula). The v2 non-compensatory pillar-combined formula with a min-pillar penalty is defined, validated (see Pillar-combined score activation below), and wired behind the `RESILIENCE_PILLAR_COMBINE_ENABLED` flag, but its default is `false` — activation is an explicit operator action rather than a code deploy. The annual Reference Edition at citation quality is a separate Phase 3 deliverable and is not yet shipped.
+
+Everything documented below describes the **currently shipping** state: schemaVersion `"2.0"` shape, 6 domains × 19 dimensions × 3 pillars, and the 6-domain weighted `overall_score`. When an operator flips the pillar-combined flag on, the subsection on [Pillar-combined score activation](#pillar-combined-score-activation-flag-gated-default-off) documents what changes.

 ## In the dashboard

-CRI is surfaced across three places in the product, all driven from the same v1.0 score described below:
+CRI is surfaced across three places in the product, all driven from the same currently-shipping score:

 - **Resilience widget** — a standalone panel (component: `src/components/ResilienceWidget.ts`) that ranks countries by resilience score with filter and search affordances. Reach it from Cmd+K by typing *resilience*.
 - **Country Deep-Dive** — inside the per-country drill-down panel, CRI appears alongside CII (Country Instability Index) as a structural complement to the short-horizon stress signal. CII and CRI are intentionally **not interchangeable**: CII answers "how much stress is on this country right now?"; CRI answers "how well-positioned is this country to absorb and recover from shocks?"
@@ -384,9 +389,9 @@ The CRI is designed to be auditable end-to-end: given the Redis snapshot at any

 | Key | Type | TTL | Written by | Read by |
 |---|---|---|---|---|
-| `resilience:score:v9:{countryCode}` | JSON | 6 hours | `buildResilienceScore` in `server/worldmonitor/resilience/v1/_shared.ts` | `getResilienceScore` handler |
-| `resilience:ranking:v9` | JSON | 6 hours | `buildResilienceRanking`, only when all countries are scored | `getResilienceRanking` handler |
-| `resilience:history:v4:{countryCode}` | sorted set | indefinite, trimmed to 30 days | `appendHistory` during scoring | trend and `change30d` computation |
+| `resilience:score:v10:{countryCode}` | JSON | 6 hours | `buildResilienceScore` in `server/worldmonitor/resilience/v1/_shared.ts` | `getResilienceScore` handler |
+| `resilience:ranking:v10` | JSON | 6 hours | `buildResilienceRanking`, only when all countries are scored | `getResilienceRanking` handler |
+| `resilience:history:v5:{countryCode}` | sorted set | indefinite, trimmed to 30 days | `appendHistory` during scoring | trend and `change30d` computation |
 | `resilience:intervals:v1:{countryCode}` | JSON | 6 hours | `scripts/seed-resilience-intervals.mjs` | `getResilienceScore` (optional `scoreInterval` field) |
 | `seed-meta:resilience:static` | JSON | 2 hours | `scripts/seed-resilience-static.mjs` at the end of each successful seed run | scorer for `dataVersion` population, health checks |
 | `resilience:static:{countryCode}` | JSON | 400 days | `scripts/seed-resilience-static.mjs` | scorer for all baseline signals (WGI, WHO, FAO, GPI, RSF, and so on) |
@@ -394,7 +399,7 @@ The CRI is designed to be auditable end-to-end: given the Redis snapshot at any

 ### dataVersion semantics

-The `dataVersion` field on every `GetResilienceScoreResponse` is the ISO date of the `fetchedAt` timestamp stored in `seed-meta:resilience:static`. It reflects the most recent successful run of the Railway static-seed job; the widget renders it in the footer as `Data YYYY-MM-DD`.
+The `dataVersion` field on every `GetResilienceScoreResponse` is the ISO date of the `fetchedAt` timestamp stored in `seed-meta:resilience:static`. It reflects the most recent successful run of the Railway static-seed job; the widget renders it in the footer as `Seed date YYYY-MM-DD`. The label is narrower than "Data" because live inputs (conflict events, sanctions, prices) can refresh at their own cadence after the static bundle runs — per-dimension freshness is surfaced separately via the freshness badge in the confidence grid.

 ### Reproducing a score by hand

@@ -452,7 +457,7 @@ Self-assessed against the standard composite-indicator review axes on a 0-10 sca

 ### v2.0 (April 2026) — Phase 2 structural rebuild

-**Current published version.** Phase 2 of the reference-grade upgrade plan (`docs/internal/country-resilience-upgrade-plan.md`). Rebuilds the top-level shape from five flat domains into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, adds a recovery capacity pillar with six new dimensions, and ships a full validation suite (cross-index benchmark, outcome backtest, sensitivity analysis).
+**Current published version** (shape). Phase 2 of the reference-grade upgrade plan (`docs/internal/country-resilience-upgrade-plan.md`). The response-shape rebuild is live: every response now carries a real coverage-weighted `pillars[]` array regrouping the six domains into structural readiness, live shock exposure, and recovery capacity. The recovery domain adds six new dimensions, and a full validation suite (cross-index benchmark, outcome backtest, sensitivity analysis) gates the activation. The top-level `overall_score` is still computed by the six-domain weighted aggregate (v1 formula); the partly non-compensatory pillar-combined `overall_score` is defined, tested, and flag-gated (see [Pillar-combined score activation](#pillar-combined-score-activation-flag-gated-default-off)), but `RESILIENCE_PILLAR_COMBINE_ENABLED` defaults to `false` so operators can schedule the flip with a proper migration message.

 - **T2.1** (#2977): Three-pillar schema added to proto and OpenAPI. `schemaVersion: "2.0"` feature flag introduced with backward-compatible `"1.0"` fallback path for one release cycle. Response now carries a `pillars` array alongside existing `domains`.
 - **T2.2a** (#2979): Signal tiering registry committed. Every indicator tagged Core, Enrichment, or Experimental with per-signal coverage percentage and license audit status. Registry enforced by CI linter.
@@ -492,7 +497,17 @@ The plan's non-compensatory pillar combine is the methodologically stronger form

 **Interpretation**: Rank order is strongly preserved on the 52-country sample (Spearman 0.9863 clears the ≥0.90 bar typically required for a rank-stable methodology change). The ranking *shape* — who is top-10, who is bottom-10, Lebanon below South Africa, Norway above the US — does not materially change. However, every country's absolute score drops on average ~11 points because the penalty factor is always ≤ 1, and imbalanced countries with one very weak pillar (Syria, Afghanistan, Venezuela, Russia) drop the most (15-19 points). Balanced top-tier countries (Switzerland, Sweden, Denmark, Iceland, Norway) drop the least (5-7 points). This is the intended behavior: the penalty punishes pillar imbalance, and pillar imbalance is strongly correlated with state fragility.

-**What this means for activation**: the rank-stability evidence supports flipping the default — there is no statistical reason to keep the legacy compensatory form. The blocker is messaging, not correctness: publishing "US = 52.65" the day after publishing "US = 65.4" without a v2.0 methodology note would look like a regression instead of a rigor upgrade. Activation is therefore scheduled as a single PR that (a) flips the default behind `RESILIENCE_PILLAR_COMBINE_ENABLED`, (b) re-anchors the release-gate bands (the current 70/35 thresholds map to roughly 60/25 in the pillar-combined scale), (c) publishes a refreshed frozen ranking snapshot, and (d) ships a methodology-change note alongside the widget. Until that PR lands, the published `overall_score` is the 6-domain weighted aggregate documented above.
+**Activation sequence**: the rank-stability evidence supports flipping the default — there is no statistical reason to keep the legacy compensatory form. The blocker is messaging: publishing "US = 54.50" the day after publishing "US = 68.26" without a methodology note would look like a regression instead of a rigor upgrade. The pillar-combine activation PR wires the following so the flip is a single env-var change with no code deploy required:
+
+1. **Feature flag**: `RESILIENCE_PILLAR_COMBINE_ENABLED`, read dynamically from `process.env` per call. Default `false`. Set to `true` in Vercel env + Railway env to activate.
+2. **Cache invalidation**: per-country score cache bumped from `resilience:score:v9:` to `resilience:score:v10:`, ranking cache bumped from `resilience:ranking:v9` to `resilience:ranking:v10`, and score-history bumped from `resilience:history:v4:` to `resilience:history:v5:`. The version bumps are a clean-slate guard; the actual cross-formula isolation is the `_formula` tag written into every cached score / ranking payload and the `:d6` / `:pc` suffix on every history sorted-set member, checked at read time so a flag flip forces a rebuild without waiting for TTLs.
+3. **Methodology-aware level thresholds**: `classifyResilienceLevel` reads `isPillarCombineEnabled()` and switches the high/medium cutoffs from 70/40 (6-domain) to 60/30 (pillar-combined). Without this, scale compression alone would demote FI (75.64 → 68.60) and NZ (76.26 → 67.93) from "high" to "medium" purely because the formula changed, not because anything about the country changed. The re-anchored cutoffs preserve the qualitative label for every country whose old label was correct.
+4. **Re-anchored release-gate bands**: `tests/resilience-pillar-combine-activation.test.mts` pins high-band anchors (NO, CH, DK) at ≥ 60 (vs the 6-domain formula's ≥ 70 floor) and low-band anchors (YE, SO) at ≤ 40 (vs ≤ 45). The snapshot test reads `methodologyFormula` from each snapshot and applies the matching bands. The live sample numbers confirm the bands hold with margin: NO proposed ≈ 71.59 (≥ 60 by 11 points), YE ≈ 27.36 (≤ 40 by 13 points).
+5. **Projected snapshot**: `docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json` carries the top/bottom/major-economies tables at the proposed formula so reviewers can preview the post-activation ranking before flipping the flag. Once the flag is on in production, run `scripts/freeze-resilience-ranking.mjs` to capture the authoritative full-universe snapshot.
+
+Rollback: set `RESILIENCE_PILLAR_COMBINE_ENABLED=false`, flush the `resilience:score:v10:*`, `resilience:ranking:v10`, and `resilience:history:v5:*` keys (or wait for TTLs to expire). The 6-domain formula lives alongside the pillar combine in `_shared.ts` and needs no code change to come back.
+
+Until operators set the flag, `overall_score` remains the 6-domain weighted aggregate documented above.

 ### Scorecard (v2.0 self-assessment)

--- a/docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json
+++ b/docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json
@@ -0,0 +1,537 @@
+{
+  "capturedAt": "2026-04-21",
+  "source": "Projected from scripts/compare-resilience-current-vs-proposed.mjs against the 52-country live-Redis sensitivity sample, regenerated after the comparison script was corrected to use the production buildPillarList aggregation (coverage-weighted across member-domain average dimension coverage). This is NOT a full live-universe capture \u2014 the pillar-combined flag is off in production, so a real 217-country ranking under the new formula does not exist yet. When activation ships, run scripts/freeze-resilience-ranking.mjs against the flag-enabled deployment to produce the authoritative capture; this file is the best available preview until then.",
+  "commitSha": "048bb8bb525393dc4a9c1998b9877c1f8cc8c011",
+  "schemaVersion": "2.0",
+  "methodologyFormula": "pillar-combined-penalized-v1",
+  "methodology": {
+    "overallScoreFormula": "penalizedPillarScore(pillars): \u03a3 pillar.score \u00d7 pillar.weight multiplied by (1 \u2212 0.5 \u00d7 (1 \u2212 min_pillar/100)). Pillar weights: structural-readiness=0.40, live-shock-exposure=0.35, recovery-capacity=0.25.",
+    "penaltyAlpha": 0.5,
+    "domainCount": 6,
+    "dimensionCount": 19,
+    "pillarCount": 3,
+    "coverageLabel": "Dimension coverage (mean of 19 per-dimension coverage values).",
+    "greyOutThreshold": 0.4,
+    "notes": [
+      "Every score is lower than the 6-domain equivalent because the penalty factor is always \u2264 1. Rank order is preserved (Spearman 0.9863 on this sample).",
+      "Sample size is 52 \u2014 the true live ranking has ~217 countries. Rank numbers here are in-sample; the true global rank for each country will likely be larger.",
+      "This snapshot informs the activation PR\u2019s release-gate re-anchoring but is NOT a substitute for the post-activation live capture."
+    ]
+  },
+  "sampleSize": 52,
+  "sampleCountries": [
+    "CH",
+    "IS",
+    "DK",
+    "NO",
+    "SE",
+    "FI",
+    "NZ",
+    "JP",
+    "DE",
+    "AU",
+    "GB",
+    "FR",
+    "ES",
+    "CA",
+    "PL",
+    "IT",
+    "KR",
+    "BR",
+    "US",
+    "MY",
+    "CN",
+    "ID",
+    "TH",
+    "PH",
+    "UA",
+    "IN",
+    "RU",
+    "VN",
+    "EG",
+    "IQ",
+    "TR",
+    "MX",
+    "ZA",
+    "BD",
+    "KE",
+    "HT",
+    "AF",
+    "PK",
+    "CF",
+    "MM",
+    "NG",
+    "ET",
+    "NE",
+    "SS",
+    "ML",
+    "TD",
+    "IR",
+    "VE",
+    "SY",
+    "YE",
+    "SO",
+    "SD"
+  ],
+  "tables": {
+    "topTenInSample": [
+      {
+        "rankInSample": 1,
+        "countryCode": "CH",
+        "countryName": "Switzerland",
+        "proposedOverallScore": 73.17,
+        "currentOverallScore": 78.78,
+        "scoreDelta": -5.61,
+        "pillars": {
+          "structuralReadiness": 82.34,
+          "liveShockExposure": 78.94,
+          "recoveryCapacity": 84.86,
+          "minPillar": 78.94
+        }
+      },
+      {
+        "rankInSample": 2,
+        "countryCode": "IS",
+        "countryName": "Iceland",
+        "proposedOverallScore": 72.76,
+        "currentOverallScore": 79.49,
+        "scoreDelta": -6.73,
+        "pillars": {
+          "structuralReadiness": 86.38,
+          "liveShockExposure": 88.09,
+          "recoveryCapacity": 73.65,
+          "minPillar": 73.65
+        }
+      },
+      {
+        "rankInSample": 3,
+        "countryCode": "DK",
+        "countryName": "Denmark",
+        "proposedOverallScore": 72.59,
+        "currentOverallScore": 78.55,
+        "scoreDelta": -5.96,
+        "pillars": {
+          "structuralReadiness": 87.81,
+          "liveShockExposure": 76.9,
+          "recoveryCapacity": 80.14,
+          "minPillar": 76.9
+        }
+      },
+      {
+        "rankInSample": 4,
+        "countryCode": "NO",
+        "countryName": "Norway",
+        "proposedOverallScore": 71.59,
+        "currentOverallScore": 79.03,
+        "scoreDelta": -7.44,
+        "pillars": {
+          "structuralReadiness": 85.85,
+          "liveShockExposure": 90.02,
+          "recoveryCapacity": 71.18,
+          "minPillar": 71.18
+        }
+      },
+      {
+        "rankInSample": 5,
+        "countryCode": "SE",
+        "countryName": "Sweden",
+        "proposedOverallScore": 70.13,
+        "currentOverallScore": 75.6,
+        "scoreDelta": -5.47,
+        "pillars": {
+          "structuralReadiness": 79.2,
+          "liveShockExposure": 81.3,
+          "recoveryCapacity": 76.79,
+          "minPillar": 76.79
+        }
+      },
+      {
+        "rankInSample": 6,
+        "countryCode": "FI",
+        "countryName": "Finland",
+        "proposedOverallScore": 68.6,
+        "currentOverallScore": 75.64,
+        "scoreDelta": -7.04,
+        "pillars": {
+          "structuralReadiness": 81.97,
+          "liveShockExposure": 78.42,
+          "recoveryCapacity": 74.17,
+          "minPillar": 74.17
+        }
+      },
+      {
+        "rankInSample": 7,
+        "countryCode": "NZ",
+        "countryName": "New Zealand",
+        "proposedOverallScore": 67.93,
+        "currentOverallScore": 76.26,
+        "scoreDelta": -8.33,
+        "pillars": {
+          "structuralReadiness": 82.9,
+          "liveShockExposure": 82.91,
+          "recoveryCapacity": 70.34,
+          "minPillar": 70.34
+        }
+      },
+      {
+        "rankInSample": 8,
+        "countryCode": "JP",
+        "countryName": "Japan",
+        "proposedOverallScore": 64.45,
+        "currentOverallScore": 73.33,
+        "scoreDelta": -8.88,
+        "pillars": {
+          "structuralReadiness": 77.74,
+          "liveShockExposure": 69.7,
+          "recoveryCapacity": 81.86,
+          "minPillar": 69.7
+        }
+      },
+      {
+        "rankInSample": 9,
+        "countryCode": "DE",
+        "countryName": "Germany",
+        "proposedOverallScore": 63.6,
+        "currentOverallScore": 72.42,
+        "scoreDelta": -8.82,
+        "pillars": {
+          "structuralReadiness": 77.74,
+          "liveShockExposure": 70.33,
+          "recoveryCapacity": 75.86,
+          "minPillar": 70.33
+        }
+      },
+      {
+        "rankInSample": 10,
+        "countryCode": "AU",
+        "countryName": "Australia",
+        "proposedOverallScore": 62.48,
+        "currentOverallScore": 73.63,
+        "scoreDelta": -11.15,
+        "pillars": {
+          "structuralReadiness": 78.77,
+          "liveShockExposure": 84.73,
+          "recoveryCapacity": 62.66,
+          "minPillar": 62.66
+        }
+      }
+    ],
+    "bottomTenInSample": [
+      {
+        "rankInSample": 43,
+        "countryCode": "NE",
+        "countryName": "Niger",
+        "proposedOverallScore": 34.11,
+        "currentOverallScore": 46.6,
+        "scoreDelta": -12.49,
+        "pillars": {
+          "structuralReadiness": 56.94,
+          "liveShockExposure": 35.95,
+          "recoveryCapacity": 59.3,
+          "minPillar": 35.95
+        }
+      },
+      {
+        "rankInSample": 44,
+        "countryCode": "SS",
+        "countryName": "South Sudan",
+        "proposedOverallScore": 34.06,
+        "currentOverallScore": 45.54,
+        "scoreDelta": -11.48,
+        "pillars": {
+          "structuralReadiness": 52.61,
+          "liveShockExposure": 40.59,
+          "recoveryCapacity": 52.82,
+          "minPillar": 40.59
+        }
+      },
+      {
+        "rankInSample": 45,
+        "countryCode": "ML",
+        "countryName": "Mali",
+        "proposedOverallScore": 33.67,
+        "currentOverallScore": 44.91,
+        "scoreDelta": -11.24,
+        "pillars": {
+          "structuralReadiness": 54.6,
+          "liveShockExposure": 38.77,
+          "recoveryCapacity": 52.47,
+          "minPillar": 38.77
+        }
+      },
+      {
+        "rankInSample": 46,
+        "countryCode": "TD",
+        "countryName": "Chad",
+        "proposedOverallScore": 32.27,
+        "currentOverallScore": 43.85,
+        "scoreDelta": -11.58,
+        "pillars": {
+          "structuralReadiness": 54.34,
+          "liveShockExposure": 35.93,
+          "recoveryCapacity": 52.68,
+          "minPillar": 35.93
+        }
+      },
+      {
+        "rankInSample": 47,
+        "countryCode": "IR",
+        "countryName": "Iran",
+        "proposedOverallScore": 31.45,
+        "currentOverallScore": 46.48,
+        "scoreDelta": -15.03,
+        "pillars": {
+          "structuralReadiness": 37.08,
+          "liveShockExposure": 58.09,
+          "recoveryCapacity": 42.86,
+          "minPillar": 37.08
+        }
+      },
+      {
+        "rankInSample": 48,
+        "countryCode": "VE",
+        "countryName": "Venezuela",
+        "proposedOverallScore": 31.18,
+        "currentOverallScore": 47.7,
+        "scoreDelta": -16.52,
+        "pillars": {
+          "structuralReadiness": 37.87,
+          "liveShockExposure": 65.59,
+          "recoveryCapacity": 33.89,
+          "minPillar": 33.89
+        }
+      },
+      {
+        "rankInSample": 49,
+        "countryCode": "SY",
+        "countryName": "Syria",
+        "proposedOverallScore": 30.55,
+        "currentOverallScore": 49.64,
+        "scoreDelta": -19.09,
+        "pillars": {
+          "structuralReadiness": 32.1,
+          "liveShockExposure": 57.79,
+          "recoveryCapacity": 52.73,
+          "minPillar": 32.1
+        }
+      },
+      {
+        "rankInSample": 50,
+        "countryCode": "YE",
+        "countryName": "Yemen",
+        "proposedOverallScore": 27.36,
+        "currentOverallScore": 42.51,
+        "scoreDelta": -15.15,
+        "pillars": {
+          "structuralReadiness": 39.36,
+          "liveShockExposure": 38.13,
+          "recoveryCapacity": 42.09,
+          "minPillar": 38.13
+        }
+      },
+      {
+        "rankInSample": 51,
+        "countryCode": "SO",
+        "countryName": "Somalia",
+        "proposedOverallScore": 26.8,
+        "currentOverallScore": 36.47,
+        "scoreDelta": -9.67,
+        "pillars": {
+          "structuralReadiness": 40.25,
+          "liveShockExposure": 35.72,
+          "recoveryCapacity": 43.56,
+          "minPillar": 35.72
+        }
+      },
+      {
+        "rankInSample": 52,
+        "countryCode": "SD",
+        "countryName": "Sudan",
+        "proposedOverallScore": 19.45,
+        "currentOverallScore": 29.69,
+        "scoreDelta": -10.24,
+        "pillars": {
+          "structuralReadiness": 31.15,
+          "liveShockExposure": 32.3,
+          "recoveryCapacity": 27.24,
+          "minPillar": 27.24
+        }
+      }
+    ],
+    "majorEconomiesInSample": [
+      {
+        "rankInSample": 8,
+        "countryCode": "JP",
+        "countryName": "Japan",
+        "proposedOverallScore": 64.45,
+        "currentOverallScore": 73.33,
+        "scoreDelta": -8.88,
+        "pillars": {
+          "structuralReadiness": 77.74,
+          "liveShockExposure": 69.7,
+          "recoveryCapacity": 81.86,
+          "minPillar": 69.7
+        }
+      },
+      {
+        "rankInSample": 9,
+        "countryCode": "DE",
+        "countryName": "Germany",
+        "proposedOverallScore": 63.6,
+        "currentOverallScore": 72.42,
+        "scoreDelta": -8.82,
+        "pillars": {
+          "structuralReadiness": 77.74,
+          "liveShockExposure": 70.33,
+          "recoveryCapacity": 75.86,
+          "minPillar": 70.33
+        }
+      },
+      {
+        "rankInSample": 10,
+        "countryCode": "AU",
+        "countryName": "Australia",
+        "proposedOverallScore": 62.48,
+        "currentOverallScore": 73.63,
+        "scoreDelta": -11.15,
+        "pillars": {
+          "structuralReadiness": 78.77,
+          "liveShockExposure": 84.73,
+          "recoveryCapacity": 62.66,
+          "minPillar": 62.66
+        }
+      },
+      {
+        "rankInSample": 11,
+        "countryCode": "GB",
+        "countryName": "United Kingdom",
+        "proposedOverallScore": 62.42,
+        "currentOverallScore": 70.1,
+        "scoreDelta": -7.68,
+        "pillars": {
+          "structuralReadiness": 73.86,
+          "liveShockExposure": 71.7,
+          "recoveryCapacity": 72.28,
+          "minPillar": 71.7
+        }
+      },
+      {
+        "rankInSample": 12,
+        "countryCode": "FR",
+        "countryName": "France",
+        "proposedOverallScore": 61.45,
+        "currentOverallScore": 70.06,
+        "scoreDelta": -8.61,
+        "pillars": {
+          "structuralReadiness": 74.96,
+          "liveShockExposure": 74.85,
+          "recoveryCapacity": 67.96,
+          "minPillar": 67.96
+        }
+      },
+      {
+        "rankInSample": 17,
+        "countryCode": "KR",
+        "countryName": "South Korea",
+        "proposedOverallScore": 60.43,
+        "currentOverallScore": 69.85,
+        "scoreDelta": -9.42,
+        "pillars": {
+          "structuralReadiness": 75.8,
+          "liveShockExposure": 66.77,
+          "recoveryCapacity": 75.14,
+          "minPillar": 66.77
+        }
+      },
+      {
+        "rankInSample": 18,
+        "countryCode": "BR",
+        "countryName": "Brazil",
+        "proposedOverallScore": 58.99,
+        "currentOverallScore": 68.34,
+        "scoreDelta": -9.35,
+        "pillars": {
+          "structuralReadiness": 68.69,
+          "liveShockExposure": 76.52,
+          "recoveryCapacity": 66.47,
+          "minPillar": 66.47
+        }
+      },
+      {
+        "rankInSample": 19,
+        "countryCode": "US",
+        "countryName": "United States",
+        "proposedOverallScore": 54.5,
+        "currentOverallScore": 68.26,
+        "scoreDelta": -13.76,
+        "pillars": {
+          "structuralReadiness": 68.55,
+          "liveShockExposure": 83.83,
+          "recoveryCapacity": 54.73,
+          "minPillar": 54.73
+        }
+      },
+      {
+        "rankInSample": 21,
+        "countryCode": "CN",
+        "countryName": "China",
+        "proposedOverallScore": 52.57,
+        "currentOverallScore": 63.73,
+        "scoreDelta": -11.16,
+        "pillars": {
+          "structuralReadiness": 58.25,
+          "liveShockExposure": 74.1,
+          "recoveryCapacity": 68.82,
+          "minPillar": 58.25
+        }
+      },
+      {
+        "rankInSample": 26,
+        "countryCode": "IN",
+        "countryName": "India",
+        "proposedOverallScore": 46.82,
+        "currentOverallScore": 59.3,
+        "scoreDelta": -12.48,
+        "pillars": {
+          "structuralReadiness": 63.51,
+          "liveShockExposure": 54.34,
+          "recoveryCapacity": 64.98,
+          "minPillar": 54.34
+        }
+      },
+      {
+        "rankInSample": 27,
+        "countryCode": "RU",
+        "countryName": "Russia",
+        "proposedOverallScore": 46.28,
+        "currentOverallScore": 61.08,
+        "scoreDelta": -14.8,
+        "pillars": {
+          "structuralReadiness": 47.95,
+          "liveShockExposure": 68.43,
+          "recoveryCapacity": 77.73,
+          "minPillar": 47.95
+        }
+      },
+      {
+        "rankInSample": 31,
+        "countryCode": "TR",
+        "countryName": "Turkey",
+        "proposedOverallScore": 43.66,
+        "currentOverallScore": 56.49,
+        "scoreDelta": -12.83,
+        "pillars": {
+          "structuralReadiness": 50.94,
+          "liveShockExposure": 59.84,
+          "recoveryCapacity": 66.14,
+          "minPillar": 50.94
+        }
+      }
+    ]
+  },
+  "totals": {
+    "rankedCountriesInSample": 52,
+    "sgInSample": false
+  },
+  "comparisonArtifactRef": "docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json"
+}
--- a/scripts/backtest-resilience-outcomes.mjs
+++ b/scripts/backtest-resilience-outcomes.mjs
@@ -27,7 +27,7 @@ loadEnvFile(import.meta.url);
 const __dirname = dirname(fileURLToPath(import.meta.url));
 const VALIDATION_DIR = join(__dirname, '..', 'docs', 'methodology', 'country-resilience-index', 'validation');

-const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
+const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
 const BACKTEST_RESULT_KEY = 'resilience:backtest:outcomes:v1';
 const BACKTEST_TTL_SECONDS = 7 * 24 * 60 * 60;

--- a/scripts/benchmark-resilience-external.mjs
+++ b/scripts/benchmark-resilience-external.mjs
@@ -365,7 +365,7 @@ function median(arr) {

 async function readWmScoresFromRedis() {
  const { url, token } = getRedisCredentials();
-  const rankingResp = await fetch(`${url}/get/${encodeURIComponent('resilience:ranking:v9')}`, {
+  const rankingResp = await fetch(`${url}/get/${encodeURIComponent('resilience:ranking:v10')}`, {
    headers: { Authorization: `Bearer ${token}` },
    signal: AbortSignal.timeout(10_000),
  });
--- a/scripts/seed-resilience-scores.mjs
+++ b/scripts/seed-resilience-scores.mjs
@@ -19,8 +19,8 @@ const WM_KEY = process.env.WORLDMONITOR_API_KEY
  || '';
 const SEED_UA = 'Mozilla/5.0 (compatible; WorldMonitor-Seed/1.0)';

-export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
-export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v9';
+export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
+export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v10';
 // Must match the server-side RESILIENCE_RANKING_CACHE_TTL_SECONDS. Extended
 // to 12h (2x the cron interval) so a missed/slow cron can't create an
 // EMPTY_ON_DEMAND gap before the next successful rebuild.
--- a/scripts/validate-resilience-backtest.mjs
+++ b/scripts/validate-resilience-backtest.mjs
@@ -27,7 +27,7 @@ import { unwrapEnvelope } from './_seed-envelope-source.mjs';
 loadEnvFile(import.meta.url);

 // Source of truth: server/worldmonitor/resilience/v1/_shared.ts
-const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
+const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';

 const MIN_SCORED_COUNTRIES = 5;

--- a/scripts/validate-resilience-correlation.mjs
+++ b/scripts/validate-resilience-correlation.mjs
@@ -3,7 +3,7 @@
 import { loadEnvFile, getRedisCredentials } from './_seed-utils.mjs';

 // Source of truth: server/worldmonitor/resilience/v1/_shared.ts → RESILIENCE_SCORE_CACHE_PREFIX
-const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
+const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';

 const REFERENCE_INDICES = {
  ndgain: {
--- a/server/worldmonitor/resilience/v1/_shared.ts
+++ b/server/worldmonitor/resilience/v1/_shared.ts
@@ -9,7 +9,7 @@ import type {

 export type { ScoreInterval };

-import { cachedFetchJson, getCachedJson, runRedisPipeline } from '../../../_shared/redis';
+import { cachedFetchJson, getCachedJson, runRedisPipeline, setCachedJson } from '../../../_shared/redis';
 import { unwrapEnvelope } from '../../../_shared/seed-envelope';
 import { detectTrend, round } from '../../../_shared/resilience-stats';
 import {
@@ -34,18 +34,37 @@ import { buildPillarList } from './_pillar-membership';
 // back to the Phase 1 shape (`schemaVersion: "1.0"`, `pillars: []`) —
 // retained as an emergency opt-out for one release cycle.
 //
-// IMPORTANT: `overallScore` is STILL computed as the 6-domain weighted
-// aggregate (Σ domain.score * domain.weight, weights sum to 1.00) in both
-// modes. A pillar-combined score with a min-pillar penalty is defined
-// below (`penalizedPillarScore`) and exercised by
-// scripts/validate-resilience-sensitivity.mjs; the activation that
-// switches `overallScore` to the pillar combine is a separate PR.
-//
 // `baselineScore`, `stressScore`, `stressFactor`, etc. remain populated
 // in both modes for widget + map layer + Country Brief consumers.
 export const RESILIENCE_SCHEMA_V2_ENABLED =
  (process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';

+// Phase 2 T2.3 activation: feature flag that switches `overallScore`
+// from the 6-domain weighted aggregate (legacy compensatory form) to
+// the 3-pillar combined form with the min-pillar penalty term defined
+// by `penalizedPillarScore` below. Default is `false` so activation is
+// an explicit operator action; the sensitivity + current-vs-proposed
+// comparison in `docs/snapshots/resilience-pillar-sensitivity-*.json`
+// is the input for that decision. When flipped to `true`:
+//   - `overallScore` = penalizedPillarScore(pillars), α=0.5 (pillar
+//     weights 0.40 / 0.35 / 0.25 per the plan).
+//   - Published numbers drop ~13 points on average across the
+//     52-country sample; Spearman vs the 6-domain ranking is 0.9935.
+//
+// Read dynamically rather than captured at module load so tests can
+// flip `process.env.RESILIENCE_PILLAR_COMBINE_ENABLED` per-case without
+// re-importing the module. Under Node production the env does not
+// change mid-process so the per-call read is a couple of instructions.
+//
+// Cache invalidation: the score cache prefix is bumped on every
+// flag-visible behavior change (see RESILIENCE_SCORE_CACHE_PREFIX
+// above). Do not flip this flag without also bumping the cache
+// prefix or waiting for the 6h TTL to expire — otherwise legacy
+// 6-domain scores will be served from cache after activation.
+export function isPillarCombineEnabled(): boolean {
+  return (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
+}
+
 export const RESILIENCE_SCORE_CACHE_TTL_SECONDS = 6 * 60 * 60;
 // Ranking TTL must exceed the cron interval (6h) by enough to tolerate one
 // missed/slow cron tick. With TTL==cron_interval, writing near the end of a
@@ -54,9 +73,37 @@ export const RESILIENCE_SCORE_CACHE_TTL_SECONDS = 6 * 60 * 60;
 // full cron-cycle of headroom — ensureRankingPresent() still refreshes on
 // every cron, so under normal operation the key stays well above TTL=0.
 export const RESILIENCE_RANKING_CACHE_TTL_SECONDS = 12 * 60 * 60;
-export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
-export const RESILIENCE_HISTORY_KEY_PREFIX = 'resilience:history:v4:';
-export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v9';
+// Bumped from v9 to v10 in the pillar-combined activation PR. Provides
+// a clean slate at PR deploy so no pre-PR cache entries (whose payloads
+// lack the `_formula` tag) can leak through on activation day. NOTE:
+// the version bump alone is NOT sufficient to isolate formulas — the
+// flag defaults to off, so v10 is populated with 6-domain entries long
+// before anyone flips RESILIENCE_PILLAR_COMBINE_ENABLED=true. The real
+// cross-formula guard is the in-payload `_formula` marker written by
+// `buildResilienceScore`, read by `ensureResilienceScoreCached` and
+// `getCachedResilienceScores` to reject stale-formula hits at serve
+// time. See the `CacheFormulaTag` comment block.
+export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
+// Bumped from v4 to v5 in the pillar-combined activation PR. Provides
+// a clean slate at PR deploy so pre-PR history points (which were
+// written without a formula tag) do not mix with tagged points. NOTE:
+// the version bump alone is NOT sufficient because the flag defaults
+// to off, so v5 accumulates d6-tagged entries during the default-off
+// window. The real cross-formula guard is the `:d6` / `:pc` suffix on
+// each sorted-set member written by `appendHistory` and filtered by
+// `buildResilienceScore` before change30d / trend are computed. Legacy
+// untagged members (from older deploys that happen to survive on v4
+// readers) decode as `d6` — matching the only formula that existed
+// before this PR — so the filter stays correct in either direction.
+export const RESILIENCE_HISTORY_KEY_PREFIX = 'resilience:history:v5:';
+// Bumped in lockstep with RESILIENCE_SCORE_CACHE_PREFIX (v9 → v10) for
+// a clean slate at PR deploy. As with the score prefix, the version
+// bump is a belt — the suspenders are the `_formula` tag on the
+// ranking payload itself, written via stampRankingCacheTag and read
+// via rankingCacheTagMatches in the ranking handler, which force a
+// recompute-and-publish on a cross-formula cache hit rather than
+// serving the stale ranking for up to the 12h ranking TTL.
+export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v10';
 export const RESILIENCE_STATIC_INDEX_KEY = 'resilience:static:index:v1';
 export const RESILIENCE_INTERVAL_KEY_PREFIX = 'resilience:intervals:v1:';
 const RESILIENCE_STATIC_META_KEY = 'seed-meta:resilience:static';
@@ -65,9 +112,28 @@ const RANK_STABLE_MAX_INTERVAL_WIDTH = 8;
 const LOW_CONFIDENCE_COVERAGE_THRESHOLD = 0.55;
 const LOW_CONFIDENCE_IMPUTATION_SHARE_THRESHOLD = 0.40;

+// Cache formula tag. Stored inside score + ranking JSON payloads and as
+// a suffix in history sorted-set member strings so the reader can reject
+// or filter cross-formula entries at serve time. This is the actual
+// isolation mechanism; the v9→v10 score/ranking and v4→v5 history key
+// version bumps only provide a clean-slate at PR deploy and do NOT by
+// themselves protect against the default-off-then-activate path —
+// default-off writes land in the new v10/v5 namespace tagged as 'd6',
+// and only the in-payload tag check forces a rebuild / filter on flip.
+type CacheFormulaTag = 'd6' | 'pc';
+
+function currentCacheFormula(): CacheFormulaTag {
+  // Mirrors the gating in buildResilienceScore's overallScore branch so
+  // the tag we stamp on write equals the formula actually used. If
+  // schemaV2 is off or the pillar combine flag is off, writes tag 'd6'
+  // and reads require 'd6' — matching the 6-domain aggregate code path.
+  return isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED ? 'pc' : 'd6';
+}
+
 interface ResilienceHistoryPoint {
  date: string;
  score: number;
+  formula: CacheFormulaTag;
 }

 interface ResilienceStaticIndex {
@@ -106,9 +172,31 @@ function todayIsoDate(): string {
  return new Date().toISOString().slice(0, 10);
 }

+// Level thresholds are methodology-aware. The pillar-combined formula
+// compresses the scale (~11-point mean drop across the 52-country live
+// sample), so the legacy 70/40 thresholds misclassify top-tier countries
+// as "medium" purely because the scale got compressed rather than
+// because anything changed about the country (FI 75.64 → 68.60 and NZ
+// 76.26 → 67.93 in the live sample both straddle the legacy 70 floor).
+// The pillar-combined thresholds 60/30 are re-anchored against the live
+// sample so the qualitative label stays stable for every country whose
+// old label was correct; the 52-country sensitivity capture confirms
+// all 7 high-band anchors stay ≥60 and all fragile-state anchors stay
+// ≤30. Kept narrow: only the two thresholds move; the three-label
+// taxonomy (high/medium/low) and downstream UI consumers are
+// unchanged.
+const LEVEL_THRESHOLDS_BY_FORMULA = {
+  'domain-weighted-6d':          { high: 70, medium: 40 },
+  'pillar-combined-penalized-v1': { high: 60, medium: 30 },
+} as const;
+
 function classifyResilienceLevel(score: number): string {
-  if (score >= 70) return 'high';
-  if (score >= 40) return 'medium';
+  const formula = isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED
+    ? 'pillar-combined-penalized-v1'
+    : 'domain-weighted-6d';
+  const { high, medium } = LEVEL_THRESHOLDS_BY_FORMULA[formula];
+  if (score >= high) return 'high';
+  if (score >= medium) return 'medium';
  return 'low';
 }

@@ -183,18 +271,28 @@ function buildDomainList(dimensions: ResilienceDimension[]): ResilienceDomain[]
  });
 }

+// Sorted-set member format: `YYYY-MM-DD:SCORE[:FORMULA]`. The optional
+// formula tag is either 'd6' or 'pc'. Legacy untagged members predate
+// the pillar-combined activation and are implicitly 'd6' (the only
+// formula in use before this PR). On activation, readHistory callers
+// filter by `currentCacheFormula()` so a 30-day window of d6 points is
+// not silently compared against a fresh pc point (which would
+// manufacture a ranking-wide fake-negative change30d / false "falling"
+// trend on day one).
 function parseHistoryPoints(raw: unknown): ResilienceHistoryPoint[] {
  if (!Array.isArray(raw)) return [];
  const history: ResilienceHistoryPoint[] = [];

  for (let index = 0; index < raw.length; index += 2) {
    const member = String(raw[index] || '');
-    const separatorIndex = member.indexOf(':');
-    if (separatorIndex < 0) continue;
-    const date = member.slice(0, separatorIndex);
-    const score = Number(member.slice(separatorIndex + 1));
+    const parts = member.split(':');
+    if (parts.length < 2) continue;
+    const date = parts[0]!;
+    const score = Number(parts[1]);
+    const rawFormula = parts[2];
+    const formula: CacheFormulaTag = rawFormula === 'pc' ? 'pc' : 'd6';
    if (!/^\d{4}-\d{2}-\d{2}$/.test(date) || !Number.isFinite(score)) continue;
-    history.push({ date, score });
+    history.push({ date, score, formula });
  }

  return history.sort((left, right) => left.date.localeCompare(right.date));
@@ -212,10 +310,20 @@ async function readHistory(countryCode: string): Promise<ResilienceHistoryPoint[
  return parseHistoryPoints(result[0]?.result);
 }

-async function appendHistory(countryCode: string, overallScore: number): Promise<void> {
+async function appendHistory(
+  countryCode: string,
+  overallScore: number,
+  formula: CacheFormulaTag,
+): Promise<void> {
  const dateScore = Number(todayIsoDate().replace(/-/g, ''));
+  // Member format `YYYY-MM-DD:SCORE:FORMULA` — see parseHistoryPoints
+  // above for the reader. The formula tag is required because the v4→v5
+  // history prefix bump happens at PR deploy, not at flag flip, so the
+  // v5 series accumulates d6-tagged entries during the default-off
+  // window; only the per-member tag lets the reader correctly filter
+  // those out when the pillar-combined formula later activates.
  await runRedisPipeline([
-    ['ZADD', historyKey(countryCode), dateScore, `${todayIsoDate()}:${round(overallScore)}`],
+    ['ZADD', historyKey(countryCode), dateScore, `${todayIsoDate()}:${round(overallScore)}:${formula}`],
    ['ZREMRANGEBYRANK', historyKey(countryCode), 0, -31],
  ]);
 }
@@ -249,7 +357,26 @@ async function buildResilienceScore(
  const baselineScore = round(coverageWeightedMean(baselineDims));
  const stressScore = round(coverageWeightedMean(stressDims));
  const stressFactor = round(Math.max(0, Math.min(1 - stressScore / 100, 0.5)), 4);
-  const overallScore = round(domains.reduce((sum, d) => sum + d.score * d.weight, 0));
+  // Phase 2 T2.3 activation: `overallScore` is either the legacy
+  // 6-domain weighted aggregate (compensatory, `Σ domain.score *
+  // domain.weight`) or the pillar-combined penalized form (non-
+  // compensatory, `penalizedPillarScore(pillars)`), controlled by
+  // `RESILIENCE_PILLAR_COMBINE_ENABLED` + `RESILIENCE_SCHEMA_V2_ENABLED`.
+  // We only activate the pillar combine when v2 is on because the
+  // pillar list is empty under v1 and `penalizedPillarScore([])` returns
+  // 0 — that would silently zero every country's score if the flags
+  // were out of sync.
+  const domainAggregate = round(domains.reduce((sum, d) => sum + d.score * d.weight, 0));
+  const pillarEligible = isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED && pillars.length > 0;
+  const overallScore = pillarEligible
+    ? round(penalizedPillarScore(pillars.map((p) => ({ score: p.score, weight: p.weight }))))
+    : domainAggregate;
+  // Tag MUST match the branch that actually computed overallScore so
+  // the reader's stale-formula check in ensureResilienceScoreCached
+  // correctly rejects cross-formula cache entries when the env flag
+  // flips later. currentCacheFormula() reads the same two flags, so
+  // the derivation is intentionally redundant-by-agreement.
+  const formula: CacheFormulaTag = pillarEligible ? 'pc' : 'd6';

  const totalImputed = dimensions.reduce((sum, d) => sum + (d.imputedWeight ?? 0), 0);
  const totalObserved = dimensions.reduce((sum, d) => sum + (d.observedWeight ?? 0), 0);
@@ -257,12 +384,18 @@ async function buildResilienceScore(
    ? round(totalImputed / (totalImputed + totalObserved), 4)
    : 0;

+  // Filter history to the CURRENT formula only. Points tagged with the
+  // other formula are excluded from change30d / trend so the first
+  // post-flip score is not diffed against a 30-day window of the other
+  // formula's values (which would emit a fake-negative change30d and
+  // a false "falling" trend across the ranking on activation day).
  const history = (await readHistory(normalizedCountryCode))
+    .filter((point) => point.formula === formula)
    .filter((point) => point.date !== todayIsoDate());
  const scoreSeries = [...history.map((point) => point.score), overallScore];
  const oldestScore = history[0]?.score;

-  await appendHistory(normalizedCountryCode, overallScore);
+  await appendHistory(normalizedCountryCode, overallScore, formula);

  return {
    countryCode: normalizedCountryCode,
@@ -282,6 +415,37 @@ async function buildResilienceScore(
  };
 }

+// The shape we actually store in Redis. Extends the public response type
+// with a `_formula` marker so the reader can reject cross-formula cache
+// entries when `RESILIENCE_PILLAR_COMBINE_ENABLED` flips later. The
+// marker is stripped before the payload crosses back to callers.
+type CachedScorePayload = GetResilienceScoreResponse & { _formula?: CacheFormulaTag };
+
+function stripCacheMeta(payload: CachedScorePayload): GetResilienceScoreResponse {
+  const { _formula: _drop, ...rest } = payload;
+  void _drop;
+  return rest;
+}
+
+// Exposed helpers so the ranking handler can apply the same
+// stale-formula invalidation to its own cache key. Kept in this module
+// alongside the score versions so the tag convention has one source of
+// truth; a diverging derivation elsewhere would re-introduce the cross-
+// formula drift this whole pattern is meant to prevent.
+export function getCurrentCacheFormula(): CacheFormulaTag {
+  return currentCacheFormula();
+}
+
+export function stampRankingCacheTag<T extends object>(payload: T): T & { _formula: CacheFormulaTag } {
+  return { ...payload, _formula: currentCacheFormula() };
+}
+
+export function rankingCacheTagMatches(payload: unknown): boolean {
+  if (!payload || typeof payload !== 'object') return false;
+  const tag = (payload as { _formula?: unknown })._formula;
+  return tag === currentCacheFormula();
+}
+
 export async function ensureResilienceScoreCached(countryCode: string, reader?: ResilienceSeedReader): Promise<GetResilienceScoreResponse> {
  const normalizedCountryCode = normalizeCountryCode(countryCode);
  if (!normalizedCountryCode) {
@@ -306,44 +470,68 @@ export async function ensureResilienceScoreCached(countryCode: string, reader?:
    };
  }

-  let cached = await cachedFetchJson<GetResilienceScoreResponse>(
-    scoreCacheKey(normalizedCountryCode),
+  const current = currentCacheFormula();
+  const cacheKey = scoreCacheKey(normalizedCountryCode);
+
+  let cached = await cachedFetchJson<CachedScorePayload>(
+    cacheKey,
    RESILIENCE_SCORE_CACHE_TTL_SECONDS,
-    () => buildResilienceScore(normalizedCountryCode, reader),
+    async () => {
+      const built = await buildResilienceScore(normalizedCountryCode, reader);
+      // Tag with the formula buildResilienceScore actually used so
+      // downstream readers can reject cross-formula entries.
+      return { ...built, _formula: current };
+    },
    300,
-  ) ?? {
-    countryCode: normalizedCountryCode,
-    overallScore: 0,
-    baselineScore: 0,
-    stressScore: 0,
-    stressFactor: 0.5,
-    level: 'unknown',
-    domains: [],
-    trend: 'stable',
-    change30d: 0,
-    lowConfidence: true,
-    imputationShare: 0,
-    dataVersion: '',
-    // Phase 2 T2.1: cachedFetchJson-null fallback. Stays on the v1 shape
-    // because there are no domains to wrap into pillars here.
-    pillars: [],
-    schemaVersion: '1.0',
-  };
+  );
+
+  // Stale-formula guard. On activation day (flag flip), cached entries
+  // from the previous formula are still in Redis under the same key
+  // (v10 bump happens at PR deploy, not at flip time). The `_formula`
+  // tag we wrote on the cached payload lets us detect and overwrite
+  // the stale entry at read time. Without this, a 6-hour post-flip
+  // window would keep serving legacy scores. Legacy untagged entries
+  // (pre-PR writes that happen to survive the v9→v10 bump via
+  // external writers) are treated as stale-formula and rebuilt.
+  if (cached && cached._formula !== current) {
+    const rebuilt = await buildResilienceScore(normalizedCountryCode, reader);
+    cached = { ...rebuilt, _formula: current };
+    await setCachedJson(cacheKey, cached, RESILIENCE_SCORE_CACHE_TTL_SECONDS);
+  }
+
+  let payload: GetResilienceScoreResponse = cached
+    ? stripCacheMeta(cached)
+    : {
+        countryCode: normalizedCountryCode,
+        overallScore: 0,
+        baselineScore: 0,
+        stressScore: 0,
+        stressFactor: 0.5,
+        level: 'unknown',
+        domains: [],
+        trend: 'stable',
+        change30d: 0,
+        lowConfidence: true,
+        imputationShare: 0,
+        dataVersion: '',
+        pillars: [],
+        schemaVersion: '1.0',
+      };

  const scoreInterval = await readScoreInterval(normalizedCountryCode);
  if (scoreInterval) {
-    cached = { ...cached, scoreInterval };
+    payload = { ...payload, scoreInterval };
  }

  // P1 fix: the cache always stores the v2 superset (pillars + schemaVersion='2.0').
  // When the flag is off, strip pillars and downgrade schemaVersion so consumers
  // see the v1 shape. Flag flips take effect immediately, no 6h TTL wait.
  if (!RESILIENCE_SCHEMA_V2_ENABLED) {
-    cached.pillars = [];
-    cached.schemaVersion = '1.0';
+    payload.pillars = [];
+    payload.schemaVersion = '1.0';
  }

-  return cached;
+  return payload;
 }

 export async function listScorableCountries(): Promise<string[]> {
@@ -361,6 +549,7 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise

  const results = await runRedisPipeline(normalized.map((countryCode) => ['GET', scoreCacheKey(countryCode)]));
  const scores = new Map<string, GetResilienceScoreResponse>();
+  const current = currentCacheFormula();

  for (let index = 0; index < normalized.length; index += 1) {
    const countryCode = normalized[index]!;
@@ -369,14 +558,32 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise
    try {
      // Envelope-aware: resilience score keys are written by seed-resilience-scores
      // in contract mode (PR 2). unwrapEnvelope is a no-op on legacy bare-shape.
-      const parsed = unwrapEnvelope(JSON.parse(raw)).data as GetResilienceScoreResponse;
+      const parsed = unwrapEnvelope(JSON.parse(raw)).data as CachedScorePayload;
      if (!parsed) continue;
+      // Stale-formula skip: this bulk read feeds the ranking handler,
+      // which mirrors the single-country cache miss path. Leaving the
+      // country out of `scores` causes the ranking handler's
+      // warmMissingResilienceScores step to rebuild it with the current
+      // formula, producing a coherent same-formula ranking. Without
+      // this filter, a flip would serve a mixed-formula ranking for
+      // up to the 6h score TTL.
+      //
+      // IMPORTANT: the condition intentionally matches `undefined` too
+      // (not `parsed._formula && parsed._formula !== current`). Legacy
+      // untagged entries carry no `_formula` — they were written by a
+      // pre-PR code path or by an external writer that has not been
+      // updated — and must be treated as stale so the ranking warm
+      // path rebuilds them with the current tag. The `&&` short-circuit
+      // would admit them and re-introduce the cross-formula drift the
+      // whole cache-tag strategy is meant to prevent.
+      if (parsed._formula !== current) continue;
+      const publicPayload = stripCacheMeta(parsed);
      // P1 fix: cached payload is always v2 superset. Gate on serve.
      if (!RESILIENCE_SCHEMA_V2_ENABLED) {
-        parsed.pillars = [];
-        parsed.schemaVersion = '1.0';
+        publicPayload.pillars = [];
+        publicPayload.schemaVersion = '1.0';
      }
-      scores.set(countryCode, parsed);
+      scores.set(countryCode, publicPayload);
    } catch {
      // Ignore malformed cache entries and let the caller decide whether to warm them.
    }
@@ -500,10 +707,15 @@ export async function warmMissingResilienceScores(
  // pipeline body small enough to land well under the timeout while still
  // making one round-trip per batch.
  const SET_BATCH = 30;
+  const current = currentCacheFormula();
  const allSetCommands = scores.map(({ cc, score }) => [
    'SET',
    scoreCacheKey(cc),
-    JSON.stringify(score),
+    // Stamp the formula tag on the written payload so the bulk-read
+    // path in getCachedResilienceScores can filter stale entries after
+    // a flag flip. Without this tag, warmed-then-flipped entries would
+    // be served as-is until the 6h TTL expired.
+    JSON.stringify({ ...score, _formula: current } satisfies CachedScorePayload),
    'EX',
    String(RESILIENCE_SCORE_CACHE_TTL_SECONDS),
  ]);
--- a/server/worldmonitor/resilience/v1/get-resilience-ranking.ts
+++ b/server/worldmonitor/resilience/v1/get-resilience-ranking.ts
@@ -15,7 +15,9 @@ import {
  buildRankingItem,
  getCachedResilienceScores,
  listScorableCountries,
+  rankingCacheTagMatches,
  sortRankingItems,
+  stampRankingCacheTag,
  warmMissingResilienceScores,
  type ScoreInterval,
 } from './_shared';
@@ -88,8 +90,21 @@ export const getResilienceRanking: ResilienceServiceHandler['getResilienceRankin
    return true;
  })();
  if (!forceRefresh) {
-    const cached = await getCachedJson(RESILIENCE_RANKING_CACHE_KEY) as GetResilienceRankingResponse | null;
-    if (cached != null && (cached.items.length > 0 || (cached.greyedOut?.length ?? 0) > 0)) return cached;
+    const cached = await getCachedJson(RESILIENCE_RANKING_CACHE_KEY) as (GetResilienceRankingResponse & { _formula?: string }) | null;
+    // Stale-formula gate: the ranking cache key is bumped at PR deploy,
+    // but the flag flip happens later, so the v10 namespace starts out
+    // filled with 6-domain rankings. Without this check, a flip would
+    // serve the legacy ranking aggregate for up to the 12h ranking TTL
+    // even as per-country reads produced pillar-combined scores. Drop
+    // stale-formula hits so the recompute-and-publish path below runs.
+    const tagMatches = cached != null && rankingCacheTagMatches(cached);
+    if (tagMatches && (cached!.items.length > 0 || (cached!.greyedOut?.length ?? 0) > 0)) {
+      // Strip the cache-only tag before returning to callers so the
+      // wire shape matches the generated proto response type.
+      const { _formula: _drop, ...publicResponse } = cached!;
+      void _drop;
+      return publicResponse as GetResilienceRankingResponse;
+    }
  }

  const countryCodes = await listScorableCountries();
@@ -132,8 +147,12 @@ export const getResilienceRanking: ResilienceServiceHandler['getResilienceRankin
    // self-heal here ensures we at least log it, and the seeder also verifies
    // BOTH keys post-refresh. If either SET didn't return OK we log a warning
    // that ops can grep for, rather than silently succeeding.
+    // Tag the persisted ranking so the stale-formula gate above can
+    // detect a cross-formula cache hit after a flag flip. The tag is
+    // stripped on read before the response crosses back to callers.
+    const persistedRanking = stampRankingCacheTag(response);
    const pipelineResult = await runRedisPipeline([
-      ['SET', RESILIENCE_RANKING_CACHE_KEY, JSON.stringify(response), 'EX', RESILIENCE_RANKING_CACHE_TTL_SECONDS],
+      ['SET', RESILIENCE_RANKING_CACHE_KEY, JSON.stringify(persistedRanking), 'EX', RESILIENCE_RANKING_CACHE_TTL_SECONDS],
      ['SET', RESILIENCE_RANKING_META_KEY, JSON.stringify({
        fetchedAt: Date.now(),
        count: response.items.length + response.greyedOut.length,
--- a/tests/resilience-handlers.test.mts
+++ b/tests/resilience-handlers.test.mts
@@ -28,7 +28,7 @@ describe('resilience handlers', () => {
    delete process.env.VERCEL_ENV;

    const { fetchImpl, redis, sortedSets } = createRedisFetch(RESILIENCE_FIXTURES);
-    sortedSets.set('resilience:history:v4:US', [
+    sortedSets.set('resilience:history:v5:US', [
      { member: '2026-04-01:20', score: 20260401 },
      { member: '2026-04-02:30', score: 20260402 },
    ]);
@@ -55,16 +55,16 @@ describe('resilience handlers', () => {
    assert.ok(response.stressFactor >= 0 && response.stressFactor <= 0.5, `stressFactor out of bounds: ${response.stressFactor}`);
    assert.equal(response.dataVersion, '2024-04-03', 'dataVersion should be the ISO date from seed-meta fetchedAt');

-    const cachedScore = redis.get('resilience:score:v9:US');
+    const cachedScore = redis.get('resilience:score:v10:US');
    assert.ok(cachedScore, 'expected score cache to be written');
    assert.equal(JSON.parse(cachedScore || '{}').countryCode, 'US');

-    const history = sortedSets.get('resilience:history:v4:US') ?? [];
+    const history = sortedSets.get('resilience:history:v5:US') ?? [];
    assert.ok(history.some((entry) => entry.member.startsWith(today + ':')), 'expected today history member to be written');

    await getResilienceScore({ request: new Request('https://example.com') } as never, {
      countryCode: 'US',
    });
-    assert.equal((sortedSets.get('resilience:history:v4:US') ?? []).length, history.length, 'cache hit must not append history');
+    assert.equal((sortedSets.get('resilience:history:v5:US') ?? []).length, history.length, 'cache hit must not append history');
  });
 });
--- a/tests/resilience-pillar-aggregation.test.mts
+++ b/tests/resilience-pillar-aggregation.test.mts
@@ -157,8 +157,8 @@ describe('pillar constants', () => {
    assert.equal(PENALTY_ALPHA, 0.50);
  });

-  it('RESILIENCE_SCORE_CACHE_PREFIX is v9', () => {
-    assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v9:');
+  it('RESILIENCE_SCORE_CACHE_PREFIX is v10', () => {
+    assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v10:');
  });

  it('PILLAR_ORDER has 3 entries', () => {
--- a/tests/resilience-pillar-combine-activation.test.mts
+++ b/tests/resilience-pillar-combine-activation.test.mts
@@ -0,0 +1,247 @@
+// Phase 2 T2.3 activation test suite.
+//
+// Exercises the `RESILIENCE_PILLAR_COMBINE_ENABLED` flag: when set,
+// `overallScore` switches from the 6-domain weighted aggregate to the
+// penalized pillar-combined form. The existing release-gate tests
+// (tests/resilience-release-gate.test.mts) cover the default (flag=off)
+// path and pin the anchors for the 6-domain formula; this file covers
+// the re-anchored bands under the pillar combine.
+//
+// Why separate file: the existing release-gate test imports
+// `getResilienceScore` at the top of the file (captures the legacy
+// overallScore path) and runs many asserts that would become stale
+// under the pillar combine. A separate file lets us flip the env flag
+// in a per-test setup/teardown cleanly.
+
+import assert from 'node:assert/strict';
+import { afterEach, beforeEach, describe, it } from 'node:test';
+
+import { getResilienceRanking } from '../server/worldmonitor/resilience/v1/get-resilience-ranking.ts';
+import { getResilienceScore } from '../server/worldmonitor/resilience/v1/get-resilience-score.ts';
+import {
+  isPillarCombineEnabled,
+  penalizedPillarScore,
+} from '../server/worldmonitor/resilience/v1/_shared.ts';
+import { createRedisFetch } from './helpers/fake-upstash-redis.mts';
+import {
+  buildReleaseGateFixtures,
+} from './helpers/resilience-release-fixtures.mts';
+
+// Re-anchored bands for the pillar-combined formula, derived from the
+// 52-country live-Redis sensitivity capture in
+// docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json.
+// Old (6-domain): NO ≥ 70, YE/SO/CD ≤ 35, NO − US ≥ 8.
+// New (pillar combine, α=0.5): every country drops ~13 points, top
+// stays ~65-72, fragile states drop to ~15-35. The re-anchored bands
+// preserve the "high" vs "low" separation without pinning numbers that
+// are only valid for the legacy formula.
+const HIGH_BAND_FLOOR = 60;
+const LOW_BAND_CEILING = 40;
+const MIN_HIGH_LOW_SEPARATION = 20;
+
+const fixtures = buildReleaseGateFixtures();
+
+const originalFetch = globalThis.fetch;
+const originalRedisUrl = process.env.UPSTASH_REDIS_REST_URL;
+const originalRedisToken = process.env.UPSTASH_REDIS_REST_TOKEN;
+const originalVercelEnv = process.env.VERCEL_ENV;
+const originalPillarFlag = process.env.RESILIENCE_PILLAR_COMBINE_ENABLED;
+
+function installRedisFixtures() {
+  process.env.UPSTASH_REDIS_REST_URL = 'https://redis.example';
+  process.env.UPSTASH_REDIS_REST_TOKEN = 'token';
+  delete process.env.VERCEL_ENV;
+  const redisState = createRedisFetch(fixtures);
+  globalThis.fetch = redisState.fetchImpl;
+  return redisState;
+}
+
+function enablePillarCombine(): void {
+  process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = 'true';
+}
+
+function disablePillarCombine(): void {
+  process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = 'false';
+}
+
+describe('pillar-combined score activation', () => {
+  beforeEach(() => {
+    enablePillarCombine();
+  });
+
+  afterEach(() => {
+    globalThis.fetch = originalFetch;
+    if (originalRedisUrl == null) delete process.env.UPSTASH_REDIS_REST_URL;
+    else process.env.UPSTASH_REDIS_REST_URL = originalRedisUrl;
+    if (originalRedisToken == null) delete process.env.UPSTASH_REDIS_REST_TOKEN;
+    else process.env.UPSTASH_REDIS_REST_TOKEN = originalRedisToken;
+    if (originalVercelEnv == null) delete process.env.VERCEL_ENV;
+    else process.env.VERCEL_ENV = originalVercelEnv;
+    if (originalPillarFlag == null) delete process.env.RESILIENCE_PILLAR_COMBINE_ENABLED;
+    else process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = originalPillarFlag;
+  });
+
+  it('isPillarCombineEnabled reads env dynamically', () => {
+    enablePillarCombine();
+    assert.equal(isPillarCombineEnabled(), true);
+    disablePillarCombine();
+    assert.equal(isPillarCombineEnabled(), false);
+    enablePillarCombine();
+    assert.equal(isPillarCombineEnabled(), true);
+  });
+
+  it('penalizedPillarScore collapses to weighted-sum when all pillars equal (penalty minimal)', () => {
+    // All pillars at 80 → min=80 → penalty = 1 − 0.5*(1 − 0.8) = 0.9.
+    // Weighted sum = 80 * (0.40 + 0.35 + 0.25) = 80.
+    // Final = 80 * 0.9 = 72.
+    const result = penalizedPillarScore([
+      { score: 80, weight: 0.40 },
+      { score: 80, weight: 0.35 },
+      { score: 80, weight: 0.25 },
+    ]);
+    assert.equal(Math.round(result * 100) / 100, 72.00);
+  });
+
+  it('pillar-combined overallScore drops NO below the 6-domain band floor (expected, re-anchored)', async () => {
+    installRedisFixtures();
+
+    const response = await getResilienceScore(
+      { request: new Request('https://example.com?countryCode=NO') } as never,
+      { countryCode: 'NO' },
+    );
+
+    // Norway under the 6-domain formula scores ~86 under the current
+    // fixtures (pinned by T1.1 regression test). Under the pillar
+    // combine it drops to roughly the low-70s because penalty = 1 −
+    // 0.5 × (1 − min_pillar/100) is always ≤ 1. The activated path's
+    // HIGH_BAND_FLOOR = 60 leaves plenty of headroom above mid-tier
+    // countries while accepting that elite scores no longer sit in the
+    // 85+ range.
+    assert.ok(
+      response.overallScore >= HIGH_BAND_FLOOR,
+      `NO in the pillar-combined formula must stay above the re-anchored high-band floor (${HIGH_BAND_FLOOR}), got ${response.overallScore}`,
+    );
+    assert.ok(
+      response.overallScore <= 90,
+      `NO in the pillar-combined formula should NOT exceed 90 — penalty factor is always ≤ 1, so getting close to 100 would indicate the penalty is not firing. Got ${response.overallScore}.`,
+    );
+  });
+
+  it('pillar-combined overallScore keeps fragile countries (YE, SO) below the re-anchored low-band ceiling', async () => {
+    installRedisFixtures();
+
+    for (const countryCode of ['YE', 'SO'] as const) {
+      const response = await getResilienceScore(
+        { request: new Request(`https://example.com?countryCode=${countryCode}`) } as never,
+        { countryCode },
+      );
+      assert.ok(
+        response.overallScore <= LOW_BAND_CEILING,
+        `${countryCode} in the pillar-combined formula must stay below the re-anchored low-band ceiling (${LOW_BAND_CEILING}), got ${response.overallScore}`,
+      );
+    }
+  });
+
+  it('pillar-combined preserves NO vs US separation (high-band vs mid-band)', async () => {
+    installRedisFixtures();
+
+    const [no, us] = await Promise.all([
+      getResilienceScore({ request: new Request('https://example.com?countryCode=NO') } as never, { countryCode: 'NO' }),
+      getResilienceScore({ request: new Request('https://example.com?countryCode=US') } as never, { countryCode: 'US' }),
+    ]);
+
+    // The 6-domain separation was ~14 points under fixtures. The
+    // pillar combine amplifies penalty on imbalanced pillar profiles
+    // (US has a weaker live-shock pillar than Norway), so the
+    // separation is expected to hold or widen.
+    assert.ok(
+      no.overallScore > us.overallScore,
+      `NO (${no.overallScore}) must still outscore US (${us.overallScore}) under the pillar combine`,
+    );
+    assert.ok(
+      no.overallScore - us.overallScore >= MIN_HIGH_LOW_SEPARATION - 12,
+      `NO − US separation must stay ≥ ${MIN_HIGH_LOW_SEPARATION - 12} under pillar combine; got NO=${no.overallScore}, US=${us.overallScore}, Δ=${(no.overallScore - us.overallScore).toFixed(2)}`,
+    );
+  });
+
+  it('pillar-combined ranking preserves the elite vs fragile ordering over the release set', async () => {
+    installRedisFixtures();
+
+    const ranking = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
+    const byCountry = new Map(ranking.items.map((item) => [item.countryCode, item]));
+
+    // Every high-band anchor (if present in the ranking) must outrank
+    // every low-band anchor (if present). This is the structural
+    // invariant the pillar combine must preserve to be accepted.
+    const highAnchors = ['NO', 'CH', 'DK', 'IS', 'FI', 'SE', 'NZ'].filter((cc) => byCountry.has(cc));
+    const lowAnchors = ['YE', 'SO', 'SD', 'CD'].filter((cc) => byCountry.has(cc));
+
+    for (const high of highAnchors) {
+      for (const low of lowAnchors) {
+        const highScore = byCountry.get(high)!.overallScore;
+        const lowScore = byCountry.get(low)!.overallScore;
+        assert.ok(
+          highScore > lowScore,
+          `pillar-combined ranking must keep ${high} (${highScore}) above ${low} (${lowScore})`,
+        );
+      }
+    }
+  });
+
+  it('disabling the flag restores the 6-domain aggregate (regression guard for the default path)', async () => {
+    installRedisFixtures();
+    disablePillarCombine();
+
+    const response = await getResilienceScore(
+      { request: new Request('https://example.com?countryCode=NO') } as never,
+      { countryCode: 'NO' },
+    );
+
+    // Under the 6-domain formula + current fixtures, NO is pinned at
+    // ≥ 70 by the existing release-gate test. The flag-off code path
+    // is the same one the production default uses; we verify here that
+    // switching the flag off mid-suite really does restore it (the
+    // dynamic env read in isPillarCombineEnabled() is load-bearing).
+    assert.ok(
+      response.overallScore >= 70,
+      `with flag off, NO must still meet the 6-domain release-gate floor (70), got ${response.overallScore}`,
+    );
+  });
+
+  it('flipping the flag mid-session rebuilds the score (stale-formula cache invalidation)', async () => {
+    // This is the core guarantee for the activation story: merging this
+    // PR with flag=false populates cached scores tagged _formula='d6',
+    // and later setting RESILIENCE_PILLAR_COMBINE_ENABLED=true MUST
+    // force a rebuild on next read (rather than serving the d6-tagged
+    // entry for up to 6h until the TTL expires). We simulate the flip
+    // inside a single test by pre-computing a cache entry with the
+    // flag off, flipping the flag, then reading again — the second
+    // read must produce a different overallScore because the cache
+    // entry's _formula no longer matches the current formula.
+    disablePillarCombine();
+    installRedisFixtures();
+    const firstRead = await getResilienceScore(
+      { request: new Request('https://example.com?countryCode=NO') } as never,
+      { countryCode: 'NO' },
+    );
+    assert.ok(firstRead.overallScore >= 70, `flag-off NO should score ≥70, got ${firstRead.overallScore}`);
+
+    // Flip the flag. The cached entry in Redis still carries
+    // _formula='d6' from the first read. Without the stale-formula
+    // gate, the second read would serve that same 6-domain score.
+    enablePillarCombine();
+    const secondRead = await getResilienceScore(
+      { request: new Request('https://example.com?countryCode=NO') } as never,
+      { countryCode: 'NO' },
+    );
+
+    assert.ok(
+      secondRead.overallScore < firstRead.overallScore,
+      `flag-on rebuild must drop NO's score below the 6-domain value (penalty factor ≤ 1); got first=${firstRead.overallScore} second=${secondRead.overallScore}. If these are equal, the stale-formula cache gate is not firing and a flag flip in production would serve legacy values for up to the 6h TTL.`,
+    );
+    assert.ok(
+      secondRead.overallScore >= 60,
+      `flag-on NO should still meet the re-anchored 60 floor, got ${secondRead.overallScore}`,
+    );
+  });
+});
--- a/tests/resilience-ranking-snapshot.test.mts
+++ b/tests/resilience-ranking-snapshot.test.mts
@@ -23,13 +23,29 @@ const __filename = fileURLToPath(import.meta.url);
 const REPO_ROOT = path.resolve(path.dirname(__filename), '..');
 const SNAPSHOT_DIR = path.join(REPO_ROOT, 'docs', 'snapshots');

-// Band anchors from the release-gate tests (tests/resilience-release-gate.test.mts).
-// Countries in the high-anchor set must never drop below 70 in a published
-// snapshot; countries in the low-anchor set must never climb above 45.
+// Band anchors from the release-gate tests (tests/resilience-release-gate.test.mts
+// and tests/resilience-pillar-combine-activation.test.mts).
+// Floors/ceilings depend on the methodology formula the snapshot was
+// captured under — the pillar-combined form is non-compensatory so its
+// scale is compressed; the 6-domain legacy form is compensatory and
+// runs ~13 points hotter.
 const HIGH_BAND_ANCHORS = new Set(['NO', 'CH', 'DK', 'IS', 'FI', 'SE', 'NZ']);
 const LOW_BAND_ANCHORS = new Set(['YE', 'SO', 'SD', 'CD']);
-const HIGH_BAND_FLOOR = 70;
-const LOW_BAND_CEILING = 45;
+
+const METHODOLOGY_BANDS: Record<string, { highFloor: number; lowCeiling: number }> = {
+  'domain-weighted-6d': { highFloor: 70, lowCeiling: 45 },
+  'pillar-combined-penalized-v1': { highFloor: 60, lowCeiling: 40 },
+};
+
+function resolveBands(methodologyFormula: string | undefined): { highFloor: number; lowCeiling: number } {
+  // Unknown / unspecified formulas fall through to the 6-domain bands
+  // (the production default at the time of writing). If a future
+  // snapshot uses a new formula id, adding an entry to
+  // METHODOLOGY_BANDS above is the one-line fix; until then we assume
+  // the legacy bands rather than silently under-validating.
+  return METHODOLOGY_BANDS[methodologyFormula ?? 'domain-weighted-6d']
+    ?? METHODOLOGY_BANDS['domain-weighted-6d']!;
+}

 interface PublishedRow {
  rank: number;
@@ -53,6 +69,7 @@ interface SnapshotPublished {
  capturedAt: string;
  commitSha: string;
  schemaVersion: string;
+  methodologyFormula?: string;
  methodology: {
    domainCount: number;
    dimensionCount: number;
@@ -82,14 +99,49 @@ interface SnapshotLive {
  greyedOut: Array<{ countryCode: string; overallCoverage: number }>;
 }

-type Snapshot = SnapshotPublished | SnapshotLive;
+interface ProjectedRow {
+  rankInSample: number;
+  countryCode: string;
+  countryName: string;
+  proposedOverallScore: number;
+  currentOverallScore: number;
+  scoreDelta: number;
+}
+
+interface SnapshotProjected {
+  capturedAt: string;
+  commitSha: string;
+  schemaVersion: string;
+  methodologyFormula: string;
+  methodology: {
+    domainCount: number;
+    dimensionCount: number;
+    pillarCount: number;
+    greyOutThreshold: number;
+  };
+  sampleSize: number;
+  tables: {
+    topTenInSample: ProjectedRow[];
+    bottomTenInSample: ProjectedRow[];
+    majorEconomiesInSample: ProjectedRow[];
+  };
+  totals: { rankedCountriesInSample: number };
+}
+
+type Snapshot = SnapshotPublished | SnapshotLive | SnapshotProjected;

 function isLive(snapshot: Snapshot): snapshot is SnapshotLive {
  return Array.isArray((snapshot as SnapshotLive).items);
 }

+function isProjected(snapshot: Snapshot): snapshot is SnapshotProjected {
+  const tables = (snapshot as SnapshotProjected).tables;
+  return !!tables && Array.isArray(tables.topTenInSample);
+}
+
 function isPublished(snapshot: Snapshot): snapshot is SnapshotPublished {
-  return (snapshot as SnapshotPublished).tables != null;
+  const tables = (snapshot as SnapshotPublished).tables;
+  return !!tables && Array.isArray(tables.topTen);
 }

 function loadSnapshots(): { filename: string; snapshot: Snapshot }[] {
@@ -99,8 +151,17 @@ function loadSnapshots(): { filename: string; snapshot: Snapshot }[] {
  } catch {
    return [];
  }
+  // Matches three shapes:
+  //   resilience-ranking-YYYY-MM-DD.json
+  //     → published or live capture (the authoritative shape)
+  //   resilience-ranking-<slug>-YYYY-MM-DD.json
+  //     → projected / preview snapshot (e.g. pillar-combined-projected)
+  //       Auto-discovered so the projected artifact does not slip
+  //       through unvalidated. Slug must be hyphenated, start with an
+  //       alpha char, and live before the date.
+  const RANKING_SNAPSHOT_RE = /^resilience-ranking-(?:[a-z][a-z0-9-]*-)?\d{4}-\d{2}-\d{2}\.json$/;
  return entries
-    .filter((name) => /^resilience-ranking-\d{4}-\d{2}-\d{2}\.json$/.test(name))
+    .filter((name) => RANKING_SNAPSHOT_RE.test(name))
    .sort()
    .map((filename) => ({
      filename,
@@ -205,22 +266,24 @@ describe('resilience-ranking snapshots', () => {
          assert.ok(unique.size >= Math.max(snapshot.tables.topTen.length, snapshot.tables.bottomTen.length));
        });

-        it('high-band anchors appearing in topTen stay above the release-gate floor', () => {
+        it('high-band anchors appearing in topTen stay above the release-gate floor (methodology-aware)', () => {
+          const { highFloor } = resolveBands(snapshot.methodologyFormula);
          for (const row of snapshot.tables.topTen) {
            if (!HIGH_BAND_ANCHORS.has(row.countryCode)) continue;
            assert.ok(
-              row.overallScore >= HIGH_BAND_FLOOR,
-              `${row.countryCode} (${row.countryName}) is a high-band anchor and must stay ≥${HIGH_BAND_FLOOR}, got ${row.overallScore}`,
+              row.overallScore >= highFloor,
+              `${row.countryCode} (${row.countryName}) is a high-band anchor and must stay ≥${highFloor} under "${snapshot.methodologyFormula ?? 'domain-weighted-6d'}", got ${row.overallScore}`,
            );
          }
        });

-        it('low-band anchors appearing in bottomTen stay below the release-gate ceiling', () => {
+        it('low-band anchors appearing in bottomTen stay below the release-gate ceiling (methodology-aware)', () => {
+          const { lowCeiling } = resolveBands(snapshot.methodologyFormula);
          for (const row of snapshot.tables.bottomTen) {
            if (!LOW_BAND_ANCHORS.has(row.countryCode)) continue;
            assert.ok(
-              row.overallScore <= LOW_BAND_CEILING,
-              `${row.countryCode} (${row.countryName}) is a low-band anchor and must stay ≤${LOW_BAND_CEILING}, got ${row.overallScore}`,
+              row.overallScore <= lowCeiling,
+              `${row.countryCode} (${row.countryName}) is a low-band anchor and must stay ≤${lowCeiling} under "${snapshot.methodologyFormula ?? 'domain-weighted-6d'}", got ${row.overallScore}`,
            );
          }
        });
@@ -293,23 +356,130 @@ describe('resilience-ranking snapshots', () => {
          assert.equal(snapshot.totals.greyedOutCount, snapshot.greyedOut.length);
        });

-        it('live band anchors sit in their expected bands (structural sanity)', () => {
+        it('live band anchors sit in their expected bands (methodology-aware structural sanity)', () => {
+          const { highFloor, lowCeiling } = resolveBands((snapshot as SnapshotLive & { methodologyFormula?: string }).methodologyFormula);
          for (const item of snapshot.items) {
            if (HIGH_BAND_ANCHORS.has(item.countryCode)) {
              assert.ok(
-                item.overallScore >= HIGH_BAND_FLOOR,
-                `${item.countryCode} is a high-band anchor but scored ${item.overallScore} (< ${HIGH_BAND_FLOOR}) at rank ${item.rank}`,
+                item.overallScore >= highFloor,
+                `${item.countryCode} is a high-band anchor but scored ${item.overallScore} (< ${highFloor}) at rank ${item.rank}`,
              );
            }
            if (LOW_BAND_ANCHORS.has(item.countryCode)) {
              assert.ok(
-                item.overallScore <= LOW_BAND_CEILING,
-                `${item.countryCode} is a low-band anchor but scored ${item.overallScore} (> ${LOW_BAND_CEILING}) at rank ${item.rank}`,
+                item.overallScore <= lowCeiling,
+                `${item.countryCode} is a low-band anchor but scored ${item.overallScore} (> ${lowCeiling}) at rank ${item.rank}`,
              );
            }
          }
        });
      }
+
+      if (isProjected(snapshot)) {
+        // Projected snapshots are preview artifacts built from a
+        // sample (e.g. the 52-country sensitivity capture) against the
+        // proposed formula. They carry in-sample ranks, not global
+        // ranks, and use different table keys (topTenInSample rather
+        // than topTen) to avoid being mistaken for authoritative
+        // captures. Still validated here so the artifact does not ship
+        // with broken shape or out-of-band scores.
+
+        it('projected snapshot declares a known methodologyFormula', () => {
+          const known = new Set(['domain-weighted-6d', 'pillar-combined-penalized-v1']);
+          assert.ok(
+            known.has(snapshot.methodologyFormula),
+            `projected snapshot methodologyFormula="${snapshot.methodologyFormula}" must be one of [${[...known].join(', ')}]; add it to METHODOLOGY_BANDS at the top of this file when introducing a new formula id`,
+          );
+        });
+
+        it('projected topTenInSample ranks are 1..10, scores descend, every score in (0, 100)', () => {
+          const rows = snapshot.tables.topTenInSample;
+          assert.equal(rows.length, 10);
+          for (let i = 0; i < rows.length; i++) {
+            assert.equal(rows[i]!.rankInSample, i + 1, `topTenInSample[${i}].rankInSample should be ${i + 1}, got ${rows[i]!.rankInSample}`);
+            assert.ok(
+              rows[i]!.proposedOverallScore > 0 && rows[i]!.proposedOverallScore < 100,
+              `${rows[i]!.countryCode} proposedOverallScore=${rows[i]!.proposedOverallScore} must be in (0, 100)`,
+            );
+            if (i > 0) {
+              assert.ok(
+                rows[i]!.proposedOverallScore <= rows[i - 1]!.proposedOverallScore,
+                `topTenInSample must be monotonically non-increasing at in-sample rank ${rows[i]!.rankInSample}: ${rows[i - 1]!.proposedOverallScore} → ${rows[i]!.proposedOverallScore}`,
+              );
+            }
+          }
+        });
+
+        it('projected bottomTenInSample ranks are contiguous and descend in score', () => {
+          const rows = snapshot.tables.bottomTenInSample;
+          assert.equal(rows.length, 10);
+          for (let i = 1; i < rows.length; i++) {
+            assert.equal(
+              rows[i]!.rankInSample,
+              rows[i - 1]!.rankInSample + 1,
+              `bottomTenInSample ranks must be contiguous: ${rows[i - 1]!.rankInSample} then ${rows[i]!.rankInSample}`,
+            );
+            assert.ok(
+              rows[i]!.proposedOverallScore <= rows[i - 1]!.proposedOverallScore,
+              `bottomTenInSample scores must not increase with worsening rank: ${rows[i - 1]!.countryCode}=${rows[i - 1]!.proposedOverallScore} then ${rows[i]!.countryCode}=${rows[i]!.proposedOverallScore}`,
+            );
+          }
+          assert.equal(
+            rows[rows.length - 1]!.rankInSample,
+            snapshot.totals.rankedCountriesInSample,
+            `bottomTenInSample.last.rankInSample=${rows[rows.length - 1]!.rankInSample} must equal totals.rankedCountriesInSample=${snapshot.totals.rankedCountriesInSample}`,
+          );
+        });
+
+        it('projected scoreDelta equals proposed − current to within rounding', () => {
+          const all = [
+            ...snapshot.tables.topTenInSample,
+            ...snapshot.tables.bottomTenInSample,
+            ...snapshot.tables.majorEconomiesInSample,
+          ];
+          for (const row of all) {
+            const expected = Math.round((row.proposedOverallScore - row.currentOverallScore) * 100) / 100;
+            assert.ok(
+              Math.abs(row.scoreDelta - expected) < 0.02,
+              `${row.countryCode} scoreDelta=${row.scoreDelta} must equal proposed − current = ${expected}`,
+            );
+          }
+        });
+
+        it('projected band anchors sit in their expected bands under the declared methodology', () => {
+          const { highFloor, lowCeiling } = resolveBands(snapshot.methodologyFormula);
+          for (const row of snapshot.tables.topTenInSample) {
+            if (!HIGH_BAND_ANCHORS.has(row.countryCode)) continue;
+            assert.ok(
+              row.proposedOverallScore >= highFloor,
+              `${row.countryCode} is a high-band anchor in topTenInSample but scored ${row.proposedOverallScore} (< ${highFloor}) under "${snapshot.methodologyFormula}"`,
+            );
+          }
+          for (const row of snapshot.tables.bottomTenInSample) {
+            if (!LOW_BAND_ANCHORS.has(row.countryCode)) continue;
+            assert.ok(
+              row.proposedOverallScore <= lowCeiling,
+              `${row.countryCode} is a low-band anchor in bottomTenInSample but scored ${row.proposedOverallScore} (> ${lowCeiling}) under "${snapshot.methodologyFormula}"`,
+            );
+          }
+        });
+
+        it('projected snapshot does not confuse itself with a live-universe capture', () => {
+          // Two structural guards so a projected snapshot cannot
+          // silently slip into the authoritative slot: it must NOT
+          // carry the full-universe top/bottom keys, and its file
+          // slug must identify it as a preview.
+          assert.equal(
+            (snapshot as unknown as SnapshotPublished).tables?.topTen,
+            undefined,
+            'projected snapshots must not also expose tables.topTen (reserved for authoritative captures)',
+          );
+          assert.ok(
+            filename !== `resilience-ranking-${snapshot.capturedAt}.json`,
+            `projected snapshots must use a slug-prefixed filename, got ${filename}`,
+          );
+        });
+      }
    });
  }
 });
--- a/tests/resilience-ranking.test.mts
+++ b/tests/resilience-ranking.test.mts
@@ -47,44 +47,137 @@ describe('resilience ranking contracts', () => {

  it('returns the cached ranking payload unchanged when the ranking cache already exists', async () => {
    const { redis } = installRedis(RESILIENCE_FIXTURES);
-    const cached = {
+    const cachedPublic = {
      items: [
        { countryCode: 'NO', overallScore: 82, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
        { countryCode: 'US', overallScore: 61, level: 'medium', lowConfidence: false, overallCoverage: 0.88 },
      ],
      greyedOut: [],
    };
-    redis.set('resilience:ranking:v9', JSON.stringify(cached));
+    // The handler's stale-formula gate rejects untagged ranking entries,
+    // so fixtures must carry the `_formula` tag matching the current env
+    // (default flag-off ⇒ 'd6'). Writing the tagged shape here mirrors
+    // what the handler persists via stampRankingCacheTag.
+    redis.set('resilience:ranking:v10', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));

    const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});

-    assert.deepEqual(response, cached);
-    assert.equal(redis.has('resilience:score:v9:YE'), false, 'cache hit must not trigger score warmup');
+    // The handler strips `_formula` before returning, so response matches
+    // the public shape rather than the on-wire cache shape.
+    assert.deepEqual(response, cachedPublic);
+    assert.equal(redis.has('resilience:score:v10:YE'), false, 'cache hit must not trigger score warmup');
  });

  it('returns all-greyed-out cached payload without rewarming (items=[], greyedOut non-empty)', async () => {
    // Regression for: `cached?.items?.length` was falsy when items=[] even though
    // greyedOut had entries, causing unnecessary rewarming on every request.
    const { redis } = installRedis(RESILIENCE_FIXTURES);
-    const cached = {
+    const cachedPublic = {
      items: [],
      greyedOut: [
        { countryCode: 'SS', overallScore: 12, level: 'critical', lowConfidence: true, overallCoverage: 0.15 },
        { countryCode: 'ER', overallScore: 10, level: 'critical', lowConfidence: true, overallCoverage: 0.12 },
      ],
    };
-    redis.set('resilience:ranking:v9', JSON.stringify(cached));
+    redis.set('resilience:ranking:v10', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));

    const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});

-    assert.deepEqual(response, cached);
-    assert.equal(redis.has('resilience:score:v9:SS'), false, 'all-greyed-out cache hit must not trigger score warmup');
+    assert.deepEqual(response, cachedPublic);
+    assert.equal(redis.has('resilience:score:v10:SS'), false, 'all-greyed-out cache hit must not trigger score warmup');
+  });
+
+  it('bulk-read path skips untagged per-country score entries (legacy writes must rebuild on flip)', async () => {
+    // Pins the fix for a subtle bug: getCachedResilienceScores used
+    // `parsed._formula && parsed._formula !== current` which short-
+    // circuits on undefined. An untagged score entry — produced by a
+    // pre-PR code path or by an external writer that has not been
+    // updated — would therefore be ADMITTED into the ranking under the
+    // current formula instead of being treated as stale and re-warmed.
+    // On activation day that would mean a mixed-formula ranking for up
+    // to the 6h score TTL even though the single-country cache-miss
+    // path (ensureResilienceScoreCached) correctly invalidates the
+    // same entry. This test writes two per-country score keys, one
+    // tagged `_formula: 'd6'` and one untagged, and asserts the
+    // ranking warm path runs for the untagged country (meaning the
+    // bulk read skipped it).
+    const { redis } = installRedis(RESILIENCE_FIXTURES);
+    redis.set('resilience:static:index:v1', JSON.stringify({
+      countries: ['NO', 'US'],
+      recordCount: 2,
+      failedDatasets: [],
+      seedYear: 2026,
+    }));
+
+    const domain = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
+    // Tagged entry: served as-is.
+    redis.set('resilience:score:v10:NO', JSON.stringify({
+      countryCode: 'NO', overallScore: 82, level: 'high',
+      domains: domain, trend: 'stable', change30d: 1.2,
+      lowConfidence: false, imputationShare: 0.05, _formula: 'd6',
+    }));
+    // Untagged entry: must be rejected, ranking warm rebuilds US.
+    redis.set('resilience:score:v10:US', JSON.stringify({
+      countryCode: 'US', overallScore: 61, level: 'medium',
+      domains: domain, trend: 'rising', change30d: 4.3,
+      lowConfidence: false, imputationShare: 0.1,
+      // NOTE: no _formula field.
+    }));
+
+    await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
+
+    // After the ranking run, the US entry in Redis must now carry
+    // `_formula: 'd6'`. If the bulk read had ADMITTED the untagged
+    // entry (the pre-fix bug), the warm path for US would not have
+    // run, and the stored value would still be untagged.
+    const rewrittenRaw = redis.get('resilience:score:v10:US');
+    assert.ok(rewrittenRaw, 'US entry must remain in Redis after the ranking run');
+    const rewritten = JSON.parse(rewrittenRaw!);
+    assert.equal(
+      rewritten._formula,
+      'd6',
+      'untagged US entry must be rejected by the bulk read so the warm path rebuilds it with the current formula tag. If `_formula` is still undefined here, getCachedResilienceScores is admitting untagged entries.',
+    );
+  });
+
+  it('rejects a stale-formula ranking cache entry and recomputes even without ?refresh=1', async () => {
+    // Pins the cross-formula isolation: when the env flag is off (default)
+    // and the ranking cache carries _formula='pc' (written during a prior
+    // flag-on deploy that has since been rolled back), the handler must
+    // NOT serve the stale-formula entry. It must recompute from the
+    // per-country scores instead. Without this behavior, a flag
+    // rollback would leave the old ranking in place for up to the 12h
+    // ranking TTL even though scores were already back on the 6-domain
+    // formula.
+    const { redis } = installRedis(RESILIENCE_FIXTURES);
+    const stale = {
+      items: [
+        { countryCode: 'NO', overallScore: 99, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
+      ],
+      greyedOut: [],
+      _formula: 'pc', // mismatched — current env is flag-off ⇒ current='d6'
+    };
+    redis.set('resilience:ranking:v10', JSON.stringify(stale));
+
+    const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
+
+    assert.notDeepEqual(
+      response,
+      { items: stale.items, greyedOut: stale.greyedOut },
+      'stale-formula ranking must be rejected, not served',
+    );
+    // Recompute path warms missing per-country scores, so YE (in
+    // RESILIENCE_FIXTURES) must get scored during this call.
+    assert.ok(
+      redis.has('resilience:score:v10:YE'),
+      'stale-formula reject must trigger the recompute-and-warm path',
+    );
  });

  it('warms missing scores synchronously and returns complete ranking on first call', async () => {
    const { redis } = installRedis(RESILIENCE_FIXTURES);
    const domainWithCoverage = [{ name: 'political', dimensions: [{ name: 'd1', coverage: 0.9 }] }];
-    redis.set('resilience:score:v9:NO', JSON.stringify({
+    redis.set('resilience:score:v10:NO', JSON.stringify({
      countryCode: 'NO',
      overallScore: 82,
      level: 'high',
@@ -94,7 +187,7 @@ describe('resilience ranking contracts', () => {
      lowConfidence: false,
      imputationShare: 0.05,
    }));
-    redis.set('resilience:score:v9:US', JSON.stringify({
+    redis.set('resilience:score:v10:US', JSON.stringify({
      countryCode: 'US',
      overallScore: 61,
      level: 'medium',
@@ -109,20 +202,20 @@ describe('resilience ranking contracts', () => {

    const totalItems = response.items.length + (response.greyedOut?.length ?? 0);
    assert.equal(totalItems, 3, `expected 3 total items across ranked + greyedOut, got ${totalItems}`);
-    assert.ok(redis.has('resilience:score:v9:YE'), 'missing country should be warmed during first call');
+    assert.ok(redis.has('resilience:score:v10:YE'), 'missing country should be warmed during first call');
    assert.ok(response.items.every((item) => item.overallScore >= 0), 'ranked items should all have computed scores');
-    assert.ok(redis.has('resilience:ranking:v9'), 'fully scored ranking should be cached');
+    assert.ok(redis.has('resilience:ranking:v10'), 'fully scored ranking should be cached');
  });

  it('sets rankStable=true when interval data exists and width <= 8', async () => {
    const { redis } = installRedis(RESILIENCE_FIXTURES);
    const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
-    redis.set('resilience:score:v9:NO', JSON.stringify({
+    redis.set('resilience:score:v10:NO', JSON.stringify({
      countryCode: 'NO', overallScore: 82, level: 'high',
      domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
      lowConfidence: false, imputationShare: 0.05,
    }));
-    redis.set('resilience:score:v9:US', JSON.stringify({
+    redis.set('resilience:score:v10:US', JSON.stringify({
      countryCode: 'US', overallScore: 61, level: 'medium',
      domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
      lowConfidence: false, imputationShare: 0.1,
@@ -149,12 +242,12 @@ describe('resilience ranking contracts', () => {
      seedYear: 2025,
    }));
    const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
-    redis.set('resilience:score:v9:NO', JSON.stringify({
+    redis.set('resilience:score:v10:NO', JSON.stringify({
      countryCode: 'NO', overallScore: 82, level: 'high',
      domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
      lowConfidence: false, imputationShare: 0.05,
    }));
-    redis.set('resilience:score:v9:US', JSON.stringify({
+    redis.set('resilience:score:v10:US', JSON.stringify({
      countryCode: 'US', overallScore: 61, level: 'medium',
      domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
      lowConfidence: false, imputationShare: 0.1,
@@ -164,7 +257,7 @@ describe('resilience ranking contracts', () => {

    // 3 of 4 (NO + US pre-cached, YE warmed from fixtures, ZZ can't be warmed)
    // = 75% which meets the threshold — must cache.
-    assert.ok(redis.has('resilience:ranking:v9'), 'ranking must be cached at exactly 75% coverage');
+    assert.ok(redis.has('resilience:ranking:v10'), 'ranking must be cached at exactly 75% coverage');
    assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written alongside the ranking');
  });

@@ -195,7 +288,7 @@ describe('resilience ranking contracts', () => {
      if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
        const commands = JSON.parse(init.body) as Array<Array<string>>;
        const allScoreReads = commands.length > 0 && commands.every(
-          (cmd) => cmd[0] === 'GET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v9:'),
+          (cmd) => cmd[0] === 'GET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v10:'),
        );
        if (allScoreReads) {
          // Simulate visibility lag: pretend no scores are cached yet.
@@ -211,7 +304,7 @@ describe('resilience ranking contracts', () => {

    await getResilienceRanking({ request: new Request('https://example.com') } as never, {});

-    assert.ok(redis.has('resilience:ranking:v9'), 'ranking must be published despite pipeline-GET race');
+    assert.ok(redis.has('resilience:ranking:v10'), 'ranking must be published despite pipeline-GET race');
    assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written despite pipeline-GET race');
  });

@@ -219,8 +312,8 @@ describe('resilience ranking contracts', () => {
    // Reviewer regression: passing `raw=true` to runRedisPipeline bypasses the
    // env-based key prefix (preview: / dev:) that isolates preview deploys
    // from production. The symptom is asymmetric: preview reads hit
-    // `preview:<sha>:resilience:score:v9:XX` while preview writes landed at
-    // raw `resilience:score:v9:XX`, simultaneously (a) missing the preview
+    // `preview:<sha>:resilience:score:v10:XX` while preview writes landed at
+    // raw `resilience:score:v10:XX`, simultaneously (a) missing the preview
    // cache forever and (b) poisoning production's shared cache. Simulate a
    // preview deploy and assert the pipeline SET keys carry the prefix.
    // Shared afterEach snapshots/restores VERCEL_ENV + VERCEL_GIT_COMMIT_SHA
@@ -252,7 +345,7 @@ describe('resilience ranking contracts', () => {

    const scoreSetKeys = pipelineBodies
      .flat()
-      .filter((cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v9:'))
+      .filter((cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v10:'))
      .map((cmd) => cmd[1] as string);
    assert.ok(scoreSetKeys.length >= 2, `expected at least 2 score SETs, got ${scoreSetKeys.length}`);
    for (const key of scoreSetKeys) {
@@ -280,8 +373,14 @@ describe('resilience ranking contracts', () => {
        failedDatasets: [],
        seedYear: 2026,
      }));
-      const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [] };
-      redis.set('resilience:ranking:v9', JSON.stringify(stale));
+      // Stale sentinel tagged with the current (flag-off default)
+      // formula so the cross-formula invalidation does NOT fire here —
+      // these refresh-auth tests exercise the auth gate, not the
+      // formula check. An untagged sentinel would be silently
+      // rejected by the formula gate and the refresh path would not
+      // get tested as intended.
+      const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
+      redis.set('resilience:ranking:v10', JSON.stringify(stale));

      // No X-WorldMonitor-Key → refresh must be ignored, stale cache returned.
      const unauth = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1');
@@ -328,8 +427,14 @@ describe('resilience ranking contracts', () => {
      }));
      // Seed a pre-existing ranking so the cache-hit early-return would
      // normally fire. ?refresh=1 (with valid seed key) must ignore it.
-      const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [] };
-      redis.set('resilience:ranking:v9', JSON.stringify(stale));
+      // Stale sentinel tagged with the current (flag-off default)
+      // formula so the cross-formula invalidation does NOT fire here —
+      // these refresh-auth tests exercise the auth gate, not the
+      // formula check. An untagged sentinel would be silently
+      // rejected by the formula gate and the refresh path would not
+      // get tested as intended.
+      const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
+      redis.set('resilience:ranking:v10', JSON.stringify(stale));

      const request = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1', {
        headers: { 'X-WorldMonitor-Key': 'seed-secret' },
@@ -364,7 +469,7 @@ describe('resilience ranking contracts', () => {
      if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
        const commands = JSON.parse(init.body) as Array<Array<string>>;
        const isAllScoreSets = commands.length > 0 && commands.every(
-          (cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v9:'),
+          (cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v10:'),
        );
        if (isAllScoreSets) setPipelineSizes.push(commands.length);
      }
@@ -396,7 +501,7 @@ describe('resilience ranking contracts', () => {
      seedYear: 2026,
    }));

-    // Intercept any pipeline SET to resilience:score:v9:* and reply with
+    // Intercept any pipeline SET to resilience:score:v10:* and reply with
    // non-OK results (persisted but authoritative signal says no). /set and
    // other paths pass through normally so history/interval writes succeed.
    const blockedScoreWrites = (async (input: RequestInfo | URL, init?: RequestInit) => {
@@ -404,7 +509,7 @@ describe('resilience ranking contracts', () => {
      if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
        const commands = JSON.parse(init.body) as Array<Array<string>>;
        const allScoreSets = commands.length > 0 && commands.every(
-          (cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v9:'),
+          (cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v10:'),
        );
        if (allScoreSets) {
          return new Response(
@@ -419,7 +524,7 @@ describe('resilience ranking contracts', () => {

    await getResilienceRanking({ request: new Request('https://example.com') } as never, {});

-    assert.ok(!redis.has('resilience:ranking:v9'), 'ranking must NOT be published when score writes failed');
+    assert.ok(!redis.has('resilience:ranking:v10'), 'ranking must NOT be published when score writes failed');
    assert.ok(!redis.has('seed-meta:resilience:ranking'), 'seed-meta must NOT be written when score writes failed');
  });

--- a/tests/resilience-scores-seed.test.mjs
+++ b/tests/resilience-scores-seed.test.mjs
@@ -10,12 +10,12 @@ import {
 } from '../scripts/seed-resilience-scores.mjs';

 describe('exported constants', () => {
-  it('RESILIENCE_RANKING_CACHE_KEY matches server-side key (v9)', () => {
-    assert.equal(RESILIENCE_RANKING_CACHE_KEY, 'resilience:ranking:v9');
+  it('RESILIENCE_RANKING_CACHE_KEY matches server-side key (v10)', () => {
+    assert.equal(RESILIENCE_RANKING_CACHE_KEY, 'resilience:ranking:v10');
  });

-  it('RESILIENCE_SCORE_CACHE_PREFIX matches server-side prefix (v9)', () => {
-    assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v9:');
+  it('RESILIENCE_SCORE_CACHE_PREFIX matches server-side prefix (v10)', () => {
+    assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v10:');
  });

  it('RESILIENCE_RANKING_CACHE_TTL_SECONDS is 12 hours (2x cron interval)', () => {