feat(resilience): flag-gated pillar-combined score activation (default off) (#3267)

Wires the non-compensatory 3-pillar combined overall_score behind a
RESILIENCE_PILLAR_COMBINE_ENABLED env flag. Default is false so this PR
ships zero behavior change in production. When flipped true the
top-level overall_score switches from the 6-domain weighted aggregate
to penalizedPillarScore(pillars) with alpha 0.5 and pillar weights
0.40 / 0.35 / 0.25.

Evidence from docs/snapshots/resilience-pillar-sensitivity-2026-04-21:
- Spearman rank correlation current vs proposed 0.9935
- Mean score delta -13.44 points (every country drops, penalty is
  always at most 1)
- Max top-50 rank swing 6 positions (Russia)
- No ceiling or floor effects under plus/minus 20pct perturbation
- Release gate PASS 0/19

Code change in server/worldmonitor/resilience/v1/_shared.ts:
- New isPillarCombineEnabled() reads env dynamically so tests can flip
  state without reloading the module
- overallScore branches on (isPillarCombineEnabled() AND
  RESILIENCE_SCHEMA_V2_ENABLED AND pillars.length > 0); otherwise falls
  through to the 6-domain aggregate (unchanged default path)
- RESILIENCE_SCORE_CACHE_PREFIX bumped v9 to v10
- RESILIENCE_RANKING_CACHE_KEY bumped v9 to v10

Cache invalidation: the version bump forces both per-country score
cache and ranking cache to recompute from the current code path on
first read after a flag flip. Without the bump, 6-domain values cached
under the flag-off path would continue to serve for up to 6-12 hours
after the flip, producing a ragged mix of formulas.

Ripple of v9 to v10:
- api/health.js registry entry
- scripts/seed-resilience-scores.mjs (both keys)
- scripts/validate-resilience-correlation.mjs,
  scripts/backtest-resilience-outcomes.mjs,
  scripts/validate-resilience-backtest.mjs,
  scripts/benchmark-resilience-external.mjs
- tests/resilience-ranking.test.mts 24 fixture usages
- tests/resilience-handlers.test.mts
- tests/resilience-scores-seed.test.mjs explicit pin
- tests/resilience-pillar-aggregation.test.mts explicit pin
- docs/methodology/country-resilience-index.mdx

New tests/resilience-pillar-combine-activation.test.mts:
7 assertions exercising the flag-on path against the release fixtures
with re-anchored bands (NO at least 60, YE/SO at most 40, NO greater
than US preserved, elite greater than fragile). Regression guard
verifies flipping the flag back off restores the 6-domain aggregate.

tests/resilience-ranking-snapshot.test.mts: band thresholds now
resolve from a METHODOLOGY_BANDS table keyed on
snapshot.methodologyFormula. Backward compatible (missing formula
defaults to domain-weighted-6d bands).

Snapshots:
- docs/snapshots/resilience-ranking-2026-04-21.json tagged
  methodologyFormula domain-weighted-6d
- docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json
  new: top/bottom/major-economies tables projected from the
  52-country sensitivity sample. Explicitly tagged projected (NOT a
  full-universe live capture). When the flag is flipped in production,
  run scripts/freeze-resilience-ranking.mjs to capture the
  authoritative full-universe snapshot.

Methodology doc: Pillar-combined score activation section rewritten to
describe the flag-gated mechanism (activation is an env-var flip, no
code deploy) and the rollback path.

Verification: npm run typecheck:all clean, 397/397 resilience tests
pass (up from 390, +7 activation tests).

Activation plan:
1. Merge this PR with flag default false (zero behavior change)
2. Set RESILIENCE_PILLAR_COMBINE_ENABLED=true in Vercel and Railway env
3. Redeploy or wait for next cold start; v9 to v10 bump forces every
   country to be rescored on first read
4. Run scripts/freeze-resilience-ranking.mjs against the flag-on
   deployment and commit the resulting snapshot
5. Ship a v2.0 methodology-change note explaining the re-anchored
   scale so analysts understand the universal ~13 point score drop is
   a scale rebase, not a country-level regression

Rollback: set RESILIENCE_PILLAR_COMBINE_ENABLED=false, flush
resilience:score:v10:* and resilience:ranking:v10 keys (or wait for
TTLs). The 6-domain formula stays alongside the pillar combine in
_shared.ts and needs no code change to come back.
This commit is contained in:
Elie Habib
2026-04-22 06:52:07 +04:00
committed by GitHub
parent 502bd4472c
commit fbaf07e106
16 changed files with 1434 additions and 129 deletions

View File

@@ -161,7 +161,7 @@ const STANDALONE_KEYS = {
pizzint: 'intelligence:pizzint:seed:v1',
resilienceStaticIndex: 'resilience:static:index:v1',
resilienceStaticFao: 'resilience:static:fao',
resilienceRanking: 'resilience:ranking:v9',
resilienceRanking: 'resilience:ranking:v10',
productCatalog: 'product-catalog:v2',
energySpineCountries: 'energy:spine:v1:_countries',
energyExposure: 'energy:exposure:v1:index',

View File

@@ -5,11 +5,16 @@ description: "Real-time resilience scoring for ~220 countries across 6 domains a
The WorldMonitor Country Resilience Index (CRI) scores every country in the world on a 0-100 scale, combining long-run structural capacity with current operational stress to produce an actionable resilience metric. Rather than relying on static country risk ratings, the CRI updates every 6 hours from official and authoritative sources and exposes full provenance, coverage, and imputation context so analysts can see exactly *why* a score moved and how much of it is real data versus imputed.
This document is the v1.0 reference for the live product. A planned v2.0 upgrade will rebuild the top-level shape into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, and ship an annual Reference Edition at citation quality. That work is tracked in a separate reference-grade upgrade plan and is not yet shipped; everything documented below describes the **current shipping behavior**.
This document describes the **currently shipping** behavior of the index. The versioning has two independent axes:
- **Response shape**: `schemaVersion: "2.0"` is the current default. Every response carries a real coverage-weighted `pillars[]` array regrouping the six domains into structural readiness / live shock exposure / recovery capacity. The legacy `schemaVersion: "1.0"` shape (pillars empty) remains available via the `RESILIENCE_SCHEMA_V2_ENABLED=false` env flag for one release cycle.
- **Scoring formula**: the top-level `overall_score` is the six-domain weighted aggregate (the v1 compensatory formula). The v2 non-compensatory pillar-combined formula with a min-pillar penalty is defined, validated (see Pillar-combined score activation below), and wired behind the `RESILIENCE_PILLAR_COMBINE_ENABLED` flag, but its default is `false` — activation is an explicit operator action rather than a code deploy. The annual Reference Edition at citation quality is a separate Phase 3 deliverable and is not yet shipped.
Everything documented below describes the **currently shipping** state: schemaVersion `"2.0"` shape, 6 domains × 19 dimensions × 3 pillars, and the 6-domain weighted `overall_score`. When an operator flips the pillar-combined flag on, the subsection on [Pillar-combined score activation](#pillar-combined-score-activation-flag-gated-default-off) documents what changes.
## In the dashboard
CRI is surfaced across three places in the product, all driven from the same v1.0 score described below:
CRI is surfaced across three places in the product, all driven from the same currently-shipping score:
- **Resilience widget** — a standalone panel (component: `src/components/ResilienceWidget.ts`) that ranks countries by resilience score with filter and search affordances. Reach it from Cmd+K by typing *resilience*.
- **Country Deep-Dive** — inside the per-country drill-down panel, CRI appears alongside CII (Country Instability Index) as a structural complement to the short-horizon stress signal. CII and CRI are intentionally **not interchangeable**: CII answers "how much stress is on this country right now?"; CRI answers "how well-positioned is this country to absorb and recover from shocks?"
@@ -384,9 +389,9 @@ The CRI is designed to be auditable end-to-end: given the Redis snapshot at any
| Key | Type | TTL | Written by | Read by |
|---|---|---|---|---|
| `resilience:score:v9:{countryCode}` | JSON | 6 hours | `buildResilienceScore` in `server/worldmonitor/resilience/v1/_shared.ts` | `getResilienceScore` handler |
| `resilience:ranking:v9` | JSON | 6 hours | `buildResilienceRanking`, only when all countries are scored | `getResilienceRanking` handler |
| `resilience:history:v4:{countryCode}` | sorted set | indefinite, trimmed to 30 days | `appendHistory` during scoring | trend and `change30d` computation |
| `resilience:score:v10:{countryCode}` | JSON | 6 hours | `buildResilienceScore` in `server/worldmonitor/resilience/v1/_shared.ts` | `getResilienceScore` handler |
| `resilience:ranking:v10` | JSON | 6 hours | `buildResilienceRanking`, only when all countries are scored | `getResilienceRanking` handler |
| `resilience:history:v5:{countryCode}` | sorted set | indefinite, trimmed to 30 days | `appendHistory` during scoring | trend and `change30d` computation |
| `resilience:intervals:v1:{countryCode}` | JSON | 6 hours | `scripts/seed-resilience-intervals.mjs` | `getResilienceScore` (optional `scoreInterval` field) |
| `seed-meta:resilience:static` | JSON | 2 hours | `scripts/seed-resilience-static.mjs` at the end of each successful seed run | scorer for `dataVersion` population, health checks |
| `resilience:static:{countryCode}` | JSON | 400 days | `scripts/seed-resilience-static.mjs` | scorer for all baseline signals (WGI, WHO, FAO, GPI, RSF, and so on) |
@@ -394,7 +399,7 @@ The CRI is designed to be auditable end-to-end: given the Redis snapshot at any
### dataVersion semantics
The `dataVersion` field on every `GetResilienceScoreResponse` is the ISO date of the `fetchedAt` timestamp stored in `seed-meta:resilience:static`. It reflects the most recent successful run of the Railway static-seed job; the widget renders it in the footer as `Data YYYY-MM-DD`.
The `dataVersion` field on every `GetResilienceScoreResponse` is the ISO date of the `fetchedAt` timestamp stored in `seed-meta:resilience:static`. It reflects the most recent successful run of the Railway static-seed job; the widget renders it in the footer as `Seed date YYYY-MM-DD`. The label is narrower than "Data" because live inputs (conflict events, sanctions, prices) can refresh at their own cadence after the static bundle runs — per-dimension freshness is surfaced separately via the freshness badge in the confidence grid.
### Reproducing a score by hand
@@ -452,7 +457,7 @@ Self-assessed against the standard composite-indicator review axes on a 0-10 sca
### v2.0 (April 2026) — Phase 2 structural rebuild
**Current published version.** Phase 2 of the reference-grade upgrade plan (`docs/internal/country-resilience-upgrade-plan.md`). Rebuilds the top-level shape from five flat domains into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, adds a recovery capacity pillar with six new dimensions, and ships a full validation suite (cross-index benchmark, outcome backtest, sensitivity analysis).
**Current published version** (shape). Phase 2 of the reference-grade upgrade plan (`docs/internal/country-resilience-upgrade-plan.md`). The response-shape rebuild is live: every response now carries a real coverage-weighted `pillars[]` array regrouping the six domains into structural readiness, live shock exposure, and recovery capacity. The recovery domain adds six new dimensions, and a full validation suite (cross-index benchmark, outcome backtest, sensitivity analysis) gates the activation. The top-level `overall_score` is still computed by the six-domain weighted aggregate (v1 formula); the partly non-compensatory pillar-combined `overall_score` is defined, tested, and flag-gated (see [Pillar-combined score activation](#pillar-combined-score-activation-flag-gated-default-off)), but `RESILIENCE_PILLAR_COMBINE_ENABLED` defaults to `false` so operators can schedule the flip with a proper migration message.
- **T2.1** (#2977): Three-pillar schema added to proto and OpenAPI. `schemaVersion: "2.0"` feature flag introduced with backward-compatible `"1.0"` fallback path for one release cycle. Response now carries a `pillars` array alongside existing `domains`.
- **T2.2a** (#2979): Signal tiering registry committed. Every indicator tagged Core, Enrichment, or Experimental with per-signal coverage percentage and license audit status. Registry enforced by CI linter.
@@ -492,7 +497,17 @@ The plan's non-compensatory pillar combine is the methodologically stronger form
**Interpretation**: Rank order is strongly preserved on the 52-country sample (Spearman 0.9863 clears the ≥0.90 bar typically required for a rank-stable methodology change). The ranking *shape* — who is top-10, who is bottom-10, Lebanon below South Africa, Norway above the US — does not materially change. However, every country's absolute score drops on average ~11 points because the penalty factor is always ≤ 1, and imbalanced countries with one very weak pillar (Syria, Afghanistan, Venezuela, Russia) drop the most (15-19 points). Balanced top-tier countries (Switzerland, Sweden, Denmark, Iceland, Norway) drop the least (5-7 points). This is the intended behavior: the penalty punishes pillar imbalance, and pillar imbalance is strongly correlated with state fragility.
**What this means for activation**: the rank-stability evidence supports flipping the default — there is no statistical reason to keep the legacy compensatory form. The blocker is messaging, not correctness: publishing "US = 52.65" the day after publishing "US = 65.4" without a v2.0 methodology note would look like a regression instead of a rigor upgrade. Activation is therefore scheduled as a single PR that (a) flips the default behind `RESILIENCE_PILLAR_COMBINE_ENABLED`, (b) re-anchors the release-gate bands (the current 70/35 thresholds map to roughly 60/25 in the pillar-combined scale), (c) publishes a refreshed frozen ranking snapshot, and (d) ships a methodology-change note alongside the widget. Until that PR lands, the published `overall_score` is the 6-domain weighted aggregate documented above.
**Activation sequence**: the rank-stability evidence supports flipping the default — there is no statistical reason to keep the legacy compensatory form. The blocker is messaging: publishing "US = 54.50" the day after publishing "US = 68.26" without a methodology note would look like a regression instead of a rigor upgrade. The pillar-combine activation PR wires the following so the flip is a single env-var change with no code deploy required:
1. **Feature flag**: `RESILIENCE_PILLAR_COMBINE_ENABLED`, read dynamically from `process.env` per call. Default `false`. Set to `true` in Vercel env + Railway env to activate.
2. **Cache invalidation**: per-country score cache bumped from `resilience:score:v9:` to `resilience:score:v10:`, ranking cache bumped from `resilience:ranking:v9` to `resilience:ranking:v10`, and score-history bumped from `resilience:history:v4:` to `resilience:history:v5:`. The version bumps are a clean-slate guard; the actual cross-formula isolation is the `_formula` tag written into every cached score / ranking payload and the `:d6` / `:pc` suffix on every history sorted-set member, checked at read time so a flag flip forces a rebuild without waiting for TTLs.
3. **Methodology-aware level thresholds**: `classifyResilienceLevel` reads `isPillarCombineEnabled()` and switches the high/medium cutoffs from 70/40 (6-domain) to 60/30 (pillar-combined). Without this, scale compression alone would demote FI (75.64 → 68.60) and NZ (76.26 → 67.93) from "high" to "medium" purely because the formula changed, not because anything about the country changed. The re-anchored cutoffs preserve the qualitative label for every country whose old label was correct.
4. **Re-anchored release-gate bands**: `tests/resilience-pillar-combine-activation.test.mts` pins high-band anchors (NO, CH, DK) at ≥ 60 (vs the 6-domain formula's ≥ 70 floor) and low-band anchors (YE, SO) at ≤ 40 (vs ≤ 45). The snapshot test reads `methodologyFormula` from each snapshot and applies the matching bands. The live sample numbers confirm the bands hold with margin: NO proposed ≈ 71.59 (≥ 60 by 11 points), YE ≈ 27.36 (≤ 40 by 13 points).
5. **Projected snapshot**: `docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json` carries the top/bottom/major-economies tables at the proposed formula so reviewers can preview the post-activation ranking before flipping the flag. Once the flag is on in production, run `scripts/freeze-resilience-ranking.mjs` to capture the authoritative full-universe snapshot.
Rollback: set `RESILIENCE_PILLAR_COMBINE_ENABLED=false`, flush the `resilience:score:v10:*`, `resilience:ranking:v10`, and `resilience:history:v5:*` keys (or wait for TTLs to expire). The 6-domain formula lives alongside the pillar combine in `_shared.ts` and needs no code change to come back.
Until operators set the flag, `overall_score` remains the 6-domain weighted aggregate documented above.
### Scorecard (v2.0 self-assessment)

View File

@@ -0,0 +1,537 @@
{
"capturedAt": "2026-04-21",
"source": "Projected from scripts/compare-resilience-current-vs-proposed.mjs against the 52-country live-Redis sensitivity sample, regenerated after the comparison script was corrected to use the production buildPillarList aggregation (coverage-weighted across member-domain average dimension coverage). This is NOT a full live-universe capture \u2014 the pillar-combined flag is off in production, so a real 217-country ranking under the new formula does not exist yet. When activation ships, run scripts/freeze-resilience-ranking.mjs against the flag-enabled deployment to produce the authoritative capture; this file is the best available preview until then.",
"commitSha": "048bb8bb525393dc4a9c1998b9877c1f8cc8c011",
"schemaVersion": "2.0",
"methodologyFormula": "pillar-combined-penalized-v1",
"methodology": {
"overallScoreFormula": "penalizedPillarScore(pillars): \u03a3 pillar.score \u00d7 pillar.weight multiplied by (1 \u2212 0.5 \u00d7 (1 \u2212 min_pillar/100)). Pillar weights: structural-readiness=0.40, live-shock-exposure=0.35, recovery-capacity=0.25.",
"penaltyAlpha": 0.5,
"domainCount": 6,
"dimensionCount": 19,
"pillarCount": 3,
"coverageLabel": "Dimension coverage (mean of 19 per-dimension coverage values).",
"greyOutThreshold": 0.4,
"notes": [
"Every score is lower than the 6-domain equivalent because the penalty factor is always \u2264 1. Rank order is preserved (Spearman 0.9863 on this sample).",
"Sample size is 52 \u2014 the true live ranking has ~217 countries. Rank numbers here are in-sample; the true global rank for each country will likely be larger.",
"This snapshot informs the activation PR\u2019s release-gate re-anchoring but is NOT a substitute for the post-activation live capture."
]
},
"sampleSize": 52,
"sampleCountries": [
"CH",
"IS",
"DK",
"NO",
"SE",
"FI",
"NZ",
"JP",
"DE",
"AU",
"GB",
"FR",
"ES",
"CA",
"PL",
"IT",
"KR",
"BR",
"US",
"MY",
"CN",
"ID",
"TH",
"PH",
"UA",
"IN",
"RU",
"VN",
"EG",
"IQ",
"TR",
"MX",
"ZA",
"BD",
"KE",
"HT",
"AF",
"PK",
"CF",
"MM",
"NG",
"ET",
"NE",
"SS",
"ML",
"TD",
"IR",
"VE",
"SY",
"YE",
"SO",
"SD"
],
"tables": {
"topTenInSample": [
{
"rankInSample": 1,
"countryCode": "CH",
"countryName": "Switzerland",
"proposedOverallScore": 73.17,
"currentOverallScore": 78.78,
"scoreDelta": -5.61,
"pillars": {
"structuralReadiness": 82.34,
"liveShockExposure": 78.94,
"recoveryCapacity": 84.86,
"minPillar": 78.94
}
},
{
"rankInSample": 2,
"countryCode": "IS",
"countryName": "Iceland",
"proposedOverallScore": 72.76,
"currentOverallScore": 79.49,
"scoreDelta": -6.73,
"pillars": {
"structuralReadiness": 86.38,
"liveShockExposure": 88.09,
"recoveryCapacity": 73.65,
"minPillar": 73.65
}
},
{
"rankInSample": 3,
"countryCode": "DK",
"countryName": "Denmark",
"proposedOverallScore": 72.59,
"currentOverallScore": 78.55,
"scoreDelta": -5.96,
"pillars": {
"structuralReadiness": 87.81,
"liveShockExposure": 76.9,
"recoveryCapacity": 80.14,
"minPillar": 76.9
}
},
{
"rankInSample": 4,
"countryCode": "NO",
"countryName": "Norway",
"proposedOverallScore": 71.59,
"currentOverallScore": 79.03,
"scoreDelta": -7.44,
"pillars": {
"structuralReadiness": 85.85,
"liveShockExposure": 90.02,
"recoveryCapacity": 71.18,
"minPillar": 71.18
}
},
{
"rankInSample": 5,
"countryCode": "SE",
"countryName": "Sweden",
"proposedOverallScore": 70.13,
"currentOverallScore": 75.6,
"scoreDelta": -5.47,
"pillars": {
"structuralReadiness": 79.2,
"liveShockExposure": 81.3,
"recoveryCapacity": 76.79,
"minPillar": 76.79
}
},
{
"rankInSample": 6,
"countryCode": "FI",
"countryName": "Finland",
"proposedOverallScore": 68.6,
"currentOverallScore": 75.64,
"scoreDelta": -7.04,
"pillars": {
"structuralReadiness": 81.97,
"liveShockExposure": 78.42,
"recoveryCapacity": 74.17,
"minPillar": 74.17
}
},
{
"rankInSample": 7,
"countryCode": "NZ",
"countryName": "New Zealand",
"proposedOverallScore": 67.93,
"currentOverallScore": 76.26,
"scoreDelta": -8.33,
"pillars": {
"structuralReadiness": 82.9,
"liveShockExposure": 82.91,
"recoveryCapacity": 70.34,
"minPillar": 70.34
}
},
{
"rankInSample": 8,
"countryCode": "JP",
"countryName": "Japan",
"proposedOverallScore": 64.45,
"currentOverallScore": 73.33,
"scoreDelta": -8.88,
"pillars": {
"structuralReadiness": 77.74,
"liveShockExposure": 69.7,
"recoveryCapacity": 81.86,
"minPillar": 69.7
}
},
{
"rankInSample": 9,
"countryCode": "DE",
"countryName": "Germany",
"proposedOverallScore": 63.6,
"currentOverallScore": 72.42,
"scoreDelta": -8.82,
"pillars": {
"structuralReadiness": 77.74,
"liveShockExposure": 70.33,
"recoveryCapacity": 75.86,
"minPillar": 70.33
}
},
{
"rankInSample": 10,
"countryCode": "AU",
"countryName": "Australia",
"proposedOverallScore": 62.48,
"currentOverallScore": 73.63,
"scoreDelta": -11.15,
"pillars": {
"structuralReadiness": 78.77,
"liveShockExposure": 84.73,
"recoveryCapacity": 62.66,
"minPillar": 62.66
}
}
],
"bottomTenInSample": [
{
"rankInSample": 43,
"countryCode": "NE",
"countryName": "Niger",
"proposedOverallScore": 34.11,
"currentOverallScore": 46.6,
"scoreDelta": -12.49,
"pillars": {
"structuralReadiness": 56.94,
"liveShockExposure": 35.95,
"recoveryCapacity": 59.3,
"minPillar": 35.95
}
},
{
"rankInSample": 44,
"countryCode": "SS",
"countryName": "South Sudan",
"proposedOverallScore": 34.06,
"currentOverallScore": 45.54,
"scoreDelta": -11.48,
"pillars": {
"structuralReadiness": 52.61,
"liveShockExposure": 40.59,
"recoveryCapacity": 52.82,
"minPillar": 40.59
}
},
{
"rankInSample": 45,
"countryCode": "ML",
"countryName": "Mali",
"proposedOverallScore": 33.67,
"currentOverallScore": 44.91,
"scoreDelta": -11.24,
"pillars": {
"structuralReadiness": 54.6,
"liveShockExposure": 38.77,
"recoveryCapacity": 52.47,
"minPillar": 38.77
}
},
{
"rankInSample": 46,
"countryCode": "TD",
"countryName": "Chad",
"proposedOverallScore": 32.27,
"currentOverallScore": 43.85,
"scoreDelta": -11.58,
"pillars": {
"structuralReadiness": 54.34,
"liveShockExposure": 35.93,
"recoveryCapacity": 52.68,
"minPillar": 35.93
}
},
{
"rankInSample": 47,
"countryCode": "IR",
"countryName": "Iran",
"proposedOverallScore": 31.45,
"currentOverallScore": 46.48,
"scoreDelta": -15.03,
"pillars": {
"structuralReadiness": 37.08,
"liveShockExposure": 58.09,
"recoveryCapacity": 42.86,
"minPillar": 37.08
}
},
{
"rankInSample": 48,
"countryCode": "VE",
"countryName": "Venezuela",
"proposedOverallScore": 31.18,
"currentOverallScore": 47.7,
"scoreDelta": -16.52,
"pillars": {
"structuralReadiness": 37.87,
"liveShockExposure": 65.59,
"recoveryCapacity": 33.89,
"minPillar": 33.89
}
},
{
"rankInSample": 49,
"countryCode": "SY",
"countryName": "Syria",
"proposedOverallScore": 30.55,
"currentOverallScore": 49.64,
"scoreDelta": -19.09,
"pillars": {
"structuralReadiness": 32.1,
"liveShockExposure": 57.79,
"recoveryCapacity": 52.73,
"minPillar": 32.1
}
},
{
"rankInSample": 50,
"countryCode": "YE",
"countryName": "Yemen",
"proposedOverallScore": 27.36,
"currentOverallScore": 42.51,
"scoreDelta": -15.15,
"pillars": {
"structuralReadiness": 39.36,
"liveShockExposure": 38.13,
"recoveryCapacity": 42.09,
"minPillar": 38.13
}
},
{
"rankInSample": 51,
"countryCode": "SO",
"countryName": "Somalia",
"proposedOverallScore": 26.8,
"currentOverallScore": 36.47,
"scoreDelta": -9.67,
"pillars": {
"structuralReadiness": 40.25,
"liveShockExposure": 35.72,
"recoveryCapacity": 43.56,
"minPillar": 35.72
}
},
{
"rankInSample": 52,
"countryCode": "SD",
"countryName": "Sudan",
"proposedOverallScore": 19.45,
"currentOverallScore": 29.69,
"scoreDelta": -10.24,
"pillars": {
"structuralReadiness": 31.15,
"liveShockExposure": 32.3,
"recoveryCapacity": 27.24,
"minPillar": 27.24
}
}
],
"majorEconomiesInSample": [
{
"rankInSample": 8,
"countryCode": "JP",
"countryName": "Japan",
"proposedOverallScore": 64.45,
"currentOverallScore": 73.33,
"scoreDelta": -8.88,
"pillars": {
"structuralReadiness": 77.74,
"liveShockExposure": 69.7,
"recoveryCapacity": 81.86,
"minPillar": 69.7
}
},
{
"rankInSample": 9,
"countryCode": "DE",
"countryName": "Germany",
"proposedOverallScore": 63.6,
"currentOverallScore": 72.42,
"scoreDelta": -8.82,
"pillars": {
"structuralReadiness": 77.74,
"liveShockExposure": 70.33,
"recoveryCapacity": 75.86,
"minPillar": 70.33
}
},
{
"rankInSample": 10,
"countryCode": "AU",
"countryName": "Australia",
"proposedOverallScore": 62.48,
"currentOverallScore": 73.63,
"scoreDelta": -11.15,
"pillars": {
"structuralReadiness": 78.77,
"liveShockExposure": 84.73,
"recoveryCapacity": 62.66,
"minPillar": 62.66
}
},
{
"rankInSample": 11,
"countryCode": "GB",
"countryName": "United Kingdom",
"proposedOverallScore": 62.42,
"currentOverallScore": 70.1,
"scoreDelta": -7.68,
"pillars": {
"structuralReadiness": 73.86,
"liveShockExposure": 71.7,
"recoveryCapacity": 72.28,
"minPillar": 71.7
}
},
{
"rankInSample": 12,
"countryCode": "FR",
"countryName": "France",
"proposedOverallScore": 61.45,
"currentOverallScore": 70.06,
"scoreDelta": -8.61,
"pillars": {
"structuralReadiness": 74.96,
"liveShockExposure": 74.85,
"recoveryCapacity": 67.96,
"minPillar": 67.96
}
},
{
"rankInSample": 17,
"countryCode": "KR",
"countryName": "South Korea",
"proposedOverallScore": 60.43,
"currentOverallScore": 69.85,
"scoreDelta": -9.42,
"pillars": {
"structuralReadiness": 75.8,
"liveShockExposure": 66.77,
"recoveryCapacity": 75.14,
"minPillar": 66.77
}
},
{
"rankInSample": 18,
"countryCode": "BR",
"countryName": "Brazil",
"proposedOverallScore": 58.99,
"currentOverallScore": 68.34,
"scoreDelta": -9.35,
"pillars": {
"structuralReadiness": 68.69,
"liveShockExposure": 76.52,
"recoveryCapacity": 66.47,
"minPillar": 66.47
}
},
{
"rankInSample": 19,
"countryCode": "US",
"countryName": "United States",
"proposedOverallScore": 54.5,
"currentOverallScore": 68.26,
"scoreDelta": -13.76,
"pillars": {
"structuralReadiness": 68.55,
"liveShockExposure": 83.83,
"recoveryCapacity": 54.73,
"minPillar": 54.73
}
},
{
"rankInSample": 21,
"countryCode": "CN",
"countryName": "China",
"proposedOverallScore": 52.57,
"currentOverallScore": 63.73,
"scoreDelta": -11.16,
"pillars": {
"structuralReadiness": 58.25,
"liveShockExposure": 74.1,
"recoveryCapacity": 68.82,
"minPillar": 58.25
}
},
{
"rankInSample": 26,
"countryCode": "IN",
"countryName": "India",
"proposedOverallScore": 46.82,
"currentOverallScore": 59.3,
"scoreDelta": -12.48,
"pillars": {
"structuralReadiness": 63.51,
"liveShockExposure": 54.34,
"recoveryCapacity": 64.98,
"minPillar": 54.34
}
},
{
"rankInSample": 27,
"countryCode": "RU",
"countryName": "Russia",
"proposedOverallScore": 46.28,
"currentOverallScore": 61.08,
"scoreDelta": -14.8,
"pillars": {
"structuralReadiness": 47.95,
"liveShockExposure": 68.43,
"recoveryCapacity": 77.73,
"minPillar": 47.95
}
},
{
"rankInSample": 31,
"countryCode": "TR",
"countryName": "Turkey",
"proposedOverallScore": 43.66,
"currentOverallScore": 56.49,
"scoreDelta": -12.83,
"pillars": {
"structuralReadiness": 50.94,
"liveShockExposure": 59.84,
"recoveryCapacity": 66.14,
"minPillar": 50.94
}
}
]
},
"totals": {
"rankedCountriesInSample": 52,
"sgInSample": false
},
"comparisonArtifactRef": "docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json"
}

View File

@@ -27,7 +27,7 @@ loadEnvFile(import.meta.url);
const __dirname = dirname(fileURLToPath(import.meta.url));
const VALIDATION_DIR = join(__dirname, '..', 'docs', 'methodology', 'country-resilience-index', 'validation');
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
const BACKTEST_RESULT_KEY = 'resilience:backtest:outcomes:v1';
const BACKTEST_TTL_SECONDS = 7 * 24 * 60 * 60;

View File

@@ -365,7 +365,7 @@ function median(arr) {
async function readWmScoresFromRedis() {
const { url, token } = getRedisCredentials();
const rankingResp = await fetch(`${url}/get/${encodeURIComponent('resilience:ranking:v9')}`, {
const rankingResp = await fetch(`${url}/get/${encodeURIComponent('resilience:ranking:v10')}`, {
headers: { Authorization: `Bearer ${token}` },
signal: AbortSignal.timeout(10_000),
});

View File

@@ -19,8 +19,8 @@ const WM_KEY = process.env.WORLDMONITOR_API_KEY
|| '';
const SEED_UA = 'Mozilla/5.0 (compatible; WorldMonitor-Seed/1.0)';
export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v9';
export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v10';
// Must match the server-side RESILIENCE_RANKING_CACHE_TTL_SECONDS. Extended
// to 12h (2x the cron interval) so a missed/slow cron can't create an
// EMPTY_ON_DEMAND gap before the next successful rebuild.

View File

@@ -27,7 +27,7 @@ import { unwrapEnvelope } from './_seed-envelope-source.mjs';
loadEnvFile(import.meta.url);
// Source of truth: server/worldmonitor/resilience/v1/_shared.ts
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
const MIN_SCORED_COUNTRIES = 5;

View File

@@ -3,7 +3,7 @@
import { loadEnvFile, getRedisCredentials } from './_seed-utils.mjs';
// Source of truth: server/worldmonitor/resilience/v1/_shared.ts → RESILIENCE_SCORE_CACHE_PREFIX
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
const REFERENCE_INDICES = {
ndgain: {

View File

@@ -9,7 +9,7 @@ import type {
export type { ScoreInterval };
import { cachedFetchJson, getCachedJson, runRedisPipeline } from '../../../_shared/redis';
import { cachedFetchJson, getCachedJson, runRedisPipeline, setCachedJson } from '../../../_shared/redis';
import { unwrapEnvelope } from '../../../_shared/seed-envelope';
import { detectTrend, round } from '../../../_shared/resilience-stats';
import {
@@ -34,18 +34,37 @@ import { buildPillarList } from './_pillar-membership';
// back to the Phase 1 shape (`schemaVersion: "1.0"`, `pillars: []`) —
// retained as an emergency opt-out for one release cycle.
//
// IMPORTANT: `overallScore` is STILL computed as the 6-domain weighted
// aggregate (Σ domain.score * domain.weight, weights sum to 1.00) in both
// modes. A pillar-combined score with a min-pillar penalty is defined
// below (`penalizedPillarScore`) and exercised by
// scripts/validate-resilience-sensitivity.mjs; the activation that
// switches `overallScore` to the pillar combine is a separate PR.
//
// `baselineScore`, `stressScore`, `stressFactor`, etc. remain populated
// in both modes for widget + map layer + Country Brief consumers.
export const RESILIENCE_SCHEMA_V2_ENABLED =
(process.env.RESILIENCE_SCHEMA_V2_ENABLED ?? 'true').toLowerCase() === 'true';
// Phase 2 T2.3 activation: feature flag that switches `overallScore`
// from the 6-domain weighted aggregate (legacy compensatory form) to
// the 3-pillar combined form with the min-pillar penalty term defined
// by `penalizedPillarScore` below. Default is `false` so activation is
// an explicit operator action; the sensitivity + current-vs-proposed
// comparison in `docs/snapshots/resilience-pillar-sensitivity-*.json`
// is the input for that decision. When flipped to `true`:
// - `overallScore` = penalizedPillarScore(pillars), α=0.5 (pillar
// weights 0.40 / 0.35 / 0.25 per the plan).
// - Published numbers drop ~13 points on average across the
// 52-country sample; Spearman vs the 6-domain ranking is 0.9935.
//
// Read dynamically rather than captured at module load so tests can
// flip `process.env.RESILIENCE_PILLAR_COMBINE_ENABLED` per-case without
// re-importing the module. Under Node production the env does not
// change mid-process so the per-call read is a couple of instructions.
//
// Cache invalidation: the score cache prefix is bumped on every
// flag-visible behavior change (see RESILIENCE_SCORE_CACHE_PREFIX
// above). Do not flip this flag without also bumping the cache
// prefix or waiting for the 6h TTL to expire — otherwise legacy
// 6-domain scores will be served from cache after activation.
export function isPillarCombineEnabled(): boolean {
return (process.env.RESILIENCE_PILLAR_COMBINE_ENABLED ?? 'false').toLowerCase() === 'true';
}
export const RESILIENCE_SCORE_CACHE_TTL_SECONDS = 6 * 60 * 60;
// Ranking TTL must exceed the cron interval (6h) by enough to tolerate one
// missed/slow cron tick. With TTL==cron_interval, writing near the end of a
@@ -54,9 +73,37 @@ export const RESILIENCE_SCORE_CACHE_TTL_SECONDS = 6 * 60 * 60;
// full cron-cycle of headroom — ensureRankingPresent() still refreshes on
// every cron, so under normal operation the key stays well above TTL=0.
export const RESILIENCE_RANKING_CACHE_TTL_SECONDS = 12 * 60 * 60;
export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v9:';
export const RESILIENCE_HISTORY_KEY_PREFIX = 'resilience:history:v4:';
export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v9';
// Bumped from v9 to v10 in the pillar-combined activation PR. Provides
// a clean slate at PR deploy so no pre-PR cache entries (whose payloads
// lack the `_formula` tag) can leak through on activation day. NOTE:
// the version bump alone is NOT sufficient to isolate formulas — the
// flag defaults to off, so v10 is populated with 6-domain entries long
// before anyone flips RESILIENCE_PILLAR_COMBINE_ENABLED=true. The real
// cross-formula guard is the in-payload `_formula` marker written by
// `buildResilienceScore`, read by `ensureResilienceScoreCached` and
// `getCachedResilienceScores` to reject stale-formula hits at serve
// time. See the `CacheFormulaTag` comment block.
export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v10:';
// Bumped from v4 to v5 in the pillar-combined activation PR. Provides
// a clean slate at PR deploy so pre-PR history points (which were
// written without a formula tag) do not mix with tagged points. NOTE:
// the version bump alone is NOT sufficient because the flag defaults
// to off, so v5 accumulates d6-tagged entries during the default-off
// window. The real cross-formula guard is the `:d6` / `:pc` suffix on
// each sorted-set member written by `appendHistory` and filtered by
// `buildResilienceScore` before change30d / trend are computed. Legacy
// untagged members (from older deploys that happen to survive on v4
// readers) decode as `d6` — matching the only formula that existed
// before this PR — so the filter stays correct in either direction.
export const RESILIENCE_HISTORY_KEY_PREFIX = 'resilience:history:v5:';
// Bumped in lockstep with RESILIENCE_SCORE_CACHE_PREFIX (v9 → v10) for
// a clean slate at PR deploy. As with the score prefix, the version
// bump is a belt — the suspenders are the `_formula` tag on the
// ranking payload itself, written via stampRankingCacheTag and read
// via rankingCacheTagMatches in the ranking handler, which force a
// recompute-and-publish on a cross-formula cache hit rather than
// serving the stale ranking for up to the 12h ranking TTL.
export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v10';
export const RESILIENCE_STATIC_INDEX_KEY = 'resilience:static:index:v1';
export const RESILIENCE_INTERVAL_KEY_PREFIX = 'resilience:intervals:v1:';
const RESILIENCE_STATIC_META_KEY = 'seed-meta:resilience:static';
@@ -65,9 +112,28 @@ const RANK_STABLE_MAX_INTERVAL_WIDTH = 8;
const LOW_CONFIDENCE_COVERAGE_THRESHOLD = 0.55;
const LOW_CONFIDENCE_IMPUTATION_SHARE_THRESHOLD = 0.40;
// Cache formula tag. Stored inside score + ranking JSON payloads and as
// a suffix in history sorted-set member strings so the reader can reject
// or filter cross-formula entries at serve time. This is the actual
// isolation mechanism; the v9→v10 score/ranking and v4→v5 history key
// version bumps only provide a clean-slate at PR deploy and do NOT by
// themselves protect against the default-off-then-activate path —
// default-off writes land in the new v10/v5 namespace tagged as 'd6',
// and only the in-payload tag check forces a rebuild / filter on flip.
type CacheFormulaTag = 'd6' | 'pc';
function currentCacheFormula(): CacheFormulaTag {
// Mirrors the gating in buildResilienceScore's overallScore branch so
// the tag we stamp on write equals the formula actually used. If
// schemaV2 is off or the pillar combine flag is off, writes tag 'd6'
// and reads require 'd6' — matching the 6-domain aggregate code path.
return isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED ? 'pc' : 'd6';
}
interface ResilienceHistoryPoint {
date: string;
score: number;
formula: CacheFormulaTag;
}
interface ResilienceStaticIndex {
@@ -106,9 +172,31 @@ function todayIsoDate(): string {
return new Date().toISOString().slice(0, 10);
}
// Level thresholds are methodology-aware. The pillar-combined formula
// compresses the scale (~11-point mean drop across the 52-country live
// sample), so the legacy 70/40 thresholds misclassify top-tier countries
// as "medium" purely because the scale got compressed rather than
// because anything changed about the country (FI 75.64 → 68.60 and NZ
// 76.26 → 67.93 in the live sample both straddle the legacy 70 floor).
// The pillar-combined thresholds 60/30 are re-anchored against the live
// sample so the qualitative label stays stable for every country whose
// old label was correct; the 52-country sensitivity capture confirms
// all 7 high-band anchors stay ≥60 and all fragile-state anchors stay
// ≤30. Kept narrow: only the two thresholds move; the three-label
// taxonomy (high/medium/low) and downstream UI consumers are
// unchanged.
const LEVEL_THRESHOLDS_BY_FORMULA = {
'domain-weighted-6d': { high: 70, medium: 40 },
'pillar-combined-penalized-v1': { high: 60, medium: 30 },
} as const;
function classifyResilienceLevel(score: number): string {
if (score >= 70) return 'high';
if (score >= 40) return 'medium';
const formula = isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED
? 'pillar-combined-penalized-v1'
: 'domain-weighted-6d';
const { high, medium } = LEVEL_THRESHOLDS_BY_FORMULA[formula];
if (score >= high) return 'high';
if (score >= medium) return 'medium';
return 'low';
}
@@ -183,18 +271,28 @@ function buildDomainList(dimensions: ResilienceDimension[]): ResilienceDomain[]
});
}
// Sorted-set member format: `YYYY-MM-DD:SCORE[:FORMULA]`. The optional
// formula tag is either 'd6' or 'pc'. Legacy untagged members predate
// the pillar-combined activation and are implicitly 'd6' (the only
// formula in use before this PR). On activation, readHistory callers
// filter by `currentCacheFormula()` so a 30-day window of d6 points is
// not silently compared against a fresh pc point (which would
// manufacture a ranking-wide fake-negative change30d / false "falling"
// trend on day one).
function parseHistoryPoints(raw: unknown): ResilienceHistoryPoint[] {
if (!Array.isArray(raw)) return [];
const history: ResilienceHistoryPoint[] = [];
for (let index = 0; index < raw.length; index += 2) {
const member = String(raw[index] || '');
const separatorIndex = member.indexOf(':');
if (separatorIndex < 0) continue;
const date = member.slice(0, separatorIndex);
const score = Number(member.slice(separatorIndex + 1));
const parts = member.split(':');
if (parts.length < 2) continue;
const date = parts[0]!;
const score = Number(parts[1]);
const rawFormula = parts[2];
const formula: CacheFormulaTag = rawFormula === 'pc' ? 'pc' : 'd6';
if (!/^\d{4}-\d{2}-\d{2}$/.test(date) || !Number.isFinite(score)) continue;
history.push({ date, score });
history.push({ date, score, formula });
}
return history.sort((left, right) => left.date.localeCompare(right.date));
@@ -212,10 +310,20 @@ async function readHistory(countryCode: string): Promise<ResilienceHistoryPoint[
return parseHistoryPoints(result[0]?.result);
}
async function appendHistory(countryCode: string, overallScore: number): Promise<void> {
async function appendHistory(
countryCode: string,
overallScore: number,
formula: CacheFormulaTag,
): Promise<void> {
const dateScore = Number(todayIsoDate().replace(/-/g, ''));
// Member format `YYYY-MM-DD:SCORE:FORMULA` — see parseHistoryPoints
// above for the reader. The formula tag is required because the v4→v5
// history prefix bump happens at PR deploy, not at flag flip, so the
// v5 series accumulates d6-tagged entries during the default-off
// window; only the per-member tag lets the reader correctly filter
// those out when the pillar-combined formula later activates.
await runRedisPipeline([
['ZADD', historyKey(countryCode), dateScore, `${todayIsoDate()}:${round(overallScore)}`],
['ZADD', historyKey(countryCode), dateScore, `${todayIsoDate()}:${round(overallScore)}:${formula}`],
['ZREMRANGEBYRANK', historyKey(countryCode), 0, -31],
]);
}
@@ -249,7 +357,26 @@ async function buildResilienceScore(
const baselineScore = round(coverageWeightedMean(baselineDims));
const stressScore = round(coverageWeightedMean(stressDims));
const stressFactor = round(Math.max(0, Math.min(1 - stressScore / 100, 0.5)), 4);
const overallScore = round(domains.reduce((sum, d) => sum + d.score * d.weight, 0));
// Phase 2 T2.3 activation: `overallScore` is either the legacy
// 6-domain weighted aggregate (compensatory, `Σ domain.score *
// domain.weight`) or the pillar-combined penalized form (non-
// compensatory, `penalizedPillarScore(pillars)`), controlled by
// `RESILIENCE_PILLAR_COMBINE_ENABLED` + `RESILIENCE_SCHEMA_V2_ENABLED`.
// We only activate the pillar combine when v2 is on because the
// pillar list is empty under v1 and `penalizedPillarScore([])` returns
// 0 — that would silently zero every country's score if the flags
// were out of sync.
const domainAggregate = round(domains.reduce((sum, d) => sum + d.score * d.weight, 0));
const pillarEligible = isPillarCombineEnabled() && RESILIENCE_SCHEMA_V2_ENABLED && pillars.length > 0;
const overallScore = pillarEligible
? round(penalizedPillarScore(pillars.map((p) => ({ score: p.score, weight: p.weight }))))
: domainAggregate;
// Tag MUST match the branch that actually computed overallScore so
// the reader's stale-formula check in ensureResilienceScoreCached
// correctly rejects cross-formula cache entries when the env flag
// flips later. currentCacheFormula() reads the same two flags, so
// the derivation is intentionally redundant-by-agreement.
const formula: CacheFormulaTag = pillarEligible ? 'pc' : 'd6';
const totalImputed = dimensions.reduce((sum, d) => sum + (d.imputedWeight ?? 0), 0);
const totalObserved = dimensions.reduce((sum, d) => sum + (d.observedWeight ?? 0), 0);
@@ -257,12 +384,18 @@ async function buildResilienceScore(
? round(totalImputed / (totalImputed + totalObserved), 4)
: 0;
// Filter history to the CURRENT formula only. Points tagged with the
// other formula are excluded from change30d / trend so the first
// post-flip score is not diffed against a 30-day window of the other
// formula's values (which would emit a fake-negative change30d and
// a false "falling" trend across the ranking on activation day).
const history = (await readHistory(normalizedCountryCode))
.filter((point) => point.formula === formula)
.filter((point) => point.date !== todayIsoDate());
const scoreSeries = [...history.map((point) => point.score), overallScore];
const oldestScore = history[0]?.score;
await appendHistory(normalizedCountryCode, overallScore);
await appendHistory(normalizedCountryCode, overallScore, formula);
return {
countryCode: normalizedCountryCode,
@@ -282,6 +415,37 @@ async function buildResilienceScore(
};
}
// The shape we actually store in Redis. Extends the public response type
// with a `_formula` marker so the reader can reject cross-formula cache
// entries when `RESILIENCE_PILLAR_COMBINE_ENABLED` flips later. The
// marker is stripped before the payload crosses back to callers.
type CachedScorePayload = GetResilienceScoreResponse & { _formula?: CacheFormulaTag };
function stripCacheMeta(payload: CachedScorePayload): GetResilienceScoreResponse {
const { _formula: _drop, ...rest } = payload;
void _drop;
return rest;
}
// Exposed helpers so the ranking handler can apply the same
// stale-formula invalidation to its own cache key. Kept in this module
// alongside the score versions so the tag convention has one source of
// truth; a diverging derivation elsewhere would re-introduce the cross-
// formula drift this whole pattern is meant to prevent.
export function getCurrentCacheFormula(): CacheFormulaTag {
return currentCacheFormula();
}
export function stampRankingCacheTag<T extends object>(payload: T): T & { _formula: CacheFormulaTag } {
return { ...payload, _formula: currentCacheFormula() };
}
export function rankingCacheTagMatches(payload: unknown): boolean {
if (!payload || typeof payload !== 'object') return false;
const tag = (payload as { _formula?: unknown })._formula;
return tag === currentCacheFormula();
}
export async function ensureResilienceScoreCached(countryCode: string, reader?: ResilienceSeedReader): Promise<GetResilienceScoreResponse> {
const normalizedCountryCode = normalizeCountryCode(countryCode);
if (!normalizedCountryCode) {
@@ -306,44 +470,68 @@ export async function ensureResilienceScoreCached(countryCode: string, reader?:
};
}
let cached = await cachedFetchJson<GetResilienceScoreResponse>(
scoreCacheKey(normalizedCountryCode),
const current = currentCacheFormula();
const cacheKey = scoreCacheKey(normalizedCountryCode);
let cached = await cachedFetchJson<CachedScorePayload>(
cacheKey,
RESILIENCE_SCORE_CACHE_TTL_SECONDS,
() => buildResilienceScore(normalizedCountryCode, reader),
async () => {
const built = await buildResilienceScore(normalizedCountryCode, reader);
// Tag with the formula buildResilienceScore actually used so
// downstream readers can reject cross-formula entries.
return { ...built, _formula: current };
},
300,
) ?? {
countryCode: normalizedCountryCode,
overallScore: 0,
baselineScore: 0,
stressScore: 0,
stressFactor: 0.5,
level: 'unknown',
domains: [],
trend: 'stable',
change30d: 0,
lowConfidence: true,
imputationShare: 0,
dataVersion: '',
// Phase 2 T2.1: cachedFetchJson-null fallback. Stays on the v1 shape
// because there are no domains to wrap into pillars here.
pillars: [],
schemaVersion: '1.0',
};
);
// Stale-formula guard. On activation day (flag flip), cached entries
// from the previous formula are still in Redis under the same key
// (v10 bump happens at PR deploy, not at flip time). The `_formula`
// tag we wrote on the cached payload lets us detect and overwrite
// the stale entry at read time. Without this, a 6-hour post-flip
// window would keep serving legacy scores. Legacy untagged entries
// (pre-PR writes that happen to survive the v9→v10 bump via
// external writers) are treated as stale-formula and rebuilt.
if (cached && cached._formula !== current) {
const rebuilt = await buildResilienceScore(normalizedCountryCode, reader);
cached = { ...rebuilt, _formula: current };
await setCachedJson(cacheKey, cached, RESILIENCE_SCORE_CACHE_TTL_SECONDS);
}
let payload: GetResilienceScoreResponse = cached
? stripCacheMeta(cached)
: {
countryCode: normalizedCountryCode,
overallScore: 0,
baselineScore: 0,
stressScore: 0,
stressFactor: 0.5,
level: 'unknown',
domains: [],
trend: 'stable',
change30d: 0,
lowConfidence: true,
imputationShare: 0,
dataVersion: '',
pillars: [],
schemaVersion: '1.0',
};
const scoreInterval = await readScoreInterval(normalizedCountryCode);
if (scoreInterval) {
cached = { ...cached, scoreInterval };
payload = { ...payload, scoreInterval };
}
// P1 fix: the cache always stores the v2 superset (pillars + schemaVersion='2.0').
// When the flag is off, strip pillars and downgrade schemaVersion so consumers
// see the v1 shape. Flag flips take effect immediately, no 6h TTL wait.
if (!RESILIENCE_SCHEMA_V2_ENABLED) {
cached.pillars = [];
cached.schemaVersion = '1.0';
payload.pillars = [];
payload.schemaVersion = '1.0';
}
return cached;
return payload;
}
export async function listScorableCountries(): Promise<string[]> {
@@ -361,6 +549,7 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise
const results = await runRedisPipeline(normalized.map((countryCode) => ['GET', scoreCacheKey(countryCode)]));
const scores = new Map<string, GetResilienceScoreResponse>();
const current = currentCacheFormula();
for (let index = 0; index < normalized.length; index += 1) {
const countryCode = normalized[index]!;
@@ -369,14 +558,32 @@ export async function getCachedResilienceScores(countryCodes: string[]): Promise
try {
// Envelope-aware: resilience score keys are written by seed-resilience-scores
// in contract mode (PR 2). unwrapEnvelope is a no-op on legacy bare-shape.
const parsed = unwrapEnvelope(JSON.parse(raw)).data as GetResilienceScoreResponse;
const parsed = unwrapEnvelope(JSON.parse(raw)).data as CachedScorePayload;
if (!parsed) continue;
// Stale-formula skip: this bulk read feeds the ranking handler,
// which mirrors the single-country cache miss path. Leaving the
// country out of `scores` causes the ranking handler's
// warmMissingResilienceScores step to rebuild it with the current
// formula, producing a coherent same-formula ranking. Without
// this filter, a flip would serve a mixed-formula ranking for
// up to the 6h score TTL.
//
// IMPORTANT: the condition intentionally matches `undefined` too
// (not `parsed._formula && parsed._formula !== current`). Legacy
// untagged entries carry no `_formula` — they were written by a
// pre-PR code path or by an external writer that has not been
// updated — and must be treated as stale so the ranking warm
// path rebuilds them with the current tag. The `&&` short-circuit
// would admit them and re-introduce the cross-formula drift the
// whole cache-tag strategy is meant to prevent.
if (parsed._formula !== current) continue;
const publicPayload = stripCacheMeta(parsed);
// P1 fix: cached payload is always v2 superset. Gate on serve.
if (!RESILIENCE_SCHEMA_V2_ENABLED) {
parsed.pillars = [];
parsed.schemaVersion = '1.0';
publicPayload.pillars = [];
publicPayload.schemaVersion = '1.0';
}
scores.set(countryCode, parsed);
scores.set(countryCode, publicPayload);
} catch {
// Ignore malformed cache entries and let the caller decide whether to warm them.
}
@@ -500,10 +707,15 @@ export async function warmMissingResilienceScores(
// pipeline body small enough to land well under the timeout while still
// making one round-trip per batch.
const SET_BATCH = 30;
const current = currentCacheFormula();
const allSetCommands = scores.map(({ cc, score }) => [
'SET',
scoreCacheKey(cc),
JSON.stringify(score),
// Stamp the formula tag on the written payload so the bulk-read
// path in getCachedResilienceScores can filter stale entries after
// a flag flip. Without this tag, warmed-then-flipped entries would
// be served as-is until the 6h TTL expired.
JSON.stringify({ ...score, _formula: current } satisfies CachedScorePayload),
'EX',
String(RESILIENCE_SCORE_CACHE_TTL_SECONDS),
]);

View File

@@ -15,7 +15,9 @@ import {
buildRankingItem,
getCachedResilienceScores,
listScorableCountries,
rankingCacheTagMatches,
sortRankingItems,
stampRankingCacheTag,
warmMissingResilienceScores,
type ScoreInterval,
} from './_shared';
@@ -88,8 +90,21 @@ export const getResilienceRanking: ResilienceServiceHandler['getResilienceRankin
return true;
})();
if (!forceRefresh) {
const cached = await getCachedJson(RESILIENCE_RANKING_CACHE_KEY) as GetResilienceRankingResponse | null;
if (cached != null && (cached.items.length > 0 || (cached.greyedOut?.length ?? 0) > 0)) return cached;
const cached = await getCachedJson(RESILIENCE_RANKING_CACHE_KEY) as (GetResilienceRankingResponse & { _formula?: string }) | null;
// Stale-formula gate: the ranking cache key is bumped at PR deploy,
// but the flag flip happens later, so the v10 namespace starts out
// filled with 6-domain rankings. Without this check, a flip would
// serve the legacy ranking aggregate for up to the 12h ranking TTL
// even as per-country reads produced pillar-combined scores. Drop
// stale-formula hits so the recompute-and-publish path below runs.
const tagMatches = cached != null && rankingCacheTagMatches(cached);
if (tagMatches && (cached!.items.length > 0 || (cached!.greyedOut?.length ?? 0) > 0)) {
// Strip the cache-only tag before returning to callers so the
// wire shape matches the generated proto response type.
const { _formula: _drop, ...publicResponse } = cached!;
void _drop;
return publicResponse as GetResilienceRankingResponse;
}
}
const countryCodes = await listScorableCountries();
@@ -132,8 +147,12 @@ export const getResilienceRanking: ResilienceServiceHandler['getResilienceRankin
// self-heal here ensures we at least log it, and the seeder also verifies
// BOTH keys post-refresh. If either SET didn't return OK we log a warning
// that ops can grep for, rather than silently succeeding.
// Tag the persisted ranking so the stale-formula gate above can
// detect a cross-formula cache hit after a flag flip. The tag is
// stripped on read before the response crosses back to callers.
const persistedRanking = stampRankingCacheTag(response);
const pipelineResult = await runRedisPipeline([
['SET', RESILIENCE_RANKING_CACHE_KEY, JSON.stringify(response), 'EX', RESILIENCE_RANKING_CACHE_TTL_SECONDS],
['SET', RESILIENCE_RANKING_CACHE_KEY, JSON.stringify(persistedRanking), 'EX', RESILIENCE_RANKING_CACHE_TTL_SECONDS],
['SET', RESILIENCE_RANKING_META_KEY, JSON.stringify({
fetchedAt: Date.now(),
count: response.items.length + response.greyedOut.length,

View File

@@ -28,7 +28,7 @@ describe('resilience handlers', () => {
delete process.env.VERCEL_ENV;
const { fetchImpl, redis, sortedSets } = createRedisFetch(RESILIENCE_FIXTURES);
sortedSets.set('resilience:history:v4:US', [
sortedSets.set('resilience:history:v5:US', [
{ member: '2026-04-01:20', score: 20260401 },
{ member: '2026-04-02:30', score: 20260402 },
]);
@@ -55,16 +55,16 @@ describe('resilience handlers', () => {
assert.ok(response.stressFactor >= 0 && response.stressFactor <= 0.5, `stressFactor out of bounds: ${response.stressFactor}`);
assert.equal(response.dataVersion, '2024-04-03', 'dataVersion should be the ISO date from seed-meta fetchedAt');
const cachedScore = redis.get('resilience:score:v9:US');
const cachedScore = redis.get('resilience:score:v10:US');
assert.ok(cachedScore, 'expected score cache to be written');
assert.equal(JSON.parse(cachedScore || '{}').countryCode, 'US');
const history = sortedSets.get('resilience:history:v4:US') ?? [];
const history = sortedSets.get('resilience:history:v5:US') ?? [];
assert.ok(history.some((entry) => entry.member.startsWith(today + ':')), 'expected today history member to be written');
await getResilienceScore({ request: new Request('https://example.com') } as never, {
countryCode: 'US',
});
assert.equal((sortedSets.get('resilience:history:v4:US') ?? []).length, history.length, 'cache hit must not append history');
assert.equal((sortedSets.get('resilience:history:v5:US') ?? []).length, history.length, 'cache hit must not append history');
});
});

View File

@@ -157,8 +157,8 @@ describe('pillar constants', () => {
assert.equal(PENALTY_ALPHA, 0.50);
});
it('RESILIENCE_SCORE_CACHE_PREFIX is v9', () => {
assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v9:');
it('RESILIENCE_SCORE_CACHE_PREFIX is v10', () => {
assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v10:');
});
it('PILLAR_ORDER has 3 entries', () => {

View File

@@ -0,0 +1,247 @@
// Phase 2 T2.3 activation test suite.
//
// Exercises the `RESILIENCE_PILLAR_COMBINE_ENABLED` flag: when set,
// `overallScore` switches from the 6-domain weighted aggregate to the
// penalized pillar-combined form. The existing release-gate tests
// (tests/resilience-release-gate.test.mts) cover the default (flag=off)
// path and pin the anchors for the 6-domain formula; this file covers
// the re-anchored bands under the pillar combine.
//
// Why separate file: the existing release-gate test imports
// `getResilienceScore` at the top of the file (captures the legacy
// overallScore path) and runs many asserts that would become stale
// under the pillar combine. A separate file lets us flip the env flag
// in a per-test setup/teardown cleanly.
import assert from 'node:assert/strict';
import { afterEach, beforeEach, describe, it } from 'node:test';
import { getResilienceRanking } from '../server/worldmonitor/resilience/v1/get-resilience-ranking.ts';
import { getResilienceScore } from '../server/worldmonitor/resilience/v1/get-resilience-score.ts';
import {
isPillarCombineEnabled,
penalizedPillarScore,
} from '../server/worldmonitor/resilience/v1/_shared.ts';
import { createRedisFetch } from './helpers/fake-upstash-redis.mts';
import {
buildReleaseGateFixtures,
} from './helpers/resilience-release-fixtures.mts';
// Re-anchored bands for the pillar-combined formula, derived from the
// 52-country live-Redis sensitivity capture in
// docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json.
// Old (6-domain): NO ≥ 70, YE/SO/CD ≤ 35, NO US ≥ 8.
// New (pillar combine, α=0.5): every country drops ~13 points, top
// stays ~65-72, fragile states drop to ~15-35. The re-anchored bands
// preserve the "high" vs "low" separation without pinning numbers that
// are only valid for the legacy formula.
const HIGH_BAND_FLOOR = 60;
const LOW_BAND_CEILING = 40;
const MIN_HIGH_LOW_SEPARATION = 20;
const fixtures = buildReleaseGateFixtures();
const originalFetch = globalThis.fetch;
const originalRedisUrl = process.env.UPSTASH_REDIS_REST_URL;
const originalRedisToken = process.env.UPSTASH_REDIS_REST_TOKEN;
const originalVercelEnv = process.env.VERCEL_ENV;
const originalPillarFlag = process.env.RESILIENCE_PILLAR_COMBINE_ENABLED;
function installRedisFixtures() {
process.env.UPSTASH_REDIS_REST_URL = 'https://redis.example';
process.env.UPSTASH_REDIS_REST_TOKEN = 'token';
delete process.env.VERCEL_ENV;
const redisState = createRedisFetch(fixtures);
globalThis.fetch = redisState.fetchImpl;
return redisState;
}
function enablePillarCombine(): void {
process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = 'true';
}
function disablePillarCombine(): void {
process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = 'false';
}
describe('pillar-combined score activation', () => {
beforeEach(() => {
enablePillarCombine();
});
afterEach(() => {
globalThis.fetch = originalFetch;
if (originalRedisUrl == null) delete process.env.UPSTASH_REDIS_REST_URL;
else process.env.UPSTASH_REDIS_REST_URL = originalRedisUrl;
if (originalRedisToken == null) delete process.env.UPSTASH_REDIS_REST_TOKEN;
else process.env.UPSTASH_REDIS_REST_TOKEN = originalRedisToken;
if (originalVercelEnv == null) delete process.env.VERCEL_ENV;
else process.env.VERCEL_ENV = originalVercelEnv;
if (originalPillarFlag == null) delete process.env.RESILIENCE_PILLAR_COMBINE_ENABLED;
else process.env.RESILIENCE_PILLAR_COMBINE_ENABLED = originalPillarFlag;
});
it('isPillarCombineEnabled reads env dynamically', () => {
enablePillarCombine();
assert.equal(isPillarCombineEnabled(), true);
disablePillarCombine();
assert.equal(isPillarCombineEnabled(), false);
enablePillarCombine();
assert.equal(isPillarCombineEnabled(), true);
});
it('penalizedPillarScore collapses to weighted-sum when all pillars equal (penalty minimal)', () => {
// All pillars at 80 → min=80 → penalty = 1 0.5*(1 0.8) = 0.9.
// Weighted sum = 80 * (0.40 + 0.35 + 0.25) = 80.
// Final = 80 * 0.9 = 72.
const result = penalizedPillarScore([
{ score: 80, weight: 0.40 },
{ score: 80, weight: 0.35 },
{ score: 80, weight: 0.25 },
]);
assert.equal(Math.round(result * 100) / 100, 72.00);
});
it('pillar-combined overallScore drops NO below the 6-domain band floor (expected, re-anchored)', async () => {
installRedisFixtures();
const response = await getResilienceScore(
{ request: new Request('https://example.com?countryCode=NO') } as never,
{ countryCode: 'NO' },
);
// Norway under the 6-domain formula scores ~86 under the current
// fixtures (pinned by T1.1 regression test). Under the pillar
// combine it drops to roughly the low-70s because penalty = 1
// 0.5 × (1 min_pillar/100) is always ≤ 1. The activated path's
// HIGH_BAND_FLOOR = 60 leaves plenty of headroom above mid-tier
// countries while accepting that elite scores no longer sit in the
// 85+ range.
assert.ok(
response.overallScore >= HIGH_BAND_FLOOR,
`NO in the pillar-combined formula must stay above the re-anchored high-band floor (${HIGH_BAND_FLOOR}), got ${response.overallScore}`,
);
assert.ok(
response.overallScore <= 90,
`NO in the pillar-combined formula should NOT exceed 90 — penalty factor is always ≤ 1, so getting close to 100 would indicate the penalty is not firing. Got ${response.overallScore}.`,
);
});
it('pillar-combined overallScore keeps fragile countries (YE, SO) below the re-anchored low-band ceiling', async () => {
installRedisFixtures();
for (const countryCode of ['YE', 'SO'] as const) {
const response = await getResilienceScore(
{ request: new Request(`https://example.com?countryCode=${countryCode}`) } as never,
{ countryCode },
);
assert.ok(
response.overallScore <= LOW_BAND_CEILING,
`${countryCode} in the pillar-combined formula must stay below the re-anchored low-band ceiling (${LOW_BAND_CEILING}), got ${response.overallScore}`,
);
}
});
it('pillar-combined preserves NO vs US separation (high-band vs mid-band)', async () => {
installRedisFixtures();
const [no, us] = await Promise.all([
getResilienceScore({ request: new Request('https://example.com?countryCode=NO') } as never, { countryCode: 'NO' }),
getResilienceScore({ request: new Request('https://example.com?countryCode=US') } as never, { countryCode: 'US' }),
]);
// The 6-domain separation was ~14 points under fixtures. The
// pillar combine amplifies penalty on imbalanced pillar profiles
// (US has a weaker live-shock pillar than Norway), so the
// separation is expected to hold or widen.
assert.ok(
no.overallScore > us.overallScore,
`NO (${no.overallScore}) must still outscore US (${us.overallScore}) under the pillar combine`,
);
assert.ok(
no.overallScore - us.overallScore >= MIN_HIGH_LOW_SEPARATION - 12,
`NO US separation must stay ≥ ${MIN_HIGH_LOW_SEPARATION - 12} under pillar combine; got NO=${no.overallScore}, US=${us.overallScore}, Δ=${(no.overallScore - us.overallScore).toFixed(2)}`,
);
});
it('pillar-combined ranking preserves the elite vs fragile ordering over the release set', async () => {
installRedisFixtures();
const ranking = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
const byCountry = new Map(ranking.items.map((item) => [item.countryCode, item]));
// Every high-band anchor (if present in the ranking) must outrank
// every low-band anchor (if present). This is the structural
// invariant the pillar combine must preserve to be accepted.
const highAnchors = ['NO', 'CH', 'DK', 'IS', 'FI', 'SE', 'NZ'].filter((cc) => byCountry.has(cc));
const lowAnchors = ['YE', 'SO', 'SD', 'CD'].filter((cc) => byCountry.has(cc));
for (const high of highAnchors) {
for (const low of lowAnchors) {
const highScore = byCountry.get(high)!.overallScore;
const lowScore = byCountry.get(low)!.overallScore;
assert.ok(
highScore > lowScore,
`pillar-combined ranking must keep ${high} (${highScore}) above ${low} (${lowScore})`,
);
}
}
});
it('disabling the flag restores the 6-domain aggregate (regression guard for the default path)', async () => {
installRedisFixtures();
disablePillarCombine();
const response = await getResilienceScore(
{ request: new Request('https://example.com?countryCode=NO') } as never,
{ countryCode: 'NO' },
);
// Under the 6-domain formula + current fixtures, NO is pinned at
// ≥ 70 by the existing release-gate test. The flag-off code path
// is the same one the production default uses; we verify here that
// switching the flag off mid-suite really does restore it (the
// dynamic env read in isPillarCombineEnabled() is load-bearing).
assert.ok(
response.overallScore >= 70,
`with flag off, NO must still meet the 6-domain release-gate floor (70), got ${response.overallScore}`,
);
});
it('flipping the flag mid-session rebuilds the score (stale-formula cache invalidation)', async () => {
// This is the core guarantee for the activation story: merging this
// PR with flag=false populates cached scores tagged _formula='d6',
// and later setting RESILIENCE_PILLAR_COMBINE_ENABLED=true MUST
// force a rebuild on next read (rather than serving the d6-tagged
// entry for up to 6h until the TTL expires). We simulate the flip
// inside a single test by pre-computing a cache entry with the
// flag off, flipping the flag, then reading again — the second
// read must produce a different overallScore because the cache
// entry's _formula no longer matches the current formula.
disablePillarCombine();
installRedisFixtures();
const firstRead = await getResilienceScore(
{ request: new Request('https://example.com?countryCode=NO') } as never,
{ countryCode: 'NO' },
);
assert.ok(firstRead.overallScore >= 70, `flag-off NO should score ≥70, got ${firstRead.overallScore}`);
// Flip the flag. The cached entry in Redis still carries
// _formula='d6' from the first read. Without the stale-formula
// gate, the second read would serve that same 6-domain score.
enablePillarCombine();
const secondRead = await getResilienceScore(
{ request: new Request('https://example.com?countryCode=NO') } as never,
{ countryCode: 'NO' },
);
assert.ok(
secondRead.overallScore < firstRead.overallScore,
`flag-on rebuild must drop NO's score below the 6-domain value (penalty factor ≤ 1); got first=${firstRead.overallScore} second=${secondRead.overallScore}. If these are equal, the stale-formula cache gate is not firing and a flag flip in production would serve legacy values for up to the 6h TTL.`,
);
assert.ok(
secondRead.overallScore >= 60,
`flag-on NO should still meet the re-anchored 60 floor, got ${secondRead.overallScore}`,
);
});
});

View File

@@ -23,13 +23,29 @@ const __filename = fileURLToPath(import.meta.url);
const REPO_ROOT = path.resolve(path.dirname(__filename), '..');
const SNAPSHOT_DIR = path.join(REPO_ROOT, 'docs', 'snapshots');
// Band anchors from the release-gate tests (tests/resilience-release-gate.test.mts).
// Countries in the high-anchor set must never drop below 70 in a published
// snapshot; countries in the low-anchor set must never climb above 45.
// Band anchors from the release-gate tests (tests/resilience-release-gate.test.mts
// and tests/resilience-pillar-combine-activation.test.mts).
// Floors/ceilings depend on the methodology formula the snapshot was
// captured under — the pillar-combined form is non-compensatory so its
// scale is compressed; the 6-domain legacy form is compensatory and
// runs ~13 points hotter.
const HIGH_BAND_ANCHORS = new Set(['NO', 'CH', 'DK', 'IS', 'FI', 'SE', 'NZ']);
const LOW_BAND_ANCHORS = new Set(['YE', 'SO', 'SD', 'CD']);
const HIGH_BAND_FLOOR = 70;
const LOW_BAND_CEILING = 45;
const METHODOLOGY_BANDS: Record<string, { highFloor: number; lowCeiling: number }> = {
'domain-weighted-6d': { highFloor: 70, lowCeiling: 45 },
'pillar-combined-penalized-v1': { highFloor: 60, lowCeiling: 40 },
};
function resolveBands(methodologyFormula: string | undefined): { highFloor: number; lowCeiling: number } {
// Unknown / unspecified formulas fall through to the 6-domain bands
// (the production default at the time of writing). If a future
// snapshot uses a new formula id, adding an entry to
// METHODOLOGY_BANDS above is the one-line fix; until then we assume
// the legacy bands rather than silently under-validating.
return METHODOLOGY_BANDS[methodologyFormula ?? 'domain-weighted-6d']
?? METHODOLOGY_BANDS['domain-weighted-6d']!;
}
interface PublishedRow {
rank: number;
@@ -53,6 +69,7 @@ interface SnapshotPublished {
capturedAt: string;
commitSha: string;
schemaVersion: string;
methodologyFormula?: string;
methodology: {
domainCount: number;
dimensionCount: number;
@@ -82,14 +99,49 @@ interface SnapshotLive {
greyedOut: Array<{ countryCode: string; overallCoverage: number }>;
}
type Snapshot = SnapshotPublished | SnapshotLive;
interface ProjectedRow {
rankInSample: number;
countryCode: string;
countryName: string;
proposedOverallScore: number;
currentOverallScore: number;
scoreDelta: number;
}
interface SnapshotProjected {
capturedAt: string;
commitSha: string;
schemaVersion: string;
methodologyFormula: string;
methodology: {
domainCount: number;
dimensionCount: number;
pillarCount: number;
greyOutThreshold: number;
};
sampleSize: number;
tables: {
topTenInSample: ProjectedRow[];
bottomTenInSample: ProjectedRow[];
majorEconomiesInSample: ProjectedRow[];
};
totals: { rankedCountriesInSample: number };
}
type Snapshot = SnapshotPublished | SnapshotLive | SnapshotProjected;
function isLive(snapshot: Snapshot): snapshot is SnapshotLive {
return Array.isArray((snapshot as SnapshotLive).items);
}
function isProjected(snapshot: Snapshot): snapshot is SnapshotProjected {
const tables = (snapshot as SnapshotProjected).tables;
return !!tables && Array.isArray(tables.topTenInSample);
}
function isPublished(snapshot: Snapshot): snapshot is SnapshotPublished {
return (snapshot as SnapshotPublished).tables != null;
const tables = (snapshot as SnapshotPublished).tables;
return !!tables && Array.isArray(tables.topTen);
}
function loadSnapshots(): { filename: string; snapshot: Snapshot }[] {
@@ -99,8 +151,17 @@ function loadSnapshots(): { filename: string; snapshot: Snapshot }[] {
} catch {
return [];
}
// Matches three shapes:
// resilience-ranking-YYYY-MM-DD.json
// → published or live capture (the authoritative shape)
// resilience-ranking-<slug>-YYYY-MM-DD.json
// → projected / preview snapshot (e.g. pillar-combined-projected)
// Auto-discovered so the projected artifact does not slip
// through unvalidated. Slug must be hyphenated, start with an
// alpha char, and live before the date.
const RANKING_SNAPSHOT_RE = /^resilience-ranking-(?:[a-z][a-z0-9-]*-)?\d{4}-\d{2}-\d{2}\.json$/;
return entries
.filter((name) => /^resilience-ranking-\d{4}-\d{2}-\d{2}\.json$/.test(name))
.filter((name) => RANKING_SNAPSHOT_RE.test(name))
.sort()
.map((filename) => ({
filename,
@@ -205,22 +266,24 @@ describe('resilience-ranking snapshots', () => {
assert.ok(unique.size >= Math.max(snapshot.tables.topTen.length, snapshot.tables.bottomTen.length));
});
it('high-band anchors appearing in topTen stay above the release-gate floor', () => {
it('high-band anchors appearing in topTen stay above the release-gate floor (methodology-aware)', () => {
const { highFloor } = resolveBands(snapshot.methodologyFormula);
for (const row of snapshot.tables.topTen) {
if (!HIGH_BAND_ANCHORS.has(row.countryCode)) continue;
assert.ok(
row.overallScore >= HIGH_BAND_FLOOR,
`${row.countryCode} (${row.countryName}) is a high-band anchor and must stay ≥${HIGH_BAND_FLOOR}, got ${row.overallScore}`,
row.overallScore >= highFloor,
`${row.countryCode} (${row.countryName}) is a high-band anchor and must stay ≥${highFloor} under "${snapshot.methodologyFormula ?? 'domain-weighted-6d'}", got ${row.overallScore}`,
);
}
});
it('low-band anchors appearing in bottomTen stay below the release-gate ceiling', () => {
it('low-band anchors appearing in bottomTen stay below the release-gate ceiling (methodology-aware)', () => {
const { lowCeiling } = resolveBands(snapshot.methodologyFormula);
for (const row of snapshot.tables.bottomTen) {
if (!LOW_BAND_ANCHORS.has(row.countryCode)) continue;
assert.ok(
row.overallScore <= LOW_BAND_CEILING,
`${row.countryCode} (${row.countryName}) is a low-band anchor and must stay ≤${LOW_BAND_CEILING}, got ${row.overallScore}`,
row.overallScore <= lowCeiling,
`${row.countryCode} (${row.countryName}) is a low-band anchor and must stay ≤${lowCeiling} under "${snapshot.methodologyFormula ?? 'domain-weighted-6d'}", got ${row.overallScore}`,
);
}
});
@@ -293,23 +356,130 @@ describe('resilience-ranking snapshots', () => {
assert.equal(snapshot.totals.greyedOutCount, snapshot.greyedOut.length);
});
it('live band anchors sit in their expected bands (structural sanity)', () => {
it('live band anchors sit in their expected bands (methodology-aware structural sanity)', () => {
const { highFloor, lowCeiling } = resolveBands((snapshot as SnapshotLive & { methodologyFormula?: string }).methodologyFormula);
for (const item of snapshot.items) {
if (HIGH_BAND_ANCHORS.has(item.countryCode)) {
assert.ok(
item.overallScore >= HIGH_BAND_FLOOR,
`${item.countryCode} is a high-band anchor but scored ${item.overallScore} (< ${HIGH_BAND_FLOOR}) at rank ${item.rank}`,
item.overallScore >= highFloor,
`${item.countryCode} is a high-band anchor but scored ${item.overallScore} (< ${highFloor}) at rank ${item.rank}`,
);
}
if (LOW_BAND_ANCHORS.has(item.countryCode)) {
assert.ok(
item.overallScore <= LOW_BAND_CEILING,
`${item.countryCode} is a low-band anchor but scored ${item.overallScore} (> ${LOW_BAND_CEILING}) at rank ${item.rank}`,
item.overallScore <= lowCeiling,
`${item.countryCode} is a low-band anchor but scored ${item.overallScore} (> ${lowCeiling}) at rank ${item.rank}`,
);
}
}
});
}
if (isProjected(snapshot)) {
// Projected snapshots are preview artifacts built from a
// sample (e.g. the 52-country sensitivity capture) against the
// proposed formula. They carry in-sample ranks, not global
// ranks, and use different table keys (topTenInSample rather
// than topTen) to avoid being mistaken for authoritative
// captures. Still validated here so the artifact does not ship
// with broken shape or out-of-band scores.
it('projected snapshot declares a known methodologyFormula', () => {
const known = new Set(['domain-weighted-6d', 'pillar-combined-penalized-v1']);
assert.ok(
known.has(snapshot.methodologyFormula),
`projected snapshot methodologyFormula="${snapshot.methodologyFormula}" must be one of [${[...known].join(', ')}]; add it to METHODOLOGY_BANDS at the top of this file when introducing a new formula id`,
);
});
it('projected topTenInSample ranks are 1..10, scores descend, every score in (0, 100)', () => {
const rows = snapshot.tables.topTenInSample;
assert.equal(rows.length, 10);
for (let i = 0; i < rows.length; i++) {
assert.equal(rows[i]!.rankInSample, i + 1, `topTenInSample[${i}].rankInSample should be ${i + 1}, got ${rows[i]!.rankInSample}`);
assert.ok(
rows[i]!.proposedOverallScore > 0 && rows[i]!.proposedOverallScore < 100,
`${rows[i]!.countryCode} proposedOverallScore=${rows[i]!.proposedOverallScore} must be in (0, 100)`,
);
if (i > 0) {
assert.ok(
rows[i]!.proposedOverallScore <= rows[i - 1]!.proposedOverallScore,
`topTenInSample must be monotonically non-increasing at in-sample rank ${rows[i]!.rankInSample}: ${rows[i - 1]!.proposedOverallScore}${rows[i]!.proposedOverallScore}`,
);
}
}
});
it('projected bottomTenInSample ranks are contiguous and descend in score', () => {
const rows = snapshot.tables.bottomTenInSample;
assert.equal(rows.length, 10);
for (let i = 1; i < rows.length; i++) {
assert.equal(
rows[i]!.rankInSample,
rows[i - 1]!.rankInSample + 1,
`bottomTenInSample ranks must be contiguous: ${rows[i - 1]!.rankInSample} then ${rows[i]!.rankInSample}`,
);
assert.ok(
rows[i]!.proposedOverallScore <= rows[i - 1]!.proposedOverallScore,
`bottomTenInSample scores must not increase with worsening rank: ${rows[i - 1]!.countryCode}=${rows[i - 1]!.proposedOverallScore} then ${rows[i]!.countryCode}=${rows[i]!.proposedOverallScore}`,
);
}
assert.equal(
rows[rows.length - 1]!.rankInSample,
snapshot.totals.rankedCountriesInSample,
`bottomTenInSample.last.rankInSample=${rows[rows.length - 1]!.rankInSample} must equal totals.rankedCountriesInSample=${snapshot.totals.rankedCountriesInSample}`,
);
});
it('projected scoreDelta equals proposed current to within rounding', () => {
const all = [
...snapshot.tables.topTenInSample,
...snapshot.tables.bottomTenInSample,
...snapshot.tables.majorEconomiesInSample,
];
for (const row of all) {
const expected = Math.round((row.proposedOverallScore - row.currentOverallScore) * 100) / 100;
assert.ok(
Math.abs(row.scoreDelta - expected) < 0.02,
`${row.countryCode} scoreDelta=${row.scoreDelta} must equal proposed current = ${expected}`,
);
}
});
it('projected band anchors sit in their expected bands under the declared methodology', () => {
const { highFloor, lowCeiling } = resolveBands(snapshot.methodologyFormula);
for (const row of snapshot.tables.topTenInSample) {
if (!HIGH_BAND_ANCHORS.has(row.countryCode)) continue;
assert.ok(
row.proposedOverallScore >= highFloor,
`${row.countryCode} is a high-band anchor in topTenInSample but scored ${row.proposedOverallScore} (< ${highFloor}) under "${snapshot.methodologyFormula}"`,
);
}
for (const row of snapshot.tables.bottomTenInSample) {
if (!LOW_BAND_ANCHORS.has(row.countryCode)) continue;
assert.ok(
row.proposedOverallScore <= lowCeiling,
`${row.countryCode} is a low-band anchor in bottomTenInSample but scored ${row.proposedOverallScore} (> ${lowCeiling}) under "${snapshot.methodologyFormula}"`,
);
}
});
it('projected snapshot does not confuse itself with a live-universe capture', () => {
// Two structural guards so a projected snapshot cannot
// silently slip into the authoritative slot: it must NOT
// carry the full-universe top/bottom keys, and its file
// slug must identify it as a preview.
assert.equal(
(snapshot as unknown as SnapshotPublished).tables?.topTen,
undefined,
'projected snapshots must not also expose tables.topTen (reserved for authoritative captures)',
);
assert.ok(
filename !== `resilience-ranking-${snapshot.capturedAt}.json`,
`projected snapshots must use a slug-prefixed filename, got ${filename}`,
);
});
}
});
}
});

View File

@@ -47,44 +47,137 @@ describe('resilience ranking contracts', () => {
it('returns the cached ranking payload unchanged when the ranking cache already exists', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const cached = {
const cachedPublic = {
items: [
{ countryCode: 'NO', overallScore: 82, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
{ countryCode: 'US', overallScore: 61, level: 'medium', lowConfidence: false, overallCoverage: 0.88 },
],
greyedOut: [],
};
redis.set('resilience:ranking:v9', JSON.stringify(cached));
// The handler's stale-formula gate rejects untagged ranking entries,
// so fixtures must carry the `_formula` tag matching the current env
// (default flag-off ⇒ 'd6'). Writing the tagged shape here mirrors
// what the handler persists via stampRankingCacheTag.
redis.set('resilience:ranking:v10', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.deepEqual(response, cached);
assert.equal(redis.has('resilience:score:v9:YE'), false, 'cache hit must not trigger score warmup');
// The handler strips `_formula` before returning, so response matches
// the public shape rather than the on-wire cache shape.
assert.deepEqual(response, cachedPublic);
assert.equal(redis.has('resilience:score:v10:YE'), false, 'cache hit must not trigger score warmup');
});
it('returns all-greyed-out cached payload without rewarming (items=[], greyedOut non-empty)', async () => {
// Regression for: `cached?.items?.length` was falsy when items=[] even though
// greyedOut had entries, causing unnecessary rewarming on every request.
const { redis } = installRedis(RESILIENCE_FIXTURES);
const cached = {
const cachedPublic = {
items: [],
greyedOut: [
{ countryCode: 'SS', overallScore: 12, level: 'critical', lowConfidence: true, overallCoverage: 0.15 },
{ countryCode: 'ER', overallScore: 10, level: 'critical', lowConfidence: true, overallCoverage: 0.12 },
],
};
redis.set('resilience:ranking:v9', JSON.stringify(cached));
redis.set('resilience:ranking:v10', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.deepEqual(response, cached);
assert.equal(redis.has('resilience:score:v9:SS'), false, 'all-greyed-out cache hit must not trigger score warmup');
assert.deepEqual(response, cachedPublic);
assert.equal(redis.has('resilience:score:v10:SS'), false, 'all-greyed-out cache hit must not trigger score warmup');
});
it('bulk-read path skips untagged per-country score entries (legacy writes must rebuild on flip)', async () => {
// Pins the fix for a subtle bug: getCachedResilienceScores used
// `parsed._formula && parsed._formula !== current` which short-
// circuits on undefined. An untagged score entry — produced by a
// pre-PR code path or by an external writer that has not been
// updated — would therefore be ADMITTED into the ranking under the
// current formula instead of being treated as stale and re-warmed.
// On activation day that would mean a mixed-formula ranking for up
// to the 6h score TTL even though the single-country cache-miss
// path (ensureResilienceScoreCached) correctly invalidates the
// same entry. This test writes two per-country score keys, one
// tagged `_formula: 'd6'` and one untagged, and asserts the
// ranking warm path runs for the untagged country (meaning the
// bulk read skipped it).
const { redis } = installRedis(RESILIENCE_FIXTURES);
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
const domain = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
// Tagged entry: served as-is.
redis.set('resilience:score:v10:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domain, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05, _formula: 'd6',
}));
// Untagged entry: must be rejected, ranking warm rebuilds US.
redis.set('resilience:score:v10:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domain, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
// NOTE: no _formula field.
}));
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
// After the ranking run, the US entry in Redis must now carry
// `_formula: 'd6'`. If the bulk read had ADMITTED the untagged
// entry (the pre-fix bug), the warm path for US would not have
// run, and the stored value would still be untagged.
const rewrittenRaw = redis.get('resilience:score:v10:US');
assert.ok(rewrittenRaw, 'US entry must remain in Redis after the ranking run');
const rewritten = JSON.parse(rewrittenRaw!);
assert.equal(
rewritten._formula,
'd6',
'untagged US entry must be rejected by the bulk read so the warm path rebuilds it with the current formula tag. If `_formula` is still undefined here, getCachedResilienceScores is admitting untagged entries.',
);
});
it('rejects a stale-formula ranking cache entry and recomputes even without ?refresh=1', async () => {
// Pins the cross-formula isolation: when the env flag is off (default)
// and the ranking cache carries _formula='pc' (written during a prior
// flag-on deploy that has since been rolled back), the handler must
// NOT serve the stale-formula entry. It must recompute from the
// per-country scores instead. Without this behavior, a flag
// rollback would leave the old ranking in place for up to the 12h
// ranking TTL even though scores were already back on the 6-domain
// formula.
const { redis } = installRedis(RESILIENCE_FIXTURES);
const stale = {
items: [
{ countryCode: 'NO', overallScore: 99, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
],
greyedOut: [],
_formula: 'pc', // mismatched — current env is flag-off ⇒ current='d6'
};
redis.set('resilience:ranking:v10', JSON.stringify(stale));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.notDeepEqual(
response,
{ items: stale.items, greyedOut: stale.greyedOut },
'stale-formula ranking must be rejected, not served',
);
// Recompute path warms missing per-country scores, so YE (in
// RESILIENCE_FIXTURES) must get scored during this call.
assert.ok(
redis.has('resilience:score:v10:YE'),
'stale-formula reject must trigger the recompute-and-warm path',
);
});
it('warms missing scores synchronously and returns complete ranking on first call', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const domainWithCoverage = [{ name: 'political', dimensions: [{ name: 'd1', coverage: 0.9 }] }];
redis.set('resilience:score:v9:NO', JSON.stringify({
redis.set('resilience:score:v10:NO', JSON.stringify({
countryCode: 'NO',
overallScore: 82,
level: 'high',
@@ -94,7 +187,7 @@ describe('resilience ranking contracts', () => {
lowConfidence: false,
imputationShare: 0.05,
}));
redis.set('resilience:score:v9:US', JSON.stringify({
redis.set('resilience:score:v10:US', JSON.stringify({
countryCode: 'US',
overallScore: 61,
level: 'medium',
@@ -109,20 +202,20 @@ describe('resilience ranking contracts', () => {
const totalItems = response.items.length + (response.greyedOut?.length ?? 0);
assert.equal(totalItems, 3, `expected 3 total items across ranked + greyedOut, got ${totalItems}`);
assert.ok(redis.has('resilience:score:v9:YE'), 'missing country should be warmed during first call');
assert.ok(redis.has('resilience:score:v10:YE'), 'missing country should be warmed during first call');
assert.ok(response.items.every((item) => item.overallScore >= 0), 'ranked items should all have computed scores');
assert.ok(redis.has('resilience:ranking:v9'), 'fully scored ranking should be cached');
assert.ok(redis.has('resilience:ranking:v10'), 'fully scored ranking should be cached');
});
it('sets rankStable=true when interval data exists and width <= 8', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
redis.set('resilience:score:v9:NO', JSON.stringify({
redis.set('resilience:score:v10:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05,
}));
redis.set('resilience:score:v9:US', JSON.stringify({
redis.set('resilience:score:v10:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
@@ -149,12 +242,12 @@ describe('resilience ranking contracts', () => {
seedYear: 2025,
}));
const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
redis.set('resilience:score:v9:NO', JSON.stringify({
redis.set('resilience:score:v10:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05,
}));
redis.set('resilience:score:v9:US', JSON.stringify({
redis.set('resilience:score:v10:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
@@ -164,7 +257,7 @@ describe('resilience ranking contracts', () => {
// 3 of 4 (NO + US pre-cached, YE warmed from fixtures, ZZ can't be warmed)
// = 75% which meets the threshold — must cache.
assert.ok(redis.has('resilience:ranking:v9'), 'ranking must be cached at exactly 75% coverage');
assert.ok(redis.has('resilience:ranking:v10'), 'ranking must be cached at exactly 75% coverage');
assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written alongside the ranking');
});
@@ -195,7 +288,7 @@ describe('resilience ranking contracts', () => {
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const allScoreReads = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'GET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v9:'),
(cmd) => cmd[0] === 'GET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v10:'),
);
if (allScoreReads) {
// Simulate visibility lag: pretend no scores are cached yet.
@@ -211,7 +304,7 @@ describe('resilience ranking contracts', () => {
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.ok(redis.has('resilience:ranking:v9'), 'ranking must be published despite pipeline-GET race');
assert.ok(redis.has('resilience:ranking:v10'), 'ranking must be published despite pipeline-GET race');
assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written despite pipeline-GET race');
});
@@ -219,8 +312,8 @@ describe('resilience ranking contracts', () => {
// Reviewer regression: passing `raw=true` to runRedisPipeline bypasses the
// env-based key prefix (preview: / dev:) that isolates preview deploys
// from production. The symptom is asymmetric: preview reads hit
// `preview:<sha>:resilience:score:v9:XX` while preview writes landed at
// raw `resilience:score:v9:XX`, simultaneously (a) missing the preview
// `preview:<sha>:resilience:score:v10:XX` while preview writes landed at
// raw `resilience:score:v10:XX`, simultaneously (a) missing the preview
// cache forever and (b) poisoning production's shared cache. Simulate a
// preview deploy and assert the pipeline SET keys carry the prefix.
// Shared afterEach snapshots/restores VERCEL_ENV + VERCEL_GIT_COMMIT_SHA
@@ -252,7 +345,7 @@ describe('resilience ranking contracts', () => {
const scoreSetKeys = pipelineBodies
.flat()
.filter((cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v9:'))
.filter((cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v10:'))
.map((cmd) => cmd[1] as string);
assert.ok(scoreSetKeys.length >= 2, `expected at least 2 score SETs, got ${scoreSetKeys.length}`);
for (const key of scoreSetKeys) {
@@ -280,8 +373,14 @@ describe('resilience ranking contracts', () => {
failedDatasets: [],
seedYear: 2026,
}));
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [] };
redis.set('resilience:ranking:v9', JSON.stringify(stale));
// Stale sentinel tagged with the current (flag-off default)
// formula so the cross-formula invalidation does NOT fire here —
// these refresh-auth tests exercise the auth gate, not the
// formula check. An untagged sentinel would be silently
// rejected by the formula gate and the refresh path would not
// get tested as intended.
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
redis.set('resilience:ranking:v10', JSON.stringify(stale));
// No X-WorldMonitor-Key → refresh must be ignored, stale cache returned.
const unauth = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1');
@@ -328,8 +427,14 @@ describe('resilience ranking contracts', () => {
}));
// Seed a pre-existing ranking so the cache-hit early-return would
// normally fire. ?refresh=1 (with valid seed key) must ignore it.
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [] };
redis.set('resilience:ranking:v9', JSON.stringify(stale));
// Stale sentinel tagged with the current (flag-off default)
// formula so the cross-formula invalidation does NOT fire here —
// these refresh-auth tests exercise the auth gate, not the
// formula check. An untagged sentinel would be silently
// rejected by the formula gate and the refresh path would not
// get tested as intended.
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
redis.set('resilience:ranking:v10', JSON.stringify(stale));
const request = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1', {
headers: { 'X-WorldMonitor-Key': 'seed-secret' },
@@ -364,7 +469,7 @@ describe('resilience ranking contracts', () => {
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const isAllScoreSets = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v9:'),
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v10:'),
);
if (isAllScoreSets) setPipelineSizes.push(commands.length);
}
@@ -396,7 +501,7 @@ describe('resilience ranking contracts', () => {
seedYear: 2026,
}));
// Intercept any pipeline SET to resilience:score:v9:* and reply with
// Intercept any pipeline SET to resilience:score:v10:* and reply with
// non-OK results (persisted but authoritative signal says no). /set and
// other paths pass through normally so history/interval writes succeed.
const blockedScoreWrites = (async (input: RequestInfo | URL, init?: RequestInit) => {
@@ -404,7 +509,7 @@ describe('resilience ranking contracts', () => {
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const allScoreSets = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v9:'),
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v10:'),
);
if (allScoreSets) {
return new Response(
@@ -419,7 +524,7 @@ describe('resilience ranking contracts', () => {
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.ok(!redis.has('resilience:ranking:v9'), 'ranking must NOT be published when score writes failed');
assert.ok(!redis.has('resilience:ranking:v10'), 'ranking must NOT be published when score writes failed');
assert.ok(!redis.has('seed-meta:resilience:ranking'), 'seed-meta must NOT be written when score writes failed');
});

View File

@@ -10,12 +10,12 @@ import {
} from '../scripts/seed-resilience-scores.mjs';
describe('exported constants', () => {
it('RESILIENCE_RANKING_CACHE_KEY matches server-side key (v9)', () => {
assert.equal(RESILIENCE_RANKING_CACHE_KEY, 'resilience:ranking:v9');
it('RESILIENCE_RANKING_CACHE_KEY matches server-side key (v10)', () => {
assert.equal(RESILIENCE_RANKING_CACHE_KEY, 'resilience:ranking:v10');
});
it('RESILIENCE_SCORE_CACHE_PREFIX matches server-side prefix (v9)', () => {
assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v9:');
it('RESILIENCE_SCORE_CACHE_PREFIX matches server-side prefix (v10)', () => {
assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v10:');
});
it('RESILIENCE_RANKING_CACHE_TTL_SECONDS is 12 hours (2x cron interval)', () => {