Files
worldmonitor/tests/resilience-ranking.test.mts
Elie Habib d3d406448a feat(resilience): PR 2 §3.4 recovery-domain weight rebalance (#3328)
* feat(resilience): PR 2 §3.4 recovery-domain weight rebalance

Dials the two PR 2 §3.4 recovery dims (liquidReserveAdequacy,
sovereignFiscalBuffer) to ~10% share each of the recovery-domain
score via a new per-dimension weight channel in the coverage-weighted
mean. Matches the plan's direction that the sovereign-wealth signal
complement — rather than dominate — the classical liquid-reserves
and fiscal-space signals.

Implementation

- RESILIENCE_DIMENSION_WEIGHTS: new Record<ResilienceDimensionId, number>
  alongside RESILIENCE_DOMAIN_WEIGHTS. Every dim has an explicit entry
  (default 1.0) so rebalance decisions stay auditable; the two new
  recovery dims carry 0.5 each.

  Share math at full coverage (6 active recovery dims):
    weight sum                  = 4 × 1.0 + 2 × 0.5 = 5.0
    each new-dim share          = 0.5 / 5.0 = 0.10  ✓
    each core-dim share         = 1.0 / 5.0 = 0.20

  Retired dims (reserveAdequacy, fuelStockDays) keep weight 1.0 in
  the map; their coverage=0 neutralizes them at the coverage channel
  regardless. Explicit entries guard against a future scorer bug
  accidentally returning coverage>0 for a retired dim and falling
  through the `?? 1.0` default — every retirement decision is now
  tied to a single explicit source of truth.

- coverageWeightedMean (_shared.ts): refactored to apply
  `coverage × dimWeight` per dim instead of `coverage` alone. Backward-
  compatible when all weights default to 1.0 (reduces to the original
  mean). All three aggregation callers — buildDomainList, baseline-
  Score, stressScore — pick up the weighting transparently.

Test coverage

1. New `tests/resilience-recovery-weight-rebalance.test.mts`:
   pins the per-dim weight values, asserts the share math
   (0.10 new / 0.20 core), verifies completeness of the weight map,
   and documents why retired dims stay in the map at 1.0.
2. New `tests/resilience-recovery-ordering.test.mts`: fixture-based
   Spearman-proxy sensitivity check. Asserts NO > US > YE ordering
   preserved on both the overall score and the recovery-domain
   subscore after the rebalance. (Live post-merge Spearman rerun
   against the PR 0 snapshot is tracked as a follow-up commit.)
3. resilience-scorers.test.mts fixture anchors updated in lockstep:
     baselineScore: 60.35 → 62.17 (low-scoring liquidReserveAdequacy
       + partial-coverage SWF now contribute ~half the weight)
     overallScore:  63.60 → 64.39 (recovery subscore lifts by ~3 pts
       from the rebalance, overall by ~0.79)
     recovery flat mean: 48.75 (unchanged — flat mean doesn't apply
       weights by design; documents the coverage-weighted diff)
   Local coverageWeightedMean helper in the test mirrors the
   production implementation (weights applied per dim).

Methodology doc

- New "Per-dimension weights in the recovery domain" subsection with
  the weight table and a sentence explaining the cap. Cross-references
  the source of truth (RESILIENCE_DIMENSION_WEIGHTS).

Deliberate non-goals

- Live post-merge Spearman ≥0.85 check against the PR 0 baseline
  snapshot. Fixture ordering is preserved (new ordering test); the
  live-data check runs after Railway cron refreshes the rankings on
  the new weights and commits docs/snapshots/resilience-ranking-live-
  post-pr2-<date>.json. Tracked as the final piece of PR 2 §3.4
  alongside the health.js / bootstrap graduation (waiting on the
  7-day Railway cron bake-in window).

Tests: 6588/6588 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on touched files. NO > US > YE
fixture ordering preserved.

* fix(resilience): PR 2 review — thread RESILIENCE_DIMENSION_WEIGHTS through the comparison harness

Greptile P2: the operator comparison harness
(scripts/compare-resilience-current-vs-proposed.mjs) claims its domain
scores "mirror the production scorer's coverage-weighted mean" and is
the artifact generator for Spearman / rank-delta acceptance decisions.
After PR 2 §3.4's weight rebalance, the production mirror diverged —
production now applies RESILIENCE_DIMENSION_WEIGHTS (liquidReserveAdequacy
= 0.5, sovereignFiscalBuffer = 0.5) inside coverageWeightedMean, but
the harness still used equal-weight aggregation.

Left unfixed, post-merge Spearman / rank-delta diagnostics would
compare live API scores (with the 0.5 recovery weights) against
harness predictions that assume equal-share dims — silently biasing
every acceptance decision until someone noticed a country's rank-
delta didn't track.

Fix

- Mirrored coverageWeightedMean now accepts dimensionWeights and
  applies `coverage × weight` per dim, matching _shared.ts exactly.
- Mirrored buildDomainList accepts + forwards dimensionWeights.
- main() imports RESILIENCE_DIMENSION_WEIGHTS from the scorer module
  and passes it through to buildDomainList at the single call site.
- Missing-entry default = 1.0 (same contract as production) — makes
  the harness forward-compatible with any future weight refactor
  (adds a new dim without an explicit entry, old production fallback
  path still produces the correct number).

Verification

- Harness syntax-check clean (node -c).
- RESILIENCE_DIMENSION_WEIGHTS import resolves correctly from the
  harness's import path.
- 509/509 resilience tests still pass (harness isn't in the test
  suite; the invariant is that production ↔ harness use the same
  math, and the production side is covered by tests/resilience-
  recovery-weight-rebalance.test.mts).

* fix(resilience): PR 2 review — bump cache prefixes v10→v11 + document coverage-vs-weight asymmetry

Greptile P1 + P2 on PR #3328.

P1 — cache prefix not bumped after formula change
--------------------------------------------------
The per-dim weight rebalance changes the score formula, but the
`_formula` tag only distinguishes 'd6' vs 'pc' (pillar-combined vs
legacy 6-domain) — it does NOT detect intra-'d6' weight changes. Left
unfixed, scores cached before deploy would be served with the old
equal-weight math for up to the full 6h TTL, and the ranking key for
up to its 12h TTL. Matches the established v9→v10 pattern for every
prior formula-changing deploy.

Bumped in lockstep:
 - RESILIENCE_SCORE_CACHE_PREFIX:     v10  → v11
 - RESILIENCE_RANKING_CACHE_KEY:      v10  → v11
 - RESILIENCE_HISTORY_KEY_PREFIX:      v5  → v6
 - scripts/seed-resilience-scores.mjs local mirrors
 - api/health.js resilienceRanking literal
 - 4 analysis/backtest scripts that read the cached keys directly
 - Test fixtures in resilience-{ranking, handlers, scores-seed,
   pillar-aggregation}.test.* that assert on literal key values

The v5→v6 history bump is the critical one: without it, pre-rebalance
history points would mix with post-rebalance points inside the 30-day
window, and change30d / trend math would diff values from different
formulas against each other, producing false-negative "falling" trends
for every country across the deploy window.

P2 — coverage-vs-weight asymmetry in computeLowConfidence / computeOverallCoverage
----------------------------------------------------------------------------------
Reviewer flagged that these two functions still average coverage
equally across all non-retired dims, even after the scoring aggregation
started applying RESILIENCE_DIMENSION_WEIGHTS. The asymmetry is
INTENTIONAL — these signals answer a different question from scoring:

  scoring aggregation: "how much does each dim matter to the score?"
  coverage signal:     "how much real data do we have on this country?"

A dim at weight 0.5 still has the same data-availability footprint as
a weight=1.0 dim: its coverage value reflects whether we successfully
fetched the upstream source, not whether the scorer cares about it.
Applying scoring weights to the coverage signal would let a
half-weight dim hide half its sparsity from the overallCoverage pill,
misleading users reading coverage as a data-quality indicator.

Added explicit comments to both functions noting the asymmetry is
deliberate and pointing at the other site for matching rationale.
No code change — just documentation.

Tests: 6588/6588 data-tier tests pass (+511 resilience-specific
including the prefix-literal assertions). Typecheck clean on both
tsconfig configs. Biome clean on touched files.

* docs(resilience): bump methodology doc cache-prefix references to v11/v6

Greptile P2 on PR #3328: Redis keys table in the reproducibility
appendix still published `score:v10` / `ranking:v10` / `history:v5`,
and the rollback instructions told operators to flush those keys.
After the recovery-domain weight rebalance, live cache runs at
`score:v11` / `ranking:v11` / `history:v6`.

- Updated the Redis keys table (line 490-492) to match `_shared.ts`.
- Updated the rollback block to name the current keys.
- Left the historical "Activation sequence" narrative intact (it
  accurately describes the pillar-combine PR's v9→v10 / v4→v5 bump)
  but added a parenthetical pointing at the current v11/v6 values.

No code change — doc-only correction for operator accuracy.

* fix(docs): escape MDX-unsafe `<137` pattern to unblock Mintlify deploy

Line 643 had `(<137 countries)` — MDX parses `<137` as a JSX tag
starting with digit `1`, which is illegal and breaks the deploy with
"Unexpected character \`1\` (U+0031) before name". Surfaced after the
prior cache-prefix commit forced Mintlify to re-parse this file.

Replaced with "fewer than 137 countries" for unambiguous rendering.
Other `<` occurrences in this doc (lines 34, 642) are followed by
whitespace and don't trip MDX's tag parser.
2026-04-23 10:25:18 +04:00

546 lines
28 KiB
TypeScript

import assert from 'node:assert/strict';
import { afterEach, describe, it } from 'node:test';
import { getResilienceRanking } from '../server/worldmonitor/resilience/v1/get-resilience-ranking.ts';
import { buildRankingItem, sortRankingItems } from '../server/worldmonitor/resilience/v1/_shared.ts';
import { __resetKeyPrefixCacheForTests } from '../server/_shared/redis.ts';
import { installRedis } from './helpers/fake-upstash-redis.mts';
import { RESILIENCE_FIXTURES } from './helpers/resilience-fixtures.mts';
const originalFetch = globalThis.fetch;
const originalRedisUrl = process.env.UPSTASH_REDIS_REST_URL;
const originalRedisToken = process.env.UPSTASH_REDIS_REST_TOKEN;
const originalVercelEnv = process.env.VERCEL_ENV;
const originalVercelSha = process.env.VERCEL_GIT_COMMIT_SHA;
afterEach(() => {
globalThis.fetch = originalFetch;
if (originalRedisUrl == null) delete process.env.UPSTASH_REDIS_REST_URL;
else process.env.UPSTASH_REDIS_REST_URL = originalRedisUrl;
if (originalRedisToken == null) delete process.env.UPSTASH_REDIS_REST_TOKEN;
else process.env.UPSTASH_REDIS_REST_TOKEN = originalRedisToken;
if (originalVercelEnv == null) delete process.env.VERCEL_ENV;
else process.env.VERCEL_ENV = originalVercelEnv;
if (originalVercelSha == null) delete process.env.VERCEL_GIT_COMMIT_SHA;
else process.env.VERCEL_GIT_COMMIT_SHA = originalVercelSha;
// Any test that touched VERCEL_ENV / VERCEL_GIT_COMMIT_SHA must invalidate
// the memoized key prefix so the next test recomputes it against the
// restored env — otherwise preview/dev tests would leak a stale prefix.
__resetKeyPrefixCacheForTests();
});
describe('resilience ranking contracts', () => {
it('sorts descending by overall score and keeps unscored placeholders at the end', () => {
const sorted = sortRankingItems([
{ countryCode: 'US', overallScore: 61, level: 'medium', lowConfidence: false },
{ countryCode: 'YE', overallScore: -1, level: 'unknown', lowConfidence: true },
{ countryCode: 'NO', overallScore: 82, level: 'high', lowConfidence: false },
{ countryCode: 'DE', overallScore: -1, level: 'unknown', lowConfidence: true },
{ countryCode: 'JP', overallScore: 61, level: 'medium', lowConfidence: false },
]);
assert.deepEqual(
sorted.map((item) => [item.countryCode, item.overallScore]),
[['NO', 82], ['JP', 61], ['US', 61], ['DE', -1], ['YE', -1]],
);
});
it('returns the cached ranking payload unchanged when the ranking cache already exists', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const cachedPublic = {
items: [
{ countryCode: 'NO', overallScore: 82, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
{ countryCode: 'US', overallScore: 61, level: 'medium', lowConfidence: false, overallCoverage: 0.88 },
],
greyedOut: [],
};
// The handler's stale-formula gate rejects untagged ranking entries,
// so fixtures must carry the `_formula` tag matching the current env
// (default flag-off ⇒ 'd6'). Writing the tagged shape here mirrors
// what the handler persists via stampRankingCacheTag.
redis.set('resilience:ranking:v11', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
// The handler strips `_formula` before returning, so response matches
// the public shape rather than the on-wire cache shape.
assert.deepEqual(response, cachedPublic);
assert.equal(redis.has('resilience:score:v11:YE'), false, 'cache hit must not trigger score warmup');
});
it('returns all-greyed-out cached payload without rewarming (items=[], greyedOut non-empty)', async () => {
// Regression for: `cached?.items?.length` was falsy when items=[] even though
// greyedOut had entries, causing unnecessary rewarming on every request.
const { redis } = installRedis(RESILIENCE_FIXTURES);
const cachedPublic = {
items: [],
greyedOut: [
{ countryCode: 'SS', overallScore: 12, level: 'critical', lowConfidence: true, overallCoverage: 0.15 },
{ countryCode: 'ER', overallScore: 10, level: 'critical', lowConfidence: true, overallCoverage: 0.12 },
],
};
redis.set('resilience:ranking:v11', JSON.stringify({ ...cachedPublic, _formula: 'd6' }));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.deepEqual(response, cachedPublic);
assert.equal(redis.has('resilience:score:v11:SS'), false, 'all-greyed-out cache hit must not trigger score warmup');
});
it('bulk-read path skips untagged per-country score entries (legacy writes must rebuild on flip)', async () => {
// Pins the fix for a subtle bug: getCachedResilienceScores used
// `parsed._formula && parsed._formula !== current` which short-
// circuits on undefined. An untagged score entry — produced by a
// pre-PR code path or by an external writer that has not been
// updated — would therefore be ADMITTED into the ranking under the
// current formula instead of being treated as stale and re-warmed.
// On activation day that would mean a mixed-formula ranking for up
// to the 6h score TTL even though the single-country cache-miss
// path (ensureResilienceScoreCached) correctly invalidates the
// same entry. This test writes two per-country score keys, one
// tagged `_formula: 'd6'` and one untagged, and asserts the
// ranking warm path runs for the untagged country (meaning the
// bulk read skipped it).
const { redis } = installRedis(RESILIENCE_FIXTURES);
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
const domain = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
// Tagged entry: served as-is.
redis.set('resilience:score:v11:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domain, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05, _formula: 'd6',
}));
// Untagged entry: must be rejected, ranking warm rebuilds US.
redis.set('resilience:score:v11:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domain, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
// NOTE: no _formula field.
}));
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
// After the ranking run, the US entry in Redis must now carry
// `_formula: 'd6'`. If the bulk read had ADMITTED the untagged
// entry (the pre-fix bug), the warm path for US would not have
// run, and the stored value would still be untagged.
const rewrittenRaw = redis.get('resilience:score:v11:US');
assert.ok(rewrittenRaw, 'US entry must remain in Redis after the ranking run');
const rewritten = JSON.parse(rewrittenRaw!);
assert.equal(
rewritten._formula,
'd6',
'untagged US entry must be rejected by the bulk read so the warm path rebuilds it with the current formula tag. If `_formula` is still undefined here, getCachedResilienceScores is admitting untagged entries.',
);
});
it('rejects a stale-formula ranking cache entry and recomputes even without ?refresh=1', async () => {
// Pins the cross-formula isolation: when the env flag is off (default)
// and the ranking cache carries _formula='pc' (written during a prior
// flag-on deploy that has since been rolled back), the handler must
// NOT serve the stale-formula entry. It must recompute from the
// per-country scores instead. Without this behavior, a flag
// rollback would leave the old ranking in place for up to the 12h
// ranking TTL even though scores were already back on the 6-domain
// formula.
const { redis } = installRedis(RESILIENCE_FIXTURES);
const stale = {
items: [
{ countryCode: 'NO', overallScore: 99, level: 'high', lowConfidence: false, overallCoverage: 0.95 },
],
greyedOut: [],
_formula: 'pc', // mismatched — current env is flag-off ⇒ current='d6'
};
redis.set('resilience:ranking:v11', JSON.stringify(stale));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.notDeepEqual(
response,
{ items: stale.items, greyedOut: stale.greyedOut },
'stale-formula ranking must be rejected, not served',
);
// Recompute path warms missing per-country scores, so YE (in
// RESILIENCE_FIXTURES) must get scored during this call.
assert.ok(
redis.has('resilience:score:v11:YE'),
'stale-formula reject must trigger the recompute-and-warm path',
);
});
it('warms missing scores synchronously and returns complete ranking on first call', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const domainWithCoverage = [{ name: 'political', dimensions: [{ name: 'd1', coverage: 0.9 }] }];
redis.set('resilience:score:v11:NO', JSON.stringify({
countryCode: 'NO',
overallScore: 82,
level: 'high',
domains: domainWithCoverage,
trend: 'stable',
change30d: 1.2,
lowConfidence: false,
imputationShare: 0.05,
}));
redis.set('resilience:score:v11:US', JSON.stringify({
countryCode: 'US',
overallScore: 61,
level: 'medium',
domains: domainWithCoverage,
trend: 'rising',
change30d: 4.3,
lowConfidence: false,
imputationShare: 0.1,
}));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
const totalItems = response.items.length + (response.greyedOut?.length ?? 0);
assert.equal(totalItems, 3, `expected 3 total items across ranked + greyedOut, got ${totalItems}`);
assert.ok(redis.has('resilience:score:v11:YE'), 'missing country should be warmed during first call');
assert.ok(response.items.every((item) => item.overallScore >= 0), 'ranked items should all have computed scores');
assert.ok(redis.has('resilience:ranking:v11'), 'fully scored ranking should be cached');
});
it('sets rankStable=true when interval data exists and width <= 8', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
redis.set('resilience:score:v11:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05,
}));
redis.set('resilience:score:v11:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
}));
redis.set('resilience:intervals:v1:NO', JSON.stringify({ p05: 78, p95: 84 }));
redis.set('resilience:intervals:v1:US', JSON.stringify({ p05: 50, p95: 72 }));
const response = await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
const no = response.items.find((item) => item.countryCode === 'NO');
const us = response.items.find((item) => item.countryCode === 'US');
assert.equal(no?.rankStable, true, 'NO interval width 6 should be stable');
assert.equal(us?.rankStable, false, 'US interval width 22 should be unstable');
});
it('caches the ranking when partial coverage meets the 75% threshold (4 countries, 3 scored)', async () => {
const { redis } = installRedis(RESILIENCE_FIXTURES);
// Override the static index so we have an un-scoreable extra country (ZZ has
// no fixture → warm will throw and ZZ stays missing).
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US', 'YE', 'ZZ'],
recordCount: 4,
failedDatasets: [],
seedYear: 2025,
}));
const domainWithCoverage = [{ id: 'political', score: 80, weight: 0.2, dimensions: [{ id: 'd1', score: 80, coverage: 0.9, observedWeight: 1, imputedWeight: 0 }] }];
redis.set('resilience:score:v11:NO', JSON.stringify({
countryCode: 'NO', overallScore: 82, level: 'high',
domains: domainWithCoverage, trend: 'stable', change30d: 1.2,
lowConfidence: false, imputationShare: 0.05,
}));
redis.set('resilience:score:v11:US', JSON.stringify({
countryCode: 'US', overallScore: 61, level: 'medium',
domains: domainWithCoverage, trend: 'rising', change30d: 4.3,
lowConfidence: false, imputationShare: 0.1,
}));
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
// 3 of 4 (NO + US pre-cached, YE warmed from fixtures, ZZ can't be warmed)
// = 75% which meets the threshold — must cache.
assert.ok(redis.has('resilience:ranking:v11'), 'ranking must be cached at exactly 75% coverage');
assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written alongside the ranking');
});
it('publishes ranking via in-memory warm results even when Upstash pipeline-GET lags after /set writes (race regression)', async () => {
// Simulates the documented Upstash REST write→re-read lag inside a single
// Vercel invocation: /set calls succeed, but a pipeline GET immediately
// afterwards can return null for the same keys. Pre-fix, this collapsed
// coverage to 0 and silently dropped the ranking publish. Post-fix, the
// handler merges warm results from memory, so coverage reflects reality.
const { redis, fetchImpl } = installRedis({ ...RESILIENCE_FIXTURES });
// Override the static index: 2 countries, neither pre-cached — both must
// be warmed by the handler. Pre-fix, both pipeline-GETs post-warm would
// return null, coverage = 0% < 75%, handler skips the write. Post-fix,
// the in-memory merge carries both scores, coverage = 100%, write
// proceeds.
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
// Stale pipeline-GETs for score keys: pretend Redis hasn't caught up with
// the /set writes yet. /set calls still mutate the underlying map so the
// final assertion on ranking presence can verify the SET happened.
const lagged = (async (input: RequestInfo | URL, init?: RequestInit) => {
const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const allScoreReads = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'GET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v11:'),
);
if (allScoreReads) {
// Simulate visibility lag: pretend no scores are cached yet.
return new Response(
JSON.stringify(commands.map(() => ({ result: null }))),
{ status: 200 },
);
}
}
return fetchImpl(input, init);
}) as typeof fetch;
globalThis.fetch = lagged;
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.ok(redis.has('resilience:ranking:v11'), 'ranking must be published despite pipeline-GET race');
assert.ok(redis.has('seed-meta:resilience:ranking'), 'seed-meta must be written despite pipeline-GET race');
});
it('pipeline SETs apply env prefix so preview warms do not leak into production namespace', async () => {
// Reviewer regression: passing `raw=true` to runRedisPipeline bypasses the
// env-based key prefix (preview: / dev:) that isolates preview deploys
// from production. The symptom is asymmetric: preview reads hit
// `preview:<sha>:resilience:score:v11:XX` while preview writes landed at
// raw `resilience:score:v11:XX`, simultaneously (a) missing the preview
// cache forever and (b) poisoning production's shared cache. Simulate a
// preview deploy and assert the pipeline SET keys carry the prefix.
// Shared afterEach snapshots/restores VERCEL_ENV + VERCEL_GIT_COMMIT_SHA
// and invalidates the memoized key prefix, so this test just mutates them
// freely without a finally block.
process.env.VERCEL_ENV = 'preview';
process.env.VERCEL_GIT_COMMIT_SHA = 'abcdef12ffff';
__resetKeyPrefixCacheForTests();
const { redis, fetchImpl } = installRedis({ ...RESILIENCE_FIXTURES }, { keepVercelEnv: true });
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
const pipelineBodies: Array<Array<Array<unknown>>> = [];
const capturing = (async (input: RequestInfo | URL, init?: RequestInit) => {
const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
pipelineBodies.push(JSON.parse(init.body) as Array<Array<unknown>>);
}
return fetchImpl(input, init);
}) as typeof fetch;
globalThis.fetch = capturing;
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
const scoreSetKeys = pipelineBodies
.flat()
.filter((cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v11:'))
.map((cmd) => cmd[1] as string);
assert.ok(scoreSetKeys.length >= 2, `expected at least 2 score SETs, got ${scoreSetKeys.length}`);
for (const key of scoreSetKeys) {
assert.ok(
key.startsWith('preview:abcdef12:'),
`score SET key must carry preview prefix; got ${key} — writes would poison the production namespace`,
);
}
});
it('?refresh=1 is rejected without a valid X-WorldMonitor-Key (Pro bearer token is NOT enough)', async () => {
// A full warm is expensive (~222 score computations + chunked pipeline
// SETs). Allowing any Pro user to loop on ?refresh=1 would DoS Upstash
// and Edge budget. refresh must be seed-service only — validated against
// WORLDMONITOR_VALID_KEYS / WORLDMONITOR_API_KEY.
const prevValidKeys = process.env.WORLDMONITOR_VALID_KEYS;
const prevApiKey = process.env.WORLDMONITOR_API_KEY;
process.env.WORLDMONITOR_VALID_KEYS = 'seed-secret';
delete process.env.WORLDMONITOR_API_KEY;
try {
const { redis } = installRedis({ ...RESILIENCE_FIXTURES });
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
// Stale sentinel tagged with the current (flag-off default)
// formula so the cross-formula invalidation does NOT fire here —
// these refresh-auth tests exercise the auth gate, not the
// formula check. An untagged sentinel would be silently
// rejected by the formula gate and the refresh path would not
// get tested as intended.
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
redis.set('resilience:ranking:v11', JSON.stringify(stale));
// No X-WorldMonitor-Key → refresh must be ignored, stale cache returned.
const unauth = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1');
const unauthResp = await getResilienceRanking({ request: unauth } as never, {});
assert.equal(unauthResp.items.length, 1);
assert.equal(unauthResp.items[0]?.countryCode, 'ZZ', 'refresh=1 without key must fall back to cached response');
// Wrong key → same as no key.
const wrongKey = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1', {
headers: { 'X-WorldMonitor-Key': 'bogus' },
});
const wrongResp = await getResilienceRanking({ request: wrongKey } as never, {});
assert.equal(wrongResp.items[0]?.countryCode, 'ZZ', 'refresh=1 with wrong key must fall back to cached response');
// Valid seed key → refresh is honored, ZZ is NOT in the recomputed response.
const authed = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1', {
headers: { 'X-WorldMonitor-Key': 'seed-secret' },
});
const authedResp = await getResilienceRanking({ request: authed } as never, {});
const codes = (authedResp.items.concat(authedResp.greyedOut ?? [])).map((i) => i.countryCode);
assert.ok(!codes.includes('ZZ'), 'refresh=1 with valid seed key must recompute');
} finally {
if (prevValidKeys == null) delete process.env.WORLDMONITOR_VALID_KEYS;
else process.env.WORLDMONITOR_VALID_KEYS = prevValidKeys;
if (prevApiKey == null) delete process.env.WORLDMONITOR_API_KEY;
else process.env.WORLDMONITOR_API_KEY = prevApiKey;
}
});
it('?refresh=1 bypasses the cache-hit early-return and recomputes the ranking (with valid seed key)', async () => {
// Seeder uses ?refresh=1 on the unconditional per-cron rebuild. Without
// this bypass, the seeder would have to DEL the ranking before rebuild
// (the old flow) — a failed rebuild would then leave the key absent
// instead of stale-but-present.
const prevValidKeys = process.env.WORLDMONITOR_VALID_KEYS;
process.env.WORLDMONITOR_VALID_KEYS = 'seed-secret';
try {
const { redis } = installRedis({ ...RESILIENCE_FIXTURES });
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
// Seed a pre-existing ranking so the cache-hit early-return would
// normally fire. ?refresh=1 (with valid seed key) must ignore it.
// Stale sentinel tagged with the current (flag-off default)
// formula so the cross-formula invalidation does NOT fire here —
// these refresh-auth tests exercise the auth gate, not the
// formula check. An untagged sentinel would be silently
// rejected by the formula gate and the refresh path would not
// get tested as intended.
const stale = { items: [{ countryCode: 'ZZ', overallScore: 1, level: 'low', lowConfidence: true, overallCoverage: 0.5 }], greyedOut: [], _formula: 'd6' };
redis.set('resilience:ranking:v11', JSON.stringify(stale));
const request = new Request('https://example.com/api/resilience/v1/get-resilience-ranking?refresh=1', {
headers: { 'X-WorldMonitor-Key': 'seed-secret' },
});
const response = await getResilienceRanking({ request } as never, {});
const returnedCountries = response.items.concat(response.greyedOut ?? []).map((i) => i.countryCode);
assert.ok(!returnedCountries.includes('ZZ'), 'refresh=1 must recompute, not return the stale cached ZZ entry');
assert.ok(returnedCountries.includes('NO') || returnedCountries.includes('US'), 'recomputed ranking must reflect the current static index');
} finally {
if (prevValidKeys == null) delete process.env.WORLDMONITOR_VALID_KEYS;
else process.env.WORLDMONITOR_VALID_KEYS = prevValidKeys;
}
});
it('warms via batched pipeline SETs (avoids 600KB single-pipeline timeout)', async () => {
// The 5s pipeline timeout would fail on a 222-SET pipeline (~600KB body)
// and the persistence guard would correctly return empty → no ranking.
// Splitting into smaller batches keeps each pipeline well under timeout.
// We assert the SET path uses MULTIPLE pipelines, not one giant one.
const { redis, fetchImpl } = installRedis({ ...RESILIENCE_FIXTURES });
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US', 'YE'],
recordCount: 3,
failedDatasets: [],
seedYear: 2026,
}));
const setPipelineSizes: number[] = [];
const observing = (async (input: RequestInfo | URL, init?: RequestInit) => {
const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const isAllScoreSets = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && (cmd[1] as string).includes('resilience:score:v11:'),
);
if (isAllScoreSets) setPipelineSizes.push(commands.length);
}
return fetchImpl(input, init);
}) as typeof fetch;
globalThis.fetch = observing;
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
// For 3 countries the batch fits in one pipeline. The contract under test
// is that no single pipeline exceeds the SET_BATCH bound (30) — would-be
// 222-element pipelines must be split into multiple smaller ones.
assert.ok(setPipelineSizes.length > 0, 'warm must issue at least one score-SET pipeline');
for (const size of setPipelineSizes) {
assert.ok(size <= 30, `each score-SET pipeline must be ≤30 commands; saw ${size}`);
}
});
it('does NOT publish ranking when score-key /set writes silently fail (persistence guard)', async () => {
// Reviewer regression: trusting in-memory warm results without verifying
// persistence turned a read-lag fix into a write-failure false positive.
// With writes broken at the Upstash layer, coverage should NOT pass the
// gate and neither the ranking nor its meta should be published.
const { redis, fetchImpl } = installRedis({ ...RESILIENCE_FIXTURES });
redis.set('resilience:static:index:v1', JSON.stringify({
countries: ['NO', 'US'],
recordCount: 2,
failedDatasets: [],
seedYear: 2026,
}));
// Intercept any pipeline SET to resilience:score:v11:* and reply with
// non-OK results (persisted but authoritative signal says no). /set and
// other paths pass through normally so history/interval writes succeed.
const blockedScoreWrites = (async (input: RequestInfo | URL, init?: RequestInit) => {
const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
if (url.endsWith('/pipeline') && typeof init?.body === 'string') {
const commands = JSON.parse(init.body) as Array<Array<string>>;
const allScoreSets = commands.length > 0 && commands.every(
(cmd) => cmd[0] === 'SET' && typeof cmd[1] === 'string' && cmd[1].startsWith('resilience:score:v11:'),
);
if (allScoreSets) {
return new Response(
JSON.stringify(commands.map(() => ({ error: 'simulated write failure' }))),
{ status: 200 },
);
}
}
return fetchImpl(input, init);
}) as typeof fetch;
globalThis.fetch = blockedScoreWrites;
await getResilienceRanking({ request: new Request('https://example.com') } as never, {});
assert.ok(!redis.has('resilience:ranking:v11'), 'ranking must NOT be published when score writes failed');
assert.ok(!redis.has('seed-meta:resilience:ranking'), 'seed-meta must NOT be written when score writes failed');
});
it('defaults rankStable=false when no interval data exists', () => {
const item = buildRankingItem('ZZ', {
countryCode: 'ZZ', overallScore: 50, level: 'medium',
domains: [], trend: 'stable', change30d: 0,
lowConfidence: false, imputationShare: 0,
baselineScore: 50, stressScore: 50, stressFactor: 0.5, dataVersion: '',
});
assert.equal(item.rankStable, false, 'missing interval should default to unstable');
});
it('returns rankStable=false for null response (unscored country)', () => {
const item = buildRankingItem('XX');
assert.equal(item.rankStable, false);
});
});