Files
worldmonitor/tests/resilience-scores-seed.test.mjs
Elie Habib 184e82cb40 feat(resilience): PR 3A — net-imports denominator for sovereignFiscalBuffer (#3380)
PR 3A of cohort-audit plan 2026-04-24-002. Construct correction for
re-export hubs: the SWF rawMonths denominator was gross imports, which
double-counted flow-through trade that never represents domestic
consumption. Net-imports fix:

  rawMonths = aum / (grossImports × (1 − reexportShareOfImports)) × 12

applied to any country in the re-export share manifest. Countries NOT
in the manifest get gross imports unchanged (status-quo fallback).

Plan acceptance gates — verified synthetically in this PR:

  Construct invariant. Two synthetic countries, same SWF, same gross
  imports. A re-exports 60%; B re-exports 0%. Post-fix, A's rawMonths
  is 2.5× B's (1/(1-0.6) = 2.5). Pinned in
  tests/resilience-net-imports-denominator.test.mts.

  SWF-heavy exporter invariant. Country with share ≤ 5%: rawMonths
  lift < 5% vs baseline (negligible). Pinned.

What shipped

1. Re-export share manifest infrastructure.
   - scripts/shared/reexport-share-manifest.yaml (new, empty) — schema
     committed; entries populated in follow-up PRs with UNCTAD
     Handbook citations.
   - scripts/shared/reexport-share-loader.mjs (new) — loader + strict
     validator, mirrors swf-manifest-loader.mjs.
   - scripts/seed-recovery-reexport-share.mjs (new) — publishes
     resilience:recovery:reexport-share:v1 from manifest. Empty
     manifest = valid (no countries, no adjustment).

2. SWF seeder uses net-imports denominator.
   - scripts/seed-sovereign-wealth.mjs exports computeNetImports(gross,
     share) — pure helper, unit-tested.
   - Per-country loop: reads manifest, computes denominatorImports,
     applies to rawMonths math.
   - Payload records annualImports (gross, audit), denominatorImports
     (used in math), reexportShareOfImports (provenance).
   - Summary log reports which countries had a net-imports adjustment
     applied with source year.

3. Bundle wiring.
   - Reexport-Share runs BEFORE Sovereign-Wealth in the recovery
     bundle so the SWF seeder reads fresh re-export data in the same
     cron tick.
   - tests/seed-bundle-resilience-recovery.test.mjs expected-entries
     updated (6 → 7) with ordering preservation.

4. Cache-prefix bump (per cache-prefix-bump-propagation-scope skill).
   - RESILIENCE_SCORE_CACHE_PREFIX: v11 → v12
   - RESILIENCE_RANKING_CACHE_KEY: v11 → v12
   - RESILIENCE_HISTORY_KEY_PREFIX: v6 → v7 (history rotation prevents
     30-day rolling window from mixing pre/post-fix scores and
     manufacturing false "falling" trends on deploy day).
   - Source of truth: server/worldmonitor/resilience/v1/_shared.ts
   - Mirrored in: scripts/seed-resilience-scores.mjs,
     scripts/validate-resilience-correlation.mjs,
     scripts/backtest-resilience-outcomes.mjs,
     scripts/validate-resilience-backtest.mjs,
     scripts/benchmark-resilience-external.mjs, api/health.js
   - Test literals bumped in 4 test files (26 line edits).
   - EXTENDED tests/resilience-cache-keys-health-sync.test.mts with
     a parity pass that reads every known mirror file and asserts
     both (a) canonical prefix present AND (b) no stale v<older>
     literals in non-comment code. Found one legacy log-line that
     still referenced v9 (scripts/seed-resilience-scores.mjs:342)
     and refactored it to use the RESILIENCE_RANKING_CACHE_KEY
     constant so future bumps self-update.

Explicitly NOT in this PR

- liquidReserveAdequacy denominator fix. The plan's PR 3A wording
  mentions both dims, but the RESERVES ratio (WB FI.RES.TOTL.MO) is a
  PRE-COMPUTED WB series; applying a post-hoc net-imports adjustment
  mixes WB's denominator year with our manifest-year, and the math
  change belongs in PR 3B (unified liquidity) where the α calibration
  is explicit. This PR stays scoped to sovereignFiscalBuffer.
- Live re-export share entries. The manifest ships EMPTY in this PR;
  entries with UNCTAD citations are one-per-PR follow-ups so each
  figure is individually auditable.

Verified

- tests/resilience-net-imports-denominator.test.mts — 9 pass (construct
  contract: 2.5× ratio gate, monotonicity, boundary rejections,
  backward-compat on missing manifest entry, cohort-proportionality,
  SWF-heavy-exporter-unchanged)
- tests/reexport-share-loader.test.mts — 7 pass (committed-manifest
  shape + 6 schema-violation rejections)
- tests/resilience-cache-keys-health-sync.test.mts — 5 pass (existing 3
  + 2 new parity checks across all mirror files)
- tests/seed-bundle-resilience-recovery.test.mjs — 17 pass (expected
  entries bumped to 7)
- npm run test:data — 6714 pass / 0 fail
- npm run typecheck / typecheck:api — green
- npm run lint / lint:md — clean

Deployment notes

Score + ranking + history cache prefixes all bump in the same deploy.
Per established v10→v11 precedent (and the cache-prefix-bump-
propagation-scope skill):
- Score / ranking: 6h TTL — the new prefix populates via the Railway
  resilience-scores cron within one tick.
- History: 30d ring — the v7 ring starts empty; the first 30 days
  post-deploy lack baseline points, so trend / change30d will read as
  "no change" until v7 accumulates a window.
- Legacy v11 keys can be deleted from Redis at any time post-deploy
  (no reader references them). Leaving them in place costs storage
  but does no harm.
2026-04-24 18:14:04 +04:00

274 lines
13 KiB
JavaScript

import { before, describe, it } from 'node:test';
import assert from 'node:assert/strict';
import {
RESILIENCE_RANKING_CACHE_KEY,
RESILIENCE_RANKING_CACHE_TTL_SECONDS,
RESILIENCE_SCORE_CACHE_PREFIX,
RESILIENCE_STATIC_INDEX_KEY,
computeIntervals,
} from '../scripts/seed-resilience-scores.mjs';
describe('exported constants', () => {
it('RESILIENCE_RANKING_CACHE_KEY matches server-side key (v11)', () => {
assert.equal(RESILIENCE_RANKING_CACHE_KEY, 'resilience:ranking:v12');
});
it('RESILIENCE_SCORE_CACHE_PREFIX matches server-side prefix (v11)', () => {
assert.equal(RESILIENCE_SCORE_CACHE_PREFIX, 'resilience:score:v12:');
});
it('RESILIENCE_RANKING_CACHE_TTL_SECONDS is 12 hours (2x cron interval)', () => {
// TTL must exceed cron interval (6h) so a missed/slow cron doesn't create
// an EMPTY_ON_DEMAND gap. Seeder and handler must agree on the TTL.
assert.equal(RESILIENCE_RANKING_CACHE_TTL_SECONDS, 12 * 60 * 60);
});
it('RESILIENCE_STATIC_INDEX_KEY matches expected key', () => {
assert.equal(RESILIENCE_STATIC_INDEX_KEY, 'resilience:static:index:v1');
});
});
describe('seed script does not export tsx/esm helpers', () => {
it('ensureResilienceScoreCached is not exported', async () => {
const mod = await import('../scripts/seed-resilience-scores.mjs');
assert.equal(typeof mod.ensureResilienceScoreCached, 'undefined');
});
it('createMemoizedSeedReader is not exported', async () => {
const mod = await import('../scripts/seed-resilience-scores.mjs');
assert.equal(typeof mod.createMemoizedSeedReader, 'undefined');
});
it('buildRankingItem is not exported (ranking write removed)', async () => {
const mod = await import('../scripts/seed-resilience-scores.mjs');
assert.equal(typeof mod.buildRankingItem, 'undefined');
});
it('sortRankingItems is not exported (ranking write removed)', async () => {
const mod = await import('../scripts/seed-resilience-scores.mjs');
assert.equal(typeof mod.sortRankingItems, 'undefined');
});
it('buildRankingPayload is not exported (ranking write removed)', async () => {
const mod = await import('../scripts/seed-resilience-scores.mjs');
assert.equal(typeof mod.buildRankingPayload, 'undefined');
});
});
describe('computeIntervals', () => {
it('returns p05 <= p95', () => {
const domainScores = [65, 70, 55, 80, 60];
const weights = [0.22, 0.20, 0.15, 0.25, 0.18];
const result = computeIntervals(domainScores, weights, 200);
assert.ok(result.p05 <= result.p95, `p05 (${result.p05}) should be <= p95 (${result.p95})`);
});
it('returns values within the domain score range', () => {
const domainScores = [40, 60, 50, 70, 55];
const weights = [0.22, 0.20, 0.15, 0.25, 0.18];
const result = computeIntervals(domainScores, weights, 200);
assert.ok(result.p05 >= 30, `p05 (${result.p05}) should be >= 30`);
assert.ok(result.p95 <= 80, `p95 (${result.p95}) should be <= 80`);
});
it('returns identical p05/p95 for uniform domain scores', () => {
const domainScores = [50, 50, 50, 50, 50];
const weights = [0.22, 0.20, 0.15, 0.25, 0.18];
const result = computeIntervals(domainScores, weights, 100);
assert.equal(result.p05, 50);
assert.equal(result.p95, 50);
});
it('produces wider interval for more diverse domain scores', () => {
const uniform = [50, 50, 50, 50, 50];
const diverse = [20, 90, 30, 80, 40];
const weights = [0.22, 0.20, 0.15, 0.25, 0.18];
const uResult = computeIntervals(uniform, weights, 500);
const dResult = computeIntervals(diverse, weights, 500);
const uWidth = uResult.p95 - uResult.p05;
const dWidth = dResult.p95 - dResult.p05;
assert.ok(dWidth > uWidth, `Diverse width (${dWidth}) should be > uniform width (${uWidth})`);
});
});
describe('script is self-contained .mjs', () => {
it('does not import from ../server/', async () => {
const { readFileSync } = await import('node:fs');
const { fileURLToPath } = await import('node:url');
const { dirname, join } = await import('node:path');
const dir = dirname(fileURLToPath(import.meta.url));
const src = readFileSync(join(dir, '..', 'scripts', 'seed-resilience-scores.mjs'), 'utf8');
assert.equal(src.includes('../server/'), false, 'Must not import from ../server/');
assert.equal(src.includes('tsx/esm'), false, 'Must not reference tsx/esm');
});
it('all imports are local ./ relative paths', async () => {
const { readFileSync } = await import('node:fs');
const { fileURLToPath } = await import('node:url');
const { dirname, join } = await import('node:path');
const dir = dirname(fileURLToPath(import.meta.url));
const src = readFileSync(join(dir, '..', 'scripts', 'seed-resilience-scores.mjs'), 'utf8');
const imports = [...src.matchAll(/from\s+['"]([^'"]+)['"]/g)].map((m) => m[1]);
for (const imp of imports) {
assert.ok(imp.startsWith('./'), `Import "${imp}" must be a local ./ relative path`);
}
});
});
describe('ensures ranking aggregate is present every cron, with truthful meta', () => {
// The ranking aggregate has the same 6h TTL as the per-country scores. If we
// only check + rebuild it inside the missing-scores branch, a cron tick that
// finds all scores still warm will skip the probe entirely — and the ranking
// can expire mid-cycle without anyone noticing until the NEXT cold-start
// cron. The probe + rebuild path must run on every cron, regardless of
// whether per-country warm was needed. The seed-meta write must be gated on
// post-rebuild verification so it never claims freshness over a missing key.
let src;
before(async () => {
const { readFileSync } = await import('node:fs');
const { fileURLToPath } = await import('node:url');
const { dirname, join } = await import('node:path');
const dir = dirname(fileURLToPath(import.meta.url));
src = readFileSync(join(dir, '..', 'scripts', 'seed-resilience-scores.mjs'), 'utf8');
});
it('extracts refreshRankingAggregate helper used by both warm and skip-warm branches', () => {
assert.match(src, /async function refreshRankingAggregate\b/, 'helper must be defined');
const calls = [...src.matchAll(/await\s+refreshRankingAggregate\s*\(/g)];
assert.ok(
calls.length >= 2,
`refreshRankingAggregate must be called from both branches (missing>0 and missing===0); found ${calls.length} call sites`,
);
});
it('always triggers the rebuild HTTP call — never short-circuits on "key still present"', () => {
// Skipping rebuild when the key exists recreates a timing hole: the key
// can be alive at probe time but expire a few minutes later, leaving a
// multi-hour gap until the NEXT cron where the key happens to be gone at
// probe time. Always rebuilding is one cheap HTTP per cron.
assert.doesNotMatch(
src,
/if\s*\(\s*rankingExists\s*!=\s*null[^)]*\)\s*return\s+true/,
'refreshRankingAggregate must not early-return when the ranking key is still present',
);
// The HTTP rebuild call itself must be unconditional (not gated on a probe).
assert.match(
src,
/async function refreshRankingAggregate[\s\S]*?\/api\/resilience\/v1\/get-resilience-ranking/,
'rebuild HTTP call must be in the body of refreshRankingAggregate unconditionally',
);
});
it('verifies the ranking key after the rebuild attempt for observability', () => {
assert.match(
src,
/\/strlen\/\$\{encodeURIComponent\(RESILIENCE_RANKING_CACHE_KEY\)\}/,
'STRLEN verify after rebuild surfaces when handler skipped the SET (coverage gate or partial pipeline)',
);
});
it('does NOT DEL the ranking before rebuild — uses ?refresh=1 instead', () => {
// The old flow (DEL + rebuild HTTP) created a brief absence window: if
// the rebuild request failed transiently, the ranking stayed absent
// until the next cron. We now send ?refresh=1 so the handler bypasses
// its cache-hit early-return and recomputes+SETs atomically. On failure,
// the existing (possibly stale) ranking remains.
assert.doesNotMatch(
src,
/\['DEL',\s*RESILIENCE_RANKING_CACHE_KEY\]/,
'seeder must not DEL the ranking key — ?refresh=1 is the atomic replacement path',
);
// ALL seeder-initiated calls to get-resilience-ranking must carry
// ?refresh=1. The bulk-warm path (inside `if (missing > 0)`) also needs
// it — the ranking TTL (12h) exceeds the score TTL (6h), so in the 6h-12h
// window the handler would hit its cache and skip the warm entirely,
// leaving per-country scores absent and coverage degraded.
const rankingEndpointCalls = [...src.matchAll(/\/api\/resilience\/v1\/get-resilience-ranking(\?[^\s'`"]*)?/g)];
assert.ok(rankingEndpointCalls.length >= 2, `expected at least 2 ranking-endpoint calls (bulk-warm + refresh), got ${rankingEndpointCalls.length}`);
for (const [full, query] of rankingEndpointCalls) {
assert.ok(
(query || '').includes('refresh=1'),
`ranking endpoint call must include ?refresh=1 — found: ${full}`,
);
}
});
it('seeder does NOT write seed-meta:resilience:ranking (handler is sole writer)', () => {
// A seeder-written meta can only attest to per-country score count, not
// to whether the ranking aggregate was actually published. Handler gates
// its SET on 75% coverage; if the gate trips, an older ranking survives
// and seeder meta would lie about freshness. Remove the seeder write —
// handler writes ranking + meta atomically, ensureRankingPresent()
// triggers the handler every cron so meta stays fresh during quiet Pro
// usage without the seeder needing to heartbeat.
assert.doesNotMatch(
src,
/writeRankingSeedMeta\s*\(/,
'seed-resilience-scores.mjs must NOT define or call writeRankingSeedMeta',
);
// Assert no SET command targets the meta key — comments that reference
// the key name are fine and useful for future maintainers.
assert.doesNotMatch(
src,
/\[\s*['"]SET['"]\s*,\s*['"]seed-meta:resilience:ranking['"]/,
'seeder must not issue SET seed-meta:resilience:ranking (handler is sole writer)',
);
});
});
describe('seed-bundle-resilience section interval keeps refresh alive', () => {
// The bundle runner skips a section when its seed-meta is younger than
// intervalMs * 0.8. If intervalMs is too long (e.g. 6h), most Railway cron
// fires hit the skip branch → refreshRankingAggregate() never runs →
// ranking can expire between actual runs and create EMPTY_ON_DEMAND gaps.
// 2h is the tested trade-off: frequent enough for the 12h ranking TTL to
// stay well-refreshed, cheap enough per warm-path run (~5-10s).
it('Resilience-Scores section has intervalMs ≤ 2 hours', async () => {
const { readFileSync } = await import('node:fs');
const { fileURLToPath } = await import('node:url');
const { dirname, join } = await import('node:path');
const dir = dirname(fileURLToPath(import.meta.url));
const src = readFileSync(
join(dir, '..', 'scripts', 'seed-bundle-resilience.mjs'),
'utf8',
);
// Match the label + section line, then extract the intervalMs value.
const m = src.match(/label:\s*'Resilience-Scores'[\s\S]{0,400}?intervalMs:\s*(\d+)\s*\*\s*HOUR/);
assert.ok(m, 'Resilience-Scores section must set intervalMs in HOUR units');
const hours = Number(m[1]);
assert.ok(
hours > 0 && hours <= 2,
`intervalMs must be ≤ 2 hours (found ${hours}) so refreshRankingAggregate runs frequently enough to keep the ranking key alive before its 12h TTL`,
);
});
});
describe('handler warm pipeline is chunked', () => {
// The 222-country pipeline SET payload (~600KB) exceeds the 5s pipeline
// timeout on Vercel Edge → handler reports 0 persisted, ranking skipped.
// The fix is to chunk into smaller pipelines that comfortably fit. Static
// assertion because behavioral tests can't easily synthesize 222 countries
// through the full scoring pipeline.
it('warmMissingResilienceScores splits SETs into batches', async () => {
const { readFileSync } = await import('node:fs');
const { fileURLToPath } = await import('node:url');
const { dirname, join } = await import('node:path');
const dir = dirname(fileURLToPath(import.meta.url));
const src = readFileSync(
join(dir, '..', 'server', 'worldmonitor', 'resilience', 'v1', '_shared.ts'),
'utf8',
);
assert.match(
src,
/const\s+SET_BATCH\s*=\s*\d+/,
'SET_BATCH constant must be defined',
);
assert.match(
src,
/for\s*\([^)]*i\s*\+=\s*SET_BATCH/,
'pipeline SETs must be issued in SET_BATCH-sized chunks',
);
});
});