Files
worldmonitor/tests/resilience-scorers.test.mts
Elie Habib 7cf37c604c feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)
* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently

First commit in PR 3 of the resilience repair plan. Retires
`fuelStockDays` from the core score with no replacement.

Why permanent, not replaced:
IEA emergency-stockholding rules are defined in days of NET IMPORTS
and do not bind net exporters by design. Norway/Canada/US measured
in days-of-imports are incomparable to Germany/Japan measured the
same way — the construct is fundamentally different across the two
country classes. No globally-comparable recovery-fuel signal can
be built from this source; the pre-repair probe showed 100% imputed
at 50 for every country in the April 2026 freeze.

  scoreFuelStockDays:
    - Rewritten to return coverage=0 + observedWeight=0 +
      imputationClass='source-failure' for every country regardless
      of seed content.
    - Drops the dimension from the `recovery` domain's coverage-
      weighted mean automatically; remaining recovery dimensions
      pick up the share via re-normalisation in
      `_shared.ts#coverageWeightedMean`.
    - No explicit weight transfer needed — the coverage-weighted
      blend handles redistribution.

  Registry:
    - recoveryFuelStockDays re-tagged from tier='enrichment' to
      tier='experimental' so the Core coverage gate treats it as
      out-of-score.
    - Description updated to make the retirement explicit; entry
      stays in the registry for structural continuity (the
      dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER
      for the 19-dimension tests; removing the dimension entirely is
      a PR 4 structural-audit concern).

  Housekeeping:
    - Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no
      longer read; noUnusedLocals would reject it).
    - Removed `RecoveryFuelStocksCountry` interface for the same
      reason. Comment at the removed declaration instructs future
      maintainers not to re-add the type as a reservation; when a
      new recovery-fuel concept lands, introduce a fresh interface.

Plan reference: §3.5 point 1 of
`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`.

51 resilience tests pass, typecheck + biome clean. The
`recovery` domain's published score will shift slightly for every
country because the 0.10 slot that fuelStockDays was imputing to
now redistributes; the compare-harness acceptance-gate rerun at
merge time will quantify the shift per plan §6 gates.

* feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves

BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35
fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60
countries max — so every non-BIS country fell through to
curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45).
Combined with reserveMarginPct already removed in PR 1, currencyExternal
was the clearest "construct absent for most of the world" carrier left
in the scorer.

Changes:

_dimension-scorers.ts
- scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX
  reserves only. Coverage ladder:
    inflation + reserves → 0.85 (observed primary + secondary)
    inflation only       → 0.55
    reserves only        → 0.40
    neither              → 0.30 (IMPUTE.bisEer retained for snapshot
                                 continuity; semantics read as
                                 "no IMF + no WB reserves" now)
- Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved
  via comment only, flagged by noUnusedLocals), stddev() helper,
  getCountryBisExchangeRates() loader, BisExchangeRate interface,
  dateToSortableNumber() — all were exclusive callers of the retired
  BIS path.

_indicator-registry.ts
- New core entry inflationStability (weight 0.60, tier=core,
  sourceKey=economic:imf:macro:v2).
- fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability
  anchor).
- fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental
  (BIS ~60-country coverage; off the core weight sum).
- Non-experimental weights now sum to 1.0 (0.60 + 0.40).

scripts/compare-resilience-current-vs-proposed.mjs
- EXTRACTION_RULES: added inflationStability →
  imf-macro-country-field field=inflationPct so the registry-parity
  test passes and the correlation harness sees the new construct.

tests/resilience-dimension-scorers.test.mts
- Dropped BIS-era wording ("non-BIS country") and test 266
  (BIS-outage coverage 0.35 branch) which collapsed to the inflation-
  only path post-retirement.
- Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+
  reserves 0.55 → 0.85.

tests/resilience-scorers.test.mts
- domainAverages.economic 68.33 → 66.33 (US currencyExternal score
  shifts slightly under IMF+reserves vs old BIS composite).
- stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279.
- overallScore 65.82 → 65.52.
- baselineScore unchanged (currencyExternal is stress-only).

All 6324 data-tier tests pass. typecheck:api clean. No change to
seeders or Redis keys; this is a pure scorer + registry rebuild.

* feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)

Plan §2.1 diagnosis table showed externalDebtCoverage saturating at
score=100 across all 9 probe countries — including stressed states.
Signal was collapsed. Root cause: (worst=5, best=0) gave every country
with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's
reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no
worry" territory instead of the "mild warning" it should be.

Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score
50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure).
Ratios above 2.0 clamp to 0, consistent with "beyond this point the
country is already in crisis; exact value stops mattering."

Files changed:

- _indicator-registry.ts: recoveryDebtToReserves goalposts
  {worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to
  cite Greenspan-Guidotti; inline comment documents anchor + rationale.

- _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound
  changed from (0..5) to (0..2), with inline comment.

- docs/methodology/country-resilience-index.mdx: goalpost table row
  5-0 → 2-0, description cites Greenspan-Guidotti.

- docs/methodology/indicator-sources.yaml:
  * constructStatus: dead-signal → observed-mechanism (signal is now
    discriminating).
  * reviewNotes updated to describe the new anchor.
  * mechanismTestRationale names the Greenspan-Guidotti rule.

- tests/resilience-dimension-monotonicity.test.mts: updated the
  comment + picked values inside the (0..2) discriminating band (0.3
  and 1.5). Old values (1 vs 4) had 4 clamping to 0.

- tests/resilience-dimension-scorers.test.mts: NO score threshold
  relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96).

- tests/resilience-scorers.test.mts: fixture drift:
  * domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25).
  * baselineScore 63.63 → 60.12 (extDebt is baseline type).
  * overallScore 65.52 → 63.27.
  * stressScore / stressFactor unchanged (extDebt is baseline-only).

All 6324 data-tier tests pass. typecheck:api clean.

* feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight

Plan §3.6 adds a new acceptance criterion (also §5 item 5):

> No indicator with observed coverage below 70% may exceed 5% nominal
> weight OR 5% effective influence in the post-change sensitivity run.

This commit enforces the NOMINAL-WEIGHT half as a unit test that runs
on every CI build. The EFFECTIVE-INFLUENCE half is produced by
scripts/validate-resilience-sensitivity.mjs as a committed artifact;
the gate file only asserts that script still exists so a refactor that
removes it breaks the build loudly.

Why the gate exists (plan §3.6):

  "A dimension at 30% observed coverage carries the same effective
   weight as one at 95%. This contradicts the OECD/JRC handbook on
   uncertainty analysis."

Implementation:

tests/resilience-coverage-influence-gate.test.mts — three tests:
  1. Nominal-weight gate: for every core indicator with coverage < 137
     countries (70% of the ~195-country universe), computes its nominal
     overall weight as
       indicator.weight × (1/dimensions-in-domain) × domain-weight
     and asserts it does not exceed 5%. Equal-share-per-dimension is
     the *upper bound* on runtime weight (coverage-weighted mean gives
     a lower share when a dimension drops out), so this is a strict
     bound: if the nominal number passes, the runtime number also
     passes for every country.
  2. Effective-influence contract: asserts the sensitivity script
     exists at its expected path. Removing it (intentionally or by
     refactor) breaks the build.
  3. Audit visibility: prints the top 10 core indicators by nominal
     overall weight. No assertion beyond "ran" — the list lets
     reviewers spot outliers that pass the gate but are near the cap.

Current state (observed from audit output):

  recoveryReserveMonths:   nominal=4.17%  coverage=188
  recoveryDebtToReserves:  nominal=4.17%  coverage=185
  recoveryImportHhi:       nominal=4.17%  coverage=190
  inflationStability:      nominal=3.40%  coverage=185
  electricityConsumption:  nominal=3.30%  coverage=217
  ucdpConflict:            nominal=3.09%  coverage=193

Every core indicator has coverage ≥ 180 (already enforced by the
pre-existing indicator-tiering test), so the nominal-weight gate has
no current violators — its purpose is catching future drift, not
flagging today's state.

All 6327 data-tier tests pass. typecheck:api clean.

* docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate

Methodology-doc update capturing the three §3.5 landings and the §3.6 CI
gate. Five edits:

1. **Known construct limitations section (#5 and #6):** strikethrough the
   original "dead signals" and "no coverage-based weight cap" items,
   annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" +
   specifics of what shipped.

2. **Currency & External H4 section:** completely rewritten. Old table
   (fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is
   replaced by the two-indicator post-PR-3 table (inflationStability at
   0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out
   (0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as
   experimental-tier drill-downs only.

3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the
   methodology-lint H4-to-dimension mapping does not break; body
   rewritten to explain that the dimension is retired from core but the
   seeder still runs for IEA-member drill-downs.

4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description
   cites Greenspan-Guidotti reserve-adequacy rule.

5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering
   §3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit
   updates.

No scoring or code changes in this commit. Methodology-lint test passes
(H4 mapping intact). All 6327 data-tier tests pass.

* fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation

Reviewer catch (thanks). The previous gate computed each indicator's
nominal overall weight as

  indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight

and claimed this was an upper bound ("actual runtime weight is ≤ this
when some dimensions drop out on coverage"). That is BACKWARDS for
this scorer.

The domain aggregation is coverage-weighted
(server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean),
so when a dimension pins at coverage=0 it is EXCLUDED from the
denominator and the surviving dimensions' shares go UP, not down.

PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to
coverage=0 for every country — so in the current live state the
recovery domain has 5 contributing dimensions (not 6), and each core
recovery indicator's nominal share is

  1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%)

The old gate therefore under-estimated nominal influence and could
silently pass exactly the kind of low-coverage overweight regression
it is meant to block.

Fix:

- Added `coreBearingDimensions(domainId)` helper that counts only
  dimensions that have ≥1 core indicator in the registry. A dimension
  with only experimental/enrichment entries (post-retirement
  fuelStockDays) has no core contribution → does not dilute shares.
- Updated `nominalOverallWeight` to divide by the core-bearing count,
  not the raw dimension count.
- Rewrote the helper's doc comment to stop claiming this is a strict
  upper bound — explicitly calls out the dynamic case (source failure
  raising surviving dim shares further) as the sensitivity script's
  responsibility.
- Added a new regression test: asserts (a) at least one recovery
  dimension is all-non-core (fuelStockDays post-retirement),
  (b) fuelStockDays has zero core indicators, and (c) recoveryDebt
  ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion
  of the retirement or regression to N_total-denominator will fail
  loudly.

Top-10 audit output now correctly shows:

  recoveryReserveMonths:   nominal=5%     coverage=188
  recoveryDebtToReserves:  nominal=5%     coverage=185
  recoveryImportHhi:       nominal=5%     coverage=190
  (was 4.17% each under the old math)

All 486 resilience tests pass. typecheck:api clean.

Note: the 5% figure is exactly AT the cap, not over it. "exceed" means
strictly > 5%, so it still passes. But now the reviewer / audit log
reflects reality.

* fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label

Addresses the Codex review P1 + P2 on PR #3297.

P1 — retired-dim drag on confidence averages
--------------------------------------------
scoreFuelStockDays returns coverage=0 by design (retired construct),
but computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence averaged across all 19 dimensions. That
dragged every country's reported averageCoverage down — US went from
0.8556 (active dims only) to 0.8105 (all dims) — enough drift to
misclassify edge countries as lowConfidence and to shift the ranking
widget's overallCoverage pill for every country.

Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in
_dimension-scorers.ts and filter it out of all three averages. The
filter is keyed on the retired-dim REGISTRY, not on coverage === 0,
because a non-retired dim can legitimately emit coverage=0 on a
genuinely sparse-data country via weightedBlend fall-through — those
entries MUST keep dragging confidence down (that is the sparse-data
signal lowConfidence exists to surface). Verified: sparse-country
release-gate test (marks sparse WHO/FAO countries as low confidence)
still passes with the registry-keyed filter; would have failed with
a naive coverage=0 filter.

Server-client parity: widget-utils cannot import server code, so
RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept
in lockstep by tests/resilience-retired-dimensions-parity.test.mts
(parses the widget file as text, same pattern as existing widget-util
tests that can't import the widget module directly).

P2 — false "Source down" label on retired dim
---------------------------------------------
scoreFuelStockDays hard-coded imputationClass: 'source-failure',
which the widget maps to "Source down: upstream seeder failed" with
a `!` icon for every country. That is semantically wrong for an
intentional retirement. Flipped to null so the widget's absent-path
renders a neutral cell without a false outage label. null is already
a legal value of ResilienceDimensionScore.imputationClass; no type
change needed.

Tests
-----
- tests/resilience-confidence-averaging.test.mts (new): pins the
  registry-keyed filter semantic for computeOverallCoverage +
  computeLowConfidence. Includes a negative-control test proving
  non-retired coverage=0 dims still flip lowConfidence.
- tests/resilience-retired-dimensions-parity.test.mts (new):
  lockstep gate between server and client retired-dim lists.
- Widget test adds a registry-keyed exclusion test with a non-retired
  coverage=0 dim in the fixture to lock in the correct semantic.
- Existing tests asserting imputationClass: 'source-failure' for
  fuelStockDays flipped to null.

All 494 resilience tests + full 6336/6336 data-tier suite pass.
Typecheck clean for both tsconfig.json and tsconfig.api.json.

* docs(resilience): align methodology + registry metadata with shipped imputationClass=null

Follow-up to the previous PR 3 review commit that flipped
scoreFuelStockDays's imputationClass from 'source-failure' to null to
avoid a false "Source down" widget label on every country. The code
changed; the doc and registry metadata did not, leaving three sites
in the methodology mdx and two comment/description sites in the
registry still claiming imputationClass='source-failure'. Any future
reviewer (or tooling that treats the registry description as
authoritative) would be misled.

This commit rewrites those sites to describe the shipped behavior:
 - imputationClass=null (not 'source-failure'), with the rationale
 - exclusion from confidence/coverage averages via the
   RESILIENCE_RETIRED_DIMENSIONS registry filter
 - the distinction between structural retirement (filtered) and
   runtime coverage=0 (kept so sparse-data countries still flag
   lowConfidence)

Touched:
 - docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590)
 - server/worldmonitor/resilience/v1/_indicator-registry.ts
   (recoveryFuelStockDays comment block + description field)

No code-behavior change. Docs-only.

Tests: 157 targeted resilience tests pass (incl. methodology-lint +
widget + release-gate + confidence-averaging). Typecheck clean on
both tsconfig.json and tsconfig.api.json.
2026-04-22 23:57:28 +04:00

355 lines
17 KiB
TypeScript

import assert from 'node:assert/strict';
import { afterEach, describe, it } from 'node:test';
import {
RESILIENCE_DIMENSION_DOMAINS,
RESILIENCE_DIMENSION_ORDER,
RESILIENCE_DIMENSION_SCORERS,
RESILIENCE_DIMENSION_TYPES,
RESILIENCE_DOMAIN_ORDER,
getResilienceDomainWeight,
scoreAllDimensions,
scoreEnergy,
scoreInfrastructure,
scoreTradeSanctions,
} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
import { installRedis } from './helpers/fake-upstash-redis.mts';
import { RESILIENCE_FIXTURES } from './helpers/resilience-fixtures.mts';
const originalFetch = globalThis.fetch;
const originalRedisUrl = process.env.UPSTASH_REDIS_REST_URL;
const originalRedisToken = process.env.UPSTASH_REDIS_REST_TOKEN;
const originalVercelEnv = process.env.VERCEL_ENV;
afterEach(() => {
globalThis.fetch = originalFetch;
if (originalRedisUrl == null) delete process.env.UPSTASH_REDIS_REST_URL;
else process.env.UPSTASH_REDIS_REST_URL = originalRedisUrl;
if (originalRedisToken == null) delete process.env.UPSTASH_REDIS_REST_TOKEN;
else process.env.UPSTASH_REDIS_REST_TOKEN = originalRedisToken;
if (originalVercelEnv == null) delete process.env.VERCEL_ENV;
else process.env.VERCEL_ENV = originalVercelEnv;
});
describe('resilience scorer contracts', () => {
it('keeps every dimension scorer within the 0..100 range for known countries', async () => {
installRedis(RESILIENCE_FIXTURES);
for (const countryCode of ['NO', 'US', 'YE']) {
for (const [dimensionId, scorer] of Object.entries(RESILIENCE_DIMENSION_SCORERS)) {
const result = await scorer(countryCode);
assert.ok(result.score >= 0 && result.score <= 100, `${countryCode}/${dimensionId} score out of bounds: ${result.score}`);
assert.ok(result.coverage >= 0 && result.coverage <= 1, `${countryCode}/${dimensionId} coverage out of bounds: ${result.coverage}`);
}
}
});
it('returns coverage=0 when all backing seeds are missing (source outage must not impute)', async () => {
installRedis({});
// Imputation only applies when the source is loaded but the country is absent.
// A null source (seed outage) must NOT be reclassified as a "stable country" signal.
// Exceptions:
// - scoreFoodWater reads per-country static data; fao=null in a loaded static
// record is a legitimate "not in active crisis" signal.
// - scoreCurrencyExternal (T1.7 source-failure wiring): the legacy absence
// branch (score=50, coverage=0, imputationClass=null) was deleted so every
// imputed return path carries a taxonomy tag. When BIS + IMF + reserves are
// all absent, the scorer falls through to IMPUTE.bisEer (curated_list_absent
// → unmonitored, coverage=0.3). The aggregation pass then re-tags to
// source-failure when the adapter is in seed-meta failedDatasets. This is the
// single source of truth for "no currency data"; null-imputationClass paths
// on non-real-data return branches are no longer permitted.
// PR 3 §3.5: fuelStockDays removed from this set — scoreFuelStockDays
// now returns coverage=0 + imputationClass=null for every country
// (retired), so it passes the default coverage=0 assertion below
// instead of the T1.7 fall-through assertion. The `null` tag (rather
// than 'source-failure') reflects the intentional retirement — see
// the widget `formatDimensionConfidence` absent-path which would
// otherwise surface a false "Source down" label on every country.
const coverageZeroExempt = new Set([
'currencyExternal',
'fiscalSpace', 'reserveAdequacy', 'externalDebtCoverage',
'importConcentration', 'stateContinuity',
]);
for (const [dimensionId, scorer] of Object.entries(RESILIENCE_DIMENSION_SCORERS)) {
const result = await scorer('US');
assert.ok(result.score >= 0 && result.score <= 100, `${dimensionId} fallback score out of bounds: ${result.score}`);
if (coverageZeroExempt.has(dimensionId)) {
// The scorer emits the curated_list_absent taxonomy entry directly;
// coverage is the taxonomy's certaintyCoverage (0.3) rather than 0.
assert.ok(result.imputedWeight > 0, `${dimensionId} must emit imputed weight on T1.7 fall-through`);
assert.equal(result.imputationClass, 'unmonitored',
`${dimensionId} fall-through must tag unmonitored, got ${result.imputationClass}`);
continue;
}
assert.equal(result.coverage, 0, `${dimensionId} must have coverage=0 when all seeds missing (source outage ≠ country absence)`);
}
});
it('produces the expected weighted overall score from the known fixture dimensions', async () => {
installRedis(RESILIENCE_FIXTURES);
const scoreMap = await scoreAllDimensions('US');
const domainAverages = Object.fromEntries(RESILIENCE_DOMAIN_ORDER.map((domainId) => {
const dimensionScores = RESILIENCE_DIMENSION_ORDER
.filter((dimensionId) => RESILIENCE_DIMENSION_DOMAINS[dimensionId] === domainId)
.map((dimensionId) => scoreMap[dimensionId].score);
const average = Number((dimensionScores.reduce((sum, value) => sum + value, 0) / dimensionScores.length).toFixed(2));
return [domainId, average];
}));
// PR 3 §3.5: economic 68.33 → 66.33 after currencyExternal rebuild.
// Recovery 54.83 → 47.33 after externalDebtCoverage goalpost was
// tightened from (0..5) to (0..2) per §3.5 point 3 (US ratio=1.5
// now scores 25 instead of 70).
assert.deepEqual(domainAverages, {
economic: 66.33,
infrastructure: 79,
energy: 80,
'social-governance': 61.75,
'health-food': 60.5,
recovery: 47.33,
});
function round(v: number, d = 2) { return Number(v.toFixed(d)); }
function coverageWeightedMean(dims: { score: number; coverage: number }[]) {
const totalCov = dims.reduce((s, d) => s + d.coverage, 0);
if (!totalCov) return 0;
return dims.reduce((s, d) => s + d.score * d.coverage, 0) / totalCov;
}
const dimensions = RESILIENCE_DIMENSION_ORDER.map((id) => ({
id,
score: round(scoreMap[id].score),
coverage: round(scoreMap[id].coverage),
}));
const baselineDims = dimensions.filter((d) => {
const t = RESILIENCE_DIMENSION_TYPES[d.id as keyof typeof RESILIENCE_DIMENSION_TYPES];
return t === 'baseline' || t === 'mixed';
});
const stressDims = dimensions.filter((d) => {
const t = RESILIENCE_DIMENSION_TYPES[d.id as keyof typeof RESILIENCE_DIMENSION_TYPES];
return t === 'stress' || t === 'mixed';
});
const baselineScore = round(coverageWeightedMean(baselineDims));
const stressScore = round(coverageWeightedMean(stressDims));
const stressFactor = round(Math.max(0, Math.min(1 - stressScore / 100, 0.5)), 4);
// PR 3 §3.5: 62.64 → 63.63 (fuelStockDays retirement) → 60.12
// (externalDebtCoverage goalpost tightened; US score drops from 70
// to 25, pulling the coverage-weighted baseline mean down).
assert.equal(baselineScore, 60.12);
// PR 3 §3.5: 65.84 → 67.85 (fuelStockDays retirement) → 67.21
// (currencyExternal rebuilt on IMF inflation + WB reserves, coverage
// shifts and US stress score moves). stressFactor updates in lockstep:
// 1 - 67.21/100 = 0.3279, clamped to 0.5.
assert.equal(stressScore, 67.21);
assert.equal(stressFactor, 0.3279);
const overallScore = round(
RESILIENCE_DOMAIN_ORDER.map((domainId) => {
const dimScores = RESILIENCE_DIMENSION_ORDER
.filter((id) => RESILIENCE_DIMENSION_DOMAINS[id] === domainId)
.map((id) => ({ score: round(scoreMap[id].score), coverage: round(scoreMap[id].coverage) }));
const totalCov = dimScores.reduce((sum, d) => sum + d.coverage, 0);
const cwMean = totalCov ? dimScores.reduce((sum, d) => sum + d.score * d.coverage, 0) / totalCov : 0;
return round(cwMean) * getResilienceDomainWeight(domainId);
}).reduce((sum, v) => sum + v, 0),
);
// PR 3 §3.5: 65.57 → 65.82 (fuelStockDays retirement) → 65.52
// (currencyExternal rebuild) → 63.27 (externalDebtCoverage goalpost
// tightened 0..5 → 0..2; US recovery-domain contribution drops).
assert.equal(overallScore, 63.27);
});
it('baselineScore is computed from baseline + mixed dimensions only', async () => {
installRedis(RESILIENCE_FIXTURES);
const scoreMap = await scoreAllDimensions('US');
const baselineDimIds = RESILIENCE_DIMENSION_ORDER.filter((id) => {
const t = RESILIENCE_DIMENSION_TYPES[id];
return t === 'baseline' || t === 'mixed';
});
const stressOnlyDimIds = RESILIENCE_DIMENSION_ORDER.filter((id) => RESILIENCE_DIMENSION_TYPES[id] === 'stress');
assert.ok(baselineDimIds.length > 0, 'should have baseline dims');
for (const id of stressOnlyDimIds) {
assert.ok(!baselineDimIds.includes(id), `stress-only dimension ${id} should not appear in baseline set`);
}
assert.ok(baselineDimIds.includes('macroFiscal'), 'macroFiscal should be in baseline set');
assert.ok(baselineDimIds.includes('infrastructure'), 'infrastructure should be in baseline set');
assert.ok(baselineDimIds.includes('logisticsSupply'), 'mixed logisticsSupply should be in baseline set');
});
it('stressScore is computed from stress + mixed dimensions only', async () => {
installRedis(RESILIENCE_FIXTURES);
const scoreMap = await scoreAllDimensions('US');
const stressDimIds = RESILIENCE_DIMENSION_ORDER.filter((id) => {
const t = RESILIENCE_DIMENSION_TYPES[id];
return t === 'stress' || t === 'mixed';
});
const baselineOnlyDimIds = RESILIENCE_DIMENSION_ORDER.filter((id) => RESILIENCE_DIMENSION_TYPES[id] === 'baseline');
assert.ok(stressDimIds.length > 0, 'should have stress dims');
for (const id of baselineOnlyDimIds) {
assert.ok(!stressDimIds.includes(id), `baseline-only dimension ${id} should not appear in stress set`);
}
assert.ok(stressDimIds.includes('currencyExternal'), 'currencyExternal should be in stress set');
assert.ok(stressDimIds.includes('borderSecurity'), 'borderSecurity should be in stress set');
assert.ok(stressDimIds.includes('energy'), 'mixed energy should be in stress set');
});
it('overallScore = sum(domainScore * domainWeight)', async () => {
installRedis(RESILIENCE_FIXTURES);
const scoreMap = await scoreAllDimensions('US');
function round(v: number, d = 2) { return Number(v.toFixed(d)); }
function coverageWeightedMean(dims: { score: number; coverage: number }[]) {
const totalCov = dims.reduce((s, d) => s + d.coverage, 0);
if (!totalCov) return 0;
return dims.reduce((s, d) => s + d.score * d.coverage, 0) / totalCov;
}
const dimensions = RESILIENCE_DIMENSION_ORDER.map((id) => ({
id, score: round(scoreMap[id].score), coverage: round(scoreMap[id].coverage),
}));
const grouped = new Map<string, typeof dimensions>();
for (const domainId of RESILIENCE_DOMAIN_ORDER) grouped.set(domainId, []);
for (const dim of dimensions) {
const domainId = RESILIENCE_DIMENSION_DOMAINS[dim.id as keyof typeof RESILIENCE_DIMENSION_DOMAINS];
grouped.get(domainId)?.push(dim);
}
const expected = round(
RESILIENCE_DOMAIN_ORDER.reduce((sum, domainId) => {
const domainDims = grouped.get(domainId) ?? [];
const domainScore = round(coverageWeightedMean(domainDims));
return sum + domainScore * getResilienceDomainWeight(domainId);
}, 0),
);
assert.ok(expected > 0, 'overall should be positive');
// PR 3 §3.5: 65.82 → 65.52 (currencyExternal rebuild) → 63.27 after
// externalDebtCoverage goalpost tightened from (0..5) to (0..2).
assert.equal(expected, 63.27, 'overallScore should match sum(domainScore * domainWeight); 65.52 → 63.27 after PR 3 §3.5 externalDebtCoverage re-goalpost');
});
it('stressFactor is still computed (informational) and clamped to [0, 0.5]', () => {
function clampStressFactor(stressScore: number) {
return Math.max(0, Math.min(1 - stressScore / 100, 0.5));
}
assert.equal(clampStressFactor(100), 0, 'perfect stress score = zero factor');
assert.equal(clampStressFactor(0), 0.5, 'zero stress score = max factor 0.5');
assert.equal(clampStressFactor(50), 0.5, 'stress 50 = clamped to 0.5');
assert.ok(clampStressFactor(70) >= 0 && clampStressFactor(70) <= 0.5, 'stress 70 within bounds');
assert.ok(clampStressFactor(110) >= 0, 'stress above 100 still clamped');
});
});
const DE_BASE_FIXTURES = {
...RESILIENCE_FIXTURES,
'resilience:static:DE': {
iea: { energyImportDependency: { value: 65, year: 2024, source: 'IEA' } },
},
'energy:mix:v1:DE': {
iso2: 'DE', country: 'Germany', year: 2023,
coalShare: 30, gasShare: 15, oilShare: 1, renewShare: 46,
},
};
describe('scoreEnergy storageBuffer metric', () => {
it('EU country with high storage (>80% fill) contributes near-zero storageStress', async () => {
installRedis({
...DE_BASE_FIXTURES,
'energy:gas-storage:v1:DE': { iso2: 'DE', fillPct: 90, trend: 'stable' },
});
const result = await scoreEnergy('DE');
assert.ok(result.score >= 0 && result.score <= 100, `score out of bounds: ${result.score}`);
assert.ok(result.coverage > 0, 'coverage should be > 0 when static data present');
});
it('EU country with low storage (20% fill) scores lower than with high storage', async () => {
installRedis({
...DE_BASE_FIXTURES,
'energy:gas-storage:v1:DE': { iso2: 'DE', fillPct: 20, trend: 'withdrawing' },
});
const resultLow = await scoreEnergy('DE');
installRedis({
...DE_BASE_FIXTURES,
'energy:gas-storage:v1:DE': { iso2: 'DE', fillPct: 90, trend: 'stable' },
});
const resultHigh = await scoreEnergy('DE');
assert.ok(resultLow.score < resultHigh.score, `low storage (${resultLow.score}) should score lower than high storage (${resultHigh.score})`);
});
it('non-EU country with no gas-storage key drops storageBuffer weight gracefully', async () => {
installRedis(RESILIENCE_FIXTURES);
const result = await scoreEnergy('US');
assert.ok(result.score >= 0 && result.score <= 100, `score out of bounds: ${result.score}`);
assert.ok(result.coverage > 0, 'coverage should be > 0 when other data is present');
assert.ok(result.coverage < 1, 'coverage < 1 when storageBuffer is missing');
});
it('EU country with null fillPct falls back gracefully (excludes storageBuffer from weighted avg)', async () => {
installRedis({
...DE_BASE_FIXTURES,
'energy:gas-storage:v1:DE': { iso2: 'DE', fillPct: null },
});
const resultNull = await scoreEnergy('DE');
installRedis(DE_BASE_FIXTURES);
const resultMissing = await scoreEnergy('DE');
assert.ok(resultNull.score >= 0 && resultNull.score <= 100, `score out of bounds: ${resultNull.score}`);
assert.equal(resultNull.score, resultMissing.score, 'null fillPct should behave identically to missing key');
});
});
describe('scoreInfrastructure: broadband penetration', () => {
it('pins expected numeric score and coverage for US with broadband data', async () => {
installRedis(RESILIENCE_FIXTURES);
const result = await scoreInfrastructure('US');
assert.equal(result.score, 84, 'pinned infrastructure score for US fixture');
assert.equal(result.coverage, 1, 'full coverage when all four metrics present');
});
it('broadband removal lowers score and coverage', async () => {
installRedis(RESILIENCE_FIXTURES);
const withBroadband = await scoreInfrastructure('US');
const noBroadbandFixtures = structuredClone(RESILIENCE_FIXTURES);
const usStatic = noBroadbandFixtures['resilience:static:US'] as Record<string, unknown>;
const infra = usStatic.infrastructure as { indicators: Record<string, unknown> };
delete infra.indicators['IT.NET.BBND.P2'];
installRedis(noBroadbandFixtures);
const withoutBroadband = await scoreInfrastructure('US');
assert.equal(withoutBroadband.score, 83, 'pinned infrastructure score without broadband');
assert.equal(withoutBroadband.coverage, 0.85, 'coverage drops to 0.85 without broadband (0.15 weight missing)');
assert.ok(withBroadband.score > withoutBroadband.score, 'broadband presence increases infrastructure score');
assert.ok(withBroadband.coverage > withoutBroadband.coverage, 'broadband presence increases coverage');
});
});
describe('scoreTradeSanctions WB tariff rate', () => {
it('WB tariff rate contributes to trade score', async () => {
installRedis(RESILIENCE_FIXTURES);
const result = await scoreTradeSanctions('US');
assert.ok(result.score >= 0 && result.score <= 100, `score out of bounds: ${result.score}`);
assert.ok(result.coverage > 0, 'coverage should be > 0 when tariff data is present');
});
it('high tariff rate country scores lower than low tariff rate', async () => {
installRedis(RESILIENCE_FIXTURES);
const noResult = await scoreTradeSanctions('NO');
const yeResult = await scoreTradeSanctions('YE');
assert.ok(noResult.score > yeResult.score, `NO (${noResult.score}) should score higher than YE (${yeResult.score}) due to lower tariff rate`);
});
});