mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently
First commit in PR 3 of the resilience repair plan. Retires
`fuelStockDays` from the core score with no replacement.
Why permanent, not replaced:
IEA emergency-stockholding rules are defined in days of NET IMPORTS
and do not bind net exporters by design. Norway/Canada/US measured
in days-of-imports are incomparable to Germany/Japan measured the
same way — the construct is fundamentally different across the two
country classes. No globally-comparable recovery-fuel signal can
be built from this source; the pre-repair probe showed 100% imputed
at 50 for every country in the April 2026 freeze.
scoreFuelStockDays:
- Rewritten to return coverage=0 + observedWeight=0 +
imputationClass='source-failure' for every country regardless
of seed content.
- Drops the dimension from the `recovery` domain's coverage-
weighted mean automatically; remaining recovery dimensions
pick up the share via re-normalisation in
`_shared.ts#coverageWeightedMean`.
- No explicit weight transfer needed — the coverage-weighted
blend handles redistribution.
Registry:
- recoveryFuelStockDays re-tagged from tier='enrichment' to
tier='experimental' so the Core coverage gate treats it as
out-of-score.
- Description updated to make the retirement explicit; entry
stays in the registry for structural continuity (the
dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER
for the 19-dimension tests; removing the dimension entirely is
a PR 4 structural-audit concern).
Housekeeping:
- Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no
longer read; noUnusedLocals would reject it).
- Removed `RecoveryFuelStocksCountry` interface for the same
reason. Comment at the removed declaration instructs future
maintainers not to re-add the type as a reservation; when a
new recovery-fuel concept lands, introduce a fresh interface.
Plan reference: §3.5 point 1 of
`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`.
51 resilience tests pass, typecheck + biome clean. The
`recovery` domain's published score will shift slightly for every
country because the 0.10 slot that fuelStockDays was imputing to
now redistributes; the compare-harness acceptance-gate rerun at
merge time will quantify the shift per plan §6 gates.
* feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves
BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35
fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60
countries max — so every non-BIS country fell through to
curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45).
Combined with reserveMarginPct already removed in PR 1, currencyExternal
was the clearest "construct absent for most of the world" carrier left
in the scorer.
Changes:
_dimension-scorers.ts
- scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX
reserves only. Coverage ladder:
inflation + reserves → 0.85 (observed primary + secondary)
inflation only → 0.55
reserves only → 0.40
neither → 0.30 (IMPUTE.bisEer retained for snapshot
continuity; semantics read as
"no IMF + no WB reserves" now)
- Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved
via comment only, flagged by noUnusedLocals), stddev() helper,
getCountryBisExchangeRates() loader, BisExchangeRate interface,
dateToSortableNumber() — all were exclusive callers of the retired
BIS path.
_indicator-registry.ts
- New core entry inflationStability (weight 0.60, tier=core,
sourceKey=economic:imf:macro:v2).
- fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability
anchor).
- fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental
(BIS ~60-country coverage; off the core weight sum).
- Non-experimental weights now sum to 1.0 (0.60 + 0.40).
scripts/compare-resilience-current-vs-proposed.mjs
- EXTRACTION_RULES: added inflationStability →
imf-macro-country-field field=inflationPct so the registry-parity
test passes and the correlation harness sees the new construct.
tests/resilience-dimension-scorers.test.mts
- Dropped BIS-era wording ("non-BIS country") and test 266
(BIS-outage coverage 0.35 branch) which collapsed to the inflation-
only path post-retirement.
- Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+
reserves 0.55 → 0.85.
tests/resilience-scorers.test.mts
- domainAverages.economic 68.33 → 66.33 (US currencyExternal score
shifts slightly under IMF+reserves vs old BIS composite).
- stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279.
- overallScore 65.82 → 65.52.
- baselineScore unchanged (currencyExternal is stress-only).
All 6324 data-tier tests pass. typecheck:api clean. No change to
seeders or Redis keys; this is a pure scorer + registry rebuild.
* feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)
Plan §2.1 diagnosis table showed externalDebtCoverage saturating at
score=100 across all 9 probe countries — including stressed states.
Signal was collapsed. Root cause: (worst=5, best=0) gave every country
with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's
reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no
worry" territory instead of the "mild warning" it should be.
Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score
50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure).
Ratios above 2.0 clamp to 0, consistent with "beyond this point the
country is already in crisis; exact value stops mattering."
Files changed:
- _indicator-registry.ts: recoveryDebtToReserves goalposts
{worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to
cite Greenspan-Guidotti; inline comment documents anchor + rationale.
- _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound
changed from (0..5) to (0..2), with inline comment.
- docs/methodology/country-resilience-index.mdx: goalpost table row
5-0 → 2-0, description cites Greenspan-Guidotti.
- docs/methodology/indicator-sources.yaml:
* constructStatus: dead-signal → observed-mechanism (signal is now
discriminating).
* reviewNotes updated to describe the new anchor.
* mechanismTestRationale names the Greenspan-Guidotti rule.
- tests/resilience-dimension-monotonicity.test.mts: updated the
comment + picked values inside the (0..2) discriminating band (0.3
and 1.5). Old values (1 vs 4) had 4 clamping to 0.
- tests/resilience-dimension-scorers.test.mts: NO score threshold
relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96).
- tests/resilience-scorers.test.mts: fixture drift:
* domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25).
* baselineScore 63.63 → 60.12 (extDebt is baseline type).
* overallScore 65.52 → 63.27.
* stressScore / stressFactor unchanged (extDebt is baseline-only).
All 6324 data-tier tests pass. typecheck:api clean.
* feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight
Plan §3.6 adds a new acceptance criterion (also §5 item 5):
> No indicator with observed coverage below 70% may exceed 5% nominal
> weight OR 5% effective influence in the post-change sensitivity run.
This commit enforces the NOMINAL-WEIGHT half as a unit test that runs
on every CI build. The EFFECTIVE-INFLUENCE half is produced by
scripts/validate-resilience-sensitivity.mjs as a committed artifact;
the gate file only asserts that script still exists so a refactor that
removes it breaks the build loudly.
Why the gate exists (plan §3.6):
"A dimension at 30% observed coverage carries the same effective
weight as one at 95%. This contradicts the OECD/JRC handbook on
uncertainty analysis."
Implementation:
tests/resilience-coverage-influence-gate.test.mts — three tests:
1. Nominal-weight gate: for every core indicator with coverage < 137
countries (70% of the ~195-country universe), computes its nominal
overall weight as
indicator.weight × (1/dimensions-in-domain) × domain-weight
and asserts it does not exceed 5%. Equal-share-per-dimension is
the *upper bound* on runtime weight (coverage-weighted mean gives
a lower share when a dimension drops out), so this is a strict
bound: if the nominal number passes, the runtime number also
passes for every country.
2. Effective-influence contract: asserts the sensitivity script
exists at its expected path. Removing it (intentionally or by
refactor) breaks the build.
3. Audit visibility: prints the top 10 core indicators by nominal
overall weight. No assertion beyond "ran" — the list lets
reviewers spot outliers that pass the gate but are near the cap.
Current state (observed from audit output):
recoveryReserveMonths: nominal=4.17% coverage=188
recoveryDebtToReserves: nominal=4.17% coverage=185
recoveryImportHhi: nominal=4.17% coverage=190
inflationStability: nominal=3.40% coverage=185
electricityConsumption: nominal=3.30% coverage=217
ucdpConflict: nominal=3.09% coverage=193
Every core indicator has coverage ≥ 180 (already enforced by the
pre-existing indicator-tiering test), so the nominal-weight gate has
no current violators — its purpose is catching future drift, not
flagging today's state.
All 6327 data-tier tests pass. typecheck:api clean.
* docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate
Methodology-doc update capturing the three §3.5 landings and the §3.6 CI
gate. Five edits:
1. **Known construct limitations section (#5 and #6):** strikethrough the
original "dead signals" and "no coverage-based weight cap" items,
annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" +
specifics of what shipped.
2. **Currency & External H4 section:** completely rewritten. Old table
(fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is
replaced by the two-indicator post-PR-3 table (inflationStability at
0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out
(0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as
experimental-tier drill-downs only.
3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the
methodology-lint H4-to-dimension mapping does not break; body
rewritten to explain that the dimension is retired from core but the
seeder still runs for IEA-member drill-downs.
4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description
cites Greenspan-Guidotti reserve-adequacy rule.
5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering
§3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit
updates.
No scoring or code changes in this commit. Methodology-lint test passes
(H4 mapping intact). All 6327 data-tier tests pass.
* fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation
Reviewer catch (thanks). The previous gate computed each indicator's
nominal overall weight as
indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight
and claimed this was an upper bound ("actual runtime weight is ≤ this
when some dimensions drop out on coverage"). That is BACKWARDS for
this scorer.
The domain aggregation is coverage-weighted
(server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean),
so when a dimension pins at coverage=0 it is EXCLUDED from the
denominator and the surviving dimensions' shares go UP, not down.
PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to
coverage=0 for every country — so in the current live state the
recovery domain has 5 contributing dimensions (not 6), and each core
recovery indicator's nominal share is
1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%)
The old gate therefore under-estimated nominal influence and could
silently pass exactly the kind of low-coverage overweight regression
it is meant to block.
Fix:
- Added `coreBearingDimensions(domainId)` helper that counts only
dimensions that have ≥1 core indicator in the registry. A dimension
with only experimental/enrichment entries (post-retirement
fuelStockDays) has no core contribution → does not dilute shares.
- Updated `nominalOverallWeight` to divide by the core-bearing count,
not the raw dimension count.
- Rewrote the helper's doc comment to stop claiming this is a strict
upper bound — explicitly calls out the dynamic case (source failure
raising surviving dim shares further) as the sensitivity script's
responsibility.
- Added a new regression test: asserts (a) at least one recovery
dimension is all-non-core (fuelStockDays post-retirement),
(b) fuelStockDays has zero core indicators, and (c) recoveryDebt
ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion
of the retirement or regression to N_total-denominator will fail
loudly.
Top-10 audit output now correctly shows:
recoveryReserveMonths: nominal=5% coverage=188
recoveryDebtToReserves: nominal=5% coverage=185
recoveryImportHhi: nominal=5% coverage=190
(was 4.17% each under the old math)
All 486 resilience tests pass. typecheck:api clean.
Note: the 5% figure is exactly AT the cap, not over it. "exceed" means
strictly > 5%, so it still passes. But now the reviewer / audit log
reflects reality.
* fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label
Addresses the Codex review P1 + P2 on PR #3297.
P1 — retired-dim drag on confidence averages
--------------------------------------------
scoreFuelStockDays returns coverage=0 by design (retired construct),
but computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence averaged across all 19 dimensions. That
dragged every country's reported averageCoverage down — US went from
0.8556 (active dims only) to 0.8105 (all dims) — enough drift to
misclassify edge countries as lowConfidence and to shift the ranking
widget's overallCoverage pill for every country.
Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in
_dimension-scorers.ts and filter it out of all three averages. The
filter is keyed on the retired-dim REGISTRY, not on coverage === 0,
because a non-retired dim can legitimately emit coverage=0 on a
genuinely sparse-data country via weightedBlend fall-through — those
entries MUST keep dragging confidence down (that is the sparse-data
signal lowConfidence exists to surface). Verified: sparse-country
release-gate test (marks sparse WHO/FAO countries as low confidence)
still passes with the registry-keyed filter; would have failed with
a naive coverage=0 filter.
Server-client parity: widget-utils cannot import server code, so
RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept
in lockstep by tests/resilience-retired-dimensions-parity.test.mts
(parses the widget file as text, same pattern as existing widget-util
tests that can't import the widget module directly).
P2 — false "Source down" label on retired dim
---------------------------------------------
scoreFuelStockDays hard-coded imputationClass: 'source-failure',
which the widget maps to "Source down: upstream seeder failed" with
a `!` icon for every country. That is semantically wrong for an
intentional retirement. Flipped to null so the widget's absent-path
renders a neutral cell without a false outage label. null is already
a legal value of ResilienceDimensionScore.imputationClass; no type
change needed.
Tests
-----
- tests/resilience-confidence-averaging.test.mts (new): pins the
registry-keyed filter semantic for computeOverallCoverage +
computeLowConfidence. Includes a negative-control test proving
non-retired coverage=0 dims still flip lowConfidence.
- tests/resilience-retired-dimensions-parity.test.mts (new):
lockstep gate between server and client retired-dim lists.
- Widget test adds a registry-keyed exclusion test with a non-retired
coverage=0 dim in the fixture to lock in the correct semantic.
- Existing tests asserting imputationClass: 'source-failure' for
fuelStockDays flipped to null.
All 494 resilience tests + full 6336/6336 data-tier suite pass.
Typecheck clean for both tsconfig.json and tsconfig.api.json.
* docs(resilience): align methodology + registry metadata with shipped imputationClass=null
Follow-up to the previous PR 3 review commit that flipped
scoreFuelStockDays's imputationClass from 'source-failure' to null to
avoid a false "Source down" widget label on every country. The code
changed; the doc and registry metadata did not, leaving three sites
in the methodology mdx and two comment/description sites in the
registry still claiming imputationClass='source-failure'. Any future
reviewer (or tooling that treats the registry description as
authoritative) would be misled.
This commit rewrites those sites to describe the shipped behavior:
- imputationClass=null (not 'source-failure'), with the rationale
- exclusion from confidence/coverage averages via the
RESILIENCE_RETIRED_DIMENSIONS registry filter
- the distinction between structural retirement (filtered) and
runtime coverage=0 (kept so sparse-data countries still flag
lowConfidence)
Touched:
- docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590)
- server/worldmonitor/resilience/v1/_indicator-registry.ts
(recoveryFuelStockDays comment block + description field)
No code-behavior change. Docs-only.
Tests: 157 targeted resilience tests pass (incl. methodology-lint +
widget + release-gate + confidence-averaging). Typecheck clean on
both tsconfig.json and tsconfig.api.json.
260 lines
12 KiB
TypeScript
260 lines
12 KiB
TypeScript
// Monotonicity-test harness. Pins the direction of movement for the
|
||
// highest-leverage indicators so PR 1 + PR 2 cannot accidentally flip
|
||
// a sign silently. See
|
||
// docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md
|
||
// §5 (PR 0 deliverable) and §6 (acceptance gate 8).
|
||
//
|
||
// Each test builds two synthetic `ResilienceSeedReader` fixtures that
|
||
// differ only in the target indicator's value and asserts the dimension
|
||
// score moves in the documented direction.
|
||
//
|
||
// Scope (minimum viable, expanded in PR 0.5 follow-ups):
|
||
// - scoreEnergy: dependency, gasShare, coalShare, renewShare, electricityConsumption
|
||
// (all five direction claims the current scorer makes — PR 1 overturns three of them)
|
||
// - scoreReserveAdequacy: reserveMonths
|
||
// - scoreFiscalSpace: govRevenuePct, fiscalBalancePct, debtToGdpPct
|
||
// - scoreExternalDebtCoverage: debtToReservesRatio
|
||
// - scoreImportConcentration: hhi
|
||
// - scoreFoodWater: peopleInCrisis, phase
|
||
// - scoreGovernanceInstitutional: WGI mean
|
||
//
|
||
// 15 indicators × 1 direction check each = 15 assertions. The harness
|
||
// is written as a table so PR 1 can add/remove rows without touching
|
||
// test logic.
|
||
|
||
import assert from 'node:assert/strict';
|
||
import { describe, it } from 'node:test';
|
||
|
||
import {
|
||
scoreEnergy,
|
||
scoreReserveAdequacy,
|
||
scoreFiscalSpace,
|
||
scoreExternalDebtCoverage,
|
||
scoreImportConcentration,
|
||
scoreFoodWater,
|
||
scoreGovernanceInstitutional,
|
||
type ResilienceSeedReader,
|
||
} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
|
||
|
||
const TEST_ISO2 = 'XX';
|
||
|
||
function makeStaticReader(staticRecord: unknown, overrides: Record<string, unknown> = {}): ResilienceSeedReader {
|
||
return async (key: string) => {
|
||
if (key === `resilience:static:${TEST_ISO2}`) return staticRecord;
|
||
if (key in overrides) return overrides[key];
|
||
return null;
|
||
};
|
||
}
|
||
|
||
function makeRecoveryReader(keyValueMap: Record<string, unknown>): ResilienceSeedReader {
|
||
return async (key: string) => keyValueMap[key] ?? null;
|
||
}
|
||
|
||
describe('resilience dimension monotonicity — scoreReserveAdequacy', () => {
|
||
it('higher reserveMonths → higher score', async () => {
|
||
const low = await scoreReserveAdequacy(TEST_ISO2, makeRecoveryReader({
|
||
'resilience:recovery:reserve-adequacy:v1': { countries: { [TEST_ISO2]: { reserveMonths: 2 } } },
|
||
}));
|
||
const high = await scoreReserveAdequacy(TEST_ISO2, makeRecoveryReader({
|
||
'resilience:recovery:reserve-adequacy:v1': { countries: { [TEST_ISO2]: { reserveMonths: 12 } } },
|
||
}));
|
||
assert.ok(high.score > low.score, `reserveMonths 2→12 should raise score; got ${low.score} → ${high.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreFiscalSpace', () => {
|
||
const baseEntry = { govRevenuePct: 25, fiscalBalancePct: 0, debtToGdpPct: 60 };
|
||
|
||
async function scoreWith(override: Partial<typeof baseEntry>) {
|
||
return scoreFiscalSpace(TEST_ISO2, makeRecoveryReader({
|
||
'resilience:recovery:fiscal-space:v1': { countries: { [TEST_ISO2]: { ...baseEntry, ...override } } },
|
||
}));
|
||
}
|
||
|
||
it('higher govRevenuePct → higher score', async () => {
|
||
const low = await scoreWith({ govRevenuePct: 10 });
|
||
const high = await scoreWith({ govRevenuePct: 40 });
|
||
assert.ok(high.score > low.score, `govRevenuePct 10→40 should raise score; got ${low.score} → ${high.score}`);
|
||
});
|
||
|
||
it('higher fiscalBalancePct → higher score', async () => {
|
||
const low = await scoreWith({ fiscalBalancePct: -10 });
|
||
const high = await scoreWith({ fiscalBalancePct: 3 });
|
||
assert.ok(high.score > low.score, `fiscalBalancePct -10→3 should raise score; got ${low.score} → ${high.score}`);
|
||
});
|
||
|
||
it('higher debtToGdpPct → lower score', async () => {
|
||
const low = await scoreWith({ debtToGdpPct: 40 });
|
||
const high = await scoreWith({ debtToGdpPct: 140 });
|
||
assert.ok(low.score > high.score, `debtToGdpPct 40→140 should lower score; got ${low.score} → ${high.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreExternalDebtCoverage', () => {
|
||
async function scoreWith(ratio: number) {
|
||
return scoreExternalDebtCoverage(TEST_ISO2, makeRecoveryReader({
|
||
'resilience:recovery:external-debt:v1': { countries: { [TEST_ISO2]: { debtToReservesRatio: ratio } } },
|
||
}));
|
||
}
|
||
|
||
it('higher debtToReservesRatio → lower score', async () => {
|
||
// PR 3 §3.5 point 3: goalpost is now lower-better worst=2 best=0
|
||
// (Greenspan-Guidotti anchor). Any ratio ≥ 2 clamps to 0, so pick
|
||
// values inside the discriminating band to get a meaningful gradient.
|
||
const good = await scoreWith(0.3);
|
||
const bad = await scoreWith(1.5);
|
||
assert.ok(good.score > bad.score, `debtToReservesRatio 0.3→1.5 should lower score; got ${good.score} → ${bad.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreImportConcentration', () => {
|
||
async function scoreWith(hhi: number) {
|
||
return scoreImportConcentration(TEST_ISO2, makeRecoveryReader({
|
||
'resilience:recovery:import-hhi:v1': { countries: { [TEST_ISO2]: { hhi } } },
|
||
}));
|
||
}
|
||
|
||
it('higher hhi → lower score (more concentration = more exposure)', async () => {
|
||
// HHI payload is on a 0..1 scale (normalised before storage).
|
||
// 0.15 = diversified supplier base; 0.45 = concentrated.
|
||
const diversified = await scoreWith(0.15);
|
||
const concentrated = await scoreWith(0.45);
|
||
assert.ok(diversified.score > concentrated.score, `hhi 0.15→0.45 should lower score; got ${diversified.score} → ${concentrated.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreGovernanceInstitutional', () => {
|
||
async function scoreWith(wgiMeanValue: number) {
|
||
// Static-record shape per `getStaticWgiValues`: `wgi.indicators.<name>.value`.
|
||
const staticRecord = {
|
||
wgi: {
|
||
indicators: {
|
||
voiceAccountability: { value: wgiMeanValue },
|
||
politicalStability: { value: wgiMeanValue },
|
||
governmentEffectiveness:{ value: wgiMeanValue },
|
||
regulatoryQuality: { value: wgiMeanValue },
|
||
ruleOfLaw: { value: wgiMeanValue },
|
||
controlOfCorruption: { value: wgiMeanValue },
|
||
},
|
||
},
|
||
};
|
||
return scoreGovernanceInstitutional(TEST_ISO2, makeStaticReader(staticRecord));
|
||
}
|
||
|
||
it('higher WGI mean → higher score', async () => {
|
||
const weak = await scoreWith(-1.5);
|
||
const strong = await scoreWith(1.5);
|
||
assert.ok(strong.score > weak.score, `WGI -1.5→1.5 should raise score; got ${weak.score} → ${strong.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreFoodWater', () => {
|
||
async function scoreWith(override: Record<string, unknown>) {
|
||
const fao = { peopleInCrisis: 100, phase: 'Phase 1', ...override };
|
||
const staticRecord = { fao, aquastat: { waterStress: { value: 40 }, waterAvailability: { value: 2000 } } };
|
||
return scoreFoodWater(TEST_ISO2, makeStaticReader(staticRecord));
|
||
}
|
||
|
||
it('higher peopleInCrisis → lower score', async () => {
|
||
const healthy = await scoreWith({ peopleInCrisis: 1000 });
|
||
const crisis = await scoreWith({ peopleInCrisis: 5_000_000 });
|
||
assert.ok(healthy.score > crisis.score, `peopleInCrisis 1k→5M should lower score; got ${healthy.score} → ${crisis.score}`);
|
||
});
|
||
|
||
it('higher IPC phase → lower score', async () => {
|
||
const phase2 = await scoreWith({ phase: 'Phase 2' });
|
||
const phase5 = await scoreWith({ phase: 'Phase 5' });
|
||
assert.ok(phase2.score > phase5.score, `phase 2→5 should lower score; got ${phase2.score} → ${phase5.score}`);
|
||
});
|
||
});
|
||
|
||
describe('resilience dimension monotonicity — scoreEnergy (current construct)', () => {
|
||
// NOTE: these tests pin the CURRENT scorer direction for each indicator.
|
||
// PR 1 §3.1-3.3 overturns three of them (electricityConsumption, gasShare,
|
||
// coalShare) — when PR 1 ships, those tests are REPLACED by tests for
|
||
// the new indicators (importedFossilDependence, lowCarbonGenerationShare).
|
||
// The failure of one of these tests in the meantime is a signal that a
|
||
// PR has accidentally altered the construct; PR 1 should update this
|
||
// file in the same commit that changes scoreEnergy.
|
||
|
||
function makeEnergyReader(overrides: {
|
||
staticRecord?: unknown;
|
||
mix?: unknown;
|
||
prices?: unknown;
|
||
storage?: unknown;
|
||
} = {}): ResilienceSeedReader {
|
||
const defaultStatic = {
|
||
iea: { energyImportDependency: { value: 30 } },
|
||
infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
|
||
};
|
||
const defaultMix = { gasShare: 30, coalShare: 20, renewShare: 30 };
|
||
return async (key: string) => {
|
||
if (key === `resilience:static:${TEST_ISO2}`) return overrides.staticRecord ?? defaultStatic;
|
||
if (key === 'economic:energy:v1:all') return overrides.prices ?? null;
|
||
if (key === `energy:mix:v1:${TEST_ISO2}`) return overrides.mix ?? defaultMix;
|
||
if (key === `energy:gas-storage:v1:${TEST_ISO2}`) return overrides.storage ?? null;
|
||
return null;
|
||
};
|
||
}
|
||
|
||
it('higher import dependency → lower score', async () => {
|
||
const selfSufficient = await scoreEnergy(TEST_ISO2, makeEnergyReader({
|
||
staticRecord: {
|
||
iea: { energyImportDependency: { value: 10 } },
|
||
infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
|
||
},
|
||
}));
|
||
const dependent = await scoreEnergy(TEST_ISO2, makeEnergyReader({
|
||
staticRecord: {
|
||
iea: { energyImportDependency: { value: 90 } },
|
||
infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 3000 } } },
|
||
},
|
||
}));
|
||
assert.ok(selfSufficient.score > dependent.score, `import dep 10→90 should lower score; got ${selfSufficient.score} → ${dependent.score}`);
|
||
});
|
||
|
||
it('higher renewShare → higher score', async () => {
|
||
const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 20, renewShare: 5 } }));
|
||
const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 20, renewShare: 70 } }));
|
||
assert.ok(high.score > low.score, `renewShare 5→70 should raise score; got ${low.score} → ${high.score}`);
|
||
});
|
||
|
||
it('CURRENT: higher gasShare → lower score (THIS CHANGES IN PR 1 — see plan §3.2)', async () => {
|
||
// Pins the current (v3-plan-condemned) behavior so PR 1 knows what
|
||
// it is replacing. When PR 1 ships the new importedFossilDependence
|
||
// composite, this test is REPLACED, not deleted — the replacement
|
||
// pins the new construct's direction.
|
||
const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 10, coalShare: 20, renewShare: 30 } }));
|
||
const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 70, coalShare: 20, renewShare: 30 } }));
|
||
assert.ok(low.score > high.score, `gasShare 10→70 should lower score under current construct; got ${low.score} → ${high.score}`);
|
||
});
|
||
|
||
it('CURRENT: higher coalShare → lower score (THIS CHANGES IN PR 1 — see plan §3.2)', async () => {
|
||
const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 10, renewShare: 30 } }));
|
||
const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({ mix: { gasShare: 30, coalShare: 70, renewShare: 30 } }));
|
||
assert.ok(low.score > high.score, `coalShare 10→70 should lower score under current construct; got ${low.score} → ${high.score}`);
|
||
});
|
||
|
||
it('CURRENT: higher electricityConsumption → higher score (THIS FAILS THE MECHANISM TEST — see plan §3.1)', async () => {
|
||
// This test PASSES today because the current scorer rewards
|
||
// per-capita electricity consumption. The v3 plan classifies
|
||
// electricityConsumption as a wealth-proxy that fails the mechanism
|
||
// test; PR 1 removes it. When PR 1 ships, this test is DELETED (not
|
||
// replaced), because the indicator no longer exists. The delete is
|
||
// the signal that the wealth-proxy concern is resolved.
|
||
const low = await scoreEnergy(TEST_ISO2, makeEnergyReader({
|
||
staticRecord: {
|
||
iea: { energyImportDependency: { value: 30 } },
|
||
infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 500 } } },
|
||
},
|
||
}));
|
||
const high = await scoreEnergy(TEST_ISO2, makeEnergyReader({
|
||
staticRecord: {
|
||
iea: { energyImportDependency: { value: 30 } },
|
||
infrastructure: { indicators: { 'EG.USE.ELEC.KH.PC': { value: 7500 } } },
|
||
},
|
||
}));
|
||
assert.ok(high.score > low.score, `electricityConsumption 500→7500 kWh/cap should raise score under current construct; got ${low.score} → ${high.score}`);
|
||
});
|
||
});
|