32 Commits

Author SHA1 Message Date
Elie Habib
8655bd81bc feat(energy-atlas): GEM pipeline data import — gas 297, oil 334 (#3406)
* feat(energy-atlas): GEM pipeline data import — gas 75→297, oil 75→334 (parity-push closure)

Closes the ~3.6× pipeline-scale gap that PR #3397's import infrastructure
was built for. Per docs/methodology/pipelines.mdx operator runbook.

Source releases (CC-BY 4.0, attribution preserved in registry envelope):
  - GEM-GGIT-Gas-Pipelines-2025-11.xlsx
    SHA256: f56d8b14400e558f06e53a4205034d3d506fc38c5ae6bf58000252f87b1845e6
    URL:    https://globalenergymonitor.org/wp-content/uploads/2025/11/GEM-GGIT-Gas-Pipelines-2025-11.xlsx
  - GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx
    SHA256: d1648d28aed99cfd2264047f1e944ddfccf50ce9feeac7de5db233c601dc3bb2
    URL:    https://globalenergymonitor.org/wp-content/uploads/2025/03/GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx

Pre-conversion: GeoJSON (geometry endpoints) + XLSX (column properties) →
canonical operator-shape JSON via /tmp/gem-import/convert.py. Filter knobs:
  - status ∈ {operating, construction}
  - length ≥ 750 km (gas) / 400 km (oil) — asymmetric per-fuel trunk-class
  - capacity unit conversions: bcm/y native; MMcf/d, MMSCMD, mtpa, m3/day,
    bpd, Mb/d, kbd → bcm/y (gas) or bbl/d (oil) at canonical conversion factors.
  - Country names → ISO 3166-1 alpha-2 via pycountry + alias table.

Merge results (via scripts/import-gem-pipelines.mjs --merge):
  gas: +222 added, 15 duplicates skipped (haversine ≤ 5km AND token Jaccard ≥ 0.6)
  oil: +259 added, 16 duplicates skipped
  Final: 297 gas / 334 oil. Hand-curated 75+75 preserved with full evidence;
  GEM rows ship physicalStateSource='gem', classifierConfidence=0.4,
  operatorStatement=null, sanctionRefs=[].

Floor bump:
  scripts/_pipeline-registry.mjs MIN_PIPELINES_PER_REGISTRY 8 → 200.
  Live counts (297/334) leave ~100 rows of jitter headroom so a partial
  re-import or coverage-narrowing release fails loud rather than halving
  the registry silently.

Tests:
  - tests/pipelines-registry.test.mts: bumped synthetic-registry
    Array.from({length:8}) → length:210 to clear new floor; added 'gem' to
    the evidence-source whitelist for non-flowing badges (parity with the
    derivePipelinePublicBadge audit done in PR #3397 U1).
  - tests/import-gem-pipelines.test.mjs: bumped registry-conformance loop
    3 → 70 to clear new floor.
  - 51/51 pipeline tests pass; tsc --noEmit clean.

vs peer reference site (281 gas + 265 oil): we now match (gas 297) and
exceed (oil 334). Functional + visual + data parity for the energy variant
is closed; remaining gaps are editorial-cadence (weekly briefing) which
is intentionally out of scope per the parity-push plan.

* docs(energy-atlas): land GEM converter + expand methodology runbook for quarterly refresh

PR #3406 imported the data but didn't land the conversion script that
produced it. This commit lands the converter at scripts/_gem-geojson-to-canonical.py
so future operators can reproduce the import deterministically, and rewrites
the docs/methodology/pipelines.mdx runbook to match what actually works:

- Use GeoJSON (not XLSX) — the XLSX has properties but no lat/lon columns;
  only the GIS .zip's GeoJSON has both. The original runbook said to download
  XLSX which would fail at the lat/lon validation step.
- Cadence: quarterly refresh, with concrete signals (peer-site comparison,
  90-day calendar reminder).
- Source datasets: explicit GGIT (gas) + GOIT (oil/NGL) tracker names so
  future operators don't re-request the wrong dataset (the Extraction
  Tracker = wells/fields, NOT pipelines — ours requires the Infrastructure
  Trackers).
- Last-known-good URLs documented + URL pattern explained as fallback when
  GEM rotates per release.
- Filter knob defaults documented inline (gas ≥ 750km, oil ≥ 400km, status
  ∈ {operating, construction}, capacity unit conversion table).
- Failure-mode table mapping common errors to fixes.

Converter takes paths via env vars (GEM_GAS_GEOJSON, GEM_OIL_GEOJSON,
GEM_DOWNLOADED_AT, GEM_SOURCE_VERSION) instead of hardcoded paths so it
works for any release without code edits.

* fix(energy-atlas): close PR #3406 review findings — dedup + zero-length + test

Three Greptile findings on PR #3406:

P1 — Dedup miss (Dampier-Bunbury):
  Same physical pipeline existed in both registries — curated `dampier-bunbury`
  and GEM-imported `dampier-to-bunbury-natural-gas-pipeline-au` — because GEM
  digitized only the southern 60% of the line. The shared Bunbury terminus
  matched at 13.7 km but the average-endpoint distance was 287 km, just over
  the 5 km gate.
  Fix: scripts/_pipeline-dedup.mjs adds a name-set-identity short-circuit —
  if Jaccard == 1.0 (after stopword removal) AND any of the 4 endpoint
  pairings is ≤ 25 km, treat as duplicate. The 25 km anchor preserves the
  existing "name collision in different ocean → still added" contract.
  Added regression test: identical Dampier-Bunbury inputs → 0 added, 1
  skipped, matched against `dampier-bunbury`.

P1 — Zero-length geometry (9 rows: Trans-Alaska, Enbridge Line 3, Ichthys, etc.):
  GEM source GeoJSON occasionally has a Point geometry or single-coord
  LineString, producing pipelines where startPoint == endPoint. They render
  as map-point artifacts and skew aggregate-length stats.
  Fix (defense in depth):
    - scripts/_gem-geojson-to-canonical.py drops at conversion time
      (`zero_length` reason in drop log).
    - scripts/_pipeline-registry.mjs validateRegistry rejects defensively
      so even a hand-curated row with degenerate geometry fails loud.

P2 — Test repetition coupled to fixture row count:
  Hardcoded `for (let i = 0; i < 70; i++)` × 3 fixture rows = 210 silently
  breaks if fixture is trimmed below 3.
  Fix: `Math.ceil(REGISTRY_FLOOR / fixture.length) + 5` derives reps from
  the floor and current fixture length.

Re-run --merge with all fixes applied:
  gas: 75 → 293 (+218 added, 17 deduped — was 222/15 before; +2 catches via
       name-set-identity short-circuit; -2 zero-length never imported)
  oil: 75 → 325 (+250 added, 18 deduped — was 259/16; +2 catches; -7 zero-length)

Tests: 74/74 pipeline tests pass; tsc --noEmit clean.
2026-04-25 18:59:46 +04:00
Elie Habib
d9a1f6a0f8 feat(energy-atlas): GEM pipeline import infrastructure (parity PR 1, plan U1-U4) (#3397)
* feat(energy-atlas): GEM pipeline import infrastructure (PR 1, plan U1-U4)

Lands the parser, dedup helper, validator extensions, and operator runbook
for the Global Energy Monitor (CC-BY 4.0) pipeline-data refresh — closing
~3.6× of the Energy Atlas pipeline-scale gap once the operator runs the
import.

Per docs/plans/2026-04-25-003-feat-energy-parity-pushup-plan.md PR 1.

U1 — Validator + schema extensions:
- Add `'gem'` to VALID_SOURCES in scripts/_pipeline-registry.mjs and to the
  evidence-bearing-source whitelist in derivePipelinePublicBadge so GEM-
  sourced offline rows derive a `disputed` badge via the external-signal
  rule (parity with `press`/`satellite`/`ais-relay`).
- Export VALID_SOURCES so tests assert against the same source-of-truth
  the validator uses (matches the VALID_OIL_PRODUCT_CLASSES pattern from
  PR #3383).
- Floor bump (MIN_PIPELINES_PER_REGISTRY 8→200) intentionally DEFERRED
  to the follow-up data PR — bumping it now would gate the existing 75+75
  hand-curated rows below the new floor and break seeder publishes
  before the GEM data lands.

U2 — GEM parser (test-first):
- scripts/import-gem-pipelines.mjs reads a local JSON file (operator pre-
  converts GEM Excel externally — no `xlsx` dependency added). Schema-
  drift sentinel throws on missing columns. Status mapping covers
  Operating/Construction/Cancelled/Mothballed/Idle/Shut-in. ProductClass
  mapping covers Crude Oil / Refined Products / mixed-flow notes.
  Capacity-unit conversion handles bcm/y, bbl/d, Mbd, kbd.
- 22 tests in tests/import-gem-pipelines.test.mjs cover schema sentinel,
  fuel split, status mapping, productClass mapping, capacity conversion,
  minimum-viable-evidence shape, registry-shape conformance, and bad-
  coordinate rejection.

U3 — Deduplication (pure deterministic):
- scripts/_pipeline-dedup.mjs: dedupePipelines(existing, candidates) →
  { toAdd, skippedDuplicates }. Match rule: haversine ≤5km AND name
  Jaccard ≥0.6 (BOTH required). Reverse-direction-pair-aware.
- 19 tests cover internal helpers, match logic, id collision, determinism,
  and empty inputs.

U4 — Operator runbook (data import deferred):
- docs/methodology/pipelines.mdx: 7-step runbook for the operator to
  download GEM, pre-convert Excel→JSON, dry-run with --print-candidates,
  merge with --merge, bump the registry floor, and commit with
  provenance metadata.
- The actual data import is intentionally OUT OF SCOPE for this agent-
  authored PR because GEM downloads are registration-gated. A follow-up
  PR will commit the imported scripts/data/pipelines-{gas,oil}.json +
  bump MIN_PIPELINES_PER_REGISTRY → 200 + record the GEM release SHA256.

Tests: typecheck clean; 67 tests pass across the three test files.

Codex-approved through 8 review rounds against origin/main @ 050073354.

* fix(energy-atlas): wire --merge to dedupePipelines + within-batch dedup (PR1 review)

P1 — --merge was a TODO no-op (import-gem-pipelines.mjs:291):
- Previously exited with code 2 + a "TODO: wire dedup once U3 lands"
  message. The PR body and the methodology runbook both advertised
  --merge as the operator path.
- Add mergeIntoRegistry(filename, candidates) helper that loads the
  existing envelope, runs dedupePipelines() against the candidate
  list, sorts new entries alphabetically by id (stable diff on rerun),
  validates the merged registry via validateRegistry(), and writes
  to disk only after validation passes. CLI --merge now invokes it
  for both gas and oil + prints a per-fuel summary.
- Source attribution: the registry envelope's `source` field is
  upgraded to mention GEM (CC-BY 4.0) on first merge so the data file
  itself documents provenance.

P2 — dedup transitive-match bug (_pipeline-dedup.mjs:120):
- Pre-fix loop checked each candidate ONLY against the original
  `existing` array. Two GEM rows that match each other but not anything
  in `existing` would BOTH be added, defeating the dedup contract for
  same-batch duplicates (real example: a primary GEM entry plus a
  duplicate row from a regional supplemental sheet).
- Now compares against existing FIRST (existing wins on cross-set
  match — preserves richer hand-curated evidence), then falls back to
  the already-accepted toAdd set. Within-batch matches retain the FIRST
  accepted candidate (deterministic by candidate-list order).

Tests: 22 in tests/pipeline-dedup.test.mjs (3 new) cover the
within-batch dedup, transitive collapse, and existing-wins-over-
already-accepted scenarios. typecheck clean.

* fix(energy-atlas): cross-file-atomic --merge (PR1 review #2)

P1 — partial-import on disk if oil validation fails after gas writes
(import-gem-pipelines.mjs:329 / :350):
- Previous flow ran `mergeIntoRegistry('pipelines-gas.json', gas)` which
  wrote to disk, then `mergeIntoRegistry('pipelines-oil.json', oil)`. If
  oil validation failed, the operator was left with a half-imported
  state: gas had GEM rows committed to disk but oil didn't.
- Refactor into a two-phase API:
  1. prepareMerge(filename, candidates) — pure, no disk I/O. Builds the
     merged envelope, validates it, throws on validation failure.
  2. mergeBothRegistries(gasCandidates, oilCandidates) — calls
     prepareMerge for BOTH fuels first; only writes to disk after BOTH
     pass validation. If oil's prepareMerge throws, gas was never
     touched on disk.
- CLI --merge now invokes mergeBothRegistries. The atomicity guarantee
  is documented inline in the helper.

typecheck clean. No new tests because the existing dedup + validate
suites cover the underlying logic; the change is purely about call
ordering for atomicity.

* fix(energy-atlas): deterministic lastEvidenceUpdate + clarify test comment (PR1 review #3)

P2 — lastEvidenceUpdate was non-deterministic (Greptile P2):
- Previous code used new Date().toISOString() per parser run, so two runs
  of parseGemPipelines on the same input on different days produced
  byte-different output. Quarterly re-imports would produce noisy
  full-row diffs even when the upstream GEM data hadn't changed.
- New: resolveEvidenceTimestamp(envelope) derives the timestamp from
  envelope.downloadedAt (the operator-recorded date) or sourceVersion
  if it parses as ISO. Falls back to 1970-01-01 sentinel when neither
  is set — deliberately ugly so reviewers spot the missing field in
  the data file diff rather than getting silent today's date.
- Computed once per parse run so every emitted candidate gets the
  same timestamp.

P2 — misleading test comment (Greptile P2):
- Comment in tests/import-gem-pipelines.test.mjs:136 said "400_000 bbl/d
  ÷ 1000 = 400 Mbd" while the assertion correctly expects 0.4 (because
  the convention is millions, not thousands). Rewrote the comment to
  state the actual rule + arithmetic clearly.

3 new tests for determinism: (a) two parser runs produce identical
output, (b) timestamp derives from downloadedAt, (c) missing date
yields the epoch sentinel (loud failure mode).
2026-04-25 17:55:45 +04:00
Elie Habib
abdcdb581f feat(resilience): SWF manifest expansion + KIA split + new schema fields (#3391)
* feat(resilience): SWF manifest expansion + KIA split + new schema fields

Phase 1 of plan 2026-04-25-001 (Codex-approved round 5). Manifest-only
data correction; no construct change, no cache prefix bump.

Schema additions (loader-validated, misplacement-rejected):
- top-level: aum_usd, aum_year, aum_verified (primary-source AUM)
- under classification: aum_pct_of_audited (fraction multiplier),
  excluded_overlaps_with_reserves (boolean; documentation-only)

Manifest expansion (13 → 21 funds, 6 → 13 countries):
- UAE: +ICD ($320B verified), +ADQ ($199B verified), +EIA (unverified —
  loaded for documentation, excluded from scoring per data-integrity rule)
- KW: kia split into kia-grf (5%, access=0.9) + kia-fgf (95%,
  access=0.20). Corrects ~18× over-statement of crisis-deployable
  Kuwait sovereign wealth (audit found combined-AUM × 0.7 access
  applied $750B as "deployable" against ~$15B actual GRF stabilization
  capacity).
- CN: +CIC ($1.35T), +NSSF ($400B, statutorily-gated 0.20 tier),
  +SAFE-IC ($417B, excluded — overlaps SAFE FX reserves)
- HK: +HKMA-EF ($498B, excluded — overlaps HKMA reserves)
- KR: +KIC ($182B, IFSWF full member)
- AU: +Future Fund ($192B, pension-locked)
- OM: +OIA ($50B, IFSWF member)
- BH: +Mumtalakat ($19B)
- TL: +Petroleum Fund ($22B, GPFG-style high-transparency)

Re-audits (Phase 1E):
- ADIA access 0.3 → 0.4 (rubric flagged; ruler-discretionary deployment
  empirically demonstrated)
- Mubadala access 0.4 → 0.5 (rubric flagged); transparency 0.6 → 0.7
  (LM=10 + IFSWF full member alignment)

Rubric (docs/methodology/swf-classification-rubric.md):
- New "Statutorily-gated long-horizon" 0.20 access tier added between
  0.1 (sanctions/frozen) and 0.3 (intergenerational/ruler-discretionary).
  Anchored by KIA-FGF (Decree 106 of 1976; Council-of-Ministers + Emir
  decree gate; crossed once in extremis during COVID).

Seeder:
- Two new pure helpers: shouldSkipFundForBuffer (excluded/unverified
  decision) and applyAumPctOfAudited (sleeve fraction multiplier)
- Manifest-AUM bypass: if aum_verified=true AND aum_usd present,
  use that value directly (skip Wikipedia)
- Skip funds with excluded_overlaps_with_reserves=true (no
  double-counting against reserveAdequacy / liquidReserveAdequacy)
- Skip funds with aum_verified=false (load for documentation only)

Tests (+25 net):
- 15 schema-extension tests (misplacement rejection, value-range gates,
  rationale-pairing coherence, backward-compat with pre-PR entries)
- 10 helper tests (shouldSkipFundForBuffer + applyAumPctOfAudited
  predicates and arithmetic; KIA-GRF + KIA-FGF sum equals combined AUM)
- Existing manifest test updated for the kia → kia-grf+kia-fgf split

Full suite: 6,940 tests pass (+50 net), typecheck clean, no new lint.

Predicted ranking deltas (informational, NOT acceptance criteria per
plan §"Hard non-goals"):
- AE sovFiscBuf likely 39 → 47-49 (Phase 1A + 1E)
- KW sovFiscBuf likely 98 → 53-57 (Phase 1B)
- CN, HK (excluded), KR, AU acquire newly-defined sovFiscBuf scores
- GCC ordering shifts toward QA > KW > AE; AE-KW gap likely 6 → ~3-4

Real outcome will be measured post-deploy via cohort audit per plan
§Phase 4.

* fix(resilience): completeness denominator excludes documentation-only funds

PR-3391 review (P1 catch): the per-country `expectedFunds` denominator
counted ALL manifest entries (`funds.length`) including those skipped
from buffer scoring by design — `excluded_overlaps_with_reserves: true`
(SAFE-IC, HKMA-EF) and `aum_verified: false` (EIA). Result: countries
with mixed scorable + non-scorable rosters showed `completeness < 1.0`
even when every scorable fund matched. UAE (4 scorable + EIA) would
show 0.8; CN (CIC + NSSF + SAFE-IC excluded) would show 0.67. The
downstream scorer then derated those countries' coverage based on a
fake-partial signal.

Three call sites all carried the same bug:
- per-country `expectedFunds` in fetchSovereignWealth main loop
- `expectedFundsTotal` + `expectedCountries` in buildCoverageSummary
- `countManifestFundsForCountry` (missing-country path)

All three now filter via `shouldSkipFundForBuffer` to count only
scorable manifest entries. Documentation-only funds neither expected
nor matched — they don't appear in the ratio at all.

Tests added (+4):
- AE complete with all 4 scorable matched (EIA documented but excluded)
- CN complete with CIC + NSSF matched (SAFE-IC documented but excluded)
- Missing-country path returns scorable count not raw manifest count
- Country with ONLY documentation-only entries excluded from expectedCountries

Full suite: 6,944 tests pass (+4 net), typecheck clean.

* fix(resilience): address Greptile P2s on PR #3391 manifest

Three review findings, all in the manifest YAML:

1. **KIA-GRF access 0.9 → 0.7** (rubric alignment): GRF deployment
   requires active Council-of-Ministers authorization (2020 COVID
   precedent demonstrates this), not rule-triggered automatic
   deployment. The rubric's 0.9 tier ("Pure automatic stabilization")
   reserved for funds where political authorization is post-hoc /
   symbolic (Chile ESSF candidate). KIA-GRF correctly fits 0.7
   ("Explicit stabilization with rule") — the same tier the
   pre-split combined-KIA was assigned. Updated rationale clarifies
   the tier choice. Rubric's 0.7 precedent column already lists
   "KIA General Reserve Fund" — now consistent with the manifest.

2. **Duplicate `# ── Australia ──` header before Oman** (copy-paste
   artifact): removed the orphaned header at the Oman section;
   added proper `# ── Australia ──` header above the Future Fund
   entry where it actually belongs (after Timor-Leste).

3. **NSSF `aum_pct_of_audited: 1.0` removed** (no-op): a multiplier
   of 1.0 is identity. The schema field is OPTIONAL and only meant
   for fund-of-funds split entries (e.g. KIA-GRF/FGF). Setting it
   to 1.0 forced the loader to require an `aum_pct_of_audited`
   rationale paragraph with no computational benefit. Both the
   field and the paragraph are now removed; NSSF remains a single-
   sleeve entry that scores its full audited AUM.

Full suite: 6,944 tests pass, typecheck clean.
2026-04-25 12:02:48 +04:00
Elie Habib
1dc807e70f docs(resilience): PR 4a — SWF classification rubric (tiers + precedents, no manifest changes) (#3376)
* docs(resilience): PR 4a — SWF classification rubric (tiers + precedents, no manifest changes)

PR 4a of cohort-audit plan 2026-04-24-002. First half of the plan's PR 4
(full-manifest re-rate) split into:

- PR 4a (this): pure documentation — central rubric defining tiers
  + concrete precedents per axis. No manifest changes.
- PR 4b (deferred): apply the rubric to revise specific coefficients
  in `scripts/shared/swf-classification-manifest.yaml`. Behaviour-
  changing; belongs in a separate PR with cohort snapshots and
  methodology review.

This split addresses the plan's concern that PR 4 "may not be outcome-
predetermined" by separating the evaluative framework from its
application. PR 4a makes every current manifest value evaluable against
a benchmark; PR 4b applies the benchmark.

Shipped

- `docs/methodology/swf-classification-rubric.md` — new doc.
  Sections:
    1. Introduction + scope (rubric vs manifest boundary)
    2. Access axis: 5 named tiers (0.1, 0.3, 0.5, 0.7, 0.9) w/
       concrete precedents per tier, plus edge cases for
       fiscal-rule caps (Norway GPFG) and state holding
       companies (Temasek)
    3. Liquidity axis: 6 tiers (0.1, 0.3, 0.5, 0.7, 0.9, 1.0) w/
       precedents + listed-vs-directly-owned real-estate edge case
    4. Transparency axis: 6 tiers grounded in LM Transparency
       Index + IFSWF membership + annual-report granularity, plus
       edge cases for LM=10 w/o holdings-level disclosure and
       sealed filings (KIA)
    5. Current manifest × rubric alignment — 24 coefficients reviewed;
       6 flagged as "arguably higher/lower under the rubric" with
       directional-impact analysis marked INFORMATIONAL, not
       motivation for revision
    6. How-to-use playbook for manifest-edit PRs (add/revise/rubric-
       revise workflows)

Findings (informational only — no PR changes)

Six ratings flagged as potentially under-/over-stated against the
rubric. Per the plan's anti-pattern note (rank-targeted acceptance
criteria), the flags are INFORMATIONAL: a future manifest-edit PR
should revise only when the rubric + cited evidence support the
change, not to hit a target ranking.

Flagged (with directional impact if revised upward):

  - Mubadala access 0.4 → arguably 0.5; transparency 0.6 → 0.7
    (haircut 0.12 → 0.175, +46% access × transparency product)
  - PIF access 0.4 → arguably 0.5; liquidity 0.4 → arguably 0.3
    (net small effect — opposite directions partially cancel)
  - KIA transparency 0.4 → arguably 0.5 (haircut +25%)
  - QIA access 0.4 → arguably 0.5; transparency 0.4 → arguably 0.5
    (haircut +56%)
  - GIC access 0.6 → arguably 0.7 (haircut +17%)

Not flagged: GPFG, ADIA, Temasek (all 9 coefficients align with
their rubric tiers).

Verified

- `npm run test:data` — 6694 pass / 0 fail (unchanged — pure docs PR)
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint:md` — clean

Not in this PR

- Manifest coefficient changes (PR 4b)
- Cohort-sanity snapshot before/after (PR 4b)
- Live-data audit of IFSWF engagement + LM index current values
  (requires web fetch — not in scope for a doc PR)

* fix(resilience): PR 4a review — resolve GIC/ADIA rubric contradictions + flag-count

Addresses P1 + 2 P2 Greptile findings on #3376 (draft).

1. **P1 — GIC tier contradiction.** GIC was listed as a canonical 0.7
   ("Explicit stabilization with rule") precedent AND rated 0.6 in
   the alignment table with an "arguably 0.7" note. That inconsistency
   makes the rubric unusable as-is for PR 4b review. Removed GIC from
   the 0.7 precedent list and explicitly marked it as a 0.7 *candidate*
   (pending PR 4b evaluation), not a 0.7 *precedent*. KIA General
   Reserve Fund stays as the canonical 0.7 example; Norway GPFG
   remains the borderline case for fiscal-rule caps.

2. **P2 — ADIA liquidity midpoint inconsistency.** Methodology text
   said the rubric uses "midpoint" for ranged disclosures and cited
   ADIA 55-70% → 0.7 tier. But midpoint(55-70) = 62.5%, which sits
   in the 0.5 tier band (50-65%). Fixed the methodology to state the
   rubric uses the **upper-bound** of a disclosed range (fund's own
   statement of maximum public-market allocation), which keeps ADIA
   at 0.7 tier (70% upper bound fits 65-85% band). Added forward-
   compatibility note: if future ADIA disclosures tighten the range
   so the upper bound drops below 65%, the rubric directs the rating
   to 0.5.

3. **P2 — Flag-count header.** "(6 of 24 coefficients)" was wrong;
   the enumeration below lists 8 coefficients across 5 funds.
   Corrected to "8 coefficients across 5 funds" with the fund-by-fund
   count inline so the header math is self-verifying.

Verified

- `npm run lint:md` — clean
- `npm run typecheck` — green (pure docs PR, no behaviour change)

This PR remains in draft pending #3380 (PR 3A — net-imports denominator)
merge per the plan's PR 4 → after PR 3A sequencing.
2026-04-24 18:32:17 +04:00
Elie Habib
b4198a52c3 docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question) (#3375)
* docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question)

PR 5.1 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.3 (#3374)
so the known-limitations.md section append is additive. Read-only
static audit of scoreTradeSanctions + the sanctions:country-counts:v1
seed — framed around the Codex-reformulated construct question:
should designated-party domicile count penalize resilience?

Findings

1. The count is "OFAC-designated-party domicile locations," NOT
   "sanctions against this country." Seeder (`scripts/seed-sanctions-
   pressure.mjs:85-93`) parses OFAC Advanced XML SDN + Consolidated,
   extracts each designated party's Locations, and increments
   `map[countryCode]` by 1 for every location country on that party.

2. The count conflates three semantically distinct categories a
   resilience construct might treat differently:

   (a) Country-level sanction target (NK SDN listings) — correct penalty
   (b) Domiciled sanctioned entity (RU bank in Moscow, post-2022) —
       debatable, country hosts the actor
   (c) Transit / shell entity (UAE trading co listed under SDGT for
       Iran evasion; CY SPV for a Russian oligarch) — country is NOT
       the target, but takes the penalty

3. Observed GCC cohort impact: AE scores 54 vs KW/QA 82. The −28 gap
   is almost entirely driven by category (c) listings — AE is a
   financial hub where sanctioned parties incorporate shells.

4. Three options documented for the construct decision (NOT decided
   in this PR):
   - Option 1: Keep flat count (status quo, defensible via secondary-
     sanctions / FATF argument)
   - Option 2: Program-weighted count — weight DPRK/IRAN/SYRIA/etc.
     at 1.0, SDGT/SDNTK/CYBER/etc. at 0.3-0.5. Recommended; seeder
     already captures `programs` per entry — data is there, scorer
     just doesn't read it.
   - Option 3: Transit-hub exclusion list (AE, SG, HK, CY, VG, KY) —
     brittle + normative, not recommended

5. Recommendation documented: Option 2. Implementation deferred to
   a separate methodology-decision PR (outside auto-mode authority).

Shipped

- `docs/methodology/known-limitations.md` — new section extending
  the file: "tradeSanctions — designated-party domicile construct
  question." Covers what the count represents, the three categories
  with examples, observed GCC impact, three options w/ trade-offs,
  recommendation, follow-up audit list (entity-sample gated on
  API-key access), and file references.
- `tests/resilience-sanctions-field-mapping.test.mts` (new) —
  10 regression-guard tests pinning CURRENT behavior:
    1-6. normalizeSanctionCount piecewise anchors:
       count=0→100, 1→90, 10→75, 50→50, 200→25, 500→≤1
    7. Monotonicity: strictly decreasing across the ramp
    8. Country absent from map defaults to count=0 → score 100
       (intentional "no designated parties here" semantics)
    9. Seed outage (raw=null) → null score slot, NOT imputed
       (protects against silent data-outage scoring)
    10. Construct anchor: count=1 is exactly 10 points below count=0
        (pins the "first listing drops 10" design choice)

Verified

- `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail
- `npm run test:data` — 6721 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — clean

* fix(resilience): PR 5.1 review — tighten count=500 assertion; clarify weightedBlend weights

Addresses 2 P2 Greptile findings on #3375:

1. Tighten count=500 assertion. Was `<= 1` with a comment stating the
   exact value is 0. That loose bound silently tolerates roundScore /
   boundary drift that would be the very signal this regression guard
   exists to catch. Changed to strict equality `=== 0`.

2. Clarify the "zero weight" comment on the sanctions-only harness.
   The other slots DO contribute their declared weights (0.15 +
   0.15 + 0.25 = 0.55) to weightedBlend's `totalWeight` denominator —
   only `availableWeight` (the score-computation denominator) drops
   to 0.45 because their score is null. The previous comment elided
   this distinction and could mislead a reader into thinking the null
   slots contributed nothing at all. Expanded to state exactly how
   `coverage` and `score` each behave.

Verified

- `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts`
  — 10 pass / 0 fail (count=500 now pins the exact 0 floor)
2026-04-24 18:30:59 +04:00
Elie Habib
a97ba83833 docs(resilience): PR 5.3 — foodWater scorer audit (construct-deterministic GCC identity) (#3374)
* docs(resilience): PR 5.3 — foodWater scorer audit (construct-deterministic GCC identity)

PR 5.3 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.2 (#3373)
so the known-limitations.md section append is additive. Read-only
static audit of scoreFoodWater.

Findings

1. The observed GCC-all-score-53 is CONSTRUCT-DETERMINISTIC, not a
   regional-default leak. Pinned mathematically:
   - IPC/HDX doesn't publish active food-crisis data for food-secure
     states → scorer's fao-null branch imputes IMPUTE.ipcFood=88
     (class='stable-absence', cov=0.7) at combined weight 0.6
   - WB indicator ER.H2O.FWST.ZS (labelled 'water stress') for GCC
     is EXTREME (KW ~3200%, BH ~3400%, UAE ~2080%, QA ~770%) — all
     clamp to sub-score 0 under the scorer's lower-better 0..100
     normaliser at weight 0.4
   - Blended with peopleInCrisis=0 (fao block present with zero):
       (100 * 0.45 + 0 * 0.4) / (0.45 + 0.4) = 45 / 0.85 ≈ 53
   Every GCC country has the same inputs → same outputs. That's
   construct math, not a regional lookup.

2. Indicator-keyword routing is code-correct. `'water stress'`,
   `'withdrawal'`, `'dependency'` route to lower-better;
   `'availability'`, `'renewable'`, `'access'` route to
   higher-better; unrecognized indicators fall through to a
   value-range heuristic with a WARN log.

3. No bug or methodology decision required. The 53-all-GCC output
   is a correct summary statement: "non-crisis food security +
   severe water-withdrawal stress." A future construct decision
   might split foodWater into separate food and water dims so one
   saturated sub-signal doesn't dominate the combined dim for
   desert economies — but that's a construct redesign, not a bug.

Shipped

- `docs/methodology/known-limitations.md` — extended with a new
  section documenting the foodWater audit findings, the exact
  blend math that yields ~53 for GCC, cohort-determinism vs
  regional-default, and a follow-up data-side spot-check list
  gated on API-key access.
- `tests/resilience-foodwater-field-mapping.test.mts` — 8 new
  regression-guard tests:
    1. indicator='water stress' routes to lower-better
    2. GCC extreme-withdrawal anchor (value=2000 → blended score 53)
    3. indicator='renewable water availability' routes to higher-better
    4. fao=null with static record → imputes 88; imputationClass=null
       because observed AQUASTAT wins (weightedBlend T1.7 rule)
    5. fully-imputed (fao=null + aquastat=null) surfaces
       imputationClass='stable-absence'
    6. static-record absent entirely → coverage=0, NOT impute
    7. Cohort determinism — identical inputs → identical scores
    8. Different water-profile inputs → different scores (rules
       out regional-default hypothesis)

Verified

- `npx tsx --test tests/resilience-foodwater-field-mapping.test.mts` — 8 pass / 0 fail
- `npm run test:data` — 6711 pass / 0 fail (PR 5.2's 9 + PR 5.3's 8 = 17 new stacked)
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — clean

* fix(resilience): PR 5.3 review — pin IMPUTE branch for GCC anchor; fix comment math

Addresses 3 P2 Greptile findings on #3374 — all variations of the same
root cause: the test fixture + doc described two different code paths
that coincidentally both produce ~53 for GCC inputs.

Changes

1. GCC anchor test now drives the IMPUTE branch (`fao: null`), matching
   what the static seeder emits for GCC in production. The else branch
   (`fao: { peopleInCrisis: 0 }`) happens to converge on ~52.94 by
   coincidence but is NOT the live code path for GCC.

2. Doc finding #4 updated to show the IMPUTE-branch math
   `(88×0.6 + 0×0.4) / 1.0 = 52.8 → 53` and explicitly notes the
   else-branch convergence as a coincidence — not the construct's
   intent.

3. Comment math off-by-one fix at line 107:
     (88×0.6 + 80×0.4) / (0.6+0.4)
     = 52.8 + 32.0
     = 84.8 → 85    (was incorrectly stated as 85.6 → 86)
   Test assertion `>= 80 && <= 90` still accepts 85 so behaviour is
   unchanged; this was a comment-only error that would have misled
   anyone reproducing the math by hand.

Verified

- `npx tsx --test tests/resilience-foodwater-field-mapping.test.mts`
  — 8 pass / 0 fail (IMPUTE-branch anchor test produces 53 as expected)
- `npm run lint:md` — clean

Also rebased onto updated #3373 (which landed a backtick-escape fix).
2026-04-24 18:25:50 +04:00
Elie Habib
6807a9c7b9 docs(resilience): PR 5.2 — displacement field-mapping audit + known-limitations (#3373)
* docs(resilience): PR 5.2 — displacement field-mapping audit + known-limitations

PR 5.2 of cohort-audit plan 2026-04-24-002. Read-only static audit of
the UNHCR displacement field mapping consumed by scoreSocialCohesion,
scoreBorderSecurity, and scoreStateContinuity.

Findings

1. Field mapping is CODE-CORRECT. The plan's concern — that
   `totalDisplaced` might inadvertently include labor migrants — is
   negative at the source. The UNHCR Population API does not publish
   labor migrant data at all; it covers only four categories
   (refugees, asylum seekers, IDPs, stateless), all of which the
   seeder sums correctly. Labor-migrant-dominated cohorts (GCC, SG)
   legitimately register as "no UNHCR footprint" — that's UNHCR
   semantics, not a bug.

2. NEW finding during audit — `scoreBorderSecurity` fallback at
   _dimension-scorers.ts:1412 is effectively dead code. The
   `hostTotal ?? totalDisplaced` fallback never fires in production
   for two compounding reasons:

   (a) `safeNum(null)` returns 0 (JS `Number(null) === 0`), so the
       `??` short-circuits on 0 — the nullish-coalescing only falls
       back on null/undefined.
   (b) `scripts/seed-displacement-summary.mjs` ALWAYS writes
       `hostTotal: 0` explicitly for origin-only countries (lines
       141-144). There's no production shape where `hostTotal` is
       undefined, so the `??` can never select the fallback path.

   Observable consequence: origin-only high-outflow countries
   (Syria, Venezuela, Ukraine, Afghanistan) score 100 on
   borderSecurity's displacement sub-component (35% of the dim
   blend). The outflow signal is effectively silenced.

3. NOT fixing this in this PR. A one-line change (`||` or an
   explicit `> 0` check) would flip the borderSecurity score for
   ~6 high-outflow origin countries by a material amount — a
   methodology change, not a pure bug-fix. Belongs in a construct-
   decision PR with before/after cohort snapshots. Opening this as
   a follow-up discussion instead of bundling into an audit-doc PR.

Shipped

- `docs/methodology/known-limitations.md` — new file. Sections:
  "Displacement field-mapping" covering source semantics (what
  UNHCR provides vs does not), the GCC labor-migrant-cohort
  implication, the `??` short-circuit finding, and the decision
  to not fix in this PR. Includes a follow-up audit list of 11
  countries (high host-pressure + high origin-outflow + labor-
  migrant cohorts) for a live-data spot-check against UNHCR
  Refugee Data Finder — gated on API-key access.
- `tests/resilience-displacement-field-mapping.test.mts` —
  9-test regression guard. Pins:
    (1) `totalDisplaced` = sum of all four UNHCR categories;
    (2) `hostTotal` = asylum-side sum (no IDPs/stateless);
    (3) stateless population flows into totalDisplaced (guards
        against a future seeder refactor that drops the term);
    (4) labor-migrant-cohort (UNHCR-empty) entry scores 100 on
        the displacement sub-component — the correct-per-UNHCR-
        semantics outcome, intentionally preserved;
    (5) CURRENT scoreBorderSecurity behaviour: hostTotal=0
        short-circuits `??` (Syria-pattern scores 100);
    (6) `??` fallback ONLY fires when hostTotal is undefined
        (academic; seeder never emits this shape today);
    (7) `safeNum(null)` returns 0 quirk pinned as a numeric-
        coercion contract;
    (8) absent-from-UNHCR country imputes `stable-absence`;
    (9) scoreStateContinuity reads `totalDisplaced` origin-side.

Verified

- `npx tsx --test tests/resilience-displacement-field-mapping.test.mts` — 9 pass / 0 fail
- `npm run test:data` — 6703 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no warnings on new files

* fix(resilience): PR 5.2 review — escape backticks in assertion message

Addresses Greptile P2 on #3373. The unescaped backticks around the
nullish-coalescing operator in a template literal caused JavaScript to
parse the string as 'prefix' ?? 'suffix' — truncating the assertion
message to the prefix alone on failure. Escaping the backticks preserves
the full diagnostic so a future regression shows the complete context.
Semantics unchanged; test still passes.
2026-04-24 18:14:29 +04:00
Elie Habib
df392b0514 feat(resilience): PR 0 — cohort-sanity release-gate harness (#3369)
* feat(resilience): PR 0 — cohort-sanity release-gate harness

Lands the audit infrastructure for the resilience cohort-ranking
structural audit (plan 2026-04-24-002). Release gate, not merge gate:
the audit tells release review what to look at before publishing a
ranking; it does not block a PR.

What's new
- scripts/audit-resilience-cohorts.mjs — Markdown report generator.
  Fetches the live ranking + per-country scores (or reads a fixture
  in offline mode), emits per-cohort per-dimension tables, contribution
  decomposition, saturated / outlier / identical-score flags, and a
  top-N movers comparison vs a baseline snapshot.
- tests/resilience-construct-invariants.test.mts — 12 formula-level
  anchor-value assertions with synthetic inputs. Covers HHI, external
  debt (Greenspan-Guidotti anchor), and sovereign fiscal buffer
  (saturating transform). Tests the MATH, not a country's rank.
- tests/fixtures/resilience-audit-fixture.json — offline fixture that
  mirrors the 2026-04-24 GCC state (KW>QA>AE) so the audit tool can
  be smoke-tested without API-key access.
- docs/methodology/cohort-sanity-release-gate.md — operational doc
  explaining when to run, how to read the report, and the explicit
  anti-pattern note on rank-targeted acceptance criteria.

Verified
- `npx tsx --test tests/resilience-construct-invariants.test.mts` —
  12 pass (HHI, debt, SWF invariants all green against current scorer)
- `npm run test:data` — 6706 pass / 0 fail
- `FIXTURE=tests/fixtures/resilience-audit-fixture.json
   OUT=/tmp/audit.md node scripts/audit-resilience-cohorts.mjs`
  runs to completion and correctly flags:
  (a) coverage-outlier on AE.importConcentration (0.3 vs peers 1.0)
  (b) saturated-high on GCC.externalDebtCoverage (all 6 at 100)
  — the two top cohort-sanity findings from the plan.

Not in this PR
- The live-API baseline snapshot
  (docs/snapshots/resilience-ranking-live-pre-cohort-audit-2026-04-24.json)
  is deferred to a manual release-prep step: run
  `WORLDMONITOR_API_KEY=wm_xxx API_BASE=https://api.worldmonitor.app
   node scripts/freeze-resilience-ranking.mjs` before the first
  methodology PR (PR 1 HHI period widening) so its movers table has
  something to compare against.
- No scorer changes. No cache-prefix bumps. This PR is pure tooling.

* fix(resilience): fail-closed on fetch failures + pillar-combine formula mode

Addresses review P1 + P2 on PR #3369.

P1 — fetch-failure silent-drop.
Per-country score fetches that failed were logged to stderr, silently
stored as null, and then filtered out of cohort tables via
`codes.filter((cc) => scoreMap.get(cc))`. A transient 403/500 on the
very country carrying the ranking anomaly could produce a Markdown
report that looked valid — wrong failure mode for a release gate.

Fix:
- `fetchScoresConcurrent` now tracks failures in a dedicated Map and
  does NOT insert null placeholders; missing cohort members are
  computed against the requested cohort code set.
- The report has a  blocker banner at top AND an always-rendered
  "Fetch failures / missing members" section (shown even when empty,
  so an operator learns to look).
- `STRICT=1` writes the report, then exits code 3 on any fetch
  failure or missing cohort member, code 4 on formula-mode drift,
  code 0 otherwise. Automation can differentiate the two.

P2 — pillar-combine formula mode invalidates contribution rows.
`docs/methodology/cohort-sanity-release-gate.md:63` tells operators
to run this audit before activating `RESILIENCE_PILLAR_COMBINE_ENABLED`,
but the contribution decomposition is a domain-weighted roll-up that
is ONLY valid when `overallScore = sum(domain.score * domain.weight)`.
Once pillar combine is on, `overallScore = penalizedPillarScore(pillars)`
(non-linear in dim scores); decomposition rows become materially
misleading for exactly the release-gate scenario the doc prescribes.

Fix:
- Added `detectFormulaMode(scoreMap)` that takes countries with:
  (a) `sum(domain.weight)` within 0.05 of 1.0 (complete response), AND
  (b) every dim at `coverage ≥ 0.9` (stable share math)
  and compares `|Σ contributions - overallScore|` against
  `CONTRIB_TOLERANCE` (default 1.5). If > 50% of ≥ 3 eligible
  countries drift, pillar combine is flagged.
- Report emits a  blocker banner at top, a "Formula mode" line in
  the header, and a "Formula-mode diagnostic" section with the first
  three offenders. Under `STRICT=1` exits code 4.
- Methodology doc updated: new "Fail-closed semantics" section,
  "Formula mode" operator guide, ENV table entries for STRICT +
  CONTRIB_TOLERANCE.

Verified:
- `tests/audit-cohort-formula-detection.test.mts` (NEW) — 3 child-process
  smoke tests: missing-members banner + STRICT exit 3, all-clear exit 0,
  pillar-mode banner + STRICT exit 4. All pass.
- `npx tsx --test tests/resilience-construct-invariants.test.mts
   tests/audit-cohort-formula-detection.test.mts` — 15 pass / 0 fail
- `npm run test:data` — 6709 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no warnings on new / changed files
  (refactor split buildReport complexity from 51 → under 50 by
  extracting `renderCohortSection` + `renderDimCell`)
- Fixture smoke: AE.importConcentration coverage-outlier and
  GCC.externalDebtCoverage saturated-high flags still fire correctly.

* fix(resilience): PR 0 review — fixture-mode source label, try/catch country-names, ASCII minus

Addresses 3 P2 Greptile findings on #3369:

1. **Misleading Source: line in fixture mode.** `FIXTURE_PATH` sets
   `API_BASE=''`, so the report header showed a bare "/api/..." path that
   never resolved — making a fixture run visually indistinguishable from
   a live run. Now surfaces `Source: fixture://<path>` in fixture mode.

2. **`loadCountryNameMap` crashes without useful diagnostics.** A missing
   or unparseable `shared/country-names.json` produced a raw unhandled
   rejection. Now the read and the parse are each wrapped in their own
   try/catch; on either failure the script logs a developer-friendly
   warning and falls back to ISO-2 codes (report shows "AE" instead of
   "Uae"). Keeps the audit operable in CI-offline scenarios.

3. **Unicode minus `−` (U+2212) instead of ASCII `-` in `fmtDelta`.**
   Downstream operators diff / grep / CSV-pipe the report; the Unicode
   minus breaks byte-level text tooling. Replaced with ASCII hyphen-
   minus. Left the U+2212 in the formula-mode diagnostic prose
   (`|Σ contributions − overallScore|`) where it's mathematical notation,
   not data.

Verified

- `npx tsx --test tests/audit-cohort-formula-detection.test.mts tests/resilience-construct-invariants.test.mts` — 15 pass / 0 fail
- Fixture-mode run produces `Source: fixture://tests/fixtures/...`
- Movers-table negative deltas now use ASCII `-`
2026-04-24 18:13:22 +04:00
Elie Habib
d521924253 fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs (#3363)
* fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs

PR #3289 shipped the v2 energy construct behind RESILIENCE_ENERGY_V2_ENABLED
(default false). Audit on 2026-04-24 after the user flagged "AE only moved
1.49 points — we added nuclear credit, we should see more" revealed two
safety gaps that made a future flag flip unsafe:

1. scoreEnergyV2 silently fell back to IMPUTE when any of its three
   required Redis seeds (low-carbon-generation, fossil-electricity-share,
   power-losses) was null. A future operator flipping the flag with
   seeds absent would produce fabricated-looking numbers for every
   country with zero operator signal.

2. api/health.js had those three seed labels in BOTH SEED_META (CRIT on
   missing) AND ON_DEMAND_KEYS (which demotes CRIT to WARN). The demotion
   won. Health has been reporting WARNING on a scorer dependency that has
   been 100% missing since PR #3289 merged — no paging trail existed.

Changes:

  server/worldmonitor/resilience/v1/_dimension-scorers.ts
    - Add ResilienceConfigurationError with missingKeys[] payload.
    - scoreEnergy: preflight the three v2 seeds when flag=true. Throw
      ResilienceConfigurationError listing the specific absent keys.
    - scoreAllDimensions: wrap per-dimension dispatch in try/catch so a
      thrown ResilienceConfigurationError routes to the source-failure
      shape (imputationClass='source-failure', coverage=0) for that ONE
      dimension — country keeps scoring other dims normally. Log once
      per country-dimension pair so the gap is audit-traceable.

  api/health.js
    - Remove lowCarbonGeneration / fossilElectricityShare / powerLosses
      from ON_DEMAND_KEYS. They stay in BOOTSTRAP_KEYS + SEED_META.
    - Replace the transitional comment with a hard "do NOT add these
      back" note pointing at the scorer's fail-closed gate.

  tests/resilience-energy-v2.test.mts
    - New test: flag on + ALL three seeds missing → throws
      ResilienceConfigurationError naming all three keys.
    - New test: flag on + only one seed missing → throws naming ONLY
      the missing key (operator-clarity guard).
    - New test: flag on + all seeds present → v2 runs normally.
    - Update the file-level invariant comment to reflect the new
      fail-closed contract (replacing the prior "degrade gracefully"
      wording that codified the silent-IMPUTE bug).
    - Note: fixture's `??` fallbacks coerce null-overrides into real
      data, so the preflight tests use a direct-reader helper.

  docs/methodology/country-resilience-index.mdx
    - New "Fail-closed semantics" paragraph in the v2 Energy section
      documenting the throw + source-failure + health-CRIT contract.

Non-goals (intentional):
  - This PR does NOT flip RESILIENCE_ENERGY_V2_ENABLED.
  - This PR does NOT provision seed-bundle-resilience-energy-v2 on Railway.
  - This PR does NOT touch RESILIENCE_PILLAR_COMBINE_ENABLED.

Operational effect post-merge:
  - /api/health flips from WARNING → CRITICAL on the three v2 seed-meta
    entries. That is the intended alarm; it reveals that the Railway
    bundle was never provisioned.
  - scoreEnergy behavior with flag=false is unchanged (legacy path).
  - scoreEnergy behavior with flag=true + seeds present is unchanged.
  - scoreEnergy behavior with flag=true + seeds absent changes from
    "silently IMPUTE all 217 countries" to "source-failure on the
    energy dim for every country, visible in widget + API response".

Tests: 511/511 resilience-* pass. Biome clean. Lint:md clean.

Related plan: docs/plans/2026-04-24-001-fix-resilience-v2-fail-closed-on-missing-seeds-plan.md

* docs(resilience): scrub stale ON_DEMAND_KEYS references for v2 energy seeds

Greptile P2 on PR #3363: four stale references implied the three v2
energy seeds were still gated as ON_DEMAND_KEYS (WARN-on-missing) even
though this PR's api/health.js change removed them (now strict
SEED_META = CRIT on missing). Scrubbing each:

  - api/health.js:196 (BOOTSTRAP_KEYS comment) — was "ON_DEMAND_KEYS
    until Railway cron provisions; see below." Updated to cite plan
    2026-04-24-001 and the strict-SEED_META posture.
  - api/health.js:398 (SEED_META comment) — was "Listed in ON_DEMAND_KEYS
    below until Railway cron provisions..." Updated for same reason.
  - docs/methodology/country-resilience-index.mdx:635 — v2.1 changelog
    entry said seed keys were ON_DEMAND_KEYS until graduation. Replaced
    with the fail-closed contract description.
  - docs/methodology/energy-v2-flag-flip-runbook.md:25 — step 3 said
    "ON_DEMAND_KEYS graduation" was required at flag-flip time.
    Rewrote to explain no graduation step is needed because the
    posture was removed pre-activation.

No code change. Tests still 14/14 on the energy-v2 suite, lint:md clean.

* fix(docs): escape MDX-unsafe `<=` in energy-v2 runbook to unblock Mintlify

Mintlify deploy on PR #3363 failed with
`Unexpected character '=' (U+003D) before name` at
`docs/methodology/energy-v2-flag-flip-runbook.md`. Two lines had
`<=` in plain prose, which MDX tries to parse as a JSX-tag-start.

Replaced both with `≤` (U+2264) — and promoted the two existing `>=`
on adjacent lines to `≥` for consistency. Prose is clearer and MDX
safe.

Same pattern as `mdx-unsafe-patterns-in-md` skill; also adjacent to
PR #3344's `(<137 countries)` fix.
2026-04-24 09:37:18 +04:00
Elie Habib
53c50f4ba9 fix(swf): move manifest next to its loader so Railway ships it (#3344)
PR #3336 fixed the yaml dep but the next Railway tick crashed with
`ENOENT: no such file or directory, open '/docs/methodology/swf-classification-manifest.yaml'`.

Root cause: the loader at scripts/shared/swf-manifest-loader.mjs
resolved `../../docs/methodology/swf-classification-manifest.yaml`,
which works in a full repo checkout but lands at `/docs/...` (outside
`/app`) in the Railway recovery-bundle container. That service has
rootDirectory=scripts/ in the dashboard, so NIXPACKS only copies
`scripts/` into the image — `docs/` is never shipped.

Fix: move the YAML to scripts/shared/swf-classification-manifest.yaml,
alongside its loader. MANIFEST_PATH becomes `./swf-classification-manifest.yaml`
so the file is always adjacent to the code that reads it, regardless
of rootDirectory.

Tests: 53/53 SWF tests still pass; biome clean on changed files.
2026-04-23 19:47:10 +04:00
Elie Habib
d3d406448a feat(resilience): PR 2 §3.4 recovery-domain weight rebalance (#3328)
* feat(resilience): PR 2 §3.4 recovery-domain weight rebalance

Dials the two PR 2 §3.4 recovery dims (liquidReserveAdequacy,
sovereignFiscalBuffer) to ~10% share each of the recovery-domain
score via a new per-dimension weight channel in the coverage-weighted
mean. Matches the plan's direction that the sovereign-wealth signal
complement — rather than dominate — the classical liquid-reserves
and fiscal-space signals.

Implementation

- RESILIENCE_DIMENSION_WEIGHTS: new Record<ResilienceDimensionId, number>
  alongside RESILIENCE_DOMAIN_WEIGHTS. Every dim has an explicit entry
  (default 1.0) so rebalance decisions stay auditable; the two new
  recovery dims carry 0.5 each.

  Share math at full coverage (6 active recovery dims):
    weight sum                  = 4 × 1.0 + 2 × 0.5 = 5.0
    each new-dim share          = 0.5 / 5.0 = 0.10  ✓
    each core-dim share         = 1.0 / 5.0 = 0.20

  Retired dims (reserveAdequacy, fuelStockDays) keep weight 1.0 in
  the map; their coverage=0 neutralizes them at the coverage channel
  regardless. Explicit entries guard against a future scorer bug
  accidentally returning coverage>0 for a retired dim and falling
  through the `?? 1.0` default — every retirement decision is now
  tied to a single explicit source of truth.

- coverageWeightedMean (_shared.ts): refactored to apply
  `coverage × dimWeight` per dim instead of `coverage` alone. Backward-
  compatible when all weights default to 1.0 (reduces to the original
  mean). All three aggregation callers — buildDomainList, baseline-
  Score, stressScore — pick up the weighting transparently.

Test coverage

1. New `tests/resilience-recovery-weight-rebalance.test.mts`:
   pins the per-dim weight values, asserts the share math
   (0.10 new / 0.20 core), verifies completeness of the weight map,
   and documents why retired dims stay in the map at 1.0.
2. New `tests/resilience-recovery-ordering.test.mts`: fixture-based
   Spearman-proxy sensitivity check. Asserts NO > US > YE ordering
   preserved on both the overall score and the recovery-domain
   subscore after the rebalance. (Live post-merge Spearman rerun
   against the PR 0 snapshot is tracked as a follow-up commit.)
3. resilience-scorers.test.mts fixture anchors updated in lockstep:
     baselineScore: 60.35 → 62.17 (low-scoring liquidReserveAdequacy
       + partial-coverage SWF now contribute ~half the weight)
     overallScore:  63.60 → 64.39 (recovery subscore lifts by ~3 pts
       from the rebalance, overall by ~0.79)
     recovery flat mean: 48.75 (unchanged — flat mean doesn't apply
       weights by design; documents the coverage-weighted diff)
   Local coverageWeightedMean helper in the test mirrors the
   production implementation (weights applied per dim).

Methodology doc

- New "Per-dimension weights in the recovery domain" subsection with
  the weight table and a sentence explaining the cap. Cross-references
  the source of truth (RESILIENCE_DIMENSION_WEIGHTS).

Deliberate non-goals

- Live post-merge Spearman ≥0.85 check against the PR 0 baseline
  snapshot. Fixture ordering is preserved (new ordering test); the
  live-data check runs after Railway cron refreshes the rankings on
  the new weights and commits docs/snapshots/resilience-ranking-live-
  post-pr2-<date>.json. Tracked as the final piece of PR 2 §3.4
  alongside the health.js / bootstrap graduation (waiting on the
  7-day Railway cron bake-in window).

Tests: 6588/6588 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on touched files. NO > US > YE
fixture ordering preserved.

* fix(resilience): PR 2 review — thread RESILIENCE_DIMENSION_WEIGHTS through the comparison harness

Greptile P2: the operator comparison harness
(scripts/compare-resilience-current-vs-proposed.mjs) claims its domain
scores "mirror the production scorer's coverage-weighted mean" and is
the artifact generator for Spearman / rank-delta acceptance decisions.
After PR 2 §3.4's weight rebalance, the production mirror diverged —
production now applies RESILIENCE_DIMENSION_WEIGHTS (liquidReserveAdequacy
= 0.5, sovereignFiscalBuffer = 0.5) inside coverageWeightedMean, but
the harness still used equal-weight aggregation.

Left unfixed, post-merge Spearman / rank-delta diagnostics would
compare live API scores (with the 0.5 recovery weights) against
harness predictions that assume equal-share dims — silently biasing
every acceptance decision until someone noticed a country's rank-
delta didn't track.

Fix

- Mirrored coverageWeightedMean now accepts dimensionWeights and
  applies `coverage × weight` per dim, matching _shared.ts exactly.
- Mirrored buildDomainList accepts + forwards dimensionWeights.
- main() imports RESILIENCE_DIMENSION_WEIGHTS from the scorer module
  and passes it through to buildDomainList at the single call site.
- Missing-entry default = 1.0 (same contract as production) — makes
  the harness forward-compatible with any future weight refactor
  (adds a new dim without an explicit entry, old production fallback
  path still produces the correct number).

Verification

- Harness syntax-check clean (node -c).
- RESILIENCE_DIMENSION_WEIGHTS import resolves correctly from the
  harness's import path.
- 509/509 resilience tests still pass (harness isn't in the test
  suite; the invariant is that production ↔ harness use the same
  math, and the production side is covered by tests/resilience-
  recovery-weight-rebalance.test.mts).

* fix(resilience): PR 2 review — bump cache prefixes v10→v11 + document coverage-vs-weight asymmetry

Greptile P1 + P2 on PR #3328.

P1 — cache prefix not bumped after formula change
--------------------------------------------------
The per-dim weight rebalance changes the score formula, but the
`_formula` tag only distinguishes 'd6' vs 'pc' (pillar-combined vs
legacy 6-domain) — it does NOT detect intra-'d6' weight changes. Left
unfixed, scores cached before deploy would be served with the old
equal-weight math for up to the full 6h TTL, and the ranking key for
up to its 12h TTL. Matches the established v9→v10 pattern for every
prior formula-changing deploy.

Bumped in lockstep:
 - RESILIENCE_SCORE_CACHE_PREFIX:     v10  → v11
 - RESILIENCE_RANKING_CACHE_KEY:      v10  → v11
 - RESILIENCE_HISTORY_KEY_PREFIX:      v5  → v6
 - scripts/seed-resilience-scores.mjs local mirrors
 - api/health.js resilienceRanking literal
 - 4 analysis/backtest scripts that read the cached keys directly
 - Test fixtures in resilience-{ranking, handlers, scores-seed,
   pillar-aggregation}.test.* that assert on literal key values

The v5→v6 history bump is the critical one: without it, pre-rebalance
history points would mix with post-rebalance points inside the 30-day
window, and change30d / trend math would diff values from different
formulas against each other, producing false-negative "falling" trends
for every country across the deploy window.

P2 — coverage-vs-weight asymmetry in computeLowConfidence / computeOverallCoverage
----------------------------------------------------------------------------------
Reviewer flagged that these two functions still average coverage
equally across all non-retired dims, even after the scoring aggregation
started applying RESILIENCE_DIMENSION_WEIGHTS. The asymmetry is
INTENTIONAL — these signals answer a different question from scoring:

  scoring aggregation: "how much does each dim matter to the score?"
  coverage signal:     "how much real data do we have on this country?"

A dim at weight 0.5 still has the same data-availability footprint as
a weight=1.0 dim: its coverage value reflects whether we successfully
fetched the upstream source, not whether the scorer cares about it.
Applying scoring weights to the coverage signal would let a
half-weight dim hide half its sparsity from the overallCoverage pill,
misleading users reading coverage as a data-quality indicator.

Added explicit comments to both functions noting the asymmetry is
deliberate and pointing at the other site for matching rationale.
No code change — just documentation.

Tests: 6588/6588 data-tier tests pass (+511 resilience-specific
including the prefix-literal assertions). Typecheck clean on both
tsconfig configs. Biome clean on touched files.

* docs(resilience): bump methodology doc cache-prefix references to v11/v6

Greptile P2 on PR #3328: Redis keys table in the reproducibility
appendix still published `score:v10` / `ranking:v10` / `history:v5`,
and the rollback instructions told operators to flush those keys.
After the recovery-domain weight rebalance, live cache runs at
`score:v11` / `ranking:v11` / `history:v6`.

- Updated the Redis keys table (line 490-492) to match `_shared.ts`.
- Updated the rollback block to name the current keys.
- Left the historical "Activation sequence" narrative intact (it
  accurately describes the pillar-combine PR's v9→v10 / v4→v5 bump)
  but added a parenthetical pointing at the current v11/v6 values.

No code change — doc-only correction for operator accuracy.

* fix(docs): escape MDX-unsafe `<137` pattern to unblock Mintlify deploy

Line 643 had `(<137 countries)` — MDX parses `<137` as a JSX tag
starting with digit `1`, which is illegal and breaks the deploy with
"Unexpected character \`1\` (U+0031) before name". Surfaced after the
prior cache-prefix commit forced Mintlify to re-parse this file.

Replaced with "fewer than 137 countries" for unambiguous rendering.
Other `<` occurrences in this doc (lines 34, 642) are followed by
whitespace and don't trip MDX's tag parser.
2026-04-23 10:25:18 +04:00
Elie Habib
99b536bfb4 docs(energy): /corrections revision-log page (Week 4 launch requirement) (#3323)
* docs(energy): /corrections revision-log page (Week 4 launch requirement)

All five methodology pages reference /corrections (the auto-revision-log
URL promised in the Global Energy Flow parity plan §20) but the page
didn't exist — clicks 404'd. This lands the page.

Content:
- Explains the revision-log shape: `{date, assetOrEventId, fieldChanged,
  previousValue, newValue, trigger, sourcesUsed, classifierVersion}`.
- Defines the trigger vocabulary (classifier / source / decay / override)
  so readers know what kind of change they're seeing.
- States the v1-launch truth honestly: the log is empty at launch and
  fills as the post-launch classifier pass (in proactive-intelligence.mjs)
  runs on its normal schedule. No fake entries, no placeholder rows.
- Documents the correction-submission path (operators / regulators /
  researchers with public source URLs) and the contract that
  corrections write `override`-trigger entries citing the submitted
  source — not anonymous overrides.
- Cross-links all five methodology pages.
- Explains WHY we publish this: evidence-first classification only
  works if the audit trail is public; otherwise "the classifier said
  so" has no more authority than any other opaque pipeline.

Also fixes a navigation gap: docs/docs.json was missing both
methodology/disruptions (landed in PR #3294 but never registered in
nav) and the new corrections page. Both now appear in the "Intelligence
& Analysis" group alongside the other methodology pages.

No code changes. MDX lint + docs.json JSON validation pass.

* docs(energy): reframe /corrections as planned-surface spec (P1 review fix)

Greptile P1: the prior /corrections page made live-product claims
("writes an append-only entry here", "expect the first entries within
days", "email corrections@worldmonitor.app") that the code doesn't
back. The revision-log writer ships with the post-launch classifier;
the correction-intake pipeline does not yet exist; and the related
detail handlers still return empty `revisions` arrays with code
comments explicitly marking the surface as future work.

Fix: rewrite the page as a planned-surface specification with a
prominent Status callout. Changes:

- Page title: "Revision Log" → "Revision Log (Planned)"
- Prominent <Note> callout at the top states v1 launch truth: log is
  not yet live, RPC `revisions` arrays are empty by design,
  corrections are handled manually today.
- "Current state (v1 launch)" section removed; replaced with two
  explicit sections: "What IS live today" (evidence bundles,
  methodology, versioned classifier output) and "What is NOT live
  today" (log entries, automated correction intake, override-writer).
- "Within days" timeline language removed — no false operational SLA.
- Email submission path removed (no automated intake exists). Points
  readers to GitHub issues for manual review today.
- Preserves the planned data shape, trigger vocabulary, policy
  commitment, and "why we publish this" framing — those are spec, not
  claims.

Also softens /corrections references in the four methodology pages
(pipelines, storage, shortages, disruptions) so none of them claim
the revision log is live. Each now says "planned revision-log shape
and submission policy" and points manual corrections at GitHub issues.

MDX lint 122/122 clean. docs.json JSON validation clean. No code
changes; pure reframing to match reality.

* docs(shortages): fix P1 overclaim + wrong RPC name (round-2 review)

Two findings on the same file:

P1 — `energy_asset_overrides` table documented as existing. It doesn't.
The PR's corrections.mdx explicitly lists the override-writer as NOT
live in v1; this section contradicted that. Rewrote as "Break-glass
overrides (planned)" with a clear Status callout matching the pattern
established in docs/corrections.mdx and the other methodology pages.
Points readers at GitHub issues for manual corrections today.

P2 — Wrong RPC name: `listActiveFuelShortages` doesn't exist. The
shipped RPC (in proto/worldmonitor/supply_chain/v1/
list_fuel_shortages.proto + server/worldmonitor/supply-chain/v1/
list-fuel-shortages.ts) is `ListFuelShortages`. Replaced the name +
reframed the sentence to describe what the actual RPC already exposes
(every FuelShortageEntry includes evidence.evidenceSources[]) rather
than projecting a future surface.

Also swept the other methodology pages for the same class of bug:
- grep for _overrides: only the one line above
- grep for listActive/ getActive RPC names: none found
- verified all RPC mentions in docs/methodology + docs/corrections.mdx
  match names actually in proto (ListPipelines, ListStorageFacilities,
  ListFuelShortages, ListEnergyDisruptions, GetPipelineDetail,
  GetStorageFacilityDetail, GetFuelShortageDetail)

MDX lint clean. No code changes.

* docs(methodology): round-3 sibling sweep for revision-log overclaims

Reviewer (Greptile) caught a third round of the same overclaim pattern
I've been trying to stamp out: docs/methodology/shortages.mdx line 46
said "Stale shortages never persist silently. Every demotion writes to
the public revision log." — contradicting the same PR's /corrections
page which explicitly frames the revision log as not-yet-live. Fixed
that one AND did the mechanical sibling sweep the review pattern
clearly called for.

Changes:

- `docs/methodology/shortages.mdx:46` — rewrote the auto-decay footer
  to future tense: "When the post-launch classifier ships, stale
  shortages will never persist silently — every demotion will write
  an entry to the planned public revision log." Points readers at
  /corrections for the designed shape. Notes that today the demotion
  thresholds ARE the contract; the structured audit trail is what
  lands with the classifier.

- `docs/methodology/chokepoints.mdx:64` — sibling sweep caught the
  same bug class ("Every badge transition writes to the public
  revision log"). Reworded to future tense and pointed manual
  corrections at GitHub issues, matching the pattern already applied
  to pipelines / storage / shortages in prior commits on this PR.

Final audit of remaining revision-log mentions across all 5
methodology pages + corrections.mdx — every one uses hedged tense now
("planned", "will", "when live", "designed", "not yet live", "once
the classifier ships"). The one remaining present-tense "emit" in
shortages.mdx:77 is inside the "(planned)" break-glass section with
its own Status callout, so it's correctly scoped.

Following the plan-doc-as-docs-source-overclaim skill's step-4
(sibling sweep) explicitly this time — which also retroactively
validates the skill extraction: three review rounds was the cost of
not running the sweep on round 1.

MDX lint clean. No code changes.

* docs(corrections): drop hardcoded launch date (Greptile P2)

Greptile inline P2 at docs/corrections.mdx:60: the phrase
"v1 launch (2026-04-23)" pins a specific calendar date that will read
inaccurately to visitors months later once entries start appearing.

Dropped the parenthetical date. "Status — v1 launch:" keeps the
scoping clear without tying it to a specific day. When live entries
start appearing on this page (or when the page is rewritten to show
real rows), a "last updated" marker will replace the status callout
entirely — no migration churn needed.

MDX lint 122/122 clean.
2026-04-23 09:22:48 +04:00
Elie Habib
7f83e1e0c3 chore: remove dormant proactive-intelligence agent (superseded by digest) (#3325)
* chore: remove dormant proactive-intelligence agent (superseded by digest)

PR #2889 merged a Phase 4 "Proactive Intelligence Agent" in 2026-04 with
588 lines of code and a PR body explicitly requiring a 6h Railway cron
service. That service was never provisioned — no Dockerfile, no Railway
entry, no health-registry key, all 7 test-plan checkboxes unchecked.

In the meantime the daily Intelligence Brief shipped via
scripts/seed-digest-notifications.mjs (PR #3321 and earlier), covering
the same "personalized editorial brief across all channels" use-case
at a different cadence (30m rather than 6h). The proactive agent's
landscape-diff trigger was speculative; the digest is the shipped
equivalent.

This PR retires the dormant code and scrubs the aspirational
"post-launch classifier" references that docs + comments have been
quietly carrying:

- Deleted scripts/proactive-intelligence.mjs (588 lines).
- scripts/_energy-disruption-registry.mjs, scripts/seed-fuel-shortages.mjs,
  scripts/_fuel-shortage-registry.mjs, src/shared/shortage-evidence.ts:
  dropped "proactive-intelligence.mjs will extend this registry /
  classifier output" comments. Registries are curated-only; no classifier
  exists.
- docs/methodology/disruptions.mdx: replaced "post-launch classifier"
  prose with the accurate "curated-only" description of how the event
  log is maintained.
- docs/api-notifications.mdx: envelope version is shared across **two**
  producers now (notification-relay, seed-digest-notifications), not three.
- scripts/notification-relay.cjs: one cross-producer comment updated.
- proto/worldmonitor/supply_chain/v1/list_energy_disruptions.proto +
  list_fuel_shortages.proto: same aspirational wording scrubbed.
- docs/api/SupplyChainService.openapi.{yaml,json} auto-regenerated via
  `make generate` — text-only description updates, no schema changes.

Net: -626 lines, +36 lines. No runtime behavior change. 6573/6573 unit
tests pass locally.

* fix(proto): scrub stale ListFuelShortages RPC comment (PR #3325 review)

Reviewer caught a stale "classifier-extended post-launch" comment on
the ListFuelShortages RPC method in service.proto that this PR's
initial pass missed — I fixed the message-definition comment in
list_fuel_shortages.proto but not the RPC-method comment in
service.proto, which propagates into the published OpenAPI
operation description.

- proto/worldmonitor/supply_chain/v1/service.proto: rewrite the
  ListFuelShortages RPC comment to match the curated-only framing
  used elsewhere in this PR.
- docs/api/SupplyChainService.openapi.{yaml,json}: auto-regenerated
  via `make generate`. Text-only operation-description update;
  no schema / contract changes.

No runtime impact. Other `classifier` references remaining in the
OpenAPI are legitimate schema field names (classifierVersion,
classifierConfidence) and an unrelated auto-revision-log trigger
enum value, both of which describe real on-row fields that existed
before this cleanup.
2026-04-23 09:15:57 +04:00
Elie Habib
c48ceea463 feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer (#3324)
* feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer

Plan §3.4 follow-up to #3305 + #3319. Lands the scorer + dimension
registration so the SWF seed from the Railway cron feeds a real score
once the bake-in window closes. No weight rebalance yet (separate
commit with Spearman sensitivity check), no health.js graduation yet
(7-day ON_DEMAND window per feedback_health_required_key_needs_
railway_cron_first.md), no bootstrap wiring yet (follow-up PR).

Shape of the change

Retirement:
- reserveAdequacy joins fuelStockDays in RESILIENCE_RETIRED_DIMENSIONS.
  The legacy scorer now mirrors scoreFuelStockDays: returns
  coverage=0 / imputationClass=null so the dimension is filtered out
  of the confidence / coverage averages via the registry filter in
  computeLowConfidence, computeOverallCoverage, and the widget's
  formatResilienceConfidence. Kept in RESILIENCE_DIMENSION_ORDER for
  structural continuity (tests, cached payload shape, registry
  membership). Indicator registry tier demoted to 'experimental'.

Two new active dimensions:
- liquidReserveAdequacy (replaces the liquid-reserves half of the
  retired reserveAdequacy). Same source (WB FI.RES.TOTL.MO, total
  reserves in months of imports) but re-anchored 1..12 months
  instead of 1..18. Twelve months ≈ IMF "full reserve adequacy"
  benchmark for a diversified emerging-market importer — the tighter
  ceiling prevents wealthy commodity-exporters from claiming outsized
  credit for on-paper reserve stocks that are not the relevant
  shock-absorption buffer.
- sovereignFiscalBuffer. Reads resilience:recovery:sovereign-wealth:v1
  (populated by scripts/seed-sovereign-wealth.mjs, landed in #3305 +
  wired into Railway cron in #3319). Computes the saturating
  transform:
    effectiveMonths = Σ [ aum/annualImports × 12 × access × liquidity × transparency ]
    score           = 100 × (1 − exp(−effectiveMonths / 12))
  Exponential saturation prevents Norway-type outliers (effective
  months in the 100s) from dominating the recovery pillar.

Three code paths in scoreSovereignFiscalBuffer:
1. Seed key absent entirely → IMPUTE.recoverySovereignFiscalBuffer
   (score 50 / coverage 0.3 / unmonitored). Covers the Railway-cron
   bake-in window before the first successful tick.
2. Seed present, country NOT in manifest → score=0 with FULL coverage.
   Substantive absence, NOT imputation — per plan §3.4 "What happens
   to no-SWF countries." 0 × weight = 0 in the numerator, so the
   country correctly scores lower than SWF-holding peers on this dim.
3. Seed present, country in payload → saturating score, coverage
   derated by the partial-seed completeness signal (so a Mubadala or
   Temasek scrape drift on a multi-fund country shows up as lower
   confidence rather than a silently-understated total).

Indicator registry:
- Demoted recoveryReserveMonths (tied to retired reserveAdequacy) to
  tier='experimental'.
- Added recoveryLiquidReserveMonths: WB FI.RES.TOTL.MO, anchors 1..12,
  tier='core', coverage=188.
- Added recoverySovereignWealthEffectiveMonths: the new SWF signal,
  tier='experimental' for now because the manifest only has 8 funds
  (below the 180-core / 137-§3.6-gate threshold). Graduating to 'core'
  requires expanding the manifest past ~137 entries — a later PR.

Tests updated

- resilience-release-gate: 19→21 dim count; RETIRED_DIMENSIONS allow-
  list now includes reserveAdequacy alongside fuelStockDays.
- resilience-dimension-scorers: scoreReserveAdequacy monotonicity +
  "high reserves score well" tests migrated to scoreLiquidReserve-
  Adequacy (same source, new 1..12 anchor). New retirement-shape test
  for scoreReserveAdequacy mirroring the PR 3 fuelStockDays retirement
  test. Four new scorer tests pin the three code paths of
  scoreSovereignFiscalBuffer (absent seed / no-SWF country / SWF
  country / partial-completeness derate).
- resilience-scorers fixture: baseline 60.12→60.35, recovery-domain
  flat mean 47.33→48.75, overall 63.27→63.6. Each number commented
  with the driver (split adds liquidReserveAdequacy 18@1.0 + sovereign
  FiscalBuffer 50@0.3 at IMPUTE; retired reserveAdequacy drops out).
- resilience-dimension-monotonicity: target scoreLiquidReserveAdequacy
  instead of scoreReserveAdequacy.
- resilience-handlers: response-shape dim count 19→21.
- resilience-indicator-registry: coverage 19→21 dimensions.
- resilience-dimension-freshness: allowlisted the new sovereign-wealth
  seed-meta key in KNOWN_SEEDS_NOT_IN_HEALTH for the ON_DEMAND window.
- resilience-methodology-lint HEADING_TO_DIMENSION: added the two new
  heading mappings. Methodology doc gets H4 sections for Liquid
  Reserve Adequacy and Sovereign Fiscal Buffer; Reserve Adequacy
  section is annotated as retired.
- resilience-retired-dimensions-parity: client-side
  RESILIENCE_RETIRED_DIMENSION_IDS gets reserveAdequacy. Parser
  upgraded to strip inline `// …` comments from the array body so a
  future reviewer can drop a rationale next to an entry without
  breaking parity.
- resilience-confidence-averaging: fixture updated to include both
  retired dims (reserveAdequacy + fuelStockDays) — confirms the
  registry filter correctly excludes BOTH from the visible coverage
  reading.

Extraction harness (scripts/compare-resilience-current-vs-proposed.mjs):
- recoveryLiquidReserveMonths: reads the same reserve-adequacy seed
  field as recoveryReserveMonths.
- recoverySovereignWealthEffectiveMonths: reads the new SWF seed key
  on field totalEffectiveMonths. Absent-payload → 0 for correlation
  math (matches the substantive-no-SWF scorer branch).

Out of scope for this commit (follow-ups)

- Recovery-domain weight rebalance + Spearman sensitivity rerun
  against the PR 0 baseline.
- health.js graduation (SEED_META entry + ON_DEMAND_KEYS removal) once
  Railway cron has ~7 days of clean runs.
- api/bootstrap.js wiring once an RPC consumer needs the SWF data.
- Manifest expansion past 137 countries so sovereignFiscalBuffer can
  graduate from tier='experimental' to tier='core'.

Tests: 6573/6573 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on all touched files.

* fix(resilience): PR 2 review — add widget labels for new dimensions

P2 review finding on PR #3324. DIMENSION_LABELS in src/components/
resilience-widget-utils.ts covered only the old 19 dimension IDs, so
the two new active dims (liquidReserveAdequacy, sovereignFiscalBuffer)
would render with their raw internal IDs in the confidence grid for
every country once the scorer started emitting them. The widget test
at getResilienceDimensionLabel also asserted only the 19-label set,
so the gap would have shipped silently.

Fix: add user-facing short labels for both new dims. "Reserves" is
already claimed by the retired reserveAdequacy, so the replacement
disambiguates with "Liquid Reserves"; sovereignFiscalBuffer →
"Sovereign Wealth" per the methodology doc H4 heading.

Also added a regression guard — new test asserts EVERY id in
RESILIENCE_DIMENSION_ORDER resolves to a non-id label. Any future
dimension that ships without a matching DIMENSION_LABELS entry now
fails CI loudly instead of leaking the ID into the UI.

Tests: 502/502 resilience tests pass (+1 new coverage check).
Typecheck clean on both configs.

* fix(resilience): PR 2 review — remove dead IMPUTE.recoveryReserveAdequacy entry

Greptile P2: the retired scoreReserveAdequacy stub no longer reads
from IMPUTE (it hardcodes coverage=0 / imputationClass=null per the
retirement pattern), making IMPUTE.recoveryReserveAdequacy dead code.
Removed the entry + added a breadcrumb comment pointing at the
replacement IMPUTE.recoveryLiquidReserveAdequacy.

The second P2 (bootstrap.js not wired) is a deliberate non-goal — the
reviewer explicitly flags "for visibility" since it's tracked in the
PR body. No action this commit; bootstrap wiring lands alongside the
SEED_META graduation after the ~7-day Railway-cron bake-in.

Tests: 502/502 resilience tests still pass. Typecheck clean.
2026-04-23 09:01:30 +04:00
Elie Habib
8032dc3a04 feat(resilience): PR 2 pre-scorer — SWF manifest + seeder (8/8 funds) (#3305)
* feat(resilience): PR 2 scaffolding — SWF classification manifest + seeder skeleton

Plan §3.4. First of multiple commits for PR 2 (fiscal-buffer split
and sovereign-wealth integration). This commit is SCAFFOLDING ONLY:
no dimension wiring, no scorer, no cache-keys entry yet. The goal is
to land the reviewer-facing metadata and the seeder's three-tier
source shape so an external SWF practitioner can critique before we
wire the scorer.

What is in:

1. docs/methodology/swf-classification-manifest.yaml — authoritative
   per-fund classification for the `sovereignFiscalBuffer` dimension.
   First-pass estimates for the 8 funds named in plan §3.4 table:
   Norway GPFG, UAE ADIA + Mubadala, Saudi PIF, Kuwait KIA,
   Qatar QIA, Singapore GIC + Temasek. Each fund carries:
     - three-component classification (access, liquidity, transparency)
       each on [0, 1], with rationale text citing the mandate / fiscal
       rule / asset-mix / transparency-index evidence
     - source URLs for audit
   Fund-candidates deferred for external-reviewer decision are listed
   in a trailing comment block (CIC, NWF, SOFAZ, NSIA, Future Fund,
   NZ Super, ESSF, etc.).

   external_review_status: PENDING — flip to REVIEWED on sign-off.

2. scripts/shared/swf-manifest-loader.mjs — YAML parser + strict schema
   validator. Fails loudly on any deviation (out-of-range scores,
   non-ISO2 countries, missing rationale, duplicate fund IDs, wrong
   manifest version). Single source of truth for the seeder, future
   scorer, and methodology-doc linter.

3. scripts/seed-sovereign-wealth.mjs — seeder shell with the three-tier
   source priority from plan §3.4:
     1. Official fund disclosures (MoF, central-bank, annual reports)
     2. IFSWF member filings
     3. SWFI public fund-rankings page (license-free fallback, scraped)
   Tiers 1-3 are all stubbed (return null) in this commit — the
   seeder publishes a well-formed empty payload so the scorer IMPUTE
   fallback can be exercised end-to-end without live data.
   emptyDataIsFailure: false is set deliberately so pre-wiring cron
   runs do not poison seed-meta (see
   feedback_strict_floor_validate_fail_poisons_seed_meta.md).

   SWFI scrape target is documented in the file header with the
   exact URL and a 2.5s inter-request interval. The scraper itself
   lands in the next commit after the external reviewer signs off
   on the manifest.

4. tests/swf-classification-manifest.test.mjs — 14 tests exercising
   both the shipped YAML (plan §3.4 required-fund presence, [0,1]
   bounds, rationale length, source citations, multi-fund country
   handling) and the validator's schema enforcement (rejects out-
   of-range scores, non-ISO2 codes, missing rationale, empty sources,
   duplicates, wrong version, invalid review status).

Out of scope for this commit (follow-ups, in order):
 - Implement SWFI scrape + IFSWF parse + per-fund official endpoints
 - Add `liquidReserveAdequacy` and `sovereignFiscalBuffer` dimensions
   to RESILIENCE_DIMENSION_ORDER, registry, and scorers
 - Retire `reserveAdequacy` via RESILIENCE_RETIRED_DIMENSIONS
 - cache-keys.ts + api/bootstrap.js + api/health.js wiring (new
   seed key needs ON_DEMAND_KEYS gating per Railway-cron bake-in rule)
 - Recovery-domain weight rebalance + Spearman sensitivity rerun
 - Methodology doc: rewrite the reserveAdequacy section

Tests: 508/508 pass (resilience suite + new manifest tests).
Typecheck clean on both tsconfig.json and tsconfig.api.json.

No external-facing behavior change — all files are new + isolated.

* feat(resilience): PR 2 commit 2 — Wikipedia SWF scraper + SWFI pivot

Implements Tier 3 of the sovereignFiscalBuffer seeder. Tier 1 (official
disclosures) and Tier 2 (IFSWF filings) remain stubbed — they require
per-fund bespoke adapters and will land incrementally.

SWFI pivot
----------
The plan's original Tier 3 target was
https://www.swfinstitute.org/fund-rankings/sovereign-wealth-fund. Live
check on 2026-04-23: the page's <tbody> is empty and AUM is gated
behind a lead-capture form (name + company + job title). SWFI per-fund
/profile/<id> pages are similarly barren. The "public fund rankings"
is effectively no longer public; scraping the lead-gated surface would
require submitting fabricated contact info (TOS violation, legally
questionable), so Tier 3 pivots to Wikipedia.

Wikipedia is legally clean (CC-BY-SA 4.0, attribution required — see
WIKIPEDIA_SOURCE_ATTRIBUTION in the seeder) and structurally scrapable.
The SWFI Linaburg-Maduell Transparency Index mentioned in manifest
rationale text is a SEPARATE SWFI publication (public index scores),
not the fund-rankings paywall — those citations stay valid.

What is in
----------

1. scripts/seed-sovereign-wealth.mjs — Wikipedia scraper implementation:
   - parseWikipediaRankingsTable(html) — exported pure function so
     the parser is unit-testable without a live fetch. Extracts the
     wikitable, parses per-fund rows (Country, Abbrev, Fund name,
     Assets USD B, Inception, Origin).
   - Strip-HTML helper strips <sup> tags to SPACES (not empty) so
     `302.0<sup>41</sup>` stays `302.0 41` — otherwise the decimal
     value and its trailing footnote ref get welded into `302.041`,
     which the Assets regex mis-parses.
   - matchWikipediaRecord(fund, cache) — abbrev + fund-name lookup
     with country disambiguation: lookup maps are now
     Map<key, Record[]> (list) rather than Map<key, Record>, and the
     matcher filters the list by manifest country before returning.
     This is the exact fix for the PIF collision:
     "PIF" resolves to BOTH Saudi Arabia's Public Investment Fund
     (~USD 925B) and Palestine's Palestine Investment Fund (~USD 900M)
     on the live article. Without country-filtering, Map.set silently
     overwrites one with the other, so Saudi PIF would return
     Palestine's AUM — three orders of magnitude wrong.
   - When the country disambiguator cannot pick, returns null rather
     than a best-guess. Seeder logs the unmatched fund; the IMPUTE
     path handles it gracefully.

2. docs/methodology/swf-classification-manifest.yaml — added
   `wikipedia` hints block to each of the 8 funds (abbrev and/or
   fund_name, matching Wikipedia's canonical naming).

3. scripts/shared/swf-manifest-loader.mjs — optional `wikipedia` field
   in the schema: `abbrev` and `fund_name` both optional strings, but
   at least one must be present if the block is provided.

4. tests/seed-sovereign-wealth.test.mjs — 12 tests exercising:
   - fixture-based parser: abbrev/name indexing, HTML + footnote
     stripping, decimal AUM, malformed rows skipped, missing-table error
   - abbrev-collision handling: both candidates retained in the list
   - country-disambiguation matcher: Saudi PIF correctly picked from
     a Saudi-vs-Palestine collision fixture (the exact live bug)
   - ambiguous lookup with unknown country returns null, not wrong record

Live verification against the shipped Wikipedia article: 7/8 funds
matched with the correct country; Saudi PIF now correctly returns
USD 925B (not Palestine's USD 0.9B) because of the country-
disambiguation fix. Temasek is the one miss — Wikipedia does not
classify it as an SWF (practitioner debate; it lists under "state
holding companies" instead). Falls through to IMPUTE in the scorer
until Tier 1/2 adapters land with an official-disclosure source.

Tests: 522/522 pass (resilience + manifest + scraper).
Typecheck clean on both tsconfig.json and tsconfig.api.json.

Still stubbed for later commits:
 - Tier 1 per-fund official-disclosure adapters (incl. Temasek)
 - Tier 2 IFSWF secretariat parser
 - Dimension wiring (liquidReserveAdequacy, sovereignFiscalBuffer)
 - reserveAdequacy retirement via RESILIENCE_RETIRED_DIMENSIONS
 - cache-keys / bootstrap / health.js wiring (ON_DEMAND_KEYS until bake-in)
 - Recovery-domain weight rebalance + Spearman sensitivity rerun

* feat(resilience): PR 2 commit 3 — Wikipedia infobox fallback + FX → 8/8 match

Closes the Temasek gap. The Wikipedia list article excludes Temasek on
editorial grounds (classified as a "state holding company" rather than
an SWF), so the Tier-3 list-only path topped out at 7/8 funds matched.
This commit adds Tier 3b — per-fund Wikipedia article infobox scrape
— and a baked-in FX table to handle non-USD infobox currencies.

Live verification on the shipped Wikipedia articles: 8/8 funds matched.
Temasek: S$ 434B → US$ 321B via infobox + SGD→USD FX.

Implementation

1. scripts/seed-sovereign-wealth.mjs
   - FX_TO_USD table (USD, SGD, NOK, EUR, GBP, AED, SAR, KWD, QAR)
     with FX_RATES_REVIEWED_AT='2026-04-23' committed into the seed
     payload so stale rates are visible at audit time.
   - CURRENCY_SYMBOL_TO_ISO ordered list — US$ tested before S$ before
     bare $, and $ / kr require a space + digit neighbor to avoid
     false-matches in rich prose.
   - detectCurrency(text) exported pure for unit testing.
   - parseWikipediaArticleInfobox(html) exported pure — scans rows
     for "Total assets" / "Assets under management" / "AUM" / "Net
     assets" / "Net portfolio value" labels, extracts "NUMBER (trillion
     | billion | million) (YEAR)" values, applies FX conversion.
   - fetchWikipediaInfobox(fund) — per-fund article fetch, gated on
     the manifest's wikipedia.article_url hint.
   - sourceMix split into {official, ifswf, wikipedia_list,
     wikipedia_infobox} counters so the seed payload shows which tier
     delivered each fund.
   - Source priority chain: official → ifswf → wikipedia_list →
     wikipedia_infobox. Infobox last because it is N network round-
     trips; amortizing over the list article cache first minimizes
     live traffic.

2. docs/methodology/swf-classification-manifest.yaml
   - Temasek entry gains wikipedia.article_url:
     https://en.wikipedia.org/wiki/Temasek_Holdings with an inline
     comment explaining why the list-article path misses.

3. scripts/shared/swf-manifest-loader.mjs
   - article_url optional field; validator rejects anything that is
     not a https://<lang>.wikipedia.org/... URL so a typo cannot
     silently wire the seeder to an off-site fetch.

4. tests/seed-sovereign-wealth.test.mjs (10 new tests, 38/38 pass)
   - detectCurrency distinguishes US$ vs S$ vs bare $.
   - parseWikipediaArticleInfobox extracts Temasek S$ 434B → US$ 321B
     with year tag from "(2025)".
   - USD-native row pass-through with fxRate=1.0.
   - NOK trillion conversion (NOK 18.7T → USD 1.74T).
   - Returns null when no AUM row / no infobox at all.
   - Documents the unknown-currency → USD fallback contract.

Tests: 532/532 pass (full resilience + manifest + scraper suite).
Typecheck clean on both tsconfig.json and tsconfig.api.json.

Still stubbed for later commits:
 - Tier 1 per-fund official-disclosure adapters
 - Tier 2 IFSWF secretariat parser
 - Dimension wiring (liquidReserveAdequacy, sovereignFiscalBuffer)
 - reserveAdequacy retirement via RESILIENCE_RETIRED_DIMENSIONS
 - cache-keys / bootstrap / health.js wiring (ON_DEMAND_KEYS)
 - Recovery-domain weight rebalance + Spearman sensitivity rerun

* refactor(resilience): reuse project-shared FX infrastructure for SWF seeder

Self-caught duplication from the previous commit (699ba832a introduced
a local FX_TO_USD table and FX_RATES_REVIEWED_AT constant). The
codebase already has the canonical path:

  scripts/_seed-utils.mjs
    SHARED_FX_FALLBACKS      (USD/SGD/NOK/EUR/GBP/AED/SAR/QAR/KWD/...)
    getSharedFxRates()       (Redis shared:fx-rates:v1 4h cache + Yahoo)
    fetchYahooFxRates()

Used by seed-grocery-basket, seed-fuel-prices, seed-bigmac. Two FX
tables would drift and the live-rate layer (Yahoo via Redis cache)
would be orphaned on the SWF path.

What changed

- Deleted local FX_TO_USD / FX_RATES_REVIEWED_AT constants.
- parseWikipediaArticleInfobox() no longer performs FX conversion.
  Returns { valueNative, currencyNative, aumYear } so the seeder
  orchestrator applies project-shared rates at call time. Parser is
  now currency-agnostic and thinner.
- Added lookupUsdRate(currency, fxRates) helper:
  * USD → 1.0 short-circuit
  * prefer the live map (getSharedFxRates output) over static fallback
  * fall back to SHARED_FX_FALLBACKS
  * return null on unknown currency (caller skips the fund — no silent
    wrong-currency misreading).
- fetchWikipediaInfobox() accepts fxRates map, converts via
  lookupUsdRate, returns enriched { aum, currencyNative, fxRate }.
- fetchSovereignWealth() fetches fxRates once at the top via
  getSharedFxRates(buildFxSymbolsForSwf(), SHARED_FX_FALLBACKS), in
  parallel with World Bank imports + Wikipedia list. Warms the shared
  Redis FX cache for other seeders at the same time.
- Seed payload drops the fxRatesReviewedAt field; the shared cache
  carries that metadata at the Redis level for all seeders.

Tests updated

- parseWikipediaArticleInfobox tests assert the native value + ISO
  code, no longer the USD-converted amount.
- New `lookupUsdRate` suite pins the project-shared FX integration:
  USD short-circuit, live-rate preference, static fallback, unknown-
  currency null, and a Temasek S$ 434B → US$ 321B end-to-end case
  via the shared fallback table.

Live re-verification still 8/8; SGD comes through SHARED_FX_FALLBACKS
at 0.74 (same number as the deleted local table), so behavior is
identical but the dedupe is real.

Tests: 536/536 pass. Typecheck clean on both tsconfig configs.

* refactor(resilience): split SWF manifest validator into sub-helpers

Biome reported validateManifest at complexity 55 vs max 50. Extracted
the per-fund validation into validateFundEntry(raw, idx, seen) and
pulled out validateClassification, validateRationale, validateSources,
validateWikipediaHints as separate helpers. Behavior and tests are
unchanged; each helper is now well under the complexity cap and the
main validator reads linearly.

Tests: 42/42 manifest + scraper tests pass. Typecheck clean.

* fix(resilience): PR 2 review — partial-seed guard + manifest REVIEWED status

Addresses two P1 findings on PR #3305.

P1.1 — partial-seed silent corruption on multi-fund countries
-------------------------------------------------------------
For multi-fund countries (AE = ADIA + Mubadala, SG = GIC + Temasek)
the previous aggregation silently published a partial
totalEffectiveMonths if a secondary fund's scraper drifted on
Wikipedia — recordCount would still look green because we counted
"any fund matched" as a successful country-seed. Downstream scorer
would under-rank those countries with no missingness signal.

Fix:
- Each country entry now carries { expectedFunds, matchedFunds,
  completeness } alongside the existing totalEffectiveMonths. The
  scorer can use completeness < 1.0 to derate (treat as degraded
  coverage) rather than accept the partial number at face value.
- declareRecords counts ONLY countries with completeness === 1.0,
  so a secondary-fund drift drops the seed-meta record_count and
  triggers the operational alarm. recordCount in runSeed opts now
  delegates to declareRecords for parity.
- A warn log fires per partial country so the Railway cron log is
  loud on drift without poisoning seed-meta.
- 4 new tests pin: all-matched counts, partial drops, empty/malformed
  payloads, and defensive handling of pre-completeness payload shape.

P1.2 — manifest external-reviewer language contradicted shipped workflow
------------------------------------------------------------------------
The YAML header said "External sovereign-wealth-practitioner review
is REQUIRED before PR 2 merges" and external_review_status=PENDING.
WorldMonitor's operating mode is fully automated (see memory
`feedback_no_external_reviewer_assumption.md`) — there is no external
practitioner gate. Reviewer correctly flagged the inconsistency
between the document and the shipped behaviour.

Fix:
- Rewrote the header to describe the actual audit discipline:
  coefficients derive from the committed rationale + cited sources
  for each fund; revisions require the same discipline in a follow-
  up PR. No external-gate language.
- Flipped external_review_status to REVIEWED, with a clarifying
  comment: REVIEWED = coefficients derive from the committed
  rationale + seeder end-to-end matches the live surfaces. PENDING
  remains reserved for future PRs that ship unresolved TBD
  coefficients.
- Rewrote the "candidates deferred from v1" trailing block. Each
  fund listed now has a concrete rationale for deferral (sanctions /
  access coefficient would pin at 0 / classification contested /
  AUM disclosure unstable) so a future PR author can argue the case
  on record. No "reviewer advice needed" placeholders.
- Tweaked two inline fund comments (UAE ADQ/ICD, Singapore Temasek)
  that still said "external reviewer" — now describe the substantive
  reason for inclusion/deferral.

Tests
-----
- 540/540 resilience + manifest + scraper tests pass.
- Typecheck clean on both tsconfig configs.
- Biome clean on all touched files.

* fix(resilience): PR 2 review — 4 P2 fixes (WB imports, null validate, nested table, aumYear)

Greptile P2 findings on PR #3305, all addressed.

(1) Silent country drop on missing WB imports
---------------------------------------------
If fetchAnnualImportsUsd() has no entry for a manifest ISO-2
(transient WB outage, new country with spotty coverage), the country
was silently skipped. Downstream scorer would then read "absent from
payload" as "no SWF" and score 0 with full coverage — substantively
wrong. Now logs a warn and adds each affected fund to the unmatched
list with a `(no WB imports)` suffix so the seed-meta observer sees
the degradation.

(2) typeof null === 'object' bypassed validate()
------------------------------------------------
Bare `typeof data?.countries === 'object'` returned true for
{ countries: null } and { countries: [] }. Downstream property
access would then crash. Strict check added: non-null plain object
only; also rejects arrays. Test pins all 5 edge cases.

(3) Nested </table> / </td> truncated wikitable parse
-----------------------------------------------------
Lazy [\s\S]*? in the outer table regex AND the inner row/cell regexes
could silently drop every row after any cell that contained a nested
mini-table (Wikipedia footnote boxes, sort helpers). Two-step fix:
  - extractFirstWikitable: depth-aware walk counts <table>/</table>
    opens and closes, returns content at balanced depth
  - stripNestedTables: iteratively removes complete inner
    <table>…</table> blocks BEFORE row parsing, so the lazy row / cell
    regexes never see a nested </tr> or </td>
Test: 5-row fixture with a nested table inside row 1's cell — ADIA
(row 2) must still parse, GPFG (row 1 with nested) must still parse.

(4) aumYear reflected scrape year, not data year
------------------------------------------------
List-article entries were stamped with `new Date().getFullYear()`
even though the Wikipedia list publishes no per-row data-year
annotation (figures are typically prior-period). Consumers using
aumYear for freshness audit would see "2026" for 2024/2025 data.
Now set to null for list entries; infobox tier 3b retains year
extraction from the "(YYYY)" tag on the individual fund article.

P1 bootstrap deferral: intentional per project memory
-----------------------------------------------------
AGENTS.md says new data sources MUST wire api/bootstrap.js. Not done
in this PR by design:
  - No RPC consumer exists yet for
    `resilience:recovery:sovereign-wealth:v1` (scorer lands in a
    follow-up PR; wiring bootstrap without a consumer would be dead
    code).
  - Local memory `feedback_health_required_key_needs_railway_cron_
    first.md` requires new seed keys to sit in ON_DEMAND_KEYS for
    ~7 days of clean Railway cron before promoting to
    BOOTSTRAP_KEYS — adding bootstrap wiring now would pre-empt
    that window and risk CRIT alarms on the health surface.
The scorer PR that follows will land the bootstrap wiring + the
dimension at the same time, which is the cohesive unit.

Tests: 547/547 resilience + manifest + scraper tests pass.
Typecheck clean on both tsconfig configs. Biome clean on touched
files.
2026-04-23 07:58:40 +04:00
Elie Habib
84ee2beb3e feat(energy): Energy Atlas end-to-end — pipelines + storage + shortages + disruptions + country drill-down (#3294)
* feat(energy): pipeline registries (gas + oil) — evidence-based schema

Day 6 of the Energy Atlas Release 1 plan (Week 2). First curated asset
registry for the atlas — the real gap vs GEF.

## Curated data (critical assets only, not global completeness)

scripts/data/pipelines-gas.json — 12 critical gas lines:
  Nord Stream 1/2 (offline; Swedish EEZ sabotage 2022; EU sanctions refs),
  TurkStream, Yamal–Europe (offline; Polish counter-sanctions),
  Brotherhood/Soyuz (offline; Ukraine transit expired 2024-12-31),
  Power of Siberia, Dolphin, Medgaz, TAP, TANAP,
  Central Asia–China, Langeled.

scripts/data/pipelines-oil.json — 12 critical oil lines:
  Druzhba North/South (N offline per EU 2022/879; S under landlocked
  derogation), CPC, ESPO (+ price-cap sanction ref), BTC, TAPS,
  Habshan–Fujairah (Hormuz bypass), Keystone, Kirkuk–Ceyhan (offline
  since 2023 ICC ruling), Baku–Supsa, Trans-Mountain (TMX expansion
  May 2024), ESPO spur to Daqing.

Scope note: 75+ each is Week 2b work via GEM bulk import. Today's cut
is curated from first-hand operator disclosures + regulator filings so
I can stand behind every evidence field.

## Evidence-based schema (not conclusion labels)

Per docs/methodology/pipelines.mdx: no bare `sanctions_blocked` field.
Every pipeline carries an evidence bundle with `physicalState`,
`physicalStateSource`, `operatorStatement`, `commercialState`,
`sanctionRefs[]`, `lastEvidenceUpdate`, `classifierVersion`,
`classifierConfidence`. The public badge (`flowing|reduced|offline|
disputed`) is derived server-side from this bundle at read time.

## Seeder

scripts/seed-pipelines.mjs — single process publishes BOTH keys
(energy:pipelines:{gas,oil}:v1) via two runSeed() calls. Tiny datasets
(<20KB each) so co-location is cheap and guarantees classifierVersion
consistency.

Conventions followed (worldmonitor-bootstrap-registration skill):
- TTL 21d = 3× weekly cadence (gold-standard per
  feedback_seeder_gold_standard.md)
- maxStaleMin 20_160 = 2× cadence (health-maxstalemin-write-cadence skill)
- sourceVersion + schemaVersion + recordCount + declareRecords wired
  (seed-contract-foundation)
- Zero-case explicitly NOT allowed — MIN_PIPELINES_PER_REGISTRY=8 floor

## Health registration (dual, per feedback_two_health_endpoints_must_match)

- api/health.js: BOOTSTRAP_KEYS adds pipelinesGas + pipelinesOil;
  SEED_META adds both with maxStaleMin=20_160.
- api/seed-health.js: mirror entries with intervalMin=10_080 (maxStaleMin/2).

## Bundle registration

scripts/seed-bundle-energy-sources.mjs adds a single Pipelines entry
(not two) because seed-pipelines.mjs publishes both keys in one run —
listing oil separately would double-execute. Monitoring of the oil key
staleness happens in api/health.js instead.

## Tests (tests/pipelines-registry.test.mts)

17 passing node:test assertions covering:
- Schema validation (both registries pass validateRegistry)
- Identity resolution (no id collisions, id matches object key)
- Country ISO2 normalization (from/to/transit all match /^[A-Z]{2}$/)
- Endpoint geometry within Earth bounds
- Evidence rigor: non-flowing badges require at least one supporting
  evidence source (operator statement / sanctionRefs / ais-relay /
  satellite / press)
- ClassifierConfidence in 0..1
- Commodity/capacity pairing (gas uses capacityBcmYr, oil uses
  capacityMbd — mixing = test fail)
- validateRegistry rejects: empty object, null, no-evidence fixtures,
  below-floor counts

Typecheck clean (both tsconfig.json and tsconfig.api.json).

Next: Day 7 will add list-pipelines / get-pipeline-detail RPCs in
supply-chain/v1. Day 8 ships PipelineStatusPanel with DeckGL PathLayer
consuming the registry.

* fix(energy): split seed-pipelines.mjs into two entry points — runSeed hard-exits

High finding from PR review. scripts/seed-pipelines.mjs called runSeed()
twice in one process and awaited Promise.all. But runSeed() in
scripts/_seed-utils.mjs hard-exits via process.exit on ~9 terminal paths
(lines 816, 820, 839, 888, 917, 989, plus fetch-retry 946, fatal 859,
skipped-lock 81). The first runSeed to reach any terminal path exits the
entire node process, so the second runSeed's resolve never fires — only
one of energy:pipelines:{gas,oil}:v1 would ever be written.

Since the bundle scheduled seed-pipelines.mjs exactly once, and both
api/health.js and api/seed-health.js expect both keys populated, the
other registry would stay permanently EMPTY/STALE after deploy.

Fix: split into two entry-point scripts around a shared utility.

- scripts/_pipeline-registry.mjs (NEW, was seed-pipelines.mjs) — shared
  helpers ONLY. Exports GAS_CANONICAL_KEY, OIL_CANONICAL_KEY,
  PIPELINES_TTL_SECONDS, MAX_STALE_MIN, buildGasPayload, buildOilPayload,
  validateRegistry, recordCount, declareRecords. Underscore prefix marks
  it as non-entry-point (matches _seed-utils.mjs / _seed-envelope-source.mjs
  convention).
- scripts/seed-pipelines-gas.mjs (NEW) — imports from the shared module,
  single runSeed('energy','pipelines-gas',…) call.
- scripts/seed-pipelines-oil.mjs (NEW) — same shape, oil.
- scripts/seed-bundle-energy-sources.mjs — register BOTH seeders (not one).
- scripts/seed-pipelines.mjs — deleted.
- tests/pipelines-registry.test.mts — update import path to the shared
  module. All 17 tests still pass.

Typecheck clean (both configs). Tests pass. No other consumers import
from the deleted script.

* fix(energy): complete pipeline bootstrap registration per 4-file checklist

High finding from PR review. My earlier PR description claimed
worldmonitor-bootstrap-registration was complete, but I only touched two
of the four registries (api/health.js + api/seed-health.js). The bootstrap
hydration payload itself (api/bootstrap.js) and the shared cache-keys
registry (server/_shared/cache-keys.ts) still had no entry for either
pipeline key, so any consumer that reads bootstrap data would see
pipelinesGas/pipelinesOil as missing on first load.

Files updated this commit:

- api/bootstrap.js — KEYS map + SLOW_KEYS set both gain pipelinesGas +
  pipelinesOil. Placed next to sprPolicies (same curated-registry cadence
  and tier). Slow tier is correct: weekly cron, not needed on first paint.
- server/_shared/cache-keys.ts — PIPELINES_GAS_KEY + PIPELINES_OIL_KEY
  exported constants (matches SPR_POLICIES_KEY pattern), BOOTSTRAP_KEYS map
  entries, and BOOTSTRAP_TIERS entries (both 'slow').

Not touched (intentional):
- server/gateway.ts — pipeline data is free-tier per the Energy Atlas
  plan; no PREMIUM_RPC_PATHS entry required. Energy Atlas monetization
  hooks (scenario runner, MCP tools, subscriptions) are Release 2.

Full 4-file checklist now complete:
   server/_shared/cache-keys.ts (this commit)
   api/bootstrap.js          (this commit)
   api/health.js             (earlier in PR)
   api/seed-health.js        (earlier in PR — dual-registry rule)

Typecheck clean (both configs).

* feat(energy): ListPipelines + GetPipelineDetail RPCs with evidence-derived badges

Day 7 of the Energy Atlas Release 1 plan (Week 2). Exposes the pipeline
registries (shipped in Day 6) via two supply-chain RPCs and ships the
evidence-to-badge derivation server-side.

## Proto

proto/worldmonitor/supply_chain/v1/list_pipelines.proto — new:
- ListPipelinesRequest { commodity_type?: 'gas' | 'oil' }
- ListPipelinesResponse { pipelines[], fetched_at, classifier_version, upstream_unavailable }
- GetPipelineDetailRequest { pipeline_id (required, query-param) }
- GetPipelineDetailResponse { pipeline?, revisions[], fetched_at, unavailable }
- PipelineEntry — wire shape mirroring scripts/data/pipelines-{gas,oil}.json
  + a server-derived public_badge field
- PipelineEvidence, OperatorStatement, SanctionRef, LatLon, PipelineRevisionEntry

service.proto adds both rpc methods with HTTP_METHOD_GET + path bindings:
  /api/supply-chain/v1/list-pipelines
  /api/supply-chain/v1/get-pipeline-detail

`make generate` regenerated src/generated/{client,server}/… + docs/api/
OpenAPI json/yaml.

## Evidence-derivation

server/worldmonitor/supply-chain/v1/_pipeline-evidence.ts — new.
derivePublicBadge(evidence) → 'flowing' | 'reduced' | 'offline' | 'disputed'
is deterministic + versioned (DERIVER_VERSION='badge-deriver-v1').

Rules (first match wins):
1. offline + sanctionRef OR expired/suspended commercial → offline
2. offline + operator statement → offline
3. offline + only press/ais/satellite → disputed (single-source negative claim)
4. reduced → reduced
5. flowing → flowing
6. unknown / malformed → disputed

Staleness guard: non-flowing badges on >14d-old evidence demote to
disputed. Flowing is the optimistic default — stale "still flowing" is
safer than stale "offline". Matches seed-pipelines-{gas,oil}.mjs maxStaleMin.

Tests (tests/pipeline-evidence-derivation.test.mts) — 15 passing cases
covering happy paths, disputed fallbacks, staleness guard, versioning.

## Handlers

server/worldmonitor/supply-chain/v1/list-pipelines.ts
- Reads energy:pipelines:{gas,oil}:v1 via getCachedJson.
- projectPipeline() narrows the Upstash `unknown` into PipelineEntry
  shape + calls derivePublicBadge.
- Honors commodity_type filter (skip the opposite registry's Redis read
  when the client pre-filters).
- Returns upstream_unavailable=true when BOTH registries miss.

server/worldmonitor/supply-chain/v1/get-pipeline-detail.ts
- Scans both registries by id (ids are globally unique per
  tests/pipelines-registry.test.mts).
- Empty revisions[] for now; auto-revision log wires up in Week 3.

handler.ts registers both into supplyChainHandler.

## Gateway

server/gateway.ts adds 'static' cache-tier for both new RPC paths
(registry is slow-moving; 'static' matches the other read-mostly
supply-chain endpoints).

## Consumer wiring

Not in this commit — PipelineStatusPanel (Day 8) is what will call
listPipelines/getPipelineDetail via the generated client. pipelinesGas
+ pipelinesOil stay in PENDING_CONSUMERS until Day 8.

Typecheck clean (both configs). 15 new tests + 17 registry tests all pass.

* feat(energy): PipelineStatusPanel — evidence-backed status table + drawer

Day 8 of the Energy Atlas Release 1 plan. First consumer of the Day 6–7
registries + RPCs.

## What this PR adds

- src/components/PipelineStatusPanel.ts — new panel (id=pipeline-status).
  * Bootstrap-hydrates from pipelinesGas + pipelinesOil for instant first
    paint; falls through to listPipelines() RPC if bootstrap misses.
    Background re-fetch runs on every render so a classifier-version bump
    between bootstrap stamp and first view produces a visible update.
  * Table rows sorted non-flowing-first (offline / reduced / disputed
    before flowing) — what an atlas reader cares about.
  * Click-to-expand drawer calls getPipelineDetail() lazily — operator
    statements, sanction refs (with clickable source URLs), commercial
    state, classifier version + confidence %, capacity + route metadata.
  * publicBadge color-chip palette matches the methodology doc.
  * Attribution footer with GEM (CC-BY 4.0) credit + classifier version.

- src/components/index.ts — barrel export.
- src/app/panel-layout.ts — import + createPanel('pipeline-status', …).
- src/config/panels.ts — ENERGY_PANELS adds 'pipeline-status' at priority 1.

## PENDING_CONSUMERS cleanup

tests/bootstrap.test.mjs — removes 'pipelinesGas' + 'pipelinesOil' from
the allowlist. The invariant "every bootstrap key has a getHydratedData
consumer" now enforces real wiring for these keys: the panel literally
calls getHydratedData('pipelinesGas') and getHydratedData('pipelinesOil').
Future regressions that remove the consumer will fail pre-push.

## Consumer contract verified

- 67 tests pass including bootstrap.test.mjs consumer coverage check.
- Typecheck clean.
- No DeckGL PathLayer in this commit — existing 'pipelines-layer' has a
  separate data source, so modifying DeckGLMap.ts to overlay evidence-
  derived badges on the map is a follow-up commit to avoid clobbering.

## Out of scope for Day 8 (next steps on same PR)

- DeckGL PathLayer integration (color pipelines on the main map by
  publicBadge, click-to-open this drawer) — Day 8b commit.
- Storage facility registry + StorageFacilityMapPanel — Days 9-10.

* fix(energy): PipelineStatusPanel bootstrap path — client-side badge derivation

High finding from PR review. The Day-8 panel crashed on first paint
whenever bootstrap hydration succeeded, because:

- Bootstrap hydrates raw scripts/data/pipelines-{gas,oil}.json verbatim.
- That JSON does NOT include publicBadge — that field is only added by
  the server handler's projectPipeline() in list-pipelines.ts.
- PipelineStatusPanel passed raw entries into badgeChip(), which called
  badgeLabel(undefined).charAt(0) → TypeError.

The background RPC refresh that would have repaired the data never ran
because the panel threw before reaching it. So the exact bootstrap path
newly wired in commit 6b01fa537 was broken for the new panel.

Fix: move the evidence→badge deriver to src/shared/pipeline-evidence.ts
so the client panel and the server handler run the identical function on
identical inputs. Panel projects raw bootstrap JSON through the shared
deriver client-side, producing the same publicBadge the RPC would have
returned. No UI flicker on hydration because pre- and post-RPC badges
match exactly (same function, same input).

## Changes

- src/shared/pipeline-evidence.ts (NEW) — pure deriver with duck-typed
  PipelineEvidenceInput (no generated-type dependency, so both client
  and server assign their proto-typed evidence bundles by structural
  subtyping). Exports derivePipelinePublicBadge + version + type.
- server/worldmonitor/supply-chain/v1/_pipeline-evidence.ts — now a thin
  re-export of the shared module under its older name so in-handler
  imports keep working without a sweep.
- src/components/PipelineStatusPanel.ts:
  * Imports derivePipelinePublicBadge from @/shared/pipeline-evidence.
  * NEW projectRawPipeline() defensively coerces every field from
    unknown → PipelineEntry shape, mirroring the server projection.
  * buildBootstrapResponse now routes every raw entry through the
    projection before returning, so the wire-format PipelineEntry[] the
    renderer receives always has publicBadge populated.
  * badgeChip() gained a null-guard fallback to 'disputed' — belt +
    braces so even if a future caller passes an undefined, the UI
    renders safely instead of throwing.
  * BootstrapRegistry renamed RawBootstrapRegistry with a comment
    explaining why the seeder ships raw JSON (not wire format).

## Regression tests

tests/pipeline-panel-bootstrap.test.mts (NEW) — 6 tests that exercise
the bootstrap-first-paint path end-to-end:

- Every gas + oil curated entry produces a valid badge.
- Raw entries never ship with pre-computed publicBadge (contract guard
  on the seed data format).
- Deriver never throws on undefined/null/{} evidence (was the crash).
- Nord Stream 1 regression check (offline + paperwork → offline).
- Druzhba-South staleness behavior (reduced when fresh, disputed after
  60 days without update).

38/38 tests now pass (17 registry + 15 deriver + 6 new bootstrap-path).
Typecheck clean on both configs.

## Invariant preserved

The server handler and the panel render identical badges because:
1. Same pure function (imported from the same module).
2. Same deterministic rules, same staleness window.
3. Same bootstrap data read by both paths (Redis → either bootstrap
   payload or RPC response).

No UI flicker on hydration.

* fix(energy): three PR-review P2s on PipelineStatusPanel + aggregators

## P2-1 — sanitizeUrl on external evidence links (XSS hardening)

Sanction-ref URLs and operator-statement URLs were interpolated with
escapeHtml only. HTML-escaping blocks tag injection but NOT javascript:
or data: URL schemes, so a bad URL in the seeded registry would execute
in-app when a reader clicked the evidence link. Every other panel in
the codebase (NewsPanel, GdeltIntelPanel, GeoHubsPanel, AirlineIntelPanel,
MonitorPanel) uses sanitizeUrl for this exact reason.

Fix: import sanitizeUrl from @/utils/sanitize and route both hrefs
through it. sanitizeUrl() drops non-http(s) schemes + returns '' on
invalid URLs. The renderer now suppresses the <a> entirely when
sanitize rejects — the date label still renders as plain text instead
of becoming an executable link.

## P2-2 — loadDetail catch path missing stale-response guard

The success path at loadDetail() checked `this.selectedId !== pipelineId`
to suppress stale responses when the user has clicked another pipeline
mid-flight. The catch path at line 219 had no such guard: if the user
clicked A, then B, and A's request failed before B resolved, A's error
handler cleared detailLoading and detail, showing "Pipeline detail
unavailable" for B's drawer even though B was still loading.

Fix: mirror the same `if (this.selectedId !== pipelineId) return` guard
in the catch path. The newer request now owns the drawer state
regardless of which path (success OR failure) the older one took.

## P2-3 — always-gas-preference aggregator for classifierVersion + fetchedAt

Three call sites (list-pipelines.ts handler, get-pipeline-detail.ts
handler, PipelineStatusPanel bootstrap projection) computed aggregate
classifier version and fetchedAt by `gas?.x || oil?.x || fallback`.
That was defensible when a single seed-pipelines.mjs wrote both keys
atomically (fix commit 29b4ac78f split this into two separate Railway
cron entry points). Now gas + oil cron independently, so mixed-version
(gas=v1, oil=v2 during classifier rollout) and mixed-timestamp (oil
refreshed 6h after gas) windows are the EXPECTED state, not the
exceptional one. The comment in list-pipelines.ts even said "pick the
newest classifier version" but the code didn't actually compare.

Fix: add two shared helpers in src/shared/pipeline-evidence.ts —

- pickNewerClassifierVersion(a,b) — parses /^v(\\d+)$/ and returns the
  higher-numbered version; falls back to lexicographic for non-v-
  prefixed values; handles single-missing inputs.
- pickNewerIsoTimestamp(a,b) — Date.parse()-compares and returns the
  later ISO; handles missing / malformed inputs gracefully.

Both server RPCs and the panel bootstrap projection now call these
helpers identically, so clients are told the truth about version +
freshness during partial rollouts.

## Tests

Extended tests/pipeline-evidence-derivation.test.mts with 8 new
assertions covering both pickers:

- Higher v-number wins regardless of order (v1 vs v2 → v2 both ways)
- Single-missing falls back to the one present
- Missing + missing → default 'v1' for version / '' for ts
- Non-v-numbered values fall back to lexicographic
- Explicit regression: "gas=v1 + oil=v2 during rollout" returns v2
- Explicit regression: "oil fresher than gas" returns the oil timestamp

38 → 46 tests. All pass. Typecheck clean on both configs.

* feat(energy): DeckGL PathLayer colored by evidence-derived badge + map↔panel link

Day 8b of the Energy Atlas plan. Pipelines now render on the main
DeckGL map of the energy variant colored by their derived publicBadge,
and clicking a pipeline on the map opens the same evidence drawer the
panel row-click opens.

## Why this commit

Day 8 shipped the PipelineStatusPanel as a table + drawer view.
Reviewer flag notwithstanding (fixed in 149d33ec3 + db52965cd), a
table-only pipeline view is a weak product compared to the map-centric
atlas it's meant to rival. The map-layer differentiation is the whole
point of the feature.

## What this adds

src/components/DeckGLMap.ts:
- New createEnergyPipelinesLayer() — reads hydrated pipeline registries
  via getHydratedData, projects raw JSON through the shared deriver
  (src/shared/pipeline-evidence.ts), renders a DeckGL PathLayer colored
  by publicBadge:
    flowing  → green (46,204,113)
    reduced  → amber (243,156,18)
    offline  → red   (231,76,60)
    disputed → purple (155,89,182)
  Offline + disputed get thicker strokes (3px vs 2px) for at-a-glance
  surfacing of disrupted assets. Geometry comes from raw startPoint +
  waypoints[] + endPoint per asset (straight line when no waypoints).
- Branching at line ~1498: SITE_VARIANT === 'energy' routes to the
  new method; other variants keep the static PIPELINES config (colored
  by oil/gas type). Existing commodity/finance/full map layers are
  untouched — no cross-variant leakage.
- onClick handler emits `energy:open-pipeline-detail` as a window
  CustomEvent with { pipelineId }. Loose coupling: the map doesn't
  import the panel, the panel doesn't import the map.
- Fallback: if bootstrap hasn't hydrated yet, createEnergyPipelinesLayer
  falls back to the static createPipelinesLayer() so the pipelines
  toggle always shows *something*.

src/components/PipelineStatusPanel.ts:
- Constructor registers a window event listener for
  'energy:open-pipeline-detail' → calls this.loadDetail(pipelineId) →
  drawer opens on the clicked asset. Map click and row click converge
  on the same drawer, same evidence view.
- destroy() removes the listener to prevent ghost handlers after panel
  unmount.

## Guarantees

- Bootstrap parity: the DeckGL layer calls the SAME derivePipelinePublicBadge
  as the panel and the server handler, so the map color, the table row
  chip, and the RPC response all agree on the badge. No flicker, no
  drift, no confused user.
- Variant isolation: only SITE_VARIANT === 'energy' triggers the new
  path. Commodity / finance / full map layers untouched.
- No cross-component import: the panel doesn't reference the map class
  and vice versa. The event contract is the only coupling — testable,
  swappable, tauri-safe (guarded with `typeof window !== 'undefined'`).

Typecheck clean. PR #3294 now has 8 commits.

Follow-up backlog:
- Add waypoints[] to the curated pipelines-{gas,oil}.json so the map
  draws real routes instead of straight lines (cosmetic; does not
  affect correctness).
- Tooltip case in the picking tooltip registry (line ~3748) so hover
  shows "Nord Stream 1 · OFFLINE" before click.

* fix(energy): three PR-review findings on Day 8b DeckGL integration

## P1 — getHydratedData single-use race between map + panel

src/services/bootstrap.ts:34 — `if (val !== undefined) hydrationCache.delete(key);`
The helper drains its slot on first read. Day 8 (PipelineStatusPanel) and
Day 8b (createEnergyPipelinesLayer) BOTH call getHydratedData('pipelinesGas')
and getHydratedData('pipelinesOil') — whoever renders first drains the cache
and forces the loser onto its fallback path (panel → RPC, map → static
PIPELINES layer). The commit's "shared bootstrap-backed data" guarantee
did not actually hold.

Fix: new src/shared/pipeline-registry-store.ts that reads once and memoizes.
Both consumers read through getCachedPipelineRegistries() — same data, same
reference, unlimited re-reads. When the panel's background RPC fetch lands,
it calls setCachedPipelineRegistries() to back-propagate fresh data into
the store so the map's next re-render sees the newer classifierVersion +
fetchedAt too (no map/panel drift during classifier rollouts).

Test-only injection hook (__setBootstrapReaderForTests) makes the drain-once
semantics observable without a real bootstrap payload.

## P2 — pipelines-layer tooltip regresses to blank label on energy variant

src/components/DeckGLMap.ts:3748 (pipelines-layer tooltip case) still assumed
the static-config shape (obj.type). The new energy layer emits objects with
commodityType + badge fields, so the tooltip's type-ternary fell through to
the generic fallback — hover rendered " pipeline" (empty leading commodity)
instead of "Nord Stream 1 · OFFLINE".

Fix: differentiate by presence of obj.badge (only the energy layer sets it).
On the energy variant, tooltip now reads name + commodity + badge. Static-
config variants (commodity / finance / full) keep their existing format
unchanged.

## P2 — createEnergyPipelinesLayer dropped highlightedAssets behavior

The static createPipelinesLayer() reads this.highlightedAssets.pipeline and
threads it into getColor / getWidth with an updateTrigger on the signature.
Any caller using flashAssets('pipeline', [...]) or highlightAssets([...])
gets a visible red-outline flash on the matching paths. My Day 8b energy
layer ignored the set entirely — those APIs silently no-op'd on the energy
variant.

Fix: createEnergyPipelinesLayer() now reads the same highlight set, applies
HIGHLIGHT_COLOR + wider stroke to matching IDs, and wires
updateTriggers: { getColor: sig, getWidth: sig } so DeckGL actually
recomputes when the set changes.

Also removed the unnecessary layerCache.set() in the energy path: the
store can update via RPC back-propagation, and a cache keyed only on
highlight-signature would serve stale data. With ~25 critical-asset
pipelines, rebuild per render is trivial.

## Tests

tests/pipeline-registry-store.test.mts (NEW) — 5 tests covering the
drain-once read-many invariant: multiple consumers get cached data
without re-draining, RPC back-propagation updates the source, partial
updates preserve the other commodity, and pure RPC-first (no bootstrap)
works without invoking the reader.

All 51 PR tests pass. Typecheck clean on both configs.

* feat(energy): Day 9 — storage facility registry (UGS + SPR + LNG + crude hubs)

Ships 21 critical strategic storage facilities as a curated registry, same
evidence-bundle pattern as the pipeline registries in Day 7/8:

- scripts/data/storage-facilities.json — 4 UGS + 4 SPR + 6 LNG export +
  3 LNG import + 4 crude tank farms. Each carries physicalState +
  sanctionRefs + classifierVersion/Confidence + fillDisclosed/fillSource.
- scripts/_storage-facility-registry.mjs — shared helpers (validator,
  builder, canonical key, MAX_STALE_MIN). Validator enforces facility-type
  × capacity-unit pairing (ugs→TWh, spr/tank-farm→Mb, LNG→Mtpa) and the
  non-operational badge ⇒ evidence invariant.
- scripts/seed-storage-facilities.mjs — single runSeed entry (only one
  key, so no split-seeder dance needed).
- Registered in the 4-file bootstrap checklist: cache-keys.ts
  (STORAGE_FACILITIES_KEY + BOOTSTRAP_CACHE_KEYS + BOOTSTRAP_TIERS),
  api/bootstrap.js (KEYS + SLOW_KEYS), api/health.js (BOOTSTRAP_KEYS +
  SEED_META, 14d threshold = 2× weekly cron), api/seed-health.js (mirror).
- tests/bootstrap.test.mjs PENDING_CONSUMERS adds storageFacilities —
  Day 10 StorageFacilityMapPanel will remove it.
- tests/storage-facilities-registry.test.mts — 20 tests covering schema,
  identity, geometry, type×capacity pairing, evidence contract, and
  negative-input validator rejection.

Registry fields are slow-moving; badge derivation happens at read-time
server-side once the RPC handler lands in Day 10 (panel + deckGL
ScatterplotLayer). Seeded data is live in Redis from this commit so the
Day 10 PR only adds display surfaces.

Tests: 56 pass (36 prior + 20 new). Typecheck + typecheck:api clean.

* feat(energy): Day 10 — storage atlas (ListStorageFacilities RPC + DeckGL ScatterplotLayer + panel)

End-to-end wiring for the strategic storage registry seeded in Day 9. Same
pattern as the pipeline shipping path (Days 7+8+8b): proto → handler →
shared evidence deriver → panel → DeckGL map layer, with a shared
read-once store keeping map + panel aligned.

Proto + generated code:
- list_storage_facilities.proto: ListStorageFacilities +
  GetStorageFacilityDetail messages with StorageFacilityEntry,
  StorageEvidence, StorageSanctionRef, StorageOperatorStatement,
  StorageLatLon, StorageFacilityRevisionEntry.
- service.proto wires both RPCs under /api/supply-chain/v1.
- make generate → regenerated client + server stubs + OpenAPI.

Server handlers:
- src/shared/storage-evidence.ts: shared pure deriver. Duck-typed input
  interface avoids generated-type deps; identical rules to the pipeline
  deriver (sanction/commercial paperwork vs external-signal-only offline,
  14d staleness window, version pin).
- _storage-evidence.ts: thin re-export for server handler import ergonomics.
- list-storage-facilities.ts: reads STORAGE_FACILITIES_KEY from Upstash,
  projects raw → wire format, attaches derived publicBadge, filters by
  optional facilityType query arg.
- get-storage-facility-detail.ts: single-asset lookup for drawer.
- handler.ts registers both new methods.
- gateway.ts: both routes → 'static' cache tier (registry is near-static).

Panel + map:
- src/shared/storage-facility-registry-store.ts: drain-once memo mirroring
  pipeline-registry-store. Both panel and DeckGL layer read through this
  so the single-use getHydratedData drain doesn't race between consumers.
  RPC back-propagation via setCachedStorageFacilityRegistry() keeps map ↔
  panel on the same classifierVersion during rollouts.
- StorageFacilityMapPanel.ts: table + evidence drawer. Bootstrap hot path
  projects raw registry through same deriver as server so first-paint
  badge matches post-RPC badge (no flicker). sanitizeUrl + stale-response
  guards (success + catch paths) carried over from PipelineStatusPanel.
- DeckGLMap.ts createEnergyStorageLayer(): ScatterplotLayer keyed on
  badge color; log-scale radius (6km–26km) keeps Rehden visible next to
  Ras Laffan. Click dispatches 'energy:open-storage-facility-detail' —
  panel listens and opens its drawer (loose coupling, no direct refs).
- Tooltip branch on storage-facilities-layer shows facility type, country,
  capacity unit, and badge.
- Added 'storageFacilities' optional field to MapLayers type (optional so
  existing variant literals across commodity/finance/tech/happy/full/etc.
  don't need touching). Wired into LAYER_REGISTRY + VARIANT_LAYER_ORDER.energy
  + ENERGY_MAP_LAYERS + ENERGY_MOBILE_MAP_LAYERS. Panel entry added to
  ENERGY_PANELS + panel-layout createPanel. PENDING_CONSUMERS entry from
  Day 9 removed — panel + map layer are now real consumers.

Tests:
- storage-evidence-derivation.test.mts (17 tests): covers every curated
  facility yields a valid badge, null/malformed input never throws,
  offline sanction/commercial/operator rules, external-signal-only offline
  → disputed, staleness demotion.
- storage-facility-registry-store.test.mts (4 tests): drain-once, no-data
  drain, RPC update, pure-RPC-first path.

All 6,426 unit tests pass. Typecheck + typecheck:api clean. Pre-existing
src-tauri/sidecar/ test failure is unrelated (no diff touches src-tauri/).

* feat(energy): Day 11 — fuel-shortage registry schema + seed + RPC (classifier post-launch)

Ships v1 of the global fuel-shortage alert registry. Severity is the
CLASSIFIER OUTPUT (confirmed/watch), not a client derivation — we ship
the evidence alongside so readers can audit the grounds. v1 is seeded
from curated JSON; post-launch the proactive-intelligence classifier
(Day 12 work) extends the same key directly.

Data:
- scripts/data/fuel-shortages.json — 15 known active shortages
  (PK, LK, NG×2, CU, VE, LB, ZW, AR, IR, BO, KE, PA, EG, BY)
  spanning petrol/diesel/jet across confirmed + watch tiers. Each entry
  carries evidenceSources[] (regulator/operator/press), firstSeen,
  lastConfirmed, resolvedAt, impactTypes[], causeChain[], classifier
  version + confidence. Confirmed severity enforces authoritative
  evidence at schema level.

Seeder:
- scripts/_fuel-shortage-registry.mjs — shared validator (enforces
  iso2 country, enum products/severities/impacts/causes, authoritative
  evidence for confirmed). MIN_SHORTAGES=10.
- scripts/seed-fuel-shortages.mjs — single runSeed entry.
- Registered in seed-bundle-energy-sources.mjs at DAY cadence (shortages
  move faster than registry assets).

Bootstrap 4-file registration:
- cache-keys.ts: FUEL_SHORTAGES_KEY + BOOTSTRAP_CACHE_KEYS + BOOTSTRAP_TIERS.
- api/bootstrap.js: KEYS + SLOW_KEYS.
- api/health.js: BOOTSTRAP_KEYS + SEED_META (2880min = 2× daily cron).
- api/seed-health.js: mirrors intervalMin=1440.

Proto + RPC:
- list_fuel_shortages.proto: ListFuelShortages (country/product/severity
  query facets) + GetFuelShortageDetail messages with FuelShortageEntry,
  FuelShortageEvidence, FuelShortageEvidenceSource.
- service.proto wires both new RPCs under /api/supply-chain/v1.
- list-fuel-shortages.ts handler projects raw → wire format, supports
  server-side country/product/severity filtering.
- get-fuel-shortage-detail.ts single-shortage lookup.
- handler.ts registers both. gateway.ts: 'medium' cache-tier (daily
  classifier updates warrant moderate freshness).

Shared evidence helper:
- src/shared/shortage-evidence.ts: deriveShortageEvidenceQuality maps
  (confidence + authoritative-source count + freshness) → 'strong' |
  'moderate' | 'thin' for client-side sort/trust indicators. Does NOT
  change severity — classifier owns that decision.
- countEvidenceSources buckets sources for the drawer's "n regulator /
  m press" line.

Tests:
- tests/fuel-shortages-registry.test.mts (19 tests): schema, identity,
  enum coverage, evidence contract (confirmed → authoritative source),
  validateRegistry negative cases.
- tests/shortage-evidence.test.mts (10 tests): quality deriver edge
  cases, source bucketing.
- tests/bootstrap.test.mjs PENDING_CONSUMERS adds fuelShortages —
  FuelShortagePanel arrives Day 12 which will remove the entry.

Typecheck + typecheck:api clean. 64 tests pass.

* feat(energy): Day 12 — FuelShortagePanel + DeckGL shortage pins

End-to-end wiring of the fuel-shortage registry shipped in Day 11: panel
on the Energy variant page, ScatterplotLayer pins on the DeckGL map,
both reading through a shared single-drain store so they don't race on
the bootstrap cache.

Panel:
- src/components/FuelShortagePanel.ts — table sorted by severity (confirmed
  first) then evidence quality (strong → thin) then most-recent lastConfirmed.
  Drawer shows short description, first-seen / last-confirmed / resolved,
  impact types, cause chain, classifier version/confidence, and a typed
  evidence-source list with regulator/operator/press chips. sanitizeUrl on
  every href so classifier-ingested URLs can't render as javascript:. Same
  stale-response guards on success + catch paths as the other detail drawers.
- Consumes deriveShortageEvidenceQuality for client-side trust indicator
  (three-dot ●●● / ●●○ / ●○○), NOT for severity — severity is classifier
  output.
- Registered in ENERGY_PANELS + panel-layout.ts + components barrel.

Shared store:
- src/shared/fuel-shortage-registry-store.ts — same drain-once memoize
  pattern as pipeline- and storage-facility-registry-store. Both the
  panel and the DeckGL shortage-pins layer read through it.

DeckGL layer:
- DeckGLMap.createEnergyShortagePinsLayer: ScatterplotLayer placing one
  pin per active shortage at the country centroid (via getCountryCentroid
  from services/country-geometry). Stacking offset (~0.8° lon) when
  multiple shortages share a country so Nigeria's petrol + diesel don't
  render as a single dot. Confirmed pins 55km radius; watch 38km. Click
  dispatches 'energy:open-fuel-shortage-detail' — panel listens.
- Tooltip branch on fuel-shortages-layer: country · product · short
  description · severity.
- Layer registered in LAYER_REGISTRY, VARIANT_LAYER_ORDER.energy,
  ENERGY_MAP_LAYERS, ENERGY_MOBILE_MAP_LAYERS. MapLayers.fuelShortages
  is optional on the type so other variants' literals remain valid.

Tests:
- tests/fuel-shortage-registry-store.test.mts (4 tests): drain-once,
  no-data, RPC back-prop, pure-RPC-first path.
- tests/bootstrap.test.mjs — fuelShortages removed from PENDING_CONSUMERS.

Typecheck + typecheck:api clean. 39 tests pass (plus full suite in pre-push).

* feat(energy): Day 13 — energy disruption event log + asset timeline drawer

Ships the energy:disruptions:v1 registry that threads together pipelines
and storage facilities: state transitions (sabotage, sanction, maintenance,
mechanical, weather, commercial, war) keyed by assetId so any asset's
drawer can render its history without a second registry lookup.

Data + seeder:
- scripts/data/energy-disruptions.json — 12 curated events spanning
  Nord Stream 1/2 sabotage, Druzhba sanctions, CPC force majeure,
  TurkStream maintenance, Yamal halt, Rehden trusteeship, Arctic LNG 2
  sanction, ESPO drone strikes, BTC fire (historical), Sabine Pass
  Hurricane Beryl, Power of Siberia ramp. Each event links back to a
  seeded asset.
- scripts/_energy-disruption-registry.mjs — validator enforces valid
  assetType/eventType/cause enums, http(s) sources, startAt ≤ endAt,
  MIN_EVENTS=8.
- scripts/seed-energy-disruptions.mjs — runSeed entry (weekly cron).
- Bundle entry at 7×DAY cadence.

Bootstrap 4-file registration (cache-keys.ts + bootstrap.js + health.js +
seed-health.js) — energyDisruptions in PENDING_CONSUMERS because panel
drawers fetch lazily via RPC on drawer-open rather than hydrating from
bootstrap directly.

Proto + handler:
- list_energy_disruptions.proto: ListEnergyDisruptions with
  assetId / assetType / ongoingOnly query facets. Returns events sorted
  newest-first.
- list-energy-disruptions.ts projects raw → wire format, supports all
  three query facets.
- Registered in handler.ts. gateway.ts: 'medium' cache tier.

Shared timeline helper:
- src/shared/disruption-timeline.ts — pure formatters (formatEventWindow,
  formatCapacityOffline, statusForEvent). No generated-type deps so
  PipelineStatusPanel + StorageFacilityMapPanel import the same helpers
  and render the timeline identically.

Panel integration:
- PipelineStatusPanel.loadDetail now fetches getPipelineDetail +
  listEnergyDisruptions({assetId, assetType:'pipeline'}) in parallel.
  Drawer gains "Disruption timeline (N)" section with event type, date
  window, capacity offline, cause chain, and short description per entry.
- StorageFacilityMapPanel gets identical treatment with assetType='storage'.
- Both reset detailEvents on closeDetail and on fresh click (stale-response
  safety).

Tests:
- tests/energy-disruptions-registry.test.mts (17 tests): schema, identity,
  enum coverage, evidence, negative inputs.
- tests/bootstrap.test.mjs — energyDisruptions added to PENDING_CONSUMERS.

Typecheck + typecheck:api clean. 51 tests pass locally (plus full suite
in pre-push).

* feat(energy): Day 14 — country drill-down Atlas exposure section

Extends CountryDeepDivePanel's existing "Energy Profile" card with a
mini Atlas-exposure section that surfaces per-country exposure to the
new registries we shipped in Days 7-13.

For each country:
- Pipelines touching this country (from, to, or transit) — clickable
  rows that dispatch 'energy:open-pipeline-detail' so the PipelineStatusPanel
  drawer opens on the energy variant; no-op on other variants.
- Storage facilities in this country — same loose-coupling pattern
  with 'energy:open-storage-facility-detail'.
- Active fuel shortages in this country — severity breakdown line
  (N confirmed · M watch) plus clickable rows emitting
  'energy:open-fuel-shortage-detail'.

Silent absence: sections render only when the country has matching
assets/events, so countries with no pipeline, storage, or shortage
touchpoints see the existing energy-profile card unchanged.

Lazy stores: reads go through the same shared drain-once stores
(getCachedPipelineRegistries, getCachedStorageFacilityRegistry,
getCachedFuelShortageRegistry) so CountryDeepDivePanel does NOT race
with Atlas panels over the single-drain bootstrap cache. Dynamic
import() keeps the three stores out of the panel's static import graph
so non-energy variants can tree-shake them.

Typecheck clean. No schema changes; purely additive UI read from
already-shipped registries.

* docs(energy): methodology page for energy disruption event log

Fills the /docs/methodology/disruptions URL referenced by
list_energy_disruptions.proto, scripts/_energy-disruption-registry.mjs,
and the panel attribution footers. Explains scope (state transitions
not daily noise), data shape, what counts as a disruption, classifier
evolution path, RPC contract, and ties into the sibling pipeline +
storage + shortage methodology pages.

No code change; pure docs completion for Week 4 launch polish.

* fix(energy): upstreamUnavailable only fires when Redis returned nothing

Two handlers (list-storage-facilities + list-pipelines) conflated "empty
filter result on a healthy registry" with "upstream unavailable". A
caller who queried one facilityType/commodityType and legitimately got
zero matches was told the upstream was down — which may push clients to
error-state rendering or suppress caching instead of showing a valid
empty list.

list-storage-facilities.ts — upstreamUnavailable now only fires when
`raw` is null (Redis miss). Zero filtered rows on a healthy registry
returns upstreamUnavailable: false + empty array. Matches the sibling
list-fuel-shortages handler and the wire contract in
list_storage_facilities.proto.

list-pipelines.ts — same bug, subtler shape. Now checks "requested at
least one side AND received nothing" rather than "zero rows after
collection". A filter that legitimately matches no gas/oil pipelines on
a healthy registry now returns upstreamUnavailable: false.

list-energy-disruptions.ts and list-fuel-shortages.ts already had the
correct shape (only flag unavailable when raw is missing) — left as-is.

Typecheck + typecheck:api clean. No tests added: the existing registry
schema tests cover the projection/filter helpers, and the handler-level
gating change is documented in code comments for future audits.

* fix(energy): three Greptile findings on PR #3294

Two P1 filter bugs (resolved shortages rendered as active) and one P2
contract inconsistency on the disruptions handler.

P1: DeckGLMap createEnergyShortagePinsLayer rendered every shortage in
the registry as an active crisis pin — including entries where the
classifier has written resolvedAt to mark the crisis over. Added a
filter so only entries with a null/empty resolvedAt become map pins.
Curated v1 data has resolvedAt=null everywhere so no visible change
today, but the moment the classifier starts writing resolutions
post-launch, resolved shortages would have appeared as ongoing.

P1: CountryDeepDivePanel renderAtlasExposure had the same bug in the
country drill-down — "N confirmed · M watch" counts included resolved
entries, inflating the active-crisis line per country. Same one-line
filter fix.

P2: list-energy-disruptions.ts gated upstreamUnavailable on
`!raw?.events` — a partial write (top-level object present but `events`
property missing) fired the "upstream down" flag, inconsistent with
the sibling handlers (list-pipelines, list-storage-facilities,
list-fuel-shortages) that only fire on `!raw`. Rewrote to match:
`!raw` → upstreamUnavailable, empty events → normal empty list. This
also aligns with the contract documented on the upstream-unavailable-
vs-empty-filter skill extracted from the earlier P2 review.

Typecheck + typecheck:api clean. All three fixes are one-liner filter
or gate changes; no test additions needed (registry tests still pass
with v1 data since resolvedAt is null throughout).
2026-04-23 07:34:07 +04:00
Elie Habib
7cf37c604c feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)
* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently

First commit in PR 3 of the resilience repair plan. Retires
`fuelStockDays` from the core score with no replacement.

Why permanent, not replaced:
IEA emergency-stockholding rules are defined in days of NET IMPORTS
and do not bind net exporters by design. Norway/Canada/US measured
in days-of-imports are incomparable to Germany/Japan measured the
same way — the construct is fundamentally different across the two
country classes. No globally-comparable recovery-fuel signal can
be built from this source; the pre-repair probe showed 100% imputed
at 50 for every country in the April 2026 freeze.

  scoreFuelStockDays:
    - Rewritten to return coverage=0 + observedWeight=0 +
      imputationClass='source-failure' for every country regardless
      of seed content.
    - Drops the dimension from the `recovery` domain's coverage-
      weighted mean automatically; remaining recovery dimensions
      pick up the share via re-normalisation in
      `_shared.ts#coverageWeightedMean`.
    - No explicit weight transfer needed — the coverage-weighted
      blend handles redistribution.

  Registry:
    - recoveryFuelStockDays re-tagged from tier='enrichment' to
      tier='experimental' so the Core coverage gate treats it as
      out-of-score.
    - Description updated to make the retirement explicit; entry
      stays in the registry for structural continuity (the
      dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER
      for the 19-dimension tests; removing the dimension entirely is
      a PR 4 structural-audit concern).

  Housekeeping:
    - Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no
      longer read; noUnusedLocals would reject it).
    - Removed `RecoveryFuelStocksCountry` interface for the same
      reason. Comment at the removed declaration instructs future
      maintainers not to re-add the type as a reservation; when a
      new recovery-fuel concept lands, introduce a fresh interface.

Plan reference: §3.5 point 1 of
`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`.

51 resilience tests pass, typecheck + biome clean. The
`recovery` domain's published score will shift slightly for every
country because the 0.10 slot that fuelStockDays was imputing to
now redistributes; the compare-harness acceptance-gate rerun at
merge time will quantify the shift per plan §6 gates.

* feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves

BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35
fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60
countries max — so every non-BIS country fell through to
curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45).
Combined with reserveMarginPct already removed in PR 1, currencyExternal
was the clearest "construct absent for most of the world" carrier left
in the scorer.

Changes:

_dimension-scorers.ts
- scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX
  reserves only. Coverage ladder:
    inflation + reserves → 0.85 (observed primary + secondary)
    inflation only       → 0.55
    reserves only        → 0.40
    neither              → 0.30 (IMPUTE.bisEer retained for snapshot
                                 continuity; semantics read as
                                 "no IMF + no WB reserves" now)
- Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved
  via comment only, flagged by noUnusedLocals), stddev() helper,
  getCountryBisExchangeRates() loader, BisExchangeRate interface,
  dateToSortableNumber() — all were exclusive callers of the retired
  BIS path.

_indicator-registry.ts
- New core entry inflationStability (weight 0.60, tier=core,
  sourceKey=economic:imf:macro:v2).
- fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability
  anchor).
- fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental
  (BIS ~60-country coverage; off the core weight sum).
- Non-experimental weights now sum to 1.0 (0.60 + 0.40).

scripts/compare-resilience-current-vs-proposed.mjs
- EXTRACTION_RULES: added inflationStability →
  imf-macro-country-field field=inflationPct so the registry-parity
  test passes and the correlation harness sees the new construct.

tests/resilience-dimension-scorers.test.mts
- Dropped BIS-era wording ("non-BIS country") and test 266
  (BIS-outage coverage 0.35 branch) which collapsed to the inflation-
  only path post-retirement.
- Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+
  reserves 0.55 → 0.85.

tests/resilience-scorers.test.mts
- domainAverages.economic 68.33 → 66.33 (US currencyExternal score
  shifts slightly under IMF+reserves vs old BIS composite).
- stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279.
- overallScore 65.82 → 65.52.
- baselineScore unchanged (currencyExternal is stress-only).

All 6324 data-tier tests pass. typecheck:api clean. No change to
seeders or Redis keys; this is a pure scorer + registry rebuild.

* feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)

Plan §2.1 diagnosis table showed externalDebtCoverage saturating at
score=100 across all 9 probe countries — including stressed states.
Signal was collapsed. Root cause: (worst=5, best=0) gave every country
with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's
reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no
worry" territory instead of the "mild warning" it should be.

Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score
50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure).
Ratios above 2.0 clamp to 0, consistent with "beyond this point the
country is already in crisis; exact value stops mattering."

Files changed:

- _indicator-registry.ts: recoveryDebtToReserves goalposts
  {worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to
  cite Greenspan-Guidotti; inline comment documents anchor + rationale.

- _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound
  changed from (0..5) to (0..2), with inline comment.

- docs/methodology/country-resilience-index.mdx: goalpost table row
  5-0 → 2-0, description cites Greenspan-Guidotti.

- docs/methodology/indicator-sources.yaml:
  * constructStatus: dead-signal → observed-mechanism (signal is now
    discriminating).
  * reviewNotes updated to describe the new anchor.
  * mechanismTestRationale names the Greenspan-Guidotti rule.

- tests/resilience-dimension-monotonicity.test.mts: updated the
  comment + picked values inside the (0..2) discriminating band (0.3
  and 1.5). Old values (1 vs 4) had 4 clamping to 0.

- tests/resilience-dimension-scorers.test.mts: NO score threshold
  relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96).

- tests/resilience-scorers.test.mts: fixture drift:
  * domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25).
  * baselineScore 63.63 → 60.12 (extDebt is baseline type).
  * overallScore 65.52 → 63.27.
  * stressScore / stressFactor unchanged (extDebt is baseline-only).

All 6324 data-tier tests pass. typecheck:api clean.

* feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight

Plan §3.6 adds a new acceptance criterion (also §5 item 5):

> No indicator with observed coverage below 70% may exceed 5% nominal
> weight OR 5% effective influence in the post-change sensitivity run.

This commit enforces the NOMINAL-WEIGHT half as a unit test that runs
on every CI build. The EFFECTIVE-INFLUENCE half is produced by
scripts/validate-resilience-sensitivity.mjs as a committed artifact;
the gate file only asserts that script still exists so a refactor that
removes it breaks the build loudly.

Why the gate exists (plan §3.6):

  "A dimension at 30% observed coverage carries the same effective
   weight as one at 95%. This contradicts the OECD/JRC handbook on
   uncertainty analysis."

Implementation:

tests/resilience-coverage-influence-gate.test.mts — three tests:
  1. Nominal-weight gate: for every core indicator with coverage < 137
     countries (70% of the ~195-country universe), computes its nominal
     overall weight as
       indicator.weight × (1/dimensions-in-domain) × domain-weight
     and asserts it does not exceed 5%. Equal-share-per-dimension is
     the *upper bound* on runtime weight (coverage-weighted mean gives
     a lower share when a dimension drops out), so this is a strict
     bound: if the nominal number passes, the runtime number also
     passes for every country.
  2. Effective-influence contract: asserts the sensitivity script
     exists at its expected path. Removing it (intentionally or by
     refactor) breaks the build.
  3. Audit visibility: prints the top 10 core indicators by nominal
     overall weight. No assertion beyond "ran" — the list lets
     reviewers spot outliers that pass the gate but are near the cap.

Current state (observed from audit output):

  recoveryReserveMonths:   nominal=4.17%  coverage=188
  recoveryDebtToReserves:  nominal=4.17%  coverage=185
  recoveryImportHhi:       nominal=4.17%  coverage=190
  inflationStability:      nominal=3.40%  coverage=185
  electricityConsumption:  nominal=3.30%  coverage=217
  ucdpConflict:            nominal=3.09%  coverage=193

Every core indicator has coverage ≥ 180 (already enforced by the
pre-existing indicator-tiering test), so the nominal-weight gate has
no current violators — its purpose is catching future drift, not
flagging today's state.

All 6327 data-tier tests pass. typecheck:api clean.

* docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate

Methodology-doc update capturing the three §3.5 landings and the §3.6 CI
gate. Five edits:

1. **Known construct limitations section (#5 and #6):** strikethrough the
   original "dead signals" and "no coverage-based weight cap" items,
   annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" +
   specifics of what shipped.

2. **Currency & External H4 section:** completely rewritten. Old table
   (fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is
   replaced by the two-indicator post-PR-3 table (inflationStability at
   0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out
   (0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as
   experimental-tier drill-downs only.

3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the
   methodology-lint H4-to-dimension mapping does not break; body
   rewritten to explain that the dimension is retired from core but the
   seeder still runs for IEA-member drill-downs.

4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description
   cites Greenspan-Guidotti reserve-adequacy rule.

5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering
   §3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit
   updates.

No scoring or code changes in this commit. Methodology-lint test passes
(H4 mapping intact). All 6327 data-tier tests pass.

* fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation

Reviewer catch (thanks). The previous gate computed each indicator's
nominal overall weight as

  indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight

and claimed this was an upper bound ("actual runtime weight is ≤ this
when some dimensions drop out on coverage"). That is BACKWARDS for
this scorer.

The domain aggregation is coverage-weighted
(server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean),
so when a dimension pins at coverage=0 it is EXCLUDED from the
denominator and the surviving dimensions' shares go UP, not down.

PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to
coverage=0 for every country — so in the current live state the
recovery domain has 5 contributing dimensions (not 6), and each core
recovery indicator's nominal share is

  1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%)

The old gate therefore under-estimated nominal influence and could
silently pass exactly the kind of low-coverage overweight regression
it is meant to block.

Fix:

- Added `coreBearingDimensions(domainId)` helper that counts only
  dimensions that have ≥1 core indicator in the registry. A dimension
  with only experimental/enrichment entries (post-retirement
  fuelStockDays) has no core contribution → does not dilute shares.
- Updated `nominalOverallWeight` to divide by the core-bearing count,
  not the raw dimension count.
- Rewrote the helper's doc comment to stop claiming this is a strict
  upper bound — explicitly calls out the dynamic case (source failure
  raising surviving dim shares further) as the sensitivity script's
  responsibility.
- Added a new regression test: asserts (a) at least one recovery
  dimension is all-non-core (fuelStockDays post-retirement),
  (b) fuelStockDays has zero core indicators, and (c) recoveryDebt
  ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion
  of the retirement or regression to N_total-denominator will fail
  loudly.

Top-10 audit output now correctly shows:

  recoveryReserveMonths:   nominal=5%     coverage=188
  recoveryDebtToReserves:  nominal=5%     coverage=185
  recoveryImportHhi:       nominal=5%     coverage=190
  (was 4.17% each under the old math)

All 486 resilience tests pass. typecheck:api clean.

Note: the 5% figure is exactly AT the cap, not over it. "exceed" means
strictly > 5%, so it still passes. But now the reviewer / audit log
reflects reality.

* fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label

Addresses the Codex review P1 + P2 on PR #3297.

P1 — retired-dim drag on confidence averages
--------------------------------------------
scoreFuelStockDays returns coverage=0 by design (retired construct),
but computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence averaged across all 19 dimensions. That
dragged every country's reported averageCoverage down — US went from
0.8556 (active dims only) to 0.8105 (all dims) — enough drift to
misclassify edge countries as lowConfidence and to shift the ranking
widget's overallCoverage pill for every country.

Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in
_dimension-scorers.ts and filter it out of all three averages. The
filter is keyed on the retired-dim REGISTRY, not on coverage === 0,
because a non-retired dim can legitimately emit coverage=0 on a
genuinely sparse-data country via weightedBlend fall-through — those
entries MUST keep dragging confidence down (that is the sparse-data
signal lowConfidence exists to surface). Verified: sparse-country
release-gate test (marks sparse WHO/FAO countries as low confidence)
still passes with the registry-keyed filter; would have failed with
a naive coverage=0 filter.

Server-client parity: widget-utils cannot import server code, so
RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept
in lockstep by tests/resilience-retired-dimensions-parity.test.mts
(parses the widget file as text, same pattern as existing widget-util
tests that can't import the widget module directly).

P2 — false "Source down" label on retired dim
---------------------------------------------
scoreFuelStockDays hard-coded imputationClass: 'source-failure',
which the widget maps to "Source down: upstream seeder failed" with
a `!` icon for every country. That is semantically wrong for an
intentional retirement. Flipped to null so the widget's absent-path
renders a neutral cell without a false outage label. null is already
a legal value of ResilienceDimensionScore.imputationClass; no type
change needed.

Tests
-----
- tests/resilience-confidence-averaging.test.mts (new): pins the
  registry-keyed filter semantic for computeOverallCoverage +
  computeLowConfidence. Includes a negative-control test proving
  non-retired coverage=0 dims still flip lowConfidence.
- tests/resilience-retired-dimensions-parity.test.mts (new):
  lockstep gate between server and client retired-dim lists.
- Widget test adds a registry-keyed exclusion test with a non-retired
  coverage=0 dim in the fixture to lock in the correct semantic.
- Existing tests asserting imputationClass: 'source-failure' for
  fuelStockDays flipped to null.

All 494 resilience tests + full 6336/6336 data-tier suite pass.
Typecheck clean for both tsconfig.json and tsconfig.api.json.

* docs(resilience): align methodology + registry metadata with shipped imputationClass=null

Follow-up to the previous PR 3 review commit that flipped
scoreFuelStockDays's imputationClass from 'source-failure' to null to
avoid a false "Source down" widget label on every country. The code
changed; the doc and registry metadata did not, leaving three sites
in the methodology mdx and two comment/description sites in the
registry still claiming imputationClass='source-failure'. Any future
reviewer (or tooling that treats the registry description as
authoritative) would be misled.

This commit rewrites those sites to describe the shipped behavior:
 - imputationClass=null (not 'source-failure'), with the rationale
 - exclusion from confidence/coverage averages via the
   RESILIENCE_RETIRED_DIMENSIONS registry filter
 - the distinction between structural retirement (filtered) and
   runtime coverage=0 (kept so sparse-data countries still flag
   lowConfidence)

Touched:
 - docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590)
 - server/worldmonitor/resilience/v1/_indicator-registry.ts
   (recoveryFuelStockDays comment block + description field)

No code-behavior change. Docs-only.

Tests: 157 targeted resilience tests pass (incl. methodology-lint +
widget + release-gate + confidence-averaging). Typecheck clean on
both tsconfig.json and tsconfig.api.json.
2026-04-22 23:57:28 +04:00
Elie Habib
c067a7dd63 fix(resilience): include hydroelectric in lowCarbonGenerationShare (PR #3289 follow-up) (#3293)
Greptile P1 review on the merged PR #3289: World Bank EG.ELC.RNEW.ZS
explicitly excludes hydroelectric. The v2 lowCarbonGenerationShare
composite was summing only nuclear + renew-ex-hydro, which would
collapse to ~0 for hydro-dominant economies the moment the
RESILIENCE_ENERGY_V2_ENABLED flag flipped:

  Norway    ~95% hydro → score near 0 on a 0.20-weight indicator
  Paraguay  ~99% hydro → same
  Brazil    ~65% hydro → same
  Canada    ~60% hydro → same

Directly contradicts the plan §3.3 intent of crediting "firm
low-carbon generation" and would produce rankings that contradict
the power-system security framing.

PR #3289 merged before the review landed. This branch applies the
fix against main.

Fix: add EG.ELC.HYRO.ZS as a third series in the composite.

  seed-low-carbon-generation.mjs:
    - INDICATORS: ['EG.ELC.NUCL.ZS', 'EG.ELC.RNEW.ZS'] + 'EG.ELC.HYRO.ZS'
    - fetchLowCarbonGeneration(): sum three series, track latest year
      across all three, same cap-at-100 guard
    - File header comment names the three-series sum with the hydro-
      exclusion rationale + the country list that would break.

  _indicator-registry.ts lowCarbonGenerationShare.description:
  rewritten to name all three WB codes + explain the hydro exclusion.

  country-resilience-index.mdx:
    - Known-limitations item 3 names all three WB codes + country list
    - Energy domain v2 table row names all three WB codes
    - v2.1 changelog Indicators-added bullet names all three WB codes
    - v2.1 changelog New-seeders bullet names all three WB codes on
      seed-low-carbon-generation

No scorer code change (composite lives in the seeder; scorer reads
the pre-summed value from resilience:low-carbon-generation:v1). No
weight change. Flag-off path remains byte-identical.

25 resilience tests pass, typecheck + typecheck:api clean.
2026-04-22 23:47:45 +04:00
Elie Habib
a17a3383d9 feat(variant): Energy Atlas — Release 1 Day 1 (variant scaffolding) (#3291)
* feat(variant): add energy variant scaffolding for energy.worldmonitor.app

Release 1 Day 1 of the Energy Atlas plan — introduces src/config/variants/energy.ts
modeled on the commodity variant. No new panels or RPCs yet; the variant reuses
existing energy-related panels (energy-complex, oil-inventories, hormuz,
energy-crisis, fuel-prices, renewable-energy) + supply-chain/sanctions context.

Map layers enable pipelines, waterways, AIS, commodityPorts, minerals, climate,
outages, natural, weather. All geopolitical/military/tech/finance/happy variant
layers explicitly disabled per variant isolation conventions.

Next PRs on feat/energy-atlas-release-1 add:
- Pipeline & storage registries (curated critical assets, ~75 gas / ~75 oil / ~125 storage)
- Global fuel-shortage registry with automated evidence-threshold promotion
- Pipeline/storage disruption event log
- Country drill-down Energy section
- Atlas landing composition at variant root

* feat(variant): wire energy variant into runtime + atlas landing composition

Day 2 of the Energy Atlas Release 1 plan. The Day-1 commit added a canonical
variants/energy.ts but discovery during Day 2 showed the app's runtime variant
resolution lives in src/config/panels.ts (ENERGY_PANELS/ENERGY_MAP_LAYERS/etc.),
not in variants/*.ts (which are orphans). This commit does the real wiring.

Changes:
- src/config/panels.ts — ENERGY_PANELS, ENERGY_MAP_LAYERS, ENERGY_MOBILE_MAP_LAYERS;
  registered in ALL_PANELS, VARIANT_DEFAULTS, VARIANT_PANEL_OVERRIDES; wired into
  DEFAULT_MAP_LAYERS + MOBILE_DEFAULT_MAP_LAYERS ternaries. Panels at launch:
  map, live-news, insights, energy-complex, oil-inventories, hormuz, energy-crisis,
  fuel-prices, renewable-energy, commodities, energy (news), macro-signals,
  supply-chain, sanctions-pressure, gulf-economies, gcc-investments, climate,
  monitors, world-clock, latest-brief.
- src/config/variant.ts — recognize 'energy' as allowed SITE_VARIANT; resolve
  energy.worldmonitor.app subdomain to 'energy'; honor localStorage override.
- src/config/variant-meta.ts — SEO entry for energy.worldmonitor.app (title,
  description, keywords targeting 'oil pipeline tracker', 'gas storage map',
  'fuel shortage tracker', 'chokepoint monitor', etc.).
- src/app/panel-layout.ts — desktop variant switcher + mobile menu both list
  energy with  icon and t('header.energy') label.
- src/App.ts + src/app/data-loader.ts — energy variant enables trade-policy and
  supply-chain data loads (chokepoint exposure is a core Atlas surface).
- src/app/data-loader.ts — daily-brief newsCategories override for energy variant
  (energy, energy-markets, oil-gas-news, pipeline-news, lng-news).
- src/locales/en.json — 'header.energy' translation key.
- src/config/variants/energy.ts — add clarifying comment that real wiring lives
  in panels.ts (same orphan pattern as commodity.ts/finance.ts/etc.).

Atlas landing composition: the variant now renders its energy panel set with
energy-specific names (Energy Atlas Map, Energy Headlines, AI Energy Insights)
when SITE_VARIANT === 'energy'. Pipeline and commodity-port map layers enabled
so Week 2's pipeline registry + storage-facility registry drop in with layers
already toggled on.

Typecheck clean; 175 pre-push tests expected to remain green.

Subsequent PRs on feat/energy-atlas-release-1:
- Week 2: pipeline registry + storage facility registry (evidence-based)
- Week 3: fuel-shortage classifier + disruption log + country drill-down
- Week 4: automated revision log, SEO polish, launch

* feat(energy): chokepoint strip at top of atlas (7 chokepoints)

Day 3 of the Energy Atlas Release 1 plan. Adds ChokepointStripPanel — a
compact horizontal strip of chip-style cards, one per chokepoint, showing
name + status color + flow-as-%-of-baseline + active-warnings badge.
Ordered by volume: Hormuz, Malacca, Suez, Bab el-Mandeb, Turkish Straits,
Danish Straits, Panama.

GEF covers 5 chokepoints (Hormuz, Malacca, Suez, Bab el-Mandeb, Panama).
We cover 7 — adds Turkish Straits + Danish Straits. One of the surpass
vectors enumerated in §5.7 of the plan doc.

Data: reuses the existing fetchChokepointStatus() RPC backed by
supply_chain:chokepoints:v4 (Portwatch DWT + AIS calibration). No new
backend work; this is pure UI composition.

Changes:
- src/components/ChokepointStripPanel.ts — new Panel subclass with
  in-line CSS for the chip strip; falls back gracefully when a chokepoint
  is missing from the response or FlowEstimate is absent.
- src/components/index.ts — barrel export.
- src/app/panel-layout.ts — import + createPanel registration near
  existing energy panels.
- src/config/panels.ts — ENERGY_PANELS adds 'chokepoint-strip' at priority 1
  (renders near top of atlas). Also fixes two panel-ID mismatches caught
  while wiring: 'hormuz' → 'hormuz-tracker' and 'renewable-energy' →
  'renewable' (matches HormuzPanel.ts and RenewableEnergyPanel registration).

Typecheck clean. No new tests required — panel renders real data.

* feat(energy): attribution footer utility + methodology page stubs

Days 4 & 5 of the Energy Atlas Release 1 plan.

## Day 4 — Attribution footer (src/utils/attribution-footer.ts)

A reusable string-builder that stamps every energy-atlas number with its
provenance. Design intent per plan §5.6 (quantitative rigour moat):
"every flow number carries {value, baseline, n_vessels, methodology,
confidence}".

Input schema:
- sourceType: operator | regulator | ais | satellite | press | classifier | derived
- method: short free-text ("AIS-DWT calibrated", "GIE AGSI+ daily")
- sampleSize + sampleLabel: observation count and unit
- updatedAt: ISO8601 / Date / number — rendered as "Xm/h/d ago"
- confidence: 0..1 — bucketed to high/medium/low
- classifierVersion: surfaced when evidence-derived badges ship in Week 2+
- creditName / creditUrl: CC-BY / dataset credit (OWID, GEM pattern)

Every field also writes data-attributes (data-attr-source, data-attr-n,
data-attr-confidence, data-attr-classifier) so MCP / scraper / analyst
agents can extract the same provenance the reader sees. Agent-native by
default.

Applied to ChokepointStripPanel — the panel now shows its evidence
footer ("AIS calibration · Portwatch DWT + AIS · N AIS disruption
signals · updated Xh ago · EIA World Oil Transit Chokepoints").
Future pipeline / storage / shortage panels drop the same helper in
and hit the same rigour bar automatically.

7 unit tests (tests/attribution-footer.test.mts, node:test via tsx):
minimal footer, method + sample size + credit, "X ago" formatting,
confidence band mapping, full data-attribute emission, credit omission,
HTML escaping.

## Day 5 — Public methodology page stubs (docs/methodology/)

Four new MDX pages surfaced in docs/docs.json navigation under
"Intelligence & Analysis":

- chokepoints.mdx — 7 chokepoints, Portwatch+AIS calibration method,
  status badge derivation, known limits, revision-log link.
- pipelines.mdx — curated critical-asset scope, GEM CC-BY attribution,
  evidence-schema (NOT conclusion labels), freshness SLA, corrections.
- storage.mdx — curated ~125 facilities scope, "published not
  synthesized" fill % policy, country-aggregate fallback, attribution.
- shortages.mdx — automated tiered evidence threshold, LLM second-pass
  gating, auto-decay cadence, evidence-source transparency, break-glass
  override policy (admin-only, off critical path).

All four explicitly document WorldMonitor's automated-data-quality
posture: no human review queues, quality via classifier rigour +
evidence transparency + auto-decay + public revision log.

Typecheck clean. attribution-footer.test.mts passes all 7 tests.

* fix(variant): close three energy-variant isolation leaks from review

Addresses three High findings from PR review:

1. Map-layer isolation (src/config/map-layer-definitions.ts)
   - Add 'energy' to the MapVariant type union.
   - Add energy entry to VARIANT_LAYER_ORDER with the curated energy
     subset (pipelines, waterways, commodityPorts, commodityHubs, ais,
     tradeRoutes, minerals, sanctions, fires, climate, weather, outages,
     natural, resilienceScore, dayNight).
   Without this, getLayersForVariant() and sanitizeLayersForVariant()
   (called from DeckGLMap and App.ts) fell back to VARIANT_LAYER_ORDER.full,
   letting the full geopolitical palette (military, nuclear, iranAttacks,
   conflicts, hotspots, bases, protests, flights, ucdpEvents,
   displacement, gpsJamming, satellites, ciiChoropleth, cables,
   datacenters, economic, cyberThreats, spaceports, irradiators,
   radiationWatch) into the desktop map tray and saved/URL layer
   sanitization — breaking the PR's stated no-geopolitical-bleed goal
   and violating multi-variant-site-data-isolation.

2. News feeds (src/config/feeds.ts + src/app/data-loader.ts)
   - Add ENERGY_FEEDS with three keys matching ENERGY_PANELS: live-news
     (broad energy headlines from OilPrice, Rigzone, Reuters/Bloomberg/FT
     energy), energy (OPEC + crude + NatGas/LNG + pipelines/chokepoints
     + crisis/shortages + refineries), supply-chain (tanker/shipping,
     chokepoints, energy sanctions, ports/terminals).
   - Add SITE_VARIANT === 'energy' branch to the FEEDS variant selector.
   - Correct newsCategories override in data-loader.ts — my earlier
     speculative values ['energy','energy-markets','oil-gas-news',
     'pipeline-news','lng-news'] included keys that did not exist in
     any feed map. Replaced with real ENERGY_FEEDS keys ['live-news',
     'energy', 'supply-chain'].
   Without this, FEEDS resolved to FULL_FEEDS for the energy variant —
   live-news + daily-brief both ingested the world/geopolitical feed set.

3. Insights / AI brief framing (src/components/InsightsPanel.ts)
   - Add SITE_VARIANT === 'energy' branch to geoContext: dedicated energy
     prompt focused on physical supply (pipelines, chokepoints, storage,
     days-of-cover, refineries, LNG, sanctions, shortages) with
     evidence-grounded attribution, no bare conclusions.
   - Add ' ENERGY BRIEF' heading branch in renderWorldBrief().
   Without this, the renamed 'AI Energy Insights' panel fell through
   to the empty default prompt and rendered 'WORLD BRIEF'.

Typecheck clean. attribution-footer tests still pass (no coupling changed).

* fix(variant): close energy-variant leak in SVG/mobile fallback map

Fifth High finding from PR review: src/components/Map.ts createLayerToggles()
(line 381-409) has no 'energy' branch in its variant ternary, so energy-variant
users whose MapContainer routes to the SVG/mobile fallback (no WebGL, mobile
with deviceMemory < 3, or DeckGL init throws) see the full geopolitical toggle
set — iranAttacks, conflicts, hotspots, bases, nuclear, irradiators, military,
protests, flights, gpsJamming, ciiChoropleth, cables, datacenters.

Clicking any toggle flips the layer via toggleLayer() which is variant-blind
(Map.ts:3383) — so users could enable military / nuclear layers on the energy
variant despite the rest of the isolation work in panels.ts,
map-layer-definitions.ts, feeds.ts, and InsightsPanel.ts.

Fix: add energyLayers array with the SVG-capable subset of ENERGY_MAP_LAYERS —
pipelines, waterways, ais, commodityHubs, minerals, sanctions, outages, natural,
weather, fires, economic. Intentionally omitted: commodityPorts, climate,
tradeRoutes, resilienceScore, dayNight — none of these have render handlers in
Map.ts's SVG path, so including them would create toggles that do nothing.
Extended the ternary with 'energy' → energyLayers between 'happy' and the
'full' fallback.

Note (preexisting, NOT fixed here): the same ternary has no 'commodity' branch
either, so commodity.worldmonitor.app also gets the full geopolitical toggle
set on the SVG fallback. Out of scope for this PR; flagged for a separate fix.

Defence-in-depth: sanitizeLayersForVariant() (now fixed in
map-layer-definitions.ts) strips saved-URL layers to the energy subset before
the SVG map sees them, so even if a user arrives with ?layers=military in the
URL, it's gone by the time initialState reaches MapComponent. The toggle-list
fix closes the UI-path leak; the sanitize fix closes the URL-path leak.

Typecheck clean.
2026-04-22 21:37:25 +04:00
Elie Habib
52659ce192 feat(resilience): PR 1 — energy construct repair (flag-gated) (#3289)
* docs(resilience): PR 1 foundation — Option B framing + v2 energy construct spec

First commit in PR 1 of the resilience repair plan. Zero scoring-behaviour
change; sets up the construct contract that the code changes will implement.

Declares the framing decision required by plan section 3.2 before any
scorer code lands: Option B (power-system security) is adopted. Electricity
grids are the dominant short-horizon shock-transmission channel, and the
choice lets the v2 energy indicator set share one denominator (percent of
electricity generation) instead of mixing primary-energy and power-system
measures in a composite.

Methodology doc changes:
  - Energy Domain section now documents both the legacy indicator set
    (still the default) and the v2 indicator set (flag-gated), under a
    single #### Energy H4 heading so the methodology-doc linter still
    asserts dimension-id parity with the registry.
  - v2 indicators: importedFossilDependence (EG.ELC.FOSL.ZS x
    max(EG.IMP.CONS.ZS, 0)), lowCarbonGenerationShare (EG.ELC.NUCL.ZS +
    EG.ELC.RNEW.ZS), powerLossesPct (EG.ELC.LOSS.ZS), reserveMarginPct
    (IEA), euGasStorageStress (renamed + scoped to EU), energyPriceStress
    (retained at 0.15 weight).
  - Retired under v2: electricityConsumption, gasShare, coalShare,
    dependency (all into importedFossilDependence), renewShare.
  - electricityAccess moves from energy to infrastructure under v2.
  - Added a v2.1 changelog section documenting the flag-gated rollout,
    acceptance gates (per plan section 6), and snapshot filenames for
    the post-flag-flip captures.
  - Known-limitations items 1-3 updated to note PR 1 lands the v2
    construct behind RESILIENCE_ENERGY_V2_ENABLED (default off).

Methodology-doc linter + mdx-lint + typecheck all clean. Indicator
registry, seeders, and scorer rewrite land in subsequent commits on
this same branch.

* feat(resilience): PR 1 — RESILIENCE_ENERGY_V2_ENABLED flag + scoreEnergy v2 + registry entries

Second commit in PR 1 of the resilience repair plan. Lands the flag,
the v2 scorer code path, and the registry entries the methodology
doc referenced. Default is flag off; published rankings are unchanged
until the flag flips in a later commit (after seeders land and the
acceptance-gate rerun produces a fresh post-flip snapshot).

Changes:

  - _shared.ts: isEnergyV2Enabled() function reader on the canonical
    RESILIENCE_ENERGY_V2_ENABLED env var. Dynamic read (like
    isPillarCombineEnabled) so tests can flip per-case.

  - _dimension-scorers.ts:
    - New Redis key constants for the three v2 seed keys plus the
      reserved reserveMargin key (seeder deferred per plan §3.1
      open-question).
    - EU_GAS_STORAGE_COUNTRIES set (EU + EFTA + UK) for the renamed
      euGasStorageStress signal per plan §3.5 point 2.
    - isEnergyV2EnabledLocal() — private duplicate of the flag reader
      to avoid a circular import (_shared.ts already imports from
      this module). Same env-var contract.
    - scoreEnergy split into scoreEnergyLegacy() + scoreEnergyV2().
      Public scoreEnergy() branches on the flag. Legacy path is
      byte-identical to the pre-commit behaviour.
    - scoreEnergyV2() reads four new bulk payloads, composes
      importedFossilDependence = fossilElectricityShare × max(netImports, 0)/100
      per plan §3.2, collapses net exporters to 0, and gates
      euGasStorageStress on EU membership so non-EU countries
      re-normalise rather than getting penalised for a regional
      signal.

  - _indicator-registry.ts: four new entries under `dimension: 'energy'`
    with `tier: 'experimental'` — importedFossilDependence (0.35),
    lowCarbonGenerationShare (0.20), powerLossesPct (0.10),
    reserveMarginPct (0.10). Experimental tier keeps them out of the
    Core coverage gate until seed coverage is confirmed.

  - compare-resilience-current-vs-proposed.mjs: new
    'bulk-v1-country-value' shape family in the extraction dispatcher.
    EXTRACTION_RULES now covers the four v2 registry indicators so
    the per-indicator influence harness tracks them from day one.
    When the seeders are absent, pairedSampleSize = 0 and Pearson = 0
    — the harness output surfaces the "no influence yet" state rather
    than silently dropping the indicators.

  - tests/resilience-energy-v2.test.mts: 11 new tests pinning:
    - flag-off = legacy behaviour preserved (v2 seed keys have no
      effect when flag is off — catches accidental cross-path reads)
    - flag-on = v2 composite behaves correctly:
      - lower fossilElectricityShare raises score
      - net exporter with 90% fossil > net importer with 90% fossil
        (max(·, 0) collapse verified)
      - higher lowCarbonGenerationShare raises score (nuclear credit)
      - higher powerLossesPct lowers score
      - euGasStorageStress is invariant for non-EU, responds for DE
      - all v2 inputs absent = graceful degradation, coverage < 1.0

106 resilience tests pass (existing + 11 new). Typecheck clean. Biome
clean. No production behaviour change with flag off (default).

Next commits on this branch: three World Bank seeders for the v2 keys,
health.js + SEED_META registration (gated ON_DEMAND_KEYS until Railway
cron provisions), acceptance-gate rerun at flag-flip time.

* feat(resilience): PR 1 — three WB seeders + health registration for v2 energy construct

Third commit in PR 1. Lands the seed scripts for the three v2 energy
indicator source keys, registered in api/health.js with ON_DEMAND_KEYS
gating until Railway cron provisions.

New seeders (weekly cron cadence, 8d maxStaleMin = 2x interval):
  - scripts/seed-low-carbon-generation.mjs
    Pulls EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS from World Bank, sums per
    country into `resilience:low-carbon-generation:v1`. Partial
    coverage (one series missing) still emits a value using the
    observed half — the scorer's 0-80 saturating goalpost tolerates
    it and the underlying construct is "firm low-carbon share".

  - scripts/seed-fossil-electricity-share.mjs
    Pulls EG.ELC.FOSL.ZS into `resilience:fossil-electricity-share:v1`.
    Feeds the importedFossilDependence composite at score time
    (composite = fossilShare × max(netImports, 0) / 100 per plan §3.2).

  - scripts/seed-power-reliability.mjs
    Pulls EG.ELC.LOSS.ZS into `resilience:power-losses:v1`. Direct
    grid-integrity signal replacing the retired electricityConsumption
    wealth proxy.

All three follow the existing seed-recovery-*.mjs template:
  - Shape: { countries: { [ISO2]: { value, year } }, seededAt }
  - runSeed() from _seed-utils.mjs with schemaVersion=1, ttl=35d
  - validateFn floor of 150 countries (WB coverage is 150-180 for
    the three indicators; below 150 = transient fetch failure)
  - ISO3 → ISO2 mapping via scripts/shared/iso3-to-iso2.json

No reserveMargin seeder is shipped in this commit per plan §3.1 open
question: IEA electricity-balance coverage is sparse outside OECD+G20,
and the indicator will likely ship as 'unmonitored' with weight 0.05
if it lands at all. The Redis key (`resilience:reserve-margin:v1`) is
reserved in _dimension-scorers.ts so the v2 scorer shape is stable.

api/health.js:
  - SEED_DOMAINS: add `lowCarbonGeneration`, `fossilElectricityShare`,
    `powerLosses` → their Redis keys.
  - SEED_META: same three, pointing at `seed-meta:resilience:*` meta
    keys with maxStaleMin=11520 (8d, per the worldmonitor
    health-maxstalemin-write-cadence pattern: 2x weekly cron).
  - ON_DEMAND_KEYS: three new entries gated as TRANSITIONAL until
    Railway cron provisions and the first clean run completes. Remove
    from this set after ~7 days of green production runs.

Typecheck clean; existing 106 resilience tests pass (seeders have no
in-repo callers yet, so nothing depends on them executing). Real-API
integration tests land when Railway cron is provisioned.

Next commit: Railway cron configuration + bundle-runner wiring.

* feat(resilience): PR 1 — bundle-runner + acceptance-gate verdict + flag-flip runbook

Final commit in the PR 1 tranche. Lands the three remaining pieces so
the flag-flip is fully operable once Railway cron provisions.

  - scripts/seed-bundle-resilience-energy-v2.mjs
    Railway cron bundle wrapping the three v2 energy seeders
    (low-carbon-generation, fossil-electricity-share, power-losses).
    Weekly cadence (7-day intervalMs); the underlying data is annual
    at source so polling more frequently just hammers the World Bank
    API. 5-minute per-script timeout. Mirrors the existing
    seed-bundle-resilience-recovery.mjs pattern.

  - scripts/compare-resilience-current-vs-proposed.mjs: acceptanceGates
    block. Programmatic evaluation of plan §6 gates using the inputs
    the harness already computes:
      gate-1-spearman              Spearman vs baseline >= 0.85
      gate-2-country-drift         Max country drift vs baseline <= 15
      gate-6-cohort-median         Cohort median shift vs baseline <= 10
      gate-7-matched-pair          Every pair holds expected direction
      gate-9-effective-influence   >= 80% Core indicators measurable
      gate-universe-integrity      No cohort/pair endpoint missing from scorable
    Thresholds are encoded in a const so they can't silently soften.
    Output verdict is PASS / CONDITIONAL / BLOCK. Emitted in
    summary.acceptanceVerdict for at-a-glance PR comment pasting, with
    full per-gate detail in acceptanceGates.results.

  - docs/methodology/energy-v2-flag-flip-runbook.md
    Operator runbook for the flag flip. Pre-flip checklist (seeders
    green, health endpoint green, ON_DEMAND_KEYS graduation, Spearman
    verification), flip procedure (pre-flip snapshot, dry-run, cache
    prefix bump, Vercel env flip, post-flip snapshot, methodology
    doc reclassification), rollback procedure, and a reference table
    for the three possible verdict states.

PR 1 is now code-complete pending:
  1. Railway cron provisioning (ops, not code)
  2. Flag flip + acceptance-gate rerun (follows runbook, not code)
  3. Reserve-margin seeder (deferred per plan §3.1 open-question)

Zero scoring-behaviour change in this commit. 121 resilience tests
pass, typecheck clean.

* fix(resilience): PR 1 — drop unseeded reserveMargin from scorer + fix composite extractor

Addresses two P1 review findings on PR #3289.

Finding 1: scoreEnergyV2 read resilience:reserve-margin:v1 at weight
0.10 but no seeder ships in this PR (indicator deferred per plan
§3.1 open-question). On flag flip that slot would be permanently
null, silently renormalizing the remaining 90% of weight and
producing a construct different from what the methodology doc
describes. Fix: remove reserve-margin from the v2 reader +
blend entirely. Redistribute its 0.10 weight to powerLossesPct
(now 0.20); both are grid-integrity signals per plan §3.1, and
the original plan split electricityConsumption's 0.30 weight
across powerLossesPct + reserveMarginPct + importedFossilDependence
— without reserveMarginPct, powerLossesPct carries the shared
grid-integrity load until the IEA seeder ships.

  v2 weights now: 0.35 + 0.20 + 0.20 + 0.10 + 0.15 = 1.00
  (importedFossilDependence + lowCarbonGenerationShare +
   powerLossesPct + euGasStorageStress + energyPriceStress)

  Reserve-margin Redis key constant stays reserved so the v2
  scorer shape is stable when a future commit lands the seeder;
  split 0.10 back out of powerLossesPct at that point.

Methodology doc, _shared.ts flag comment, and v2 test suite all
updated to the 5-indicator shape. New regression test asserts
that changing reserve-margin Redis content has zero effect on
the v2 score — guards against a future commit accidentally
wiring the reader back in without its seeder.

Finding 2: scripts/compare-resilience-current-vs-proposed.mjs
measured importedFossilDependence by reading fossilElectricityShare
alone. The scorer defines it as fossilShare × max(netImports, 0)
/ 100, so the extractor zeroed out net exporters and
under-reported net importers — making gate-9 effective-influence
wrong for the centrepiece construct change of PR 1.

Fix: new 'imported-fossil-dependence-composite' extractor type
in applyExtractionRule that recomputes the same composite from
both inputs (fossilShare bulk payload + staticRecord.iea.
energyImportDependency.value). Stays in lockstep with the
scorer — drift between the two would break gate-9's
interpretation.

New unit tests pin:
  - net importer: 80% × max(60, 0) / 100 = 48 ✓
  - net exporter: 80% × max(-40, 0) / 100 = 0 ✓
  - missing either input → null

64 resilience tests pass; typecheck clean. Flag-off path is
still byte-identical to pre-PR behaviour.

* docs(resilience): PR 1 — align methodology doc with actual shipped indicators and seeders

Addresses P1 review on docs/methodology/country-resilience-index.mdx
lines 29 and 574-575. The doc still described reserveMarginPct as a
shipped v2 indicator and listed seed-net-energy-imports.mjs in the
new-seeders list, neither of which the branch actually ships.

Doc changes to match the code in this branch:

  Known-limitations item 1: restated to describe the actual v2
  replacement footprint — powerLossesPct at 0.20 (temporarily
  absorbing reserveMarginPct's 0.10) plus accessToElectricityPct
  moved to infrastructure. reserveMarginPct is named as a deferred
  companion with the split-out instructions for when its seeder
  lands.

  v2.1 changelog (Indicators added): split into "live in PR 1" and
  "deferred in PR 1" so the reader can distinguish which entries
  match real code. importedFossilDependence's composite formula
  now written out and the net-imports source attributed to the
  existing resilience:static.iea path (not a new seeder).

  v2.1 changelog (New seeders): lists the three actual files that
  ship in this branch (seed-low-carbon-generation, seed-fossil-
  electricity-share, seed-power-reliability) and explicitly notes
  seed-net-energy-imports.mjs is NOT a new seeder — the
  EG.IMP.CONS.ZS series is already fetched by seed-resilience-
  static.mjs. Adds the bundle-runner reference.

Methodology-doc linter + mdx-lint both pass (125/125). Typecheck
clean. Doc is now the source of truth for what PR 1 actually ships.

* fix(resilience): PR 1 — sync powerLossesPct registry weight with scorer (0.10 → 0.20)

Reviewer-caught mismatch between INDICATOR_REGISTRY and scoreEnergyV2.
The previous commit redistributed the deferred reserveMarginPct's 0.10
weight into powerLossesPct in the SCORER but left the REGISTRY entry
unchanged at 0.10. Two downstream effects:

  1. scripts/compare-resilience-current-vs-proposed.mjs copies
     `spec.weight` into `nominalWeight` for gate-9 reporting, so
     powerLossesPct's nominal influence would be under-reported by
     half in every post-flip acceptance run — exactly the harness PR 1
     relies on for merge evidence.
  2. Methodology doc vs registry vs scorer drift is the pattern the
     methodology-doc linter is supposed to catch; it passes here
     because the linter only checks dimension-id parity, not weights.
     Registry is now the only remaining source of truth to keep in
     lockstep with the scorer.

Change:
  - `_indicator-registry.ts` powerLossesPct.weight: 0.1 → 0.2
  - Inline comment names the deferral and instructs: "when the IEA
    electricity-balance seeder lands, split 0.10 back out and restore
    reserveMarginPct at 0.10. Keep this field in lockstep with
    scoreEnergyV2 ... because the PR 0 compare harness copies
    spec.weight into nominalWeight for gate-9 reporting."

Experimental weights per dimension invariant still holds (0.35 + 0.20
+ 0.20 = 0.75 for energy, well under the 1.0 ceiling). 64 resilience
tests pass, typecheck clean.
2026-04-22 17:10:38 +04:00
Elie Habib
da0f26a3cf feat(resilience): PR 0 diagnostic freeze + fairness-audit harness (no scoring changes) (#3284)
* feat(resilience): PR 0 diagnostic freeze + fairness-audit harness

Lands the before-state and measurement apparatus every subsequent
resilience-scorer PR validates against. Zero scoring changes. Per the
v3 plan at docs/plans/2026-04-22-001-fix-resilience-scorer-structural-
bias-plan.md this is tranche 0 of five.

What lands:
- Construct contract published in the methodology doc: absolute
  resilience not development-adjusted, mechanism test for every
  indicator, peer-relative views published separately from the core.
- Known construct limitations section: six construct errors scheduled
  for PR 1-3 repair with explicit mapping to plan tranches.
- Indicator-source manifest at docs/methodology/indicator-sources.yaml
  with source, seriesId, seriesUrl, coveragePct, lastObservedYear,
  license, mechanismTestRationale, and a constructStatus classification.
- Pre-repair ranking snapshot at
  docs/snapshots/resilience-ranking-live-pre-repair-2026-04-22.json
  (217 items + 5 greyedOut, captured 2026-04-22 08:38 UTC at commit
  425507d15).
- Cohort configuration at tests/helpers/resilience-cohorts.mts: six
  cohorts covering 87 countries (net-fuel-exporters, net-energy-
  importers-oecd, nuclear-heavy-generation, coal-heavy-domestic,
  small-island-importers, fragile-states).
- Matched-pair sanity panel at tests/helpers/resilience-matched-pairs.mts:
  six pairs (FR/DE, NO/CA, UAE/BH, JP/KR, IN/ZA, SG/CH) with expected-
  direction rationale and minGap for acceptance gate 7.
- scripts/compare-resilience-current-vs-proposed.mjs extended to emit
  cohortSummary and matchedPairSummary alongside the existing output
  shape (backward compatible).
- tests/resilience-cohort-config.test.mts: 11 validations ensuring the
  cohort + matched-pair configs stay well-formed.

Deferred to PR 0.5 (before PR 1 lands):
- Monotonicity test harness for all 19 dimension scorers pinning the
  sign of every indicator.
- Pearson-derivative variable-influence baseline inside the sensitivity
  script producing the nominal-weight-vs-effective-influence table that
  plan acceptance gate 8 requires.

Verification: typecheck:all clean, 430/430 resilience tests pass,
11/11 new cohort-config tests pass, snapshot auto-discovered and
validated by the existing snapshot-test harness.

* feat(resilience): PR 0 follow-ups — monotonicity harness, variable-influence baseline, cross-consumer formula gate

Completes the PR 0 scope per the v3 plan §5 deliverables. Three adds:

1. Monotonicity test harness
   tests/resilience-dimension-monotonicity.test.mts pins the direction
   of movement for 14 indicators across 7 dimensions (reserve adequacy,
   fiscal space 3x, external debt coverage, import concentration,
   governance WGI, food/water 2x, energy 5x). Each test builds two
   synthetic ResilienceSeedReader fixtures differing only in the target
   indicator and asserts the dimension score moves in the documented
   direction. The scoreEnergy tests explicitly flag three indicators
   (gasShare, coalShare, electricityConsumption) that PR 1 §3.1-3.2
   overturns so future readers understand which directional claims the
   plan intentionally replaces.

2. Variable-influence baseline
   scripts/compare-resilience-current-vs-proposed.mjs now computes
   per-dimension Pearson correlation against the current overallScore
   scaled by the dimension's nominal domain weight (a Pearson-derivative
   approximation of Sobol indices). The output carries a
   variableInfluence[] array sorted by abs(effectiveInfluence) desc.
   Acceptance gate 8 from the plan compares post-change effective
   influence against assigned nominal weight; divergences flag a
   wealth-proxy or saturated-signal construct problem.

3. Cross-consumer formula gate
   Five external consumers of resilience:score:v10:* now filter stale-
   formula entries so a flag flip does not serve mixed-formula data
   downstream:
     - server/worldmonitor/supply-chain/v1/get-route-impact.ts —
       readResilienceScore() checks _formula via the new
       getCurrentCacheFormula export and returns 0 on mismatch.
     - scripts/validate-resilience-correlation.mjs,
       scripts/validate-resilience-backtest.mjs,
       scripts/backtest-resilience-outcomes.mjs,
       scripts/benchmark-resilience-external.mjs — each inlines a
       currentCacheFormulaLocal() helper that mirrors the server's
       formula derivation from env, skips parsed entries whose
       _formula disagrees, and logs the skip count so operators can
       notice a mismatch during the flip window.

A mixed-formula cohort (some countries d6-tagged, others pc-tagged)
would confound every correlation, AUC, and Spearman this repair plan
depends on for its acceptance gates. These guards close that gap.

Verification: typecheck:all clean, 444/444 resilience tests pass
(+14 from the new monotonicity harness).

* fix(resilience): PR 0 review follow-ups — sample-union + doc tense

Two review-driven fixes on top of PR 0.

1. scripts/compare-resilience-current-vs-proposed.mjs — the cohort and
   matched-pair summaries were computed against the historical
   52-country sensitivity seed, which silently excluded the
   small-island-importers cohort (zero members in the seed) and the
   sg-vs-ch matched pair (Singapore not in the seed). With the current
   script those acceptance gates are partially measured at best.

   SAMPLE now = union(historical 52 seed, every cohort member, every
   matched-pair endpoint). The imports for RESILIENCE_COHORTS and
   MATCHED_PAIRS moved from inside main() to module scope so the union
   can be computed before the script runs.

   Net sample size grows from 52 to ~95 countries. Still fast enough
   for an interactive pass; makes the acceptance gates honest.

2. docs/methodology/country-resilience-index.mdx — the construct
   contract wording read as present-tense compliance ("Every indicator
   in the scorer passes a single mechanism test"), which contradicted
   the immediately-following passage about indicators that currently
   fail the test. Reworded to "is being evaluated against" and added
   an explicit PR-0-does-not-change-scoring paragraph that names the
   known-failing indicators (electricityConsumption, gas/coal flat
   penalties, WHO per-capita health spend) and points at the repair
   plan for the replacement schedule.

Verification: typecheck:all clean, 444/444 resilience tests pass.

* fix(resilience): compare-script loads frozen baseline + emits per-indicator influence

Addresses two P1 review findings on PR #3284:

1. Script previously compared current-6d vs proposed-pillar-combined
   from the SAME checkout; never loaded the frozen pre-PR-0 baseline,
   so acceptance gates 2/6/7 ("no country moved >15pts vs baseline",
   cohort median shift vs baseline, matched-pair gap change vs
   baseline) could not be enforced for later scorer PRs.

   Now auto-discovers the most recent
   resilience-ranking-live-pre-repair-<date>.json (or post-<pr>-<date>)
   in docs/snapshots/ and emits a baselineComparison block with:
   spearmanVsBaseline, maxCountryAbsDelta, biggestDriftsVsBaseline,
   cohortShiftVsBaseline, matchedPairGapChange. If no baseline is
   found, the block is emitted with status 'unavailable' so callers
   distinguish missing-baseline from passed-baseline.

2. variableInfluence was emitted only at the dimension level, which
   hid the exact sub-indicators the repair plan targets
   (electricityConsumption, gasShare, coalShare, etc.) inside their
   parent dimension. Added extractIndicatorValues() which pulls twelve
   construct-risk indicators per country from the shared memoized
   reader, then computes per-indicator Pearson correlation against
   the current overall score. Emitted as perIndicatorInfluence[],
   sorted by absolute effective influence.

Acceptance gate 8 ("effective influence agrees in sign and rank-order
with assigned nominal weights") is now computable at the indicator
level, not only at the dimension level.

No production code touched; diagnostic-harness only.

* fix(resilience): baseline-snapshot selection by structured parse, not filename sort

Addresses P1 review on compare-resilience-current-vs-proposed.mjs:118-130.

Plain filename sort breaks the "immediate-prior state" contract two ways:

1. Lexical ordering: `pre-repair` sorts after `post-*`
   (`pr...` to 'r' > 'o'), so the PR-0 freeze would keep winning even
   after post-PR snapshots exist. Later scorer PRs would then report
   acceptance-gate deltas against the original pre-repair freeze
   instead of the immediately-prior post-PR-(N-1) snapshot — the gate
   would appear valid while measuring against the wrong baseline.

2. Lexical ordering: `pr10` < `pr9` (digit-by-digit), so PR-10 would
   lose the selection to PR-9.

Fix: parseBaselineSnapshotMeta() extracts (kind, prNumber, date) from
the filename. Sort keys are (kindRank desc, prNumber desc, date desc):
  - post always beats pre-repair (kindRank 1 vs 0)
  - among posts, prNumber compared numerically (10 beats 9)
  - date breaks ties (same-PR re-snapshots, later capture wins)
  - unlabeled post tags get prNumber 0 so they sort between
    pre-repair and any numbered PR snapshot

Surfaced in output: baselineKind / baselinePrNumber / baselineDate
alongside baselineFile so the operator can verify which snapshot was
selected without having to reopen the file.

Module now isMain-guarded per feedback_seed_isMain_guard memory so
tests can import parseBaselineSnapshotMeta without firing the
scoring run.

Added tests/resilience-baseline-snapshot-ordering.test.mjs (9 tests)
pinning the ordering contract for every known failure mode.

Diagnostic-harness change only. No production code touched.

* fix(resilience): full scorable universe + registry-driven per-indicator influence

Addresses two fresh P1 review findings on the PR 0 compare harness.

Finding 1 — acceptance math ran on a curated ~95-country sample,
so plan gate 2 could miss large regressions in excluded countries.

  - Main scoring loop now iterates the FULL scorable universe
    (listScorableCountries()), not the 52-country seed + cohort union.
  - Removed SAMPLE / HISTORICAL_SENSITIVITY_SEED constants.
  - Added scorableUniverseSize + cohortMissingFromScorable to output
    so operators see universe size and any cohort/pair endpoint that
    listScorable refuses to score (fail-loud, not silent drop).

Finding 3 — per-indicator influence was a hand-picked 12-indicator
subset, hiding most registry indicators from the baseline that
later scorer PRs need.

  - Extraction is now driven by INDICATOR_REGISTRY. Every Core +
    Enrichment indicator gets a row with explicit extractionStatus:
      implemented | not-implemented (with reason) | unregistered-in-harness
  - EXTRACTION_RULES covers 40/59 indicators across 11 shape families
    (static-path, static-wb-infrastructure, static-wgi, static-wgi-mean,
    static-who, energy-mix-field, gas-storage-field, recovery-country-
    field, imf-macro/labor-country-field, national-debt, sanctions-count).
  - Remaining 19 indicators need either a scorer trace hook (PR 0.5)
    or a safe aggregation duplicate; each carries a reason string.
  - extractionCoverage summary (totalIndicators / implemented /
    notImplemented / unregisteredInHarness / coreImplemented / coreTotal)
    exposed in output so PR 0.5 progress is measurable.

Added tests/resilience-indicator-extraction-plan.test.mjs (11 tests)
pinning: every registry entry has an extraction row; not-implemented
rows carry a reason; all 12 plan-named construct-risk indicators stay
extractable; Core-tier coverage floor of 45%; shape-family unit tests.

Diagnostic-harness change only. No production code touched.

* fix(resilience): wire event-aggregate per-indicator influence via exported scorer helpers

Addresses P1 review on PR 0 compare harness. Previous commit marked 16
Core-tier indicators as 'not-implemented' because they needed scorer
event-window/severity-weighting math; that left the gate-9 acceptance
apparatus incomplete for a large part of the shipped score.

Fix: export the scorer-internal aggregation helpers so the harness
calls them directly. Zero aggregation math duplicated in the harness,
harness and scorer cannot drift.

Exported from _dimension-scorers.ts (purely additive):
  summarizeCyber, summarizeOutages, summarizeGps,
  summarizeUcdp, summarizeUnrest, summarizeSocialVelocity,
  getCountryDisplacement, getThreatSummaryScore,
  countTradeRestrictions, countTradeBarriers.

13 extraction rules moved from not-implemented to implemented:
  cyberThreats, internetOutages, infraOutages, gpsJamming,
  ucdpConflict, unrestEvents, socialVelocity, newsThreatScore,
  displacementTotal, displacementHosted, tradeRestrictions,
  tradeBarriers, recoveryConflictPressure, recoveryDisplacementVelocity.

Coverage:
  52/59 total (88%), 46/50 Core-tier (92%).

Four Core indicators remain not-implemented for STRUCTURAL reasons,
NOT missing code. Scorer inputs are genuinely global scalars with
zero per-country variance, so Pearson(indicator, overall) is 0 or
NaN by construction:
  shippingStress, transitDisruption, energyPriceStress — scorer
  reads a global scalar applied to every country; a per-country
  effective signal would need re-expression as (global x per-country
  exposure), which is a derived signal in a different entry.
  aquastatWaterAvailability — needs a distinct sub-indicator path
  resolver; enrichment follow-up.

New test asserts the three no-per-country-variance indicators STAY
not-implemented with a matching reason, so any future extraction
that appears to cover them without fixing the underlying construct
fails.

Dispatcher split into STATIC / SIMPLE / AGGREGATE extractor tables
to stay under biome complexity limit. Core-tier floor test raised
from 45% to 80%.

89 resilience tests pass, typecheck clean, biome clean. No production
behaviour changes.

* fix(resilience): tag-gated AQUASTAT extractor closes the last fixable Core gap

Reviewer flagged aquastatWaterAvailability as the only remaining Core
indicator where the not-implemented status was structurally fixable
rather than conceptually impossible.

Both aquastatWaterStress and aquastatWaterAvailability share a single
.aquastat.value field; the scorer's scoreAquastatValue splits them
by the sibling .aquastat.indicator tag keyword (stress/withdrawal/
dependency to stress family; availability/renewable/access to
availability family). The harness now mirrors this branching:

  - classifyAquastatFamily implements the scorer's priority order
    (stress-family match wins even if the tag also contains an
    availability keyword, matching the sequential if-check at
    _dimension-scorers.ts L770-776).
  - static-aquastat-stress / static-aquastat-availability extractors
    return the value only when the family matches, so stress-family
    readings never corrupt the availability Pearson and vice versa.

Core-tier coverage: 46/50 to 47/50 (94%). The 3 remaining Core
not-implemented indicators (shippingStress, transitDisruption,
energyPriceStress) are all structural impossibilities: scorer inputs
are global scalars with zero per-country variance.

New contract test pins both directions of the tag gate plus the
priority-order edge case (a tag containing both families' keywords
routes to stress).

90 resilience tests pass, typecheck clean, biome clean.
2026-04-22 16:44:12 +04:00
Elie Habib
fbaf07e106 feat(resilience): flag-gated pillar-combined score activation (default off) (#3267)
Wires the non-compensatory 3-pillar combined overall_score behind a
RESILIENCE_PILLAR_COMBINE_ENABLED env flag. Default is false so this PR
ships zero behavior change in production. When flipped true the
top-level overall_score switches from the 6-domain weighted aggregate
to penalizedPillarScore(pillars) with alpha 0.5 and pillar weights
0.40 / 0.35 / 0.25.

Evidence from docs/snapshots/resilience-pillar-sensitivity-2026-04-21:
- Spearman rank correlation current vs proposed 0.9935
- Mean score delta -13.44 points (every country drops, penalty is
  always at most 1)
- Max top-50 rank swing 6 positions (Russia)
- No ceiling or floor effects under plus/minus 20pct perturbation
- Release gate PASS 0/19

Code change in server/worldmonitor/resilience/v1/_shared.ts:
- New isPillarCombineEnabled() reads env dynamically so tests can flip
  state without reloading the module
- overallScore branches on (isPillarCombineEnabled() AND
  RESILIENCE_SCHEMA_V2_ENABLED AND pillars.length > 0); otherwise falls
  through to the 6-domain aggregate (unchanged default path)
- RESILIENCE_SCORE_CACHE_PREFIX bumped v9 to v10
- RESILIENCE_RANKING_CACHE_KEY bumped v9 to v10

Cache invalidation: the version bump forces both per-country score
cache and ranking cache to recompute from the current code path on
first read after a flag flip. Without the bump, 6-domain values cached
under the flag-off path would continue to serve for up to 6-12 hours
after the flip, producing a ragged mix of formulas.

Ripple of v9 to v10:
- api/health.js registry entry
- scripts/seed-resilience-scores.mjs (both keys)
- scripts/validate-resilience-correlation.mjs,
  scripts/backtest-resilience-outcomes.mjs,
  scripts/validate-resilience-backtest.mjs,
  scripts/benchmark-resilience-external.mjs
- tests/resilience-ranking.test.mts 24 fixture usages
- tests/resilience-handlers.test.mts
- tests/resilience-scores-seed.test.mjs explicit pin
- tests/resilience-pillar-aggregation.test.mts explicit pin
- docs/methodology/country-resilience-index.mdx

New tests/resilience-pillar-combine-activation.test.mts:
7 assertions exercising the flag-on path against the release fixtures
with re-anchored bands (NO at least 60, YE/SO at most 40, NO greater
than US preserved, elite greater than fragile). Regression guard
verifies flipping the flag back off restores the 6-domain aggregate.

tests/resilience-ranking-snapshot.test.mts: band thresholds now
resolve from a METHODOLOGY_BANDS table keyed on
snapshot.methodologyFormula. Backward compatible (missing formula
defaults to domain-weighted-6d bands).

Snapshots:
- docs/snapshots/resilience-ranking-2026-04-21.json tagged
  methodologyFormula domain-weighted-6d
- docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json
  new: top/bottom/major-economies tables projected from the
  52-country sensitivity sample. Explicitly tagged projected (NOT a
  full-universe live capture). When the flag is flipped in production,
  run scripts/freeze-resilience-ranking.mjs to capture the
  authoritative full-universe snapshot.

Methodology doc: Pillar-combined score activation section rewritten to
describe the flag-gated mechanism (activation is an env-var flip, no
code deploy) and the rollback path.

Verification: npm run typecheck:all clean, 397/397 resilience tests
pass (up from 390, +7 activation tests).

Activation plan:
1. Merge this PR with flag default false (zero behavior change)
2. Set RESILIENCE_PILLAR_COMBINE_ENABLED=true in Vercel and Railway env
3. Redeploy or wait for next cold start; v9 to v10 bump forces every
   country to be rescored on first read
4. Run scripts/freeze-resilience-ranking.mjs against the flag-on
   deployment and commit the resulting snapshot
5. Ship a v2.0 methodology-change note explaining the re-anchored
   scale so analysts understand the universal ~13 point score drop is
   a scale rebase, not a country-level regression

Rollback: set RESILIENCE_PILLAR_COMBINE_ENABLED=false, flush
resilience:score:v10:* and resilience:ranking:v10 keys (or wait for
TTLs). The 6-domain formula stays alongside the pillar combine in
_shared.ts and needs no code change to come back.
2026-04-22 06:52:07 +04:00
Elie Habib
502bd4472c docs(resilience): sync methodology/proto/widget to 6-domain + 3-pillar reality (#3264)
Brings every user-facing surface into alignment with the live resilience
scorer. Zero behavior change: overall_score is still the 6-domain
weighted aggregate, schemaVersion is still 2.0 default, and every
existing test continues to pass.

Surfaces touched:
- proto + OpenAPI: rewrote the ResiliencePillar + schema_version
  descriptions. 2.0 is correctly documented as default; shaped-but-empty
  language removed.
- Widget: added missing recovery: 'Recovery' label (was rendering
  literal lowercase recovery before), retitled footer data-version chip
  from Data to Seed date so it is clear the value reflects the static
  seed bundle not every live input, rewrote help tooltip for 6 domains
  and 3 pillars and called out the 0.25 recovery weight.
- Methodology doc: domains-and-weights table now carries all 6 rows
  with actual code weights (0.17/0.15/0.11/0.19/0.13/0.25), Recovery
  section header weight corrected from 1.0 to 0.25, new Pillar-combined
  score activation (pending) section with the measured Spearman 0.9935,
  top-5 movers, and the activation checklist.
- documentation.mdx + features.mdx: product blurbs updated from 5
  domains and 13 dimensions to 6 domains and 19 dimensions grouped into
  3 pillars.
- Tests: recovery-label regression pin, Seed date label pin, clarified
  pillar-schema degenerate-input semantics.

New scaffolding for defensibility:
- docs/snapshots/resilience-ranking-2026-04-21.json frozen published
  tables artifact with methodology metadata and commit SHA.
- docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json live
  Redis capture (52-country sample) combining sensitivity stability
  with the current-vs-proposed Spearman comparison.
- scripts/freeze-resilience-ranking.mjs refresh script.
- scripts/compare-resilience-current-vs-proposed.mjs comparison script.
- tests/resilience-ranking-snapshot.test.mts 13 assertions auto
  discovered from any resilience-ranking-YYYY-MM-DD.json in snapshots.

Verification: npm run typecheck:all clean, 390/390 resilience tests
pass.

Follow-up: pillar-combined score activation. The sensitivity artifact
shows rank-preservation Spearman 0.9935 and no ceiling effects, which
clears the methodological bar. Blocker is messaging because every
country drops ~13 points under the penalty, so activation PR ships with
re-anchored release-gate bands, refreshed frozen ranking, and a v2.0
methodology note.
2026-04-21 22:37:27 +04:00
Elie Habib
4c9888ac79 docs(mintlify): panel reference pages (PR 2) (#3213)
* docs(mintlify): add user-facing panel reference pages (PR 2)

Six new end-user pages under docs/panels/ for the shipped panels that
had no user-facing documentation in the published docs, per the plan
docs/plans/2026-04-19-001-feat-docs-user-facing-ia-refresh-plan.md.
All claims are grounded in the live component source + SEED_META +
handler dirs — no invented fields, counts, or refresh windows.

- panels/latest-brief.mdx — daily AI brief panel (ready/composing/
  locked states). Hard-gated PRO (`premium: 'locked'`).
- panels/forecast.mdx — AI Forecasts panel (internal id `forecast`,
  label "AI Forecasts"). Domain + macro-region filter pills; 10%
  probability floor. Free on web, locked on desktop.
- panels/consumer-prices.mdx — 5-tab retail-price surface (Overview
  / Categories / Movers / Spread / Health) with market, basket, and
  7/30/90-day controls. Free.
- panels/disease-outbreaks.mdx — WHO / ProMED / national health
  ministries outbreak alerts with alert/warning/watch pills. Free.
- panels/radiation-watch.mdx — EPA RadNet + Safecast observations
  with anomaly scoring and source-confidence synthesis. Free.
- panels/thermal-escalation.mdx — FIRMS/VIIRS thermal clusters with
  persistence and conflict-adjacency flags. Free.

Also:
- docs/docs.json — new Panels nav group (Latest Brief, AI Forecasts,
  Consumer Prices, Disease Outbreaks, Radiation Watch, Thermal
  Escalation).
- docs/features.mdx — cross-link every panel name in the Cmd+K
  inventory to its new page (and link Country Instability + Country
  Resilience from the same list).
- docs/methodology/country-resilience-index.mdx — short "In the
  dashboard" bridge section naming the three CRI surfaces
  (Resilience widget, Country Deep-Dive, map choropleth) so the
  methodology page doubles as the user-facing panel reference for
  CRI. No separate docs/panels/country-resilience.mdx — keeps the
  methodology page as the single source of truth.

* docs(panels): fix Latest Brief polling description

Reviewer catch: the panel does schedule a 60-second re-poll while
in the composing state. `COMPOSING_POLL_MS = 60_000` at
src/components/LatestBriefPanel.ts:78, and `scheduleComposingPoll()`
is called from `renderComposing()` at :366. The poll auto-promotes
the panel to ready without a manual refresh and is cleared when
the panel leaves composing. My earlier 'no polling timer' line was
right for the ready state but wrong as a blanket claim.

* docs(panels): fix variant-availability claims across all 6 panel pages

Reviewer catch on consumer-prices surfaced the same class of error
on 4 other panel pages: I described variant availability with loose
phrasing ('most variants', 'where X context is relevant', 'tech/
finance/happy opt-in') that didn't match the actual per-variant
panel registries in src/config/panels.ts.

Verified matrix against each *_PANELS block directly:

  Panel              | FULL | TECH | FINANCE | HAPPY | COMMODITY
  consumer-prices    | opt  |  -   |  def    |   -   |  def
  latest-brief       | def  | def  |  def    |   -   |  def   (all PRO-locked)
  disease-outbreaks  | def  |  -   |   -     |   -   |   -
  radiation-watch    | def  |  -   |   -     |   -   |   -
  thermal-escalation | def  |  -   |   -     |   -   |   -
  forecast           | def  |  -   |   -     |   -   |   -   (PRO-locked on desktop)

All 6 pages now name the exact variant blocks in src/config/panels.ts
that register them, so the claim is re-verifiable by grep rather than
drifting with future panel-registry changes.

* docs(panels): fix 5 reviewer findings — no invented controls/sources/keys

All fixes cross-checked against source.

- consumer-prices: no basket selector UI exists. The panel has a
  market bar, a range bar, and tab/category affordances; basket is
  derived from market selection (essentials-<code>, or DEFAULT_BASKET
  for the 'all' aggregate view). Per
  src/components/ConsumerPricesPanel.ts:120-123 and :216-229.
- disease-outbreaks: 'Row click opens advisory' was wrong. The only
  interactive elements in-row are the source-name <a> link
  (sanitised URL, target=_blank); clicking the row itself is a no-op
  (the only content-level listener is for [data-filter] pills and
  the search input). Per DiseaseOutbreaksPanel.ts:35-49,115-117.
- disease-outbreaks: upstream list was wrong. Actual seeder uses
  WHO DON (JSON API), CDC HAN (RSS), Outbreak News Today
  (aggregator), and ThinkGlobalHealth disease tracker
  (ProMED-sourced, 90d lookback). Noted the in-panel tooltip's
  shorter 'WHO, ProMED, health ministries' summary and gave the full
  upstream list with the 72h Redis TTL. Per seed-disease-outbreaks
  .mjs:31-38.
- radiation-watch: summary bar renders 6 cards, not 7 — Anomalies,
  Elevated, Confirmed, Low Confidence, Conflicts, Spikes. The
  CPM-derived indicator is a per-row badge (radiation-flag-converted
  at :67), not a summary card. Moved the CPM reference to the
  per-row badges list. Per RadiationWatchPanel.ts:85-112.
- latest-brief: Redis key shape corrected. The composer writes the
  envelope to brief:{userId}:{issueSlot} (where issueSlot comes from
  issueSlotInTz, not a plain date) and atomically writes a latest
  pointer at brief:latest:{userId} → {issueSlot}. Readers resolve
  via the pointer. 7-day TTL on both. Per
  seed-digest-notifications.mjs:1103-1115 and
  api/latest-brief.ts:80-89.

* docs(panels): Tier 1 — PRO/LLM panel reference pages (9)

Adds user-facing panel pages for the 9 PRO/LLM-backed surfaces flagged
in the extended audit. All claims grounded in component source +
src/config/panels.ts entries (with line cites).

- panels/chat-analyst.mdx — WM Analyst (conversational AI, 5 quick
  actions, 4 domain scopes, POSTs /api/chat-analyst via premiumFetch).
- panels/market-implications.mdx — AI Market Implications trade signals
  (LONG/SHORT/HEDGE × HIGH/MEDIUM/LOW, transmission paths, 120min
  maxStaleMin, degrade-to-warn). Carries the repo's disclaimer verbatim.
- panels/deduction.mdx — Deduct Situation (opt-in PRO; 5s cooldown;
  composes buildNewsContext + active framework).
- panels/daily-market-brief.mdx — Daily Market Brief (stanced items,
  framework selector, live vs cached source badge).
- panels/regional-intelligence.mdx — Regional Intelligence Board
  (7 BOARD_REGIONS, 6 structured blocks + narrative sections,
  request-sequence arbitrator, opt-in PRO).
- panels/strategic-posture.mdx — AI Strategic Posture (cached posture
  + live military vessels → recalcPostureWithVessels; free on web,
  enhanced on desktop).
- panels/stock-analysis.mdx — Premium Stock Analysis (per-ticker
  deep dive: signal, targets, consensus, upgrades, insiders, sparkline).
- panels/stock-backtest.mdx — Premium Backtesting (longitudinal view;
  live vs cached data badge).
- panels/wsb-ticker-scanner.mdx — WSB Ticker Scanner (retail sentiment
  + velocity score with 4-tier color bucketing).

All 9 are PRO (8 via apiKeyPanels allowlist at src/config/panels.ts:973,
strategic-posture is free-on-web/enhanced-on-desktop). Variant matrices
name the exact *_PANELS block registering each panel.

* docs(panels): Tier 2 — flagship free data panels (7)

Adds reference pages for 7 flagship free panels. Every claim grounded
in the panel component + src/config/panels.ts per-variant registration.

- panels/airline-intel.mdx — 6-tab aviation surface (ops/flights/
  airlines/tracking/news/prices), 8 aviation RPCs, user watchlist.
- panels/tech-readiness.mdx — ranked country tech-readiness index with
  6-hour in-panel refresh interval.
- panels/trade-policy.mdx — 6-tab trade-policy surface (restrictions/
  tariffs/flows/barriers/revenue/comtrade).
- panels/supply-chain.mdx — composite stress + carriers + minerals +
  Scenario Engine trigger surface (free panel, PRO scenario activation).
- panels/sanctions-pressure.mdx — OFAC SDN + Consolidated list
  pressure rollup with new/vessels/aircraft summary cards and top-8
  country rows.
- panels/hormuz-tracker.mdx — Hormuz chokepoint drill-down; status
  indicator + per-series bar charts; references Scenario Engine's
  hormuz-tanker-blockade template.
- panels/energy-crisis.mdx — IEA 2026 Energy Crisis Policy Response
  Tracker; category/sector/status filters.

All 7 are free. Variant matrices name exact *_PANELS blocks
registering each panel.

* docs(panels): Tier 3 — compact panels (5)

Adds reference pages for 5 compact user-facing panels.

- panels/world-clock.mdx — 22 global market-centre clocks with
  exchange labels + open/closed indicators (client-side only).
- panels/monitors.mdx — personal keyword alerts, localStorage-persisted;
  links to Features → Custom Monitors for longer explanation.
- panels/oref-sirens.mdx — OREF civil-defence siren feed; active +
  24h wave history; free on web, PRO-locked on desktop (_desktop &&
  premium: 'locked' pattern).
- panels/telegram-intel.mdx — topic-tabbed Telegram channel mirror
  via relay; free on web, PRO-locked on desktop.
- panels/fsi.mdx — US KCFSI + EU FSI stress composites with
  four-level colour buckets (Low/Moderate/Elevated/High).

All 5 grounded in component source + variant registrations.
oref-sirens and telegram-intel correctly describe the _desktop &&
locking pattern rather than the misleading 'PRO' shorthand used
earlier for other desktop-locked panels.

* docs(panels): Tier 4 + 5 catalogue pages, nav re-grouping, features cross-links

Closes out the comprehensive panel-reference expansion. Two catalogue
pages cover the remaining ~60 panels collectively so they're all
searchable and findable without dedicated pages per feed/tile.

- panels/news-feeds.mdx — catalogue covering all content-stream panels:
  regional news (africa/asia/europe/latam/us/middleeast/politics),
  topical news (climate/crypto/economic/markets/mining/commodity/
  commodities), tech/startup streams (startups/unicorns/accelerators/
  fintech/ipo/layoffs/producthunt/regionalStartups/thinktanks/vcblogs/
  defense-patents/ai-regulation/tech-hubs/ai/cloud/hardware/dev/
  security/github), finance streams (bonds/centralbanks/derivatives/
  forex/institutional/policy/fin-regulation/commodity-regulation/
  analysis), happy variant streams (species/breakthroughs/progress/
  spotlight/giving/digest/events/funding/counters/gov/renewable).
- panels/indicators-and-signals.mdx — catalogue covering compact
  market-indicator tiles, correlation panels, and misc signal surfaces.
  Grouped by function: sentiment, macro, calendars, market-structure,
  commodity, crypto, regional economy, correlation panels, misc signals.

docs/docs.json — split the Panels group into three for navigability:
  - Panels — AI & PRO (11 pages)
  - Panels — Data & Tracking (16 pages)
  - Panels — Catalogues (2 pages)

docs/features.mdx — Cmd+K inventory rewritten as per-family sub-lists
  with links to every panel page (or catalogue page for the ones
  that live in a catalogue). Replaces the prior run-on paragraph.

Every catalogue panel is also registered in at least one *_PANELS
block in src/config/panels.ts — the catalogue pages note this and
point readers to the config file for variant-availability details.

* docs(panels): fix airline-intel + world-clock source-of-truth errors

- airline-intel: refresh behavior section was wrong on two counts.
  (1) The panel DOES have a polling timer: a 5-minute setInterval
  in the constructor calling refresh() (which reloads ops + active
  tab). (2) The 'prices' tab does NOT re-fetch on tab switch —
  it's explicitly excluded from both tab-switch and auto-refresh
  paths, loading only on explicit search-button click. Three
  distinct refresh paths now documented with source line hints.
  Per src/components/AirlineIntelPanel.ts ~:173 (setInterval),
  :287 (prices tab-switch guard), :291 (refresh() prices skip).
- world-clock: the WORLD_CITIES list has 30 entries, not '~22'.
  Replaced the approximate count with the exact number and a
  :14-43 line-range cite so it's re-verifiable.
2026-04-20 00:53:17 +04:00
Elie Habib
5939b78f4d chore(resilience): Phase 2 scorecard + v2.0 changelog + schemaVersion flag flip (10/10) (#2993)
* chore(resilience): Phase 2 scorecard + v2.0 changelog + schemaVersion flag flip

* fix(resilience): A9 checkbox + cache version reference (#2993 review)
2026-04-12 10:24:41 +04:00
Elie Habib
676331607a feat(resilience): three-pillar aggregation with penalized weighted mean (T2.3) (#2990)
* feat(resilience): three-pillar aggregation with penalized weighted mean (Phase 2 T2.3)

Wire real three-pillar scoring: structural-readiness (0.40), live-shock-exposure
(0.35), recovery-capacity (0.25). Add penalizedPillarScore formula with alpha=0.50
penalty factor for backtest tuning. Set recovery domain weight to 0.25 and
redistribute existing domain weights proportionally to sum to 1.0. Bump cache
keys v8 to v9. The penalized formula is exported and tested but overallScore
stays as the v1 domain-weighted sum until the flag flips in PR 10.

* fix(resilience): update test description v8 to v9 (#2990 review)

Test descriptions said "(v8)" but assertions check v9 cache keys.
2026-04-12 10:18:42 +04:00
Elie Habib
17e34dfca7 feat(resilience): recovery capacity pillar — 6 new dimensions + 5 seeders (Phase 2 T2.2b) (#2987)
* feat(resilience): recovery capacity pillar — 6 new dimensions + 5 seeders (Phase 2 T2.2b)

Add the recovery-capacity pillar with 6 new dimensions:
- fiscalSpace: IMF GGR_G01_GDP_PT + GGXCNL_G01_GDP_PT + GGXWDG_NGDP_PT
- reserveAdequacy: World Bank FI.RES.TOTL.MO
- externalDebtCoverage: WB DT.DOD.DSTC.CD / FI.RES.TOTL.CD ratio
- importConcentration: UN Comtrade HHI (stub seeder)
- stateContinuity: derived from WGI + UCDP + displacement (no new fetch)
- fuelStockDays: IEA/EIA (stub seeder, Enrichment tier)

Each dimension has a scorer in _dimension-scorers.ts, registry entries in
_indicator-registry.ts, methodology doc subsections, and fixture data.

Seeders: fiscal-space (real, IMF WEO), reserve-adequacy (real, WB API),
external-debt (real, WB API), import-hhi (stub), fuel-stocks (stub).

Recovery domain weight is 0 until PR 4 (T2.3) ships the penalized weighted
mean across pillars. The domain appears in responses structurally but does
not affect the overall score.

Bootstrap: STANDALONE_KEYS + SEED_META + EMPTY_DATA_OK_KEYS + ON_DEMAND_KEYS
all updated in api/health.js. Source-failure mapping updated for
stateContinuity (WGI adapter). Widget labels and LOCKED_PREVIEW updated.

All 282 resilience tests pass, typecheck clean, methodology lint clean.

* fix(resilience): ISO3→ISO2 normalization in WB recovery seeders (#2987 P1)

Both seed-recovery-reserve-adequacy.mjs and seed-recovery-external-debt.mjs
used countryiso3code from the World Bank API response then immediately
rejected codes where length !== 2. WB returns ISO3 codes (USA, DEU, etc.),
so all real rows were silently dropped and the feed was always empty.

Fix: import scripts/shared/iso3-to-iso2.json and normalize before the
length check. Also removed from EMPTY_DATA_OK_KEYS in health.js since
empty results now indicate a real failure, not a structural absence.

* fix(resilience): remove unused import + no-op overrides (#2987 review)

* fix(test): update release-gate to expect 6 domains after recovery pillar
2026-04-12 10:10:10 +04:00
Elie Habib
fa3cca8c7e feat(resilience): cross-index benchmark — INFORM/ND-GAIN/WRI/FSI (Phase 2 T2.4) (#2985)
* feat(resilience): cross-index benchmark — INFORM/ND-GAIN/WRI/FSI (Phase 2 T2.4)

* fix(resilience): INFORM scoreCol guard + Redis check + cleanup (#2985 review)
2026-04-12 09:52:37 +04:00
Elie Habib
49a2c54819 chore(resilience): T1.9 cache-key health sync + Phase 1 scorecard close-out (#2965)
* chore(resilience): T1.9 cache-key health sync + Phase 1 scorecard close-out

Final PR of the Phase 1 reference-grade upgrade. Closes out the Phase 1
acceptance gate with two deliverables:

1. T1.9 cache-key / health-registry sync regression test

   New tests/resilience-cache-keys-health-sync.test.mts asserts that
   the RESILIENCE_RANKING_CACHE_KEY string from _shared.ts literally
   appears in api/health.js, and that the score/history prefixes match
   the expected resilience:<kind>:v<n>: shape. This guards against
   future silent version drift where a cache-key bump in _shared.ts
   could leave the health registry watching the old key indefinitely.

   No cache keys were bumped in Phase 1 because every schema addition
   (imputationClass, freshness) was additive with default fallbacks on
   the existing resilience:score:v7 / ranking:v8 / history:v4 keys.

2. Methodology changelog v1.1 + Phase 1 self-assessment scorecard

   docs/methodology/country-resilience-index.mdx:
   - v1.0 entry trimmed to only reference the actual v1.0 baseline PRs
     (#2821, #2847, #2858). Moved from "Current published version" to
     "Baseline".
   - New v1.1 entry inserted between v1.0 and v2.0, marked as "Current
     published version". Lists all Phase 1 tasks T1.1-T1.9 with the
     specific PRs that shipped each slice.
   - New "Scorecard (v1.1 self-assessment)" section at the end of the
     changelog with ratings on six standard composite-indicator review
     axes: Methodology 7.5, Explainability 7.5, Reproducibility 8.0,
     Source quality 7.0, Timeliness 6.5, Sensitivity 7.0. Every axis has
     a named rationale and a named gap tied to a Phase 2 or Phase 3
     task. Both Phase 1 required thresholds (Methodology >=7.5,
     Explainability >=7.5) are met.

   docs/internal/country-resilience-upgrade-plan.md:
   - Phase 1 acceptance checklist: 7 of 7 items flipped to [x] with
     PR references.
   - T1.9 task bullet closed out with a reference to this PR.

What is deliberately NOT in this PR

- No code changes to the scorer, response builder, or widget.
- No new proto fields or schema changes.
- No cache-key bumps.
- No external expert review (Phase 3 T3.8b, runs in parallel).
- No Phase 2 work.

Depends on nothing at code level; references PRs 2959/2961/2962/2964
which are still in the review queue. This PR can land independently.

Verified

- tests/resilience-cache-keys-health-sync.test.mts 3/3 passing
- methodology doc linter (T1.8) still passes on the updated mdx
- lint:md clean
- test:data clean (4355/4355)
- typecheck clean
- typecheck:api clean

* docs(resilience): correct Timeliness scorecard gap (#2965 P2)

Greptile P2 finding: the Timeliness row in the Phase 1 self-assessment
scorecard claimed "no real-time signals in v1.1" and described
conflict events and power outages as Phase 2 additions, which is
factually wrong. Thirteen stress-side indicators already run at
realtime or daily cadence via the cross-source stack:

  realtime: ucdpConflict, internetOutages, infraOutages,
            unrestEvents, socialVelocity
  daily:    sanctionCount, cyberThreats, gpsJamming,
            shippingStress, transitDisruption, gasStorageStress,
            energyPriceStress, newsThreatScore

The real cadence gap is that structural sources (WGI, GPI, RSF, WHO,
IMF macro) are annual and still carry the majority of index weight,
while the live-shock pillar is already rolling. Phase 2 T2.2 adds
FX volatility at daily cadence to narrow the gap on the
currency-external dimension.

Score itself unchanged (6.5), it was a defensible number. Only
the gap rationale is corrected to match reality.
2026-04-12 00:43:23 +04:00
Elie Habib
dca2e1ca3c feat(resilience): expose imputationClass on ResilienceDimension (T1.7 schema pass) (#2959)
* feat(resilience): expose imputationClass on ResilienceDimension (T1.7 schema pass)

Ships the Phase 1 T1.7 schema pass of the country-resilience reference
grade upgrade plan. PR #2944 shipped the classifier table foundation
(ImputationClass type, ImputationEntry interface, IMPUTATION/IMPUTE
tagged with four semantic classes) and explicitly deferred the schema
propagation. This PR lands that propagation so downstream consumers can
distinguish "country is stable" from "country is unmonitored" from
"upstream is down" from "structurally not applicable" on a per-dimension
basis.

What this PR commits

- Proto: new imputation_class string field on ResilienceDimension
  (empty string = dimension has any observed data; otherwise one of
  stable-absence, unmonitored, source-failure, not-applicable).
- Generated TS types: regenerated service_server.ts and service_client.ts
  via make generate.
- Scorer: ResilienceDimensionScore carries ImputationClass | null.
  WeightedMetric carries an optional imputationClass that imputation
  paths populate. weightedBlend aggregates the dominant class by
  weight when the dimension is fully imputed, returns null otherwise.
- All IMPUTE.* early-return paths propagate the class from the table
  (IMPUTE.bisEer, IMPUTE.wtoData, IMPUTE.ipcFood, IMPUTE.unhcrDisplacement).
- Response builder: _shared.ts buildDimensionList passes the class
  through to the ResilienceDimension proto field.
- Tests: weightedBlend aggregation semantics (5 cases), dimension-level
  propagation from IMPUTE tables, serialized response includes the field.

What is deliberately NOT in this PR

- No widget icon rendering (T1.6 full grid, PR 3 of 5)
- No source-failure seed-meta consultation (PR 4 of 5)
- No freshness field (T1.5 propagation, PR 2 of 5)
- No cache key bump: the new field is empty-string default, existing
  cached responses continue to deserialize cleanly

Verified

- make generate clean
- npm run typecheck + typecheck:api clean
- tests/resilience-dimension-scorers.test.mts all passing (existing + new)
- tests/resilience-*.test.mts + test:data suite passing (4361 tests)
- npm run lint exits 0

* fix(resilience): normalize cached score responses on read (#2959 P2)

Greptile P2 finding on PR #2959: cachedFetchJson and
getCachedResilienceScores return pre-change payloads verbatim, so a
resilience:score:v7 entry written before this PR lands lacks the
imputationClass field. Downstream consumers that read
dim.imputationClass get undefined for up to 6 hours until the cache
TTL expires.

Fix: add normalizeResilienceScoreResponse helper that defaults
missing optional fields in place and apply it at both read sites.
Defaults imputationClass to empty string, matching the proto3 default
for the new imputation_class field.

- ensureResilienceScoreCached applies the normalizer after
  cachedFetchJson returns.
- getCachedResilienceScores applies it after each successful
  JSON.parse on the pipeline result.
- Two new test cases: stale payload without imputationClass gets
  defaulted, present values are preserved.
- Not bumping the cache key: stale-read defaults are safe, the key
  bump would invalidate every cached score for a 6-hour cold-start
  cycle. The normalizer is extensible when PR #2961 adds freshness
  to the same payload.

P3 finding (broken docs reference) verified invalid: the proto
comment points to docs/methodology/country-resilience-index.mdx,
which IS the current file. The .md predecessor was renamed in
PR #2945 (T1.3 methodology doc promotion to CII parity). No change
needed to the comment.

* fix(resilience): bump score cache key v7 to v8, drop normalizer (#2959 P2)

Second fixup for the Greptile P2 finding on #2959. The previous fixup
(40ea22009) added normalizeResilienceScoreResponse to default missing
imputationClass fields on cached payloads to empty string. The
reviewer correctly pushed back: defaulting to empty string is the
proto3 default for "dimension has observed data", which silently
misreports pre-rollout imputed dimensions as observed until the 6h
TTL expires.

Correct fix: bump RESILIENCE_SCORE_CACHE_PREFIX from resilience:score:v7:
to resilience:score:v8:. Invalidates every pre-change cache entry, so
the next request per country repopulates with the correct
imputationClass written by the scorer. Cost: a 6h warmup cycle where
first-request-per-country recomputes the score, ~100ms per country
across hundreds of requests.

Also deletes the normalizeResilienceScoreResponse helper and its two
call sites. It was misleading defense-in-depth that can hide future
schema drift bugs. Future additive field additions should bump the
key, not silently default fields.

- server/worldmonitor/resilience/v1/_shared.ts: prefix v7 to v8,
  delete normalizer function and both call sites.
- scripts/seed-resilience-scores.mjs, validate-resilience-correlation.mjs,
  validate-resilience-backtest.mjs: mirror constants bumped.
- tests/resilience-scores-seed.test.mjs: pin literal v7 to v8.
- tests/resilience-ranking.test.mts: 7 hardcoded cache keys bumped.
- tests/resilience-handlers.test.mts: stray v7 cache key bumped.
- tests/resilience-release-gate.test.mts: the two normalizer test
  cases from 40ea22009 deleted along with the helper.
- docs/methodology/country-resilience-index.mdx: Redis keys table
  updated from v7 to v8 to match the canonical constant.

P3 (broken docs reference) confirmed invalid a second time.
docs/methodology/country-resilience-index.mdx exists on origin/main
AND on the PR branch with the same blob hash
d2ab1ebad3. docs/methodology/resilience-index.md
does not exist on either. No proto comment change.
2026-04-11 23:50:27 +04:00
Elie Habib
5e4d6c87ed docs(resilience): promote methodology to .mdx at CII parity (T1.3) (#2945)
* docs(resilience): promote methodology to .mdx at CII parity (T1.3)

Why this PR?

Ships Phase 1 T1.3 of the country-resilience reference-grade upgrade
plan: promote the v1.0 methodology document from a scratch markdown
file to a published MDX page at parity with the Country Instability
Index (docs/country-instability-index.mdx).

Before this PR there was no publicly documented resilience methodology
page, only an internal markdown draft. The reference-grade upgrade
plan's origin review explicitly called out methodological opacity as
the biggest gap keeping the product below OECD/JRC standards, and
committed to documenting every dimension, formula, goalpost, cadence,
weight rationale, and imputation rule before any Phase 2 schema work
begins. This PR is that documentation.

What this PR commits:

- git mv docs/methodology/resilience-index.md to
  docs/methodology/country-resilience-index.mdx so git history tracks
  the rename; edits land as modifications on the new path.
- Adds MDX frontmatter (title + description) matching CII shape so the
  page renders correctly in the docs site and shows up in search.
- Prepends a short v1.0 scope note that distinguishes the currently
  shipping behavior (this doc) from the planned v2.0 three-pillar
  rebuild (tracked in the reference-grade upgrade plan). Readers who
  land on the page know exactly what is live and what is coming.
- Fixes an orphan table row (`| Food & Water | Mixed |`) that had
  drifted into the Overall Score section from an earlier edit and
  was breaking rendering around the score-formula block.
- Rewrites the Imputation Taxonomy section to use the formal
  four-class naming from T1.7 (stable-absence, unmonitored,
  source-failure, not-applicable) and adds a mapping table from each
  concrete IMPUTATION / IMPUTE entry to its class. This is the
  public-facing surface of the T1.7 foundation PR.
- Adds a Reproducibility Appendix listing every Redis key used by the
  scorer (score cache, ranking cache, history sorted set, intervals,
  seed-meta, static record, static index), the meaning of the
  `dataVersion` field, and a step-by-step procedure for reproducing
  any published country score by hand from a Redis snapshot.
- Adds a Changelog section with a v1.0 entry that references the
  prerequisite PRs (#2821, #2847, #2858) and the Phase 1 work landing
  so far (T1.1 regression test, T1.4 dataVersion widget wire,
  T1.7 imputation taxonomy foundation), plus a v2.0 placeholder that
  summarizes the reference-grade upgrade plan for readers.
- Adds editorial notes pointing at the Phase 1 T1.8 methodology doc
  linter that will enforce parity between this document and the
  indicator registry in _dimension-scorers.ts.

What is deliberately NOT in this PR:

- No v2.0 content presented as shipping. The three-pillar rebuild,
  recovery capacity pillar, penalized weighted mean aggregation,
  cross-index benchmark, and annual Reference Edition are all tracked
  in the reference-grade upgrade plan but not yet implemented. The
  changelog v2.0 section names them as planned work, not current
  behavior.
- No methodology doc linter. That is T1.8 and ships separately so it
  can be reviewed as a tooling change.
- No deletion of the old .md path. git mv preserves history; the old
  location renders a 404 which is expected.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Testing:
- Markdown-only change, no code.
- npm run typecheck: clean
- Pre-push hook (typecheck + build:full + version:check) passes.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(resilience): add v1.0 content to methodology mdx (T1.3 followup)

Companion commit to 4e7f64aaa (which was a rename-only because git
add after git mv re-staged the original file content). This commit
adds the actual v1.0 content the T1.3 task requires:

- MDX frontmatter (title + description) matching the Country
  Instability Index shape so the page renders in the docs site.
- Short v1.0 scope note that distinguishes the currently shipping
  behavior (this doc) from the planned v2.0 three-pillar rebuild
  (tracked in the reference-grade upgrade plan).
- Removes an orphan table row (`| Food & Water | Mixed |`) from the
  Overall Score section that had drifted from an earlier edit.
- Rewrites the Imputation Taxonomy section to use the formal
  four-class naming from T1.7 (stable-absence, unmonitored,
  source-failure, not-applicable) with a mapping table from each
  concrete IMPUTATION / IMPUTE entry to its class.
- Reproducibility Appendix listing every Redis key used by the
  scorer, the dataVersion semantics, and a step-by-step reproduction
  procedure.
- Changelog section with v1.0 (prerequisite PRs + T1.1 / T1.4 / T1.7
  landing) and v2.0 placeholder summarizing the reference-grade
  upgrade plan.
- Editorial notes pointing at the T1.8 methodology doc linter that
  will enforce parity between this document and the indicator
  registry.

Net change: 89 insertions, 13 deletions, file grows from 300 to 376
lines.

Docs-only, no code, no runtime impact.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(resilience): address PR #2945 review (dead link + roads reuse note)

Addresses two review comments on PR #2945:

1. **P2 Medium: broken reference in the v1.0 scope note at line 8.**
   The intro linked to `../internal/country-resilience-upgrade-plan.md`,
   which resolves to `docs/internal/country-resilience-upgrade-plan.md`.
   That file lives on PR #2938 and is not in this PR's tree or on
   origin/main, so until #2938 merges the link would render as a 404
   on the docs site. Dropped the markdown link and kept the text
   reference to "a separate reference-grade upgrade plan" so readers
   still know context exists elsewhere. The existing prose reference
   to the same file at line 365 is already inside backticks and is
   therefore a plain-text citation, not a link, so it stays.

2. **Additional suggestion: roads series shared between two
   dimensions.** `roadsPavedLogistics` (Logistics & Supply, weight
   0.50) and `roadsPavedInfra` (Infrastructure, weight 0.35) both
   read from World Bank `IS.ROD.PAVE.ZS`. Added an explicit note
   under the Infrastructure dimension table clarifying that this is
   deliberate source reuse (Logistics uses it as a transit-viability
   proxy, Infrastructure uses it as a baseline-public-capital proxy)
   and pointing forward to the v2.0 plan's consolidation of shared
   upstream signals into a single indicator registry. No change to
   the indicator tables themselves; this is a clarifying paragraph.

No content semantics change beyond the clarifying paragraph and the
dead-link removal. Docs-only, no code.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 19:44:04 +04:00
Elie Habib
7ddf6707b9 docs(resilience): add methodology page for the Country Resilience Index (#2864)
* docs(resilience): add methodology page for the Country Resilience Index

Documents domains, weights, dimensions, scoring formula, normalization,
missing data handling, data sources, and supplementary fields.

* fix: add blank line before list to pass markdownlint MD032

* fix(docs): correct formula to domain-weighted sum + remove stale version numbers
2026-04-09 12:22:22 +04:00