Commit Graph

213 Commits

Author SHA1 Message Date
Elie Habib
8cca8d19e3 feat(resilience): Comtrade-backed re-export-share seeder + SWF Redis read (#3385)
* feat(seed): BUNDLE_RUN_STARTED_AT_MS env + runSeed SIGTERM cleanup

Prereq for the re-export-share Comtrade seeder (plan 2026-04-24-003),
usable by any cohort seeder whose consumer needs bundle-level freshness.

Two coupled changes:

1. `_bundle-runner.mjs` injects `BUNDLE_RUN_STARTED_AT_MS` into every
   spawned child. All siblings in a single bundle run share one value
   (captured at `runBundle` start, not spawn time). Consumers use this
   to detect stale peer keys — if a peer's seed-meta predates the
   current bundle run, fall back to a hard default rather than read
   a cohort-peer's last-week output.

2. `_seed-utils.mjs::runSeed` registers a `process.once('SIGTERM')`
   handler that releases the acquired lock and extends existing-data
   TTL before exiting 143. `_bundle-runner.mjs` sends SIGTERM on
   section timeout, then SIGKILL after KILL_GRACE_MS (5s). Without
   this handler the `finally` path never runs on SIGKILL, leaving
   the 30-min acquireLock reservation in place until its own TTL
   expires — the next cron tick silently skips the resource.

Regression guard memory: `bundle-runner-sigkill-leaks-child-lock` (PR
#3128 root cause).

Tests added:
- bundle-runner env injection (value within run bounds)
- sibling sections share the same timestamp (critical for the
  consumer freshness guard)
- runSeed SIGTERM path: exit 143 + cleanup log
- process.once contract: second SIGTERM does not re-enter handler

* fix(seed): address P1/P2 review findings on SIGTERM + bundle contracts

Addresses PR #3384 review findings (todos 256, 257, 259, 260):

#256 (P1) — SIGTERM handler narrowed to fetch phase only. Was installed
at runSeed entry and armed through every `process.exit` path; could
race `emptyDataIsFailure: true` strict-floor exits (IMF-External,
WB-bulk) and extend seed-meta TTL when the contract forbids it —
silently re-masking 30-day outages. Now the handler is attached
immediately before `withRetry(fetchFn)` and removed in a try/finally
that covers all fetch-phase exit branches.

#257 (P1) — `BUNDLE_RUN_STARTED_AT_MS` now has a first-class helper.
Exported `getBundleRunStartedAtMs()` from `_seed-utils.mjs` with JSDoc
describing the bundle-freshness contract. Fleet-wide helper so the
next consumer seeder imports instead of rediscovering the idiom.

#259 (P2) — SIGTERM cleanup runs `Promise.allSettled` on disjoint-key
ops (`releaseLock` + `extendExistingTtl`). Serialising compounded
Upstash latency during the exact failure mode (Redis degraded) this
handler exists to handle, risking breach of the 5s SIGKILL grace.

#260 (P2) — `_bundle-runner.mjs` asserts topological order on
optional `dependsOn` section field. Throws on unknown-label refs and
on deps appearing at a later index. Fleet-wide contract replacing
the previous prose-comment ordering guarantee.

Tests added/updated:
- New: SIGTERM handler removed after fetchFn completes (narrowed-scope
  contract — post-fetch SIGTERM must NOT trigger TTL extension)
- New: dependsOn unknown-label + out-of-order + happy-path (3 tests)

Full test suite: 6,866 tests pass (+4 net).

* fix(seed): getBundleRunStartedAtMs returns null outside a bundle run

Review follow-up: the earlier `Math.floor(Date.now()/1000)*1000` fallback
regressed standalone (non-bundle) runs. A consumer seeder invoked
manually just after its peer wrote `fetchedAt = (now - 5s)` would see
`bundleStartMs = Date.now()`, reject the perfectly-fresh peer envelope
as "stale", and fall back to defaults — defeating the point of the
peer-read path outside the bundle.

Returning null when `BUNDLE_RUN_STARTED_AT_MS` is unset/invalid keeps
the freshness gate scoped to its real purpose (across-bundle-tick
staleness) and lets standalone runs skip the gate entirely. Consumers
check `bundleStartMs != null` before applying the comparison; see the
companion `seed-sovereign-wealth.mjs` change on the stacked PR.

* test(seed): SIGTERM cleanup test now verifies Redis DEL + EXPIRE calls

Greptile review P2 on PR #3384: the existing test only asserted exit
code + log line, not that the Redis ops were actually issued. The
log claim was ahead of the test.

Fixture now logs every Upstash fetch call's shape (EVAL / pipeline-
EXPIRE / other) to stderr. Test asserts:

- >=1 EVAL op was issued during SIGTERM cleanup (releaseLock Lua
  script on the lock key)
- >=1 pipeline-EXPIRE op was issued (extendExistingTtl on canonical
  + seed-meta keys)
- The EVAL body carries the runSeed-generated runId (proves it's
  THIS run's release, not a phantom op)
- The EXPIRE pipeline touches both the canonicalKey AND the
  seed-meta key (proves the keys[] array was built correctly
  including the extraKeys merge path)

Full test suite: 6,866 tests pass, typecheck clean.

* feat(resilience): Comtrade-backed re-export-share seeder + SWF Redis read

Plan ref: docs/plans/2026-04-24-003-feat-reexport-share-comtrade-seeder-plan.md

Motivating case. Before this PR, the SWF `rawMonths` denominator for
the `sovereignFiscalBuffer` dimension used GROSS annual imports for
every country. For re-export hubs (goods transiting without domestic
settlement), this structurally under-reports resilience: UAE's 2023
$941B of imports include $334B of transit flow that never represents
domestic consumption. Net imports = gross × (1 − reexport_share).

The previous (PR 3A) design flattened a hand-curated YAML into Redis;
the YAML shipped empty and never populated, so the correction never
applied and the cohort audit showed no movement.

Gap #2 (this PR). Two coupled changes to make the correction actually
apply:

1. Comtrade-backed seeder (`scripts/seed-recovery-reexport-share.mjs`).
   Rewritten to fetch UN Comtrade `flowCode=RX` (re-exports) and
   `flowCode=M` (imports) per cohort member, compute share = RX/M at
   the latest co-populated year, clamp to [0.05, 0.95], publish the
   envelope. Header auth (`Ocp-Apim-Subscription-Key`) — subscription
   key never reaches URL/logs/Redis. `maxRecords=250000` cap with
   truncation detection. Sequential + retry-on-429 with backoff.

   Hub cohort resolved by Phase 0 empirical probe (plan §Phase 0):
   ['AE', 'PA']. Six candidates (SG/HK/NL/BE/MY/LT) return HTTP 200
   with zero RX rows — Comtrade doesn't expose RX for those reporters.

2. SWF seeder reads from Redis (`scripts/seed-sovereign-wealth.mjs`).
   Swaps `loadReexportShareByCountry()` (YAML) for
   `loadReexportShareFromRedis()` (Redis key written by #1). Guarded
   by bundle-run freshness: if the sibling Reexport-Share seeder's
   `seed-meta` predates `BUNDLE_RUN_STARTED_AT_MS` (set by the
   prereq PR's `_bundle-runner.mjs` env-injection), HARD fallback
   to gross imports rather than apply last-month's stale share.

Health registries. Both new keys registered in BOTH `api/health.js`
SEED_META (60-day alert threshold) and `api/seed-health.js`
SEED_DOMAINS (43200min interval). feedback_two_health_endpoints_must_match.

Bundle wiring. `seed-bundle-resilience-recovery` Reexport-Share
timeout bumped 60s → 300s (Comtrade + retry can take 2-3 min
worst-case). Ordering preserved: Reexport-Share before Sovereign-
Wealth so the SWF seeder reads a freshly-written key in the same
cron tick.

Deletions. YAML + loader + 7 obsolete loader tests removed; single
source of truth is now Comtrade → Redis.

Prereq. Stacks on PR #3384 (feat/bundle-runner-env-sigterm)
which adds BUNDLE_RUN_STARTED_AT_MS env injection + runSeed
SIGTERM cleanup. This PR's bundle-freshness guard depends on
that env variable.

Tests (19 new, 7 deleted, +12 net):
- Pure math: parseComtradeFlowResponse, computeShareFromFlows,
  clampShare, declareRecords + credential-leak source scan (15)
- Integration (Gap #2 regression guards): SWF seeder loadReexport
  ShareFromRedis — fresh/absent/malformed/stale-meta/missing-meta (5)
- Health registry dual-registry drift guard — scoped to this PR's
  keys, respecting pre-existing asymmetry (4)
- Bundle-ordering + timeout assertions (2)

Phase 0 cohort validation committed to plan. Full test suite
passes: 6,881 tests.

* fix(resilience): address P1/P2 review findings — adopt shared helpers, pin freshness boundary

Addresses PR #3385 review findings:

#257 (P1) consumer — `seed-sovereign-wealth.mjs` imports the shared
`getBundleRunStartedAtMs` helper from `_seed-utils.mjs` (added in the
prereq commit) instead of its own `getBundleStartMs`. Single source of
truth for the bundle-freshness contract.

#258 (P2) — `seed-recovery-reexport-share.mjs` isMain guard uses the
canonical `pathToFileURL(process.argv[1]).href === import.meta.url`
form instead of basename-suffix matching. Handles symlinks, case-
different paths on macOS HFS+, and Windows path separators without
string munging.

#260 (P2) consumer — Sovereign-Wealth declares `dependsOn:
['Reexport-Share']` in the bundle spec. `_bundle-runner.mjs` (prereq
commit) now enforces topological order on load and throws on
violation — replaces the previous prose-comment ordering contract.

#261 (P2) — added a test to `tests/seed-sovereign-wealth-reads-redis-
reexport-share.test.mts` pinning the inclusive-boundary semantic:
`fetchedAtMs === bundleStartMs` must be treated as FRESH. Guards
against a future refactor to `<=` that would silently reject peers
writing at the very first millisecond of the bundle run.

Rebased onto updated prereq. Full test suite: 6,886 tests pass (+5 net).

* fix(resilience): freshness gate skipped in standalone mode; meta still required

Review catch: the previous `bundleStartMs = Date.now()` fallback made
standalone/manual `seed-sovereign-wealth.mjs` runs ALWAYS reject any
previously-seeded re-export-share meta as "stale" — even when the
operator ran the Reexport seeder milliseconds beforehand. Defeated
the point of the peer-read path outside the bundle.

With `getBundleRunStartedAtMs()` now returning null outside a bundle
(companion commit on the prereq branch), the consumer only applies
the freshness gate when `bundleStartMs != null`. Standalone runs
accept any `fetchedAt` — the operator is responsible for ordering.

Two guards survive the change:
- Meta MUST exist (absence = peer-outage fail-safe, both modes)
- In-bundle: meta MUST be at or after `BUNDLE_RUN_STARTED_AT_MS`

Two new tests pin both modes:
- standalone: accepts meta written 10 min before this process started
- standalone: still rejects missing meta (peer-outage fail-safe
  survives gate bypass)

Rebased onto updated prereq. Full test suite: 6,888 tests (+2 net).

* fix(resilience): filter world-aggregate Comtrade rows + skip final-retry sleep

Greptile review of PR #3385 flagged two P2s in the Comtrade seeder.

Finding #3 (parseComtradeFlowResponse double-count risk):
`cmdCode=TOTAL` without a partner filter currently returns only
world-aggregate rows in practice — but `parseComtradeFlowResponse`
summed every row unconditionally. A future refactor adding per-
partner querying would silently double-count (world-aggregate row +
partner-level rows for the same year), cutting the derived share in
half with no test signal.

Fix: explicit `partnerCode ∈ {'0', 0, null/undefined}` filter. Matches
current empirical behavior (aggregate-only responses) and makes the
construct robust to a future partner-level query.

Finding #4 (wasted backoff on final retry):
429 and 5xx branches slept `backoffMs` before `continue`, but on
`attempt === RETRY_MAX_ATTEMPTS` the loop condition fails immediately
after — the sleep was pure waste. Added early-return (parallel to the
existing pattern in the network-error catch branch) so the final
attempt exits the retry loop at the first non-success response
without extra latency.

Tests:
- 3 new `parseComtradeFlowResponse` variants: world-only filter,
  numeric-0 partnerCode shape, rows without partnerCode field
- Existing tests updated: the double-count assertion replaced with
  a "per-partner rows must NOT sum into the world-aggregate total"
  assertion that pins the new contract

Rebased onto updated prereq. Full test suite: 6,890 tests (+2 net).
2026-04-25 00:14:17 +04:00
Elie Habib
184e82cb40 feat(resilience): PR 3A — net-imports denominator for sovereignFiscalBuffer (#3380)
PR 3A of cohort-audit plan 2026-04-24-002. Construct correction for
re-export hubs: the SWF rawMonths denominator was gross imports, which
double-counted flow-through trade that never represents domestic
consumption. Net-imports fix:

  rawMonths = aum / (grossImports × (1 − reexportShareOfImports)) × 12

applied to any country in the re-export share manifest. Countries NOT
in the manifest get gross imports unchanged (status-quo fallback).

Plan acceptance gates — verified synthetically in this PR:

  Construct invariant. Two synthetic countries, same SWF, same gross
  imports. A re-exports 60%; B re-exports 0%. Post-fix, A's rawMonths
  is 2.5× B's (1/(1-0.6) = 2.5). Pinned in
  tests/resilience-net-imports-denominator.test.mts.

  SWF-heavy exporter invariant. Country with share ≤ 5%: rawMonths
  lift < 5% vs baseline (negligible). Pinned.

What shipped

1. Re-export share manifest infrastructure.
   - scripts/shared/reexport-share-manifest.yaml (new, empty) — schema
     committed; entries populated in follow-up PRs with UNCTAD
     Handbook citations.
   - scripts/shared/reexport-share-loader.mjs (new) — loader + strict
     validator, mirrors swf-manifest-loader.mjs.
   - scripts/seed-recovery-reexport-share.mjs (new) — publishes
     resilience:recovery:reexport-share:v1 from manifest. Empty
     manifest = valid (no countries, no adjustment).

2. SWF seeder uses net-imports denominator.
   - scripts/seed-sovereign-wealth.mjs exports computeNetImports(gross,
     share) — pure helper, unit-tested.
   - Per-country loop: reads manifest, computes denominatorImports,
     applies to rawMonths math.
   - Payload records annualImports (gross, audit), denominatorImports
     (used in math), reexportShareOfImports (provenance).
   - Summary log reports which countries had a net-imports adjustment
     applied with source year.

3. Bundle wiring.
   - Reexport-Share runs BEFORE Sovereign-Wealth in the recovery
     bundle so the SWF seeder reads fresh re-export data in the same
     cron tick.
   - tests/seed-bundle-resilience-recovery.test.mjs expected-entries
     updated (6 → 7) with ordering preservation.

4. Cache-prefix bump (per cache-prefix-bump-propagation-scope skill).
   - RESILIENCE_SCORE_CACHE_PREFIX: v11 → v12
   - RESILIENCE_RANKING_CACHE_KEY: v11 → v12
   - RESILIENCE_HISTORY_KEY_PREFIX: v6 → v7 (history rotation prevents
     30-day rolling window from mixing pre/post-fix scores and
     manufacturing false "falling" trends on deploy day).
   - Source of truth: server/worldmonitor/resilience/v1/_shared.ts
   - Mirrored in: scripts/seed-resilience-scores.mjs,
     scripts/validate-resilience-correlation.mjs,
     scripts/backtest-resilience-outcomes.mjs,
     scripts/validate-resilience-backtest.mjs,
     scripts/benchmark-resilience-external.mjs, api/health.js
   - Test literals bumped in 4 test files (26 line edits).
   - EXTENDED tests/resilience-cache-keys-health-sync.test.mts with
     a parity pass that reads every known mirror file and asserts
     both (a) canonical prefix present AND (b) no stale v<older>
     literals in non-comment code. Found one legacy log-line that
     still referenced v9 (scripts/seed-resilience-scores.mjs:342)
     and refactored it to use the RESILIENCE_RANKING_CACHE_KEY
     constant so future bumps self-update.

Explicitly NOT in this PR

- liquidReserveAdequacy denominator fix. The plan's PR 3A wording
  mentions both dims, but the RESERVES ratio (WB FI.RES.TOTL.MO) is a
  PRE-COMPUTED WB series; applying a post-hoc net-imports adjustment
  mixes WB's denominator year with our manifest-year, and the math
  change belongs in PR 3B (unified liquidity) where the α calibration
  is explicit. This PR stays scoped to sovereignFiscalBuffer.
- Live re-export share entries. The manifest ships EMPTY in this PR;
  entries with UNCTAD citations are one-per-PR follow-ups so each
  figure is individually auditable.

Verified

- tests/resilience-net-imports-denominator.test.mts — 9 pass (construct
  contract: 2.5× ratio gate, monotonicity, boundary rejections,
  backward-compat on missing manifest entry, cohort-proportionality,
  SWF-heavy-exporter-unchanged)
- tests/reexport-share-loader.test.mts — 7 pass (committed-manifest
  shape + 6 schema-violation rejections)
- tests/resilience-cache-keys-health-sync.test.mts — 5 pass (existing 3
  + 2 new parity checks across all mirror files)
- tests/seed-bundle-resilience-recovery.test.mjs — 17 pass (expected
  entries bumped to 7)
- npm run test:data — 6714 pass / 0 fail
- npm run typecheck / typecheck:api — green
- npm run lint / lint:md — clean

Deployment notes

Score + ranking + history cache prefixes all bump in the same deploy.
Per established v10→v11 precedent (and the cache-prefix-bump-
propagation-scope skill):
- Score / ranking: 6h TTL — the new prefix populates via the Railway
  resilience-scores cron within one tick.
- History: 30d ring — the v7 ring starts empty; the first 30 days
  post-deploy lack baseline points, so trend / change30d will read as
  "no change" until v7 accumulates a window.
- Legacy v11 keys can be deleted from Redis at any time post-deploy
  (no reader references them). Leaving them in place costs storage
  but does no harm.
2026-04-24 18:14:04 +04:00
Elie Habib
d521924253 fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs (#3363)
* fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs

PR #3289 shipped the v2 energy construct behind RESILIENCE_ENERGY_V2_ENABLED
(default false). Audit on 2026-04-24 after the user flagged "AE only moved
1.49 points — we added nuclear credit, we should see more" revealed two
safety gaps that made a future flag flip unsafe:

1. scoreEnergyV2 silently fell back to IMPUTE when any of its three
   required Redis seeds (low-carbon-generation, fossil-electricity-share,
   power-losses) was null. A future operator flipping the flag with
   seeds absent would produce fabricated-looking numbers for every
   country with zero operator signal.

2. api/health.js had those three seed labels in BOTH SEED_META (CRIT on
   missing) AND ON_DEMAND_KEYS (which demotes CRIT to WARN). The demotion
   won. Health has been reporting WARNING on a scorer dependency that has
   been 100% missing since PR #3289 merged — no paging trail existed.

Changes:

  server/worldmonitor/resilience/v1/_dimension-scorers.ts
    - Add ResilienceConfigurationError with missingKeys[] payload.
    - scoreEnergy: preflight the three v2 seeds when flag=true. Throw
      ResilienceConfigurationError listing the specific absent keys.
    - scoreAllDimensions: wrap per-dimension dispatch in try/catch so a
      thrown ResilienceConfigurationError routes to the source-failure
      shape (imputationClass='source-failure', coverage=0) for that ONE
      dimension — country keeps scoring other dims normally. Log once
      per country-dimension pair so the gap is audit-traceable.

  api/health.js
    - Remove lowCarbonGeneration / fossilElectricityShare / powerLosses
      from ON_DEMAND_KEYS. They stay in BOOTSTRAP_KEYS + SEED_META.
    - Replace the transitional comment with a hard "do NOT add these
      back" note pointing at the scorer's fail-closed gate.

  tests/resilience-energy-v2.test.mts
    - New test: flag on + ALL three seeds missing → throws
      ResilienceConfigurationError naming all three keys.
    - New test: flag on + only one seed missing → throws naming ONLY
      the missing key (operator-clarity guard).
    - New test: flag on + all seeds present → v2 runs normally.
    - Update the file-level invariant comment to reflect the new
      fail-closed contract (replacing the prior "degrade gracefully"
      wording that codified the silent-IMPUTE bug).
    - Note: fixture's `??` fallbacks coerce null-overrides into real
      data, so the preflight tests use a direct-reader helper.

  docs/methodology/country-resilience-index.mdx
    - New "Fail-closed semantics" paragraph in the v2 Energy section
      documenting the throw + source-failure + health-CRIT contract.

Non-goals (intentional):
  - This PR does NOT flip RESILIENCE_ENERGY_V2_ENABLED.
  - This PR does NOT provision seed-bundle-resilience-energy-v2 on Railway.
  - This PR does NOT touch RESILIENCE_PILLAR_COMBINE_ENABLED.

Operational effect post-merge:
  - /api/health flips from WARNING → CRITICAL on the three v2 seed-meta
    entries. That is the intended alarm; it reveals that the Railway
    bundle was never provisioned.
  - scoreEnergy behavior with flag=false is unchanged (legacy path).
  - scoreEnergy behavior with flag=true + seeds present is unchanged.
  - scoreEnergy behavior with flag=true + seeds absent changes from
    "silently IMPUTE all 217 countries" to "source-failure on the
    energy dim for every country, visible in widget + API response".

Tests: 511/511 resilience-* pass. Biome clean. Lint:md clean.

Related plan: docs/plans/2026-04-24-001-fix-resilience-v2-fail-closed-on-missing-seeds-plan.md

* docs(resilience): scrub stale ON_DEMAND_KEYS references for v2 energy seeds

Greptile P2 on PR #3363: four stale references implied the three v2
energy seeds were still gated as ON_DEMAND_KEYS (WARN-on-missing) even
though this PR's api/health.js change removed them (now strict
SEED_META = CRIT on missing). Scrubbing each:

  - api/health.js:196 (BOOTSTRAP_KEYS comment) — was "ON_DEMAND_KEYS
    until Railway cron provisions; see below." Updated to cite plan
    2026-04-24-001 and the strict-SEED_META posture.
  - api/health.js:398 (SEED_META comment) — was "Listed in ON_DEMAND_KEYS
    below until Railway cron provisions..." Updated for same reason.
  - docs/methodology/country-resilience-index.mdx:635 — v2.1 changelog
    entry said seed keys were ON_DEMAND_KEYS until graduation. Replaced
    with the fail-closed contract description.
  - docs/methodology/energy-v2-flag-flip-runbook.md:25 — step 3 said
    "ON_DEMAND_KEYS graduation" was required at flag-flip time.
    Rewrote to explain no graduation step is needed because the
    posture was removed pre-activation.

No code change. Tests still 14/14 on the energy-v2 suite, lint:md clean.

* fix(docs): escape MDX-unsafe `<=` in energy-v2 runbook to unblock Mintlify

Mintlify deploy on PR #3363 failed with
`Unexpected character '=' (U+003D) before name` at
`docs/methodology/energy-v2-flag-flip-runbook.md`. Two lines had
`<=` in plain prose, which MDX tries to parse as a JSX-tag-start.

Replaced both with `≤` (U+2264) — and promoted the two existing `>=`
on adjacent lines to `≥` for consistency. Prose is clearer and MDX
safe.

Same pattern as `mdx-unsafe-patterns-in-md` skill; also adjacent to
PR #3344's `(<137 countries)` fix.
2026-04-24 09:37:18 +04:00
Elie Habib
6d4c717e75 fix(health): treat empty intlDelays as OK, matching faaDelays (#3360)
intlDelays was alarming EMPTY_DATA during calm windows (seedAge 25m,
records 0) while its faaDelays sibling — written by the same aviation
seeder — was in EMPTY_DATA_OK_KEYS. The seeder itself declares
zeroIsValid: true (scripts/seed-aviation.mjs:1171) because 0 airport
disruptions is a real steady state, so the health classifier should
agree. Stale-seed degradation still kicks in once seedAge > 90min.
2026-04-24 07:11:56 +04:00
Elie Habib
d3d406448a feat(resilience): PR 2 §3.4 recovery-domain weight rebalance (#3328)
* feat(resilience): PR 2 §3.4 recovery-domain weight rebalance

Dials the two PR 2 §3.4 recovery dims (liquidReserveAdequacy,
sovereignFiscalBuffer) to ~10% share each of the recovery-domain
score via a new per-dimension weight channel in the coverage-weighted
mean. Matches the plan's direction that the sovereign-wealth signal
complement — rather than dominate — the classical liquid-reserves
and fiscal-space signals.

Implementation

- RESILIENCE_DIMENSION_WEIGHTS: new Record<ResilienceDimensionId, number>
  alongside RESILIENCE_DOMAIN_WEIGHTS. Every dim has an explicit entry
  (default 1.0) so rebalance decisions stay auditable; the two new
  recovery dims carry 0.5 each.

  Share math at full coverage (6 active recovery dims):
    weight sum                  = 4 × 1.0 + 2 × 0.5 = 5.0
    each new-dim share          = 0.5 / 5.0 = 0.10  ✓
    each core-dim share         = 1.0 / 5.0 = 0.20

  Retired dims (reserveAdequacy, fuelStockDays) keep weight 1.0 in
  the map; their coverage=0 neutralizes them at the coverage channel
  regardless. Explicit entries guard against a future scorer bug
  accidentally returning coverage>0 for a retired dim and falling
  through the `?? 1.0` default — every retirement decision is now
  tied to a single explicit source of truth.

- coverageWeightedMean (_shared.ts): refactored to apply
  `coverage × dimWeight` per dim instead of `coverage` alone. Backward-
  compatible when all weights default to 1.0 (reduces to the original
  mean). All three aggregation callers — buildDomainList, baseline-
  Score, stressScore — pick up the weighting transparently.

Test coverage

1. New `tests/resilience-recovery-weight-rebalance.test.mts`:
   pins the per-dim weight values, asserts the share math
   (0.10 new / 0.20 core), verifies completeness of the weight map,
   and documents why retired dims stay in the map at 1.0.
2. New `tests/resilience-recovery-ordering.test.mts`: fixture-based
   Spearman-proxy sensitivity check. Asserts NO > US > YE ordering
   preserved on both the overall score and the recovery-domain
   subscore after the rebalance. (Live post-merge Spearman rerun
   against the PR 0 snapshot is tracked as a follow-up commit.)
3. resilience-scorers.test.mts fixture anchors updated in lockstep:
     baselineScore: 60.35 → 62.17 (low-scoring liquidReserveAdequacy
       + partial-coverage SWF now contribute ~half the weight)
     overallScore:  63.60 → 64.39 (recovery subscore lifts by ~3 pts
       from the rebalance, overall by ~0.79)
     recovery flat mean: 48.75 (unchanged — flat mean doesn't apply
       weights by design; documents the coverage-weighted diff)
   Local coverageWeightedMean helper in the test mirrors the
   production implementation (weights applied per dim).

Methodology doc

- New "Per-dimension weights in the recovery domain" subsection with
  the weight table and a sentence explaining the cap. Cross-references
  the source of truth (RESILIENCE_DIMENSION_WEIGHTS).

Deliberate non-goals

- Live post-merge Spearman ≥0.85 check against the PR 0 baseline
  snapshot. Fixture ordering is preserved (new ordering test); the
  live-data check runs after Railway cron refreshes the rankings on
  the new weights and commits docs/snapshots/resilience-ranking-live-
  post-pr2-<date>.json. Tracked as the final piece of PR 2 §3.4
  alongside the health.js / bootstrap graduation (waiting on the
  7-day Railway cron bake-in window).

Tests: 6588/6588 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on touched files. NO > US > YE
fixture ordering preserved.

* fix(resilience): PR 2 review — thread RESILIENCE_DIMENSION_WEIGHTS through the comparison harness

Greptile P2: the operator comparison harness
(scripts/compare-resilience-current-vs-proposed.mjs) claims its domain
scores "mirror the production scorer's coverage-weighted mean" and is
the artifact generator for Spearman / rank-delta acceptance decisions.
After PR 2 §3.4's weight rebalance, the production mirror diverged —
production now applies RESILIENCE_DIMENSION_WEIGHTS (liquidReserveAdequacy
= 0.5, sovereignFiscalBuffer = 0.5) inside coverageWeightedMean, but
the harness still used equal-weight aggregation.

Left unfixed, post-merge Spearman / rank-delta diagnostics would
compare live API scores (with the 0.5 recovery weights) against
harness predictions that assume equal-share dims — silently biasing
every acceptance decision until someone noticed a country's rank-
delta didn't track.

Fix

- Mirrored coverageWeightedMean now accepts dimensionWeights and
  applies `coverage × weight` per dim, matching _shared.ts exactly.
- Mirrored buildDomainList accepts + forwards dimensionWeights.
- main() imports RESILIENCE_DIMENSION_WEIGHTS from the scorer module
  and passes it through to buildDomainList at the single call site.
- Missing-entry default = 1.0 (same contract as production) — makes
  the harness forward-compatible with any future weight refactor
  (adds a new dim without an explicit entry, old production fallback
  path still produces the correct number).

Verification

- Harness syntax-check clean (node -c).
- RESILIENCE_DIMENSION_WEIGHTS import resolves correctly from the
  harness's import path.
- 509/509 resilience tests still pass (harness isn't in the test
  suite; the invariant is that production ↔ harness use the same
  math, and the production side is covered by tests/resilience-
  recovery-weight-rebalance.test.mts).

* fix(resilience): PR 2 review — bump cache prefixes v10→v11 + document coverage-vs-weight asymmetry

Greptile P1 + P2 on PR #3328.

P1 — cache prefix not bumped after formula change
--------------------------------------------------
The per-dim weight rebalance changes the score formula, but the
`_formula` tag only distinguishes 'd6' vs 'pc' (pillar-combined vs
legacy 6-domain) — it does NOT detect intra-'d6' weight changes. Left
unfixed, scores cached before deploy would be served with the old
equal-weight math for up to the full 6h TTL, and the ranking key for
up to its 12h TTL. Matches the established v9→v10 pattern for every
prior formula-changing deploy.

Bumped in lockstep:
 - RESILIENCE_SCORE_CACHE_PREFIX:     v10  → v11
 - RESILIENCE_RANKING_CACHE_KEY:      v10  → v11
 - RESILIENCE_HISTORY_KEY_PREFIX:      v5  → v6
 - scripts/seed-resilience-scores.mjs local mirrors
 - api/health.js resilienceRanking literal
 - 4 analysis/backtest scripts that read the cached keys directly
 - Test fixtures in resilience-{ranking, handlers, scores-seed,
   pillar-aggregation}.test.* that assert on literal key values

The v5→v6 history bump is the critical one: without it, pre-rebalance
history points would mix with post-rebalance points inside the 30-day
window, and change30d / trend math would diff values from different
formulas against each other, producing false-negative "falling" trends
for every country across the deploy window.

P2 — coverage-vs-weight asymmetry in computeLowConfidence / computeOverallCoverage
----------------------------------------------------------------------------------
Reviewer flagged that these two functions still average coverage
equally across all non-retired dims, even after the scoring aggregation
started applying RESILIENCE_DIMENSION_WEIGHTS. The asymmetry is
INTENTIONAL — these signals answer a different question from scoring:

  scoring aggregation: "how much does each dim matter to the score?"
  coverage signal:     "how much real data do we have on this country?"

A dim at weight 0.5 still has the same data-availability footprint as
a weight=1.0 dim: its coverage value reflects whether we successfully
fetched the upstream source, not whether the scorer cares about it.
Applying scoring weights to the coverage signal would let a
half-weight dim hide half its sparsity from the overallCoverage pill,
misleading users reading coverage as a data-quality indicator.

Added explicit comments to both functions noting the asymmetry is
deliberate and pointing at the other site for matching rationale.
No code change — just documentation.

Tests: 6588/6588 data-tier tests pass (+511 resilience-specific
including the prefix-literal assertions). Typecheck clean on both
tsconfig configs. Biome clean on touched files.

* docs(resilience): bump methodology doc cache-prefix references to v11/v6

Greptile P2 on PR #3328: Redis keys table in the reproducibility
appendix still published `score:v10` / `ranking:v10` / `history:v5`,
and the rollback instructions told operators to flush those keys.
After the recovery-domain weight rebalance, live cache runs at
`score:v11` / `ranking:v11` / `history:v6`.

- Updated the Redis keys table (line 490-492) to match `_shared.ts`.
- Updated the rollback block to name the current keys.
- Left the historical "Activation sequence" narrative intact (it
  accurately describes the pillar-combine PR's v9→v10 / v4→v5 bump)
  but added a parenthetical pointing at the current v11/v6 values.

No code change — doc-only correction for operator accuracy.

* fix(docs): escape MDX-unsafe `<137` pattern to unblock Mintlify deploy

Line 643 had `(<137 countries)` — MDX parses `<137` as a JSX tag
starting with digit `1`, which is illegal and breaks the deploy with
"Unexpected character \`1\` (U+0031) before name". Surfaced after the
prior cache-prefix commit forced Mintlify to re-parse this file.

Replaced with "fewer than 137 countries" for unambiguous rendering.
Other `<` occurrences in this doc (lines 34, 642) are followed by
whitespace and don't trip MDX's tag parser.
2026-04-23 10:25:18 +04:00
Elie Habib
84ee2beb3e feat(energy): Energy Atlas end-to-end — pipelines + storage + shortages + disruptions + country drill-down (#3294)
* feat(energy): pipeline registries (gas + oil) — evidence-based schema

Day 6 of the Energy Atlas Release 1 plan (Week 2). First curated asset
registry for the atlas — the real gap vs GEF.

## Curated data (critical assets only, not global completeness)

scripts/data/pipelines-gas.json — 12 critical gas lines:
  Nord Stream 1/2 (offline; Swedish EEZ sabotage 2022; EU sanctions refs),
  TurkStream, Yamal–Europe (offline; Polish counter-sanctions),
  Brotherhood/Soyuz (offline; Ukraine transit expired 2024-12-31),
  Power of Siberia, Dolphin, Medgaz, TAP, TANAP,
  Central Asia–China, Langeled.

scripts/data/pipelines-oil.json — 12 critical oil lines:
  Druzhba North/South (N offline per EU 2022/879; S under landlocked
  derogation), CPC, ESPO (+ price-cap sanction ref), BTC, TAPS,
  Habshan–Fujairah (Hormuz bypass), Keystone, Kirkuk–Ceyhan (offline
  since 2023 ICC ruling), Baku–Supsa, Trans-Mountain (TMX expansion
  May 2024), ESPO spur to Daqing.

Scope note: 75+ each is Week 2b work via GEM bulk import. Today's cut
is curated from first-hand operator disclosures + regulator filings so
I can stand behind every evidence field.

## Evidence-based schema (not conclusion labels)

Per docs/methodology/pipelines.mdx: no bare `sanctions_blocked` field.
Every pipeline carries an evidence bundle with `physicalState`,
`physicalStateSource`, `operatorStatement`, `commercialState`,
`sanctionRefs[]`, `lastEvidenceUpdate`, `classifierVersion`,
`classifierConfidence`. The public badge (`flowing|reduced|offline|
disputed`) is derived server-side from this bundle at read time.

## Seeder

scripts/seed-pipelines.mjs — single process publishes BOTH keys
(energy:pipelines:{gas,oil}:v1) via two runSeed() calls. Tiny datasets
(<20KB each) so co-location is cheap and guarantees classifierVersion
consistency.

Conventions followed (worldmonitor-bootstrap-registration skill):
- TTL 21d = 3× weekly cadence (gold-standard per
  feedback_seeder_gold_standard.md)
- maxStaleMin 20_160 = 2× cadence (health-maxstalemin-write-cadence skill)
- sourceVersion + schemaVersion + recordCount + declareRecords wired
  (seed-contract-foundation)
- Zero-case explicitly NOT allowed — MIN_PIPELINES_PER_REGISTRY=8 floor

## Health registration (dual, per feedback_two_health_endpoints_must_match)

- api/health.js: BOOTSTRAP_KEYS adds pipelinesGas + pipelinesOil;
  SEED_META adds both with maxStaleMin=20_160.
- api/seed-health.js: mirror entries with intervalMin=10_080 (maxStaleMin/2).

## Bundle registration

scripts/seed-bundle-energy-sources.mjs adds a single Pipelines entry
(not two) because seed-pipelines.mjs publishes both keys in one run —
listing oil separately would double-execute. Monitoring of the oil key
staleness happens in api/health.js instead.

## Tests (tests/pipelines-registry.test.mts)

17 passing node:test assertions covering:
- Schema validation (both registries pass validateRegistry)
- Identity resolution (no id collisions, id matches object key)
- Country ISO2 normalization (from/to/transit all match /^[A-Z]{2}$/)
- Endpoint geometry within Earth bounds
- Evidence rigor: non-flowing badges require at least one supporting
  evidence source (operator statement / sanctionRefs / ais-relay /
  satellite / press)
- ClassifierConfidence in 0..1
- Commodity/capacity pairing (gas uses capacityBcmYr, oil uses
  capacityMbd — mixing = test fail)
- validateRegistry rejects: empty object, null, no-evidence fixtures,
  below-floor counts

Typecheck clean (both tsconfig.json and tsconfig.api.json).

Next: Day 7 will add list-pipelines / get-pipeline-detail RPCs in
supply-chain/v1. Day 8 ships PipelineStatusPanel with DeckGL PathLayer
consuming the registry.

* fix(energy): split seed-pipelines.mjs into two entry points — runSeed hard-exits

High finding from PR review. scripts/seed-pipelines.mjs called runSeed()
twice in one process and awaited Promise.all. But runSeed() in
scripts/_seed-utils.mjs hard-exits via process.exit on ~9 terminal paths
(lines 816, 820, 839, 888, 917, 989, plus fetch-retry 946, fatal 859,
skipped-lock 81). The first runSeed to reach any terminal path exits the
entire node process, so the second runSeed's resolve never fires — only
one of energy:pipelines:{gas,oil}:v1 would ever be written.

Since the bundle scheduled seed-pipelines.mjs exactly once, and both
api/health.js and api/seed-health.js expect both keys populated, the
other registry would stay permanently EMPTY/STALE after deploy.

Fix: split into two entry-point scripts around a shared utility.

- scripts/_pipeline-registry.mjs (NEW, was seed-pipelines.mjs) — shared
  helpers ONLY. Exports GAS_CANONICAL_KEY, OIL_CANONICAL_KEY,
  PIPELINES_TTL_SECONDS, MAX_STALE_MIN, buildGasPayload, buildOilPayload,
  validateRegistry, recordCount, declareRecords. Underscore prefix marks
  it as non-entry-point (matches _seed-utils.mjs / _seed-envelope-source.mjs
  convention).
- scripts/seed-pipelines-gas.mjs (NEW) — imports from the shared module,
  single runSeed('energy','pipelines-gas',…) call.
- scripts/seed-pipelines-oil.mjs (NEW) — same shape, oil.
- scripts/seed-bundle-energy-sources.mjs — register BOTH seeders (not one).
- scripts/seed-pipelines.mjs — deleted.
- tests/pipelines-registry.test.mts — update import path to the shared
  module. All 17 tests still pass.

Typecheck clean (both configs). Tests pass. No other consumers import
from the deleted script.

* fix(energy): complete pipeline bootstrap registration per 4-file checklist

High finding from PR review. My earlier PR description claimed
worldmonitor-bootstrap-registration was complete, but I only touched two
of the four registries (api/health.js + api/seed-health.js). The bootstrap
hydration payload itself (api/bootstrap.js) and the shared cache-keys
registry (server/_shared/cache-keys.ts) still had no entry for either
pipeline key, so any consumer that reads bootstrap data would see
pipelinesGas/pipelinesOil as missing on first load.

Files updated this commit:

- api/bootstrap.js — KEYS map + SLOW_KEYS set both gain pipelinesGas +
  pipelinesOil. Placed next to sprPolicies (same curated-registry cadence
  and tier). Slow tier is correct: weekly cron, not needed on first paint.
- server/_shared/cache-keys.ts — PIPELINES_GAS_KEY + PIPELINES_OIL_KEY
  exported constants (matches SPR_POLICIES_KEY pattern), BOOTSTRAP_KEYS map
  entries, and BOOTSTRAP_TIERS entries (both 'slow').

Not touched (intentional):
- server/gateway.ts — pipeline data is free-tier per the Energy Atlas
  plan; no PREMIUM_RPC_PATHS entry required. Energy Atlas monetization
  hooks (scenario runner, MCP tools, subscriptions) are Release 2.

Full 4-file checklist now complete:
   server/_shared/cache-keys.ts (this commit)
   api/bootstrap.js          (this commit)
   api/health.js             (earlier in PR)
   api/seed-health.js        (earlier in PR — dual-registry rule)

Typecheck clean (both configs).

* feat(energy): ListPipelines + GetPipelineDetail RPCs with evidence-derived badges

Day 7 of the Energy Atlas Release 1 plan (Week 2). Exposes the pipeline
registries (shipped in Day 6) via two supply-chain RPCs and ships the
evidence-to-badge derivation server-side.

## Proto

proto/worldmonitor/supply_chain/v1/list_pipelines.proto — new:
- ListPipelinesRequest { commodity_type?: 'gas' | 'oil' }
- ListPipelinesResponse { pipelines[], fetched_at, classifier_version, upstream_unavailable }
- GetPipelineDetailRequest { pipeline_id (required, query-param) }
- GetPipelineDetailResponse { pipeline?, revisions[], fetched_at, unavailable }
- PipelineEntry — wire shape mirroring scripts/data/pipelines-{gas,oil}.json
  + a server-derived public_badge field
- PipelineEvidence, OperatorStatement, SanctionRef, LatLon, PipelineRevisionEntry

service.proto adds both rpc methods with HTTP_METHOD_GET + path bindings:
  /api/supply-chain/v1/list-pipelines
  /api/supply-chain/v1/get-pipeline-detail

`make generate` regenerated src/generated/{client,server}/… + docs/api/
OpenAPI json/yaml.

## Evidence-derivation

server/worldmonitor/supply-chain/v1/_pipeline-evidence.ts — new.
derivePublicBadge(evidence) → 'flowing' | 'reduced' | 'offline' | 'disputed'
is deterministic + versioned (DERIVER_VERSION='badge-deriver-v1').

Rules (first match wins):
1. offline + sanctionRef OR expired/suspended commercial → offline
2. offline + operator statement → offline
3. offline + only press/ais/satellite → disputed (single-source negative claim)
4. reduced → reduced
5. flowing → flowing
6. unknown / malformed → disputed

Staleness guard: non-flowing badges on >14d-old evidence demote to
disputed. Flowing is the optimistic default — stale "still flowing" is
safer than stale "offline". Matches seed-pipelines-{gas,oil}.mjs maxStaleMin.

Tests (tests/pipeline-evidence-derivation.test.mts) — 15 passing cases
covering happy paths, disputed fallbacks, staleness guard, versioning.

## Handlers

server/worldmonitor/supply-chain/v1/list-pipelines.ts
- Reads energy:pipelines:{gas,oil}:v1 via getCachedJson.
- projectPipeline() narrows the Upstash `unknown` into PipelineEntry
  shape + calls derivePublicBadge.
- Honors commodity_type filter (skip the opposite registry's Redis read
  when the client pre-filters).
- Returns upstream_unavailable=true when BOTH registries miss.

server/worldmonitor/supply-chain/v1/get-pipeline-detail.ts
- Scans both registries by id (ids are globally unique per
  tests/pipelines-registry.test.mts).
- Empty revisions[] for now; auto-revision log wires up in Week 3.

handler.ts registers both into supplyChainHandler.

## Gateway

server/gateway.ts adds 'static' cache-tier for both new RPC paths
(registry is slow-moving; 'static' matches the other read-mostly
supply-chain endpoints).

## Consumer wiring

Not in this commit — PipelineStatusPanel (Day 8) is what will call
listPipelines/getPipelineDetail via the generated client. pipelinesGas
+ pipelinesOil stay in PENDING_CONSUMERS until Day 8.

Typecheck clean (both configs). 15 new tests + 17 registry tests all pass.

* feat(energy): PipelineStatusPanel — evidence-backed status table + drawer

Day 8 of the Energy Atlas Release 1 plan. First consumer of the Day 6–7
registries + RPCs.

## What this PR adds

- src/components/PipelineStatusPanel.ts — new panel (id=pipeline-status).
  * Bootstrap-hydrates from pipelinesGas + pipelinesOil for instant first
    paint; falls through to listPipelines() RPC if bootstrap misses.
    Background re-fetch runs on every render so a classifier-version bump
    between bootstrap stamp and first view produces a visible update.
  * Table rows sorted non-flowing-first (offline / reduced / disputed
    before flowing) — what an atlas reader cares about.
  * Click-to-expand drawer calls getPipelineDetail() lazily — operator
    statements, sanction refs (with clickable source URLs), commercial
    state, classifier version + confidence %, capacity + route metadata.
  * publicBadge color-chip palette matches the methodology doc.
  * Attribution footer with GEM (CC-BY 4.0) credit + classifier version.

- src/components/index.ts — barrel export.
- src/app/panel-layout.ts — import + createPanel('pipeline-status', …).
- src/config/panels.ts — ENERGY_PANELS adds 'pipeline-status' at priority 1.

## PENDING_CONSUMERS cleanup

tests/bootstrap.test.mjs — removes 'pipelinesGas' + 'pipelinesOil' from
the allowlist. The invariant "every bootstrap key has a getHydratedData
consumer" now enforces real wiring for these keys: the panel literally
calls getHydratedData('pipelinesGas') and getHydratedData('pipelinesOil').
Future regressions that remove the consumer will fail pre-push.

## Consumer contract verified

- 67 tests pass including bootstrap.test.mjs consumer coverage check.
- Typecheck clean.
- No DeckGL PathLayer in this commit — existing 'pipelines-layer' has a
  separate data source, so modifying DeckGLMap.ts to overlay evidence-
  derived badges on the map is a follow-up commit to avoid clobbering.

## Out of scope for Day 8 (next steps on same PR)

- DeckGL PathLayer integration (color pipelines on the main map by
  publicBadge, click-to-open this drawer) — Day 8b commit.
- Storage facility registry + StorageFacilityMapPanel — Days 9-10.

* fix(energy): PipelineStatusPanel bootstrap path — client-side badge derivation

High finding from PR review. The Day-8 panel crashed on first paint
whenever bootstrap hydration succeeded, because:

- Bootstrap hydrates raw scripts/data/pipelines-{gas,oil}.json verbatim.
- That JSON does NOT include publicBadge — that field is only added by
  the server handler's projectPipeline() in list-pipelines.ts.
- PipelineStatusPanel passed raw entries into badgeChip(), which called
  badgeLabel(undefined).charAt(0) → TypeError.

The background RPC refresh that would have repaired the data never ran
because the panel threw before reaching it. So the exact bootstrap path
newly wired in commit 6b01fa537 was broken for the new panel.

Fix: move the evidence→badge deriver to src/shared/pipeline-evidence.ts
so the client panel and the server handler run the identical function on
identical inputs. Panel projects raw bootstrap JSON through the shared
deriver client-side, producing the same publicBadge the RPC would have
returned. No UI flicker on hydration because pre- and post-RPC badges
match exactly (same function, same input).

## Changes

- src/shared/pipeline-evidence.ts (NEW) — pure deriver with duck-typed
  PipelineEvidenceInput (no generated-type dependency, so both client
  and server assign their proto-typed evidence bundles by structural
  subtyping). Exports derivePipelinePublicBadge + version + type.
- server/worldmonitor/supply-chain/v1/_pipeline-evidence.ts — now a thin
  re-export of the shared module under its older name so in-handler
  imports keep working without a sweep.
- src/components/PipelineStatusPanel.ts:
  * Imports derivePipelinePublicBadge from @/shared/pipeline-evidence.
  * NEW projectRawPipeline() defensively coerces every field from
    unknown → PipelineEntry shape, mirroring the server projection.
  * buildBootstrapResponse now routes every raw entry through the
    projection before returning, so the wire-format PipelineEntry[] the
    renderer receives always has publicBadge populated.
  * badgeChip() gained a null-guard fallback to 'disputed' — belt +
    braces so even if a future caller passes an undefined, the UI
    renders safely instead of throwing.
  * BootstrapRegistry renamed RawBootstrapRegistry with a comment
    explaining why the seeder ships raw JSON (not wire format).

## Regression tests

tests/pipeline-panel-bootstrap.test.mts (NEW) — 6 tests that exercise
the bootstrap-first-paint path end-to-end:

- Every gas + oil curated entry produces a valid badge.
- Raw entries never ship with pre-computed publicBadge (contract guard
  on the seed data format).
- Deriver never throws on undefined/null/{} evidence (was the crash).
- Nord Stream 1 regression check (offline + paperwork → offline).
- Druzhba-South staleness behavior (reduced when fresh, disputed after
  60 days without update).

38/38 tests now pass (17 registry + 15 deriver + 6 new bootstrap-path).
Typecheck clean on both configs.

## Invariant preserved

The server handler and the panel render identical badges because:
1. Same pure function (imported from the same module).
2. Same deterministic rules, same staleness window.
3. Same bootstrap data read by both paths (Redis → either bootstrap
   payload or RPC response).

No UI flicker on hydration.

* fix(energy): three PR-review P2s on PipelineStatusPanel + aggregators

## P2-1 — sanitizeUrl on external evidence links (XSS hardening)

Sanction-ref URLs and operator-statement URLs were interpolated with
escapeHtml only. HTML-escaping blocks tag injection but NOT javascript:
or data: URL schemes, so a bad URL in the seeded registry would execute
in-app when a reader clicked the evidence link. Every other panel in
the codebase (NewsPanel, GdeltIntelPanel, GeoHubsPanel, AirlineIntelPanel,
MonitorPanel) uses sanitizeUrl for this exact reason.

Fix: import sanitizeUrl from @/utils/sanitize and route both hrefs
through it. sanitizeUrl() drops non-http(s) schemes + returns '' on
invalid URLs. The renderer now suppresses the <a> entirely when
sanitize rejects — the date label still renders as plain text instead
of becoming an executable link.

## P2-2 — loadDetail catch path missing stale-response guard

The success path at loadDetail() checked `this.selectedId !== pipelineId`
to suppress stale responses when the user has clicked another pipeline
mid-flight. The catch path at line 219 had no such guard: if the user
clicked A, then B, and A's request failed before B resolved, A's error
handler cleared detailLoading and detail, showing "Pipeline detail
unavailable" for B's drawer even though B was still loading.

Fix: mirror the same `if (this.selectedId !== pipelineId) return` guard
in the catch path. The newer request now owns the drawer state
regardless of which path (success OR failure) the older one took.

## P2-3 — always-gas-preference aggregator for classifierVersion + fetchedAt

Three call sites (list-pipelines.ts handler, get-pipeline-detail.ts
handler, PipelineStatusPanel bootstrap projection) computed aggregate
classifier version and fetchedAt by `gas?.x || oil?.x || fallback`.
That was defensible when a single seed-pipelines.mjs wrote both keys
atomically (fix commit 29b4ac78f split this into two separate Railway
cron entry points). Now gas + oil cron independently, so mixed-version
(gas=v1, oil=v2 during classifier rollout) and mixed-timestamp (oil
refreshed 6h after gas) windows are the EXPECTED state, not the
exceptional one. The comment in list-pipelines.ts even said "pick the
newest classifier version" but the code didn't actually compare.

Fix: add two shared helpers in src/shared/pipeline-evidence.ts —

- pickNewerClassifierVersion(a,b) — parses /^v(\\d+)$/ and returns the
  higher-numbered version; falls back to lexicographic for non-v-
  prefixed values; handles single-missing inputs.
- pickNewerIsoTimestamp(a,b) — Date.parse()-compares and returns the
  later ISO; handles missing / malformed inputs gracefully.

Both server RPCs and the panel bootstrap projection now call these
helpers identically, so clients are told the truth about version +
freshness during partial rollouts.

## Tests

Extended tests/pipeline-evidence-derivation.test.mts with 8 new
assertions covering both pickers:

- Higher v-number wins regardless of order (v1 vs v2 → v2 both ways)
- Single-missing falls back to the one present
- Missing + missing → default 'v1' for version / '' for ts
- Non-v-numbered values fall back to lexicographic
- Explicit regression: "gas=v1 + oil=v2 during rollout" returns v2
- Explicit regression: "oil fresher than gas" returns the oil timestamp

38 → 46 tests. All pass. Typecheck clean on both configs.

* feat(energy): DeckGL PathLayer colored by evidence-derived badge + map↔panel link

Day 8b of the Energy Atlas plan. Pipelines now render on the main
DeckGL map of the energy variant colored by their derived publicBadge,
and clicking a pipeline on the map opens the same evidence drawer the
panel row-click opens.

## Why this commit

Day 8 shipped the PipelineStatusPanel as a table + drawer view.
Reviewer flag notwithstanding (fixed in 149d33ec3 + db52965cd), a
table-only pipeline view is a weak product compared to the map-centric
atlas it's meant to rival. The map-layer differentiation is the whole
point of the feature.

## What this adds

src/components/DeckGLMap.ts:
- New createEnergyPipelinesLayer() — reads hydrated pipeline registries
  via getHydratedData, projects raw JSON through the shared deriver
  (src/shared/pipeline-evidence.ts), renders a DeckGL PathLayer colored
  by publicBadge:
    flowing  → green (46,204,113)
    reduced  → amber (243,156,18)
    offline  → red   (231,76,60)
    disputed → purple (155,89,182)
  Offline + disputed get thicker strokes (3px vs 2px) for at-a-glance
  surfacing of disrupted assets. Geometry comes from raw startPoint +
  waypoints[] + endPoint per asset (straight line when no waypoints).
- Branching at line ~1498: SITE_VARIANT === 'energy' routes to the
  new method; other variants keep the static PIPELINES config (colored
  by oil/gas type). Existing commodity/finance/full map layers are
  untouched — no cross-variant leakage.
- onClick handler emits `energy:open-pipeline-detail` as a window
  CustomEvent with { pipelineId }. Loose coupling: the map doesn't
  import the panel, the panel doesn't import the map.
- Fallback: if bootstrap hasn't hydrated yet, createEnergyPipelinesLayer
  falls back to the static createPipelinesLayer() so the pipelines
  toggle always shows *something*.

src/components/PipelineStatusPanel.ts:
- Constructor registers a window event listener for
  'energy:open-pipeline-detail' → calls this.loadDetail(pipelineId) →
  drawer opens on the clicked asset. Map click and row click converge
  on the same drawer, same evidence view.
- destroy() removes the listener to prevent ghost handlers after panel
  unmount.

## Guarantees

- Bootstrap parity: the DeckGL layer calls the SAME derivePipelinePublicBadge
  as the panel and the server handler, so the map color, the table row
  chip, and the RPC response all agree on the badge. No flicker, no
  drift, no confused user.
- Variant isolation: only SITE_VARIANT === 'energy' triggers the new
  path. Commodity / finance / full map layers untouched.
- No cross-component import: the panel doesn't reference the map class
  and vice versa. The event contract is the only coupling — testable,
  swappable, tauri-safe (guarded with `typeof window !== 'undefined'`).

Typecheck clean. PR #3294 now has 8 commits.

Follow-up backlog:
- Add waypoints[] to the curated pipelines-{gas,oil}.json so the map
  draws real routes instead of straight lines (cosmetic; does not
  affect correctness).
- Tooltip case in the picking tooltip registry (line ~3748) so hover
  shows "Nord Stream 1 · OFFLINE" before click.

* fix(energy): three PR-review findings on Day 8b DeckGL integration

## P1 — getHydratedData single-use race between map + panel

src/services/bootstrap.ts:34 — `if (val !== undefined) hydrationCache.delete(key);`
The helper drains its slot on first read. Day 8 (PipelineStatusPanel) and
Day 8b (createEnergyPipelinesLayer) BOTH call getHydratedData('pipelinesGas')
and getHydratedData('pipelinesOil') — whoever renders first drains the cache
and forces the loser onto its fallback path (panel → RPC, map → static
PIPELINES layer). The commit's "shared bootstrap-backed data" guarantee
did not actually hold.

Fix: new src/shared/pipeline-registry-store.ts that reads once and memoizes.
Both consumers read through getCachedPipelineRegistries() — same data, same
reference, unlimited re-reads. When the panel's background RPC fetch lands,
it calls setCachedPipelineRegistries() to back-propagate fresh data into
the store so the map's next re-render sees the newer classifierVersion +
fetchedAt too (no map/panel drift during classifier rollouts).

Test-only injection hook (__setBootstrapReaderForTests) makes the drain-once
semantics observable without a real bootstrap payload.

## P2 — pipelines-layer tooltip regresses to blank label on energy variant

src/components/DeckGLMap.ts:3748 (pipelines-layer tooltip case) still assumed
the static-config shape (obj.type). The new energy layer emits objects with
commodityType + badge fields, so the tooltip's type-ternary fell through to
the generic fallback — hover rendered " pipeline" (empty leading commodity)
instead of "Nord Stream 1 · OFFLINE".

Fix: differentiate by presence of obj.badge (only the energy layer sets it).
On the energy variant, tooltip now reads name + commodity + badge. Static-
config variants (commodity / finance / full) keep their existing format
unchanged.

## P2 — createEnergyPipelinesLayer dropped highlightedAssets behavior

The static createPipelinesLayer() reads this.highlightedAssets.pipeline and
threads it into getColor / getWidth with an updateTrigger on the signature.
Any caller using flashAssets('pipeline', [...]) or highlightAssets([...])
gets a visible red-outline flash on the matching paths. My Day 8b energy
layer ignored the set entirely — those APIs silently no-op'd on the energy
variant.

Fix: createEnergyPipelinesLayer() now reads the same highlight set, applies
HIGHLIGHT_COLOR + wider stroke to matching IDs, and wires
updateTriggers: { getColor: sig, getWidth: sig } so DeckGL actually
recomputes when the set changes.

Also removed the unnecessary layerCache.set() in the energy path: the
store can update via RPC back-propagation, and a cache keyed only on
highlight-signature would serve stale data. With ~25 critical-asset
pipelines, rebuild per render is trivial.

## Tests

tests/pipeline-registry-store.test.mts (NEW) — 5 tests covering the
drain-once read-many invariant: multiple consumers get cached data
without re-draining, RPC back-propagation updates the source, partial
updates preserve the other commodity, and pure RPC-first (no bootstrap)
works without invoking the reader.

All 51 PR tests pass. Typecheck clean on both configs.

* feat(energy): Day 9 — storage facility registry (UGS + SPR + LNG + crude hubs)

Ships 21 critical strategic storage facilities as a curated registry, same
evidence-bundle pattern as the pipeline registries in Day 7/8:

- scripts/data/storage-facilities.json — 4 UGS + 4 SPR + 6 LNG export +
  3 LNG import + 4 crude tank farms. Each carries physicalState +
  sanctionRefs + classifierVersion/Confidence + fillDisclosed/fillSource.
- scripts/_storage-facility-registry.mjs — shared helpers (validator,
  builder, canonical key, MAX_STALE_MIN). Validator enforces facility-type
  × capacity-unit pairing (ugs→TWh, spr/tank-farm→Mb, LNG→Mtpa) and the
  non-operational badge ⇒ evidence invariant.
- scripts/seed-storage-facilities.mjs — single runSeed entry (only one
  key, so no split-seeder dance needed).
- Registered in the 4-file bootstrap checklist: cache-keys.ts
  (STORAGE_FACILITIES_KEY + BOOTSTRAP_CACHE_KEYS + BOOTSTRAP_TIERS),
  api/bootstrap.js (KEYS + SLOW_KEYS), api/health.js (BOOTSTRAP_KEYS +
  SEED_META, 14d threshold = 2× weekly cron), api/seed-health.js (mirror).
- tests/bootstrap.test.mjs PENDING_CONSUMERS adds storageFacilities —
  Day 10 StorageFacilityMapPanel will remove it.
- tests/storage-facilities-registry.test.mts — 20 tests covering schema,
  identity, geometry, type×capacity pairing, evidence contract, and
  negative-input validator rejection.

Registry fields are slow-moving; badge derivation happens at read-time
server-side once the RPC handler lands in Day 10 (panel + deckGL
ScatterplotLayer). Seeded data is live in Redis from this commit so the
Day 10 PR only adds display surfaces.

Tests: 56 pass (36 prior + 20 new). Typecheck + typecheck:api clean.

* feat(energy): Day 10 — storage atlas (ListStorageFacilities RPC + DeckGL ScatterplotLayer + panel)

End-to-end wiring for the strategic storage registry seeded in Day 9. Same
pattern as the pipeline shipping path (Days 7+8+8b): proto → handler →
shared evidence deriver → panel → DeckGL map layer, with a shared
read-once store keeping map + panel aligned.

Proto + generated code:
- list_storage_facilities.proto: ListStorageFacilities +
  GetStorageFacilityDetail messages with StorageFacilityEntry,
  StorageEvidence, StorageSanctionRef, StorageOperatorStatement,
  StorageLatLon, StorageFacilityRevisionEntry.
- service.proto wires both RPCs under /api/supply-chain/v1.
- make generate → regenerated client + server stubs + OpenAPI.

Server handlers:
- src/shared/storage-evidence.ts: shared pure deriver. Duck-typed input
  interface avoids generated-type deps; identical rules to the pipeline
  deriver (sanction/commercial paperwork vs external-signal-only offline,
  14d staleness window, version pin).
- _storage-evidence.ts: thin re-export for server handler import ergonomics.
- list-storage-facilities.ts: reads STORAGE_FACILITIES_KEY from Upstash,
  projects raw → wire format, attaches derived publicBadge, filters by
  optional facilityType query arg.
- get-storage-facility-detail.ts: single-asset lookup for drawer.
- handler.ts registers both new methods.
- gateway.ts: both routes → 'static' cache tier (registry is near-static).

Panel + map:
- src/shared/storage-facility-registry-store.ts: drain-once memo mirroring
  pipeline-registry-store. Both panel and DeckGL layer read through this
  so the single-use getHydratedData drain doesn't race between consumers.
  RPC back-propagation via setCachedStorageFacilityRegistry() keeps map ↔
  panel on the same classifierVersion during rollouts.
- StorageFacilityMapPanel.ts: table + evidence drawer. Bootstrap hot path
  projects raw registry through same deriver as server so first-paint
  badge matches post-RPC badge (no flicker). sanitizeUrl + stale-response
  guards (success + catch paths) carried over from PipelineStatusPanel.
- DeckGLMap.ts createEnergyStorageLayer(): ScatterplotLayer keyed on
  badge color; log-scale radius (6km–26km) keeps Rehden visible next to
  Ras Laffan. Click dispatches 'energy:open-storage-facility-detail' —
  panel listens and opens its drawer (loose coupling, no direct refs).
- Tooltip branch on storage-facilities-layer shows facility type, country,
  capacity unit, and badge.
- Added 'storageFacilities' optional field to MapLayers type (optional so
  existing variant literals across commodity/finance/tech/happy/full/etc.
  don't need touching). Wired into LAYER_REGISTRY + VARIANT_LAYER_ORDER.energy
  + ENERGY_MAP_LAYERS + ENERGY_MOBILE_MAP_LAYERS. Panel entry added to
  ENERGY_PANELS + panel-layout createPanel. PENDING_CONSUMERS entry from
  Day 9 removed — panel + map layer are now real consumers.

Tests:
- storage-evidence-derivation.test.mts (17 tests): covers every curated
  facility yields a valid badge, null/malformed input never throws,
  offline sanction/commercial/operator rules, external-signal-only offline
  → disputed, staleness demotion.
- storage-facility-registry-store.test.mts (4 tests): drain-once, no-data
  drain, RPC update, pure-RPC-first path.

All 6,426 unit tests pass. Typecheck + typecheck:api clean. Pre-existing
src-tauri/sidecar/ test failure is unrelated (no diff touches src-tauri/).

* feat(energy): Day 11 — fuel-shortage registry schema + seed + RPC (classifier post-launch)

Ships v1 of the global fuel-shortage alert registry. Severity is the
CLASSIFIER OUTPUT (confirmed/watch), not a client derivation — we ship
the evidence alongside so readers can audit the grounds. v1 is seeded
from curated JSON; post-launch the proactive-intelligence classifier
(Day 12 work) extends the same key directly.

Data:
- scripts/data/fuel-shortages.json — 15 known active shortages
  (PK, LK, NG×2, CU, VE, LB, ZW, AR, IR, BO, KE, PA, EG, BY)
  spanning petrol/diesel/jet across confirmed + watch tiers. Each entry
  carries evidenceSources[] (regulator/operator/press), firstSeen,
  lastConfirmed, resolvedAt, impactTypes[], causeChain[], classifier
  version + confidence. Confirmed severity enforces authoritative
  evidence at schema level.

Seeder:
- scripts/_fuel-shortage-registry.mjs — shared validator (enforces
  iso2 country, enum products/severities/impacts/causes, authoritative
  evidence for confirmed). MIN_SHORTAGES=10.
- scripts/seed-fuel-shortages.mjs — single runSeed entry.
- Registered in seed-bundle-energy-sources.mjs at DAY cadence (shortages
  move faster than registry assets).

Bootstrap 4-file registration:
- cache-keys.ts: FUEL_SHORTAGES_KEY + BOOTSTRAP_CACHE_KEYS + BOOTSTRAP_TIERS.
- api/bootstrap.js: KEYS + SLOW_KEYS.
- api/health.js: BOOTSTRAP_KEYS + SEED_META (2880min = 2× daily cron).
- api/seed-health.js: mirrors intervalMin=1440.

Proto + RPC:
- list_fuel_shortages.proto: ListFuelShortages (country/product/severity
  query facets) + GetFuelShortageDetail messages with FuelShortageEntry,
  FuelShortageEvidence, FuelShortageEvidenceSource.
- service.proto wires both new RPCs under /api/supply-chain/v1.
- list-fuel-shortages.ts handler projects raw → wire format, supports
  server-side country/product/severity filtering.
- get-fuel-shortage-detail.ts single-shortage lookup.
- handler.ts registers both. gateway.ts: 'medium' cache-tier (daily
  classifier updates warrant moderate freshness).

Shared evidence helper:
- src/shared/shortage-evidence.ts: deriveShortageEvidenceQuality maps
  (confidence + authoritative-source count + freshness) → 'strong' |
  'moderate' | 'thin' for client-side sort/trust indicators. Does NOT
  change severity — classifier owns that decision.
- countEvidenceSources buckets sources for the drawer's "n regulator /
  m press" line.

Tests:
- tests/fuel-shortages-registry.test.mts (19 tests): schema, identity,
  enum coverage, evidence contract (confirmed → authoritative source),
  validateRegistry negative cases.
- tests/shortage-evidence.test.mts (10 tests): quality deriver edge
  cases, source bucketing.
- tests/bootstrap.test.mjs PENDING_CONSUMERS adds fuelShortages —
  FuelShortagePanel arrives Day 12 which will remove the entry.

Typecheck + typecheck:api clean. 64 tests pass.

* feat(energy): Day 12 — FuelShortagePanel + DeckGL shortage pins

End-to-end wiring of the fuel-shortage registry shipped in Day 11: panel
on the Energy variant page, ScatterplotLayer pins on the DeckGL map,
both reading through a shared single-drain store so they don't race on
the bootstrap cache.

Panel:
- src/components/FuelShortagePanel.ts — table sorted by severity (confirmed
  first) then evidence quality (strong → thin) then most-recent lastConfirmed.
  Drawer shows short description, first-seen / last-confirmed / resolved,
  impact types, cause chain, classifier version/confidence, and a typed
  evidence-source list with regulator/operator/press chips. sanitizeUrl on
  every href so classifier-ingested URLs can't render as javascript:. Same
  stale-response guards on success + catch paths as the other detail drawers.
- Consumes deriveShortageEvidenceQuality for client-side trust indicator
  (three-dot ●●● / ●●○ / ●○○), NOT for severity — severity is classifier
  output.
- Registered in ENERGY_PANELS + panel-layout.ts + components barrel.

Shared store:
- src/shared/fuel-shortage-registry-store.ts — same drain-once memoize
  pattern as pipeline- and storage-facility-registry-store. Both the
  panel and the DeckGL shortage-pins layer read through it.

DeckGL layer:
- DeckGLMap.createEnergyShortagePinsLayer: ScatterplotLayer placing one
  pin per active shortage at the country centroid (via getCountryCentroid
  from services/country-geometry). Stacking offset (~0.8° lon) when
  multiple shortages share a country so Nigeria's petrol + diesel don't
  render as a single dot. Confirmed pins 55km radius; watch 38km. Click
  dispatches 'energy:open-fuel-shortage-detail' — panel listens.
- Tooltip branch on fuel-shortages-layer: country · product · short
  description · severity.
- Layer registered in LAYER_REGISTRY, VARIANT_LAYER_ORDER.energy,
  ENERGY_MAP_LAYERS, ENERGY_MOBILE_MAP_LAYERS. MapLayers.fuelShortages
  is optional on the type so other variants' literals remain valid.

Tests:
- tests/fuel-shortage-registry-store.test.mts (4 tests): drain-once,
  no-data, RPC back-prop, pure-RPC-first path.
- tests/bootstrap.test.mjs — fuelShortages removed from PENDING_CONSUMERS.

Typecheck + typecheck:api clean. 39 tests pass (plus full suite in pre-push).

* feat(energy): Day 13 — energy disruption event log + asset timeline drawer

Ships the energy:disruptions:v1 registry that threads together pipelines
and storage facilities: state transitions (sabotage, sanction, maintenance,
mechanical, weather, commercial, war) keyed by assetId so any asset's
drawer can render its history without a second registry lookup.

Data + seeder:
- scripts/data/energy-disruptions.json — 12 curated events spanning
  Nord Stream 1/2 sabotage, Druzhba sanctions, CPC force majeure,
  TurkStream maintenance, Yamal halt, Rehden trusteeship, Arctic LNG 2
  sanction, ESPO drone strikes, BTC fire (historical), Sabine Pass
  Hurricane Beryl, Power of Siberia ramp. Each event links back to a
  seeded asset.
- scripts/_energy-disruption-registry.mjs — validator enforces valid
  assetType/eventType/cause enums, http(s) sources, startAt ≤ endAt,
  MIN_EVENTS=8.
- scripts/seed-energy-disruptions.mjs — runSeed entry (weekly cron).
- Bundle entry at 7×DAY cadence.

Bootstrap 4-file registration (cache-keys.ts + bootstrap.js + health.js +
seed-health.js) — energyDisruptions in PENDING_CONSUMERS because panel
drawers fetch lazily via RPC on drawer-open rather than hydrating from
bootstrap directly.

Proto + handler:
- list_energy_disruptions.proto: ListEnergyDisruptions with
  assetId / assetType / ongoingOnly query facets. Returns events sorted
  newest-first.
- list-energy-disruptions.ts projects raw → wire format, supports all
  three query facets.
- Registered in handler.ts. gateway.ts: 'medium' cache tier.

Shared timeline helper:
- src/shared/disruption-timeline.ts — pure formatters (formatEventWindow,
  formatCapacityOffline, statusForEvent). No generated-type deps so
  PipelineStatusPanel + StorageFacilityMapPanel import the same helpers
  and render the timeline identically.

Panel integration:
- PipelineStatusPanel.loadDetail now fetches getPipelineDetail +
  listEnergyDisruptions({assetId, assetType:'pipeline'}) in parallel.
  Drawer gains "Disruption timeline (N)" section with event type, date
  window, capacity offline, cause chain, and short description per entry.
- StorageFacilityMapPanel gets identical treatment with assetType='storage'.
- Both reset detailEvents on closeDetail and on fresh click (stale-response
  safety).

Tests:
- tests/energy-disruptions-registry.test.mts (17 tests): schema, identity,
  enum coverage, evidence, negative inputs.
- tests/bootstrap.test.mjs — energyDisruptions added to PENDING_CONSUMERS.

Typecheck + typecheck:api clean. 51 tests pass locally (plus full suite
in pre-push).

* feat(energy): Day 14 — country drill-down Atlas exposure section

Extends CountryDeepDivePanel's existing "Energy Profile" card with a
mini Atlas-exposure section that surfaces per-country exposure to the
new registries we shipped in Days 7-13.

For each country:
- Pipelines touching this country (from, to, or transit) — clickable
  rows that dispatch 'energy:open-pipeline-detail' so the PipelineStatusPanel
  drawer opens on the energy variant; no-op on other variants.
- Storage facilities in this country — same loose-coupling pattern
  with 'energy:open-storage-facility-detail'.
- Active fuel shortages in this country — severity breakdown line
  (N confirmed · M watch) plus clickable rows emitting
  'energy:open-fuel-shortage-detail'.

Silent absence: sections render only when the country has matching
assets/events, so countries with no pipeline, storage, or shortage
touchpoints see the existing energy-profile card unchanged.

Lazy stores: reads go through the same shared drain-once stores
(getCachedPipelineRegistries, getCachedStorageFacilityRegistry,
getCachedFuelShortageRegistry) so CountryDeepDivePanel does NOT race
with Atlas panels over the single-drain bootstrap cache. Dynamic
import() keeps the three stores out of the panel's static import graph
so non-energy variants can tree-shake them.

Typecheck clean. No schema changes; purely additive UI read from
already-shipped registries.

* docs(energy): methodology page for energy disruption event log

Fills the /docs/methodology/disruptions URL referenced by
list_energy_disruptions.proto, scripts/_energy-disruption-registry.mjs,
and the panel attribution footers. Explains scope (state transitions
not daily noise), data shape, what counts as a disruption, classifier
evolution path, RPC contract, and ties into the sibling pipeline +
storage + shortage methodology pages.

No code change; pure docs completion for Week 4 launch polish.

* fix(energy): upstreamUnavailable only fires when Redis returned nothing

Two handlers (list-storage-facilities + list-pipelines) conflated "empty
filter result on a healthy registry" with "upstream unavailable". A
caller who queried one facilityType/commodityType and legitimately got
zero matches was told the upstream was down — which may push clients to
error-state rendering or suppress caching instead of showing a valid
empty list.

list-storage-facilities.ts — upstreamUnavailable now only fires when
`raw` is null (Redis miss). Zero filtered rows on a healthy registry
returns upstreamUnavailable: false + empty array. Matches the sibling
list-fuel-shortages handler and the wire contract in
list_storage_facilities.proto.

list-pipelines.ts — same bug, subtler shape. Now checks "requested at
least one side AND received nothing" rather than "zero rows after
collection". A filter that legitimately matches no gas/oil pipelines on
a healthy registry now returns upstreamUnavailable: false.

list-energy-disruptions.ts and list-fuel-shortages.ts already had the
correct shape (only flag unavailable when raw is missing) — left as-is.

Typecheck + typecheck:api clean. No tests added: the existing registry
schema tests cover the projection/filter helpers, and the handler-level
gating change is documented in code comments for future audits.

* fix(energy): three Greptile findings on PR #3294

Two P1 filter bugs (resolved shortages rendered as active) and one P2
contract inconsistency on the disruptions handler.

P1: DeckGLMap createEnergyShortagePinsLayer rendered every shortage in
the registry as an active crisis pin — including entries where the
classifier has written resolvedAt to mark the crisis over. Added a
filter so only entries with a null/empty resolvedAt become map pins.
Curated v1 data has resolvedAt=null everywhere so no visible change
today, but the moment the classifier starts writing resolutions
post-launch, resolved shortages would have appeared as ongoing.

P1: CountryDeepDivePanel renderAtlasExposure had the same bug in the
country drill-down — "N confirmed · M watch" counts included resolved
entries, inflating the active-crisis line per country. Same one-line
filter fix.

P2: list-energy-disruptions.ts gated upstreamUnavailable on
`!raw?.events` — a partial write (top-level object present but `events`
property missing) fired the "upstream down" flag, inconsistent with
the sibling handlers (list-pipelines, list-storage-facilities,
list-fuel-shortages) that only fire on `!raw`. Rewrote to match:
`!raw` → upstreamUnavailable, empty events → normal empty list. This
also aligns with the contract documented on the upstream-unavailable-
vs-empty-filter skill extracted from the earlier P2 review.

Typecheck + typecheck:api clean. All three fixes are one-liner filter
or gate changes; no test additions needed (registry tests still pass
with v1 data since resolvedAt is null throughout).
2026-04-23 07:34:07 +04:00
Elie Habib
52659ce192 feat(resilience): PR 1 — energy construct repair (flag-gated) (#3289)
* docs(resilience): PR 1 foundation — Option B framing + v2 energy construct spec

First commit in PR 1 of the resilience repair plan. Zero scoring-behaviour
change; sets up the construct contract that the code changes will implement.

Declares the framing decision required by plan section 3.2 before any
scorer code lands: Option B (power-system security) is adopted. Electricity
grids are the dominant short-horizon shock-transmission channel, and the
choice lets the v2 energy indicator set share one denominator (percent of
electricity generation) instead of mixing primary-energy and power-system
measures in a composite.

Methodology doc changes:
  - Energy Domain section now documents both the legacy indicator set
    (still the default) and the v2 indicator set (flag-gated), under a
    single #### Energy H4 heading so the methodology-doc linter still
    asserts dimension-id parity with the registry.
  - v2 indicators: importedFossilDependence (EG.ELC.FOSL.ZS x
    max(EG.IMP.CONS.ZS, 0)), lowCarbonGenerationShare (EG.ELC.NUCL.ZS +
    EG.ELC.RNEW.ZS), powerLossesPct (EG.ELC.LOSS.ZS), reserveMarginPct
    (IEA), euGasStorageStress (renamed + scoped to EU), energyPriceStress
    (retained at 0.15 weight).
  - Retired under v2: electricityConsumption, gasShare, coalShare,
    dependency (all into importedFossilDependence), renewShare.
  - electricityAccess moves from energy to infrastructure under v2.
  - Added a v2.1 changelog section documenting the flag-gated rollout,
    acceptance gates (per plan section 6), and snapshot filenames for
    the post-flag-flip captures.
  - Known-limitations items 1-3 updated to note PR 1 lands the v2
    construct behind RESILIENCE_ENERGY_V2_ENABLED (default off).

Methodology-doc linter + mdx-lint + typecheck all clean. Indicator
registry, seeders, and scorer rewrite land in subsequent commits on
this same branch.

* feat(resilience): PR 1 — RESILIENCE_ENERGY_V2_ENABLED flag + scoreEnergy v2 + registry entries

Second commit in PR 1 of the resilience repair plan. Lands the flag,
the v2 scorer code path, and the registry entries the methodology
doc referenced. Default is flag off; published rankings are unchanged
until the flag flips in a later commit (after seeders land and the
acceptance-gate rerun produces a fresh post-flip snapshot).

Changes:

  - _shared.ts: isEnergyV2Enabled() function reader on the canonical
    RESILIENCE_ENERGY_V2_ENABLED env var. Dynamic read (like
    isPillarCombineEnabled) so tests can flip per-case.

  - _dimension-scorers.ts:
    - New Redis key constants for the three v2 seed keys plus the
      reserved reserveMargin key (seeder deferred per plan §3.1
      open-question).
    - EU_GAS_STORAGE_COUNTRIES set (EU + EFTA + UK) for the renamed
      euGasStorageStress signal per plan §3.5 point 2.
    - isEnergyV2EnabledLocal() — private duplicate of the flag reader
      to avoid a circular import (_shared.ts already imports from
      this module). Same env-var contract.
    - scoreEnergy split into scoreEnergyLegacy() + scoreEnergyV2().
      Public scoreEnergy() branches on the flag. Legacy path is
      byte-identical to the pre-commit behaviour.
    - scoreEnergyV2() reads four new bulk payloads, composes
      importedFossilDependence = fossilElectricityShare × max(netImports, 0)/100
      per plan §3.2, collapses net exporters to 0, and gates
      euGasStorageStress on EU membership so non-EU countries
      re-normalise rather than getting penalised for a regional
      signal.

  - _indicator-registry.ts: four new entries under `dimension: 'energy'`
    with `tier: 'experimental'` — importedFossilDependence (0.35),
    lowCarbonGenerationShare (0.20), powerLossesPct (0.10),
    reserveMarginPct (0.10). Experimental tier keeps them out of the
    Core coverage gate until seed coverage is confirmed.

  - compare-resilience-current-vs-proposed.mjs: new
    'bulk-v1-country-value' shape family in the extraction dispatcher.
    EXTRACTION_RULES now covers the four v2 registry indicators so
    the per-indicator influence harness tracks them from day one.
    When the seeders are absent, pairedSampleSize = 0 and Pearson = 0
    — the harness output surfaces the "no influence yet" state rather
    than silently dropping the indicators.

  - tests/resilience-energy-v2.test.mts: 11 new tests pinning:
    - flag-off = legacy behaviour preserved (v2 seed keys have no
      effect when flag is off — catches accidental cross-path reads)
    - flag-on = v2 composite behaves correctly:
      - lower fossilElectricityShare raises score
      - net exporter with 90% fossil > net importer with 90% fossil
        (max(·, 0) collapse verified)
      - higher lowCarbonGenerationShare raises score (nuclear credit)
      - higher powerLossesPct lowers score
      - euGasStorageStress is invariant for non-EU, responds for DE
      - all v2 inputs absent = graceful degradation, coverage < 1.0

106 resilience tests pass (existing + 11 new). Typecheck clean. Biome
clean. No production behaviour change with flag off (default).

Next commits on this branch: three World Bank seeders for the v2 keys,
health.js + SEED_META registration (gated ON_DEMAND_KEYS until Railway
cron provisions), acceptance-gate rerun at flag-flip time.

* feat(resilience): PR 1 — three WB seeders + health registration for v2 energy construct

Third commit in PR 1. Lands the seed scripts for the three v2 energy
indicator source keys, registered in api/health.js with ON_DEMAND_KEYS
gating until Railway cron provisions.

New seeders (weekly cron cadence, 8d maxStaleMin = 2x interval):
  - scripts/seed-low-carbon-generation.mjs
    Pulls EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS from World Bank, sums per
    country into `resilience:low-carbon-generation:v1`. Partial
    coverage (one series missing) still emits a value using the
    observed half — the scorer's 0-80 saturating goalpost tolerates
    it and the underlying construct is "firm low-carbon share".

  - scripts/seed-fossil-electricity-share.mjs
    Pulls EG.ELC.FOSL.ZS into `resilience:fossil-electricity-share:v1`.
    Feeds the importedFossilDependence composite at score time
    (composite = fossilShare × max(netImports, 0) / 100 per plan §3.2).

  - scripts/seed-power-reliability.mjs
    Pulls EG.ELC.LOSS.ZS into `resilience:power-losses:v1`. Direct
    grid-integrity signal replacing the retired electricityConsumption
    wealth proxy.

All three follow the existing seed-recovery-*.mjs template:
  - Shape: { countries: { [ISO2]: { value, year } }, seededAt }
  - runSeed() from _seed-utils.mjs with schemaVersion=1, ttl=35d
  - validateFn floor of 150 countries (WB coverage is 150-180 for
    the three indicators; below 150 = transient fetch failure)
  - ISO3 → ISO2 mapping via scripts/shared/iso3-to-iso2.json

No reserveMargin seeder is shipped in this commit per plan §3.1 open
question: IEA electricity-balance coverage is sparse outside OECD+G20,
and the indicator will likely ship as 'unmonitored' with weight 0.05
if it lands at all. The Redis key (`resilience:reserve-margin:v1`) is
reserved in _dimension-scorers.ts so the v2 scorer shape is stable.

api/health.js:
  - SEED_DOMAINS: add `lowCarbonGeneration`, `fossilElectricityShare`,
    `powerLosses` → their Redis keys.
  - SEED_META: same three, pointing at `seed-meta:resilience:*` meta
    keys with maxStaleMin=11520 (8d, per the worldmonitor
    health-maxstalemin-write-cadence pattern: 2x weekly cron).
  - ON_DEMAND_KEYS: three new entries gated as TRANSITIONAL until
    Railway cron provisions and the first clean run completes. Remove
    from this set after ~7 days of green production runs.

Typecheck clean; existing 106 resilience tests pass (seeders have no
in-repo callers yet, so nothing depends on them executing). Real-API
integration tests land when Railway cron is provisioned.

Next commit: Railway cron configuration + bundle-runner wiring.

* feat(resilience): PR 1 — bundle-runner + acceptance-gate verdict + flag-flip runbook

Final commit in the PR 1 tranche. Lands the three remaining pieces so
the flag-flip is fully operable once Railway cron provisions.

  - scripts/seed-bundle-resilience-energy-v2.mjs
    Railway cron bundle wrapping the three v2 energy seeders
    (low-carbon-generation, fossil-electricity-share, power-losses).
    Weekly cadence (7-day intervalMs); the underlying data is annual
    at source so polling more frequently just hammers the World Bank
    API. 5-minute per-script timeout. Mirrors the existing
    seed-bundle-resilience-recovery.mjs pattern.

  - scripts/compare-resilience-current-vs-proposed.mjs: acceptanceGates
    block. Programmatic evaluation of plan §6 gates using the inputs
    the harness already computes:
      gate-1-spearman              Spearman vs baseline >= 0.85
      gate-2-country-drift         Max country drift vs baseline <= 15
      gate-6-cohort-median         Cohort median shift vs baseline <= 10
      gate-7-matched-pair          Every pair holds expected direction
      gate-9-effective-influence   >= 80% Core indicators measurable
      gate-universe-integrity      No cohort/pair endpoint missing from scorable
    Thresholds are encoded in a const so they can't silently soften.
    Output verdict is PASS / CONDITIONAL / BLOCK. Emitted in
    summary.acceptanceVerdict for at-a-glance PR comment pasting, with
    full per-gate detail in acceptanceGates.results.

  - docs/methodology/energy-v2-flag-flip-runbook.md
    Operator runbook for the flag flip. Pre-flip checklist (seeders
    green, health endpoint green, ON_DEMAND_KEYS graduation, Spearman
    verification), flip procedure (pre-flip snapshot, dry-run, cache
    prefix bump, Vercel env flip, post-flip snapshot, methodology
    doc reclassification), rollback procedure, and a reference table
    for the three possible verdict states.

PR 1 is now code-complete pending:
  1. Railway cron provisioning (ops, not code)
  2. Flag flip + acceptance-gate rerun (follows runbook, not code)
  3. Reserve-margin seeder (deferred per plan §3.1 open-question)

Zero scoring-behaviour change in this commit. 121 resilience tests
pass, typecheck clean.

* fix(resilience): PR 1 — drop unseeded reserveMargin from scorer + fix composite extractor

Addresses two P1 review findings on PR #3289.

Finding 1: scoreEnergyV2 read resilience:reserve-margin:v1 at weight
0.10 but no seeder ships in this PR (indicator deferred per plan
§3.1 open-question). On flag flip that slot would be permanently
null, silently renormalizing the remaining 90% of weight and
producing a construct different from what the methodology doc
describes. Fix: remove reserve-margin from the v2 reader +
blend entirely. Redistribute its 0.10 weight to powerLossesPct
(now 0.20); both are grid-integrity signals per plan §3.1, and
the original plan split electricityConsumption's 0.30 weight
across powerLossesPct + reserveMarginPct + importedFossilDependence
— without reserveMarginPct, powerLossesPct carries the shared
grid-integrity load until the IEA seeder ships.

  v2 weights now: 0.35 + 0.20 + 0.20 + 0.10 + 0.15 = 1.00
  (importedFossilDependence + lowCarbonGenerationShare +
   powerLossesPct + euGasStorageStress + energyPriceStress)

  Reserve-margin Redis key constant stays reserved so the v2
  scorer shape is stable when a future commit lands the seeder;
  split 0.10 back out of powerLossesPct at that point.

Methodology doc, _shared.ts flag comment, and v2 test suite all
updated to the 5-indicator shape. New regression test asserts
that changing reserve-margin Redis content has zero effect on
the v2 score — guards against a future commit accidentally
wiring the reader back in without its seeder.

Finding 2: scripts/compare-resilience-current-vs-proposed.mjs
measured importedFossilDependence by reading fossilElectricityShare
alone. The scorer defines it as fossilShare × max(netImports, 0)
/ 100, so the extractor zeroed out net exporters and
under-reported net importers — making gate-9 effective-influence
wrong for the centrepiece construct change of PR 1.

Fix: new 'imported-fossil-dependence-composite' extractor type
in applyExtractionRule that recomputes the same composite from
both inputs (fossilShare bulk payload + staticRecord.iea.
energyImportDependency.value). Stays in lockstep with the
scorer — drift between the two would break gate-9's
interpretation.

New unit tests pin:
  - net importer: 80% × max(60, 0) / 100 = 48 ✓
  - net exporter: 80% × max(-40, 0) / 100 = 0 ✓
  - missing either input → null

64 resilience tests pass; typecheck clean. Flag-off path is
still byte-identical to pre-PR behaviour.

* docs(resilience): PR 1 — align methodology doc with actual shipped indicators and seeders

Addresses P1 review on docs/methodology/country-resilience-index.mdx
lines 29 and 574-575. The doc still described reserveMarginPct as a
shipped v2 indicator and listed seed-net-energy-imports.mjs in the
new-seeders list, neither of which the branch actually ships.

Doc changes to match the code in this branch:

  Known-limitations item 1: restated to describe the actual v2
  replacement footprint — powerLossesPct at 0.20 (temporarily
  absorbing reserveMarginPct's 0.10) plus accessToElectricityPct
  moved to infrastructure. reserveMarginPct is named as a deferred
  companion with the split-out instructions for when its seeder
  lands.

  v2.1 changelog (Indicators added): split into "live in PR 1" and
  "deferred in PR 1" so the reader can distinguish which entries
  match real code. importedFossilDependence's composite formula
  now written out and the net-imports source attributed to the
  existing resilience:static.iea path (not a new seeder).

  v2.1 changelog (New seeders): lists the three actual files that
  ship in this branch (seed-low-carbon-generation, seed-fossil-
  electricity-share, seed-power-reliability) and explicitly notes
  seed-net-energy-imports.mjs is NOT a new seeder — the
  EG.IMP.CONS.ZS series is already fetched by seed-resilience-
  static.mjs. Adds the bundle-runner reference.

Methodology-doc linter + mdx-lint both pass (125/125). Typecheck
clean. Doc is now the source of truth for what PR 1 actually ships.

* fix(resilience): PR 1 — sync powerLossesPct registry weight with scorer (0.10 → 0.20)

Reviewer-caught mismatch between INDICATOR_REGISTRY and scoreEnergyV2.
The previous commit redistributed the deferred reserveMarginPct's 0.10
weight into powerLossesPct in the SCORER but left the REGISTRY entry
unchanged at 0.10. Two downstream effects:

  1. scripts/compare-resilience-current-vs-proposed.mjs copies
     `spec.weight` into `nominalWeight` for gate-9 reporting, so
     powerLossesPct's nominal influence would be under-reported by
     half in every post-flip acceptance run — exactly the harness PR 1
     relies on for merge evidence.
  2. Methodology doc vs registry vs scorer drift is the pattern the
     methodology-doc linter is supposed to catch; it passes here
     because the linter only checks dimension-id parity, not weights.
     Registry is now the only remaining source of truth to keep in
     lockstep with the scorer.

Change:
  - `_indicator-registry.ts` powerLossesPct.weight: 0.1 → 0.2
  - Inline comment names the deferral and instructs: "when the IEA
    electricity-balance seeder lands, split 0.10 back out and restore
    reserveMarginPct at 0.10. Keep this field in lockstep with
    scoreEnergyV2 ... because the PR 0 compare harness copies
    spec.weight into nominalWeight for gate-9 reporting."

Experimental weights per dimension invariant still holds (0.35 + 0.20
+ 0.20 = 0.75 for energy, well under the 1.0 ceiling). 64 resilience
tests pass, typecheck clean.
2026-04-22 17:10:38 +04:00
Elie Habib
fbaf07e106 feat(resilience): flag-gated pillar-combined score activation (default off) (#3267)
Wires the non-compensatory 3-pillar combined overall_score behind a
RESILIENCE_PILLAR_COMBINE_ENABLED env flag. Default is false so this PR
ships zero behavior change in production. When flipped true the
top-level overall_score switches from the 6-domain weighted aggregate
to penalizedPillarScore(pillars) with alpha 0.5 and pillar weights
0.40 / 0.35 / 0.25.

Evidence from docs/snapshots/resilience-pillar-sensitivity-2026-04-21:
- Spearman rank correlation current vs proposed 0.9935
- Mean score delta -13.44 points (every country drops, penalty is
  always at most 1)
- Max top-50 rank swing 6 positions (Russia)
- No ceiling or floor effects under plus/minus 20pct perturbation
- Release gate PASS 0/19

Code change in server/worldmonitor/resilience/v1/_shared.ts:
- New isPillarCombineEnabled() reads env dynamically so tests can flip
  state without reloading the module
- overallScore branches on (isPillarCombineEnabled() AND
  RESILIENCE_SCHEMA_V2_ENABLED AND pillars.length > 0); otherwise falls
  through to the 6-domain aggregate (unchanged default path)
- RESILIENCE_SCORE_CACHE_PREFIX bumped v9 to v10
- RESILIENCE_RANKING_CACHE_KEY bumped v9 to v10

Cache invalidation: the version bump forces both per-country score
cache and ranking cache to recompute from the current code path on
first read after a flag flip. Without the bump, 6-domain values cached
under the flag-off path would continue to serve for up to 6-12 hours
after the flip, producing a ragged mix of formulas.

Ripple of v9 to v10:
- api/health.js registry entry
- scripts/seed-resilience-scores.mjs (both keys)
- scripts/validate-resilience-correlation.mjs,
  scripts/backtest-resilience-outcomes.mjs,
  scripts/validate-resilience-backtest.mjs,
  scripts/benchmark-resilience-external.mjs
- tests/resilience-ranking.test.mts 24 fixture usages
- tests/resilience-handlers.test.mts
- tests/resilience-scores-seed.test.mjs explicit pin
- tests/resilience-pillar-aggregation.test.mts explicit pin
- docs/methodology/country-resilience-index.mdx

New tests/resilience-pillar-combine-activation.test.mts:
7 assertions exercising the flag-on path against the release fixtures
with re-anchored bands (NO at least 60, YE/SO at most 40, NO greater
than US preserved, elite greater than fragile). Regression guard
verifies flipping the flag back off restores the 6-domain aggregate.

tests/resilience-ranking-snapshot.test.mts: band thresholds now
resolve from a METHODOLOGY_BANDS table keyed on
snapshot.methodologyFormula. Backward compatible (missing formula
defaults to domain-weighted-6d bands).

Snapshots:
- docs/snapshots/resilience-ranking-2026-04-21.json tagged
  methodologyFormula domain-weighted-6d
- docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json
  new: top/bottom/major-economies tables projected from the
  52-country sensitivity sample. Explicitly tagged projected (NOT a
  full-universe live capture). When the flag is flipped in production,
  run scripts/freeze-resilience-ranking.mjs to capture the
  authoritative full-universe snapshot.

Methodology doc: Pillar-combined score activation section rewritten to
describe the flag-gated mechanism (activation is an env-var flip, no
code deploy) and the rollback path.

Verification: npm run typecheck:all clean, 397/397 resilience tests
pass (up from 390, +7 activation tests).

Activation plan:
1. Merge this PR with flag default false (zero behavior change)
2. Set RESILIENCE_PILLAR_COMBINE_ENABLED=true in Vercel and Railway env
3. Redeploy or wait for next cold start; v9 to v10 bump forces every
   country to be rescored on first read
4. Run scripts/freeze-resilience-ranking.mjs against the flag-on
   deployment and commit the resulting snapshot
5. Ship a v2.0 methodology-change note explaining the re-anchored
   scale so analysts understand the universal ~13 point score drop is
   a scale rebase, not a country-level regression

Rollback: set RESILIENCE_PILLAR_COMBINE_ENABLED=false, flush
resilience:score:v10:* and resilience:ranking:v10 keys (or wait for
TTLs). The 6-domain formula stays alongside the pillar combine in
_shared.ts and needs no code change to come back.
2026-04-22 06:52:07 +04:00
Elie Habib
661bbe8f09 fix(health): nationalDebt threshold 7d → 60d — match monthly cron interval (#3237)
* fix(health): nationalDebt threshold 7d → 60d to match monthly cron cadence

User reported health showing:
  "nationalDebt": { status: "STALE_SEED", records: 187, seedAgeMin: 10469, maxStaleMin: 10080 }

Root cause: api/health.js had `maxStaleMin: 10080` (7 days) on a seeder
that runs every 30 days via seed-bundle-macro.mjs:
  { label: 'National-Debt', intervalMs: 30 * DAY, ... }

The threshold was narrower than the cron interval, so every month
between days 8–30 it guaranteed STALE_SEED. Original comment
"7 days — monthly seed" even spelled the mismatch out loud.

Data source cadence:
- US Treasury debt_to_penny API: updates daily but we only snapshot latest
- IMF WEO: quarterly/semi-annual release — no value in checking daily
- 30-day cron is appropriate; stale threshold should be ≥ 2× interval

Fix: bump maxStaleMin to 86400 (60 days). Matches the 2× pattern used
by faoFoodPriceIndex + recovery pillar (recoveryFiscalSpace, etc.)
which also run monthly.

Also fixes the same mismatch in scripts/regional-snapshot/freshness.mjs —
the 10080 ceiling there would exclude national-debt from capital_stress
axis scoring 23 days out of every 30 between seeds.

* fix(seed-national-debt): raise CACHE_TTL to 65d so health.js stale window is actually reachable

PR #3237 review was correct: my earlier fix set api/health.js
SEED_META.nationalDebt.maxStaleMin to 60d (86400min), but the seeder's
CACHE_TTL was still 35d. After a missed monthly cron, the canonical key
expired at day 35 — long before the 60d "stale" threshold. Result path:
  hasData=false → api/health.js:545-549 → status = EMPTY (crit)
Not STALE_SEED (warn) as my commit message claimed.

writeFreshnessMetadata() in scripts/_seed-utils.mjs:222 sets meta TTL to
max(7d, ttlSeconds), so bumping ttlSeconds alone propagates to both the
canonical payload AND the meta key.

Fix:
- CACHE_TTL 35d → 65d (5d past the 60d stale window so we get a clean
  STALE_SEED → EMPTY transition without keys vanishing mid-warn).
- runSeed opts.maxStaleMin 10080 (7d) → 86400 (60d) so the in-seeder
  declaration matches api/health.js. Field is only validated for
  presence by runSeed (scripts/_seed-utils.mjs:798), but the drift was
  what hid the TTL invariant in the first place.

Invariant this restores: for any SEED_META entry,
  seeder CACHE_TTL ≥ maxStaleMin + buffer
so the "warn before crit" gradient actually exists.

* fix(freshness): wire national-debt to seed-meta + teach extractTimestamp about seededAt

Reviewer P2 on PR #3237: my earlier freshness.mjs bump to 86400 was a
no-op. classifyInputs() (scripts/regional-snapshot/freshness.mjs:100-108,
122-132) uses the entry's metaKey or extractTimestamp()'s known field
list. national-debt had neither — payload carries only `seededAt`, and
extractTimestamp didn't know that field, so the "present but undated"
branch treated every call as fresh. The age window never mattered.

Two complementary fixes:

1. Add metaKey: 'seed-meta:economic:national-debt' to the freshness
   entry. Primary, authoritative source — seed-meta.fetchedAt is
   written by writeFreshnessMetadata() on every successful run, which is
   also what api/health.js reads, keeping both surfaces consistent.

2. Add `seededAt` to extractTimestamp()'s field list. Defense-in-depth:
   many other runSeed-based scripts (seed-iea-oil-stocks,
   seed-eurostat-country-data, etc.) wrap output as { ..., seededAt: ISO }
   with no metaKey in the freshness registry. Without this, they were
   also silently always-fresh. ISO strings parse via Date.parse.

Note: `economic:eu-gas-storage:v1` uses `seededAt: String(Date.now())` —
a stringified epoch number, which Date.parse does NOT handle. That seed's
freshness classification is still broken by this entry's lack of metaKey,
but it's a separate shape issue out of scope here. Flagged in PR body.
2026-04-20 19:03:47 +04:00
Elie Habib
9e022f23bb fix(cable-health): stop EMPTY alarm during NGA outages — writeback fallback + mark zero-events healthy (#3230)
User reported health endpoint showing:
  "cableHealth": { status: "EMPTY", records: 0, seedAgeMin: 0, maxStaleMin: 90 }

despite the 30-min warm-ping loop running. Two bugs stacked:

1. get-cable-health.ts null-upstream path didn't write Redis.
   cachedFetchJson with a returning-null fetcher stores NEG_SENTINEL
   (10 bytes) in cable-health-v1 for 2 min. Handler then returned
   `fallbackCache || { cables: {} }` to the client WITHOUT writing to
   cable-health-v1 or refreshing seed-meta. api/health.js saw strlen=10
   → strlenIsData=false → hasData=false → records=0 → EMPTY (CRIT).

   Fix: on null result, write the fallback response back to CACHE_KEY
   (short TTL matching NEG_SENTINEL so a recovered NGA fetch can
   overwrite immediately) AND refresh seed-meta with the real count.
   Health now sees hasData=true during an outage.

2. Zero-cables was treated as EMPTY_DATA (CRIT), but `cables: {}` is
   the valid healthy state — NGA had no active subsea cable warnings.
   The old `Math.max(count, 1)` on recordCount was an intentional lie
   to sidestep this; now honest.

   Fix: add `cableHealth` to EMPTY_DATA_OK_KEYS. Matches the existing
   pattern for notamClosures, gpsjam, weatherAlerts — "zero events is
   valid, not critical". recordCount now reports actual cables.length.

Combined: NGA outage → fallback cached locally + written back → health
reads hasData=true, records=N, no false alarm. NGA healthy with zero
active warnings → cables={}, records=0, EMPTY_DATA_OK → OK. NGA healthy
with warnings → cables={...}, records>0 → OK.

Regression guard to keep in mind: if anyone later removes cableHealth
from EMPTY_DATA_OK_KEYS and wants strict zero-events to alarm, they'd
also need to revisit `Math.max(count, 1)` or an equivalent floor so
the "legitimately empty but healthy" state doesn't CRIT.
2026-04-20 15:21:04 +04:00
Elie Habib
84eec7f09f fix(health): align breadthHistory maxStaleMin with actual Tue-Sat cron schedule (#3219)
Production alarm: `breadthHistory` went STALE_SEED every Monday morning
despite the seeder running correctly. Root cause was a threshold /
schedule mismatch:

- Schedule (Railway): 02:00 UTC, Tuesday through Saturday. Five ticks
  per week, capturing Mon-Fri market close → following-day 02:00 UTC.
- Threshold: maxStaleMin=2880 (48h), assuming daily cadence.
- Max real gap: Sat 02:00 UTC → Tue 02:00 UTC = 72h. The existing 48h
  alarm fired every Monday at ~02:00 UTC when the Sun/Mon cron ticks
  are intentionally absent, until the Tue 02:00 UTC run restored
  fetchedAt.

Fix: bump maxStaleMin to 5760 (96h). 72h covers the weekend gap;
extra 24h tolerates one missed Tue run without alarming. Comment now
records the actual schedule + reasoning.

No seeder change needed — logs confirm the service fires and completes
correctly on its schedule (Apr 16/17/18 02:00 UTC runs all "Done" with
3/3 readings, `Stopping Container` is normal Railway cron teardown).

Diagnostic memo: this is the class of bug where the schedule comment
lies. Original comment said "daily cron at 21:00 ET". True start time
is 22:00 EDT / 21:00 EST Mon-Fri (02:00 UTC next day) AND only Mon-Fri,
so "daily" is wrong by two days every week.
2026-04-20 07:56:54 +04:00
Elie Habib
c7aacfd651 fix(health): persist WARNING events + add failure-log timeline (#3197)
* fix(health): persist WARNING events + add failure-log timeline

WARNING status (stale seeds) was excluded from the health:last-failure
Redis write (line 680 checked `!== 'WARNING'`). When UptimeRobot keyword-
checks for "HEALTHY" and gets a WARNING response, it flags DOWN, but no
forensic trail was left in Redis. This made stale-seed incidents invisible
to post-mortem investigation.

Changes:
- Write health:last-failure for ANY non-HEALTHY status (including WARNING)
- Add health:failure-log (LPUSH list, last 50 entries, 7-day TTL) so
  multiple incidents are preserved as a timeline, not just the latest
- Include warnCount alongside critCount in the snapshot
- Broaden the problems filter to capture all non-OK statuses

* fix(health): dedupe failure-log entries by incident signature

Repeated polls during one long WARNING window would LPUSH near-identical
snapshots, filling the 50-entry log and evicting older distinct incidents.

Now compares a signature (status + sorted problem set) against the previous
entry via health:failure-log-sig. Only appends when the incident changes.
The last-failure key is still updated every poll (latest timestamp matters).

* fix(health): add 4s timeout to persist pipelines + consistent arg types

Addresses greptile review on PR #3197:
- Both persist redisPipeline calls now pass 4_000ms timeout (main data
  pipeline uses 8_000ms; persist is less critical so shorter is fine)
- LTRIM/EXPIRE args use numbers consistently (was mixing number/string)

* fix(health): atomic sig swap via SET ... GET to eliminate dedupe race

Two concurrent /api/health requests could both read the old signature
before either write lands, appending duplicate entries. Now uses
SET key val EX ttl GET (Redis 6.2+) to atomically swap the sig and
return the previous value in one pipeline command. The LPUSH only
fires if the returned previous sig differs from the new one.

Also skips the second redisPipeline call entirely when sig matches
(no logCmds to send).

* fix(health): exclude seedAgeMin from dedupe sig + clear sig on recovery

Two issues with the failure-log dedupe:

1. seedAgeMin changes on every poll (e.g. 31min, 32min, 33min), so
   the signature changed every time and LPUSH still fired on every
   probe during a STALE_SEED window. Now uses a separate sigKeys
   array with only key:status (no age) for the signature, while
   problemKeys still includes ages for the snapshot payload.

2. The sig was never cleared on recovery. If the same problem set
   recurred after a healthy gap, the old sig (within its 24h TTL)
   would match and the recurrence would be silently skipped. Now
   DELs health:failure-log-sig when overall === 'HEALTHY'.

* fix(health): move sig write after LPUSH in same pipeline

The sig was written eagerly in the first pipeline (SET ... GET), but the
LPUSH happened in a separate background pipeline. If that second write
failed, the sig was already advanced, permanently deduping the incident
out of the timeline.

Now: GET sig first (read-only), then write last-failure + LPUSH + sig
all in one pipeline. The sig only advances if the entire pipeline
succeeds. Failure leaves the old sig in place so the next poll retries.

Reintroduces a small read-then-write race window (two concurrent probes
can both read the old sig), but the worst case is a single duplicate
entry, which is strictly better than a permanently dropped incident.
2026-04-19 10:14:19 +04:00
Elie Habib
96fca1dc2b fix(supply-chain): popup-keyed history re-query + dataAvailable flag (#3187)
* fix(supply-chain): popup-keyed history re-query + dataAvailable flag for partial coverage

Two P1 findings on #3185 post-merge review:

1. MapPopup cross-chokepoint history contamination
   Popup's async history resolve re-queried [data-transit-chart] without a
   cpId key. User opens popup A → fetch starts for cpA; user opens popup B
   before it resolves → cpA's history mounts into cpB's chart container.
   Fix: add data-transit-chart-id keyed by cpId; re-query by it on resolve.
   Mirrors SupplyChainPanel's existing data-chart-cp-id pattern.

2. Partial portwatch coverage still looked healthy
   Previous fix emits all 13 canonical summaries (zero-state fill for
   missing IDs) and records pwCovered in seed-meta, but:
   - get-chokepoint-status still zero-filled missing chokepoints and cached
     the response as healthy — panel rendered silent empty rows.
   - api/health.js only degrades on recordCount=0, so 10/13 partial read
     as OK despite the UI hiding entire chokepoints.
   Fix:
   - proto: TransitSummary.data_available (field 12). Writer tags with
     Boolean(cpData). Status RPC passes through; defaults true for pre-fix
     payloads (absence = covered).
   - Status RPC writes seed-meta recordCount as covered count (not shape
     size), and flips response-level upstreamUnavailable on partial.
   - api/health.js: new minRecordCount field on SEED_META entries + new
     COVERAGE_PARTIAL status (warn rollup). chokepoints entry declares
     minRecordCount: 13. recordCount < 13 → COVERAGE_PARTIAL.
   - Client (panel + popup): skip stats/chart rendering when
     !dataAvailable; show "Transit data unavailable (upstream partial)"
     microcopy so users understand the gap.

5759/5759 data tests pass. Typecheck + typecheck:api clean.

* fix(supply-chain): guarantee Simulate Closure button exits Computing state

User reports "Simulate Closure does nothing beyond write Computing…" — the
button sticks at Computing forever. Two causes:

1. Scenario worker appears down (0 scenario-result:* keys in Redis in the
   last 24h of 24h-TTL). Railway-side — separate intervention needed to
   redeploy scripts/scenario-worker.mjs.

2. Client leaked the "Computing…" state on multiple exit paths:
   - signal.aborted early-return inside the poll loop never reset the
     button. Second click fired abort on first → first returned without
     resetting → button stayed "Computing…" until next render.
   - !this.content.isConnected early-return also skipped reset (less
     user-visible but same class of bug).
   - catch block swallowed AbortError without resetting.
   - POST /run had no hard timeout — a hanging edge function left the
     button in Computing indefinitely.

Fix:
- resetButton(text) helper touches the btn only if still connected;
  applied in every exit path (abort, timeout, post-success, catch).
- AbortSignal.any([caller, AbortSignal.timeout(20_000)]) on POST /run.
- console.error on failure so Simulate Closure errors surface in ops.
- Error message includes "scenario worker may be down" on loop timeout
  so operators see the right suspect.

Backend observations (for follow-up):
- Hormuz backend is healthy (/api/health chokepoints OK, 13 records,
  1 min old; live RPC has hormuz_strait.riskLevel=critical, wow=-22,
  flowEstimate present; GetChokepointHistory returns 174 entries).
  User-reported "Hormuz empty" is likely browser/CDN stale cache from
  before PR #3185; hard refresh should resolve.
- scenario-worker.mjs has zero result keys in 24h. Railway service
  needs verification/redeployment.

* fix(scenario): wrong Upstash RPUSH format silently broke every Simulate Closure

Railway scenario-worker log shows every job failing field validation since
at least 03:06Z today:

  [scenario-worker] Job failed field validation, discarding:
    ["{\"jobId\":\"scenario:1776535792087:cynxx5v4\",...

The leading [" in the payload is the smoking gun. api/scenario/v1/run.ts
was POSTing to /rpush/{key} with body `[payload]`, expecting Upstash to
unpack the array and push one string value. Upstash does NOT parse that
form — it stored the literal `["{...}"]` string as a single list value.

Worker BLMOVEs the literal string → JSON.parse → array → destructure
`{jobId, scenarioId, iso2}` on an array returns undefined for all three
→ every job discarded without writing a result. Client poll returns
`pending` for the full 60s timeout, then (on the prior client code path)
leaked the stuck "Computing…" button state indefinitely.

Fix: use the standard Upstash REST command format — POST to the base URL
with body `["RPUSH", key, value]`. Matches scripts/ais-relay.cjs upstashLpush.

After this, the scenario-queue:pending list stores the raw payload string,
BLMOVE returns the payload, JSON.parse gives the object, validation passes,
computeScenario runs, result key gets written, client poll sees `done`.

Zero result keys existed in prod Redis in the last 24h (24h TTL on
scenario-result:*) — confirms the fix addresses the production outage.
2026-04-18 23:38:33 +04:00
Elie Habib
388995b1a4 fix(health): macroSignals maxStaleMin 20 → 150 to match seed-economy cron cadence (#3179)
macroSignals is a secondary key written by seed-economy.mjs, whose
primary key energy-prices has maxStaleMin=150 in its runSeed config.
A 20-min threshold guaranteed STALE_SEED between every cron run.
2026-04-18 20:50:48 +04:00
Elie Habib
64c906a406 feat(eia): gold-standard /api/eia/petroleum (Railway seed → Redis → Vercel reads only) (#3161)
* feat(eia): move /api/eia/petroleum to gold-standard (Railway seed → Redis → Vercel reads only)

Live api.eia.gov fetches from the Vercel edge function were causing
FUNCTION_INVOCATION_TIMEOUT 504s on /api/eia/petroleum (Sydney edge →
US origin with no timeout, no cache, no stale fallback — one EIA blip
blew the 25s budget).

- New seeder scripts/seed-eia-petroleum.mjs — fetches WTI/Brent/
  production/inventory from api.eia.gov with per-fetch 15s timeouts,
  writes energy:eia-petroleum:v1 with the {_seed, data} envelope.
  Accepts 1-of-4 series; 0-of-4 routes to contract-mode RETRY so
  seed-meta stays stale and the bundle retries on next cron.
- Bundled into seed-bundle-energy-sources.mjs (daily, 90s timeout) —
  no new Railway service needed.
- Rewrote api/eia/[[...path]].js as a Redis-only reader via
  readJsonFromUpstash. Same response shape for backward compat with
  widgets/MCP/external callers. 503 + Retry-After on miss (never 504).
- Registered eiaPetroleum in api/health.js STANDALONE_KEYS + gated as
  ON_DEMAND_KEYS for the deploy window; promote to SEED_META
  (maxStaleMin: 4320) in a follow-up after ~7 days of clean cron.
- Tests: 14 seeder unit tests + 9 edge handler tests.

Audit result: /api/eia/petroleum was the only Vercel route fetching
dashboard data live. Every other fetch(https://…) in api/ is
auth/payments/notifications/user-initiated enrichment.

* fix(eia): close silent-stale window — add SEED_META + seed-health registration

Review finding on PR #3161: without a SEED_META entry, readSeedMeta
returns seedStale: null and classifyKey never reaches STALE_SEED.
That meant a broken Railway cron or missing EIA_API_KEY after the first
successful seed would keep /api/eia/petroleum serving stale data for
up to 7 days (TTL) while /api/health reported OK.

- api/health.js: add SEED_META.eiaPetroleum with maxStaleMin=4320
  (72h = 3× daily bundle cadence). Keep eiaPetroleum in ON_DEMAND_KEYS
  so the Vercel-instant / Railway-delayed deploy window doesn't CRIT
  on first seed, but stale-after-seed now properly fires STALE_SEED.
- api/seed-health.js: register energy:eia-petroleum in SEED_DOMAINS
  (intervalMin=1440) so the secondary health endpoint reports it too.
- Updated ON_DEMAND_KEYS comment to reflect freshness is now enforced.
2026-04-18 14:40:00 +04:00
Elie Habib
f44b3260f4 fix(relay): harden Telegram session lifecycle + add health monitoring (#3152)
* fix(relay): harden Telegram session lifecycle + add health monitoring

Three fixes for the AUTH_KEY_DUPLICATED outage that silently emptied
the Telegram Intel panel with no health signal:

1. Increase disconnect timeout from 3s to 10s in gracefulShutdown,
   and default TELEGRAM_STARTUP_DELAY_MS from 60s to 120s. The 3s
   timeout was too aggressive for the MTProto disconnect handshake,
   allowing the old session to linger past the new container's
   startup delay window, causing AUTH_KEY_DUPLICATED.

2. Register telegramFeed in health.js (STANDALONE_KEYS + SEED_META
   with maxStaleMin=10). The relay now writes both a data key and
   seed-meta on each successful poll cycle. When the poll stops
   (session invalidated, package missing, FLOOD_WAIT), the key goes
   stale within 10 minutes and surfaces as STALE_SEED in the global
   health dashboard instead of silently showing "No messages available"
   in the panel indefinitely.

3. Add destroyTelegramClient() that nulls the client reference AND
   tears down the MTProto sender's internal reconnect loop and
   underlying socket. The library's autonomous reconnect mechanism
   continued running after AUTH_KEY_DUPLICATED, spamming recv loop
   crash/reconnect log lines every 90s even though telegramPermanently
   Disabled was true and no polls were running.

* fix(relay): stagger telegram Redis TTLs so STALE_SEED fires before EMPTY

Data key 1800s (30min), seed-meta 900s (15min). With maxStaleMin=10,
the health timeline is:
  0-10min after last poll: OK (meta fresh, data present)
  10-15min: STALE_SEED (meta older than maxStaleMin, data still present)
  15-30min: EMPTY (meta expired, data still present but records=0)
  30min+:   EMPTY (both expired)

Previously both keys had 600s TTL, so they expired together and
health jumped straight from OK to EMPTY with no stale window.

* fix(relay): destroy locally-created client on init failure

connect() throws before telegramState.client is assigned, so
destroyTelegramClient() saw null and the leaked client's MTProto
sender kept its autonomous reconnect loop running. Now: hoist
the client variable, assign it to telegramState.client before
calling destroyTelegramClient() on any connect failure (not just
AUTH_KEY_DUPLICATED) so the socket and sender are always torn down.
2026-04-18 00:29:11 +04:00
Elie Habib
935417e390 chore(relay): socialVelocity + wsbTickers to hourly fetch (6x Reddit traffic reduction) (#3135)
* chore(relay): socialVelocity + wsbTickers to hourly fetch (was 10min)

Reduce Reddit rate-limiting blast radius. Both seeders fetch 5 subreddits
combined (2 for SV: worldnews, geopolitics; 3 for WSB: wallstreetbets,
stocks, investing) with no proxy or OAuth. Reddit's behavioral heuristic
for datacenter IPs consistently flags the Railway IP after ~50min of
10-min polling and returns HTTP 403 on every subsequent cycle until the
container restarts with a new IP.

Evidence (2026-04-16 ais-relay log):
  13:32-14:22 UTC: 6 successful 10-min cycles for both seeders
  16:06-16:16 UTC: 2 more successful cycles after a restart
  16:26 UTC:       BOTH subs flip to HTTP 403 simultaneously
  16:36, 16:46, 16:56: every cycle, all 5 subreddits return 403

Dropping success-path frequency from 6/hour to 1/hour cuts the traffic
Reddit's heuristic sees by 6x. On failure path the 20-min retry is kept
as-is — during a block we've already been flagged, so extra retries don't
make it worse.

Changes:
- SOCIAL_VELOCITY_INTERVAL_MS:   10min → 60min
- SOCIAL_VELOCITY_TTL:           30min → 3h   (3× new interval)
- WSB_TICKERS_INTERVAL_MS:       10min → 60min
- WSB_TICKERS_TTL:               30min → 3h   (3× new interval)
- api/health.js maxStaleMin:     30min → 180min for both (3× interval)
- api/seed-health.js intervalMin: 15 → 90 for wsb-tickers (maxStaleMin / 2)

Proper fix (proxy fallback or Reddit OAuth) deferred.

* fix(seed-health): add socialVelocity parity entry — greptile P2

Review finding on PR #3135: wsbTickers was bumped from intervalMin=15 to 90
but socialVelocity had no seed-health.js entry at all. Both Reddit seeders
now share the same 60-min cadence; adding the missing entry gives parity.

P2-1 (malformed comment lines 5682-5683) is a false positive — verified
the lines do start with '//' in the file.
2026-04-16 22:17:58 +04:00
Elie Habib
7d27cec21c feat(relay): seeder-loop heartbeats for chokepoint-flows + climate-news (#3133)
* feat(relay): seeder-loop heartbeats for chokepoint-flows + climate-news

Detect silent relay-loop failures (ERR_MODULE_NOT_FOUND at import, event-loop
blocked, container restart loop) up to 4 hours earlier than the data-level
seed-meta staleness window.

The chokepoint-flows bug that motivated this PR was invisible in health for
32 hours because each 6h cron tick fired, execFile'd the child, child died
at import, and NO ONE updated seed-meta:energy:chokepoint-flows. Since the
last successful write was still within its 3-day TTL, the data key was
present and the old seed-meta was still there — STALE_SEED triggered only
at +12h, and even then was a warn (not crit) that could easily be missed.

Fix:
- In scripts/ais-relay.cjs, write a success-only heartbeat via upstashSet
  after each execFile-spawned seeder exits cleanly. TTL = 3x the loop
  interval (18h for chokepoint-flows, 90min for climate-news) so a single
  missed cycle doesn't flap but two consecutive misses alarm.
  Payload shape matches seed-meta for drop-in compatibility with the
  existing health-check reader: { fetchedAt, recordCount, durMs }.

- In api/health.js, register two new STANDALONE_KEYS entries pointing at
  the heartbeat keys, plus SEED_META entries with tighter maxStaleMin:
    chokepointFlowsRelayHeartbeat: 480min (8h vs 720min existing)
    climateNewsRelayHeartbeat:     60min  (vs 90min existing)
  When the relay loop fails for >2 intervals, the heartbeat goes stale
  first and surfaces as STALE_SEED in /api/health, giving 4h more notice
  than waiting for seed-meta:energy:chokepoint-flows.

This is orthogonal to PR #3132 (fixes the actual ERR_MODULE_NOT_FOUND root
cause). Heartbeat is defensive observability for the NEXT failure mode we
can't predict.

* fix(health): gate new relay heartbeat keys as ON_DEMAND during deploy window — greptile review

Review finding on PR #3133: new heartbeat keys (relay:heartbeat:chokepoint-flows,
relay:heartbeat:climate-news) are written by ais-relay.cjs AFTER the first
successful post-deploy loop. Vercel deploys api/health.js instantly, so the
window between 'merge' and 'first heartbeat written' is:
  - chokepoint-flows: up to 6h (initial loop tick)
  - climate-news:     up to 30min

During that window the heartbeat keys don't exist in Redis. classifyKey()
would return EMPTY (crit), which counts toward critCount and can flip overall
/api/health to DEGRADED even though climateNews and chokepointFlows data
themselves are fine.

Matches existing rule in project memory
(feedback_health_required_key_needs_railway_cron_first.md) — new seeder +
health.js registration in same PR needs ON_DEMAND gating until the Railway
side catches up, then harden after ~7 days.

Fix: add both keys to ON_DEMAND_KEYS with TRANSITIONAL comments, matching
the fxYoy / hyperliquidFlow pattern already used for the same issue.
2026-04-16 18:21:51 +04:00
Elie Habib
da1fa3367b fix(resilience-ranking): chunked warm SET, always-on rebuild, truthful meta (Slice B) (#3124)
* fix(resilience-ranking): chunked warm SET, always-on rebuild, truthful meta

Slice B follow-up to PR #3121. Three coupled production failures observed:

1. Per-country score persistence works (Slice A), but the 222-SET single
   pipeline body (~600KB) exceeds REDIS_PIPELINE_TIMEOUT_MS (5s) on Vercel
   Edge. runRedisPipeline returns []; persistence guard correctly returns
   empty; coverage = 0/222 < 75%; ranking publish silently dropped. Live
   Railway log: "Ranking: 0 ranked, 222 greyed out" → "Rebuilt … with 222
   countries (bulk-call race left ranking:v9 null)" — second call only
   succeeded because Upstash had finally caught up between attempts.

2. The seeder's probe + rebuild block lives inside `if (missing > 0)`. When
   per-country scores survive a cron tick (TTL 6h, cron every 6h), missing=0
   and the rebuild path is skipped. Ranking aggregate then expires alone and
   is never refreshed until scores also expire — multi-hour gaps where
   `resilience:ranking:v9` is gone while seed-meta still claims freshness.

3. `writeRankingSeedMeta` fires whenever finalWarmed > 0, regardless of
   whether the ranking key is actually present. Health endpoint sees fresh
   meta + missing data → EMPTY_ON_DEMAND with a misleading seedAge.

Fixes:
- _shared.ts: split the warm pipeline SET into SET_BATCH=30-command chunks
  so each pipeline body fits well under timeout. Pad missing-batch results
  with empty entries so the per-command alignment stays correct (failed
  batches stay excluded from `warmed`, no proof = no claim).
- seed-resilience-scores.mjs: extract `ensureRankingPresent` helper, call
  it from BOTH the missing>0 and missing===0 branches so the ranking gets
  refreshed every cron. Add a post-rebuild STRLEN verification — rebuild
  HTTP can return 200 with a payload but still skip the SET (coverage gate,
  pipeline failure).
- main(): only writeRankingSeedMeta when result.rankingPresent === true.
  Otherwise log and let the next cron retry.

Tests:
- resilience-ranking.test.mts: assert pipelines stay ≤30 commands.
- resilience-scores-seed.test.mjs: structural checks that the rebuild is
  hoisted (≥2 callsites of ensureRankingPresent), STRLEN verification is
  present, and meta write is gated on rankingPresent.

Full resilience suite: 373/373 pass (was 370 — 3 new tests).

* fix(resilience-ranking): seeder no longer writes seed-meta (handler is sole writer)

Reviewer P1: ensureRankingPresent() returning true only means the ranking
key exists in Redis — not that THIS cron actually wrote it. The handler
skips both the ranking SET and the meta SET when coverage < 75%, so an
older ranking from a prior cron can linger while this cron's data didn't
land. Under that scenario, the previous commit still wrote a fresh
seed-meta:resilience:ranking, recreating the stale-meta-over-stale-data
failure this PR is meant to eliminate.

Fix: remove seeder-side seed-meta writes entirely. The ranking handler
already writes ranking + meta atomically in the same pipeline when (and
only when) coverage passes the gate. ensureRankingPresent() triggers the
handler every cron, which addresses the original rationale for the seeder
heartbeat (meta going stale during quiet Pro usage) without the seeder
needing to lie.

Consequence on failure:
- Coverage gate trips → handler writes neither ranking nor meta.
- seed-meta stays at its previous timestamp; api/health reports accurate
  staleness (STALE_SEED after maxStaleMin, then CRIT) instead of a fresh
  meta over stale/empty data.

Tests updated: the "meta gated on rankingPresent" assertion is replaced
with "seeder must not SET seed-meta:resilience:ranking" + "no
writeRankingSeedMeta". Comments may still reference the key name for
maintainer clarity — the assertion targets actual SET commands.

Full resilience suite: 373/373 pass.

* fix(resilience-ranking): always refresh + 12h TTL (close timing hole)

Reviewer P1+P2:

- P1: ranking TTL == cron interval (both 6h) left a timing hole. If a cron
  wrote the key near the end of its run and the next cron fired near the
  start of its interval, the key was still alive at probe time →
  ensureRankingPresent() returned early → no rebuild → key expired a short
  while later and stayed absent until a cron eventually ran while the key
  was missing. Multi-hour EMPTY_ON_DEMAND gaps.

- P2: probing only the ranking data key (not seed-meta) meant a partial
  handler pipeline (ranking SET ok, meta SET missed) would self-heal only
  when the ranking itself expired — never during its TTL window.

Fix:

1. Bump RESILIENCE_RANKING_CACHE_TTL_SECONDS from 6h to 12h (2x cron
   interval). A single missed or slow cron no longer causes a gap.
   Server-side and seeder-side constants kept in sync.

2. Replace ensureRankingPresent() with refreshRankingAggregate(): drop the
   'if key present, skip' short-circuit. Rebuild every cron, unconditionally.
   One cheap HTTP call keeps ranking + seed-meta rolling forward together
   and self-heals the partial-pipeline case — handler retries the atomic
   pair every 6h regardless of whether the keys are currently live.

3. Update health.js comment to reflect the new TTL and refresh cadence
   (12h data TTL, 6h refresh, 12h staleness threshold = 2 missed ticks).

Tests:
- RESILIENCE_RANKING_CACHE_TTL_SECONDS asserts 12h (was 6h).
- New assertion: refreshRankingAggregate must NOT early-return on probe-
  hit, and the rebuild HTTP call must be unconditional in its body.
- DEL-guard test relaxed to allow comments between '{' and the DEL line
  (structural property preserved).

Full resilience suite: 375/375.

* fix(resilience-ranking): parallelize warm batches + atomic rebuild via ?refresh=1

Reviewer P2s:

- Warm path serialized the 8 batch pipelines with `await` in a for-loop,
  adding ~7 extra Upstash round-trips (100-500ms each on Edge) to the warm
  wall-clock. Batches are independent; Promise.all collapses them into one
  slowest-batch window.

- DEL+rebuild created a brief absence window: if the rebuild request failed
  transiently, the ranking stayed absent until the next cron. Now seeder
  calls `/api/resilience/v1/get-resilience-ranking?refresh=1` and the
  handler bypasses its cache-hit early-return, recomputing and SETting
  atomically. On rebuild failure, the existing (possibly stale-but-present)
  ranking is preserved instead of being nuked.

Handler: read ctx.request.url for the refresh query param; guard the URL
parse with try/catch so an unparseable url falls back to the cached-first
behavior.

Tests:
- New: ?refresh=1 must bypass the cache-hit early-return (fails on old code,
  passes now).
- DEL-guard test replaced with 'does NOT DEL' + 'uses ?refresh=1'.
- Batch chunking still asserted at SET_BATCH=30.

Full resilience suite: 376/376.

* fix(resilience-ranking): bulk-warm call also needs ?refresh=1 (asymmetric TTL hazard)

Reviewer P1: in the 6h-12h window, per-country score keys have expired
(TTL 6h) but the ranking aggregate is still alive (TTL 12h). The seeder's
bulk-warm call was hitting get-resilience-ranking without ?refresh=1, so
the handler's cache-hit early-return fired and the entire warm path was
skipped. Scores stayed missing; coverage degraded; the only recovery was
the per-country laggard loop (5-request batches) — which silently no-ops
when WM_KEY is absent. This defeated the whole point of the chunked bulk
warm introduced in this PR.

Fix: the bulk-warm fetch at scripts/seed-resilience-scores.mjs:167 now
appends ?refresh=1, matching the rebuild call. Every seeder-initiated hit
on the ranking endpoint forces the handler to route through
warmMissingResilienceScores and its chunked pipeline SET, regardless of
whether the aggregate is still cached.

Test extended: structural assertion now scans ALL occurrences of
get-resilience-ranking in the seeder and requires every one of them to
carry ?refresh=1. Fails the moment a future change adds a bare call.

Full resilience suite: 376/376.

* fix(resilience-ranking): gate ?refresh=1 on seed key + detect partial pipeline publish

Reviewer P1: ?refresh=1 was honored for any caller — including valid Pro
bearer tokens. A full warm is ~222 score computations + chunked pipeline
SETs; a Pro user looping on refresh=1 (or an automated client) could DoS
Upstash quota and Edge budget. Gate refresh behind
WORLDMONITOR_VALID_KEYS / WORLDMONITOR_API_KEY (X-WorldMonitor-Key
header) — the same allowlist the cron uses. Pro bearer tokens get the
standard cache-first path; refresh requires the seed service key.

Reviewer P2: the handler's atomic runRedisPipeline SET of ranking + meta
is non-transactional on Upstash REST — either SET can fail independently.
If the ranking landed but meta missed, the seeder's STRLEN verify would
pass (ranking present) while /api/health stays stuck on stale meta.

Two-part fix:
- Handler inspects pipelineResult[0] and [1] and logs a warning when
  either SET didn't return OK. Ops-greppable signal.
- Seeder's verify now checks BOTH keys in parallel: STRLEN on ranking
  data, and GET + fetchedAt freshness (<5min) on seed-meta. Partial
  publish logs a warning; next cron retries (SET is idempotent).

Tests:
- New: ?refresh=1 without/with-wrong X-WorldMonitor-Key must NOT trigger
  recompute (falls back to cached response). Existing bypass test updated
  to carry a valid seed key header.

Full resilience suite: 376/376 + 1 new = 377/377.
2026-04-16 12:48:41 +04:00
Elie Habib
dc10e47197 feat(seed-contract): PR 1 foundation — envelope + contract + conformance test (#3095)
* feat(seed-contract): PR 1 foundation — envelope helpers + contract validators + static conformance test

Adds the foundational pieces for the unified seed contract rollout described in
docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md. Behavior-
preserving by construction: legacy-shape Redis values unwrap as { _seed: null,
data: raw } and pass through every helper unchanged.

New files:
- scripts/_seed-envelope-source.mjs — single source of truth for unwrapEnvelope,
  stripSeedEnvelope, buildEnvelope.
- api/_seed-envelope.js — edge-safe mirror (AGENTS.md:80 forbids api/* importing
  from server/).
- server/_shared/seed-envelope.ts — TS mirror with SeedMeta, SeedEnvelope,
  UnwrapResult types.
- scripts/_seed-contract.mjs — SeedContractError + validateDescriptor (10
  required fields, 10 optional, unknown-field rejection) + resolveRecordCount
  (non-negative integer or throw).
- scripts/verify-seed-envelope-parity.mjs — diffs function bodies between the
  two JS copies; TS copy guarded by tsc.
- tests/seed-envelope.test.mjs — 14 tests for the three helpers (null,
  legacy-passthrough, stringified JSON, round-trip).
- tests/seed-contract.test.mjs — 25 tests for validateDescriptor/
  resolveRecordCount + a soft-warn conformance scan that STATICALLY parses
  scripts/seed-*.mjs (never dynamic import — several seeders process.exit() at
  module load). Currently logs 91 seeders awaiting declareRecords migration.

Wiring (minimal, behavior-preserving):
- api/health.js: imports unwrapEnvelope; routes readSeedMeta's parsed value
  through it. Legacy meta has no _seed wrapper → passes through unchanged.
- scripts/_bundle-runner.mjs: readSectionFreshness prefers envelope at
  section.canonicalKey when present, falls back to the existing
  seed-meta:<key> read via section.seedMetaKey (unchanged path today since no
  bundle defines canonicalKey yet).

No seeder modified. No writes changed. All 5279 existing data tests still
green; both typechecks clean; parity verifier green; 39 new tests pass.

PR 2 will migrate seeders, bundles, and readers to envelope semantics. PR 3
removes the legacy path and hard-fails the conformance test.

* fix(seed-contract): address PR #3095 review — metaTtlSeconds opt, bundle fallback, strict conformance mode

Review findings applied:

P1 — metaTtlSeconds missing from OPTIONAL_FIELDS whitelist.
scripts/seed-jodi-gas.mjs:250 passes metaTtlSeconds to runSeed(); field is
consumed by _seed-utils writeSeedMeta. Without it in the whitelist, PR 2's
validateDescriptor wiring would throw 'unknown field' the moment jodi-gas
migrates. Added with a 'removed in PR 3' note.

P2 — Bundle canonicalKey short-circuit over-runs during migration.
readSectionFreshness previously returned null if canonicalKey had no envelope
yet, even when a legacy seed-meta key was also declared — making every cron
re-run the section. Fixed to fall through to seedMetaKey on null envelope so
the transition state is safe.

P3 — Conformance soft-warn signal was invisible in CI.
tests/seed-contract.test.mjs now emits a t.diagnostic summary line
('N/M seeders export declareRecords') visible on every run and gates hard-fail
behind SEED_CONTRACT_STRICT=1 so PR 3 can flip to strict without more code.

Nitpick — parity regex missed 'export async function'.
Added '(?:async\s+)?' to scripts/verify-seed-envelope-parity.mjs function
extraction regex.

Verified: 39 tests green, parity verifier green, strict mode correctly
hard-fails with 91 seeders missing (expected during PR 1).

* fix(seed-contract): address review round 2 — NaN/empty-string validation, Error cause, parity CI wiring

P2 — Non-finite ttlSeconds/maxStaleMin bypassed validation.
`typeof NaN === 'number'` and `NaN > 0 === false` meant a NaN duration
passed the old typeof+<=0 checks and would have poisoned TTLs once
validateDescriptor is wired into runSeed. Now gated by Number.isFinite,
which rejects NaN and ±Infinity. Tests added for NaN/Infinity on both
fields.

P2 — Empty/whitespace-only strings for domain/resource/canonicalKey/sourceVersion
bypassed validation. Added .trim() === '' rejection + tests per field.
This mattered because canonicalKey='' would have landed writes at the
empty key and seed-meta under a blank resource namespace.

P3 — SeedContractError silently dropped the Error v2 cause option.
Constructor now forwards { cause } through super() so err.cause works
with standard tooling (Node's default stack printer, Sentry chained-cause
serialization). resolveRecordCount's manual err.cause = err assignment
was replaced with the options-bag form. Test added for both constructor
direct-use and the resolveRecordCount wrap path.

P3 — Parity verifier was not on an automated path. Added
tests/seed-envelope-parity.test.mjs which spawns scripts/verify-seed-envelope-parity.mjs
via execFile; non-zero exit (drift) → test fails. Now runs as part of
`npm run test:data` (tsx --test tests/*.test.mjs). Drift injection
confirmed: sed -i modifying api/_seed-envelope.js makes the test fail
with 'Command failed' from execFile.

51 tests total (was 39). All green on clean tree.

* fix(seed-contract): conformance test checks full descriptor, not just declareRecords

Previous conformance check green-lit any seeder that exported
declareRecords, even if the runSeed(...) call-site omitted other
validateDescriptor-required opts (validateFn, ttlSeconds, sourceVersion,
schemaVersion, maxStaleMin). That would have produced a false readiness
signal for PR 3's strict flip: test goes green, but wiring
validateDescriptor() into runSeed in PR 2 would still throw at runtime
across the fleet.

Examples verified on the PR head:
- scripts/seed-cot.mjs:188-192 — no sourceVersion/schemaVersion/maxStaleMin
- scripts/seed-market-breadth.mjs:121-124 — same
- scripts/seed-jodi-gas.mjs:248-253 — no schemaVersion/maxStaleMin

Now the conformance test:
1. AST-lite extracts the runSeed(...) call site with balanced parens,
   tolerating strings and comments.
2. Checks every REQUIRED_OPTS_FIELDS entry (validateFn, declareRecords,
   ttlSeconds, sourceVersion, schemaVersion, maxStaleMin) is present as
   an object key in that call-site.
3. Emits a per-file diagnostic listing missing fields.
4. Migration signal is now accurate: 0/91 seeders fully satisfy the
   descriptor (was claiming 0/91 missing just declareRecords). Matches
   the underlying validateDescriptor behavior.

Verified: strict mode (SEED_CONTRACT_STRICT=1) surfaces 'opt:schemaVersion,
opt:maxStaleMin' as missing fields per seeder — actionable for PR 2
migration work. 51 tests total (unchanged count; behavior change is in
which seeders the one conformance test considers migrated).

* fix(seed-contract): strip comments/strings before parsing runSeed() call site

The conformance scanner located the first 'runSeed(' substring in the raw
source, which caught commented-out mentions upstream of the real call.
Offending files where this produced false 'incomplete' diagnoses:
- scripts/seed-bis-data.mjs:209 // runSeed() calls process.exit(0)…
  real call at :220
- scripts/seed-economy.mjs:788 header comment mentioning runSeed()
  real call at :891

Three files had the same pattern. Under strict mode these would have been
false hard failures in PR 3 even when the real descriptor was migrated.

Fix:
- stripCommentsAndStrings(src) produces a view where block comments, line
  comments, and string/template literals are replaced with spaces (line
  feeds preserved). Indices stay aligned with the original source so
  extractRunSeedCall can match against the stripped view and then slice
  the original source for the real call body.
- descriptorFieldsPresent() also runs its field-presence regex against
  the stripped call body so '// TODO: validateFn' inside the call doesn't
  fool the check.
- hasRunSeedCall() uses the stripped view too, which correctly excludes
  5 seeders that only mentioned runSeed in comments. Count dropped
  91→86 real callers.

Added 4 targeted tests covering:
- runSeed() inside a line comment ahead of the real call
- runSeed() inside a block comment
- runSeed() inside a string literal ("don't call runSeed() directly")
- descriptor field names inside an inline comment don't count as present

Verified on the actual files: seed-bis-data.mjs first real runSeed( in
stripped source is at line 220 (was line 209 before fix).

40 tests total, all green.

* fix(seed-contract): parity verifier survives unbalanced braces in string/template literals

Addresses Greptile P2 on PR #3095: the body extractor in
scripts/verify-seed-envelope-parity.mjs counted raw { and } on every
character. A future helper body that legitimately contains
`const marker = '{'` would have pushed depth past zero at the literal
brace and truncated the body — silently masking drift in the rest of
the function.

Extracted the scan into scanBalanced(source, start, open, close) which
skips characters inside line comments, block comments, and string /
template literals (with escape handling and template-literal ${} recursion
for interpolation). Call sites in extractFunctions updated to use the new
scanner for both the arg-list parens and the function body braces.

Made extractFunctions and scanBalanced exported so the new test file
can exercise them directly. Gated main() behind an isMain check so
importing the module from tests doesn't trigger process.exit.

New tests in tests/seed-envelope-parity.test.mjs:
- extractFunctions tolerates unbalanced braces in string literals
- same for template literals
- same for braces inside block comments
- same for braces inside line comments
- scanBalanced respects backslash-escapes inside strings
- scanBalanced recurses into template-literal ${} interpolation

Also addresses the other two Greptile P2s which were already fixed in
earlier commits on this branch:
- Empty-string gap (99646dd9a): .trim()==='' rejection added
- SeedContractError cause drop (99646dd9a): constructor forwards cause
  through super's options bag per Error v2 spec

61 tests green. Both typechecks clean.
2026-04-14 22:11:56 +04:00
Elie Habib
e32d9b631c feat(market): Hyperliquid perp positioning flow as leading indicator (#3074)
* feat(market): Hyperliquid perp positioning flow as leading indicator

Adds a 4-component composite (funding × volume × OI × basis) "positioning
stress" score for ~14 perps spanning crypto (BTC/ETH/SOL), tokenized gold
(PAXG), commodity perps (WTI, Brent, Gold, Silver, Pt, Pd, Cu, NatGas), and
FX perps (EUR, JPY). Polls Hyperliquid /info every 5min via Railway cron;
publishes a single self-contained snapshot with embedded sparkline arrays
(60 samples = 5h history). Surfaces as a new "Perp Flow" tab in
CommoditiesPanel with separate Commodities / FX sections.

Why: existing CFTC COT is weekly + US-centric; market quotes are price-only.
Hyperliquid xyz: perps give 24/7 global positioning data that has been shown
to lead spot moves on commodities and FX by minutes-to-hours.

Implementation:
- scripts/seed-hyperliquid-flow.mjs — pure scoring math, symbol whitelist,
  content-type + schema validation, prior-state read via readSeedSnapshot(),
  warmup contract (first run / post-outage zeroes vol/OI deltas),
  missing-symbol carry-forward, $500k/24h min-notional guard to suppress
  thin xyz: noise. TTL 2700s (9× cadence).
- proto/worldmonitor/market/v1/get_hyperliquid_flow.proto + service.proto
  registration; make generate regenerated client/server bindings.
- server/worldmonitor/market/v1/get-hyperliquid-flow.ts — getCachedJson
  reader matching get-cot-positioning.ts seeded-handler pattern.
- server/gateway.ts cache-tier entry (medium).
- api/health.js: hyperliquidFlow registered with maxStaleMin:15 (3× cadence)
  + transitional ON_DEMAND_KEYS gate for the first ~7 days of bake-in.
- api/seed-health.js mirror with intervalMin:5.
- scripts/seed-bundle-market-backup.mjs entry (NIXPACKS auto-redeploy on
  scripts/** watch).
- src/components/MarketPanel.ts: CommoditiesPanel grows a Perp Flow tab
  + fetchHyperliquidFlow() RPC method; OI Δ1h derived from sparkOi tail.
- src/App.ts: prime via primeVisiblePanelData() + recurring refresh via
  refreshScheduler.scheduleRefresh() at 5min cadence (panel does NOT own
  setInterval; matches the App.ts:1251 lifecycle convention).
- 28 unit tests covering scoring parity, warmup flag, min-notional guard,
  schema rejection, missing-symbol carry-forward, post-outage cold start,
  sparkline cap, alert threshold.

Tests: test:data 5169/5169, hyperliquid-flow-seed 28/28, route-cache-tier
5/5, typecheck + typecheck:api green. One pre-existing test:sidecar failure
(cloud-fallback origin headers) is unrelated and reproduces on origin/main.

* fix(hyperliquid-flow): address review feedback — volume baseline window, warmup lifecycle, error logging

Two real correctness bugs and four review nits from PR #3074 review pass.

Correctness fixes:

1. Volume baseline was anchored to the OLDEST 12 samples, not the newest.
   sparkVol is newest-at-tail (shiftAndAppend), so slice(0, 12) pinned the
   rolling mean to the first hour of data forever once len >= 12. Volume
   scoring would drift further from current conditions each poll. Switched
   to slice(-VOLUME_BASELINE_MIN_SAMPLES) so the baseline tracks the most
   recent window. Regression test added.

2. Warmup flag flipped to false on the second successful poll while volume
   scoring still needed 12+ samples to activate. UI told users warmup
   lasted ~1h but the badge disappeared after 5 min. Tied per-asset warmup
   to real baseline readiness (coldStart OR vol samples < 12 OR prior OI
   missing). Snapshot-level warmup = any asset still warming. Three new
   tests cover the persist-through-baseline-build, clear-once-ready, and
   missing-OI paths.

Review nits:

- Handler: bare catch swallowed Redis/parse errors; now logs err.message.
- Panel: bare catch in fetchHyperliquidFlow hid RPC 500s; now logs.
- MarketPanel.ts: deleted hand-rolled RawHyperliquidAsset; mapHyperliquidFlowResponse
  now takes GetHyperliquidFlowResponse from the generated client so proto
  drift fails compilation instead of silently.
- Seeder: added @ts-check + JSDoc on computeAsset per type-safety rule.
- validateUpstream: MAX_UPSTREAM_UNIVERSE=2000 cap bounds memory.
- buildSnapshot: logs unknown xyz: perps upstream (once per run) so ops
  sees when Hyperliquid adds markets we could whitelist.

Tests: 37/37 green; typecheck + typecheck:api clean.

* fix(hyperliquid-flow): wire bootstrap hydration per AGENTS.md mandate

Greptile review caught that AGENTS.md:187 mandates new data sources be wired
into bootstrap hydration. Plan had deferred this on "lazy deep-dive signal"
grounds, but the project convention is binding.

- server/_shared/cache-keys.ts: add hyperliquidFlow to BOOTSTRAP_CACHE_KEYS
  + BOOTSTRAP_TIERS ('slow' — non-blocking, page-load-parallel).
- api/bootstrap.js: add to inlined BOOTSTRAP_CACHE_KEYS + SLOW_KEYS so
  bootstrap.test.mjs canonical-mirror assertions pass.
- src/components/MarketPanel.ts:
  - Import getHydratedData from @/services/bootstrap.
  - New mapHyperliquidFlowSeed() normalizes the raw seed-JSON shape
    (numeric fields) into HyperliquidFlowView. The RPC mapper handles the
    proto shape (string-encoded numbers); bootstrap emits the raw blob.
  - fetchHyperliquidFlow now reads hydrated data first, renders
    immediately, then refreshes from RPC — mirrors FearGreedPanel pattern.

Tests: 72/72 green (bootstrap + cache-tier + hyperliquid-flow-seed).
2026-04-14 08:05:40 +04:00
Elie Habib
46d17efe55 fix(resilience): wider FX YoY upstream + sanctions absolute threshold (#3071)
* fix(resilience): wider FX YoY upstream + sanctions absolute threshold

Two backtest families consistently failed Outcome-Backtest gates because
the detectors were reading the wrong shape of upstream data, not because
the upstream seeders were missing.

FX Stress (was AUC=0.500):
- BIS WS_EER (`economic:bis:eer:v1`) only covers 12 G10/major-EM countries
  — Argentina, Egypt, Turkey, Pakistan, Nigeria etc. are absent, so the
  detector had no positive events to score against
- Add `seed-fx-yoy.mjs` fetching Yahoo Finance 2-year monthly history
  across 45 single-country currencies, computing YoY % and 24-month
  peak-to-trough drawdown
- Switch detector to read drawdown24m with -15% threshold (matches
  methodology spec); falls back to yoyChange/realChange for back-compat
- Why drawdown not just YoY: rolling 12-month windows slice through
  historic crises (Egypt's March 2024 devaluation falls outside an
  April→April window by 2026); drawdown captures actual stress magnitude
  regardless of crisis timing
- Verified locally: flags AR (-38%), TR (-28%), NG (-21%), MX (-18%)

Sanctions Shocks (was AUC=0.624):
- Detector previously used top-quartile (Q3) of country-counts which
  conflated genuine comprehensive-sanctions targets (RU, IR, KP, CU,
  SY, VE, BY, MM) with financial hubs (UK, CH, DE, US) hosting many
  sanctioned entities
- Replace with absolute threshold of 100 entities — the OFAC
  distribution is heavy-tailed enough that this cleanly separates
  targets from hubs

Both fixes use existing seeded data (or new seeded data via
seed-fx-yoy.mjs) — no hardcoded country curation.

api/health.js: register `economic:fx:yoy:v1` in STANDALONE_KEYS +
SEED_META so the validation cron monitors freshness.

Railway: deploy `seed-fx-yoy` as a daily cron service (NIXPACKS
builder, startCommand `node scripts/seed-fx-yoy.mjs`, schedule
`30 6 * * *`).

* fix(seed-fx-yoy): use running-peak max-drawdown instead of global peak

PR #3071 review (P1): the original drawdown calculation found the global
maximum across the entire window, then the lowest point AFTER that peak.
This silently erased earlier crashes when the currency later recovered to
a new high — exactly the class of events the FX Stress family is trying
to capture.

Example series [5, 10, 7, 9, 6, 11, 10]: true worst drawdown is 10 → 6 =
-40%, but the broken implementation picked the later global peak 11 and
reported only 11 → 10 = -9.1%.

Fix: sweep forward tracking the running peak; for each subsequent bar
compute the drop from that running peak; keep the largest such drop.
This is the standard max-drawdown computation and correctly handles
recover-then-fall-again sequences.

Live data verification:
- BR (Brazilian real) was missing from the flagged set under the broken
  algorithm because BRL recovered above its 2024-04 peak. With the fix it
  correctly surfaces at drawdown=-15.8% (peak 2024-04, trough 2024-12).
- KR / CO peaks now resolve to mid-series dates instead of end-of-window,
  proving the running-peak scan is finding intermediate peaks.

Tests added covering: reviewer's regression case, peak-at-start (NGN
style), pure appreciation, multi-trough series, yoyChange anchor.

* fix(health): gate fxYoy as on-demand to avoid post-merge CRIT alarm

PR #3071 review (P1): registering `fxYoy` as a required standalone
seeded key creates an operational hazard during the deploy gap. After
merge, Vercel auto-deploys `api/health.js` immediately, but the
`seed-fx-yoy` Railway cron lives in a separate deployment surface that
must be triggered manually. Any gap (deploy-order race, first-cron
failure, env var typo) flips health to DEGRADED/UNHEALTHY because
`classifyKey()` marks the check as `EMPTY` without an on-demand or
empty-data-OK exemption.

Add `fxYoy` to ON_DEMAND_KEYS as a transitional safety net (matches the
pattern recovery* uses for "stub seeders not yet deployed"). The key is
still monitored — freshness via seed-meta — but missing data downgrades
from CRIT to WARN, which won't page anyone. Once the Railway cron has
fired cleanly for ~7 days in production we can remove this entry and
let it be a hard-required key like the rest of the FRED family.

Note: the Railway service IS already provisioned (cron `30 6 * * *`,
0.5 vCPU / 0.5 GB, NIXPACKS, watchPatterns scoped to the seeder + utils)
and the `economic:fx:yoy:v1` key is currently fresh in production from
local test runs. The gating here is defense-in-depth against the
operational coupling, not against a known absent key.
2026-04-13 21:57:11 +04:00
Elie Habib
4741689d26 fix(health): EMPTY+records contradiction, year rollover, edge waitUntil, per-command errors (#3056)
* fix(health): resolve EMPTY+records contradiction, displacement-year rollover, edge waitUntil, per-command errors

Code review of api/health.js surfaced six bugs / design issues:

1. EMPTY status with records>0 (contradiction): when a data key was missing
   but seed-meta still cached the prior count, status reported EMPTY while
   records displayed the stale count (e.g. energyMixAll: EMPTY records=209).
   classifyKey() now forces records=0 whenever hasData is false so the two
   fields cannot disagree.

2. strlen heuristic misclassified small payloads: hasData = strlen > 10
   treated any payload <11 bytes as missing, including legitimate {}, [],
   or single-digit numbers. Replaced with strlenIsData(strlen): >0 and not
   exactly the negative-cache sentinel length.

3. Displacement key rolls over on UTC Jan 1 and went CRIT for hours every
   year. Added displacementPrev sibling and CASCADE_GROUPS entry so health
   stays OK_CASCADE during the cutover.

4. Summary 'warn' included on-demand-empty keys but 'overall' did not,
   so HEALTHY responses showed warn>0. Summary now reports realWarnCount
   and surfaces onDemandWarn separately.

5. Background last-failure write used void Promise.catch(() => {}). Edge
   isolates can terminate before it resolves; now uses ctx.waitUntil when
   available.

6. Per-command Redis errors silently became 'EMPTY'. Pipeline result
   errors are now collected into keyErrors and surface as REDIS_PARTIAL.

Also collapsed the 150-line BOOTSTRAP/STANDALONE duplicated loops into a
shared classifyKey() helper. DEGRADED threshold is now 3% of total checks
instead of a hardcoded 3 (so adding keys does not silently raise the bar).

* fix(health): also track per-command errors on seed-meta GET half

Review on PR #3056: error tracking only covered the STRLEN data half of the
pipeline. A failed GET seed-meta:* command was returned as result=null and
silently fell through to STALE_SEED instead of surfacing as REDIS_PARTIAL.
Collect keyMetaErrors alongside keyErrors and route either failure to the
REDIS_PARTIAL branch in classifyKey.

* fix(health): drop dead cascadeCovered check in records===0 branch

Greptile review on PR #3056: that branch only runs when hasData=true, and
isCascadeCovered() short-circuits to false in that case, so the cascade
check was structurally unreachable. Replaced with a comment so a future
maintainer doesn't add cascade siblings expecting empty-data shielding.
2026-04-13 16:24:22 +04:00
Elie Habib
cd5ed0d183 feat(seeds): BIS DSR + property prices (2 of 7) (#3048)
* feat(seeds): BIS DSR + property prices (2 of 7)

Ships 2 of 7 BIS dataflows flagged as genuinely new signals in #3026 —
the rest are redundant with IMF/WB or are low-fit global aggregates.

New seeder: scripts/seed-bis-extended.mjs
  - WS_DSR   household debt service ratio (% income, quarterly)
  - WS_SPP   residential property prices (real index, quarterly)
  - WS_CPP   commercial property prices (real index, quarterly)

Gold-standard pattern: atomic publish + writeExtraKey for extras, retry
on missing startPeriod, TTL = 3 days (3× 12h cron), runSeed drives
seed-meta:economic:bis-extended. Series selection scores dimension
matches (PP_VALUATION=R / UNIT_MEASURE=628 for property, DSR_BORROWERS=P
/ DSR_ADJUST=A for DSR), then falls back to observation count.

Wired into:
  - bootstrap (slow tier) + cache-keys.ts
  - api/health.js (STANDALONE_KEYS + SEED_META, maxStaleMin = 24h)
  - api/mcp.ts get_economic_data tool (_cacheKeys + _freshnessChecks)
  - resilience macroFiscal: new householdDebtService sub-metric
    (weight 0.05, currentAccountPct rebalanced 0.3 → 0.25)
  - Housing Cycle tile on CountryDeepDivePanel (Economic Indicators card)
    with euro-area (XM) fallback for EU member states
  - seed-bundle-macro Railway cron (BIS-Extended, 12h interval)

Tests: tests/bis-extended-seed.test.mjs covers CSV parsing, series
selection, quarter math + YoY. Updated resilience golden-value tests
for the macroFiscal weight rebalance.

Closes #3026

https://claude.ai/code/session_01DDo39mPD9N2fNHtUntHDqN

* fix(resilience): unblock PR #3048 on #3046 stack

- rebase onto #3046; final macroFiscal weights: govRevenue 0.40, currentAccount 0.20, debtGrowth 0.20, unemployment 0.15, householdDebtService 0.05 (=1.00)
- add updateHousingCycle? stub to CountryBriefPanel interface so country-intel dispatch typechecks
- add HR to EURO_AREA fallback set (joined euro 2023-01-01)
- seed-bis-extended: extend SPP/CPP TTLs when DSR fetch returns empty so the rejected publish does not silently expire the still-good property keys
- update resilience goldens for the 5-sub-metric macroFiscal blend

* fix(country-brief): housing tile renders em-dash for null change values

The new Housing cycle tile used `?? 0` to default qoqChange/yoyChange/change
when missing, fabricating a flat "0.0%" label (with positive-trend styling)
for countries with no prior comparable period. Fetch path and builders
correctly return null; the panel was coercing it.

formatPctTrend now accepts null|undefined and returns an em-dash, matching
how other cards surface unavailable metrics. Drop the `?? 0` fallbacks at
the three housing call sites.

* fix(seed-health): register economic:bis-extended seed-meta monitoring

12h Railway cron writes seed-meta:economic:bis-extended but it was
missing from SEED_DOMAINS, so /api/seed-health never reported its
freshness. intervalMin=720 matches maxStaleMin/2 (1440/2) from
api/health.js.

* fix(seed-bis-extended): decouple DSR/SPP/CPP so one fetch failure doesn't block the others

Previously validate() required data.entries.length > 0 on the DSR slice
after publishTransform pulled it out of the aggregate payload. If WS_DSR
fetch failed but WS_SPP / WS_CPP succeeded, validate() rejected the
publish → afterPublish() never ran → fresh SPP/CPP data was silently
discarded and only the old snapshots got a TTL bump.

This treats the three datasets as independent:

- SPP and CPP are now published (or have their existing TTLs extended)
  as side-effects of fetchAll(), per-dataset. A failure in one never
  affects the others.
- DSR continues to flow through runSeed's canonical-key path. When DSR
  is empty, publishTransform yields { entries: [] } so atomicPublish
  skips the canonical write (preserving the old DSR snapshot); runSeed's
  skipped branch extends its TTL and refreshes seed-meta.

Shape B (one runSeed call, semantics changed) chosen over Shape A (three
sequential runSeed calls) because runSeed owns the lock + process.exit
lifecycle and can't be safely called three times in a row, and Shape B
keeps the single aggregate seed-meta:economic:bis-extended key that
health.js already monitors.

Tests cover both failure modes:
- DSR empty + SPP/CPP healthy → SPP/CPP written, DSR TTL extended
- DSR healthy + SPP/CPP empty → DSR written, SPP/CPP TTLs extended

* fix(health): per-dataset seed-meta for BIS DSR/SPP/CPP

Health was pointing bisDsr / bisPropertyResidential / bisPropertyCommercial
at the shared seed-meta:economic:bis-extended key, which runSeed refreshes
on every run (including its validation-failed "skipped" branch). A DSR-only
outage therefore left bisDsr reporting fresh in api/health.js while the
resilience scorer consumed stale/missing economic:bis:dsr:v1 data.

Write a dedicated seed-meta key per dataset ONLY when that dataset actually
published fresh entries. The aggregate bis-extended key stays as a
"seeder ran" signal in api/seed-health.js.

* fix(seed-bis-extended): write DSR seed-meta only after atomicPublish succeeds

Previously fetchAll() wrote seed-meta:economic:bis-dsr inline before
runSeed/atomicPublish ran. If atomicPublish then failed (Redis hiccup,
validate rejection, etc.), seed-meta was already bumped — health would
report DSR fresh while the canonical key was stale.

Move the DSR seed-meta write into a dsrAfterPublish callback passed to
runSeed via the existing afterPublish hook, which fires only after a
successful canonical publish. SPP/CPP paths already used this ordering
inside publishDatasetIndependently; this brings DSR in line.

Adds a regression test exercising dsrAfterPublish with mocked Upstash:
populated DSR -> single SET on seed-meta key; null/empty DSR -> zero
Redis calls.

* fix(resilience): per-dataset BIS seed-meta keys in freshness overrides

SOURCE_KEY_META_OVERRIDES previously collapsed economic:bis:dsr:v1 and
both property-* sourceKeys onto the aggregate seed-meta:economic:bis-extended
key. api/health.js (SEED_META) writes per-dataset keys
(seed-meta:economic:bis-dsr / bis-property-residential / bis-property-commercial),
so a DSR-only outage showed stale in /api/health but the resilience
dimension freshness code still reported macroFiscal inputs as fresh.

Map each BIS sourceKey to its dedicated seed-meta key to match health.js.
The aggregate bis-extended key is still written by the seeder and read by
api/seed-health.js as a "seeder ran" signal, so it is retained upstream.

* fix(bis): prefer households in DSR + per-dataset freshness in MCP

Greptile review catches on #3048:

1. buildDsr() was selecting DSR_BORROWERS='P' (private non-financial) while
   the UI labels it "Household DSR" and resilience scoring uses it as
   `householdDebtService`. Changed to 'H' (households). Countries without
   an H series now get dropped rather than silently mislabeled.
2. api/mcp.ts get_economic_data still read only the aggregate
   seed-meta:economic:bis-extended for freshness. If DSR goes stale while
   SPP/CPP keep publishing, MCP would report the BIS block as fresh even
   though one of its returned keys is stale. Swapped to the three
   per-dataset seed-meta keys (bis-dsr, bis-property-residential,
   bis-property-commercial), matching the fix already applied to
   /api/health and the resilience dimension-freshness pipeline.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-13 15:05:44 +04:00
Elie Habib
e7ef14aa02 fix(health): heal portwatch-disruptions + three stale-registry false alarms (#3051)
* fix(health): heal portwatch-disruptions + three stale-registry false alarms

* fix(resilience): log Upstash non-2xx when writing ranking seed-meta

fetch() doesn't throw on HTTP errors, so a 401/429/500 from Upstash would
be treated as success — the new meta write would fail silently and
/api/health would keep alerting with no diagnostic log. Check resp.ok
explicitly and log status + body snippet on failure.

Greptile review catch on #3051.

* fix(health): sync seed-health.js portwatch cadence with api/health.js (WEEK)

Companion fix to the same logical bug on api/health.js: api/seed-health.js
still read 'portwatch:chokepoints-ref' as a daily cron (intervalMin 1440),
so its stale threshold (intervalMin*2 = 48h) would still flag a false
stale even though api/health.js was updated to 14d. Both endpoints now
agree at 14d for a WEEK-cadence seeder.

Greptile review catch on #3051.
2026-04-13 14:13:18 +04:00
Elie Habib
8089fd9d53 feat(resilience): publish resilience:static:fao aggregate from static seed (#3050)
* feat(resilience): publish resilience:static:fao aggregate from static seed

Weekly validation cron Outcome-Backtest reads resilience:static:fao for
the Food Crisis Escalation family, but nothing wrote that key — dangling
reference, Food Crisis stuck at AUC=0.5.

IPC Phase 3+ data is already fetched by fetchFsinDataset (HDX global IPC
CSV) and stored per-country. This PR reshapes the same in-memory map into
an aggregate view and writes it in the existing Redis pipeline — no extra
fetch, no new cron service.

Output shape matches what detectFoodCrisis already walks:
  { countries: { [iso2]: { ipcPhase, phase, peopleInCrisis, year, source } },
    count, fetchedAt, seedYear, source: 'hdx-ipc' }

Only Phase 3+ countries are included, matching IPC's own publish rule.
Absence = not-monitored-crisis, consistent with scoreFoodWater()'s
stable-absence semantics.

Tests: 5 unit tests for buildFaoAggregate (incl. contract test against
detectFoodCrisis) + 1 health.js registration test. No cron/Railway
changes needed — seed-bundle-static-ref picks it up on its next October
window; restart to backfill sooner.

FX Stress / Power Outages / Refugees / Conflict also fail today but for
different reasons (detector shape mismatches) — out of scope here.

* fix(resilience): wire resilienceStaticFao into SEED_META to unmask empty-state

Reviewer catch on #3050: adding resilienceStaticFao to STANDALONE_KEYS
and EMPTY_DATA_OK_KEYS without a matching SEED_META entry leaves
seedStale=null in the standalone-key health branch, so an empty or
missing resilience:static:fao key resolves to plain OK instead of
STALE_SEED — silently masking the exact bug this PR is meant to
surface.

Adds SEED_META.resilienceStaticFao pointing at seed-meta:resilience:static
(same heartbeat as resilienceStaticIndex, since the aggregate is written
in the same Redis pipeline by the same seeder). Now: missing data with
stale heartbeat -> STALE_SEED (warn); with fresh heartbeat and no
countries in Phase 3+ -> OK (still valid per EMPTY_DATA_OK_KEYS).

Same trap documented in feedback_empty_data_ok_keys_bootstrap_blind_spot.md
but in the STANDALONE_KEYS path, not BOOTSTRAP_KEYS.

Test locks it in with a source-string regex assertion.
2026-04-13 13:00:58 +04:00
Elie Habib
f5d8ff9458 feat(seeds): Eurostat house prices + quarterly debt + industrial production (#3047)
* feat(seeds): Eurostat house prices + quarterly debt + industrial production

Adds three new Eurostat overlay seeders covering all 27 EU members plus
EA20 and EU27_2020 aggregates (issue #3028):

- prc_hpi_a  (annual house price index, 10y sparkline, TTL 35d)
  key: economic:eurostat:house-prices:v1
  complements BIS WS_SPP (#3026) for the Housing cycle tile
- gov_10q_ggdebt (quarterly gov debt %GDP, 8q sparkline, TTL 14d)
  key: economic:eurostat:gov-debt-q:v1
  upgrades National Debt card cadence from annual IMF to quarterly for EU
- sts_inpr_m (monthly industrial production, 12m sparkline, TTL 5d)
  key: economic:eurostat:industrial-production:v1
  feeds "Real economy pulse" sparkline on Economic Indicators card

Shared JSON-stat parser in scripts/_eurostat-utils.mjs handles the EL/GR
and EA20 geo quirks and returns full time series for sparklines.

Wires each seeder into bootstrap (SLOW_KEYS), health registries (keys +
seed-meta thresholds matched to cadence), macro seed bundle, cache-keys
shared module, and the MCP tool registry (get_eu_housing_cycle,
get_eu_quarterly_gov_debt, get_eu_industrial_production). MCP tool count
updated to 31.

Tests cover JSON-stat parsing, sparkline ordering, EU-only coverage
gating (non-EU geos return null so panels never render blank tiles),
validator thresholds, and registry wiring across all surfaces.

https://claude.ai/code/session_01Tgm6gG5yUMRoc2LRAKvmza

* fix(bootstrap): register new Eurostat keys in tiers, defer consumers

Adds eurostatHousePrices/GovDebtQ/IndProd to BOOTSTRAP_TIERS ('slow') to
match SLOW_KEYS in api/bootstrap.js, and lists them as PENDING_CONSUMERS
in the hydration coverage test (panel wiring lands in follow-up).

* fix(eurostat): raise seeder coverage thresholds to catch partial publishes

The three Eurostat overlay seeders (house prices, quarterly gov debt,
monthly industrial production) all validated with makeValidator(10)
against a fixed 29-geo universe (EU27 + EA20 + EU27_2020). A bad run
returning only 10-15 geos would pass validation and silently publish
a snapshot missing most of the EU.

Raise thresholds to near-complete coverage, with a small margin for
geos with patchy reporting:
  - house prices (annual):      10 -> 24
  - gov debt (quarterly):       10 -> 24
  - industrial prod (monthly):  10 -> 22 (monthly is slightly patchier)

Add a guard test that asserts every overlay seeder keeps its threshold
>=22 so this regression can't reappear.

* fix(seed-health): register 3 Eurostat seed-meta entries

house-prices, gov-debt-q, industrial-production were wired in
api/health.js SEED_META but missing from api/seed-health.js
SEED_DOMAINS, so /api/seed-health would not surface their
freshness. intervalMin = health.js maxStaleMin / 2 per convention.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-13 13:00:14 +04:00
Elie Habib
71a6309503 feat(seeds): expand IMF WEO coverage — growth, labor, external themes (#3027) (#3046)
* feat(seeds): expand IMF WEO coverage — growth, labor, external themes (#3027)

Adds three new SDMX-3.0 seeders alongside the existing imf-macro seeder
to surface 15+ additional WEO indicators across ~210 countries at zero
incremental API cost. Bundled into seed-bundle-imf-extended.mjs on the
same monthly Railway cron cadence.

Seeders + Redis keys:
- seed-imf-growth.mjs    → economic:imf:growth:v1
  NGDP_RPCH, NGDPDPC, NGDP_R, PPPPC, PPPGDP, NID_NGDP, NGSD_NGDP
- seed-imf-labor.mjs     → economic:imf:labor:v1
  LUR (unemployment), LP (population)
- seed-imf-external.mjs  → economic:imf:external:v1
  BX, BM, BCA, TM_RPCH, TX_RPCH (+ derived trade balance)
- seed-imf-macro.mjs extended with PCPI, PCPIEPCH, GGX_NGDP, GGXONLB_NGDP

All four seeders share the 35-day TTL (monthly WEO release) and ~210
country coverage via the same imfSdmxFetchIndicator helper.

Wiring:
- api/bootstrap.js, api/health.js, server/_shared/cache-keys.ts —
  register new keys, mark them slow-tier, add SEED_META freshness
  thresholds matching the imfMacro entry (70d = 2× monthly cadence)
- server/worldmonitor/resilience/v1/_dimension-freshness.ts —
  override entries for the dash-vs-colon seed-meta key shape
- _indicator-registry.ts — add LUR as a 4th macroFiscal sub-metric
  (enrichment tier, weight 0.15); rebalance govRevenuePct (0.5→0.4)
  and currentAccountPct (0.3→0.25) so weights still sum to 1.0
- _dimension-scorers.ts — read economic:imf:labor:v1 in scoreMacroFiscal,
  normalize LUR with goalposts 3% (best) → 25% (worst); null-tolerant so
  weightedBlend redistributes when labor data is unavailable
- api/mcp.ts — new get_country_macro tool bundling all four IMF keys
  with a single freshness check; describes per-country fields including
  growth/inflation/labor/BOP for LLM-driven country screening
- src/services/imf-country-data.ts — bootstrap-cached client + pure
  buildImfEconomicIndicators helper
- src/app/country-intel.ts — async-fetch the IMF bundle on country
  selection and merge real GDP growth, CPI inflation, unemployment, and
  GDP/capita rows into the Economic Indicators card; bumps card cap
  from 3 → 6 rows to fit live signals + IMF context

Tests:
- tests/seed-imf-extended.test.mjs — 13 unit tests across the three new
  seeders' pure helpers (canonical keys, ISO3→ISO2 mapping, aggregate
  filtering, derived savings-investment gap & trade balance, validate
  thresholds)
- tests/imf-country-data.test.mts — 6 tests for the panel rendering
  helper, including stagflation flag and high-unemployment trend
- tests/resilience-dimension-scorers.test.mts — new LUR sub-metric test
  (tight vs slack labor); existing scoreMacroFiscal coverage assertions
  updated for the new 4-metric weight split
- tests/helpers/resilience-fixtures.mts — labor fixture for NO/US/YE so
  the existing macroFiscal ordering test still resolves the LUR weight
- tests/bootstrap.test.mjs — register imfGrowth/imfLabor/imfExternal as
  pending consumers (matching imfMacro)
- tests/mcp.test.mjs — bump tools/list count 28 → 29

https://claude.ai/code/session_018enRzZuRqaMudKsLD5RLZv

* fix(resilience): update macroFiscal goldens for LUR weight rebalance

Recompute pinned fixture values after adding labor-unemployment as
4th macroFiscal sub-metric (weight rebalance in _indicator-registry).
Also align seed-imf-external tradeBalance to a single reference year
to avoid mixing ex/im values from different WEO vintages.

* fix(seeds): tighten IMF coverage gates to reject partial snapshots

IMF WEO growth/labor/external indicators report ~210 countries for healthy
runs. Previous thresholds (150/100/150) let a bad IMF run overwrite a good
snapshot with dozens of missing countries and still pass validation.

Raise all three to >=190, matching the pattern of sibling seeders and
leaving a ~20-country margin for indicators with slightly narrower
reporting. Labor validator unions LUR + population (LP), so healthy
coverage tracks LP (~210), not LUR (~100) — the old 100 threshold was
based on a misread of the union logic.

* fix(seed-health): register imf-growth/labor/external seed-meta keys

Missing SEED_DOMAINS entries meant the 3 new IMF WEO seeders (growth, labor,
external) had no /api/seed-health visibility. intervalMin=50400 matches
health.js maxStaleMin/2 (100800/2) — same monthly WEO cadence as imf-macro.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-13 12:51:35 +04:00
Elie Habib
0cdfddc885 feat(gold): central-bank reserves via IMF IFS (PR C) (#3038)
* feat(gold): central-bank gold reserves via IMF IFS (PR C)

* fix(gold): prefer ounces indicator over USD in IMF IFS candidate list

* fix(gold): align seed-health interval with monthly IMF cadence + drop ALG dup

Review findings on PR #3038:
- api/seed-health.js: intervalMin was 1440 (1 day), which flags stale at
  2880min (48h) — contradicted health.js maxStaleMin=44640 (~31 days) and
  would false-alarm within 2 days on a monthly data source. Bumped to
  22320 so both endpoints agree at ~31 days.
- seed-gold-cb-reserves ISO3_NAMES: dropped duplicate ALG entry (World Bank
  variant); DZA is canonical ISO 3166-1 alpha-3 and stays.
2026-04-13 08:19:53 +04:00
Elie Habib
a8b85e52c8 feat(gold): SPDR GLD physical holdings flows (PR B) (#3037)
* feat(gold): SPDR GLD physical holdings flows (PR B)

* fix(gold): strip UTF-8 BOM from SPDR CSV header (greptile P2 #3037)
2026-04-13 08:04:22 +04:00
Elie Habib
ee66b6b5c2 feat(gold): Gold Intelligence v2 — positioning depth, returns, drivers (#3034)
* feat(gold): richer Gold Intelligence panel with positioning, returns, drivers

* fix(gold): restore leveragedFunds fields and derive P/S netPct in legacy fallback

Review catch on PR #3034:

1. seed-cot.mjs stopped emitting leveragedFundsLong/Short during the v2
   refactor, which would zero out the Leveraged Funds bars in the existing
   CotPositioningPanel on the next seed run. Re-read lev_money_* from the
   TFF rows and keep the fields on the output (commodity rows don't have
   this breakdown, stay at 0).
2. get-gold-intelligence legacy fallback hardcoded producerSwap.netPct to 0,
   meaning a pre-v2 COT snapshot rendered a neutral 0% Producer/Swap bar
   on deploy until seed-cot reran. Derive netPct from dealerLong/dealerShort
   (same formula as the v2 seeder). OI share stays 0 because open_interest
   wasn't captured pre-migration; clearly documented now.

Tests: added two regression guards (leveragedFunds preserved for TFF,
commodity rows emit 0 for those fields).

* fix(gold): make enrichment layer monitored and honest about freshness

Review catch on PR #3034:

- seed-commodity-quotes now writes seed-meta:market:gold-extended via
  writeExtraKeyWithMeta on every successful run. Partial / failed fetches
  skip BOTH the data write and the meta bump, so health correctly reports
  STALE_SEED instead of masking a broken Yahoo fetch with a green check.
- Require both gold (core) AND at least one driver/silver before writing,
  so a half-successful run doesn't overwrite healthy prior data with a
  degraded payload.
- Handler no longer stamps updatedAt with new Date() when the enrichment
  key is missing. Emits empty string so the panel's freshness indicator
  shows "Updated —" with a dim dot, matching reality — enrichment is
  missing, not fresh.
- Health: goldExtended entry in STANDALONE_KEYS + SEED_META (maxStaleMin
  30, matching commodity quotes), and seed-health.js advertises the
  domain so upstream monitors pick it up.

The panel already gates session/returns/drivers sections on presence, so
legacy panels without the enrichment layer stay fully functional.
2026-04-12 22:53:32 +04:00
Elie Habib
3696aba2d1 fix(infra): sync health/bootstrap/cache-keys parity (4 keys + 6 DATA_KEYS + 5 SEED_META) (#3015)
* fix(infra): sync health/bootstrap/cache-keys parity (4 BOOTSTRAP_CACHE_KEYS + 6 DATA_KEYS + 5 SEED_META)

Audit found 4 bootstrap.js keys (consumerPrices*) missing from
cache-keys.ts BOOTSTRAP_CACHE_KEYS, 6 bootstrapped keys invisible
to health DATA_KEYS monitoring (cryptoSectors, ddosAttacks,
economicStress, insights, predictions, trafficAnomalies), and 5
bootstrapped keys with no SEED_META staleness detection
(cryptoSectors, ddosAttacks, economicStress, marketImplications,
trafficAnomalies). Keys without seed-meta writers (bisCredit,
bisExchange, giving, minerals, serviceStatuses, temporalAnomalies)
were verified as on-demand/derived and correctly skipped.

* fix(health): write seed-meta on empty DDoS/anomalies data

Prevents false STALE_SEED alerts when Cloudflare returns zero events.
Extracts writeSeedMeta() helper from writeExtraKeyWithMeta().

* fix(health): remove duplicate insights/predictions aliases, fix test regex

P1: insights/predictions duplicate newsInsights/predictionMarkets
P2: keyRe now captures non-versioned consumer-prices keys

* fix(health): add ddosAttacks/trafficAnomalies to EMPTY_DATA_OK_KEYS

Zero DDoS events or traffic anomalies is a valid quiet-period state,
not a critical failure.
2026-04-12 20:38:08 +04:00
Elie Habib
da01def264 fix(health): add DATA_KEYS entry for energyCrisisPolicies (#3014)
health.js had the SEED_META entry (line 304) but was missing the DATA_KEYS
entry, so the health endpoint never reported on the energy:crisis-policies:v1
key. Without this, empty data goes undetected.
2026-04-12 19:45:36 +04:00
Elie Habib
793d7df9dc feat(energy-crisis): add IEA 2026 Energy Crisis Policy Response Tracker panel and seeder (#3008) 2026-04-12 15:09:54 +04:00
Elie Habib
676331607a feat(resilience): three-pillar aggregation with penalized weighted mean (T2.3) (#2990)
* feat(resilience): three-pillar aggregation with penalized weighted mean (Phase 2 T2.3)

Wire real three-pillar scoring: structural-readiness (0.40), live-shock-exposure
(0.35), recovery-capacity (0.25). Add penalizedPillarScore formula with alpha=0.50
penalty factor for backtest tuning. Set recovery domain weight to 0.25 and
redistribute existing domain weights proportionally to sum to 1.0. Bump cache
keys v8 to v9. The penalized formula is exported and tested but overallScore
stays as the v1 domain-weighted sum until the flag flips in PR 10.

* fix(resilience): update test description v8 to v9 (#2990 review)

Test descriptions said "(v8)" but assertions check v9 cache keys.
2026-04-12 10:18:42 +04:00
Elie Habib
17e34dfca7 feat(resilience): recovery capacity pillar — 6 new dimensions + 5 seeders (Phase 2 T2.2b) (#2987)
* feat(resilience): recovery capacity pillar — 6 new dimensions + 5 seeders (Phase 2 T2.2b)

Add the recovery-capacity pillar with 6 new dimensions:
- fiscalSpace: IMF GGR_G01_GDP_PT + GGXCNL_G01_GDP_PT + GGXWDG_NGDP_PT
- reserveAdequacy: World Bank FI.RES.TOTL.MO
- externalDebtCoverage: WB DT.DOD.DSTC.CD / FI.RES.TOTL.CD ratio
- importConcentration: UN Comtrade HHI (stub seeder)
- stateContinuity: derived from WGI + UCDP + displacement (no new fetch)
- fuelStockDays: IEA/EIA (stub seeder, Enrichment tier)

Each dimension has a scorer in _dimension-scorers.ts, registry entries in
_indicator-registry.ts, methodology doc subsections, and fixture data.

Seeders: fiscal-space (real, IMF WEO), reserve-adequacy (real, WB API),
external-debt (real, WB API), import-hhi (stub), fuel-stocks (stub).

Recovery domain weight is 0 until PR 4 (T2.3) ships the penalized weighted
mean across pillars. The domain appears in responses structurally but does
not affect the overall score.

Bootstrap: STANDALONE_KEYS + SEED_META + EMPTY_DATA_OK_KEYS + ON_DEMAND_KEYS
all updated in api/health.js. Source-failure mapping updated for
stateContinuity (WGI adapter). Widget labels and LOCKED_PREVIEW updated.

All 282 resilience tests pass, typecheck clean, methodology lint clean.

* fix(resilience): ISO3→ISO2 normalization in WB recovery seeders (#2987 P1)

Both seed-recovery-reserve-adequacy.mjs and seed-recovery-external-debt.mjs
used countryiso3code from the World Bank API response then immediately
rejected codes where length !== 2. WB returns ISO3 codes (USA, DEU, etc.),
so all real rows were silently dropped and the feed was always empty.

Fix: import scripts/shared/iso3-to-iso2.json and normalize before the
length check. Also removed from EMPTY_DATA_OK_KEYS in health.js since
empty results now indicate a real failure, not a structural absence.

* fix(resilience): remove unused import + no-op overrides (#2987 review)

* fix(test): update release-gate to expect 6 domains after recovery pillar
2026-04-12 10:10:10 +04:00
Elie Habib
e070a97c3d Phase 3 PR2: Weekly regional briefs (LLM seeder + RPC) (#2989)
* feat(intelligence): weekly regional briefs (Phase 3 PR2)

Phase 3 PR2 of the Regional Intelligence Model. Adds LLM-powered
weekly intelligence briefs per region, completing the core feature set.

## New seeder: scripts/seed-regional-briefs.mjs

Standalone weekly cron script (not part of the 6h derived-signals bundle).
For each non-global region:
  1. Read the latest snapshot via two-hop Redis read
  2. Read recent regime transitions from the history log (#2981)
  3. Call the LLM once per region with regime trajectory + balance +
     triggers + narrative context
  4. Write structured brief to intelligence:regional-briefs:v1:weekly:{region}
     with 8-day TTL (survives one missed weekly run)

Reuses the same injectable-callLlm + parse-validation + provider-chain
pattern from narrative.mjs and weekly-brief.mjs.

## New module: scripts/regional-snapshot/weekly-brief.mjs

  generateWeeklyBrief(region, snapshot, transitions, opts?)
    -> { region_id, generated_at, period_start, period_end,
         situation_recap, regime_trajectory, key_developments[],
         risk_outlook, provider, model }

  buildBriefPrompt()    — pure prompt builder
  parseBriefJson()      — JSON parser with prose-extraction fallback
  emptyBrief()          — canonical empty shape

Global region is skipped. Provider chain: Groq -> OpenRouter. Validate
callback ensures only parseable responses pass (narrative.mjs PR #2960
review fix pattern).

## Proto + RPC: GetRegionalBrief

  proto/worldmonitor/intelligence/v1/get_regional_brief.proto

  - GetRegionalBriefRequest { region_id }
  - GetRegionalBriefResponse { brief: RegionalBrief }
  - RegionalBrief { region_id, generated_at, period_start, period_end,
                    situation_recap, regime_trajectory,
                    key_developments[], risk_outlook, provider, model }

## Server handler

  server/worldmonitor/intelligence/v1/get-regional-brief.ts

Simple getCachedJson read + adaptBrief snake->camel adapter.
Returns upstreamUnavailable: true on Redis failure so the gateway
skips caching (matching the get-regime-history pattern from #2981).

## Premium gating + cache tier

  src/shared/premium-paths.ts + server/gateway.ts RPC_CACHE_TIER

## Tests — 27 new unit tests

  buildBriefPrompt (5): region/balance/transitions/narrative rendered,
                        empty transitions handled, missing fields tolerated
  parseBriefJson (5): valid JSON, garbage, all-empty, cap at 5, prose extraction
  generateWeeklyBrief (6): success, global skip, LLM fail, garbage, exception,
                           period_start/end delta
  emptyBrief (2): region_id + empty fields
  handler (4): key prefix, adapter export, upstreamUnavailable, registration
  security (2): premium path + cache tier
  proto (3): RPC declared, import wired, RegionalBrief fields

## Verification

- npm run test:data: 4651/4651 pass
- npm run typecheck + typecheck:api: clean
- biome lint: clean

* fix(intelligence): address 3 review findings on #2989

P2 #1 — no consumer surface for GetRegionalBrief

Acknowledged. The consumer is the RegionalIntelligenceBoard panel,
which will call GetRegionalBrief and render a weekly brief block.
This wiring is Phase 3 PR3 (UI) scope — the RPC + Redis key are the
delivery mechanism, not the end surface. No code change in this commit;
the RPC is ready for the panel to consume.

P2 #2 — readRecentTransitions collapses failure to []

readRecentTransitions returned [] on Redis/network failure, which is
indistinguishable from a genuinely quiet week. The LLM then generates
a brief claiming "no regime transitions" when in reality the upstream
is down — fabricating false input.

Fix: return null on failure. The seeder skips the region with a clear
log message when transitions is null, so the brief is never written
with unreliable input. Empty array [] now only means genuinely no
transitions in the 7-day window.

P2 #3 — parseBriefJson accepts briefs the seeder rejects

parseBriefJson treated non-empty key_developments as valid even if
situation_recap was empty. The seeder gate only writes when
brief.situation_recap is truthy. That mismatch means the validator
pass + provider-fallback logic could accept a response that the seeder
then silently drops.

Fix: require situation_recap in parseBriefJson for valid=true, matching
the seeder gate. Now both checks agree on what constitutes a usable
brief, and the provider-fallback chain correctly falls through when
a provider returns a brief with developments but no recap.

* fix(intelligence): TTL path-segment fix + seed-meta always-write (Greptile P1+P2 on #2989)

P1 — TTL silently not applied (briefs never expire)

Upstash REST ignores query-string SET options (?EX=N). The correct
form is path-segment: /set/{key}/{value}/EX/{seconds}. Without this
fix every brief persists indefinitely and Redis storage grows
unboundedly across weekly runs.

P2 — seed-meta not written when all regions skipped

writeExtraKeyWithMeta was gated on generated > 0. If every region
was skipped (no snapshot yet, or LLM failed), seed-meta was never
written, making the seeder indistinguishable from "never ran" in
health tooling. Now writes seed-meta whenever failed === 0,
carrying regionsSkipped count.

P2 #3 (validate gate) — already fixed in previous commit (parseBriefJson
now requires situation_recap for valid=true).

* fix(intelligence): register regional-briefs in health.js SEED_META + STANDALONE_KEYS (review P2 on #2989)

* fix(intelligence): register regional-briefs in api/seed-health.js (review P2 on #2989)

* fix(intelligence): raise brief TTL to 15 days to cover missed weekly cycle (review P2 on #2989)

* fix(intelligence): distinguish missing-key from Redis-error + coverage-gated health (review P2s on #2989)

P2 #1 — false upstreamUnavailable before first seed

getCachedJson returns null for both "key missing" and "Redis failed",
so the handler was advertising an outage for every region before the
first weekly seed ran. Switched to getRawJson (throws on Redis errors)
so null = genuinely missing key → clean empty 200, and thrown error =
upstream failure → upstreamUnavailable: true for gateway no-store.

P2 #2 — partial run hides coverage loss in health

The seed-meta was written with generated count even if only 1 of 7
regions produced a brief. /api/health treats any positive recordCount
as healthy, so broad regional failure was invisible to operators.

Fix: recordCount is set to 0 when generated < ceil(expectedRegions/2).
This makes /api/health report EMPTY_DATA for severely partial runs
while still writing seed-meta (so the seeder is confirmed to have run).
coverageOk flag in the summary payload lets operators drill into the
exact coverage state.

* fix(intelligence): tighten coverage gate to expectedRegions-1 (review P2 on #2989)
2026-04-12 09:56:35 +04:00
Elie Habib
7dfdc819a9 Phase 0: Regional Intelligence snapshot writer foundation (#2940) 2026-04-11 17:55:39 +04:00
Elie Habib
46c35e6073 feat(breadth): add market breadth history chart (#2932) 2026-04-11 17:54:26 +04:00
Elie Habib
d3836ba49b feat(sentiment): add AAII investor sentiment survey (#2930)
* feat(sentiment): add AAII investor sentiment survey

Weekly bull/bear/neutral sentiment from AAII (1987-present). Shows
current reading, bull-bear spread, and 52-week historical chart.
Seeder fetches from AAII CSV, stores last 52 weeks in Redis.

* fix(aaii): wire panel loading + mark fallback data explicitly

* fix(aaii): keep panel live across refreshes + surface in health monitoring

- fetchData now falls back to /api/bootstrap?keys=aaiiSentiment on
  refresh (getHydratedData is one-shot and returns undefined after
  the first read, causing a permanent spinner on hourly refresh)
- Shows an error state with auto-retry when both hydrated and
  bootstrap-fetch miss, matching the WsbTickerScannerPanel pattern
- Registered aaiiSentiment in api/health.js BOOTSTRAP_KEYS and
  api/seed-health.js SEED_DOMAINS so rollout failures and
  fallback-only operation are observable in the monitoring dashboards

* fix(sentiment): handle BIFF8 SST trailing bytes and use UTC for AAII Thursday calc

Two P2 greptile fixes from PR #2930 review:

1. BIFF8 SST parser was reading the rich-text run count (cRun, flags & 0x08)
   and extended-string size (cbExtRst, flags & 0x04) to advance past those
   header fields, but never skipped the trailing bytes AFTER the char data:
   4 * cRun formatting-run bytes and cbExtRst ext-rst bytes. If any string
   before the column header was rich-text formatted, every subsequent SST
   entry parsed from the wrong offset, silently breaking XLS extraction and
   falling back to HTML scraping.

2. parseHtmlSentiment() computed last-Thursday via today.getDay() +
   setDate(today.getDate() - daysToThursday), both local-TZ-dependent. On
   Railway (non-UTC TZ) the inferred Thursday could drift by a day, causing
   the HTML-derived row to mismatch the XLS historical rows. Switched to
   getUTCDay() + Date.UTC() for TZ-stable arithmetic.
2026-04-11 17:05:39 +04:00
Elie Habib
d1cb0e3c10 feat(sectors): add P/E valuation benchmarking to sector heatmap (#2929)
* feat(sectors): add P/E valuation benchmarking to sector heatmap

Trailing/forward P/E, beta, and returns for 12 sector ETFs from Yahoo
Finance. Horizontal bar chart color-coded by valuation level plus
sortable table. Extends existing sector data pipeline.

* fix(sectors): clear stale valuations on empty refresh + document cache behavior

* fix(sectors): force valuation rollout for cached + breaker-persisted bootstraps

- Bumped market:sectors bootstrap key v1 -> v2 so stale 24h slow-tier
  payloads without the new valuations field are invisible to returning
  users on next page load
- Versioned the fetchSectors circuit-breaker (name -> "Sector Summary v2")
  so old localStorage/IndexedDB entries predating this PR cannot be
  returned as stale via the SWR path
- shouldCache now requires the valuations field to be present on the
  cached response, not just a non-empty sectors array
- loadMarkets no longer clears the valuations tab when a hydrated or
  fresh payload lacks the field; prior render is left intact, matching
  the finding's requirement
- Defensive check: hydrated payloads without valuations fall through to
  a live fetch instead of rendering an empty valuations tab

* fix(stocks): correct beta3Year source and null YTD color in sector P/E view

- scripts/ais-relay.cjs: beta3Year lives on defaultKeyStatistics (ks),
  not summaryDetail (sd); the previous fallback was a silent no-op.
- src/components/MarketPanel.ts: null ytdReturn now renders with
  var(--text-dim) instead of var(--red); the '--' placeholder no
  longer looks like a loss.

Addresses greptile review on PR #2929.
2026-04-11 16:51:35 +04:00
Elie Habib
2decda6508 feat(wsb): add Reddit/WSB ticker scanner seeder + bootstrap registration (#2916)
* feat(wsb): add Reddit/WSB ticker scanner seeder + bootstrap registration

Seeder in ais-relay.cjs fetches from r/wallstreetbets, r/stocks,
r/investing every 10min. Extracts ticker mentions, validates against
known ticker set, aggregates by frequency and engagement, writes top 50
to intelligence:wsb-tickers:v1.

4-file bootstrap registration: cache-keys.ts, bootstrap.js, health.js
with SEED_META maxStaleMin=30.

* fix(wsb): remove duplicate CEO + fix avgUpvoteRatio divisor

* fix(wsb): require ticker validation set + condition seed-meta on write + add seed-health

1. Skip seed when ticker validation set is empty (cold start/bootstrap miss)
2. Only write seed-meta after successful canonical write
3. Register in api/seed-health.js for dedicated monitoring

* fix(wsb): case-insensitive $ticker matching + BRK.B dotted symbol support

* fix(wsb): split $-prefixed vs bare ticker extraction + BRK.B→BRK-B normalization

1. $-prefixed tickers ($nvda, $BRK.B) skip whitelist validation (strong
   signal) — catches GME, AMC, PLTR etc. not in the narrow market watchlist
2. Bare uppercase tokens still validated against known set (high false-positive)
3. BRK.B normalized to BRK-B before validation (dot→dash)
4. Empty known set no longer skips seed — $-prefixed tickers still extracted

* fix(wsb): skip bare-uppercase branch entirely when ticker set unavailable
2026-04-11 07:07:11 +04:00
Elie Habib
ce30a48664 feat(resilience): add rankStable flag to ranking items (#2879)
* feat(resilience): add rankStable flag to ranking items

Countries with score interval width <= 8 (p95-p05) are flagged as
rankStable=true, indicating robust ranking under weight perturbation.
Read from batch-computed intervals in Redis.

* fix(resilience): guard inverted intervals + scope fetch to scored countries

1. isRankStable rejects negative width (malformed p05 > p95)
2. fetchIntervals scoped to cachedScores.keys() instead of all countries

* fix(resilience): raw key read for intervals + bump ranking cache to v8

* fix(resilience): remove duplicate ScoreInterval interface after rebase

ScoreInterval is now generated in service_server.ts (from PR #2877).
Remove the local duplicate and re-export the generated type.
2026-04-09 22:34:36 +04:00
Elie Habib
1af73975b9 feat(energy): SPR policy classification layer (#2881)
* feat(energy): add SPR policy classification layer with 66-country registry

Static JSON registry classifying strategic petroleum reserve regimes for
66 countries (all IEA members + major producers/consumers). Integrates
into energy profile handler, shock model limitations, analyst context,
spine seeder, and CDP UI.

- scripts/data/spr-policies.json: 66-entry registry with regime, source, asOf
- scripts/seed-spr-policies.mjs: seeder following chokepoint-baselines pattern
- Proto fields 51-59 on GetCountryEnergyProfileResponse
- Handler reads SPR registry from Redis, populates proto fields
- Shock model adds fuel-mode-gated SPR limitations for non-IEA gov SPR
- Analyst context refactored to accumulator pattern (IEA + SPR parts)
- CDP UI: SPR badge for non-IEA government_spr, muted text for spare_capacity
- Spine integration: SPR fields in shockInputs + hasSprPolicy coverage flag
- Cache keys, health, bootstrap, seed-health registrations
- Tests: registry shape, ISO2, regime enum, required entries, no estimatedFillPct

* fix(energy): remove SPR from bootstrap (server-only); narrow SPR hasAny gate to renderable regimes

* feat(energy): render "no known SPR" risk note for countries with regime=none

* fix(energy): human-readable SPR regime labels; parallelize spine+registry reads in analyst
2026-04-09 22:16:24 +04:00
Elie Habib
0a1b74a9b2 feat(resilience): add score confidence intervals via batch Monte Carlo (#2877)
* feat(resilience): add score confidence intervals via batch Monte Carlo

Weekly cron perturbs domain weights ±10% across 100 draws per country,
stores p05/p95 in Redis. Score handler reads intervals and includes
them in the API response as ScoreInterval { p05, p95 }.

Proto field 14 (score_interval) added to GetResilienceScoreResponse.

* chore: regenerate proto types and OpenAPI docs for ScoreInterval

* fix(resilience): add seed-meta + lock + fix interval cache + percentile formula

1. Write seed-meta:resilience:intervals for health monitoring
2. Add distributed lock to prevent concurrent cron overlap
3. Move scoreInterval read outside 6h score cache boundary
4. Fix percentile index from floor to ceil-1 (nearest-rank)

* fix(health): add resilience:intervals to health + seed-health registries

* fix(seed): skip seed-meta on no-op runs + register intervals in health check
2026-04-09 22:06:54 +04:00
Elie Habib
6e401ad02f feat(supply-chain): Global Shipping Intelligence — Sprint 0 + Sprint 1 (#2870)
* feat(supply-chain): Sprint 0 — chokepoint registry, HS2 sectors, war_risk_tier

- src/config/chokepoint-registry.ts: single source of truth for all 13
  canonical chokepoints with displayName, relayName, portwatchName,
  corridorRiskName, baselineId, shockModelSupported, routeIds, lat/lon
- src/config/hs2-sectors.ts: static dictionary for all 99 HS2 chapters
  with category, shockModelSupported (true only for HS27), cargoType
- server/worldmonitor/supply-chain/v1/_chokepoint-ids.ts: migrated to
  derive CANONICAL_CHOKEPOINTS from chokepoint-registry; no data duplication
- src/config/geo.ts + src/types/index.ts: added chokepointId field to
  StrategicWaterway interface and all 13 STRATEGIC_WATERWAYS entries
- src/components/MapPopup.ts: switched chokepoint matching from fragile
  name.toLowerCase() to direct chokepointId === id comparison
- server/worldmonitor/intelligence/v1/_shock-compute.ts: migrated from old
  IDs (hormuz/malacca/babelm) to canonical IDs (hormuz_strait/malacca_strait/
  bab_el_mandeb); same for CHOKEPOINT_LNG_EXPOSURE
- proto/worldmonitor/supply_chain/v1/supply_chain_data.proto: added
  WarRiskTier enum + war_risk_tier field (field 16) on ChokepointInfo
- get-chokepoint-status.ts: populates warRiskTier from ChokepointConfig.threatLevel
  via new threatLevelToWarRiskTier() helper (FREE field, no PRO gate)

* feat(supply-chain): Sprint 1 — country chokepoint exposure index + sector ring

S1.1: scripts/shared/country-port-clusters.json
  ~130 country → {nearestRouteIds, coastSide} mappings derived from trade route
  waypoints; covers all 6 seeded Comtrade reporters plus major trading nations.

S1.2: scripts/seed-hs2-chokepoint-exposure.mjs
  Daily cron seeder. Pure computation — reads country-port-clusters.json,
  scores each country against CHOKEPOINT_REGISTRY route overlap, writes
  supply-chain:exposure:{iso2}:{hs2}:v1 keys + seed-meta (24h TTL).

S1.3: RPC get-country-chokepoint-index (PRO-gated, request-varying)
  - proto: GetCountryChokepointIndexRequest/Response + ChokepointExposureEntry
  - handler: isCallerPremium gate; cachedFetchJson 24h; on-demand for any iso2
  - cache-keys.ts: CHOKEPOINT_EXPOSURE_KEY(iso2, hs2) constant
  - health.js: chokepointExposure SEED_META entry (48h threshold)
  - gateway.ts: slow-browser cache tier
  - service client: fetchCountryChokepointIndex() exported

S1.4: Chokepoint popup HS2 sector ring chart (PRO-gated)
  Static trade-sector breakdown (IEA/UNCTAD estimates) per 9 major chokepoints.
  SVG donut ring + legend shown for PRO users; blurred lockout + gate-hit
  analytics for free users. Wired into renderWaterwayPopup().

🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.com/claude-code) + Compound Engineering v2.49.0

Co-Authored-By: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>

* fix(tests): update energy-shock-v2 tests to use canonical chokepoint IDs

CHOKEPOINT_EXPOSURE and CHOKEPOINT_LNG_EXPOSURE keys were migrated from
short IDs (hormuz, malacca, babelm) to canonical registry IDs
(hormuz_strait, malacca_strait, bab_el_mandeb) in Sprint 0.
Test fixtures were not updated at the time; fix them now.

* fix(tests): update energy-shock-seed chokepoint ID to canonical form

VALID_CHOKEPOINTS changed to canonical IDs in Sprint 0; the seed test
that checks valid IDs was not updated alongside it.

* fix(cache-keys): reword JSDoc comment to avoid confusing bootstrap test regex

The comment "NOT in BOOTSTRAP_CACHE_KEYS" caused the bootstrap.test.mjs
regex to match the comment rather than the actual export declaration,
resulting in 0 entries found. Rephrase to "excluded from bootstrap".

* fix(supply-chain): address P1 review findings for chokepoint exposure index

- Add get-country-chokepoint-index to PREMIUM_RPC_PATHS (CDN bypass)
- Validate iso2/hs2 params before Redis key construction (cache injection)
- Fix seeder TTL to 172800s (2× interval) and extend TTL on skipped lock
- Fix CHOKEPOINT_EXPOSURE_SEED_META_KEY to match seeder write key
- Render placeholder sectors behind blur gate (DOM data leakage)
- Document get-country-chokepoint-index in widget agent system prompts

* fix(lint): resolve Biome CI failures

- Add biome.json overrides to silence noVar in HTML inline scripts,
  disable linting for public/ vendor/build artifacts and pro-test/
- Remove duplicate NG and MW keys from country-port-clusters.json
- Use import attributes (with) instead of deprecated assert syntax

* fix(build): drop JSON import attribute — esbuild rejects `with` syntax

---------

Co-authored-by: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>
2026-04-09 17:06:03 +04:00
Elie Habib
75e9c22dd3 feat(resilience): populate dataVersion field from seed-meta timestamp (#2865)
* feat(resilience): populate dataVersion field from seed-meta timestamp

Sets dataVersion to the ISO date of the most recent static bundle
seed, making the data vintage visible to API consumers.

* fix(resilience): bump score cache to v7 for dataVersion field addition
2026-04-09 12:22:46 +04:00
Elie Habib
09ed68db09 fix(resilience): revert overall score to domain-weighted average + fix RSF direction (#2847)
* fix(resilience): revert overall score to domain-weighted average + fix RSF direction

1. overallScore reverted from baseline*(1-stressFactor) to
   sum(domainScore * domainWeight) — the multiplicative formula
   crushed all scores by 30-50%
2. RSF press freedom: normalizeHigherBetter → normalizeLowerBetter
   (RSF 0=best, 100=worst; Norway 6.52 was scoring 7 instead of 93)
3. Seed script ranking write removed (handler owns greyedOut split)
4. Widget Impact row removed (stressFactor no longer drives headline)
5. Cache keys bumped: score v6, ranking v6, history v3

* fix(resilience): update validation scripts to v6 + remove lock from read-only seed

1. Validation scripts (backtest, correlation, sensitivity) updated from
   v5 to v6 cache keys. Sensitivity formula updated to domain-weighted.
2. Seed script lock removed — read-only health check needs no lock.

* chore: add clarifying comment on orphaned ranking TTL export
2026-04-09 08:49:54 +04:00
Elie Habib
0a64b308a7 fix(health): rename misleading predictions/insights health entries (#2835)
Renamed health check entries to match what they actually monitor:
- predictions -> predictionMarkets (Polymarket/Metaculus prediction
  markets seeder, NOT the AI forecast output)
- insights -> newsInsights (AI news insights seeder, NOT the forecast
  pipeline insights)

The actual forecast output is already monitored as 'forecasts' (OK,
14 records). The old names caused confusion when predictionMarkets
showed EMPTY_DATA, making it look like the forecast pipeline was broken.
2026-04-08 21:33:27 +04:00
Elie Habib
3c10106630 feat(energy): energy key bootstrap registration + health ops (V5-7) (#2831)
* feat(energy): register energy keys in bootstrap + health ops (V5-7)

* fix(energy): remove premature bootstrap keys (no hydration consumers yet)
2026-04-08 19:42:27 +04:00