3613 Commits

Author SHA1 Message Date
Sebastien Melki
1db05e6caa feat(usage): per-request Axiom telemetry pipeline (gateway + upstream attribution) (#3403)
* feat(gateway): thread Vercel Edge ctx through createDomainGateway (#3381)

PR-0 of the Axiom usage-telemetry stack. Pure infra change: no telemetry
emission yet, only the signature plumbing required for ctx.waitUntil to
exist on the hot path.

- createDomainGateway returns (req, ctx) instead of (req)
- rewriteToSebuf propagates ctx to its target gateway
- 5 alias callsites updated to pass ctx through
- ~30 [rpc].ts callsites unchanged (export default createDomainGateway(...))

Pattern reference: api/notification-channels.ts:166.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(usage): pure UsageIdentity resolver + Axiom emit primitives (#3381)

server/_shared/usage-identity.ts
- buildUsageIdentity: pure function, consumes already-resolved gateway state.
- Static ENTERPRISE_KEY_TO_CUSTOMER map (explicit, reviewable in code).
- Does not re-verify JWTs or re-validate API keys.

server/_shared/usage.ts
- buildRequestEvent / buildUpstreamEvent: allowlisted-primitive builders only.
  Never accept Request/Response — additive field leaks become structurally
  impossible.
- emitUsageEvents → ctx.waitUntil(sendToAxiom). Direct fetch, 1.5s timeout,
  no retry, gated by USAGE_TELEMETRY=1 and AXIOM_API_TOKEN.
- Sliding-window circuit breaker (5% over 5min, min 20 samples). Trips with
  one structured console.error; subsequent drops are 1%-sampled console.warn.
- Header derivers reuse Vercel/CF headers for request_id, region, country,
  reqBytes; ua_hash null unless USAGE_UA_PEPPER is set (no stable
  fingerprinting).
- Dev-only x-usage-telemetry response header for 2-second debugging.

server/_shared/auth-session.ts
- New resolveClerkSession returning { userId, orgId } in one JWT verify so
  customer_id can be Clerk org id without a second pass. resolveSessionUserId
  kept as back-compat wrapper.

No emission wiring yet — that lands in the next commit (gateway request
event + 403 + 429).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(gateway): emit Axiom request events on every return path (#3381)

Wires the request-event side of the Axiom usage-telemetry stack. Behind
USAGE_TELEMETRY=1 — no-op when the env var is unset.

Emit points (each builds identity from accumulated gateway state):
- origin_403 disallowed origin → reason=origin_403
- API access subscription required (403)
- legacy bearer 401 / 403 / 401-without-bearer
- entitlement check fail-through
- endpoint rate-limit 429 → reason=rate_limit_429
- global rate-limit 429 → reason=rate_limit_429
- 405 method not allowed
- 404 not found
- 304 etag match (resolved cache tier)
- 200 GET with body (resolved cache tier, real res_bytes)
- streaming / non-GET-200 final return (res_bytes best-effort)

Identity inputs (UsageIdentityInput):
- sessionUserId / clerkOrgId from new resolveClerkSession (one JWT verify)
- isUserApiKey + userApiKeyCustomerRef from validateUserApiKey result
- enterpriseApiKey when keyCheck.valid + non-wm_ wmKey present
- widgetKey from x-widget-key header (best-effort)
- tier captured opportunistically from existing getEntitlements calls

Header derivers reuse Vercel/CF metadata (x-vercel-id, x-vercel-ip-country,
cf-ipcountry, content-length, sentry-trace) — no new geo lookup, no new
crypto on the hot path. ua_hash null unless USAGE_UA_PEPPER is set.

Dev-only x-usage-telemetry response header (ok | degraded | off) attached
on the response paths for 2-second debugging in non-production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(usage): upstream events via implicit request scope (#3381)

Closes the upstream-attribution side of the Axiom usage-telemetry stack
without requiring leaf-handler changes (per koala's review).

server/_shared/usage.ts
- AsyncLocalStorage-backed UsageScope: gateway sets it once per request,
  fetch helpers read from it lazily. Defensive import — if the runtime
  rejects node:async_hooks, scope helpers degrade to no-ops and the
  request event is unaffected.
- runWithUsageScope(scope, fn) / getUsageScope() exports.

server/gateway.ts
- Wraps matchedHandler in runWithUsageScope({ ctx, requestId, customerId,
  route, tier }) so deep fetchers can attribute upstream calls without
  threading state through every handler signature.

server/_shared/redis.ts
- cachedFetchJsonWithMeta accepts opts.usage = { provider, operation? }.
  Only the provider label is required to opt in — request_id / customer_id
  / route / tier flow implicitly from UsageScope.
- Emits on the fresh path only (cache hits don't emit; the inbound
  request event already records cache_status).
- cache_status correctly distinguishes 'miss' vs 'neg-sentinel' by
  construction, matching NEG_SENTINEL handling.
- Telemetry never throws — failures are swallowed in the lazy-import
  catch, sink itself short-circuits on USAGE_TELEMETRY=0.

server/_shared/fetch-json.ts
- New optional { provider, operation } in FetchJsonOptions. Same
  opt-in-by-provider model as cachedFetchJsonWithMeta. Auto-derives host
  from URL. Reads body via .text() so response_bytes is recorded
  (best-effort; chunked responses still report 0).

Net result: any handler that uses fetchJson or cachedFetchJsonWithMeta
gets full per-customer upstream attribution by adding two fields to the
options bag. No signature changes anywhere else.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(gateway): address round-1 codex feedback on usage telemetry

- ctx is now optional on the createDomainGateway handler signature so
  direct callers (tests, non-Vercel paths) no longer crash on emit
- legacy premium bearer-token routes (resilience, shipping-v2) propagate
  session.userId into the usage accumulator so successful requests are
  attributed instead of emitting as anon
- after checkEntitlement allows a tier-gated route, re-read entitlements
  (Redis-cached + in-flight coalesced) to populate usage.tier so
  analyze-stock & co. emit the correct tier rather than 0
- domain extraction now skips a leading vN segment, so /api/v2/shipping/*
  records domain="shipping" instead of "v2"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(usage): assert telemetry payload + identity resolver + operator guide

- tests/usage-telemetry-emission.test.mts stubs globalThis.fetch to
  capture the Axiom ingest POST body and asserts the four review-flagged
  fields end-to-end through the gateway: domain on /api/v2/<svc>/* (was
  "v2"), customer_id on legacy premium bearer success (was null/anon),
  tier on entitlement-gated success via the Convex fallback path (was 0),
  plus a ctx-optional regression guard
- server/__tests__/usage-identity.test.ts unit-tests the pure
  buildUsageIdentity() resolver across every auth_kind branch, tier
  coercion, and the secret-handling invariant (raw enterprise key never
  lands in any output field)
- docs/architecture/usage-telemetry.md is the operator + dev guide:
  field reference, architecture, configuration, failure modes, local
  workflow, eight Axiom APL recipes, and runbooks for adding fields /
  new gateway return paths

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(usage): make recorder.settled robust to nested waitUntil

Promise.all(pending) snapshotted the array at call time, missing the
inner ctx.waitUntil(sendToAxiom(...)) that emitUsageEvents pushes after
the outer drain begins. Tests passed only because the fetch spy resolved
in an earlier microtask tick. Replace with a quiescence loop so the
helper survives any future async in the emit path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: trigger preview

* fix(usage): address koala #3403 review — collapse nested waitUntil, widget-key validation, neg-sentinel status, auth_* reasons

P1
- Collapse nested ctx.waitUntil at all 3 emit sites (gateway.ts emitRequest,
  fetch-json.ts, redis.ts emitUpstreamFromHook). Export sendToAxiom and call
  it directly inside the outer waitUntil so Edge runtimes don't drop the
  delivery promise after the response phase.
- Validate X-Widget-Key against WIDGET_AGENT_KEY before populating usage.widgetKey
  so unauthenticated callers can't spoof per-customer attribution.

P2
- Emit on OPTIONS preflight (new 'preflight' RequestReason).
- Gate cachedFetchJsonWithMeta upstreamStatus=200 on result != null so the
  neg-sentinel branch no longer reports as a successful upstream call.
- Extend RequestReason with auth_401/auth_403/tier_403 and replace
  reason:'ok' on every auth/tier-rejection emit path.
- Replace 32-bit FNV-1a with a two-round XOR-folded 64-bit variant in
  hashKeySync (collision space matters once widget-key adoption grows).

Verification
- tests/usage-telemetry-emission.test.mts — 6/6
- tests/premium-stock-gateway.test.mts + tests/gateway-cdn-origin-policy.test.mts — 15/15
- npx vitest run server/__tests__/usage-identity.test.ts — 13/13
- npx tsc --noEmit clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: trigger preview rebuild for AXIOM_API_TOKEN

* chore(usage): note Axiom region in ingest URL comment

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* debug(usage): unconditional logs in sendToAxiom for preview troubleshooting

Temporary — to be reverted once Axiom delivery is confirmed working in preview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(usage): add 'live' cache tier + revert preview debug logs

- Sync UsageCacheTier with the local CacheTier in gateway.ts (main added
  'live' in PR #3402 — synthetic merge with main was failing typecheck:api).
- Revert temporary unconditional debug logs in sendToAxiom now that Axiom
  delivery is verified end-to-end on preview (event landed with all fields
  populated, including the new auth_401 reason from the koala #3403 fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:10:51 +03:00
Elie Habib
8655bd81bc feat(energy-atlas): GEM pipeline data import — gas 297, oil 334 (#3406)
* feat(energy-atlas): GEM pipeline data import — gas 75→297, oil 75→334 (parity-push closure)

Closes the ~3.6× pipeline-scale gap that PR #3397's import infrastructure
was built for. Per docs/methodology/pipelines.mdx operator runbook.

Source releases (CC-BY 4.0, attribution preserved in registry envelope):
  - GEM-GGIT-Gas-Pipelines-2025-11.xlsx
    SHA256: f56d8b14400e558f06e53a4205034d3d506fc38c5ae6bf58000252f87b1845e6
    URL:    https://globalenergymonitor.org/wp-content/uploads/2025/11/GEM-GGIT-Gas-Pipelines-2025-11.xlsx
  - GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx
    SHA256: d1648d28aed99cfd2264047f1e944ddfccf50ce9feeac7de5db233c601dc3bb2
    URL:    https://globalenergymonitor.org/wp-content/uploads/2025/03/GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx

Pre-conversion: GeoJSON (geometry endpoints) + XLSX (column properties) →
canonical operator-shape JSON via /tmp/gem-import/convert.py. Filter knobs:
  - status ∈ {operating, construction}
  - length ≥ 750 km (gas) / 400 km (oil) — asymmetric per-fuel trunk-class
  - capacity unit conversions: bcm/y native; MMcf/d, MMSCMD, mtpa, m3/day,
    bpd, Mb/d, kbd → bcm/y (gas) or bbl/d (oil) at canonical conversion factors.
  - Country names → ISO 3166-1 alpha-2 via pycountry + alias table.

Merge results (via scripts/import-gem-pipelines.mjs --merge):
  gas: +222 added, 15 duplicates skipped (haversine ≤ 5km AND token Jaccard ≥ 0.6)
  oil: +259 added, 16 duplicates skipped
  Final: 297 gas / 334 oil. Hand-curated 75+75 preserved with full evidence;
  GEM rows ship physicalStateSource='gem', classifierConfidence=0.4,
  operatorStatement=null, sanctionRefs=[].

Floor bump:
  scripts/_pipeline-registry.mjs MIN_PIPELINES_PER_REGISTRY 8 → 200.
  Live counts (297/334) leave ~100 rows of jitter headroom so a partial
  re-import or coverage-narrowing release fails loud rather than halving
  the registry silently.

Tests:
  - tests/pipelines-registry.test.mts: bumped synthetic-registry
    Array.from({length:8}) → length:210 to clear new floor; added 'gem' to
    the evidence-source whitelist for non-flowing badges (parity with the
    derivePipelinePublicBadge audit done in PR #3397 U1).
  - tests/import-gem-pipelines.test.mjs: bumped registry-conformance loop
    3 → 70 to clear new floor.
  - 51/51 pipeline tests pass; tsc --noEmit clean.

vs peer reference site (281 gas + 265 oil): we now match (gas 297) and
exceed (oil 334). Functional + visual + data parity for the energy variant
is closed; remaining gaps are editorial-cadence (weekly briefing) which
is intentionally out of scope per the parity-push plan.

* docs(energy-atlas): land GEM converter + expand methodology runbook for quarterly refresh

PR #3406 imported the data but didn't land the conversion script that
produced it. This commit lands the converter at scripts/_gem-geojson-to-canonical.py
so future operators can reproduce the import deterministically, and rewrites
the docs/methodology/pipelines.mdx runbook to match what actually works:

- Use GeoJSON (not XLSX) — the XLSX has properties but no lat/lon columns;
  only the GIS .zip's GeoJSON has both. The original runbook said to download
  XLSX which would fail at the lat/lon validation step.
- Cadence: quarterly refresh, with concrete signals (peer-site comparison,
  90-day calendar reminder).
- Source datasets: explicit GGIT (gas) + GOIT (oil/NGL) tracker names so
  future operators don't re-request the wrong dataset (the Extraction
  Tracker = wells/fields, NOT pipelines — ours requires the Infrastructure
  Trackers).
- Last-known-good URLs documented + URL pattern explained as fallback when
  GEM rotates per release.
- Filter knob defaults documented inline (gas ≥ 750km, oil ≥ 400km, status
  ∈ {operating, construction}, capacity unit conversion table).
- Failure-mode table mapping common errors to fixes.

Converter takes paths via env vars (GEM_GAS_GEOJSON, GEM_OIL_GEOJSON,
GEM_DOWNLOADED_AT, GEM_SOURCE_VERSION) instead of hardcoded paths so it
works for any release without code edits.

* fix(energy-atlas): close PR #3406 review findings — dedup + zero-length + test

Three Greptile findings on PR #3406:

P1 — Dedup miss (Dampier-Bunbury):
  Same physical pipeline existed in both registries — curated `dampier-bunbury`
  and GEM-imported `dampier-to-bunbury-natural-gas-pipeline-au` — because GEM
  digitized only the southern 60% of the line. The shared Bunbury terminus
  matched at 13.7 km but the average-endpoint distance was 287 km, just over
  the 5 km gate.
  Fix: scripts/_pipeline-dedup.mjs adds a name-set-identity short-circuit —
  if Jaccard == 1.0 (after stopword removal) AND any of the 4 endpoint
  pairings is ≤ 25 km, treat as duplicate. The 25 km anchor preserves the
  existing "name collision in different ocean → still added" contract.
  Added regression test: identical Dampier-Bunbury inputs → 0 added, 1
  skipped, matched against `dampier-bunbury`.

P1 — Zero-length geometry (9 rows: Trans-Alaska, Enbridge Line 3, Ichthys, etc.):
  GEM source GeoJSON occasionally has a Point geometry or single-coord
  LineString, producing pipelines where startPoint == endPoint. They render
  as map-point artifacts and skew aggregate-length stats.
  Fix (defense in depth):
    - scripts/_gem-geojson-to-canonical.py drops at conversion time
      (`zero_length` reason in drop log).
    - scripts/_pipeline-registry.mjs validateRegistry rejects defensively
      so even a hand-curated row with degenerate geometry fails loud.

P2 — Test repetition coupled to fixture row count:
  Hardcoded `for (let i = 0; i < 70; i++)` × 3 fixture rows = 210 silently
  breaks if fixture is trimmed below 3.
  Fix: `Math.ceil(REGISTRY_FLOOR / fixture.length) + 5` derives reps from
  the floor and current fixture length.

Re-run --merge with all fixes applied:
  gas: 75 → 293 (+218 added, 17 deduped — was 222/15 before; +2 catches via
       name-set-identity short-circuit; -2 zero-length never imported)
  oil: 75 → 325 (+250 added, 18 deduped — was 259/16; +2 catches; -7 zero-length)

Tests: 74/74 pipeline tests pass; tsc --noEmit clean.
2026-04-25 18:59:46 +04:00
Elie Habib
5c955691a9 feat(energy-atlas): live tanker map layer + contract (parity PR 3, plan U7-U8) (#3402)
* feat(energy-atlas): live tanker map layer + contract (PR 3, plan U7-U8)

Lands the third and final parity-push surface — per-vessel tanker positions
inside chokepoint bounding boxes, refreshed every 60s. Closes the visual
gap with peer reference energy-intel sites for the live AIS tanker view.

Per docs/plans/2026-04-25-003-feat-energy-parity-pushup-plan.md PR 3.
Codex-approved through 8 review rounds against origin/main @ 050073354.

U7 — Contract changes (relay + handler + proto + gateway + rate-limit + test):

- scripts/ais-relay.cjs: parallel `tankerReports` Map populated for AIS
  ship type 80-89 (tanker class) per ITU-R M.1371. SEPARATE from the
  existing `candidateReports` Map (military-only) so the existing
  military-detection consumer's contract stays unchanged. Snapshot
  endpoint extended to accept `bbox=swLat,swLon,neLat,neLon` + `tankers=true`
  query params, with bbox-filtering applied server-side. Tanker reports
  cleaned up on the same retention window as candidate reports; capped
  at 200 per response (10× headroom for global storage).
- proto/worldmonitor/maritime/v1/{get_,}vessel_snapshot.proto:
  - new `bool include_tankers = 6` request field
  - new `repeated SnapshotCandidateReport tanker_reports = 7` response
    field (reuses existing message shape; parallel to candidate_reports)
- server/worldmonitor/maritime/v1/get-vessel-snapshot.ts: REPLACES the
  prior 5-minute `with|without` cache with a request-keyed cache —
  (includeCandidates, includeTankers, quantizedBbox) — at 60s TTL for
  the live-tanker path and 5min TTL for the existing density/disruption
  consumers. Also adds 1° bbox quantization for cache-key reuse and a
  10° max-bbox guard (BboxTooLargeError) to prevent malicious clients
  from pulling all tankers through one query.
- server/gateway.ts: NEW `'live'` cache tier. CacheTier union extended;
  TIER_HEADERS + TIER_CDN_CACHE both gain entries with `s-maxage=60,
  stale-while-revalidate=60`. RPC_CACHE_TIER maps the maritime endpoint
  from `'no-store'` to `'live'` so the CDN absorbs concurrent identical
  requests across all viewers (without this, N viewers × 6 chokepoints
  hit AISStream upstream linearly).
- server/_shared/rate-limit.ts: ENDPOINT_RATE_POLICIES entry for the
  maritime endpoint at 60 req/min/IP — enough headroom for one user's
  6-chokepoint tab plus refreshes; flags only true scrape-class traffic.
- tests/route-cache-tier.test.mjs: regex extended to include `live` so
  the every-route-has-an-explicit-tier check still recognises the new
  mapping. Without this, the new tier would silently drop the maritime
  route from the validator's route map.

U8 — LiveTankersLayer consumer:

- src/services/live-tankers.ts: per-chokepoint fetcher with 60s in-memory
  cache. Promise.allSettled — never .all — so one chokepoint failing
  doesn't blank the whole layer (failed zones serve last-known data).
  Sources bbox centroids from src/config/chokepoint-registry.ts
  (CORRECT location — server/.../​_chokepoint-ids.ts strips lat/lon).
  Default chokepoint set: hormuz_strait, suez, bab_el_mandeb,
  malacca_strait, panama, bosphorus.
- src/components/DeckGLMap.ts: new `createLiveTankersLayer()` ScatterplotLayer
  styled by speed (anchored amber when speed < 0.5 kn, underway cyan,
  unknown gray); new `loadLiveTankers()` async loader with abort-controller
  cancellation. Layer instantiated when `mapLayers.liveTankers && this.liveTankers.length > 0`.
- src/config/map-layer-definitions.ts: `LayerDefinition` for `liveTankers`
  with `renderers: ['flat'], deckGLOnly: true` (matches existing
  storageFacilities/fuelShortages pattern). Added to `VARIANT_LAYER_ORDER.energy`
  near `ais` so getLayersForVariant() and sanitizeLayersForVariant()
  include it on the energy variant — without this addition the layer
  would be silently stripped even when toggled on.
- src/types/index.ts: `liveTankers?: boolean` on the MapLayers union.
- src/config/panels.ts: ENERGY_MAP_LAYERS + ENERGY_MOBILE_MAP_LAYERS
  both gain `liveTankers: true`. Default `false` everywhere else.
- src/services/maritime/index.ts: existing snapshot consumer pinned to
  `includeTankers: false` to satisfy the proto's new required field;
  preserves identical behavior for the AIS-density / military-detection
  surfaces.

Tests:
- npm run typecheck clean.
- 5 unit tests in tests/live-tankers-service.test.mjs cover the default
  chokepoint set (rejects ids that aren't in CHOKEPOINT_REGISTRY), the
  60s cache TTL pin (must match gateway 'live' tier s-maxage), and bbox
  derivation (±2° padding, total span under the 10° handler guard).
- tests/route-cache-tier.test.mjs continues to pass after the regex
  extension; the new maritime tier is correctly extracted.

Defense in depth:
- THREE-layer cache (CDN 'live' tier → handler bbox-keyed 60s → service
  in-memory 60s) means concurrent users hit the relay sub-linearly.
- Server-side 200-vessel cap on tanker_reports + client-side cap;
  protects layer render perf even on a runaway relay payload.
- Bbox-size guard (10° max) prevents a single global-bbox query from
  exfiltrating every tanker.
- Per-IP rate limit at 60/min covers normal use; flags scrape-class only.
- Existing military-detection contract preserved: `candidate_reports`
  field semantics unchanged; consumers self-select via include_tankers
  vs include_candidates rather than the response field changing meaning.

* fix(energy-atlas): wire LiveTankers loop + 400 bbox-range guard (PR3 review)

Three findings from review of #3402:

P1 — loadLiveTankers() was never called (DeckGLMap.ts:2999):
- Add ensureLiveTankersLoop() / stopLiveTankersLoop() helpers paired with
  the layer-enabled / layer-disabled branches in updateLayers(). The
  ensure helper kicks an immediate load + a 60s setInterval; idempotent
  so calling it on every layers update is safe.
- Wire stopLiveTankersLoop() into destroy() and into the layer-disabled
  branch so we don't hammer the relay when the layer is off.
- Layer factory now runs only when liveTankers.length > 0; ensureLoop
  fires on every observed-enabled tick so first-paint kicks the load
  even before the first tanker arrives.

P1 — bbox lat/lon range guard (get-vessel-snapshot.ts:253):
- Out-of-range bboxes (e.g. ne_lat=200) previously passed the size
  guard (200-195=5° < 10°) but failed at the relay, which silently
  drops the bbox param and returns a global capped subset — making
  the layer appear to "work" with stale phantom data.
- Add isValidLatLon() check inside extractAndValidateBbox(): every
  corner must satisfy [-90, 90] / [-180, 180] before the size guard
  runs. Failure throws BboxValidationError.

P2 — BboxTooLargeError surfaced as 500 instead of 400:
- server/error-mapper.ts maps errors to HTTP status by checking
  `'statusCode' in error`. The previous BboxTooLargeError extended
  Error without that property, so the mapper fell through to
  "unhandled error" → 500.
- Rename to BboxValidationError, add `readonly statusCode = 400`.
  Mapper now surfaces it as HTTP 400 with a descriptive reason.
- Keep BboxTooLargeError as a backwards-compat alias so existing
  imports / tests don't break.

Tests:
- Updated tests/server-handlers.test.mjs structural test to pin the
  new class name + statusCode + lat/lon range checks. 24 tests pass.
- typecheck (src + api) clean.

* fix(energy-atlas): thread AbortSignal through fetchLiveTankers (PR3 review #2)

P2 — AbortController was created + aborted but signal was never passed
into the actual fetch path (DeckGLMap.ts:3048 / live-tankers.ts:100):
- Toggling the layer off, destroying the map, or starting a new refresh
  did not actually cancel in-flight network work. A slow older refresh
  could complete after a newer one and overwrite this.liveTankers with
  stale data.

Threading:
- fetchLiveTankers() now accepts `options.signal: AbortSignal`. Signal
  is passed through to client.getVesselSnapshot() per chokepoint via
  the Connect-RPC client's standard `{ signal }` option.
- Per-zone abort handling: bail early if signal is already aborted
  before the fetch starts (saves a wasted RPC + cache write); re-check
  after the fetch resolves so a slow resolver can't clobber cache
  after the caller cancelled.

Stale-result race guard in DeckGLMap.loadLiveTankers:
- Capture controller in a local before storing on this.liveTankersAbort.
- After fetchLiveTankers resolves, drop the result if EITHER:
  - controller.signal is now aborted (newer load cancelled this one)
  - this.liveTankersAbort points to a different controller (a newer
    load already started + replaced us in the field)
- Without these guards, an older fetch that completed despite
  signal.aborted could still write to this.liveTankers and call
  updateLayers, racing with the newer load.

Tests: 1 new signature-pin test in tests/live-tankers-service.test.mts
verifies fetchLiveTankers accepts options.signal — guards against future
edits silently dropping the parameter and re-introducing the race.
6 tests pass. typecheck clean.

* fix(energy-atlas): bound vessel-snapshot cache via LRU eviction (PR3 review)

Greptile P2 finding: the in-process cache Map grows unbounded across the
serverless instance lifetime. Each distinct (includeCandidates,
includeTankers, quantizedBbox) triple creates a slot that's never evicted.
With 1° quantization and a misbehaving client the keyspace is ~64,000
entries — realistic load is ~12, so a 128-slot cap leaves 10x headroom
while making OOM impossible.

Implementation:
- SNAPSHOT_CACHE_MAX_SLOTS = 128.
- evictIfNeeded() walks insertion order and evicts the first slot whose
  inFlight is null. Slots with active fetches are skipped to avoid
  orphaning awaiting callers; we accept brief over-cap growth until
  in-flight settles.
- touchSlot() re-inserts a slot at the end of Map insertion order on
  hit / in-flight join / fresh write so it counts as most-recently-used.
2026-04-25 17:56:23 +04:00
Elie Habib
0bca368a7d feat(energy-atlas): EnergyRiskOverviewPanel — executive overview tile (parity PR 2, plan U5-U6) (#3398)
* feat(energy-atlas): EnergyRiskOverviewPanel — executive overview tile (PR 2, plan U5-U6)

Lands the consolidated "first fold" surface that peer reference energy-intel
sites use as their executive overview. Six tiles in a single panel:
1. Strait of Hormuz status (closed/disrupted/restricted/open)
2. EU gas storage fill % (red <30, amber 30-49, green ≥50)
3. Brent crude price + 1-day change (importer-leaning: up=red, down=green)
4. Active disruption count (filtered to endAt === null)
5. Data freshness ("X min ago" from youngest fetchedAt)
6. Hormuz crisis day counter (default 2026-02-23 start, env-overridable)

Per docs/plans/2026-04-25-003-feat-energy-parity-pushup-plan.md PR 2.

U5 — Component (src/components/EnergyRiskOverviewPanel.ts):
- Composes 5 existing services via Promise.allSettled — never .all. One slow
  or failing source CANNOT freeze the panel; failed tiles render '—' and
  carry data-degraded="true" for QA inspection. Single most important
  behavior — guards against the recurrence of the #3386 panel-stuck bug.
- Uses the actual Hormuz status enum 'closed'|'disrupted'|'restricted'|'open'
  (NOT 'normal'/'reduced'/'critical' — that triplet was a misread in earlier
  drafts). Test suite explicitly rejects the wrong triplet via the gray
  sentinel fallback.
- Brent color inverted from a default market panel: oil price UP = red
  (bad for energy importers, the dominant Atlas reader); DOWN = green.
- Crisis-day counter sourced from VITE_HORMUZ_CRISIS_START_DATE env
  (default 2026-02-23). NaN/future-dated values handled with explicit
  '—' / 'pending' sentinels so the tile never renders 'Day NaN'.
- 60s setInterval re-renders the freshness tile only — no new RPCs fire
  on tick. setInterval cleared in destroy() so panel teardown is clean.
- Tests: 24 in tests/energy-risk-overview-panel.test.mts cover Hormuz
  color enum (including the wrong-triplet rejection), EU gas thresholds,
  Brent inversion, active-disruption color bands, freshness label
  formatting, crisis-day counter (today/5-days/NaN/future), and the
  degraded-mode contract (all-fail still renders 6 tiles with 4 marked
  data-degraded).

U6 — Wiring (5 sites per skill panel-stuck-loading-means-missing-primetask):
- src/components/index.ts: barrel export
- src/app/panel-layout.ts: import + createPanel('energy-risk-overview', ...)
- src/config/panels.ts: priority-1 entry in ENERGY_PANELS (top-of-grid),
  priority-2 entry in FULL_PANELS (CMD+K-discoverable, default disabled),
  panelKey added to PANEL_CATEGORY_MAP marketsFinance category
- src/App.ts: import type + primeTask kickoff (between energy-disruptions
  and climate-news in the existing ordering convention)
- src/config/commands.ts: panel:energy-risk-overview command with keywords
  for 'risk overview', 'executive overview', 'hormuz status', 'crisis day'

No new RPCs (preserves agent-native parity — every metric the panel shows
is already exposed via existing Connect-RPC handlers and bootstrap-cache
keys; agents can answer the same questions through the same surface).

Tests: typecheck clean; 24 unit tests pass on the panel's pure helpers.
Manual visual QA pending PR merge + deploy.

Plan section §M effort estimate: ~1.5d. Codex-approved through 8 review
rounds against origin/main @ 050073354.

* fix(energy-atlas): extract Risk Overview state-builder + real component test (PR2 review)

P2 — tests duplicated helper logic instead of testing the real panel
(energy-risk-overview-panel.test.mts:10):
- The original tests pinned color/threshold helpers but didn't import
  the panel's actual state-building logic, so the panel could ship
  with a broken Promise.allSettled wiring while the tests stayed green.

Refactor:
- Extract the state-building logic into a NEW Vite-free module:
  src/components/_energy-risk-overview-state.ts. Exports
  buildOverviewState(hormuz, euGas, brent, disruptions, now) and a
  countDegradedTiles() helper for tests.
- The panel now imports and calls buildOverviewState() directly inside
  fetchData(); no logic duplication. The Hormuz tile renderer narrows
  status with an explicit cast at use site.
- Why a new module: the panel transitively imports `import.meta.glob`
  via the i18n service, which doesn't resolve under node:test even
  with tsx loader. Extracting the testable logic into a
  Vite-dependency-free module is the cleanest way to exercise the
  production code from tests, per skill panel-stuck-loading-means-
  missing-primetask's emphasis on "test the actual production logic,
  not a copy-paste of it".

Tests added (11 real-component cases via the new module):
- All four sources fulfilled → 0 degraded.
- All four sources rejected → 4 degraded, no throw, no cascade.
- Mixed (1 fulfilled, 3 rejected) → only one tile populated.
- euGas with `unavailable: true` sentinel → degraded.
- euGas with fillPct=0 → degraded (treats as no-data, not "0% red").
- brent empty data array → degraded.
- brent first-quote price=null → degraded.
- disruptions upstreamUnavailable=true → degraded.
- disruptions ongoing filter: counts only endAt-falsy events.
- Malformed hormuz response (missing status field) → degraded sentinel.
- One rejected source MUST NOT cascade to fulfilled siblings (the
  core degraded-mode contract — pinned explicitly).

Total: 35 tests in this file (was 24; +11 real-component cases).
typecheck clean.

* fix(energy-atlas): server-side disruptions filter + once-only style + panel name parity (PR2 review)

Three Greptile P2 findings on PR #3398:

- listEnergyDisruptions called with ongoingOnly:true so the server filters
  the historical 52-event payload server-side. The state builder still
  re-filters as defense-in-depth.
- RISK_OVERVIEW_CSS injected once into <head> via injectRiskOverviewStylesOnce
  instead of being emitted into setContent on every render. The 60s freshness
  setInterval was tearing out and re-inserting the style tag every minute.
- FULL_PANELS entry renamed from "Energy Risk Overview" to
  "Global Energy Risk Overview" to match ENERGY_PANELS and the CMD+K command.
2026-04-25 17:56:02 +04:00
Elie Habib
d9a1f6a0f8 feat(energy-atlas): GEM pipeline import infrastructure (parity PR 1, plan U1-U4) (#3397)
* feat(energy-atlas): GEM pipeline import infrastructure (PR 1, plan U1-U4)

Lands the parser, dedup helper, validator extensions, and operator runbook
for the Global Energy Monitor (CC-BY 4.0) pipeline-data refresh — closing
~3.6× of the Energy Atlas pipeline-scale gap once the operator runs the
import.

Per docs/plans/2026-04-25-003-feat-energy-parity-pushup-plan.md PR 1.

U1 — Validator + schema extensions:
- Add `'gem'` to VALID_SOURCES in scripts/_pipeline-registry.mjs and to the
  evidence-bearing-source whitelist in derivePipelinePublicBadge so GEM-
  sourced offline rows derive a `disputed` badge via the external-signal
  rule (parity with `press`/`satellite`/`ais-relay`).
- Export VALID_SOURCES so tests assert against the same source-of-truth
  the validator uses (matches the VALID_OIL_PRODUCT_CLASSES pattern from
  PR #3383).
- Floor bump (MIN_PIPELINES_PER_REGISTRY 8→200) intentionally DEFERRED
  to the follow-up data PR — bumping it now would gate the existing 75+75
  hand-curated rows below the new floor and break seeder publishes
  before the GEM data lands.

U2 — GEM parser (test-first):
- scripts/import-gem-pipelines.mjs reads a local JSON file (operator pre-
  converts GEM Excel externally — no `xlsx` dependency added). Schema-
  drift sentinel throws on missing columns. Status mapping covers
  Operating/Construction/Cancelled/Mothballed/Idle/Shut-in. ProductClass
  mapping covers Crude Oil / Refined Products / mixed-flow notes.
  Capacity-unit conversion handles bcm/y, bbl/d, Mbd, kbd.
- 22 tests in tests/import-gem-pipelines.test.mjs cover schema sentinel,
  fuel split, status mapping, productClass mapping, capacity conversion,
  minimum-viable-evidence shape, registry-shape conformance, and bad-
  coordinate rejection.

U3 — Deduplication (pure deterministic):
- scripts/_pipeline-dedup.mjs: dedupePipelines(existing, candidates) →
  { toAdd, skippedDuplicates }. Match rule: haversine ≤5km AND name
  Jaccard ≥0.6 (BOTH required). Reverse-direction-pair-aware.
- 19 tests cover internal helpers, match logic, id collision, determinism,
  and empty inputs.

U4 — Operator runbook (data import deferred):
- docs/methodology/pipelines.mdx: 7-step runbook for the operator to
  download GEM, pre-convert Excel→JSON, dry-run with --print-candidates,
  merge with --merge, bump the registry floor, and commit with
  provenance metadata.
- The actual data import is intentionally OUT OF SCOPE for this agent-
  authored PR because GEM downloads are registration-gated. A follow-up
  PR will commit the imported scripts/data/pipelines-{gas,oil}.json +
  bump MIN_PIPELINES_PER_REGISTRY → 200 + record the GEM release SHA256.

Tests: typecheck clean; 67 tests pass across the three test files.

Codex-approved through 8 review rounds against origin/main @ 050073354.

* fix(energy-atlas): wire --merge to dedupePipelines + within-batch dedup (PR1 review)

P1 — --merge was a TODO no-op (import-gem-pipelines.mjs:291):
- Previously exited with code 2 + a "TODO: wire dedup once U3 lands"
  message. The PR body and the methodology runbook both advertised
  --merge as the operator path.
- Add mergeIntoRegistry(filename, candidates) helper that loads the
  existing envelope, runs dedupePipelines() against the candidate
  list, sorts new entries alphabetically by id (stable diff on rerun),
  validates the merged registry via validateRegistry(), and writes
  to disk only after validation passes. CLI --merge now invokes it
  for both gas and oil + prints a per-fuel summary.
- Source attribution: the registry envelope's `source` field is
  upgraded to mention GEM (CC-BY 4.0) on first merge so the data file
  itself documents provenance.

P2 — dedup transitive-match bug (_pipeline-dedup.mjs:120):
- Pre-fix loop checked each candidate ONLY against the original
  `existing` array. Two GEM rows that match each other but not anything
  in `existing` would BOTH be added, defeating the dedup contract for
  same-batch duplicates (real example: a primary GEM entry plus a
  duplicate row from a regional supplemental sheet).
- Now compares against existing FIRST (existing wins on cross-set
  match — preserves richer hand-curated evidence), then falls back to
  the already-accepted toAdd set. Within-batch matches retain the FIRST
  accepted candidate (deterministic by candidate-list order).

Tests: 22 in tests/pipeline-dedup.test.mjs (3 new) cover the
within-batch dedup, transitive collapse, and existing-wins-over-
already-accepted scenarios. typecheck clean.

* fix(energy-atlas): cross-file-atomic --merge (PR1 review #2)

P1 — partial-import on disk if oil validation fails after gas writes
(import-gem-pipelines.mjs:329 / :350):
- Previous flow ran `mergeIntoRegistry('pipelines-gas.json', gas)` which
  wrote to disk, then `mergeIntoRegistry('pipelines-oil.json', oil)`. If
  oil validation failed, the operator was left with a half-imported
  state: gas had GEM rows committed to disk but oil didn't.
- Refactor into a two-phase API:
  1. prepareMerge(filename, candidates) — pure, no disk I/O. Builds the
     merged envelope, validates it, throws on validation failure.
  2. mergeBothRegistries(gasCandidates, oilCandidates) — calls
     prepareMerge for BOTH fuels first; only writes to disk after BOTH
     pass validation. If oil's prepareMerge throws, gas was never
     touched on disk.
- CLI --merge now invokes mergeBothRegistries. The atomicity guarantee
  is documented inline in the helper.

typecheck clean. No new tests because the existing dedup + validate
suites cover the underlying logic; the change is purely about call
ordering for atomicity.

* fix(energy-atlas): deterministic lastEvidenceUpdate + clarify test comment (PR1 review #3)

P2 — lastEvidenceUpdate was non-deterministic (Greptile P2):
- Previous code used new Date().toISOString() per parser run, so two runs
  of parseGemPipelines on the same input on different days produced
  byte-different output. Quarterly re-imports would produce noisy
  full-row diffs even when the upstream GEM data hadn't changed.
- New: resolveEvidenceTimestamp(envelope) derives the timestamp from
  envelope.downloadedAt (the operator-recorded date) or sourceVersion
  if it parses as ISO. Falls back to 1970-01-01 sentinel when neither
  is set — deliberately ugly so reviewers spot the missing field in
  the data file diff rather than getting silent today's date.
- Computed once per parse run so every emitted candidate gets the
  same timestamp.

P2 — misleading test comment (Greptile P2):
- Comment in tests/import-gem-pipelines.test.mjs:136 said "400_000 bbl/d
  ÷ 1000 = 400 Mbd" while the assertion correctly expects 0.4 (because
  the convention is millions, not thousands). Rewrote the comment to
  state the actual rule + arithmetic clearly.

3 new tests for determinism: (a) two parser runs produce identical
output, (b) timestamp derives from downloadedAt, (c) missing date
yields the epoch sentinel (loud failure mode).
2026-04-25 17:55:45 +04:00
Elie Habib
eeffac31bf fix(vercel): drop fragile VERCEL_GIT_PULL_REQUEST_ID guard in ignore step (#3404)
scripts/vercel-ignore.sh skipped any preview deploy where
VERCEL_GIT_PULL_REQUEST_ID was empty. Vercel only populates that var
on fresh PR-aware webhook events; manual "Redeploy" / "Redeploy
without cache" from the dashboard, and some integration edge cases,
leave it empty even on commits attached to an open PR. The merge-base
diff against origin/main below it is already the authoritative
"touched anything web-relevant" check and is strictly stronger.

Repro: PR #3403 commit 24d511e29 on feat/usage-telemetry — five api/
+ server/ files clearly modified, build canceled at line 18 before
the path diff ran. Local replay with PR_ID unset now exits 1 (build).
2026-04-25 17:53:47 +04:00
Elie Habib
92dd046820 fix(brief): address Greptile P1 + P4 review on merged PR #3396 (#3401)
P1 — False-positive PARITY REGRESSION for AI-digest opt-out users
  (scripts/seed-digest-notifications.mjs)

When rule.aiDigestEnabled === false, briefLead is intentionally
null (no summary in channel bodies), but envLead still reads the
envelope's stub lead. The string comparison null !== '<stub lead>'
fired channels_equal=false on every tick for every opted-out user
— flooding the parity log with noise and risking the PARITY
REGRESSION alert becoming useless.

The WARN was already gated by `briefLead && envLead` (so no Sentry
flood), but the LOG line still misled operators counting
channels_equal=false. Gate the entire parity-log block on the same
condition that governs briefLead population:

  if (AI_DIGEST_ENABLED && rule.aiDigestEnabled !== false) {
    // … parity log + warn …
  }

Opt-out users now produce no parity-log line at all (correct —
there's no canonical synthesis to compare against).

P4 — greetingBucket '' fallback semantics
  (scripts/lib/brief-llm.mjs)

Doc-only — Greptile flagged that unrecognised greetings collapse
to '' (a single bucket). Added a comment clarifying this is
intentional: '' is a stable fourth bucket, not a sentinel for
"missing data". A user whose greeting flips between recognised
and unrecognised values gets different cache keys, which is
correct (those produce visibly different leads).

Other Greptile findings (no code change — replied via PR comment):
- P2 (double-fetch in [...sortedDue, ...sortedAll]): already
  addressed in helper extraction commit df3563080 of PR #3396 —
  see `seen` Set dedupe at scripts/lib/digest-orchestration-helpers.mjs:103.
- P2 (parity check no-op for opted-in): outdated as written —
  after 5d10cee86's per-rule synthesis, briefLead is per-rule and
  envLead is winner-rule's envelope.lead. They diverge for
  non-winner rules (legitimate); agree for winner rules (cache-
  shared via generateDigestProse). The check still serves its
  documented purpose for cache-drift detection.

Stacked on the merged PR #3396; opens as a follow-up since the
parent branch is now closed.

Test results: 7012/7012 (was 7006 pre-rebase onto post-merge main).
2026-04-25 16:43:50 +04:00
Elie Habib
7e68b30eb8 chore(sentry): filter PlayerControlsInterface + extension-wrapped fetch noise (#3400)
* chore(sentry): filter PlayerControlsInterface + extension-wrapped fetch noise

Triaged unresolved Sentry issues — added two targeted noise filters:

- WORLDMONITOR-P2: ignoreErrors entry for /PlayerControlsInterface\.\w+ is
  not a function/ (Android Chrome WebView native bridge injection, like
  the existing _pcmBridgeCallbackHandler / hybridExecute patterns).
- WORLDMONITOR-P5: beforeSend guard suppressing TypeError 'Failed to fetch
  (<host>)' when any frame is a browser-extension URL. Some AdBlock-class
  extensions wrap window.fetch and their replacement can fail for reasons
  unrelated to our backend; the existing maplibre host-allowlist doesn't
  cover our own hosts (abacus.worldmonitor.app, api.worldmonitor.app), and
  gating on the extension frame keeps signal for genuine first-party
  fetch failures from users without such extensions.

P1 (Dodo declined) and P4 (FRED 520) are intentional ops captures — left
unfiltered and resolved in next release. P3 (dyn-import) follows the
established policy that first-party stack frames must surface and is
mitigated by installChunkReloadGuard; resolved without a filter.

* fix(sentry): gate P5 extension-fetch filter on !hasFirstParty + add regression tests

Per PR review: the new `Failed to fetch (<host>)` extension-frame filter
needs the same `!hasFirstParty` gate the broader extension rule already
uses. Without it, a real first-party regression like a panels-*.js fetch
to api.worldmonitor.app would be silenced for users who happen to run
fetch-wrapping extensions.

Added two regression tests that lock in the safety property: extension-
only stacks suppress (the original P5 case); first-party + extension
mixed stacks reach Sentry (api.worldmonitor.app outage scenario).

* refactor(sentry): drop redundant P5 filter, retain regression tests for existing extension rule
2026-04-25 16:43:17 +04:00
Elie Habib
2f5445284b fix(brief): single canonical synthesis brain — eliminate email/brief lead divergence (#3396)
* feat(brief-llm): canonical synthesis prompt + v3 cache key

Extends generateDigestProse to be the single source of truth for
brief executive-summary synthesis (canonicalises what was previously
split between brief-llm's generateDigestProse and seed-digest-
notifications.mjs's generateAISummary). Ports Brain B's prompt
features into buildDigestPrompt:

- ctx={profile, greeting, isPublic} parameter (back-compat: 4-arg
  callers behave like today)
- per-story severity uppercased + short-hash prefix [h:XXXX] so the
  model can emit rankedStoryHashes for stable re-ranking
- profile lines + greeting opener appear only when ctx.isPublic !== true

validateDigestProseShape gains optional rankedStoryHashes (≥4-char
strings, capped to MAX_STORIES_PER_USER × 2). v2-shaped rows still
pass — field defaults to [].

hashDigestInput v3:
- material includes profile-SHA, greeting bucket, isPublic flag,
  per-story hash
- isPublic=true substitutes literal 'public' for userId in the cache
  key so all share-URL readers of the same (date, sensitivity, pool)
  hit ONE cache row (no PII in public cache key)

Adds generateDigestProsePublic(stories, sensitivity, deps) wrapper —
no userId param by design — for the share-URL surface.

Cache prefix bumped brief:llm:digest:v2 → v3. v2 rows expire on TTL.
Per the v1→v2 precedent (see hashDigestInput comment), one-tick cost
on rollout is acceptable for cache-key correctness.

Tests: 72/72 passing in tests/brief-llm.test.mjs (8 new for the v3
behaviors), full data suite 6952/6952.

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 1, Codex-approved (5 rounds).

* feat(brief): envelope v3 — adds digest.publicLead for share-URL surface

Bumps BRIEF_ENVELOPE_VERSION 2 → 3. Adds optional
BriefDigest.publicLead — non-personalised executive lead generated
by generateDigestProsePublic (already in this branch from the
previous commit) for the public share-URL surface. Personalised
`lead` is the canonical synthesis for authenticated channels;
publicLead is its profile-stripped sibling so api/brief/public/*
never serves user-specific content (watched assets/regions).

SUPPORTED_ENVELOPE_VERSIONS = [1, 2, 3] keeps v1 + v2 envelopes
in the 7-day TTL window readable through the rollout — the
composer only ever writes the current version, but readers must
tolerate older shapes that haven't expired yet. Same rollout
pattern used at the v1 → v2 bump.

Renderer changes (server/_shared/brief-render.js):
- ALLOWED_DIGEST_KEYS gains 'publicLead' (closed-key-set still
  enforced; v2 envelopes pass because publicLead === undefined is
  the v2 shape).
- assertBriefEnvelope: new isNonEmptyString check on publicLead
  when present. Type contract enforced; absence is OK.

Tests (tests/brief-magazine-render.test.mjs):
- New describe block "v3 publicLead field": v3 envelope renders;
  malformed publicLead rejected; v2 envelope still passes; ad-hoc
  digest keys (e.g. synthesisLevel) still rejected — confirming
  the closed-key-set defense holds for the cron-local-only fields
  the orchestrator must NOT persist.
- BRIEF_ENVELOPE_VERSION pin updated 2 → 3 with rollout-rationale
  comment.

Test results: 182 brief-related tests pass; full data suite
6956/6956.

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 2, Codex Round-3 Medium #2.

* feat(brief): synthesis splice + rankedStoryHashes pre-cap re-order

Plumbs the canonical synthesis output (lead, threads, signals,
publicLead, rankedStoryHashes from generateDigestProse) through the
pure composer so the orchestration layer can hand pre-resolved data
into envelope.digest. Composer stays sync / no I/O — Codex Round-2
High #2 honored.

Changes:

scripts/lib/brief-compose.mjs:
- digestStoryToUpstreamTopStory now emits `hash` (the digest story's
  stable identifier, falls back to titleHash when absent). Without
  this, rankedStoryHashes from the LLM has nothing to match against.
- composeBriefFromDigestStories accepts opts.synthesis = {lead,
  threads, signals, rankedStoryHashes?, publicLead?}. When passed,
  splices into envelope.digest after the stub is built. Partial
  synthesis (e.g. only `lead` populated) keeps stub defaults for the
  other fields — graceful degradation when L2 fallback fires.

shared/brief-filter.js:
- filterTopStories accepts optional rankedStoryHashes. New helper
  applyRankedOrder re-orders stories by short-hash prefix match
  BEFORE the cap is applied, so the model's editorial judgment of
  importance survives MAX_STORIES_PER_USER. Stable for ties; stories
  not in the ranking come after in original order. Empty/missing
  ranking is a no-op (legacy callers unchanged).

shared/brief-filter.d.ts:
- filterTopStories signature gains rankedStoryHashes?: string[].
- UpstreamTopStory gains hash?: unknown (carried through from
  digestStoryToUpstreamTopStory).

Tests added (tests/brief-from-digest-stories.test.mjs):
- synthesis substitutes lead/threads/signals/publicLead.
- legacy 4-arg callers (no synthesis) keep stub lead.
- partial synthesis (only lead) keeps stub threads/signals.
- rankedStoryHashes re-orders pool before cap.
- short-hash prefix match (model emits 8 chars; story carries full).
- unranked stories go after in original order.

Test results: 33/33 in brief-from-digest-stories; 182/182 across all
brief tests; full data suite 6956/6956.

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 3, Codex Round-2 Low + Round-2 High #2.

* feat(brief): single canonical synthesis per user; rewire all channels

Restructures the digest cron's per-user compose + send loops to
produce ONE canonical synthesis per user per issueSlot — the lead
text every channel (email HTML, plain-text, Telegram, Slack,
Discord, webhook) and the magazine show is byte-identical. This
eliminates the "two-brain" divergence that was producing different
exec summaries on different surfaces (observed 2026-04-25 0802).

Architecture:

composeBriefsForRun (orchestration):
- Pre-annotates every eligible rule with lastSentAt + isDue once,
  before the per-user pass. Same getLastSentAt helper the send loop
  uses so compose + send agree on lastSentAt for every rule.

composeAndStoreBriefForUser (per-user):
- Two-pass winner walk: try DUE rules first (sortedDue), fall back
  to ALL eligible rules (sortedAll) for compose-only ticks.
  Preserves today's dashboard refresh contract for weekly /
  twice_daily users on non-due ticks (Codex Round-4 High #1).
- Within each pass, walk by compareRules priority and pick the
  FIRST candidate with a non-empty pool — mirrors today's behavior
  at scripts/seed-digest-notifications.mjs:1044 and prevents the
  "highest-priority but empty pool" edge case (Codex Round-4
  Medium #2).
- Three-level synthesis fallback chain:
    L1: generateDigestProse(fullPool, ctx={profile,greeting,!public})
    L2: generateDigestProse(envelope-sized slice, ctx={})
    L3: stub from assembleStubbedBriefEnvelope
  Distinct log lines per fallback level so ops can quantify
  failure-mode distribution.
- Generates publicLead in parallel via generateDigestProsePublic
  (no userId param; cache-shared across all share-URL readers).
- Splices synthesis into envelope via composer's optional
  `synthesis` arg (Step 3); rankedStoryHashes re-orders the pool
  BEFORE the cap so editorial importance survives MAX_STORIES.
- synthesisLevel stored in the cron-local briefByUser entry — NOT
  persisted in the envelope (renderer's assertNoExtraKeys would
  reject; Codex Round-2 Medium #5).

Send loop:
- Reads lastSentAt via shared getLastSentAt helper (single source
  of truth with compose flow).
- briefLead = brief?.envelope?.data?.digest?.lead — the canonical
  lead. Passed to buildChannelBodies (text/Telegram/Slack/Discord),
  injectEmailSummary (HTML email), and sendWebhook (webhook
  payload's `summary` field). All-channel parity (Codex Round-1
  Medium #6).
- Subject ternary reads cron-local synthesisLevel: 1 or 2 →
  "Intelligence Brief", 3 → "Digest" (preserves today's UX for
  fallback paths; Codex Round-1 Missing #5).

Removed:
- generateAISummary() — the second LLM call that produced the
  divergent email lead. ~85 lines.
- AI_SUMMARY_CACHE_TTL constant — no longer referenced. The
  digest:ai-summary:v1:* cache rows expire on their existing 1h
  TTL (no cleanup pass).

Helpers added:
- getLastSentAt(rule) — extracted Upstash GET for digest:last-sent
  so compose + send both call one source of truth.
- buildSynthesisCtx(rule, nowMs) — formats profile + greeting for
  the canonical synthesis call. Preserves all today's prefs-fetch
  failure-mode behavior.

Composer:
- compareRules now exported from scripts/lib/brief-compose.mjs so
  the cron can sort each pass identically to groupEligibleRulesByUser.

Test results: full data suite 6962/6962 (was 6956 pre-Step 4; +6
new compose-synthesis tests from Step 3).

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Steps 4 + 4b. Codex-approved (5 rounds).

* fix(brief-render): public-share lead fail-safe — never leak personalised lead

Public-share render path (api/brief/public/[hash].ts → renderer
publicMode=true) MUST NEVER serve the personalised digest.lead
because that string can carry profile context — watched assets,
saved-region names, etc. — written by generateDigestProse with
ctx.profile populated.

Previously: redactForPublic redacted user.name and stories.whyMatters
but passed digest.lead through unchanged. Codex Round-2 High
(security finding).

Now (v3 envelope contract):
- redactForPublic substitutes digest.lead = digest.publicLead when
  the v3 envelope carries one (generated by generateDigestProsePublic
  with profile=null, cache-shared across all public readers).
- When publicLead is absent (v2 envelope still in TTL window OR v3
  envelope where publicLead generation failed), redactForPublic sets
  digest.lead to empty string.
- renderDigestGreeting: when lead is empty, OMIT the <blockquote>
  pull-quote entirely. Page still renders complete (greeting +
  horizontal rule), just without the italic lead block.
- NEVER falls back to the original personalised lead.

assertBriefEnvelope still validates publicLead's contract (when
present, must be a non-empty string) BEFORE redactForPublic runs,
so a malformed publicLead throws before any leak risk.

Tests added (tests/brief-magazine-render.test.mjs):
- v3 envelope renders publicLead in pull-quote, personalised lead
  text never appears.
- v2 envelope (no publicLead) omits pull-quote; rest of page
  intact.
- empty-string publicLead rejected by validator (defensive).
- private render still uses personalised lead.

Test results: 68 brief-magazine-render tests pass; full data suite
remains green from prior commit.

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 5, Codex Round-2 High (security).

* feat(digest): brief lead parity log + extra acceptance tests

Adds the parity-contract observability line and supplementary
acceptance tests for the canonical synthesis path.

Parity log (per send, after successful delivery):
  [digest] brief lead parity user=<id> rule=<v>:<s>:<lang>
    synthesis_level=<1|2|3> exec_len=<n> brief_lead_len=<n>
    channels_equal=<bool> public_lead_len=<n>

When channels_equal=false an extra WARN line fires —
"PARITY REGRESSION user=… — email lead != envelope lead." Sentry's
existing console-breadcrumb hook lifts this without an explicit
captureMessage call. Plan acceptance criterion A5.

Tests added (tests/brief-llm.test.mjs, +9):
- generateDigestProsePublic: two distinct callers with identical
  (sensitivity, story-pool) hit the SAME cache row (per Codex
  Round-2 Medium #4 — "no PII in public cache key").
- public + private writes never collide on cache key (defensive).
- greeting bucket change re-keys the personalised cache (Brain B
  parity).
- profile change re-keys the personalised cache.
- v3 cache prefix used (no v2 writes).

Test results: 77/77 in brief-llm; full data suite 6971/6971
(was 6962 pre-Step-7; +9 new public-cache tests).

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Steps 6 (partial) + 7. Acceptance A5, A6.g, A6.f.

* test(digest): backfill A6.h/i/l/m acceptance tests via helper extraction

* fix(brief): close two correctness regressions on multi-rule + public surface

Two findings from human review of the canonical-synthesis PR:

1. Public-share redaction leaked personalised signals + threads.
   The new prompt explicitly personalises both `lead` and `signals`
   ("personalise lead and signals"), but redactForPublic only
   substituted `lead` — leaving `signals` and `threads` intact.
   Public renderer's hasSignals gate would emit the signals page
   whenever `digest.signals.length > 0`, exposing watched-asset /
   region phrasing to anonymous readers. Same privacy bug class
   the original PR was meant to close, just on different fields.

2. Multi-rule users got cross-pool lead/storyList mismatch.
   composeAndStoreBriefForUser picks ONE winning rule for the
   canonical envelope. The send loop then injected that ONE
   `briefLead` into every due rule's channel body — even though
   each rule's storyList came from its own (per-rule) digest pool.
   Multi-rule users (e.g. `full` + `finance`) ended up with email
   bodies leading on geopolitics while listing finance stories.
   Cross-rule editorial mismatch reintroduced after the cross-
   surface fix.

Fix 1 — public signals + threads:
- Envelope shape: BriefDigest gains `publicSignals?: string[]` +
  `publicThreads?: BriefThread[]` (sibling fields to publicLead).
  Renderer's ALLOWED_DIGEST_KEYS extended; assertBriefEnvelope
  validates them when present.
- generateDigestProsePublic already returned a full prose object
  (lead + signals + threads) — orchestration now captures all
  three instead of just `.lead`. Composer splices each into its
  envelope slot.
- redactForPublic substitutes:
    digest.lead    ← publicLead (or empty → omits pull-quote)
    digest.signals ← publicSignals (or empty → omits signals page)
    digest.threads ← publicThreads (or category-derived stub via
                     new derivePublicThreadsStub helper — never
                     falls back to the personalised threads)
- New tests cover all three substitutions + their fail-safes.

Fix 2 — per-rule synthesis in send loop:
- Each due rule independently calls runSynthesisWithFallback over
  ITS OWN pool + ctx. Channel body lead is internally consistent
  with the storyList (both from the same pool).
- Cache absorbs the cost: when this is the winner rule, the
  synthesis hits the cache row written during the compose pass
  (same userId/sensitivity/pool/ctx) — no extra LLM call. Only
  multi-rule users with non-overlapping pools incur additional
  LLM calls.
- magazineUrl still points at the winner's envelope (single brief
  per user per slot — `(userId, issueSlot)` URL contract). Channel
  lead vs magazine lead may differ for non-winner rule sends;
  documented as acceptable trade-off (URL/key shape change to
  support per-rule magazines is out of scope for this PR).
- Parity log refined: adds `winner_match=<bool>` field. The
  PARITY REGRESSION warning now fires only when winner_match=true
  AND the channel lead differs from the envelope lead (the actual
  contract regression). Non-winner sends with legitimately
  different leads no longer spam the alert.

Test results:
- tests/brief-magazine-render.test.mjs: 75/75 (+7 new for public
  signals/threads + validator + private-mode-ignores-public-fields)
- Full data suite: 6995/6995 (was 6988; +7 net)
- typecheck + typecheck:api: clean

Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Addresses 2 review findings on PR #3396 not anticipated in the
5-round Codex review.

* fix(brief): unify compose+send window, fall through filter-rejection

Address two residual risks in PR #3396 (single-canonical-brain refactor):

Risk 1 — canonical lead synthesized from a fixed 24h pool while the
send loop ships stories from `lastSentAt ?? 24h`. For weekly users
that meant a 24h-pool lead bolted onto a 7d email body — the same
cross-surface divergence the refactor was meant to eliminate, just in
a different shape. Twice-daily users hit a 12h-vs-24h variant.

Fix: extract the window formula to `digestWindowStartMs(lastSentAt,
nowMs, defaultLookbackMs)` in digest-orchestration-helpers.mjs and
call it from BOTH the compose path's digestFor closure AND the send
loop. The compose path now derives windowStart per-candidate from
`cand.lastSentAt`, identical to what the send loop will use for that
rule. Removed the now-unused BRIEF_STORY_WINDOW_MS constant.

Side-effect: digestFor now receives the full annotated candidate
(`cand`) instead of just the rule, so it can reach `cand.lastSentAt`.
Backwards-compatible at the helper level — pickWinningCandidateWithPool
forwards `cand` instead of `cand.rule`.

Cache memo hit rate drops since lastSentAt varies per-rule, but
correctness > a few extra Upstash GETs.

Risk 2 — pickWinningCandidateWithPool returned the first candidate
with a non-empty raw pool as winner. If composeBriefFromDigestStories
then dropped every story (URL/headline/shape filters), the caller
bailed without trying lower-priority candidates. Pre-PR behaviour was
to keep walking. This regressed multi-rule users whose top-priority
rule's pool happens to be entirely filter-rejected.

Fix: optional `tryCompose(cand, stories)` callback on
pickWinningCandidateWithPool. When provided, the helper calls it after
the non-empty pool check; falsy return → log filter-rejected and walk
to the next candidate; truthy → returns `{winner, stories,
composeResult}` so the caller can reuse the result. Without the
callback, legacy semantics preserved (existing tests + callers
unaffected).

Caller composeAndStoreBriefForUser passes a no-synthesis compose call
as tryCompose — cheap pure-JS, no I/O. Synthesis only runs once after
the winner is locked in, so the perf cost is one extra compose per
filter-rejected candidate, no extra LLM round-trips.

Tests:
- 10 new cases in tests/digest-orchestration-helpers.test.mjs
  covering: digestFor receiving full candidate; tryCompose
  fall-through to lower-priority; all-rejected returns null;
  composeResult forwarded; legacy semantics without tryCompose;
  digestWindowStartMs lastSentAt-vs-default branches; weekly +
  twice-daily window parity assertions; epoch-zero ?? guard.
- Updated tests/digest-cache-key-sensitivity.test.mjs static-shape
  regex to match the new `cand.rule.sensitivity` cache-key shape
  (intent unchanged: cache key MUST include sensitivity).

Stacked on PR #3396 — targets feat/brief-two-brain-divergence.
2026-04-25 16:22:31 +04:00
Elie Habib
dec7b64b17 fix(unrest): proxy-only fetch + 3-attempt retry for GDELT (#3395)
* fix(unrest): proxy-only fetch + 3-attempt retry for GDELT

Production logs showed PR #3362's 45s proxy timeout solved one failure mode
(CONNECT-tunnel timeouts) but ~80% of ticks now fail in 3-14 seconds with
either "Proxy CONNECT: HTTP/1.1 522 Server Error" (Cloudflare can't reach
GDELT origin) or "Client network socket disconnected before secure TLS
connection" (Decodo RSTs the handshake). These are fast-fails, not timeouts —
no amount of timeout bumping helps.

Two changes:

1. Drop the direct fetch entirely. Every direct attempt in 14h of logs
   errored with UND_ERR_CONNECT_TIMEOUT or ECONNRESET — 0% success since
   PR #3256 added the proxy fallback. The direct call costs ~8-30s per tick
   for nothing.

2. Wrap the proxy call in a 3-attempt retry with 1.5-3s jitter. Single-attempt
   per-tick success rate measured at ~18%; with 3 attempts that lifts to ~75%+
   under the same Decodo↔Cloudflare flake rate, comfortably keeping seedAge
   under the 120m STALE_SEED threshold.

Deeper structural fix (out of scope here): wire ACLED credentials on the
Railway unrest service so GDELT isn't the single upstream.

* test(unrest): cover GDELT proxy retry path + no-proxy hard-fail

Address PR #3395 reviewer concerns:

(1) "no automated coverage for the new retry path or the no-proxy path"

Add scripts/seed-unrest-events.mjs DI seams (_proxyFetcher, _sleep,
_jitter, _maxAttempts, _resolveProxyForConnect) and a 6-test suite at
tests/seed-unrest-gdelt-fetch.test.mjs covering:

  1. Single-attempt success — no retries fire.
  2. 2 transient failures + 3rd-attempt success — recovers, returns JSON.
  3. All attempts fail — throws LAST error, exact attempt count.
  4. Malformed proxy body — SyntaxError short-circuits retry (deterministic
     parse failures shouldn't burn attempts).
  5. Missing CONNECT proxy creds — fetchGdeltEvents throws clear
     "PROXY_URL env var is not set" pointer for ops, asserts NO proxy
     fetcher invocation (no wasted network).
  6. End-to-end with retry — fetchGdeltEvents with one transient 522
     recovers and aggregates events normally.

Gate runSeed() entry-point with `import.meta.url === file://argv[1]` so
tests can `import` the module without triggering a real seed run.

(2) "review assumes Railway has Decodo creds; without them, fails immediately"

Yes — that's intentional. Direct fetch had 0% success in production for
weeks (every Railway tick errored UND_ERR_CONNECT_TIMEOUT or ECONNRESET)
since PR #3256 added the proxy fallback. Reintroducing it as "soft"
fallback would just add ~30s of latency + log noise per tick.

What's improved here: the no-proxy error message now names the missing
env var (PROXY_URL) so an operator who hits this in Railway logs has a
direct pointer instead of a generic "GDELT requires proxy" string.
2026-04-25 15:27:43 +04:00
Elie Habib
0500733541 feat(variants): wire energy.worldmonitor.app subdomain (gaps #9-11) (#3394)
DNS (Cloudflare) and the Vercel domain are already provisioned by the
operator; this lands the matching code-side wiring so the variant
actually resolves and renders correctly.

Changes:

middleware.ts
- Add `'energy.worldmonitor.app': 'energy'` to VARIANT_HOST_MAP. This
  also auto-includes the host in ALLOWED_HOSTS via the spread on
  line 87.
- Add `energy` entry to VARIANT_OG with the Energy-Atlas-specific
  title + description from `src/config/variant-meta.ts:130-152`. OG
  image points at `https://energy.worldmonitor.app/favico/energy/og-image.png`,
  matching the per-variant convention used by tech / finance /
  commodity / happy.

vercel.json
- Add `https://energy.worldmonitor.app` to BOTH `frame-src` and
  `frame-ancestors` in the global Content-Security-Policy header.
  Without this, the variant subdomain would render but be blocked
  from being framed back into worldmonitor.app for any embedded
  flow (Slack/LinkedIn previews, future iframe widgets, etc.).
  This supersedes the CSP-only portion of PR #3359 (which mixed
  CSP with unrelated relay/military changes).

convex/payments/checkout.ts:108-117
- Add `https://energy.worldmonitor.app` to the checkout returnUrl
  allowlist. Without this, a PRO upgrade flow initiated from the
  energy subdomain would fail with "Invalid returnUrl" on Convex.

src-tauri/tauri.conf.json:32
- Add `https://energy.worldmonitor.app` to the Tauri desktop CSP
  frame-src so the desktop app can embed the variant the same way
  it embeds the other 4.

public/favico/energy/* (NEW, 7 files)
- Stub the per-variant favicon directory by copying the root-level
  WorldMonitor brand assets (android-chrome 192/512, apple-touch,
  favicon 16/32/ico, og-image). This keeps the launch unblocked
  on design assets — every referenced URL resolves with valid
  bytes from day one. Replace with energy-themed designs in a
  follow-up PR; the file paths are stable.

Other variant subdomains already on main (tech / finance / commodity /
happy) are unchanged. APP_HOSTS in src/services/runtime.ts already
admits any `*.worldmonitor.app` via `host.endsWith('.worldmonitor.app')`
on line 226, so no edit needed there.

Closes gaps §L #9, #10, #11 in
docs/internal/energy-atlas-registry-expansion.md.
2026-04-25 14:19:28 +04:00
Elie Habib
8f8213605f docs(brief-quality): correct help text — cap_truncation has no fallback estimate (#3393)
Greptile P3 follow-up on PR #3390 (already merged): the help comment
described cap_truncation_rate as "computed from production drop logs
if supplied via stdin, else estimated as max(0, in - 16)/in from
replay record counts" — but:

1. The "16" was stale (post PR #3389, cap default is 12).
2. The fallback estimate was never implemented. cap_truncation only
   appears when --drop-lines-stdin is passed.

Updated the comment to match what the code actually does: cap
metric is omitted entirely without stdin input. No fallback estimate
because replay records don't capture the post-cap output count, so
any derived value would be misleading.
2026-04-25 13:47:53 +04:00
Elie Habib
621ac8d300 feat(brief): topic-threshold sweep + quality dashboard + labeled pairs (#3390)
* feat(brief): topic-threshold sweep + daily quality dashboard + labeled pairs

Adds the "are we getting better" measurement infrastructure for the
brief topic-grouping pipeline. Three artifacts:

1. scripts/data/brief-adjacency-pairs.json — labeled "should-cluster"
   and "should-separate" pairs from real production briefs (12 pairs,
   7 cluster + 5 separate). Append-only labeled corpus.

2. scripts/sweep-topic-thresholds.mjs — pulls the per-tick replay log
   captured by writeReplayLog, reconstructs each tick's reps + cached
   embeddings, re-runs single-link clustering at multiple cosine
   thresholds, and outputs a markdown table with pair_recall,
   false_adjacency, topic_count, multi-member share, and a composite
   quality_score per threshold. Picks the highest-scoring as the
   recommendation.

3. scripts/brief-quality-report.mjs — daily quality dashboard. Pulls
   the latest tick, computes metrics at the active threshold, prints
   which labeled pairs were violated. Run before each config change;
   compare deltas; revert if quality_score drops.

Both scripts mirror the production slice (score floor + top-N) before
clustering so metrics reflect what users actually receive.

First sweep result against 2026-04-24 production replay records:

  threshold | quality | recall | false-adj
     0.30   |  0.649  | 100.0% | 100.0%
     0.32   |  0.705  | 100.0% |  75.0%
     0.35   |  0.825  | 100.0% |  33.3%
     0.38   |  0.815  | 100.0% |  33.3%
     0.40   |  0.815  | 100.0% |  33.3%
     0.42   |  0.895  | 100.0% |   8.3%  
     0.45   |  0.535  |  36.4% |   0.0%  ← current production

Recommended env flip: DIGEST_DEDUP_TOPIC_THRESHOLD=0.42 — lifts
pair_recall from 36% to 100% while introducing only one false-adjacency
case (1 of 12 separate pairs).

* fix(brief-quality): reviewer feedback — cap-aware metrics + env-readable + missing-embed survival

Addresses 6 of 8 review comments on PR #3390:

B. Drop redundant groupTopicsPostDedup call. singleLinkCluster IS the
   partition algorithm production uses internally; the second pass was
   paying cosine work per threshold per tick to read only .error.

C. Score floor + topN + cap now read from production env
   (DIGEST_SCORE_MIN, DIGEST_MAX_ITEMS, DIGEST_MAX_STORIES_PER_USER)
   with documented defaults. CLI flags --score-floor / --top-n /
   --cap (--max-stories) override.

D. Filter reps with missing embeddings instead of returning null on
   the whole tick. Skip only if fewer than 5 survive. Drop count
   reported in Coverage.

E. Removed dead local cosine() in both files.

F. JSON metadata moved from underscore-prefixed top-level keys into a
   nested `meta: {}` object.

H. Recommendation output now names the Railway service explicitly
   so copy-paste can't go to the wrong service.

Adds visible-window pair-recall: scores cluster correctness on what
the user actually sees post-MAX_STORIES_PER_USER truncation, in
addition to partition correctness on the full 30-rep sliced set.

Visible-window finding (against 2026-04-24 production replay):

  threshold=0.45 cap=12 → visible_quality 0.916
  threshold=0.45 cap=16 → visible_quality 0.716  ← cap bump HURTS
  threshold=0.42 cap=12 → visible_quality 0.845
  threshold=0.42 cap=16 → visible_quality 0.845

PR #3389's cap bump 12 → 16 is NOT evidence-justified at the current
0.45 threshold. Positions 13-16 dilute without helping adjacency.
PR #3389 will be revised separately to keep cap=12 default but add
env-tunability.

Skipping G (helper extraction) per reviewer guidance — defer until a
third tool justifies the abstraction.

* fix(brief-quality): reviewer round 2 — single-star, cap=12 default, error path surfaced

Three Greptile review comments on PR #3390:

P1 — sweep  marker tagged every running-best row instead of only
the global best. Compute the global best in a first pass, render
in a second; only the single best row is starred.

P2 — sweep MAX_STORIES_DEFAULT was 16 (assumed PR #3389 would land
the bump). PR #3389 was revised after evidence to keep cap at 12;
default reverted here too. Local runs without DIGEST_MAX_STORIES_PER_USER
now evaluate the correct production-equivalent visible window.

P2 — brief-quality-report's main() gated `scoreReplay` on
`embeddingByHash.size === reps.length`, defeating the missing-embed
survival logic inside scoreReplay (which already filters and falls
back to MIN_SURVIVING_REPS). Removed the outer gate; renderReport's
existing ⚠️ error path now surfaces the diagnostic when too few
embeddings survive instead of silently omitting the section.

Re-running the sweep with the corrected cap=12 default produces a
substantially different recommendation than the original commit
message claimed:

  threshold | visible_quality (cap=12)
     0.30   |   0.649
     0.35   |   0.625
     0.40   |   0.615
     0.42   |   0.845
     0.45   |   0.916    ← current production IS the local optimum

The original commit's "lower threshold to 0.42" recommendation was
an artifact of the cap=16 default. At the actual production cap (12),
the labeled corpus says the current 0.45 threshold is best. PR
description will be updated separately.

The 'shadowed `items`' Greptile mention refers to two `items`
declarations in DIFFERENT function scopes (`redisLrangeAll` and
`scoreOneTick`); not a real shadowing — skipped.
2026-04-25 12:08:15 +04:00
Elie Habib
3373b542e9 feat(brief): make MAX_STORIES_PER_USER env-tunable (default 12, evidence kept it at 12) (#3389)
* fix(brief): bump MAX_STORIES_PER_USER 12 → 16

Production telemetry from PR #3387 surfaced cap-truncation as the
dominant filter loss: 73% of `sensitivity=all` users had `dropped_cap=18`
per tick (30 qualified stories truncated to 12). Multi-member topics
straddling the position-12 boundary lost members.

Bumping the cap to 16 lets larger leading topics fit fully without
affecting `sensitivity=critical` users (their pools cap at 7-10 stories
— well below either threshold). Reduces dropped_cap from ~18 to ~14
per tick.

Validation signal: watch the `[digest] brief filter drops` log line on
Railway after deploy — `dropped_cap=` should drop by ~4 per tick.

Side effect: this addresses the dominant production signal that
Solution 3 (post-filter regroup, originally planned in
docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md)
was supposed to handle. Production evidence killed Sol-3's premise
(0 non-cap drops in 70 samples), so this is a simpler, evidence-backed
alternative.

* revise(brief): keep MAX_STORIES_PER_USER default at 12, add env-tunability

Reviewer asked "why 16?" and the honest answer turned out to be: the
data doesn't support it. After landing PR #3390's sweep harness with
visible-window metrics, re-ran against 2026-04-24 production replay:

  threshold=0.45 cap=12 -> visible_quality 0.916 (best at this cap)
  threshold=0.45 cap=16 -> visible_quality 0.716 (cap bump HURTS)
  threshold=0.42 cap=12 -> visible_quality 0.845
  threshold=0.42 cap=16 -> visible_quality 0.845 (neutral)

At the current 0.45 threshold, positions 13-16 are mostly singletons
or members of "should-separate" clusters — they dilute the brief
without helping topic adjacency. Bumping the cap default to 16 was a
wrong inference from the dropped_cap=18 signal alone.

Revised approach:

- Default MAX_STORIES_PER_USER stays at 12 (matches historical prod).
- Constant becomes env-tunable via DIGEST_MAX_STORIES_PER_USER so any
  future sweep result can be acted on with a Railway env flip without
  a redeploy.

The actual evidence-backed adjacency fix from the sweep is to lower
DIGEST_DEDUP_TOPIC_THRESHOLD from 0.45 -> 0.42 (env flip; see PR #3390).

* fix(brief-llm): tie buildDigestPrompt + hashDigestInput slice to MAX_STORIES_PER_USER

Greptile P1 on PR #3389: with MAX_STORIES_PER_USER now env-tunable,
hard-coded stories.slice(0, 12) in buildDigestPrompt and hashDigestInput
would mean the LLM prose only references the first 12 stories when
the brief carries more. Stories 13+ would appear as visible cards
but be invisible to the AI summary — a quiet mismatch between reader
narrative and brief content.

Cache key MUST stay aligned with the prompt slice or it drifts from
the prompt content; same constant fixes both sites.

Exports MAX_STORIES_PER_USER from brief-compose.mjs (single source
of truth) and imports it in brief-llm.mjs. No behaviour change at
the default cap of 12.
2026-04-25 12:07:48 +04:00
Elie Habib
abdcdb581f feat(resilience): SWF manifest expansion + KIA split + new schema fields (#3391)
* feat(resilience): SWF manifest expansion + KIA split + new schema fields

Phase 1 of plan 2026-04-25-001 (Codex-approved round 5). Manifest-only
data correction; no construct change, no cache prefix bump.

Schema additions (loader-validated, misplacement-rejected):
- top-level: aum_usd, aum_year, aum_verified (primary-source AUM)
- under classification: aum_pct_of_audited (fraction multiplier),
  excluded_overlaps_with_reserves (boolean; documentation-only)

Manifest expansion (13 → 21 funds, 6 → 13 countries):
- UAE: +ICD ($320B verified), +ADQ ($199B verified), +EIA (unverified —
  loaded for documentation, excluded from scoring per data-integrity rule)
- KW: kia split into kia-grf (5%, access=0.9) + kia-fgf (95%,
  access=0.20). Corrects ~18× over-statement of crisis-deployable
  Kuwait sovereign wealth (audit found combined-AUM × 0.7 access
  applied $750B as "deployable" against ~$15B actual GRF stabilization
  capacity).
- CN: +CIC ($1.35T), +NSSF ($400B, statutorily-gated 0.20 tier),
  +SAFE-IC ($417B, excluded — overlaps SAFE FX reserves)
- HK: +HKMA-EF ($498B, excluded — overlaps HKMA reserves)
- KR: +KIC ($182B, IFSWF full member)
- AU: +Future Fund ($192B, pension-locked)
- OM: +OIA ($50B, IFSWF member)
- BH: +Mumtalakat ($19B)
- TL: +Petroleum Fund ($22B, GPFG-style high-transparency)

Re-audits (Phase 1E):
- ADIA access 0.3 → 0.4 (rubric flagged; ruler-discretionary deployment
  empirically demonstrated)
- Mubadala access 0.4 → 0.5 (rubric flagged); transparency 0.6 → 0.7
  (LM=10 + IFSWF full member alignment)

Rubric (docs/methodology/swf-classification-rubric.md):
- New "Statutorily-gated long-horizon" 0.20 access tier added between
  0.1 (sanctions/frozen) and 0.3 (intergenerational/ruler-discretionary).
  Anchored by KIA-FGF (Decree 106 of 1976; Council-of-Ministers + Emir
  decree gate; crossed once in extremis during COVID).

Seeder:
- Two new pure helpers: shouldSkipFundForBuffer (excluded/unverified
  decision) and applyAumPctOfAudited (sleeve fraction multiplier)
- Manifest-AUM bypass: if aum_verified=true AND aum_usd present,
  use that value directly (skip Wikipedia)
- Skip funds with excluded_overlaps_with_reserves=true (no
  double-counting against reserveAdequacy / liquidReserveAdequacy)
- Skip funds with aum_verified=false (load for documentation only)

Tests (+25 net):
- 15 schema-extension tests (misplacement rejection, value-range gates,
  rationale-pairing coherence, backward-compat with pre-PR entries)
- 10 helper tests (shouldSkipFundForBuffer + applyAumPctOfAudited
  predicates and arithmetic; KIA-GRF + KIA-FGF sum equals combined AUM)
- Existing manifest test updated for the kia → kia-grf+kia-fgf split

Full suite: 6,940 tests pass (+50 net), typecheck clean, no new lint.

Predicted ranking deltas (informational, NOT acceptance criteria per
plan §"Hard non-goals"):
- AE sovFiscBuf likely 39 → 47-49 (Phase 1A + 1E)
- KW sovFiscBuf likely 98 → 53-57 (Phase 1B)
- CN, HK (excluded), KR, AU acquire newly-defined sovFiscBuf scores
- GCC ordering shifts toward QA > KW > AE; AE-KW gap likely 6 → ~3-4

Real outcome will be measured post-deploy via cohort audit per plan
§Phase 4.

* fix(resilience): completeness denominator excludes documentation-only funds

PR-3391 review (P1 catch): the per-country `expectedFunds` denominator
counted ALL manifest entries (`funds.length`) including those skipped
from buffer scoring by design — `excluded_overlaps_with_reserves: true`
(SAFE-IC, HKMA-EF) and `aum_verified: false` (EIA). Result: countries
with mixed scorable + non-scorable rosters showed `completeness < 1.0`
even when every scorable fund matched. UAE (4 scorable + EIA) would
show 0.8; CN (CIC + NSSF + SAFE-IC excluded) would show 0.67. The
downstream scorer then derated those countries' coverage based on a
fake-partial signal.

Three call sites all carried the same bug:
- per-country `expectedFunds` in fetchSovereignWealth main loop
- `expectedFundsTotal` + `expectedCountries` in buildCoverageSummary
- `countManifestFundsForCountry` (missing-country path)

All three now filter via `shouldSkipFundForBuffer` to count only
scorable manifest entries. Documentation-only funds neither expected
nor matched — they don't appear in the ratio at all.

Tests added (+4):
- AE complete with all 4 scorable matched (EIA documented but excluded)
- CN complete with CIC + NSSF matched (SAFE-IC documented but excluded)
- Missing-country path returns scorable count not raw manifest count
- Country with ONLY documentation-only entries excluded from expectedCountries

Full suite: 6,944 tests pass (+4 net), typecheck clean.

* fix(resilience): address Greptile P2s on PR #3391 manifest

Three review findings, all in the manifest YAML:

1. **KIA-GRF access 0.9 → 0.7** (rubric alignment): GRF deployment
   requires active Council-of-Ministers authorization (2020 COVID
   precedent demonstrates this), not rule-triggered automatic
   deployment. The rubric's 0.9 tier ("Pure automatic stabilization")
   reserved for funds where political authorization is post-hoc /
   symbolic (Chile ESSF candidate). KIA-GRF correctly fits 0.7
   ("Explicit stabilization with rule") — the same tier the
   pre-split combined-KIA was assigned. Updated rationale clarifies
   the tier choice. Rubric's 0.7 precedent column already lists
   "KIA General Reserve Fund" — now consistent with the manifest.

2. **Duplicate `# ── Australia ──` header before Oman** (copy-paste
   artifact): removed the orphaned header at the Oman section;
   added proper `# ── Australia ──` header above the Future Fund
   entry where it actually belongs (after Timor-Leste).

3. **NSSF `aum_pct_of_audited: 1.0` removed** (no-op): a multiplier
   of 1.0 is identity. The schema field is OPTIONAL and only meant
   for fund-of-funds split entries (e.g. KIA-GRF/FGF). Setting it
   to 1.0 forced the loader to require an `aum_pct_of_audited`
   rationale paragraph with no computational benefit. Both the
   field and the paragraph are now removed; NSSF remains a single-
   sleeve entry that scores its full audited AUM.

Full suite: 6,944 tests pass, typecheck clean.
2026-04-25 12:02:48 +04:00
Elie Habib
9c14820c69 fix(digest): brief filter-drop instrumentation + cache-key correctness (#3387)
* fix(digest): include sensitivity in digestFor cache key

buildDigest filters by rule.sensitivity BEFORE dedup, but digestFor
memoized only on (variant, lang, windowStart). Stricter-sensitivity
users in a shared bucket inherited the looser populator's pool,
producing the wrong story set and defeating downstream topic-grouping
adjacency once filterTopStories re-applied sensitivity.

Solution 1 from docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md.

* feat(digest): instrument per-user filterTopStories drops

Adds an optional onDrop metrics callback to filterTopStories and threads
it through composeBriefFromDigestStories. The seeder aggregates counts
per composed brief and emits one structured log line per user per tick:

  [digest] brief filter drops user=<id> sensitivity=<s> in=<count>
    dropped_severity=<n> dropped_url=<n> dropped_headline=<n>
    dropped_shape=<n> out=<count>

Decides whether the conditional Solution 3 (post-filter regroup) is
warranted by quantifying how often post-group filter drops puncture
multi-member topics in production. No behaviour change for callers
that omit onDrop.

Solution 0 from docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md.

* fix(digest): close two Sol-0 instrumentation gaps from code review

Review surfaced two P2 gaps in the filter-drop telemetry that weakened
its diagnostic purpose for Sol-3 gating:

1. Cap-truncation silent drop: filterTopStories broke on
   `out.length >= maxStories` BEFORE the onDrop emit sites, so up to
   (DIGEST_MAX_ITEMS - MAX_STORIES_PER_USER) stories per user were
   invisible. Added a 'cap' reason to DropMetricsFn and emit one event
   per skipped story so `in - out - sum(dropped_*) == 0` reconciles.

2. Wipeout invisibility: composeAndStoreBriefForUser only logged drop
   stats for the WINNING candidate. When every candidate composed to
   null, the log line never fired — exactly the wipeout case Sol-0
   was meant to surface. Now tracks per-candidate drops and emits an
   aggregate `outcome=wipeout` line covering all attempts.

Also tightens the digest-cache-key sensitivity regex test to anchor
inside the cache-key template literal (it would otherwise match the
unrelated `chosenCandidate.sensitivity ?? 'high'` in the new log line).

PR review residuals from
docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md
ce-code-review run 20260424-232911-37a2d5df.

* chore: ignore .context/ ce-code-review run artifacts

The ce-code-review skill writes per-run artifacts (reviewer JSON,
synthesis.md, metadata.json) under .context/compound-engineering/.
These are local-only — neither tracked nor linted.

* fix(digest): emit per-attempt filter-drop rows, not per-user

Addresses two PR #3387 review findings:

- P2: Earlier candidates that composed to null (wiped out by post-group
  filtering) had their dropStats silently discarded when a later
  candidate shipped — exactly the signal Sol-0 was meant to surface.
- P3: outcome=wipeout row was labeled with allCandidateDrops[0]
  .sensitivity, misleading when candidates within one user have
  different sensitivities.

Fix: emit one structured row per attempted candidate, tagged with that
candidate's own sensitivity and variant. Outcome is shipped|rejected.
A wipeout is now detectable as "all rows for this user are rejected
within the tick" — no aggregate-row ambiguity. Removes the
allCandidateDrops accumulator entirely.

* fix(digest): align composeBriefFromDigestStories sensitivity default to 'high'

Addresses PR #3387 review (P2): composeBriefFromDigestStories defaulted
to `?? 'all'` while buildDigest, the digestFor cache key, and the new
per-attempt log line all default to `?? 'high'`. The mismatch is
harmless in production (the live cron path pre-filters the pool) but:

- A non-prefiltered caller with undefined sensitivity would silently
  ship medium/low stories.
- Per-attempt telemetry labels the attempt as `sensitivity=high` while
  compose actually applied 'all' — operators are misled.

Aligning compose to 'high' makes the four sites agree and the telemetry
honest. Production output is byte-identical (input pool was already
'high'-filtered upstream).

Adds 3 regression tests asserting the new default: critical/high admitted,
medium/low dropped, and onDrop fires reason=severity for the dropped
levels (locks in alignment with per-attempt telemetry).

* fix(digest): align remaining sensitivity defaults to 'high'

Addresses PR #3387 review (P2 + P3): three more sites still defaulted
missing sensitivity to 'all' while compose/buildDigest/cache/log now
treat it as 'high'.

P2 — compareRules (scripts/lib/brief-compose.mjs:35-36): the rank
function used to default to 'all', placing legacy undefined-sensitivity
rules FIRST in the candidate order. Compose then applied a 'high'
filter to them, shipping a narrow brief while an explicit 'all' rule
for the same user was never tried. Aligned to 'high' so the rank
matches what compose actually applies.

P3 — enrichBriefEnvelopeWithLLM (scripts/lib/brief-llm.mjs:526):
the digest prompt and cache key still used 'all' for legacy rules,
misleading personalization ("Reader sensitivity level: all" while the
brief contains only critical/high stories) and busting the cache for
legacy vs explicit-'all' rows that should share entries.

Also aligns the @deprecated composeBriefForRule (line 164) for
consistency, since tests still import it.

3 new regression tests in tests/brief-composer-rule-dedup.test.mjs
lock in the new ranking: explicit 'all' beats undefined-sensitivity,
undefined-sensitivity ties with explicit 'high' (decided by updatedAt),
and groupEligibleRulesByUser candidate order respects the rank.

6853/6853 tests pass (was 6850 → +3).
2026-04-25 00:23:29 +04:00
Elie Habib
8cca8d19e3 feat(resilience): Comtrade-backed re-export-share seeder + SWF Redis read (#3385)
* feat(seed): BUNDLE_RUN_STARTED_AT_MS env + runSeed SIGTERM cleanup

Prereq for the re-export-share Comtrade seeder (plan 2026-04-24-003),
usable by any cohort seeder whose consumer needs bundle-level freshness.

Two coupled changes:

1. `_bundle-runner.mjs` injects `BUNDLE_RUN_STARTED_AT_MS` into every
   spawned child. All siblings in a single bundle run share one value
   (captured at `runBundle` start, not spawn time). Consumers use this
   to detect stale peer keys — if a peer's seed-meta predates the
   current bundle run, fall back to a hard default rather than read
   a cohort-peer's last-week output.

2. `_seed-utils.mjs::runSeed` registers a `process.once('SIGTERM')`
   handler that releases the acquired lock and extends existing-data
   TTL before exiting 143. `_bundle-runner.mjs` sends SIGTERM on
   section timeout, then SIGKILL after KILL_GRACE_MS (5s). Without
   this handler the `finally` path never runs on SIGKILL, leaving
   the 30-min acquireLock reservation in place until its own TTL
   expires — the next cron tick silently skips the resource.

Regression guard memory: `bundle-runner-sigkill-leaks-child-lock` (PR
#3128 root cause).

Tests added:
- bundle-runner env injection (value within run bounds)
- sibling sections share the same timestamp (critical for the
  consumer freshness guard)
- runSeed SIGTERM path: exit 143 + cleanup log
- process.once contract: second SIGTERM does not re-enter handler

* fix(seed): address P1/P2 review findings on SIGTERM + bundle contracts

Addresses PR #3384 review findings (todos 256, 257, 259, 260):

#256 (P1) — SIGTERM handler narrowed to fetch phase only. Was installed
at runSeed entry and armed through every `process.exit` path; could
race `emptyDataIsFailure: true` strict-floor exits (IMF-External,
WB-bulk) and extend seed-meta TTL when the contract forbids it —
silently re-masking 30-day outages. Now the handler is attached
immediately before `withRetry(fetchFn)` and removed in a try/finally
that covers all fetch-phase exit branches.

#257 (P1) — `BUNDLE_RUN_STARTED_AT_MS` now has a first-class helper.
Exported `getBundleRunStartedAtMs()` from `_seed-utils.mjs` with JSDoc
describing the bundle-freshness contract. Fleet-wide helper so the
next consumer seeder imports instead of rediscovering the idiom.

#259 (P2) — SIGTERM cleanup runs `Promise.allSettled` on disjoint-key
ops (`releaseLock` + `extendExistingTtl`). Serialising compounded
Upstash latency during the exact failure mode (Redis degraded) this
handler exists to handle, risking breach of the 5s SIGKILL grace.

#260 (P2) — `_bundle-runner.mjs` asserts topological order on
optional `dependsOn` section field. Throws on unknown-label refs and
on deps appearing at a later index. Fleet-wide contract replacing
the previous prose-comment ordering guarantee.

Tests added/updated:
- New: SIGTERM handler removed after fetchFn completes (narrowed-scope
  contract — post-fetch SIGTERM must NOT trigger TTL extension)
- New: dependsOn unknown-label + out-of-order + happy-path (3 tests)

Full test suite: 6,866 tests pass (+4 net).

* fix(seed): getBundleRunStartedAtMs returns null outside a bundle run

Review follow-up: the earlier `Math.floor(Date.now()/1000)*1000` fallback
regressed standalone (non-bundle) runs. A consumer seeder invoked
manually just after its peer wrote `fetchedAt = (now - 5s)` would see
`bundleStartMs = Date.now()`, reject the perfectly-fresh peer envelope
as "stale", and fall back to defaults — defeating the point of the
peer-read path outside the bundle.

Returning null when `BUNDLE_RUN_STARTED_AT_MS` is unset/invalid keeps
the freshness gate scoped to its real purpose (across-bundle-tick
staleness) and lets standalone runs skip the gate entirely. Consumers
check `bundleStartMs != null` before applying the comparison; see the
companion `seed-sovereign-wealth.mjs` change on the stacked PR.

* test(seed): SIGTERM cleanup test now verifies Redis DEL + EXPIRE calls

Greptile review P2 on PR #3384: the existing test only asserted exit
code + log line, not that the Redis ops were actually issued. The
log claim was ahead of the test.

Fixture now logs every Upstash fetch call's shape (EVAL / pipeline-
EXPIRE / other) to stderr. Test asserts:

- >=1 EVAL op was issued during SIGTERM cleanup (releaseLock Lua
  script on the lock key)
- >=1 pipeline-EXPIRE op was issued (extendExistingTtl on canonical
  + seed-meta keys)
- The EVAL body carries the runSeed-generated runId (proves it's
  THIS run's release, not a phantom op)
- The EXPIRE pipeline touches both the canonicalKey AND the
  seed-meta key (proves the keys[] array was built correctly
  including the extraKeys merge path)

Full test suite: 6,866 tests pass, typecheck clean.

* feat(resilience): Comtrade-backed re-export-share seeder + SWF Redis read

Plan ref: docs/plans/2026-04-24-003-feat-reexport-share-comtrade-seeder-plan.md

Motivating case. Before this PR, the SWF `rawMonths` denominator for
the `sovereignFiscalBuffer` dimension used GROSS annual imports for
every country. For re-export hubs (goods transiting without domestic
settlement), this structurally under-reports resilience: UAE's 2023
$941B of imports include $334B of transit flow that never represents
domestic consumption. Net imports = gross × (1 − reexport_share).

The previous (PR 3A) design flattened a hand-curated YAML into Redis;
the YAML shipped empty and never populated, so the correction never
applied and the cohort audit showed no movement.

Gap #2 (this PR). Two coupled changes to make the correction actually
apply:

1. Comtrade-backed seeder (`scripts/seed-recovery-reexport-share.mjs`).
   Rewritten to fetch UN Comtrade `flowCode=RX` (re-exports) and
   `flowCode=M` (imports) per cohort member, compute share = RX/M at
   the latest co-populated year, clamp to [0.05, 0.95], publish the
   envelope. Header auth (`Ocp-Apim-Subscription-Key`) — subscription
   key never reaches URL/logs/Redis. `maxRecords=250000` cap with
   truncation detection. Sequential + retry-on-429 with backoff.

   Hub cohort resolved by Phase 0 empirical probe (plan §Phase 0):
   ['AE', 'PA']. Six candidates (SG/HK/NL/BE/MY/LT) return HTTP 200
   with zero RX rows — Comtrade doesn't expose RX for those reporters.

2. SWF seeder reads from Redis (`scripts/seed-sovereign-wealth.mjs`).
   Swaps `loadReexportShareByCountry()` (YAML) for
   `loadReexportShareFromRedis()` (Redis key written by #1). Guarded
   by bundle-run freshness: if the sibling Reexport-Share seeder's
   `seed-meta` predates `BUNDLE_RUN_STARTED_AT_MS` (set by the
   prereq PR's `_bundle-runner.mjs` env-injection), HARD fallback
   to gross imports rather than apply last-month's stale share.

Health registries. Both new keys registered in BOTH `api/health.js`
SEED_META (60-day alert threshold) and `api/seed-health.js`
SEED_DOMAINS (43200min interval). feedback_two_health_endpoints_must_match.

Bundle wiring. `seed-bundle-resilience-recovery` Reexport-Share
timeout bumped 60s → 300s (Comtrade + retry can take 2-3 min
worst-case). Ordering preserved: Reexport-Share before Sovereign-
Wealth so the SWF seeder reads a freshly-written key in the same
cron tick.

Deletions. YAML + loader + 7 obsolete loader tests removed; single
source of truth is now Comtrade → Redis.

Prereq. Stacks on PR #3384 (feat/bundle-runner-env-sigterm)
which adds BUNDLE_RUN_STARTED_AT_MS env injection + runSeed
SIGTERM cleanup. This PR's bundle-freshness guard depends on
that env variable.

Tests (19 new, 7 deleted, +12 net):
- Pure math: parseComtradeFlowResponse, computeShareFromFlows,
  clampShare, declareRecords + credential-leak source scan (15)
- Integration (Gap #2 regression guards): SWF seeder loadReexport
  ShareFromRedis — fresh/absent/malformed/stale-meta/missing-meta (5)
- Health registry dual-registry drift guard — scoped to this PR's
  keys, respecting pre-existing asymmetry (4)
- Bundle-ordering + timeout assertions (2)

Phase 0 cohort validation committed to plan. Full test suite
passes: 6,881 tests.

* fix(resilience): address P1/P2 review findings — adopt shared helpers, pin freshness boundary

Addresses PR #3385 review findings:

#257 (P1) consumer — `seed-sovereign-wealth.mjs` imports the shared
`getBundleRunStartedAtMs` helper from `_seed-utils.mjs` (added in the
prereq commit) instead of its own `getBundleStartMs`. Single source of
truth for the bundle-freshness contract.

#258 (P2) — `seed-recovery-reexport-share.mjs` isMain guard uses the
canonical `pathToFileURL(process.argv[1]).href === import.meta.url`
form instead of basename-suffix matching. Handles symlinks, case-
different paths on macOS HFS+, and Windows path separators without
string munging.

#260 (P2) consumer — Sovereign-Wealth declares `dependsOn:
['Reexport-Share']` in the bundle spec. `_bundle-runner.mjs` (prereq
commit) now enforces topological order on load and throws on
violation — replaces the previous prose-comment ordering contract.

#261 (P2) — added a test to `tests/seed-sovereign-wealth-reads-redis-
reexport-share.test.mts` pinning the inclusive-boundary semantic:
`fetchedAtMs === bundleStartMs` must be treated as FRESH. Guards
against a future refactor to `<=` that would silently reject peers
writing at the very first millisecond of the bundle run.

Rebased onto updated prereq. Full test suite: 6,886 tests pass (+5 net).

* fix(resilience): freshness gate skipped in standalone mode; meta still required

Review catch: the previous `bundleStartMs = Date.now()` fallback made
standalone/manual `seed-sovereign-wealth.mjs` runs ALWAYS reject any
previously-seeded re-export-share meta as "stale" — even when the
operator ran the Reexport seeder milliseconds beforehand. Defeated
the point of the peer-read path outside the bundle.

With `getBundleRunStartedAtMs()` now returning null outside a bundle
(companion commit on the prereq branch), the consumer only applies
the freshness gate when `bundleStartMs != null`. Standalone runs
accept any `fetchedAt` — the operator is responsible for ordering.

Two guards survive the change:
- Meta MUST exist (absence = peer-outage fail-safe, both modes)
- In-bundle: meta MUST be at or after `BUNDLE_RUN_STARTED_AT_MS`

Two new tests pin both modes:
- standalone: accepts meta written 10 min before this process started
- standalone: still rejects missing meta (peer-outage fail-safe
  survives gate bypass)

Rebased onto updated prereq. Full test suite: 6,888 tests (+2 net).

* fix(resilience): filter world-aggregate Comtrade rows + skip final-retry sleep

Greptile review of PR #3385 flagged two P2s in the Comtrade seeder.

Finding #3 (parseComtradeFlowResponse double-count risk):
`cmdCode=TOTAL` without a partner filter currently returns only
world-aggregate rows in practice — but `parseComtradeFlowResponse`
summed every row unconditionally. A future refactor adding per-
partner querying would silently double-count (world-aggregate row +
partner-level rows for the same year), cutting the derived share in
half with no test signal.

Fix: explicit `partnerCode ∈ {'0', 0, null/undefined}` filter. Matches
current empirical behavior (aggregate-only responses) and makes the
construct robust to a future partner-level query.

Finding #4 (wasted backoff on final retry):
429 and 5xx branches slept `backoffMs` before `continue`, but on
`attempt === RETRY_MAX_ATTEMPTS` the loop condition fails immediately
after — the sleep was pure waste. Added early-return (parallel to the
existing pattern in the network-error catch branch) so the final
attempt exits the retry loop at the first non-success response
without extra latency.

Tests:
- 3 new `parseComtradeFlowResponse` variants: world-only filter,
  numeric-0 partnerCode shape, rows without partnerCode field
- Existing tests updated: the double-count assertion replaced with
  a "per-partner rows must NOT sum into the world-aggregate total"
  assertion that pins the new contract

Rebased onto updated prereq. Full test suite: 6,890 tests (+2 net).
2026-04-25 00:14:17 +04:00
Elie Habib
5f40f8a13a feat(seed): BUNDLE_RUN_STARTED_AT_MS env + runSeed SIGTERM cleanup (#3384)
* feat(seed): BUNDLE_RUN_STARTED_AT_MS env + runSeed SIGTERM cleanup

Prereq for the re-export-share Comtrade seeder (plan 2026-04-24-003),
usable by any cohort seeder whose consumer needs bundle-level freshness.

Two coupled changes:

1. `_bundle-runner.mjs` injects `BUNDLE_RUN_STARTED_AT_MS` into every
   spawned child. All siblings in a single bundle run share one value
   (captured at `runBundle` start, not spawn time). Consumers use this
   to detect stale peer keys — if a peer's seed-meta predates the
   current bundle run, fall back to a hard default rather than read
   a cohort-peer's last-week output.

2. `_seed-utils.mjs::runSeed` registers a `process.once('SIGTERM')`
   handler that releases the acquired lock and extends existing-data
   TTL before exiting 143. `_bundle-runner.mjs` sends SIGTERM on
   section timeout, then SIGKILL after KILL_GRACE_MS (5s). Without
   this handler the `finally` path never runs on SIGKILL, leaving
   the 30-min acquireLock reservation in place until its own TTL
   expires — the next cron tick silently skips the resource.

Regression guard memory: `bundle-runner-sigkill-leaks-child-lock` (PR
#3128 root cause).

Tests added:
- bundle-runner env injection (value within run bounds)
- sibling sections share the same timestamp (critical for the
  consumer freshness guard)
- runSeed SIGTERM path: exit 143 + cleanup log
- process.once contract: second SIGTERM does not re-enter handler

* fix(seed): address P1/P2 review findings on SIGTERM + bundle contracts

Addresses PR #3384 review findings (todos 256, 257, 259, 260):

#256 (P1) — SIGTERM handler narrowed to fetch phase only. Was installed
at runSeed entry and armed through every `process.exit` path; could
race `emptyDataIsFailure: true` strict-floor exits (IMF-External,
WB-bulk) and extend seed-meta TTL when the contract forbids it —
silently re-masking 30-day outages. Now the handler is attached
immediately before `withRetry(fetchFn)` and removed in a try/finally
that covers all fetch-phase exit branches.

#257 (P1) — `BUNDLE_RUN_STARTED_AT_MS` now has a first-class helper.
Exported `getBundleRunStartedAtMs()` from `_seed-utils.mjs` with JSDoc
describing the bundle-freshness contract. Fleet-wide helper so the
next consumer seeder imports instead of rediscovering the idiom.

#259 (P2) — SIGTERM cleanup runs `Promise.allSettled` on disjoint-key
ops (`releaseLock` + `extendExistingTtl`). Serialising compounded
Upstash latency during the exact failure mode (Redis degraded) this
handler exists to handle, risking breach of the 5s SIGKILL grace.

#260 (P2) — `_bundle-runner.mjs` asserts topological order on
optional `dependsOn` section field. Throws on unknown-label refs and
on deps appearing at a later index. Fleet-wide contract replacing
the previous prose-comment ordering guarantee.

Tests added/updated:
- New: SIGTERM handler removed after fetchFn completes (narrowed-scope
  contract — post-fetch SIGTERM must NOT trigger TTL extension)
- New: dependsOn unknown-label + out-of-order + happy-path (3 tests)

Full test suite: 6,866 tests pass (+4 net).

* fix(seed): getBundleRunStartedAtMs returns null outside a bundle run

Review follow-up: the earlier `Math.floor(Date.now()/1000)*1000` fallback
regressed standalone (non-bundle) runs. A consumer seeder invoked
manually just after its peer wrote `fetchedAt = (now - 5s)` would see
`bundleStartMs = Date.now()`, reject the perfectly-fresh peer envelope
as "stale", and fall back to defaults — defeating the point of the
peer-read path outside the bundle.

Returning null when `BUNDLE_RUN_STARTED_AT_MS` is unset/invalid keeps
the freshness gate scoped to its real purpose (across-bundle-tick
staleness) and lets standalone runs skip the gate entirely. Consumers
check `bundleStartMs != null` before applying the comparison; see the
companion `seed-sovereign-wealth.mjs` change on the stacked PR.

* test(seed): SIGTERM cleanup test now verifies Redis DEL + EXPIRE calls

Greptile review P2 on PR #3384: the existing test only asserted exit
code + log line, not that the Redis ops were actually issued. The
log claim was ahead of the test.

Fixture now logs every Upstash fetch call's shape (EVAL / pipeline-
EXPIRE / other) to stderr. Test asserts:

- >=1 EVAL op was issued during SIGTERM cleanup (releaseLock Lua
  script on the lock key)
- >=1 pipeline-EXPIRE op was issued (extendExistingTtl on canonical
  + seed-meta keys)
- The EVAL body carries the runSeed-generated runId (proves it's
  THIS run's release, not a phantom op)
- The EXPIRE pipeline touches both the canonicalKey AND the
  seed-meta key (proves the keys[] array was built correctly
  including the extraKeys merge path)

Full test suite: 6,866 tests pass, typecheck clean.
2026-04-25 00:14:04 +04:00
Elie Habib
4efd286638 fix(energy-atlas): wire 4 panels into App.ts primeTask so they actually fetch (#3386)
Root cause: `App.ts::primeDataForVisiblePanels` is the sole near-viewport
kickoff path for panel `fetchData()` calls. Panel's constructor only
calls `showLoading()`; nothing else in the app triggers fetchData on
attach. Every panel that does real work has a corresponding
`if (shouldPrime('<key>')) primeTask(...)` entry in that table.

All 4 Energy Atlas panels were missing their entries:
- pipeline-status        → PipelineStatusPanel
- storage-facility-map   → StorageFacilityMapPanel
- fuel-shortages         → FuelShortagePanel
- energy-disruptions     → EnergyDisruptionsPanel

User-visible symptom: the 4 panels shipped as part of #3366 / #3378 /
the Energy Atlas PR chain rendered their headers + spinner but never
left "Loading…". Verified all four upstream RPCs return data:
- /api/supply-chain/v1/list-pipelines?commodityType=    → 100 KB
- /api/supply-chain/v1/list-storage-facilities          → 115 KB
- /api/supply-chain/v1/list-fuel-shortages              → 21 KB
- /api/supply-chain/v1/list-energy-disruptions          → 36 KB
and /api/health reports all 5 backing Redis keys as OK with the
expected record counts (75 gas + 75 oil + 200 storage + 29 shortages
+ 52 disruptions).

Fix: 4 primeTask entries mirroring the pattern already used for
energy-crisis, hormuz-tracker, oil-inventories, etc. Each panel
already implements the full fetch path (bootstrap-cache-first,
RPC fallback, setCached... back-propagation, error states); the
entries just give App.ts a reason to call it.

Placement: immediately after oil-inventories, grouping the energy
surface together.
2026-04-24 23:58:40 +04:00
Elie Habib
ce797da3a4 chore(energy-atlas): backfill productClass on oil pipelines + enforce enum (#3383)
* chore(energy-atlas): backfill productClass on all oil pipelines + enforce enum

Prior state: 12/75 oil pipelines carried a `productClass: "crude"` tag;
63/75 did not. The field had zero consumers anywhere in the codebase
(no validator, no server handler, no frontend reader) — orphan metadata
from partial curation. Inconsistency spotted during the energy-data
audit after the Energy Atlas PR chain landed.

Changes:

1. Backfill all 63 missing entries with one of three values based on
   the pipeline's name/operator/route:
   - `crude` (70 total): crude-oil trunks, gathering lines, export
     systems. Covers Druzhba, Enbridge Mainline, Keystone-XL, CPC,
     BTC, ESPO, Sumed, Forties, Brent, OCP, OCENSA, EACOP, LAPSSET,
     etc.
   - `products` (4 total): explicit refined-product pipelines —
     Abqaiq-Yanbu Products Line, Vadinar-Kandla, Yangzi-Hefei-Hangzhou,
     Tuxpan-Mexico City.
   - `mixed` (1 total): Salina Cruz-Minatitlán, the only dual-use
     crude/products bridge in the set.

2. Promote productClass from orphan metadata to a schema invariant:
   - Oil pipelines MUST declare one of {crude, products, mixed}.
   - Gas pipelines MUST NOT carry the field (commodity IS its own
     class there).
   - Enforced in scripts/_pipeline-registry.mjs::validateRegistry.

3. Five new test assertions in tests/pipelines-registry.test.mts
   cover both the data invariant (every oil entry has a valid value;
   no gas entry has one) and the validator behavior (rejects missing,
   rejects unknown enum value, rejects gas-with-productClass).

File formatting: the oil registry mixes two styles — multi-line (each
field on its own line) and compact (several fields packed onto one
line). The insertion preserves the local style for each entry by
reusing the whitespace that follows `"commodityType": "oil",`.

No runtime consumers yet; this lands the data hygiene so future
downstream work (crude-vs-products split on the map, refined-product
shock calcs) can rely on the field being present and valid.

* fix(energy-atlas): import VALID_OIL_PRODUCT_CLASSES in tests instead of redefining

Greptile P2 on #3383: the test file defined its own inline
`const VALID = new Set(['crude', 'products', 'mixed'])`, mirroring the
registry's `VALID_OIL_PRODUCT_CLASSES`. If a future PR adds a new class
(e.g. `condensate`) to the registry, the inline copy wouldn't update —
the data test would start reporting valid pipelines as failing before
the validator rejects them, creating a confusing diagnostic gap.

Export `VALID_OIL_PRODUCT_CLASSES` from `scripts/_pipeline-registry.mjs`
and import it in the test. Single source of truth; no drift possible.
2026-04-24 23:36:51 +04:00
Elie Habib
20ad5f5be0 chore(tests): add chokepoint-baselines fixture + parity guard (§L #8) (#3379)
* chore(tests): add chokepoint-baselines fixture + parity guard (§L #8)

Closes gap #8 from docs/internal/energy-atlas-registry-expansion.md §L —
the last missing V5-7 golden fixture.

Adds `tests/fixtures/chokepoint-baselines-sample.json` as a snapshot of
the expected buildPayload output shape (7 EIA chokepoints, top-level
source/referenceYear/chokepoints). Extends the existing
`tests/chokepoint-baselines-seed.test.mjs` with a 3-test fixture-parity
describe block that catches any silent drift in:

- Top-level key set (source, referenceYear, chokepoints array length)
- Position-by-position chokepoint entries (id, relayId, mbd)
- updatedAt ISO-parseability (format only — value is volatile)

Doesn't snapshot updatedAt byte-for-byte because it's a per-run
timestamp; parity is scoped to the schema-stable fields downstream
consumers depend on (CountryDeepDivePanel transit-chokepoint scoring,
shock RPC CHOKEPOINT_EXPOSURE lookup, chokepoint-flows calibration).

If a future change adds/removes an entry or renames a field, this
suite fails until the fixture is updated alongside — making schema
drift a deliberate reviewed action rather than a silent shift.

Test plan:
- [x] `npx tsx --test tests/chokepoint-baselines-seed.test.mjs` — 17/17 pass
- [x] `npm run typecheck` — clean
- [x] `npm run test:data` — 6697/6697 pass (+3 new fixture-parity cases)

* fix(tests): validate fixture against buildPayload + assert all fields (review P2)

Codex P2: the parity check validated entries against the CHOKEPOINTS
constant, not buildPayload().chokepoints — so it guarded the source
array rather than the seeded wire contract the fixture claims to
snapshot. If buildPayload ever transforms entries (coerce, reorder,
normalize), the check would miss it.

Also P2: the fixture contained richer fields (name, lat, lon) but the
old assert only checked id/relayId/mbd — most of the fixture realism
was unused and produced false confidence.

Fix:
- Parity loop now iterates payload.chokepoints (seeded output, not
  the raw source array) and asserts id, relayId, name, mbd, lat, lon
  per entry.
- Added an entry-key-set assertion that catches added/removed fields
  between seed and fixture — forces deliberate evolution rather than
  silent drift.

18 tests pass (was 17), typecheck clean.
2026-04-24 19:09:36 +04:00
Elie Habib
3d2dce3be1 feat(energy-atlas): promote Atlas map layers to FULL variant (§R #3 = B) (#3366)
* feat(energy-atlas): promote Atlas map layers to FULL variant (§R #3 = B)

Per plan §R/#3 decision B: the Redis-backed evidence registries
(75 gas + 75 oil pipelines, 200 storage facilities, 29 fuel shortages)
are now toggleable on the main worldmonitor.app map. Previously they
were hardcoded energy-variant-only, and FULL users who toggled
`pipelines: true` got the ~20-entry legacy static PIPELINES list.

Changes:
- `src/components/DeckGLMap.ts`: drop the `SITE_VARIANT === 'energy'`
  gates at :1511-1541. The pipelines layer now always uses
  `createEnergyPipelinesLayer()` (Redis-backed evidence registry);
  `createPipelinesLayer` (legacy static) is left in the file as dead
  code pending a separate cleanup PR that also retires
  `src/config/pipelines.ts`. Storage and fuel-shortage layers are
  now gated only on the variant's `mapLayers.storageFacilities` /
  `mapLayers.fuelShortages` booleans.
- `src/config/panels.ts`: add `storageFacilities: false` +
  `fuelShortages: false` to FULL_MAP_LAYERS (desktop + mobile) so
  the keys exist for toggle dispatch; default off so users opt in.
- `src/config/map-layer-definitions.ts`: extend the `full` variant's
  VARIANT_LAYER_ORDER to include `storageFacilities` and
  `fuelShortages`, so `getAllowedLayerKeys('full')` admits them and
  the layer picker surfaces them.
- `src/config/commands.ts`: add CMD+K toggles
  `layer:storageFacilities` and `layer:fuelShortages` next to the
  existing `layer:pipelines`.

Finance + commodity variants already had `pipelines: true`; they
now render the more comprehensive Redis-backed 150-entry dataset
instead of the ~20-entry legacy list. If a variant doesn't want
this, they set `pipelines: false` in their MAP_LAYERS config.

Part of docs/internal/energy-atlas-registry-expansion.md §R.

* fix(energy-atlas): restrict storageFacilities + fuelShortages to flat renderer

Reviewer (Codex) found two gaps in PR #3366:

1. GlobeMap 3D toggles did nothing. LAYER_REGISTRY declared both new
   layers with the default ['flat', 'globe'] renderers, so the toggle
   showed up in globe mode. But GlobeMap.ts has no rendering support:
   ensureStaticDataForLayer (:2160) only handles cables/pipelines/etc.,
   and the layer-channel map (:2484) has no entries for either. Users
   in globe mode saw the toggle and got silent no-ops.

2. SVG/mobile fallback (Map.ts fullLayers at :381) also has no render
   path for these data types. The existing cyberThreats precedent at
   :387 documents this as an intentional DeckGL-only pattern.

Fix:
- Restrict both LAYER_REGISTRY entries to ['flat'] explicitly. The
  layer picker hides the toggle in globe mode instead of exposing a
  no-op. Comment points to the GlobeMap gap so a future globe-rendering
  PR knows what to undo.
- Extend the existing cyberThreats note in Map.ts:387 to cover
  storageFacilities + fuelShortages too, noting they're already
  hidden from globe mode via the LAYER_REGISTRY restriction.

This is the smallest possible fix consistent with the pre-existing
pattern. Full globe-mode rendering for these layers is out of scope —
tracked separately as a follow-up.

* fix(energy-atlas): gate layer:* CMD+K by current renderer + DeckGL state

Reviewer follow-up on PR #3366: the previous fix restricted
LAYER_REGISTRY renderers to ['flat'] so the globe-mode layer picker
hides storageFacilities / fuelShortages toggles. But CMD+K was still
callable — SearchModal.matchCommands didn't filter `layer:*` commands
by renderer, so a user could CMD+K "storage layer" in globe or SVG
mode and trigger a silent no-op.

Fix — centralize "can this layer render right now?" in one helper:

- Add `deckGLOnly?: boolean` to LayerDefinition. `renderers: ['flat']`
  is not enough because `'flat'` covers both DeckGL-flat and SVG-flat,
  and the SVG/mobile fallback has no render path for either layer.
  Mark both as `deckGLOnly: true`.
- New `isLayerExecutable(key, renderer, isDeckGLActive)` helper in
  map-layer-definitions.ts. Returns true iff renderers include the
  current renderer AND (if deckGLOnly) DeckGL is active.
- `SearchModal.setLayerExecutableFn(fn)`: caller-supplied predicate
  used in both `matchCommands` (search results) and
  `renderAllCommandsList` (full picker).
- `search-manager` wires the predicate using `ctx.map.isGlobeMode()`
  + `ctx.map.isDeckGLActive()`, and also adds a symmetric guard in
  the `layer:` dispatch case so direct activations (keyboard
  accelerator, programmatic invocation) bail the same way.

Pre-existing resilienceScore DeckGL gate at search-manager:494 kept as
a belt-and-suspenders — the new isLayerExecutable check already
covers it since resilienceScore has `renderers: ['flat']` (though it
lacks deckGLOnly). Left the specific check in place to avoid scope
creep on a working guard.

Typecheck clean, 6694/6694 tests pass.

* fix(energy-atlas): filter CMD+K layer commands by variant too

Greptile P2 on commit 3f7a40036: `layer:storageFacilities` and
`layer:fuelShortages` still surface in CMD+K on tech / finance /
commodity / happy variants (where they're not in VARIANT_LAYER_ORDER).
Renderer + DeckGL filter was passing because those variants run flat
DeckGL. Dispatch silently failed at the `variantAllowed` guard in
handleCommand (:491), producing an invisible no-op from the user's
POV.

Fix: extend `setLayerExecutableFn` predicate to also check
`getAllowedLayerKeys(SITE_VARIANT).has(key)` before the renderer
checks. SearchModal now hides these commands on non-full/non-energy
variants where they can't execute.

This also cleans up the pre-existing pattern for other
variant-specific layer commands flagged by Greptile as "consistent
with how other variant-specific layer commands (e.g. layer:nuclear
on tech variant) already behave today" — they now all route through
the same predicate.

* fix(energy-atlas): gate layers:* presets + add isLayerExecutable tests (review P2)

Two Codex P2 findings on this PR:

1. `layers:*` presets bypassed the renderer/DeckGL gate.
   `search-manager.ts:481` checked only `allowed.has(layer)` before
   flipping a preset layer on. A user in globe mode or on SVG
   fallback who ran `layers:all` or `layers:infra` would silently
   set `deckGLOnly` layers (storageFacilities, fuelShortages) to
   true — toggles with no rendered output, and since the picker
   hides those layers under the current renderer the user had no
   way to toggle them back off without switching modes.

   Fix: funnel presets through the same `isLayerExecutable`
   predicate per-layer CMD+K already uses. `executable(k)` combines
   the existing `allowed.has` variant check with the renderer + DeckGL
   gate, so presets now match the per-layer dispatch behavior exactly.

2. No regression tests for the `deckGLOnly` / `isLayerExecutable`
   contract, despite it being behavior-critical renderer gating.

   Fix: added `tests/map-layer-executable.test.mts` — 16 cases:
   - Flag assertions: storageFacilities + fuelShortages carry
     `deckGLOnly: true` and renderers: ['flat']. Layers without the
     flag (pipelines, conflicts, cables) have it `undefined`, not
     accidentally `false`.
   - Renderer-gate cases: deckGLOnly layers pass only on flat + DeckGL
     active, not on SVG fallback, not on globe. Flat-only non-deckGLOnly
     layers (ciiChoropleth) pass on flat regardless of DeckGL status.
     Dual-renderer layers (pipelines) pass on both flat and globe.
     Unknown layer keys return false.
   - Exhaustive 2×2×2 matrix across (renderer, isDeckGL, deckGLOnly)
     using representative layer keys for each shape.

All 16 new tests pass. Full test:data suite still green. Typecheck clean.

* fix(energy-atlas): add pipeline-status to finance + commodity panel sets (review P1)

Codex P1: FINANCE_MAP_LAYERS and COMMODITY_MAP_LAYERS both carry
`pipelines: true`, and PR #3366 unified all variants on
`createEnergyPipelinesLayer` which dispatches
`energy:open-pipeline-detail` on row click. The listener for that
event lives in PipelineStatusPanel.

`PanelLayoutManager.createPanel()` only instantiates panels whose keys
are present in `panelSettings`, which derives from FULL_PANELS /
FINANCE_PANELS / etc. — so on finance and commodity variants the
listener never existed, and pipeline clicks were a silent no-op.

Fix: add `pipeline-status` to both FINANCE_PANELS and COMMODITY_PANELS
with `enabled: false` (panel slot not auto-opened; users invoke it by
clicking a pipeline on the map or via CMD+K). The panel now
instantiates on both variants and the click-through works end to end.

FULL_PANELS + ENERGY_PANELS already had the key from earlier PRs;
no change there.

Typecheck clean, test:data 6696/6696 pass.
2026-04-24 19:09:21 +04:00
Elie Habib
73cd8a9c92 feat(energy-atlas): EnergyDisruptionsPanel standalone timeline (§L #4) (#3378)
* feat(energy-atlas): EnergyDisruptionsPanel standalone timeline (§L #4)

Closes gap #4 from docs/internal/energy-atlas-registry-expansion.md §L.
Before this PR, the 52 disruption events in `energy:disruptions:v1`
were only reachable by drilling into a specific pipeline or storage
facility — PipelineStatusPanel and StorageFacilityMapPanel each render
an asset-scoped slice of the log inside their drawers, but no surface
listed the global event log. This panel makes the full log
first-class.

Shape:
- Reverse-chronological table (newest first) of every event.
- Filter chips: event type (sabotage, sanction, maintenance, mechanical,
  weather, war, commercial, other) + "ongoing only" toggle.
- Row click dispatches the existing `energy:open-pipeline-detail` or
  `energy:open-storage-facility-detail` CustomEvent with `{assetId,
  highlightEventId}` — no new open-panel protocol introduced. Mirrors
  the CountryDeepDivePanel disruption row contract from PR #3377.
- Uses `src/shared/disruption-timeline.ts` formatters
  (formatEventWindow, formatCapacityOffline, statusForEvent) that
  PipelineStatus/StorageFacilityMap already use — consistent UI across
  all three disruption surfaces.

Wiring:
- `src/components/EnergyDisruptionsPanel.ts` — new (~230 lines).
- `src/components/index.ts` — export.
- `src/app/panel-layout.ts` — `this.createPanel('energy-disruptions',
  () => new EnergyDisruptionsPanel())` alongside the other three
  atlas panels at :892.
- `src/config/panels.ts` — add to `FULL_PANELS` (priority 2, next to
  fuel-shortages) + `ENERGY_PANELS` (priority 1, top tier) +
  `PANEL_CATEGORY_MAP.marketsFinance` list alongside the other
  atlas panels.
- `src/config/commands.ts` — CMD+K entry `panel:energy-disruptions`
  with keywords matching the user vocabulary (sabotage, sanctions
  events, force majeure, drone strike, nord stream sabotage).

Not done in this PR:
- No new map pin layer — per plan §Q (Codex approved), disruptions
  stay a tabular/timeline surface; map assets (pipelines + storage)
  already show disruption markers on click.
- No direct globe-mode or SVG-fallback rendering needs — panel is
  pure DOM, not a map layer.

Test plan:
- [x] npm run typecheck (clean)
- [x] npm run test:data (6694/6694 pass)
- [ ] Manual: CMD+K "disruption log" → panel opens with 52 events,
      newest first. Click "Sabotage" chip → narrows to sabotage events
      only. Click a Nord Stream row → PipelineStatusPanel opens with
      that event highlighted.

* fix(energy-atlas): drop highlightEventId emission + respect empty-state (review P2)

Two Codex P2 findings on this PR:

1. Row click dispatched `highlightEventId` but neither
   PipelineStatusPanel nor StorageFacilityMapPanel consumes it. The
   UI's implicit promise (event-specific highlighting) wasn't
   delivered — clickthrough was asset-generic, and the extra field
   on the wire was a misleading API surface.

   Fix: drop `highlightEventId` from the dispatched detail. Row click
   now opens the asset drawer with just {pipelineId, facilityId}, the
   fields the receivers actually consume. User sees the full
   disruption timeline for that asset and locates the event visually.

   A future PR can add real highlight support by:
     - drawers accept `highlightEventId` in their openDetailHandler
     - loadDetail stores it and renderDisruptionTimeline scrolls +
       emphasises the matching event
     - re-add `highlightEventId` to the dispatch here, symmetrically
       in CountryDeepDivePanel (which has the same wire emission)

   The internal `_eventId` parameter is kept as a plumb-through so
   that future work is a drawer-side change, not a re-plumb.

2. `events.length === 0` was conflated with `upstreamUnavailable` and
   triggered the error UI. The server contract (list-energy-disruptions
   handler) returns `upstreamUnavailable: false` with an empty events
   array when Redis is up but has no entries matching the filter — a
   legitimate empty state, not a fetch failure.

   Fix: gate `showError` on `upstreamUnavailable` alone. Empty results
   fall through to the normal render, where the table's
   `No events match the current filter` row already handles the case.

Typecheck clean, test:data 6694/6694 pass.

* fix(energy-atlas): event delegation on persistent content (review P1)

Codex P1: Panel.setContent() debounces the DOM write by 150ms (see
Panel.ts:1025), so attaching listeners in render() via
`this.element.querySelector(...)` targets the STALE DOM — chips,
rows, and the ongoing-toggle button are silently non-interactive.
Visually the panel renders correctly after the debounce fires, but
every click is permanently dead.

Fix: register a single delegated click handler on `this.content`
(persistent element) in the constructor. The handler uses
`closest('[data-filter-type]')`, `closest('[data-toggle-ongoing]')`,
and `closest('tr.ed-row')` to route by data-attribute. Works
regardless of when setContent flushes or how many times render()
re-rewrites the inner HTML.

Also fixes Codex P2 on the same PR: filterEvents() was called twice
per render (once for row HTML, again for filteredCount). Now computed
once, reused. Trivial for 52 events but eliminates the redundant sort.

Typecheck clean.

* fix(energy-atlas): remap orphan disruption assetIds to real pipelines

Two events referenced pipeline ids that do not exist in
scripts/data/pipelines-oil.json:

- cpc-force-majeure-2022: assetId "cpc-pipeline" → "cpc"
- pdvsa-designation-2019: assetId "ve-petrol-2026-q1"
  → "venezuela-anzoategui-puerto-la-cruz"

Without this, clicking those rows in EnergyDisruptionsPanel
dead-ends at "Pipeline detail unavailable", so the panel
shipped with broken navigation on real data.

Mirrors the same fix on PR #3377 (gap #5a registry); applying
it on this branch as well so PR #3378 is independently
correct regardless of merge order. The two changes will dedupe
cleanly on rebase since the edits are byte-identical.
2026-04-24 19:09:05 +04:00
Elie Habib
7c0c08ad89 feat(energy-atlas): seed-side countries[] denorm on disruptions + CountryDeepDive row (§R #5 = B) (#3377)
* feat(energy-atlas): seed-side countries[] denorm + CountryDeepDive row (§R #5 = B)

Per plan §R/#5 decision B: denormalise countries[] at seed time on each
disruption event so CountryDeepDivePanel can filter events per country
without an asset-registry round trip. Schema join (pipeline/storage
→ event.assetId) happens once in the weekly cron, not on every panel
render. The alternative (client-side join) was rejected because it
couples UI logic to asset-registry internals and duplicates the join
for every surface that wants a per-country filter.

Changes:
- `proto/.../list_energy_disruptions.proto`: add `repeated string
  countries = 15` to EnergyDisruptionEntry with doc comment tying it
  to the plan decision and the always-non-empty invariant.
- `scripts/_energy-disruption-registry.mjs`:
    • Load pipeline-gas + pipeline-oil + storage-facilities registries
      once per seed cycle; index by id.
    • `deriveCountriesForEvent()` resolves assetId to {fromCountry,
      toCountry, transitCountries} (pipeline) or {country} (storage),
      deduped + alpha-sorted so byte-diff stability holds.
    • `buildPayload()` attaches the computed countries[] to every
      event before writing.
    • `validateRegistry()` now requires non-empty countries[] of
      ISO2 codes. Combined with the seeder's `emptyDataIsFailure:
      true`, this surfaces orphaned assetIds loudly — the next cron
      tick fails validation and seed-meta stays stale, tripping
      health alarms.
- `scripts/data/energy-disruptions.json`: fix two orphaned assetIds
  that the new join caught:
    • `cpc-force-majeure-2022`: `cpc-pipeline` → `cpc` (matches the
      entry in pipelines-oil.json).
    • `pdvsa-designation-2019`: `ve-petrol-2026-q1` (non-existent) →
      `venezuela-anzoategui-puerto-la-cruz`.
- `server/.../list-energy-disruptions.ts`: project countries[] into
  the RPC response via coerceStringArray. Legacy pre-denorm rows
  surface as empty array (always present on wire, length 0 => old).
- `src/components/CountryDeepDivePanel.ts`: add 4th Atlas row —
  "Energy disruptions in {iso2}" — filtered by `iso2 ∈ countries[]`.
  Failure is silent; EnergyDisruptionsPanel (upcoming) is the
  primary disruption surface.
- `tests/energy-disruptions-registry.test.mts`: switch to validating
  the buildPayload output (post-denorm), add §R #5 B invariant
  tests, plus a raw-JSON invariant ensuring curators don't hand-edit
  countries[] (it's derived, not declared).

Proto regen note: `make generate` currently fails with a duplicate
openapi plugin collision in buf.gen.yaml (unrelated bug — 3 plugin
entries emit to the same out dir). Worked around by temporarily
trimming buf.gen.yaml to just the TS plugins for this regen. Added
only the `countries: string[]` wire field to both service_client and
service_server; no other generated-file drift in this PR.

* chore(proto): regenerate openapi specs for countries[] field

Runs `make generate` with the sebuf v0.11.1 plugin now correctly
resolved via the PATH fix (cherry-picked from fix/makefile-generate-path-prefix).
The new `countries` field on EnergyDisruptionEntry propagates into:

- docs/api/SupplyChainService.openapi.yaml (primary per-service spec)
- docs/api/SupplyChainService.openapi.json (machine-readable variant)
- docs/api/worldmonitor.openapi.yaml (consolidated bundle)

No TypeScript drift beyond the already-committed service_client.ts /
service_server.ts updates in 80797e7cc.

* fix(energy-atlas): drop highlightEventId emission (review P2)

Codex P2: loadDisruptionsForCountry dispatched `highlightEventId` but
neither PipelineStatusPanel nor StorageFacilityMapPanel consumes it
(the openDetailHandler reads only pipelineId / facilityId). The UI's
implicit promise (event-specific highlighting) wasn't delivered —
clickthrough was asset-generic, and the extra wire field was a
misleading API surface.

Fix: emit only {pipelineId, facilityId} in the dispatched detail.
Row click opens the asset drawer; user sees the full per-asset
disruption timeline and locates the event visually.

Symmetric fix for PR #3378's EnergyDisruptionsPanel — both emitters
now match the drawer contract exactly. Re-add `highlightEventId`
here when the drawer panels ship matching consumer code
(openDetailHandler accepts it, loadDetail stores it,
renderDisruptionTimeline scrolls + emphasises the matching event).

Typecheck clean, test:data 6698/6698 pass.

* fix(energy-atlas): collision detection + abort signal + label clamp (review P2)

Three Codex P2 findings on PR #3377:

1. `loadAssetRegistries()` spread-merged gas + oil pipelines, silently
   overwriting entries on id collision. No collision today, but a
   curator adding a pipeline under the same id to both files would
   cause `deriveCountriesForEvent` to return wrong-commodity country
   data with no test flagging it.

   Fix: explicit merge loop that throws on duplicate id. The next
   cron tick fails validation, seed-meta stays stale, health alarms
   fire — same loud-failure pattern the rest of the seeder uses.

2. `loadDisruptionsForCountry` didn't thread `this.signal` through
   the RPC fetch shim. The stale-closure guard (`currentCode !== iso2`)
   discarded stale RESULTS, but the in-flight request couldn't be
   cancelled when the user switched countries or closed the panel.

   Fix: wrap globalThis.fetch with { signal: this.signal } in the
   client factory, matching the signal lifecycle the rest of the
   panel already uses.

3. `shortDescription` values up to 200 chars rendered without
   ellipsis in the compact Atlas row, overflowing the row layout.

   Fix: new `truncateDisruptionLabel` helper clamps to 80 chars with
   ellipsis. Full text still accessible via click-through to the
   asset drawer.

Typecheck clean, test:data 6698/6698 pass.
2026-04-24 19:08:07 +04:00
Elie Habib
a04c53fe26 fix(build): pin sebuf plugin via PATH in make generate (#3371)
* fix(build): pin sebuf plugin via PATH in `make generate`

Without this, developers who have an older sebuf protoc-gen-openapiv3
binary installed via a package manager (Homebrew ships v0.7.0 at
/opt/homebrew/bin) hit this failure on a fresh `make generate`:

    Failure: file ".../AviationService.openapi.yaml" was generated
    multiple times: once by plugin "protoc-gen-openapiv3" and again
    by plugin "protoc-gen-openapiv3"

Root cause: `buf.gen.yaml` declares three `protoc-gen-openapiv3`
invocations — the default yaml, a `format=json` variant, and a
`bundle_only=true, strategy: all` unified bundle. Sebuf v0.11.x
honors both `format=json` (emits .json extension) and `bundle_only=true`
(skips per-service emission), so the three invocations write distinct
files. Sebuf v0.7.x does NOT honor either option — it silently emits
the same per-service .yaml filenames from all three plugins and buf
rejects the collision.

`Makefile: install-plugins` installs v0.11.1 (SEBUF_VERSION) to
$HOME/go/bin. But the `generate` target doesn't prepend that to PATH,
so `which protoc-gen-openapiv3` resolves to the stale Homebrew binary
for anyone with both installed.

Verified by `go version -m`:
    /opt/homebrew/bin/protoc-gen-openapiv3 — mod sebuf v0.7.0
    /Users/eliehabib/go/bin/protoc-gen-openapiv3 — mod sebuf v0.11.1

Fix: prepend $$HOME/go/bin to PATH in the `generate` recipe. Matches
what .husky/pre-push:151-153 already does before invoking this target,
so CI and local behavior converge. No sebuf upstream bug.

* fix(build): follow GOBIN-then-GOPATH/bin for plugin PATH prefix

Reviewer (Codex) on PR #3371: the previous patch hardcoded
\$HOME/go/bin, which is only the default fallback when GOBIN is unset
AND GOPATH defaults to \$HOME/go. On machines with a custom GOBIN or
a non-default GOPATH, `go install` targets a different directory — so
hardcoding \$HOME/go/bin can force a stale binary from there to win
over the freshly-installed SEBUF_VERSION sitting at the actual install
location.

Fix: resolve the install dir the same way `go install` does:
  GOBIN first, then GOPATH/bin.

Shell expression: `go env GOBIN` returns an empty string (exit 0) when
unset, so `||` alone doesn't cascade. Using explicit `[ -n "$gobin" ]`
instead.

Also dropped the misleading comment that claimed the pre-push hook
used the same rule — it still hardcodes \$HOME/go/bin. Called that out
in a note, but left the hook alone because its PATH prepend is
belt-and-suspenders (only matters for locating `buf` itself; the
Makefile's own recipe-level prepend decides plugin resolution).

Verified on a machine with empty GOBIN:
    resolved → /Users/eliehabib/go/bin
And \`make generate\` succeeds without manual PATH overrides.

* fix(build): use first GOPATH entry for plugin PATH prefix

Reviewer (Codex) on commit 6db0b53c2: the GOBIN-empty fallback used
`$(go env GOPATH)/bin`, which silently breaks on setups where GOPATH
is a colon-separated list. Example:

    GOPATH=/p1:/p2
    previous code → "/p1:/p2/bin"
       ^ two PATH entries; neither is the actual install target /p1/bin

`go install` writes binaries only into the first GOPATH entry's bin,
so the stale-plugin case this PR is trying to fix can still bite.

Fix: extract the first entry via `cut -d: -f1`. Matches Go's own
behavior in cmd/go/internal/modload/init.go:gobin(), which uses
filepath.SplitList + [0].

Verified:
- Single-entry GOPATH (this machine) → /Users/eliehabib/go/bin ✓
- Simulated GOPATH=/fake/path1:/fake/path2 → /fake/path1/bin ✓
- make generate succeeds in both cases.

* fix(build): resolve buf via caller PATH; prepend plugin dir only (review P1/P3)

Codex P1: the previous recipe prepended the Go install dir to PATH
before invoking `buf generate`, which also changed which `buf` binary
ran. On a machine with a stale buf in GOBIN/$HOME/go/bin, the recipe
would silently downgrade buf itself and reintroduce version-skew
failures — the exact class of bug this PR was trying to fix.

Fix: two-stage resolution.

  1. `BUF_BIN=$(command -v buf)` resolves buf using the CALLER's PATH
     (Homebrew, Go install, distro package — whichever the developer
     actually runs day-to-day).
  2. Invoke the resolved buf via absolute path ("$BUF_BIN"), with a
     PATH whose first entry is the Go install dir. That affects ONLY
     plugin lookup inside `buf generate` (protoc-gen-ts-*,
     protoc-gen-openapiv3) — not buf itself, which was already resolved.

Adds a loud failure when `buf` is not on PATH:
    buf not found on PATH — run: make install-buf
Previously a missing buf would cascade into a confusing error deeper
in the pipeline.

Codex P3: added tests/makefile-generate-plugin-path.test.mjs — scrapes
the generate recipe text and asserts:
  - `command -v buf` captures buf before the PATH override
  - Missing-buf case fails loudly
  - buf is invoked via "$BUF_BIN" (absolute path)
  - GOBIN + GOPATH/bin resolution is present
  - Install-dir prepend precedes $$PATH (order matters)
  - The subshell expression resolves on the current machine

Codex P2 (Windows GOPATH semicolon delimiter) is acknowledged but
not fixed here — this repo does not support Windows dev per CLAUDE.md,
the pre-push hook and CI are Unix-only, and a cross-platform
implementation would require a separate Make detection or a
platform-selected helper script. Documented inline as a known
Unix assumption.

Verified:
- `make generate` clean
- `command -v buf` → /opt/homebrew/bin/buf
- protoc-gen-openapiv3 via plugin-PATH → ~/go/bin/protoc-gen-openapiv3
- New test suite 6/6 pass
- npm run typecheck clean

* fix(build): silence recipe comments with @# (review P2)

Codex P2: recipe lines starting with `#` are passed to the shell
(which ignores them) but Make still echoes them to stdout before
execution. Running `make generate` printed all 34 comment lines
verbatim. Noise for developers, and dilutes the signal when the
actual error output matters.

Fix: prefix every in-recipe `#` comment with `@` so Make suppresses
the echo. No semantic change — all comments still read identically
in the source.

Verified: `make generate` now prints only "Clean complete!", the
buf invocation line (silenced with @... would hide the invocation
which helps debugging, so leaving that audible), and "Code
generation complete!".

* fix(build): fail closed when go missing; verify plugin is present (review High)

Codex High: previous recipe computed PLUGIN_DIR from `go env GOBIN` /
`go env GOPATH` without checking that `go` itself was on PATH. When
go is missing:
  - `go env GOBIN`  fails silently, gobin=""
  - `go env GOPATH` fails silently, cut returns ""
  - printf '%s/bin' "" yields "/bin"
  - PATH becomes "/bin:$PATH" — doesn't override anything
  - `buf generate` falls back to whatever stale sebuf plugin is on
    PATH, reintroducing the exact duplicate-output failure this PR
    was supposed to fix.

Fix (chained in a single shell line so any guard failure aborts):
  1. `command -v go`  — fail with clear "install Go" message.
  2. `command -v buf` — fail with clear "run: make install-buf".
  3. Resolve PLUGIN_DIR via GOBIN / GOPATH[0]/bin.
  4. `[ -n "$$PLUGIN_DIR" ]` — fail if resolution returned empty
     (shouldn't happen after the go-guard, but belt-and-suspenders
     against future shell weirdness).
  5. `[ -x "$$PLUGIN_DIR/protoc-gen-ts-client" ]` — fail if the
     plugin isn't installed, telling the user to run
     `make install-plugins`. Catches the case where `go` exists but
     the user has never installed sebuf locally.
  6. `PATH="$$PLUGIN_DIR:$$PATH" "$$BUF_BIN" generate`.

Verified failure modes:
  - go missing        → "go not found on PATH — run: ... install-plugins"
  - buf missing       → "buf not found on PATH — run: make install-buf"
  - happy path        → clean `Code generation complete!`

Extended tests/makefile-generate-plugin-path.test.mjs with:
  - `fails loudly when go is not on PATH`
  - `verifies the sebuf plugin binary is actually present before invoking buf`
  - Rewrote `PATH override order` to target the new PLUGIN_DIR form.

All 8 tests pass. Typecheck clean.

* fix(makefile): guard all sebuf plugin binaries, not just ts-client

proto/buf.gen.yaml invokes THREE sebuf binaries:

- protoc-gen-ts-client
- protoc-gen-ts-server
- protoc-gen-openapiv3 (× 3 plugin entries)

The previous guard only verified protoc-gen-ts-client was present in
the pinned Go install dir. If the other two were missing from that
dir (or only partially installed by a prior failed `make install-plugins`),
`PATH="$PLUGIN_DIR:$PATH" buf generate` would fall through to whatever
stale copy happened to be earlier on the caller's normal PATH —
exactly the mixed-sebuf-version failure this PR is meant to eliminate.

Fix: iterate every plugin name in a shell `for` loop. Any missing
binary aborts with the same `Run: make install-plugins` remediation
the previous guard showed.

Tests:
- Update the existing plugin-presence assertion to require all three
  binaries by name AND the `[ -x "$PLUGIN_DIR/$p" ]` loop pattern.
- Add a cross-reference test that parses proto/buf.gen.yaml, extracts
  every `local:` plugin, and fails if any declared plugin is missing
  from the Makefile guard list. This catches future drift without
  relying on a human to remember the dependency.

Closes the PR #3371 High finding that a `ts-server` or `openapiv3`
missing from $PLUGIN_DIR would silently re-enable the stale-plugin bug.

* fix(pre-push): don't shadow caller's buf with stale ~/go/bin/buf

The proto-freshness hook's unconditional
`export PATH="$HOME/go/bin:$PATH"` defeated the Makefile-side
caller-PATH-first invariant: on machines with both a preferred buf
(Homebrew, /usr/local, etc.) AND an older `go install buf@<old>`
leftover at `~/go/bin/buf`, the prepend placed the stale copy first.
`make generate` then resolved buf via `command -v buf` and picked up
the shadowed stale binary — recreating the mixed-version failure
this PR is meant to eliminate.

Fix:

1. Only prepend `$HOME/go/bin` when buf is NOT already on the caller's
   PATH. Now buf's Homebrew/system copy always wins; `~/go/bin/buf` is
   a pure fallback.

2. Widen the plugin-presence check to accept either a PATH-resolvable
   `protoc-gen-ts-client` OR the default go-install location
   `$HOME/go/bin/protoc-gen-ts-client`. `make generate` now resolves
   plugins via its own PLUGIN_DIR (GOBIN, then first-entry GOPATH/bin),
   so requiring them on PATH is too strict.

3. Drop the redundant plugin-only PATH prepend — the Makefile's own
   plugin-path resolution handles it authoritatively.

Tests: add a regression guard that reads the hook, verifies the
prepend is gated on `! command -v buf`, and explicitly asserts the
OLD buggy pattern is not present.

Closes the PR #3371 High finding about the hook's unconditional
prepend defeating the Makefile-side caller-PATH-first invariant.
2026-04-24 19:01:47 +04:00
Sebastien Melki
e68a7147dd chore(api): sebuf migration follow-ups (post-#3242) (#3287)
* chore(api-manifest): rewrite brief-why-matters reason as proper internal-helper justification

Carried in from #3248 merge as a band-aid (called out in #3242 review followup
checklist item 7). The endpoint genuinely belongs in internal-helper —
RELAY_SHARED_SECRET-bearer auth, cron-only caller, never reached by dashboards
or partners. Same shape constraint as api/notify.ts.

Replaces the apologetic "filed here to keep the lint green" framing with a
proper structural justification: modeling it as a generated service would
publish internal cron plumbing as user-facing API surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(lint): premium-fetch parity check for ServiceClients (closes #3279)

Adds scripts/enforce-premium-fetch.mjs — AST-walks src/, finds every
`new <ServiceClient>(...)` (variable decl OR `this.foo =` assignment),
tracks which methods each instance actually calls, and fails if any
called method targets a path in src/shared/premium-paths.ts
PREMIUM_RPC_PATHS without `{ fetch: premiumFetch }` on the constructor.

Per-call-site analysis (not class-level) keeps the trade/index.ts pattern
clean — publicClient with globalThis.fetch + premiumClient with
premiumFetch on the same TradeServiceClient class — since publicClient
never calls a premium method.

Wired into:
- npm run lint:premium-fetch
- .husky/pre-push (right after lint:rate-limit-policies)
- .github/workflows/lint-code.yml (right after lint:api-contract)

Found and fixed three latent instances of the HIGH(new) #1 class from
#3242 review (silent 401 → empty fallback for signed-in browser pros):

- src/services/correlation-engine/engine.ts — IntelligenceServiceClient
  built with no fetch option called deductSituation. LLM-assessment overlay
  on convergence cards never landed for browser pros without a WM key.
- src/services/economic/index.ts — EconomicServiceClient with
  globalThis.fetch called getNationalDebt. National-debt panel rendered
  empty for browser pros.
- src/services/sanctions-pressure.ts — SanctionsServiceClient with
  globalThis.fetch called listSanctionsPressure. Sanctions-pressure panel
  rendered empty for browser pros.

All three swap to premiumFetch (single shared client, mirrors the
supply-chain/index.ts justification — premiumFetch no-ops safely on
public methods, so the public methods on those clients keep working).

Verification:
- lint:premium-fetch clean (34 ServiceClient classes, 28 premium paths,
  466 src/ files analyzed)
- Negative test: revert any of the three to globalThis.fetch → exit 1
  with file:line and called-premium-method names
- typecheck + typecheck:api clean
- lint:api-contract / lint:rate-limit-policies / lint:boundaries clean
- tests/sanctions-pressure.test.mjs + premium-fetch.test.mts: 16/16 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(military): fetchStaleFallback NEG_TTL=30s parity (closes #3277)

The legacy /api/military-flights handler had NEG_TTL = 30_000ms — a short
suppression window after a failed live + stale read so we don't Redis-hammer
the stale key during sustained relay+seed outages.

Carried into the sebuf list-military-flights handler:
- Module-scoped `staleNegUntil` timestamp (per-isolate on Vercel Edge,
  which is fine — each warm isolate gets its own 30s suppression window).
- Set whenever fetchStaleFallback returns null (key missing, parse fail,
  empty array after staleToProto filter, or thrown error).
- Checked at the entry of fetchStaleFallback before doing the Redis read.
- Test seam `_resetStaleNegativeCacheForTests()` exposed for unit tests.

Test pinned in tests/redis-caching.test.mjs: drives a stale-empty cycle
three times — first read hits Redis, second within window doesn't, after
test-only reset it does again.

Verified: 18/18 redis-caching tests pass, typecheck:api clean,
lint:premium-fetch clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lint): rate-limit-policies regex → import() (closes #3278)

The previous lint regex-parsed ENDPOINT_RATE_POLICIES from the source
file. That worked because the literal happens to fit a single line per
key today, but a future reformat (multi-line key wrap, formatter swap,
etc.) would silently break the lint without breaking the build —
exactly the failure mode that's worse than no lint at all.

Fix:
- Export ENDPOINT_RATE_POLICIES from server/_shared/rate-limit.ts.
- Convert scripts/enforce-rate-limit-policies.mjs to async + dynamic
  import() of the policy object directly. Same TS module that the
  gateway uses at runtime → no source-of-truth drift possible.
- Run via tsx (already a dev dep, used by test:data) so the .mjs
  shebang can resolve a .ts import.
- npm script swapped to `tsx scripts/...`. .husky/pre-push uses
  `npm run lint:rate-limit-policies` so no hook change needed.

Verified:
- Clean: 6 policies / 182 gateway routes.
- Negative test (rename a key to the original sanctions typo
  /api/sanctions/v1/lookup-entity): exit 1 with the same incident-
  attributed remedy message as before.
- Reformat test (split a single-line entry across multiple lines):
  still passes — the property is what's read, not the source layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(shipping/v2): alertThreshold: 0 preserved; drop dead validation branch (#3242 followup)

Before: alert_threshold was a plain int32. proto3 scalar default is 0, so
the handler couldn't distinguish "partner explicitly sent 0 (deliver every
disruption)" from "partner omitted the field (apply legacy default 50)" —
both arrived as 0 and got coerced to 50 by `> 0 ? : 50`. Silent intent-drop
for any partner who wanted every alert. The subsequent `alertThreshold < 0`
branch was also unreachable after that coercion.

After:
- Proto field is `optional int32 alert_threshold` — TS type becomes
  `alertThreshold?: number`, so omitted = undefined and explicit 0 stays 0.
- Handler uses `req.alertThreshold ?? 50` — undefined → 50, any number
  passes through unchanged.
- Dead `< 0 || > 100` runtime check removed; buf.validate `int32.gte = 0,
  int32.lte = 100` already enforces the range at the wire layer.

Partner wire contract: identical for the omit-field and 1..100 cases.
Only behavioural change is explicit 0 — previously impossible to request,
now honored per proto3 optional semantics.

Scoped `buf generate --path worldmonitor/shipping/v2` to avoid the full-
regen `@ts-nocheck` drift Seb documented in the #3242 PR comments.
Re-applied `@ts-nocheck` on the two regenerated files manually.

Tests:
- `alertThreshold 0 coerces to 50` flipped to `alertThreshold 0 preserved`.
- New test: `alertThreshold omitted (undefined) applies legacy default 50`.
- `rejects > 100` test removed — proto/wire validation handles it; direct
  handler calls intentionally bypass wire and the handler no longer carries
  a redundant runtime range check.

Verified: 18/18 shipping-v2-handler tests pass, typecheck + typecheck:api
clean, all 4 custom lints clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(shipping/v2): document missing webhook delivery worker + DNS-rebinding contract (#3242 followup)

#3242 followup checklist item 6 from @koala73 — sanity-check that the
delivery worker honors the re-resolve-and-re-check contract that
isBlockedCallbackUrl explicitly delegates to it.

Audit finding: no delivery worker for shipping/v2 webhooks exists in this
repo. Grep across the entire tree (excluding generated/dist) shows the
only readers of webhook:sub:* records are the registration / inspection /
rotate-secret handlers themselves. No code reads them and POSTs to the
stored callbackUrl. The delivery worker is presumed to live in Railway
(separate repo) or hasn't been built yet — neither is auditable from
this repo.

Refreshes the comment block at the top of webhook-shared.ts to:
- explicitly state DNS rebinding is NOT mitigated at registration
- spell out the four-step contract the delivery worker MUST follow
  (re-validate URL, dns.lookup, re-check resolved IP against patterns,
   fetch with resolved IP + Host header preserved)
- flag the in-repo gap so anyone landing delivery code can't miss it

Tracking the gap as #3288 — acceptance there is "delivery worker imports
the patterns + helpers from webhook-shared.ts and applies the four steps
before each send." Action moves to wherever the delivery worker actually
lives (Railway likely).

No code change. Tests + lints unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(lint): add rate-limit-policies step (greptile P1 #3287)

Pre-push hook ran lint:rate-limit-policies but the CI workflow did not,
so fork PRs and --no-verify pushes bypassed the exact drift check the
lint was added to enforce (closes #3278). Adding it right after
lint:api-contract so it runs in the same context the lint was designed
for.

* refactor(lint): premium-fetch regex → import() + loop classRe (greptile P2 #3287)

Two fragilities greptile flagged on enforce-premium-fetch.mjs:

1. loadPremiumPaths regex-parsed src/shared/premium-paths.ts with
   /'(\/api\/[^']+)'/g — same class of silent drift we just removed
   from enforce-rate-limit-policies in #3278. Reformatting the source
   Set (double quotes, spread, helper-computed entries) would drop
   paths from the lint while leaving the runtime untouched. Fix: flip
   the shebang to `#!/usr/bin/env -S npx tsx` and dynamic-import
   PREMIUM_RPC_PATHS directly, mirroring the rate-limit pattern.
   package.json lint:premium-fetch now invokes via tsx too so the
   npm-script path matches direct execution.

2. loadClientClassMap ran classRe.exec once, silently dropping every
   ServiceClient after the first if a file ever contained more than
   one. Current codegen emits one class per file so this was latent,
   but a template change would ship un-linted classes. Fix: collect
   every class-open match with matchAll, slice each class body with
   the next class's start as the boundary, and scan methods per-body
   so method-to-class binding stays correct even with multiple
   classes per file.

Verification:
- lint:premium-fetch clean (34 classes / 28 premium paths / 466 files
  — identical counts to pre-refactor, so no coverage regression).
- Negative test: revert src/services/economic/index.ts to
  globalThis.fetch → exit 1 with file:line, bound var name, and
  premium method list (getNationalDebt). Restore → clean.
- lint:rate-limit-policies still clean.

* fix(shipping/v2): re-add alertThreshold handler range guard (greptile nit 1 #3287)

Wire-layer buf.validate enforces 0..100, but direct handler invocation
(internal jobs, test harnesses, future transports) bypasses it. Cheap
invariant-at-the-boundary — rejects < 0 or > 100 with ValidationError
before the record is stored.

Tests: restored the rejects-out-of-range cases that were dropped when the
branch was (correctly) deleted as dead code on the previous commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lint): premium-fetch method-regex → TS AST (greptile nits 2+5 #3287)

loadClientClassMap:
  The method regex `async (\w+)\s*\([^)]*\)\s*:\s*Promise<[^>]+>\s*\{\s*let
  path = "..."` assumed (a) no nested `)` in arg types, (b) no nested `>`
  in the return type, (c) `let path = "..."` as the literal first statement.
  Any codegen template shift would silently drop methods with the lint still
  passing clean — the same silent-drift class #3287 just closed on the
  premium-paths side.

  Now walks the service_client.ts AST, matches `export class *ServiceClient`,
  iterates `MethodDeclaration` members, and reads the first
  `let path: string = '...'` variable statement as a StringLiteral. Tolerant
  to any reformatting of arg/return types or method shape.

findCalls scope-blindness:
  Added limitation comment — the walker matches `<varName>.<method>()`
  anywhere in the file without respecting scope. Two constructions in
  different function scopes sharing a var name merge their called-method
  sets. No current src/ file hits this; the lint errs cautiously (flags
  both instances). Keeping the walker simple until scope-aware binding
  is needed.

webhook-shared.ts:
  Inlined issue reference (#3288) so the breadcrumb resolves without
  bouncing through an MDX that isn't in the diff.

Verification:
- lint:premium-fetch clean — 34 classes / 28 premium paths / 489 files.
  Pre-refactor: 34 / 28 / 466. Class + path counts identical; file bump
  is from the main-branch rebase, not the refactor.
- Negative test: revert src/services/economic/index.ts premiumFetch →
  globalThis.fetch. Lint exits 1 at `src/services/economic/index.ts:64:7`
  with `premium method(s) called: getNationalDebt`. Restore → clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(lint): rate-limit OpenAPI regex → yaml parser (greptile nit 3 #3287)

Input side (ENDPOINT_RATE_POLICIES) was flipped to live `import()` in
4e79d029. Output side (OpenAPI routes) still regex-scraped top-level
`paths:` keys with `/^\s{4}(\/api\/[^\s:]+):/gm` — hard-coded 4-space
indent. Any YAML formatter change (2-space indent, flow style, line
folding) would silently drop routes and let policy-drift slip through
— same silent-drift class the input-side fix closed.

Now uses the `yaml` package (already a dep) to parse each
.openapi.yaml and reads `doc.paths` directly.

Verification:
- Clean: 6 policies / 189 routes (was 182 — yaml parser picks up a
  handful the regex missed, closing a silent coverage gap).
- Negative test: rename policy key back to /api/sanctions/v1/lookup-entity
  → exits 1 with the same incident-attributed remedy. Restore → clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(codegen): regenerate unified OpenAPI bundle for alert_threshold proto change

The shipping/v2 webhook alert_threshold field was flipped from `int32` to
`optional int32` with an expanded doc comment in f3339464. That comment
now surfaces in the unified docs/api/worldmonitor.openapi.yaml bundle
(introduced by #3341). Regenerated with sebuf v0.11.1 to pick it up.

No behaviour change — bundle-only documentation drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 18:00:41 +03:00
Elie Habib
1dc807e70f docs(resilience): PR 4a — SWF classification rubric (tiers + precedents, no manifest changes) (#3376)
* docs(resilience): PR 4a — SWF classification rubric (tiers + precedents, no manifest changes)

PR 4a of cohort-audit plan 2026-04-24-002. First half of the plan's PR 4
(full-manifest re-rate) split into:

- PR 4a (this): pure documentation — central rubric defining tiers
  + concrete precedents per axis. No manifest changes.
- PR 4b (deferred): apply the rubric to revise specific coefficients
  in `scripts/shared/swf-classification-manifest.yaml`. Behaviour-
  changing; belongs in a separate PR with cohort snapshots and
  methodology review.

This split addresses the plan's concern that PR 4 "may not be outcome-
predetermined" by separating the evaluative framework from its
application. PR 4a makes every current manifest value evaluable against
a benchmark; PR 4b applies the benchmark.

Shipped

- `docs/methodology/swf-classification-rubric.md` — new doc.
  Sections:
    1. Introduction + scope (rubric vs manifest boundary)
    2. Access axis: 5 named tiers (0.1, 0.3, 0.5, 0.7, 0.9) w/
       concrete precedents per tier, plus edge cases for
       fiscal-rule caps (Norway GPFG) and state holding
       companies (Temasek)
    3. Liquidity axis: 6 tiers (0.1, 0.3, 0.5, 0.7, 0.9, 1.0) w/
       precedents + listed-vs-directly-owned real-estate edge case
    4. Transparency axis: 6 tiers grounded in LM Transparency
       Index + IFSWF membership + annual-report granularity, plus
       edge cases for LM=10 w/o holdings-level disclosure and
       sealed filings (KIA)
    5. Current manifest × rubric alignment — 24 coefficients reviewed;
       6 flagged as "arguably higher/lower under the rubric" with
       directional-impact analysis marked INFORMATIONAL, not
       motivation for revision
    6. How-to-use playbook for manifest-edit PRs (add/revise/rubric-
       revise workflows)

Findings (informational only — no PR changes)

Six ratings flagged as potentially under-/over-stated against the
rubric. Per the plan's anti-pattern note (rank-targeted acceptance
criteria), the flags are INFORMATIONAL: a future manifest-edit PR
should revise only when the rubric + cited evidence support the
change, not to hit a target ranking.

Flagged (with directional impact if revised upward):

  - Mubadala access 0.4 → arguably 0.5; transparency 0.6 → 0.7
    (haircut 0.12 → 0.175, +46% access × transparency product)
  - PIF access 0.4 → arguably 0.5; liquidity 0.4 → arguably 0.3
    (net small effect — opposite directions partially cancel)
  - KIA transparency 0.4 → arguably 0.5 (haircut +25%)
  - QIA access 0.4 → arguably 0.5; transparency 0.4 → arguably 0.5
    (haircut +56%)
  - GIC access 0.6 → arguably 0.7 (haircut +17%)

Not flagged: GPFG, ADIA, Temasek (all 9 coefficients align with
their rubric tiers).

Verified

- `npm run test:data` — 6694 pass / 0 fail (unchanged — pure docs PR)
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint:md` — clean

Not in this PR

- Manifest coefficient changes (PR 4b)
- Cohort-sanity snapshot before/after (PR 4b)
- Live-data audit of IFSWF engagement + LM index current values
  (requires web fetch — not in scope for a doc PR)

* fix(resilience): PR 4a review — resolve GIC/ADIA rubric contradictions + flag-count

Addresses P1 + 2 P2 Greptile findings on #3376 (draft).

1. **P1 — GIC tier contradiction.** GIC was listed as a canonical 0.7
   ("Explicit stabilization with rule") precedent AND rated 0.6 in
   the alignment table with an "arguably 0.7" note. That inconsistency
   makes the rubric unusable as-is for PR 4b review. Removed GIC from
   the 0.7 precedent list and explicitly marked it as a 0.7 *candidate*
   (pending PR 4b evaluation), not a 0.7 *precedent*. KIA General
   Reserve Fund stays as the canonical 0.7 example; Norway GPFG
   remains the borderline case for fiscal-rule caps.

2. **P2 — ADIA liquidity midpoint inconsistency.** Methodology text
   said the rubric uses "midpoint" for ranged disclosures and cited
   ADIA 55-70% → 0.7 tier. But midpoint(55-70) = 62.5%, which sits
   in the 0.5 tier band (50-65%). Fixed the methodology to state the
   rubric uses the **upper-bound** of a disclosed range (fund's own
   statement of maximum public-market allocation), which keeps ADIA
   at 0.7 tier (70% upper bound fits 65-85% band). Added forward-
   compatibility note: if future ADIA disclosures tighten the range
   so the upper bound drops below 65%, the rubric directs the rating
   to 0.5.

3. **P2 — Flag-count header.** "(6 of 24 coefficients)" was wrong;
   the enumeration below lists 8 coefficients across 5 funds.
   Corrected to "8 coefficients across 5 funds" with the fund-by-fund
   count inline so the header math is self-verifying.

Verified

- `npm run lint:md` — clean
- `npm run typecheck` — green (pure docs PR, no behaviour change)

This PR remains in draft pending #3380 (PR 3A — net-imports denominator)
merge per the plan's PR 4 → after PR 3A sequencing.
2026-04-24 18:32:17 +04:00
Elie Habib
b4198a52c3 docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question) (#3375)
* docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question)

PR 5.1 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.3 (#3374)
so the known-limitations.md section append is additive. Read-only
static audit of scoreTradeSanctions + the sanctions:country-counts:v1
seed — framed around the Codex-reformulated construct question:
should designated-party domicile count penalize resilience?

Findings

1. The count is "OFAC-designated-party domicile locations," NOT
   "sanctions against this country." Seeder (`scripts/seed-sanctions-
   pressure.mjs:85-93`) parses OFAC Advanced XML SDN + Consolidated,
   extracts each designated party's Locations, and increments
   `map[countryCode]` by 1 for every location country on that party.

2. The count conflates three semantically distinct categories a
   resilience construct might treat differently:

   (a) Country-level sanction target (NK SDN listings) — correct penalty
   (b) Domiciled sanctioned entity (RU bank in Moscow, post-2022) —
       debatable, country hosts the actor
   (c) Transit / shell entity (UAE trading co listed under SDGT for
       Iran evasion; CY SPV for a Russian oligarch) — country is NOT
       the target, but takes the penalty

3. Observed GCC cohort impact: AE scores 54 vs KW/QA 82. The −28 gap
   is almost entirely driven by category (c) listings — AE is a
   financial hub where sanctioned parties incorporate shells.

4. Three options documented for the construct decision (NOT decided
   in this PR):
   - Option 1: Keep flat count (status quo, defensible via secondary-
     sanctions / FATF argument)
   - Option 2: Program-weighted count — weight DPRK/IRAN/SYRIA/etc.
     at 1.0, SDGT/SDNTK/CYBER/etc. at 0.3-0.5. Recommended; seeder
     already captures `programs` per entry — data is there, scorer
     just doesn't read it.
   - Option 3: Transit-hub exclusion list (AE, SG, HK, CY, VG, KY) —
     brittle + normative, not recommended

5. Recommendation documented: Option 2. Implementation deferred to
   a separate methodology-decision PR (outside auto-mode authority).

Shipped

- `docs/methodology/known-limitations.md` — new section extending
  the file: "tradeSanctions — designated-party domicile construct
  question." Covers what the count represents, the three categories
  with examples, observed GCC impact, three options w/ trade-offs,
  recommendation, follow-up audit list (entity-sample gated on
  API-key access), and file references.
- `tests/resilience-sanctions-field-mapping.test.mts` (new) —
  10 regression-guard tests pinning CURRENT behavior:
    1-6. normalizeSanctionCount piecewise anchors:
       count=0→100, 1→90, 10→75, 50→50, 200→25, 500→≤1
    7. Monotonicity: strictly decreasing across the ramp
    8. Country absent from map defaults to count=0 → score 100
       (intentional "no designated parties here" semantics)
    9. Seed outage (raw=null) → null score slot, NOT imputed
       (protects against silent data-outage scoring)
    10. Construct anchor: count=1 is exactly 10 points below count=0
        (pins the "first listing drops 10" design choice)

Verified

- `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail
- `npm run test:data` — 6721 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — clean

* fix(resilience): PR 5.1 review — tighten count=500 assertion; clarify weightedBlend weights

Addresses 2 P2 Greptile findings on #3375:

1. Tighten count=500 assertion. Was `<= 1` with a comment stating the
   exact value is 0. That loose bound silently tolerates roundScore /
   boundary drift that would be the very signal this regression guard
   exists to catch. Changed to strict equality `=== 0`.

2. Clarify the "zero weight" comment on the sanctions-only harness.
   The other slots DO contribute their declared weights (0.15 +
   0.15 + 0.25 = 0.55) to weightedBlend's `totalWeight` denominator —
   only `availableWeight` (the score-computation denominator) drops
   to 0.45 because their score is null. The previous comment elided
   this distinction and could mislead a reader into thinking the null
   slots contributed nothing at all. Expanded to state exactly how
   `coverage` and `score` each behave.

Verified

- `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts`
  — 10 pass / 0 fail (count=500 now pins the exact 0 floor)
2026-04-24 18:30:59 +04:00
Elie Habib
a97ba83833 docs(resilience): PR 5.3 — foodWater scorer audit (construct-deterministic GCC identity) (#3374)
* docs(resilience): PR 5.3 — foodWater scorer audit (construct-deterministic GCC identity)

PR 5.3 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.2 (#3373)
so the known-limitations.md section append is additive. Read-only
static audit of scoreFoodWater.

Findings

1. The observed GCC-all-score-53 is CONSTRUCT-DETERMINISTIC, not a
   regional-default leak. Pinned mathematically:
   - IPC/HDX doesn't publish active food-crisis data for food-secure
     states → scorer's fao-null branch imputes IMPUTE.ipcFood=88
     (class='stable-absence', cov=0.7) at combined weight 0.6
   - WB indicator ER.H2O.FWST.ZS (labelled 'water stress') for GCC
     is EXTREME (KW ~3200%, BH ~3400%, UAE ~2080%, QA ~770%) — all
     clamp to sub-score 0 under the scorer's lower-better 0..100
     normaliser at weight 0.4
   - Blended with peopleInCrisis=0 (fao block present with zero):
       (100 * 0.45 + 0 * 0.4) / (0.45 + 0.4) = 45 / 0.85 ≈ 53
   Every GCC country has the same inputs → same outputs. That's
   construct math, not a regional lookup.

2. Indicator-keyword routing is code-correct. `'water stress'`,
   `'withdrawal'`, `'dependency'` route to lower-better;
   `'availability'`, `'renewable'`, `'access'` route to
   higher-better; unrecognized indicators fall through to a
   value-range heuristic with a WARN log.

3. No bug or methodology decision required. The 53-all-GCC output
   is a correct summary statement: "non-crisis food security +
   severe water-withdrawal stress." A future construct decision
   might split foodWater into separate food and water dims so one
   saturated sub-signal doesn't dominate the combined dim for
   desert economies — but that's a construct redesign, not a bug.

Shipped

- `docs/methodology/known-limitations.md` — extended with a new
  section documenting the foodWater audit findings, the exact
  blend math that yields ~53 for GCC, cohort-determinism vs
  regional-default, and a follow-up data-side spot-check list
  gated on API-key access.
- `tests/resilience-foodwater-field-mapping.test.mts` — 8 new
  regression-guard tests:
    1. indicator='water stress' routes to lower-better
    2. GCC extreme-withdrawal anchor (value=2000 → blended score 53)
    3. indicator='renewable water availability' routes to higher-better
    4. fao=null with static record → imputes 88; imputationClass=null
       because observed AQUASTAT wins (weightedBlend T1.7 rule)
    5. fully-imputed (fao=null + aquastat=null) surfaces
       imputationClass='stable-absence'
    6. static-record absent entirely → coverage=0, NOT impute
    7. Cohort determinism — identical inputs → identical scores
    8. Different water-profile inputs → different scores (rules
       out regional-default hypothesis)

Verified

- `npx tsx --test tests/resilience-foodwater-field-mapping.test.mts` — 8 pass / 0 fail
- `npm run test:data` — 6711 pass / 0 fail (PR 5.2's 9 + PR 5.3's 8 = 17 new stacked)
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — clean

* fix(resilience): PR 5.3 review — pin IMPUTE branch for GCC anchor; fix comment math

Addresses 3 P2 Greptile findings on #3374 — all variations of the same
root cause: the test fixture + doc described two different code paths
that coincidentally both produce ~53 for GCC inputs.

Changes

1. GCC anchor test now drives the IMPUTE branch (`fao: null`), matching
   what the static seeder emits for GCC in production. The else branch
   (`fao: { peopleInCrisis: 0 }`) happens to converge on ~52.94 by
   coincidence but is NOT the live code path for GCC.

2. Doc finding #4 updated to show the IMPUTE-branch math
   `(88×0.6 + 0×0.4) / 1.0 = 52.8 → 53` and explicitly notes the
   else-branch convergence as a coincidence — not the construct's
   intent.

3. Comment math off-by-one fix at line 107:
     (88×0.6 + 80×0.4) / (0.6+0.4)
     = 52.8 + 32.0
     = 84.8 → 85    (was incorrectly stated as 85.6 → 86)
   Test assertion `>= 80 && <= 90` still accepts 85 so behaviour is
   unchanged; this was a comment-only error that would have misled
   anyone reproducing the math by hand.

Verified

- `npx tsx --test tests/resilience-foodwater-field-mapping.test.mts`
  — 8 pass / 0 fail (IMPUTE-branch anchor test produces 53 as expected)
- `npm run lint:md` — clean

Also rebased onto updated #3373 (which landed a backtick-escape fix).
2026-04-24 18:25:50 +04:00
Elie Habib
6807a9c7b9 docs(resilience): PR 5.2 — displacement field-mapping audit + known-limitations (#3373)
* docs(resilience): PR 5.2 — displacement field-mapping audit + known-limitations

PR 5.2 of cohort-audit plan 2026-04-24-002. Read-only static audit of
the UNHCR displacement field mapping consumed by scoreSocialCohesion,
scoreBorderSecurity, and scoreStateContinuity.

Findings

1. Field mapping is CODE-CORRECT. The plan's concern — that
   `totalDisplaced` might inadvertently include labor migrants — is
   negative at the source. The UNHCR Population API does not publish
   labor migrant data at all; it covers only four categories
   (refugees, asylum seekers, IDPs, stateless), all of which the
   seeder sums correctly. Labor-migrant-dominated cohorts (GCC, SG)
   legitimately register as "no UNHCR footprint" — that's UNHCR
   semantics, not a bug.

2. NEW finding during audit — `scoreBorderSecurity` fallback at
   _dimension-scorers.ts:1412 is effectively dead code. The
   `hostTotal ?? totalDisplaced` fallback never fires in production
   for two compounding reasons:

   (a) `safeNum(null)` returns 0 (JS `Number(null) === 0`), so the
       `??` short-circuits on 0 — the nullish-coalescing only falls
       back on null/undefined.
   (b) `scripts/seed-displacement-summary.mjs` ALWAYS writes
       `hostTotal: 0` explicitly for origin-only countries (lines
       141-144). There's no production shape where `hostTotal` is
       undefined, so the `??` can never select the fallback path.

   Observable consequence: origin-only high-outflow countries
   (Syria, Venezuela, Ukraine, Afghanistan) score 100 on
   borderSecurity's displacement sub-component (35% of the dim
   blend). The outflow signal is effectively silenced.

3. NOT fixing this in this PR. A one-line change (`||` or an
   explicit `> 0` check) would flip the borderSecurity score for
   ~6 high-outflow origin countries by a material amount — a
   methodology change, not a pure bug-fix. Belongs in a construct-
   decision PR with before/after cohort snapshots. Opening this as
   a follow-up discussion instead of bundling into an audit-doc PR.

Shipped

- `docs/methodology/known-limitations.md` — new file. Sections:
  "Displacement field-mapping" covering source semantics (what
  UNHCR provides vs does not), the GCC labor-migrant-cohort
  implication, the `??` short-circuit finding, and the decision
  to not fix in this PR. Includes a follow-up audit list of 11
  countries (high host-pressure + high origin-outflow + labor-
  migrant cohorts) for a live-data spot-check against UNHCR
  Refugee Data Finder — gated on API-key access.
- `tests/resilience-displacement-field-mapping.test.mts` —
  9-test regression guard. Pins:
    (1) `totalDisplaced` = sum of all four UNHCR categories;
    (2) `hostTotal` = asylum-side sum (no IDPs/stateless);
    (3) stateless population flows into totalDisplaced (guards
        against a future seeder refactor that drops the term);
    (4) labor-migrant-cohort (UNHCR-empty) entry scores 100 on
        the displacement sub-component — the correct-per-UNHCR-
        semantics outcome, intentionally preserved;
    (5) CURRENT scoreBorderSecurity behaviour: hostTotal=0
        short-circuits `??` (Syria-pattern scores 100);
    (6) `??` fallback ONLY fires when hostTotal is undefined
        (academic; seeder never emits this shape today);
    (7) `safeNum(null)` returns 0 quirk pinned as a numeric-
        coercion contract;
    (8) absent-from-UNHCR country imputes `stable-absence`;
    (9) scoreStateContinuity reads `totalDisplaced` origin-side.

Verified

- `npx tsx --test tests/resilience-displacement-field-mapping.test.mts` — 9 pass / 0 fail
- `npm run test:data` — 6703 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no warnings on new files

* fix(resilience): PR 5.2 review — escape backticks in assertion message

Addresses Greptile P2 on #3373. The unescaped backticks around the
nullish-coalescing operator in a template literal caused JavaScript to
parse the string as 'prefix' ?? 'suffix' — truncating the assertion
message to the prefix alone on failure. Escaping the backticks preserves
the full diagnostic so a future regression shows the complete context.
Semantics unchanged; test still passes.
2026-04-24 18:14:29 +04:00
Elie Habib
184e82cb40 feat(resilience): PR 3A — net-imports denominator for sovereignFiscalBuffer (#3380)
PR 3A of cohort-audit plan 2026-04-24-002. Construct correction for
re-export hubs: the SWF rawMonths denominator was gross imports, which
double-counted flow-through trade that never represents domestic
consumption. Net-imports fix:

  rawMonths = aum / (grossImports × (1 − reexportShareOfImports)) × 12

applied to any country in the re-export share manifest. Countries NOT
in the manifest get gross imports unchanged (status-quo fallback).

Plan acceptance gates — verified synthetically in this PR:

  Construct invariant. Two synthetic countries, same SWF, same gross
  imports. A re-exports 60%; B re-exports 0%. Post-fix, A's rawMonths
  is 2.5× B's (1/(1-0.6) = 2.5). Pinned in
  tests/resilience-net-imports-denominator.test.mts.

  SWF-heavy exporter invariant. Country with share ≤ 5%: rawMonths
  lift < 5% vs baseline (negligible). Pinned.

What shipped

1. Re-export share manifest infrastructure.
   - scripts/shared/reexport-share-manifest.yaml (new, empty) — schema
     committed; entries populated in follow-up PRs with UNCTAD
     Handbook citations.
   - scripts/shared/reexport-share-loader.mjs (new) — loader + strict
     validator, mirrors swf-manifest-loader.mjs.
   - scripts/seed-recovery-reexport-share.mjs (new) — publishes
     resilience:recovery:reexport-share:v1 from manifest. Empty
     manifest = valid (no countries, no adjustment).

2. SWF seeder uses net-imports denominator.
   - scripts/seed-sovereign-wealth.mjs exports computeNetImports(gross,
     share) — pure helper, unit-tested.
   - Per-country loop: reads manifest, computes denominatorImports,
     applies to rawMonths math.
   - Payload records annualImports (gross, audit), denominatorImports
     (used in math), reexportShareOfImports (provenance).
   - Summary log reports which countries had a net-imports adjustment
     applied with source year.

3. Bundle wiring.
   - Reexport-Share runs BEFORE Sovereign-Wealth in the recovery
     bundle so the SWF seeder reads fresh re-export data in the same
     cron tick.
   - tests/seed-bundle-resilience-recovery.test.mjs expected-entries
     updated (6 → 7) with ordering preservation.

4. Cache-prefix bump (per cache-prefix-bump-propagation-scope skill).
   - RESILIENCE_SCORE_CACHE_PREFIX: v11 → v12
   - RESILIENCE_RANKING_CACHE_KEY: v11 → v12
   - RESILIENCE_HISTORY_KEY_PREFIX: v6 → v7 (history rotation prevents
     30-day rolling window from mixing pre/post-fix scores and
     manufacturing false "falling" trends on deploy day).
   - Source of truth: server/worldmonitor/resilience/v1/_shared.ts
   - Mirrored in: scripts/seed-resilience-scores.mjs,
     scripts/validate-resilience-correlation.mjs,
     scripts/backtest-resilience-outcomes.mjs,
     scripts/validate-resilience-backtest.mjs,
     scripts/benchmark-resilience-external.mjs, api/health.js
   - Test literals bumped in 4 test files (26 line edits).
   - EXTENDED tests/resilience-cache-keys-health-sync.test.mts with
     a parity pass that reads every known mirror file and asserts
     both (a) canonical prefix present AND (b) no stale v<older>
     literals in non-comment code. Found one legacy log-line that
     still referenced v9 (scripts/seed-resilience-scores.mjs:342)
     and refactored it to use the RESILIENCE_RANKING_CACHE_KEY
     constant so future bumps self-update.

Explicitly NOT in this PR

- liquidReserveAdequacy denominator fix. The plan's PR 3A wording
  mentions both dims, but the RESERVES ratio (WB FI.RES.TOTL.MO) is a
  PRE-COMPUTED WB series; applying a post-hoc net-imports adjustment
  mixes WB's denominator year with our manifest-year, and the math
  change belongs in PR 3B (unified liquidity) where the α calibration
  is explicit. This PR stays scoped to sovereignFiscalBuffer.
- Live re-export share entries. The manifest ships EMPTY in this PR;
  entries with UNCTAD citations are one-per-PR follow-ups so each
  figure is individually auditable.

Verified

- tests/resilience-net-imports-denominator.test.mts — 9 pass (construct
  contract: 2.5× ratio gate, monotonicity, boundary rejections,
  backward-compat on missing manifest entry, cohort-proportionality,
  SWF-heavy-exporter-unchanged)
- tests/reexport-share-loader.test.mts — 7 pass (committed-manifest
  shape + 6 schema-violation rejections)
- tests/resilience-cache-keys-health-sync.test.mts — 5 pass (existing 3
  + 2 new parity checks across all mirror files)
- tests/seed-bundle-resilience-recovery.test.mjs — 17 pass (expected
  entries bumped to 7)
- npm run test:data — 6714 pass / 0 fail
- npm run typecheck / typecheck:api — green
- npm run lint / lint:md — clean

Deployment notes

Score + ranking + history cache prefixes all bump in the same deploy.
Per established v10→v11 precedent (and the cache-prefix-bump-
propagation-scope skill):
- Score / ranking: 6h TTL — the new prefix populates via the Railway
  resilience-scores cron within one tick.
- History: 30d ring — the v7 ring starts empty; the first 30 days
  post-deploy lack baseline points, so trend / change30d will read as
  "no change" until v7 accumulates a window.
- Legacy v11 keys can be deleted from Redis at any time post-deploy
  (no reader references them). Leaving them in place costs storage
  but does no harm.
2026-04-24 18:14:04 +04:00
Elie Habib
0081da4148 fix(resilience): widen Comtrade period to 4y + surface picked year (#3372)
PR 1 of cohort-audit plan 2026-04-24-002. Unblocks UAE, Oman, Bahrain
(and any other late-reporter) on the importConcentration dimension.

Problem
- seed-recovery-import-hhi.mjs queries Comtrade with `period=Y-1,Y-2`
  (currently "2025,2024"). Several reporters publish Comtrade 1-2y
  behind — their 2024/2025 rows are empty while 2023 is populated.
- With no data in the queried window, parseRecords() returned [] for
  the reporter, the seeder counted a "skip", the scorer fell through
  to IMPUTE (score=50, coverage=0.3, imputationClass="unmonitored"),
  and the cohort-sanity audit flagged AE as a coverage-outlier inside
  the GCC — exactly the class of silent gap the audit is designed to
  catch.

Fix
1. Widen the Comtrade period parameter to a 4-year window Y-1..Y-4
   via a new `buildPeriodParam(now)` helper. On-time reporters still
   pick their latest year via the existing completeness tiebreak in
   parseRecords(); late reporters now pick up whatever year they
   actually published in (2023 for UAE, etc.).
2. parseRecords() now returns { rows, year } — the year surfaces in
   the per-country payload as `year: number | null` for operator
   freshness audit. The scorer already expects this shape
   (_dimension-scorers.ts:1524 RecoveryImportHhiCountry.year); this
   PR actually populates it.
3. `buildPeriodParam` + `parseRecords` are exported so their unit
   tests can pin year-selection behaviour without hitting Comtrade.

Note on PR 2 of the same plan
The plan calls out "PR 2 — externalDebtCoverage re-goalpost to
Greenspan-Guidotti" as unshipped. It IS shipped: commit 7f78a7561
"PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)"
landed under the prior workstream 2026-04-22-001. The new construct
invariants in tests/resilience-construct-invariants.test.mts
(shipped in PR 0 / #3369) confirm score(ratio=0)=100, score(1)=50,
score(2)=0 against current main. PR 2 of the cohort-audit plan is a
no-op; I'll flag this on the plan review thread rather than bundle
a plan edit into this PR.

Verified
- `npx tsx --test tests/seed-recovery-import-hhi.test.mjs` — 19 pass
  (10 existing + 9 new: buildPeriodParam shape; parseRecords picks
  completeness-tiebreak, newer-year-on-ties, late-reporter fallback;
  empty/negative/world-aggregate handling)
- `npx tsx --test tests/seed-comtrade-5xx-retry.test.mjs` — green
  (the `{ records, status }` destructure pattern at the caller still
  works; the new third field `year` is additive)
- `npm run test:data` — 6703 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no new warnings
- No cache-prefix bump: the payload shape only ADDS an optional
  field; old snapshots remain valid readers.

Acceptance per plan
- Construct invariant: score(HHI=0.05) > score(HHI=0.20) — already
  covered in tests/resilience-construct-invariants.test.mts (PR #3369)
- Monotonicity pin: score(hhi=0.15) > score(hhi=0.45) — already
  covered in tests/resilience-dimension-monotonicity.test.mts

Post-deploy verification
After the next Railway seed-bundle-resilience-recovery cron tick,
confirm UAE/OM/BH appear in `resilience:recovery:import-hhi:v1`
with non-null hhi and `year` = 2023 (or their actual latest year).
Then re-run the cohort audit — the GCC coverage-outlier flag on
AE.importConcentration should disappear.
2026-04-24 18:13:41 +04:00
Elie Habib
df392b0514 feat(resilience): PR 0 — cohort-sanity release-gate harness (#3369)
* feat(resilience): PR 0 — cohort-sanity release-gate harness

Lands the audit infrastructure for the resilience cohort-ranking
structural audit (plan 2026-04-24-002). Release gate, not merge gate:
the audit tells release review what to look at before publishing a
ranking; it does not block a PR.

What's new
- scripts/audit-resilience-cohorts.mjs — Markdown report generator.
  Fetches the live ranking + per-country scores (or reads a fixture
  in offline mode), emits per-cohort per-dimension tables, contribution
  decomposition, saturated / outlier / identical-score flags, and a
  top-N movers comparison vs a baseline snapshot.
- tests/resilience-construct-invariants.test.mts — 12 formula-level
  anchor-value assertions with synthetic inputs. Covers HHI, external
  debt (Greenspan-Guidotti anchor), and sovereign fiscal buffer
  (saturating transform). Tests the MATH, not a country's rank.
- tests/fixtures/resilience-audit-fixture.json — offline fixture that
  mirrors the 2026-04-24 GCC state (KW>QA>AE) so the audit tool can
  be smoke-tested without API-key access.
- docs/methodology/cohort-sanity-release-gate.md — operational doc
  explaining when to run, how to read the report, and the explicit
  anti-pattern note on rank-targeted acceptance criteria.

Verified
- `npx tsx --test tests/resilience-construct-invariants.test.mts` —
  12 pass (HHI, debt, SWF invariants all green against current scorer)
- `npm run test:data` — 6706 pass / 0 fail
- `FIXTURE=tests/fixtures/resilience-audit-fixture.json
   OUT=/tmp/audit.md node scripts/audit-resilience-cohorts.mjs`
  runs to completion and correctly flags:
  (a) coverage-outlier on AE.importConcentration (0.3 vs peers 1.0)
  (b) saturated-high on GCC.externalDebtCoverage (all 6 at 100)
  — the two top cohort-sanity findings from the plan.

Not in this PR
- The live-API baseline snapshot
  (docs/snapshots/resilience-ranking-live-pre-cohort-audit-2026-04-24.json)
  is deferred to a manual release-prep step: run
  `WORLDMONITOR_API_KEY=wm_xxx API_BASE=https://api.worldmonitor.app
   node scripts/freeze-resilience-ranking.mjs` before the first
  methodology PR (PR 1 HHI period widening) so its movers table has
  something to compare against.
- No scorer changes. No cache-prefix bumps. This PR is pure tooling.

* fix(resilience): fail-closed on fetch failures + pillar-combine formula mode

Addresses review P1 + P2 on PR #3369.

P1 — fetch-failure silent-drop.
Per-country score fetches that failed were logged to stderr, silently
stored as null, and then filtered out of cohort tables via
`codes.filter((cc) => scoreMap.get(cc))`. A transient 403/500 on the
very country carrying the ranking anomaly could produce a Markdown
report that looked valid — wrong failure mode for a release gate.

Fix:
- `fetchScoresConcurrent` now tracks failures in a dedicated Map and
  does NOT insert null placeholders; missing cohort members are
  computed against the requested cohort code set.
- The report has a  blocker banner at top AND an always-rendered
  "Fetch failures / missing members" section (shown even when empty,
  so an operator learns to look).
- `STRICT=1` writes the report, then exits code 3 on any fetch
  failure or missing cohort member, code 4 on formula-mode drift,
  code 0 otherwise. Automation can differentiate the two.

P2 — pillar-combine formula mode invalidates contribution rows.
`docs/methodology/cohort-sanity-release-gate.md:63` tells operators
to run this audit before activating `RESILIENCE_PILLAR_COMBINE_ENABLED`,
but the contribution decomposition is a domain-weighted roll-up that
is ONLY valid when `overallScore = sum(domain.score * domain.weight)`.
Once pillar combine is on, `overallScore = penalizedPillarScore(pillars)`
(non-linear in dim scores); decomposition rows become materially
misleading for exactly the release-gate scenario the doc prescribes.

Fix:
- Added `detectFormulaMode(scoreMap)` that takes countries with:
  (a) `sum(domain.weight)` within 0.05 of 1.0 (complete response), AND
  (b) every dim at `coverage ≥ 0.9` (stable share math)
  and compares `|Σ contributions - overallScore|` against
  `CONTRIB_TOLERANCE` (default 1.5). If > 50% of ≥ 3 eligible
  countries drift, pillar combine is flagged.
- Report emits a  blocker banner at top, a "Formula mode" line in
  the header, and a "Formula-mode diagnostic" section with the first
  three offenders. Under `STRICT=1` exits code 4.
- Methodology doc updated: new "Fail-closed semantics" section,
  "Formula mode" operator guide, ENV table entries for STRICT +
  CONTRIB_TOLERANCE.

Verified:
- `tests/audit-cohort-formula-detection.test.mts` (NEW) — 3 child-process
  smoke tests: missing-members banner + STRICT exit 3, all-clear exit 0,
  pillar-mode banner + STRICT exit 4. All pass.
- `npx tsx --test tests/resilience-construct-invariants.test.mts
   tests/audit-cohort-formula-detection.test.mts` — 15 pass / 0 fail
- `npm run test:data` — 6709 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no warnings on new / changed files
  (refactor split buildReport complexity from 51 → under 50 by
  extracting `renderCohortSection` + `renderDimCell`)
- Fixture smoke: AE.importConcentration coverage-outlier and
  GCC.externalDebtCoverage saturated-high flags still fire correctly.

* fix(resilience): PR 0 review — fixture-mode source label, try/catch country-names, ASCII minus

Addresses 3 P2 Greptile findings on #3369:

1. **Misleading Source: line in fixture mode.** `FIXTURE_PATH` sets
   `API_BASE=''`, so the report header showed a bare "/api/..." path that
   never resolved — making a fixture run visually indistinguishable from
   a live run. Now surfaces `Source: fixture://<path>` in fixture mode.

2. **`loadCountryNameMap` crashes without useful diagnostics.** A missing
   or unparseable `shared/country-names.json` produced a raw unhandled
   rejection. Now the read and the parse are each wrapped in their own
   try/catch; on either failure the script logs a developer-friendly
   warning and falls back to ISO-2 codes (report shows "AE" instead of
   "Uae"). Keeps the audit operable in CI-offline scenarios.

3. **Unicode minus `−` (U+2212) instead of ASCII `-` in `fmtDelta`.**
   Downstream operators diff / grep / CSV-pipe the report; the Unicode
   minus breaks byte-level text tooling. Replaced with ASCII hyphen-
   minus. Left the U+2212 in the formula-mode diagnostic prose
   (`|Σ contributions − overallScore|`) where it's mathematical notation,
   not data.

Verified

- `npx tsx --test tests/audit-cohort-formula-detection.test.mts tests/resilience-construct-invariants.test.mts` — 15 pass / 0 fail
- Fixture-mode run produces `Source: fixture://tests/fixtures/...`
- Movers-table negative deltas now use ASCII `-`
2026-04-24 18:13:22 +04:00
Elie Habib
34dfc9a451 fix(news): ground LLM surfaces on real RSS description end-to-end (#3370)
* feat(news/parser): extract RSS/Atom description for LLM grounding (U1)

Add description field to ParsedItem, extract from the first non-empty of
description/content:encoded (RSS) or summary/content (Atom), picking the
longest after HTML-strip + entity-decode + whitespace-normalize. Clip to
400 chars. Reject empty, <40 chars after strip, or normalize-equal to the
headline — downstream consumers fall back to the cleaned headline on '',
preserving current behavior for feeds without a description.

CDATA end is anchored to the closing tag so internal ]]> sequences do not
truncate the match. Preserves cached rss:feed:v1 row compatibility during
the 1h TTL bleed since the field is additive.

Part of fix: pipe RSS description end-to-end so LLM surfaces stop
hallucinating named actors (docs/plans/2026-04-24-001-...).

Covers R1, R7.

* feat(news/story-track): persist description on story:track:v1 HSET (U2)

Append description to the story:track:v1 HSET only when non-empty. Additive
— no key version bump. Old rows and rows from feeds without a description
return undefined on HGETALL, letting downstream readers fall back to the
cleaned headline (R6).

Extract buildStoryTrackHsetFields as a pure helper so the inclusion gate is
unit-testable without Redis.

Update the contract comment in cache-keys.ts so the next reader of the
schema sees description as an optional field.

Covers R2, R6.

* feat(proto): NewsItem.snippet + SummarizeArticleRequest.bodies (U3)

Add two additive proto fields so the article description can ride to every
LLM-adjacent consumer without a breaking change:

- NewsItem.snippet (field 12): RSS/Atom description, HTML-stripped,
  ≤400 chars, empty when unavailable. Wired on toProtoItem.
- SummarizeArticleRequest.bodies (field 8): optional article bodies
  paired 1:1 with headlines for prompt grounding. Empty array is today's
  headline-only behavior.

Regenerated TS client/server stubs and OpenAPI YAML/JSON via sebuf v0.11.1
(PATH=~/go/bin required — Homebrew's protoc-gen-openapiv3 is an older
pre-bundle-mode build that collides on duplicate emission).

Pre-emptive bodies:[] placeholders at the two existing SummarizeArticle
call sites in src/services/summarization.ts; U6 replaces them with real
article bodies once SummarizeArticle handler reads the field.

Covers R3, R5.

* feat(brief/digest): forward RSS description end-to-end through brief envelope (U4)

Digest accumulator reader (seed-digest-notifications.mjs::buildDigest) now
plumbs the optional `description` field off each story:track:v1 HGETALL into
the digest story object. The brief adapter (brief-compose.mjs::
digestStoryToUpstreamTopStory) prefers the real RSS description over the
cleaned headline; when the upstream row has no description (old rows in the
48h bleed, feeds that don't carry one), we fall back to the cleaned headline
so today behavior is preserved (R6).

This is the upstream half of the description cache path. U5 lands the LLM-
side grounding + cache-prefix bump so Gemini actually sees the article body
instead of hallucinating a named actor from the headline.

Covers R4 (upstream half), R6.

* feat(brief/llm): RSS grounding + sanitisation + 4 cache prefix bumps (U5)

The actual fix for the headline-only named-actor hallucination class:
Gemini 2.5 Flash now receives the real article body as grounding context,
so it paraphrases what the article says instead of filling role-label
headlines from parametric priors ("Iran's new supreme leader" → "Ali
Khamenei" was the 2026-04-24 reproduction; with grounding, it becomes
the actual article-named actor).

Changes:

- buildStoryDescriptionPrompt interpolates a `Context: <body>` line
  between the metadata block and the "One editorial sentence" instruction
  when description is non-empty AND not normalise-equal to the headline.
  Clips to 400 chars as a second belt-and-braces after the U1 parser cap.
  No Context line → identical prompt to pre-fix (R6 preserved).

- sanitizeStoryForPrompt extended to cover `description`. Closes the
  asymmetry where whyMatters was sanitised and description wasn't —
  untrusted RSS bodies now flow through the same injection-marker
  neutraliser before prompt interpolation. generateStoryDescription wraps
  the story in sanitizeStoryForPrompt before calling the builder,
  matching generateWhyMatters.

- Four cache prefixes bumped atomically to evict pre-grounding rows:
    scripts/lib/brief-llm.mjs:
      brief:llm:description:v1 → v2  (Railway, description path)
      brief:llm:whymatters:v2 → v3   (Railway, whyMatters fallback)
    api/internal/brief-why-matters.ts:
      brief:llm:whymatters:v6 → v7                (edge, primary)
      brief:llm:whymatters:shadow:v4 → shadow:v5  (edge, shadow)
  hashBriefStory already includes description in the 6-field material
  (v5 contract) so identity naturally drifts; the prefix bump is the
  belt-and-braces that guarantees a clean cold-start on first tick.

- Tests: 8 new + 2 prefix-match updates on tests/brief-llm.test.mjs.
  Covers Context-line injection, empty/dup-of-headline rejection,
  400-char clip, sanitisation of adversarial descriptions, v2 write,
  and legacy-v1 row dark (forced cold-start).

Covers R4 + new sanitisation requirement.

* feat(news/summarize): accept bodies + bump summary cache v5→v6 (U6)

SummarizeArticle now grounds on per-headline article bodies when callers
supply them, so the dashboard "News summary" path stops hallucinating
across unrelated headlines when the upstream RSS carried context.

Three coordinated changes:

1. SummarizeArticleRequest handler reads req.bodies, sanitises each entry
   through sanitizeForPrompt (same trust treatment as geoContext — bodies
   are untrusted RSS text), clips to 400 chars, and pads to the headlines
   length so pair-wise identity is stable.

2. buildArticlePrompts accepts optional bodies and interleaves a
   `    Context: <body>` line under each numbered headline that has a
   non-empty body. Skipped in translate mode (headline[0]-only) and when
   all bodies are empty — yielding a byte-identical prompt to pre-U6
   for every current caller (R6 preserved).

3. summary-cache-key bumps CACHE_VERSION v5→v6 so the pre-grounding rows
   (produced from headline-only prompts) cold-start cleanly. Extends
   canonicalizeSummaryInputs + buildSummaryCacheKey with a pair-wise
   bodies segment `:bd<hash>`; the prefix is `:bd` rather than `:b` to
   avoid colliding with `:brief:` when pattern-matching keys. Translate
   mode is headline[0]-only and intentionally does not shift on bodies.

Dedup reorder preserved: the handler re-pairs bodies to the deduplicated
top-5 via findIndex, so layout matches without breaking cache identity.

New tests: 7 on buildArticlePrompts (bodies interleave, partial fill,
translate-mode skip, clip, short-array tolerance), 8 on
buildSummaryCacheKey (pair-wise sort, cache-bust on body drift, translate
skip). Existing summary-cache-key assertions updated v5→v6.

Covers R3, R4.

* feat(consumers): surface RSS snippet across dashboard, email, relay, MCP + audit (U7)

Thread the RSS description from the ingestion path (U1-U5) into every
user-facing LLM-adjacent surface. Audit the notification producers so
RSS-origin and domain-origin events stay on distinct contracts.

Dashboard (proto snippet → client → panel):
- src/types/index.ts NewsItem.snippet?:string (client-side field).
- src/app/data-loader.ts proto→client mapper propagates p.snippet.
- src/components/NewsPanel.ts renders snippet as a truncated (~200 chars,
  word-boundary ellipsis) `.item-snippet` line under each headline.
- NewsPanel.currentBodies tracks per-headline bodies paired 1:1 with
  currentHeadlines; passed as options.bodies to generateSummary so the
  server-side SummarizeArticle LLM grounds on the article body.

Summary plumbing:
- src/services/summarization.ts threads bodies through SummarizeOptions
  → generateSummary → runApiChain → tryApiProvider; cache key now includes
  bodies (via U6's buildSummaryCacheKey signature).

MCP world-brief:
- api/mcp.ts pairs headlines with their RSS snippets and POSTs `bodies`
  to /api/news/v1/summarize-article so the MCP tool surface is no longer
  starved.

Email digest:
- scripts/seed-digest-notifications.mjs plain-text formatDigest appends
  a ~200-char truncated snippet line under each story; HTML formatDigestHtml
  renders a dim-grey description div between title and meta. Both gated
  on non-empty description (R6 — empty → today's behavior).

Real-time alerts:
- src/services/breaking-news-alerts.ts BreakingAlert gains optional
  description; checkBatchForBreakingAlerts reads item.snippet; dispatchAlert
  includes `description` in the /api/notify payload when present.

Notification relay:
- scripts/notification-relay.cjs formatMessage gated on
  NOTIFY_RELAY_INCLUDE_SNIPPET=1 (default off). When on, RSS-origin
  payloads render a `> <snippet>` context line under the title. When off
  or payload.description absent, output is byte-identical to pre-U7.

Audit (RSS vs domain):
- tests/notification-relay-payload-audit.test.mjs enforces file-level
  @notification-source tags on every producer, rejects `description:` in
  domain-origin payload blocks, and verifies the relay codepath gates
  snippet rendering under the flag.
- Tag added to ais-relay.cjs (domain), seed-aviation.mjs (domain),
  alert-emitter.mjs (domain), breaking-news-alerts.ts (rss).

Deferred (plan explicitly flags): InsightsPanel + cluster-producer
plumbing (bodies default to [] — will unlock gradually once news:insights:v1
producer also carries primarySnippet).

Covers R5, R6.

* docs+test: grounding-path note + bump pinned CACHE_VERSION v5→v6 (U8)

Final verification for the RSS-description-end-to-end fix:

- docs/architecture.mdx — one-paragraph "News Grounding Pipeline"
  subsection tracing parser → story:track:v1.description → NewsItem.snippet
  → brief / SummarizeArticle / dashboard / email / relay / MCP, with the
  empty-description R6 fallback rule called out explicitly.
- tests/summarize-reasoning.test.mjs — Fix-4 static-analysis pin updated
  to match the v6 bump from U6. Without this the summary cache bump silently
  regressed CI's pinned-version assertion.

Final sweep (2026-04-24):
- grep -rn 'brief:llm:description:v1' → only in the U5 legacy-row test
  simulation (by design: proves the v2 bump forces cold-start).
- grep -rn 'brief:llm:whymatters:v2/v6/shadow:v4' → no live references.
- grep -rn 'summary:v5' → no references.
- CACHE_VERSION = 'v6' in src/utils/summary-cache-key.ts.
- Full tsx --test sweep across all tests/*.test.{mjs,mts}: 6747/6747 pass.
- npm run typecheck + typecheck:api: both clean.

Covers R4, R6, R7.

* fix(rss-description): address /ce:review findings before merge

14 fixes from structured code review across 13 reviewer personas.

Correctness-critical (P1 — fixes that prevent R6/U7 contract violations):
- NewsPanel signature covers currentBodies so view-mode toggles that leave
  headlines identical but bodies different now invalidate in-flight summaries.
  Without this, switching renderItems → renderClusters mid-summary let a
  grounded response arrive under a stale (now-orphaned) cache key.
- summarize-article.ts re-pairs bodies with headlines BEFORE dedup via a
  single zip-sanitize-filter-dedup pass. Previously bodies[] was indexed by
  position in light-sanitized headlines while findIndex looked up the
  full-sanitized array — any headline that sanitizeHeadlines emptied
  mispaired every subsequent body, grounding the LLM on the wrong story.
- Client skips the pre-chain cache lookup when bodies are present, since
  client builds keys from RAW bodies while server sanitizes first. The
  keys diverge on injection content, which would silently miss the
  server's authoritative cache every call.

Test + audit hardening:
- Legacy v1 eviction test now uses the real hashBriefStory(story()) suffix
  instead of a literal "somehash", so a bug where the reader still queried
  the v1 prefix at the real key would actually be caught.
- tests/summary-cache-key.test.mts adds 400-char clip identity coverage so
  the canonicalizer's clip and any downstream clip can't silently drift.
- tests/news-rss-description-extract.test.mts renames the well-formed
  CDATA test and adds a new test documenting the malformed-]]> fallback
  behavior (plain regex captures, article content survives).

Safe_auto cleanups:
- Deleted dead SNIPPET_PUSH_MAX constant in notification-relay.cjs.
- BETA-mode groq warm call now passes bodies, warming the right cache slot.
- seed-digest shares a local normalize-equality helper for description !=
  headline comparison, matching the parser's contract.
- Pair-wise sort in summary-cache-key tie-breaks on body so duplicate
  headlines produce stable order across runs.
- buildSummaryCacheKey gained JSDoc documenting the client/server contract
  and the bodies parameter semantics.
- MCP get_world_brief tool description now mentions RSS article-body
  grounding so calling agents see the current contract.
- _shared.ts `opts.bodies![i]!` double-bang replaced with `?? ''`.
- extractRawTagBody regexes cached in module-level Map, mirroring the
  existing TAG_REGEX_CACHE pattern.

Deferred to follow-up (tracked for PR description / separate issue):
- Promote shared MAX_BODY constant across the 5 clip sites
- Promote shared truncateForDisplay helper across 4 render sites
- Collapse NewsPanel.{currentHeadlines, currentBodies} → Array<{title, snippet}>
- Promote sanitizeStoryForPrompt to shared/brief-llm-core.js
- Split list-feed-digest.ts parser helpers into sibling -utils.ts
- Strengthen audit test: forward-sweep + behavioral gate test

Tests: 6749/6749 pass. Typecheck clean on both configs.

* fix(summarization): thread bodies through browser T5 path (Codex #2)

Addresses the second of two Codex-raised findings on PR #3370:

The PR threaded bodies through the server-side API provider chain
(Ollama → Groq → OpenRouter → /api/news/v1/summarize-article) but the
local browser T5 path at tryBrowserT5 was still summarising from
headlines alone. In BETA_MODE that ungrounded path runs BEFORE the
grounded server providers; in normal mode it remains the last
fallback. Whenever T5-small won, the dashboard summary surface
regressed to the headline-only path — the exact hallucination class
this PR exists to eliminate.

Fix: tryBrowserT5 accepts an optional `bodies` parameter and
interleaves each body with its paired headline via a `headline —
body` separator in the combined text (clipped to 200 chars per body
to stay within T5-small's ~512-token context window). All three call
sites (BETA warm, BETA cold, normal-mode fallback) now pass the
bodies threaded down from generateSummary options.bodies.

When bodies is empty/omitted, the combined text is byte-identical to
pre-fix (R6 preserved).

On Codex finding #1 (story:track:v1 additive-only HSET keeps a body
from an earlier mention of the same normalized title), declining to
change. The current rule — "if this mention has a body, overwrite;
otherwise leave the prior body alone" — is defensible: a body from
mention A is not falsified by mention B being body-less (a wire
reprint doesn't invalidate the original source's body). A feed that
publishes a corrected headline creates a new normalized-title hash,
so no stale body carries forward. The failure window is narrow (live
story evolving while keeping the same title through hours of
body-less wire reprints) and the 7-day STORY_TTL is the backstop.
Opening a follow-up issue to revisit semantics if real-world evidence
surfaces a stale-grounding case.

* fix(story-track): description always-written to overwrite stale bodies (Codex #1)

Revisiting Codex finding #1 on PR #3370 after re-review. The previous
response declined the fix with reasoning; on reflection the argument
was over-defending the current behavior.

Problem: buildStoryTrackHsetFields previously wrote `description` only
when non-empty. Because story:track:v1 rows are collapsed by
normalized-title hash, an earlier mention's body would persist for up
to STORY_TTL (7 days) on subsequent body-less mentions of the same
story. Consumers reading `track.description` via HGETALL could not
distinguish "this mention's body" from "some mention's body from the
last week," silently grounding brief / whyMatters / SummarizeArticle
LLMs on text the current mention never supplied. That violates the
grounding contract advertised to every downstream surface in this PR.

Fix: HSET `description` unconditionally on every mention — empty
string when the current item has no body, real body when it does. An
empty value overwrites any prior mention's body so the row is always
authoritative for the current cycle. Consumers continue to treat
empty description as "fall back to cleaned headline" (R6 preserved).
The 7-day STORY_TTL and normalized-title hash semantics are unchanged.

Trade-off accepted: a valid body from Feed A (NYT) is wiped when Feed
B (AP body-less wire reprint) arrives for the same normalized title,
even though Feed A's body is factually correct. Rationale: the
alternative — keeping Feed A's body indefinitely — means the user
sees Feed A's body attributed (by proximity) to an AP mention at a
later timestamp, which is at minimum misleading and at worst carries
retracted/corrected details. Honest absence beats unlabeled presence.

Tests: new stale-body overwrite sequence test (T0 body → T1 empty →
T2 new body), existing "writes description when non-empty" preserved,
existing "omits when empty" inverted to "writes empty, overwriting."
cache-keys.ts contract comment updated to mark description as
always-written rather than optional.
2026-04-24 16:25:14 +04:00
Elie Habib
959086fd45 fix(panels): address Greptile P2 review on #3364 (icons + category map) (#3365)
Two P2 findings on the now-merged #3364:

1. Icon encoding inconsistency. The two new pipeline/storage entries
   mixed '\u{1F6E2}' + raw VS16 (U+FE0F) while every other panel in
   the file uses '\u{1F6E2}️'. Same runtime glyph, but mixed
   encoding is lint-noisy. Normalize to the escaped form.

2. PANEL_CATEGORY_MAP gap. pipeline-status, storage-facility-map and
   fuel-shortages were registered in FULL_PANELS + CMD+K but absent
   from PANEL_CATEGORY_MAP, so users browsing the settings category
   picker didn't see them. Add to marketsFinance alongside
   energy-complex. While here, close the same pre-existing gap for
   hormuz-tracker and energy-crisis — reviewer explicitly called
   these out as worth addressing together.

The third P2 (spr keyword collision with oil-inventories) was fixed
in commit 83de09fe1 before the review finalised.
2026-04-24 09:42:40 +04:00
Elie Habib
d521924253 fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs (#3363)
* fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs

PR #3289 shipped the v2 energy construct behind RESILIENCE_ENERGY_V2_ENABLED
(default false). Audit on 2026-04-24 after the user flagged "AE only moved
1.49 points — we added nuclear credit, we should see more" revealed two
safety gaps that made a future flag flip unsafe:

1. scoreEnergyV2 silently fell back to IMPUTE when any of its three
   required Redis seeds (low-carbon-generation, fossil-electricity-share,
   power-losses) was null. A future operator flipping the flag with
   seeds absent would produce fabricated-looking numbers for every
   country with zero operator signal.

2. api/health.js had those three seed labels in BOTH SEED_META (CRIT on
   missing) AND ON_DEMAND_KEYS (which demotes CRIT to WARN). The demotion
   won. Health has been reporting WARNING on a scorer dependency that has
   been 100% missing since PR #3289 merged — no paging trail existed.

Changes:

  server/worldmonitor/resilience/v1/_dimension-scorers.ts
    - Add ResilienceConfigurationError with missingKeys[] payload.
    - scoreEnergy: preflight the three v2 seeds when flag=true. Throw
      ResilienceConfigurationError listing the specific absent keys.
    - scoreAllDimensions: wrap per-dimension dispatch in try/catch so a
      thrown ResilienceConfigurationError routes to the source-failure
      shape (imputationClass='source-failure', coverage=0) for that ONE
      dimension — country keeps scoring other dims normally. Log once
      per country-dimension pair so the gap is audit-traceable.

  api/health.js
    - Remove lowCarbonGeneration / fossilElectricityShare / powerLosses
      from ON_DEMAND_KEYS. They stay in BOOTSTRAP_KEYS + SEED_META.
    - Replace the transitional comment with a hard "do NOT add these
      back" note pointing at the scorer's fail-closed gate.

  tests/resilience-energy-v2.test.mts
    - New test: flag on + ALL three seeds missing → throws
      ResilienceConfigurationError naming all three keys.
    - New test: flag on + only one seed missing → throws naming ONLY
      the missing key (operator-clarity guard).
    - New test: flag on + all seeds present → v2 runs normally.
    - Update the file-level invariant comment to reflect the new
      fail-closed contract (replacing the prior "degrade gracefully"
      wording that codified the silent-IMPUTE bug).
    - Note: fixture's `??` fallbacks coerce null-overrides into real
      data, so the preflight tests use a direct-reader helper.

  docs/methodology/country-resilience-index.mdx
    - New "Fail-closed semantics" paragraph in the v2 Energy section
      documenting the throw + source-failure + health-CRIT contract.

Non-goals (intentional):
  - This PR does NOT flip RESILIENCE_ENERGY_V2_ENABLED.
  - This PR does NOT provision seed-bundle-resilience-energy-v2 on Railway.
  - This PR does NOT touch RESILIENCE_PILLAR_COMBINE_ENABLED.

Operational effect post-merge:
  - /api/health flips from WARNING → CRITICAL on the three v2 seed-meta
    entries. That is the intended alarm; it reveals that the Railway
    bundle was never provisioned.
  - scoreEnergy behavior with flag=false is unchanged (legacy path).
  - scoreEnergy behavior with flag=true + seeds present is unchanged.
  - scoreEnergy behavior with flag=true + seeds absent changes from
    "silently IMPUTE all 217 countries" to "source-failure on the
    energy dim for every country, visible in widget + API response".

Tests: 511/511 resilience-* pass. Biome clean. Lint:md clean.

Related plan: docs/plans/2026-04-24-001-fix-resilience-v2-fail-closed-on-missing-seeds-plan.md

* docs(resilience): scrub stale ON_DEMAND_KEYS references for v2 energy seeds

Greptile P2 on PR #3363: four stale references implied the three v2
energy seeds were still gated as ON_DEMAND_KEYS (WARN-on-missing) even
though this PR's api/health.js change removed them (now strict
SEED_META = CRIT on missing). Scrubbing each:

  - api/health.js:196 (BOOTSTRAP_KEYS comment) — was "ON_DEMAND_KEYS
    until Railway cron provisions; see below." Updated to cite plan
    2026-04-24-001 and the strict-SEED_META posture.
  - api/health.js:398 (SEED_META comment) — was "Listed in ON_DEMAND_KEYS
    below until Railway cron provisions..." Updated for same reason.
  - docs/methodology/country-resilience-index.mdx:635 — v2.1 changelog
    entry said seed keys were ON_DEMAND_KEYS until graduation. Replaced
    with the fail-closed contract description.
  - docs/methodology/energy-v2-flag-flip-runbook.md:25 — step 3 said
    "ON_DEMAND_KEYS graduation" was required at flag-flip time.
    Rewrote to explain no graduation step is needed because the
    posture was removed pre-activation.

No code change. Tests still 14/14 on the energy-v2 suite, lint:md clean.

* fix(docs): escape MDX-unsafe `<=` in energy-v2 runbook to unblock Mintlify

Mintlify deploy on PR #3363 failed with
`Unexpected character '=' (U+003D) before name` at
`docs/methodology/energy-v2-flag-flip-runbook.md`. Two lines had
`<=` in plain prose, which MDX tries to parse as a JSX-tag-start.

Replaced both with `≤` (U+2264) — and promoted the two existing `>=`
on adjacent lines to `≥` for consistency. Prose is clearer and MDX
safe.

Same pattern as `mdx-unsafe-patterns-in-md` skill; also adjacent to
PR #3344's `(<137 countries)` fix.
2026-04-24 09:37:18 +04:00
Elie Habib
c517b2fb17 feat(energy-atlas): expose Atlas panels on FULL variant + CMD+K (#3364)
* feat(energy-atlas): expose Atlas panels on FULL variant + CMD+K

Three Atlas panels (PipelineStatusPanel, StorageFacilityMapPanel,
FuelShortagePanel) shipped in PR #3294 but were registered only in
ENERGY_PANELS — invisible on worldmonitor.app because the energy
variant subdomain is not yet wired. Additionally, no CMD+K entries
existed for them, so command-palette search for "pipeline" or
"storage" returned nothing.

Changes:
- src/config/commands.ts: add panel:pipeline-status,
  panel:storage-facility-map, panel:fuel-shortages with relevant
  keywords (oil/gas/nord stream/druzhba/spr/lng/ugs/rationing/…).
- src/config/panels.ts: add the 3 panel keys to FULL_PANELS with
  priority 2 so they appear in the main worldmonitor.app drawer
  under the existing energy-crisis block. ENERGY_PANELS keeps its
  own priority-1 copies so the future energy.worldmonitor.app
  subdomain still surfaces them top of list.

Unblocks the plan's announcement gate item "UI: at least one Atlas
panel renders registry data on worldmonitor.app in a browser."

Part of docs/internal/energy-atlas-registry-expansion.md follow-up.

* fix(cmd-k): resolve spr/lng keyword collisions so Atlas panel wins

Reviewer found that the original PR wiring let CMD+K "spr" and "lng"
route to the wrong panel because matchCommands() (SearchModal.ts:273)
ranks by exact/prefix/substring then keeps array-insertion order on
ties. Storage-atlas was declared AFTER the colliding entries.

Collisions:
- "spr": panel:oil-inventories (line 105) had exact 'spr' → tied
  with the new storage-facility-map (line 108) → insertion order
  kept oil-inventories winning.
- "lng": panel:hormuz-tracker (line 135) has exact 'lng' →
  storage-facility-map only had substring 'lng terminals' (score 1)
  → hormuz won outright.

Fix:
- Remove 'spr' from oil-inventories keywords. The SPR as a *site
  list* semantically belongs to Strategic Storage Atlas. Stock-level
  queries still route to oil-inventories via 'strategic petroleum'
  (the word 'spr' is not a substring of 'strategic petroleum', so
  no fallback score leaks).
- Add exact 'lng' to storage-facility-map. Both it and hormuz-tracker
  now score 3 on 'lng'; stable sort preserves declaration order,
  so storage (line 108) outranks hormuz (line 135). Hormuz still
  matches via 'hormuz', 'strait of hormuz', 'tanker', 'shipping'.
2026-04-24 09:34:57 +04:00
Elie Habib
b68d98972a fix(unrest): bump GDELT proxy timeout 20s → 45s (#3362)
GDELT's v1 gkg_geojson endpoint is currently responding in ~19s (direct
curl test: HTTP 200 at t=19.4s). With the old 20s proxy timeout the
Decodo leg hits Cloudflare origin timeout and returns HTTP 522 on nearly
every tick, so fetchGdeltEvents throws "both paths failed — proxy:
HTTP/1.1 522 Server Error" and runSeed freezes seed-meta fetchedAt.

Result: the unrest:events seed-meta stops advancing while Redis still
holds the last-good payload — health.js reports STALE_SEED even though
the seeder is running on schedule every 45 min. 4.5+ hours of
consecutive failures observed in production logs overnight.

Direct path has been chronically broken (UND_ERR_CONNECT_TIMEOUT in
every tick since PR #3256 added the proxy fallback), so the proxy is
the real fetch path. Giving it 45s absorbs GDELT's current degraded
response time with headroom, without changing any other behavior.

ACLED credentials remain unconfigured in this environment, so GDELT is
effectively the single upstream — separate ops task to wire ACLED as a
real second source.
2026-04-24 08:52:08 +04:00
Elie Habib
a409d5f79d fix(agent-readiness): WebMCP uses registerTool + static import (#3316) (#3361)
* fix(agent-readiness): WebMCP uses registerTool + static import (#3316)

isitagentready.com reported "No WebMCP tools detected on page load"
on prod. Two compounding bugs in PR #3356:

1) API shape mismatch. Deployed code calls
   navigator.modelContext.provideContext({ tools }), but the scanner
   SKILL and shipping Chrome implementation use
   navigator.modelContext.registerTool(tool, { signal }) per tool with
   AbortController-driven teardown. The older provideContext form is
   kept as a fallback.

2) Dynamic-import timing. The webmcp module was lazy-loaded from a
   deep init phase, so the chunk resolved after the scanner probe
   window elapsed.

Fix:

- Rewrite registerWebMcpTools to prefer registerTool with an
  AbortController. provideContext becomes a legacy fallback. Returns
  the AbortController so teardown paths exist.
- Static-import webmcp in App.ts and call registerWebMcpTools
  synchronously at the start of init, before any await. Bindings
  close over lazy refs so throw-on-null guards still fire correctly
  when a tool is invoked later.

Test additions lock in registerTool-precedes-provideContext ordering,
AbortController pattern, static import, and call-before-first-await.

* fix(agent-readiness): WebMCP readiness wait + teardown on destroy (#3316)

Addresses three findings on PR #3361.

P1 — startup race. Early registration is required for scanner probes,
but a tool invoked during the window between register and Phase-4 UI
init threw "Search modal is not initialised yet." Both scanners and
agents that probe-and-invoke hit this. Bindings now await a uiReady
promise that resolves after searchManager.init and countryIntel.init.
A 10s timeout keeps a broken init from hanging the caller. After
readiness, a still-null target is a real failure and still throws.

Mechanics: App constructor builds uiReady as a Promise with its
resolve stored on the instance; Phase-4 end calls resolveUiReady;
waitForUiReady races uiReady against a timeout; both bindings await it.

P2 — AbortController was returned and dropped. registerWebMcpTools
returns a controller so callers can unregister on teardown, but App
discarded it. Stored on App now and aborted in destroy, so test
harnesses and SPA re-inits don't accumulate stale registrations.

P2 — test coverage. Added assertions for: bindings await
waitForUiReady before accessing state; resolveUiReady fires after
countryIntel.init; waitForUiReady uses Promise.race with a timeout;
destroy aborts the stored controller. Kept silent-success guard
assertions so bindings still throw when state is absent post-readiness.

Tests: 16 webmcp, 6682 full suite, all green.

* test(webmcp): tighten init()/destroy() regex anchoring (#3316)

Addresses P2 from PR #3361 review. The init() and destroy() body
captures used lazy `[\s\S]+?\n  }` which stops at the first
2-space-indent close brace. An intermediate `}` inside init (e.g.
some exotic scope block) would truncate the slice; the downstream
`.split(/\n\s+await\s/)` would then operate on a smaller string and
could let a refactor slip by without tripping the assertion.

Both regexes now end with a lookahead for the next class member
(`\n\n  (?:public|private) `), so the capture spans the whole method
body regardless of internal braces. If the next-member anchor ever
breaks, the match returns null and the `assert.ok` guard fails
loudly instead of silently accepting a short capture.

P1 (AbortController silently dropped) was already addressed in
f3bbd2170 — `this.webMcpController` is stored and destroy() aborts
it. Greptile reviewed the first push.
2026-04-24 08:21:07 +04:00
Elie Habib
38f7002f19 fix(checkout): entitlement watchdog unblocks Dodo wallet-return deadlock (#3357)
* fix(checkout): entitlement watchdog unblocks Dodo wallet-return deadlock

Buyers completing a Dodo checkout on the subscription-trial flow get
stranded on Dodo's "Payment successful" page indefinitely. HAR evidence
(session cks_0NdL9xlzrFFNivgTeGFU9 / pay_0NdLA3yIfX3BVDoXrFltx, live):
after 3DS succeeds, Dodo's iframe navigates to
/status/{id}/wallet-return?status=succeeded and then emits nothing --
no checkout.status, no checkout.redirect_requested postMessage. Our
onEvent handler never runs, so onSuccess / banner / redirect never
fire. Prior PRs #3298, #3346, #3354 all depended on Dodo emitting a
terminal event; this path emits none.

Fix: merchant-side entitlement watchdog in both the /pro bundle
(pro-test/src/services/checkout.ts) and the dashboard bundle
(src/services/checkout.ts). When the overlay is open, poll
/api/me/entitlement every 3s with a 10min cap. When the webhook flips
the user to pro, close the stuck overlay and run the post-checkout
side effects -- independent of whatever Dodo's iframe does. Existing
event-driven paths are preserved unchanged (they remain the fast path
for non-wallet-return checkouts); the watchdog is the floor.

Idempotency via a successFired closure flag; both the event handler
and the watchdog route through the same runTerminalSuccessSideEffects
function, making double-fires impossible. checkout.closed stops the
watchdog cleanly on cancel.

Observability: Sentry breadcrumb with reason tag on every terminal
success, plus captureMessage at info level when the watchdog resolves
it -- countable signal for prevalence tracking while Dodo investigates.

Rebuilt public/pro/ bundle (index-CiMZEtgt.js to index-QpSvSkuY.js).

Plan: docs/plans/2026-04-23-002-fix-dodo-checkout-entitlement-watchdog-plan.md
Skill: .claude/skills/dodo-wallet-return-skips-postmessage/SKILL.md

* fix(checkout): stop watchdog on destroyCheckoutOverlay to prevent orphan side effects

Greptile P1 on #3357. destroyCheckoutOverlay cleared initialized and
onSuccessCallback but never called _resetOverlaySession, so if the
dashboard layout unmounted mid-checkout the watchdog setInterval kept
running inside the closed-over scope. On entitlement flip, the orphaned
watchdog would fire clearCheckoutAttempt / clearPendingCheckoutIntent /
markPostCheckout / safeCloseOverlay against whatever session was active
by then -- stepping on a new checkout's state or silently closing a
fresh overlay.

Fix: call _resetOverlaySession before dropping references, and null it
out after. _resetOverlaySession is the only accessor for the closure's
stopWatchdog so it must run before the module-scoped slot is cleared.

* test(checkout): extract testable entitlement watchdog + state-machine tests

Greptile residual risk on #3357: the watchdog state machine had no
targeted automated coverage, especially the wallet-return path where
no terminal Dodo event arrives and success is detected only via
entitlement polling.

Extract the watchdog into src/services/entitlement-watchdog.ts as a
pure DI module (fetch / setInterval / clock / token source / onPro
all injected). Mirror the file at pro-test/src/services/entitlement-
watchdog.ts since the two bundles have no cross-root imports (pro-test
alias '@' resolves to pro-test root only). Both src/services/checkout.ts
and pro-test/src/services/checkout.ts now consume createEntitlement-
Watchdog instead of inlining setInterval.

Tests cover the wallet-return scenario explicitly plus the full state
matrix:
- wallet-return path: isPro flips to true -> onPro fires exactly once
- timeout cap: isPro stays false past timeoutMs -> self-terminate
  WITHOUT firing onPro
- missing token: tick no-ops, poller keeps trying
- non-2xx response (401/5xx): tick swallows, poller continues
- fetch rejection: tick swallows, poller continues
- idempotence: onPro never fires twice across consecutive pro ticks
- stop(): clears interval immediately, onPro never called
- double-start while active: second start is a no-op
- start after prior onPro: no-op (post-success reuse guard)

Parity test (tests/entitlement-watchdog-parity.test.mts) asserts the
two mirror files are byte-identical so drift alarms at CI time.

Rebuilt public/pro/ bundle (index-QpSvSkuY.js -> index-C-qy2Yt9.js).
2026-04-24 07:53:51 +04:00
Elie Habib
5cec1b8c4c fix(insights): trust cluster rank, stop LLM from re-picking top story (#3358)
* fix(insights): trust cluster rank, stop LLM from re-picking top story

WORLD BRIEF panel published "Iran's new supreme leader was seriously
wounded, leading him to delegate power to the Revolutionary Guards. This
development comes amid an ongoing war with Israel." to every visitor for
3h. Payload: openrouter / gemini-2.5-flash.

Root cause: callLLM sent all 10 clustered headlines with "pick the ONE
most significant and summarize ONLY that story". Clustering ranked
Lebanon journalist killing #1 (2 corroborating sources); News24 Iran
rumor ranked #3 (1 source). Gemini overrode the rank, picked #3, and
embellished with war framing from story #4. Objective rank (sourceCount,
velocity, isAlert) lost to model vibe.

Shrink the LLM's job to phrasing. Clustering already ranks — pass only
topStories[0].primaryTitle and instruct the model to rewrite it using
ONLY facts from the headline. No name/place/context invention.

Also:
- temperature 0.3 -> 0.1 (factual summary, not creative)
- CACHE_TTL 3h -> 30m so a bad brief ages out in one cron cycle
- Drop dead MAX_HEADLINES const

Payload shape unchanged; frontend untouched.

* fix(insights): corroboration gate + revert TTL + drop unconditional WHERE

Follow-up to review feedback on the ranking contract, TTL, and prompt:

1. Corroboration gate (P1a). scoreImportance() in scripts/_clustering.mjs
   is keyword-heavy (violence +125 on a single word, flashpoint +75, ^1.5
   multiplier when both hit), so a single-source sensational rumor can
   outrank a 2-source lead purely on lexical signals. Blindly trusting
   topStories[0] would let the ranker's keyword bias still pick bad
   stories. Walk topStories for sourceCount >= 2 instead — corroboration
   becomes a hard requirement, not a tiebreaker. If no cluster qualifies,
   publish status=degraded with no brief (frontend already handles this).

2. CACHE_TTL back to 10800 (P1b). 30m TTL == one cron cadence means the
   key expires on any missed or delayed run and /api/bootstrap loses
   insights entirely (api/bootstrap.js reads news:insights:v1 directly,
   no LKG across TTL-gap). The short TTL was defense-in-depth for bad
   content; the real safety is now upstream (corroboration gate + grounded
   prompt), so the LKG window doesn't need to be sacrificed for it.

3. Prompt: location conditional (P2). "Use ONLY facts present" + "Lead
   with WHAT happened and WHERE" conflicted for headlines without an
   explicit location and pushed the model toward inferred-place
   hallucination. Replaced with "Include a location, person, or
   organization ONLY if it appears in the headline."

* test(insights): lock corroboration gate + grounded-prompt invariants

Review P2: the corroboration gate and the prompt's no-invention rules
had no tests, so future edits to selectTopStories() ordering or prompt
text could silently reintroduce the original hallucination.

Extract the brief-selection helper and prompt builders into a pure
module (scripts/_insights-brief.mjs) so tests can import them without
triggering seed-insights.mjs's top-level runSeed() call:

- pickBriefCluster(topStories) returns first sourceCount>=2 cluster
- briefSystemPrompt(dateISO) returns the system prompt
- briefUserPrompt(headline) returns the user prompt

Regression tests (tests/seed-insights-brief.test.mjs, 12 cases) lock:
- pickBriefCluster skips single-source rumors even when ranked above a
  multi-sourced lead (explicit regression: News24 Iran supreme leader
  2026-04-23 scenario with realistic scores)
- pickBriefCluster tolerates missing/null entries
- briefSystemPrompt forbids invented facts and proper nouns
- briefSystemPrompt's "location" rule is conditional (no unconditional
  "Lead with WHAT and WHERE" directive that would push the model toward
  place-inference when the headline has no location)
- briefSystemPrompt does not contain "pick the most important" style
  language (ranking is done by pickBriefCluster upstream)
- briefUserPrompt passes the headline verbatim and instructs
  "only facts from this headline"

Also fix a misleading comment on CACHE_TTL: corroboration is gated at
brief-selection time, not on the topStories payload itself (which still
includes single-source clusters rendered as the headline list).

test:data: 6657/6657 pass (was 6645; +12).
2026-04-24 07:21:13 +04:00
Elie Habib
efb6037fcc feat(agent-readiness): WebMCP in-page tool surface (#3316) (#3356)
* feat(agent-readiness): WebMCP in-page tool surface (#3316)

Closes #3316. Exposes two UI tools to in-browser agents via the draft
WebMCP spec (webmachinelearning.github.io/webmcp), mirroring the static
Agent Skills index (#3310) for consistency:

- openCountryBrief({ iso2 }): opens the country deep-dive panel.
- openSearch(): opens the global command palette.

No bypass: both tools route through the exact methods a click would
hit (countryIntel.openCountryBriefByCode, searchModal.open), so auth
and Pro-tier gates apply to agent invocations unchanged.

Feature-detected: no-ops in Firefox, Safari, and older Chrome without
navigator.modelContext. No behavioural change outside WebMCP browsers.
Lazy-imported from App.ts so the module only enters the bundle if the
dynamic import resolves; keeps the hot-path init synchronous.

Each execute is wrapped in a logging shim that emits a typed
webmcp-tool-invoked analytics event per call; webmcp-registered fires
once at setup so we can distinguish capable-browser share from actual
tool usage.

v1 tools do not branch on auth state, so a single registration at
init is correct. Source-level comment flags that any future Pro-only
tool must re-register on sign-in/sign-out per the symmetric-listener
rule documented in the memory system.

tests/webmcp.test.mjs asserts the contract: feature-detect gate runs
before provideContext, two-or-more tools ship, ISO-2 validation lives
in the tool execute, every execute is wrapped in logging, and the
AppBindings surface stays narrow.

* fix(agent-readiness): WebMCP bindings surface missing-target as errors (#3316)

Addresses PR #3356 review.

P1 — silent-success via optional-chain no-op:
The App.ts bindings used this.state.searchModal?.open() and an
unchecked call to countryIntel.openCountryBriefByCode(). When the
underlying UI state was absent (pre-init, or in a variant that
skips the panel), the optional chain and the method's own null
guard both returned quietly, but the tool still reported "Opened"
with ok:true. Agents relying on that result would be misled.

Bindings now throw when the required UI target is missing. The
existing withInvocationLogging shim catches the throw, emits
ok:false in analytics, and returns isError:true, so agents get an
honest failure instead of a fake success. Fixed both bindings.

P2: dropped unused beforeEach import in tests/webmcp.test.mjs.

Added source-level assertions that both bindings throw when the
UI target is absent, so a future refactor that drops the check
fails loudly at CI time.
2026-04-24 07:14:04 +04:00
Elie Habib
6d4c717e75 fix(health): treat empty intlDelays as OK, matching faaDelays (#3360)
intlDelays was alarming EMPTY_DATA during calm windows (seedAge 25m,
records 0) while its faaDelays sibling — written by the same aviation
seeder — was in EMPTY_DATA_OK_KEYS. The seeder itself declares
zeroIsValid: true (scripts/seed-aviation.mjs:1171) because 0 airport
disruptions is a real steady state, so the health classifier should
agree. Stale-seed degradation still kicks in once seedAge > 90min.
2026-04-24 07:11:56 +04:00
Elie Habib
def94733a8 feat(agent-readiness): Agent Skills discovery index (#3310) (#3355)
* feat(agent-readiness): Agent Skills discovery index (#3310)

Closes #3310. Ships the Agent Skills Discovery v0.2.0 manifest at
/.well-known/agent-skills/index.json plus two real, useful skills.

Skills are grounded in real sebuf proto RPCs:
- fetch-country-brief → GetCountryIntelBrief (public).
- fetch-resilience-score → GetResilienceScore (Pro / API key).

Each SKILL.md documents endpoint, auth, parameters, response shape,
worked curl, errors, and when not to use the skill.

scripts/build-agent-skills-index.mjs walks every
public/.well-known/agent-skills/<name>/SKILL.md, sha256s the bytes,
and emits index.json. Wired into prebuild + every variant build so a
deploy can never ship an index whose digests disagree with served files.

tests/agent-skills-index.test.mjs asserts the index is up-to-date
via the script's --check mode and recomputes every sha256 against
the on-disk SKILL.md bytes.

Discovery wiring:
- public/.well-known/api-catalog: new anchor entry with the
  agent-skills-index rel per RFC 9727 linkset shape.
- vercel.json: adds agent-skills-index rel to the homepage +
  /index.html Link headers; deploy-config required-rels list updated.

Canonical URLs use the apex (worldmonitor.app) since #3322 fixed
the apex redirect that previously hid .well-known paths.

* fix(agent-readiness): correct auth header + harden frontmatter parser (#3310)

Addresses review findings on #3310.

## P1 — auth header was wrong in both SKILL.md files

The published skills documented `Authorization: Bearer wm_live_...`,
but WorldMonitor API keys must be sent in `X-WorldMonitor-Key`.
`Authorization: Bearer` is for MCP/OAuth or Clerk JWTs — not raw
`wm_live_...` keys. Agents that followed the SKILL.md verbatim would
have gotten 401s despite holding valid keys.

fetch-country-brief also incorrectly claimed the endpoint was
"public"; server-to-server callers without a trusted browser origin
are rejected by `validateApiKey`, so agents do need a key there too.
Fixed both SKILL.md files to document `X-WorldMonitor-Key` and
cross-link docs/usage-auth as the canonical auth matrix.

## P2 — frontmatter parser brittleness

The hand-rolled parser used `indexOf('\n---', 4)` as the closing
fence, which matched any body line that happened to start with `---`.
Swapped for a regex that anchors the fence to its own line, and
delegated value parsing to js-yaml (already a project dep) so future
catalog growth (quoted colons, typed values, arrays) does not trip
new edge cases.

Added parser-contract tests that lock in the new semantics:
body `---` does not terminate the block, values with colons survive
intact, non-mapping frontmatter throws, and no-frontmatter files
return an empty mapping.

Index.json rebuilt against the updated SKILL.md bytes.
2026-04-23 22:21:25 +04:00
Elie Habib
7cf0c32eaa fix(checkout): merchant-side escape hatch for Dodo overlay deadlock (#3354)
Dodo's hosted overlay can deadlock: X-button click fires
GET /api/checkout/sessions/{id}/payment-link, the 404 goes unhandled
inside their React, and the resulting Maximum-update-depth render
loop prevents the checkout.closed postMessage from ever escaping the
iframe. Our onEvent handler never runs, the user is stuck.

Add a merchant-side safety net: Escape-key listener on window that
calls DodoPayments.Checkout.close() (works via the merchant-mounted
iframe node, independent of the frozen inner UI), plus an auto-close
from the checkout.error branch so any surfaced error doesn't leave a
zombie overlay behind. Cleanup is wired into destroyCheckoutOverlay.

SDK 1.8.0 has no onCancel/cancel_url/dismissBehavior option —
close() is the only escape hatch Dodo exposes.

Observed 2026-04-23 session cks_0NdL3CalSpBDR6vrMFIS3 from the
?embed=pro-preview iframe-in-iframe landing flow.
2026-04-23 21:53:01 +04:00
Elie Habib
26d426369f feat(agent-readiness): RFC 8288 Link headers on homepage (#3353)
* feat(agent-readiness): RFC 8288 Link headers on homepage

Closes #3308, part of epic #3306.

Emit Link response headers on / and /index.html advertising every
live agent-discoverable target. All rels use IANA-registered values
(api-catalog, service-desc, service-doc, status) or the full IANA
URI form for OAuth metadata rels (per RFC 9728).

The mcp-server-card rel carries anchor="/mcp" to scope it to the
MCP endpoint rather than the homepage, since the server card
describes /mcp specifically.

New guardrail block in tests/deploy-config.test.mjs asserts every
required rel is present, targets are root-relative, and the MCP
anchor remains in place.

* test(agent-readiness): lockstep / + /index.html Link + exact target count

Adds two test-only guards on the homepage Link-headers suite:

- exact-count assertion on link targets (was `>= requiredRels.length`),
  catches accidental duplicate rels in vercel.json
- equality guard between `/` and `/index.html` Link headers, catches
  silent drift when one entry gets edited and the other doesn't

No production behavior change.
2026-04-23 21:50:25 +04:00
Elie Habib
e9146516a5 fix(swf): restore 8/8 fund coverage + explicit per-country observability (#3352)
* fix(swf): restore 8/8 fund coverage — WB bulk mrv=1 silently dropped Gulf countries

The 2026-04-23 post-#3344 Railway run seeded 4/8 funds (NO, SA, SG) and
silently dropped AE/KW/QA. Root cause: WB's `country/all/indicator/…?mrv=1`
returns the SAME year across every country (the most recent year that any
country publishes). KW/QA/AE report NE.IMP.GNFS.CD a year or two behind
NO/SA/SG, so mrv=1 gave them `value: null` and the seeder skipped them
because the rawMonths denominator was missing.

Fix: bump to `mrv=5` and pick the most recent non-null value per country
via a new pure helper `pickLatestPerCountry(records)`. Verified via
6 back-to-back live dry-runs (all 8/8, byte-identical numbers):
  NO: GPFG          1/1  effMo=93.05   (2024 imports)
  AE: ADIA+Mubadala 2/2  effMo=3.85    (2023 imports)
  SA: PIF           1/1  effMo=1.68    (2024 imports)
  KW: KIA           1/1  effMo=45.43   (2023 imports)
  QA: QIA           1/1  effMo=8.61    (2022 imports)
  SG: GIC+Temasek   2/2  effMo=7.11    (2024 imports; Temasek via infobox)

Second fix (observability): every manifest country is now enumerated in
a `summary` block in the payload + logged with an explicit status and
reason. Prod 14:59Z run had logs for KW/QA ("missing WB imports") but AE
was dropped with no log line — the operator has to cross-reference the
manifest to notice. New `buildCoverageSummary(manifest, imports, countries)`
is exported and always emits one row per manifest country: `complete`,
`partial`, or `missing` with `reason ∈ {'missing WB imports', 'no fund
AUM matched'}`. Summary is also embedded in the published payload so
downstream consumers can detect degraded runs without parsing logs.

Tests (48/48 pass, 9 new):
- `pickLatestPerCountry` — 7 cases including the exact prod scenario
  (AE-2024-null + AE-2023-non-null → resolves to 2023 row). Guards
  against upstream re-order (asserts latest-year wins regardless of
  array order), rejects null-only countries, rejects non-positive
  values, handles both iso3 and iso2 codes.
- `buildCoverageSummary` — 2 cases covering the regression
  (silent-drop of AE) and the reason-string disambiguation (operator
  should know whether to investigate WB or Wikipedia).

Validated: 6 live end-to-end dry-runs (all 8/8), full test suite
569/569 pass, biome + lint:md clean.

* fix(swf): address Greptile P2 — uniform reason field + meaningful null-filter test

Two P2 findings on PR #3352:

1. `complete` and `partial` entries in countryStatuses were pushed
   without a `reason` key, while `missing` always carried one. The log
   path tolerated this (`row.reason ? ... : ''`), but the summary is
   now persisted in Redis — any downstream consumer iterating
   countryStatuses and reading `.reason` on a `partial` would see
   undefined. Added `reason: null` to complete + partial for uniform
   persisted shape. Test now asserts the `reason` key is present on
   every row regardless of status.

2. The null-only pickLatestPerCountry test used `'XYZ'` as the ISO-3
   code, which is filtered at the iso3→iso2 lookup stage BEFORE ever
   reaching the null-value guard — a regression that removed null
   filtering entirely would leave the test green. Swapped to `'NOR'`
   (real ISO-3 with a valid iso2 mapping) so the null-filter is the
   actual gate under test. Verified via sanity probe: `NOR + null`
   still drops, `NOR + value` still lands.

Tests 48/48 pass; live dry-run still 8/8 byte-identical; biome clean.
2026-04-23 21:35:25 +04:00
Elie Habib
d75bde4e03 fix(agent-readiness): host-aware oauth-protected-resource endpoint (#3351)
* fix(agent-readiness): host-aware oauth-protected-resource endpoint

isitagentready.com enforces that `authorization_servers[*]` share
origin with `resource` (same-origin rule, matches Cloudflare's
mcp.cloudflare.com reference — RFC 9728 §3 permits split origins
but the scanner is stricter).

A single static file served from 3 hosts (apex/www/api) can only
satisfy one origin at a time. Replacing with an edge function that
derives both `resource` and `authorization_servers` from the
request `Host` header gives each origin self-consistent metadata.

No server-side behavior changes: api/oauth/*.js token issuer
doesn't bind tokens to a specific resource value (verified in
the previous PR's review).

* fix(agent-readiness): host-derive resource_metadata + runtime guardrails

Addresses P1/P2 review on this PR:

- api/mcp.ts (P1): WWW-Authenticate resource_metadata was still
  hardcoded to apex even when the client hit api.worldmonitor.app.
  Derive from request.headers.get('host') so each client gets a
  pointer matching their own origin — consistent with the host-
  aware edge function this PR introduces.
- api/oauth-protected-resource.ts (P2): add Vary: Host so any
  intermediate cache keys by hostname (belt + suspenders on top of
  Vercel's routing).
- tests/deploy-config.test.mjs (P2): replace regex-on-source with
  a runtime handler invocation asserting origin-matching metadata
  for apex/www/api hosts, and tighten the api/mcp.ts assertion to
  require host-derived resource_metadata construction.

---------

Co-authored-by: Elie Habib <elie@worldmonitor.app>
2026-04-23 21:17:32 +04:00
Elie Habib
fc94829ce7 fix(settings): prevent paying users hitting 409 on stale Upgrade CTA (#3349)
* fix(settings): stop paying users hitting 409 on stale Upgrade CTA

UnifiedSettings.renderUpgradeSection branched on `isEntitled()`, which
returns false while the Convex entitlement snapshot is still null during
a cold WebSocket load. The modal's `onEntitlementChange` listener only
re-rendered the `api-keys` tab, so the stale "Upgrade to Pro" button in
the Settings tab was never replaced once the snapshot arrived.

Paying users who opened settings in that window saw "Upgrade to Pro",
clicked it, hit `/api/create-checkout`, got 409 `duplicate_subscription`,
and cascaded into the billing-portal fallback path (which itself can
dead-end on a generic Dodo login). Same class of bug as the 2026-04-17/18
panel-overlay duplicate-subscription incident called out in
panel-gating.ts:20-22 -- different surface, same race.

Three-part fix in src/components/UnifiedSettings.ts:

- renderUpgradeSection() returns a hidden wrapper for non-Dodo premium
  (API key / tester key / Clerk pro role) so those users don't get
  stuck on the loading placeholder indefinitely.
- A signed-in user whose Convex entitlement snapshot is still null gets
  a neutral loading placeholder, not "Upgrade to Pro". An early click
  can no longer submit before Convex hydrates.
- open()'s onEntitlementChange handler now swaps the .upgrade-pro-section
  element in place when the snapshot arrives. Click handlers are
  delegated at overlay level, so replacing the node needs no rebind.

Observed signal behind this fix:
- WORLDMONITOR-NY (2026-04-23): Checkout error: duplicate_subscription
- WORLDMONITOR-NZ (2026-04-23): getCustomerPortalUrl Server Error, same
  session, 4s later, triggered by the duplicate-subscription dialog.

* fix(settings): hide loading placeholder to avoid empty bordered card

The base .upgrade-pro-section CSS (main.css:22833) applies margin, padding,
border, and surface background. Without 'hidden', the empty loading
placeholder paints a visibly blank bordered box during the Convex
cold-load window — swapping one bad state (stale 'Upgrade to Pro') for
another (confusing empty card).

Adding 'hidden' lets the browser's default [hidden] { display: none }
suppress the card entirely. Element stays queryable for the replaceWith
swap in open(), so the onEntitlementChange listener still finds it.

* fix(settings): bounded readiness window + click-time isEntitled guard

Addresses P1 review on PR #3349: the initial fix treated
getEntitlementState() === null as "still loading", but null is ALSO a
terminal state when Convex is disabled (no VITE_CONVEX_URL), auth times
out at waitForConvexAuth (10s), or initEntitlementSubscription throws
(entitlements.ts:41,47,58,78). In those cases a signed-in free user
would have seen a permanently empty placeholder instead of the Upgrade
to Pro CTA — a real regression on the main conversion surface.

Changes in src/components/UnifiedSettings.ts:

- Add `entitlementReady` class flag + `entitlementReadyTimer`. The flag
  flips true on first snapshot OR after a 12s fallback timer kicks in.
  12s > the 10s waitForConvexAuth timeout so healthy-but-slow paths land
  on the real state before the fallback fires.
- Seed `entitlementReady = getEntitlementState() !== null` BEFORE the
  first render() so the initial paint branches on the current snapshot,
  not the stale value from a prior open/close cycle.
- renderUpgradeSection() now gates the loading placeholder on
  `!this.entitlementReady` so the signed-in-free branch eventually
  renders the Upgrade CTA even when Convex never hydrates.
- handleUpgradeClick() defensively re-checks isEntitled() at click time:
  if the snapshot arrives AFTER the 12s timer but BEFORE the user's
  click, route to the billing portal instead of triggering
  /api/create-checkout against an active subscription (which would 409
  and re-enter the exact duplicate_subscription → getCustomerPortalUrl
  cascade this PR is trying to eliminate).
- Extract replaceUpgradeSection() helper so both the listener and the
  fallback timer share the same in-place swap path.
- close() clears the timer.

* fix(settings): clear entitlementReadyTimer in destroy()

Mirror the close() cleanup. Without this, if destroy() is called during
the 12s fallback window the timer fires after teardown and invokes
replaceUpgradeSection() against a detached overlay. The early-return
inside replaceUpgradeSection (querySelector returns null) makes the
callback a no-op, but the stray async callback + DOM reference stay
alive until fire — tidy them up at destroy time.
2026-04-23 21:00:55 +04:00
Elie Habib
38218db7cd fix(energy): strict validation — emptyDataIsFailure on Atlas seeders (#3350)
Adds `emptyDataIsFailure: true` to all 5 curated-registry seeders in the
`seed-bundle-energy-sources` Railway service. File-read-and-validate
seeders whose validateFn returns false (stale container, missing data
file, shape regression, etc.) MUST leave seed-meta stale rather than
stamping fresh `recordCount: 0` via the default `publishResult.skipped`
branch in `_seed-utils.mjs:906-917`.

Why this matters — observed production incident on 2026-04-23 (post
PR #3337 merge):

- Subset of Atlas seeders hit the validation-skip path (for reasons
  involving a Railway container stale vs the merged code + a local
  Option A run during an intermediate-file-state window).
- `_seed-utils.mjs:910` `writeFreshnessMetadata(..., 0, ...)` stamped
  `seed-meta:energy:pipelines-oil` and `seed-meta:energy:storage-facilities`
  with fresh `fetchedAt + recordCount: 0`.
- Bundle runner's interval gate at `_bundle-runner.mjs:210` reads
  `fetchedAt` only, not `recordCount`. With `elapsed < 0.8 × 10080min =
  8064min`, the gate skipped these 2 sections for ~5.5 days. No
  canonical data was written; health reported EMPTY; bundle never
  self-healed.

With `emptyDataIsFailure: true`, the strict branch at
`_seed-utils.mjs:897-905` fires instead:

  FAILURE: validation failed (empty data) — seed-meta NOT refreshed;
  bundle will retry next cycle

Seed-meta stays stale, bundle counts it as `failed++`, next cron tick
retries. Health flips STALE_SEED within max-stale-min. Operator sees
it. Loud-failure instead of silent-skip-with-meta-refresh.

Pattern previously documented for strict-floor validators
(IMF/WEO 180+ country seeders in
`feedback_strict_floor_validate_fail_poisons_seed_meta.md`) — now
applied to all 5 Energy Atlas curated registries for the same reasons.

No functional change in the healthy path — validation-passing runs
still publish canonical + fresh seed-meta as before.

Verification: typecheck clean, 6618/6618 data tests pass.
2026-04-23 20:43:27 +04:00