mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
main
507 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
1db05e6caa |
feat(usage): per-request Axiom telemetry pipeline (gateway + upstream attribution) (#3403)
* feat(gateway): thread Vercel Edge ctx through createDomainGateway (#3381) PR-0 of the Axiom usage-telemetry stack. Pure infra change: no telemetry emission yet, only the signature plumbing required for ctx.waitUntil to exist on the hot path. - createDomainGateway returns (req, ctx) instead of (req) - rewriteToSebuf propagates ctx to its target gateway - 5 alias callsites updated to pass ctx through - ~30 [rpc].ts callsites unchanged (export default createDomainGateway(...)) Pattern reference: api/notification-channels.ts:166. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(usage): pure UsageIdentity resolver + Axiom emit primitives (#3381) server/_shared/usage-identity.ts - buildUsageIdentity: pure function, consumes already-resolved gateway state. - Static ENTERPRISE_KEY_TO_CUSTOMER map (explicit, reviewable in code). - Does not re-verify JWTs or re-validate API keys. server/_shared/usage.ts - buildRequestEvent / buildUpstreamEvent: allowlisted-primitive builders only. Never accept Request/Response — additive field leaks become structurally impossible. - emitUsageEvents → ctx.waitUntil(sendToAxiom). Direct fetch, 1.5s timeout, no retry, gated by USAGE_TELEMETRY=1 and AXIOM_API_TOKEN. - Sliding-window circuit breaker (5% over 5min, min 20 samples). Trips with one structured console.error; subsequent drops are 1%-sampled console.warn. - Header derivers reuse Vercel/CF headers for request_id, region, country, reqBytes; ua_hash null unless USAGE_UA_PEPPER is set (no stable fingerprinting). - Dev-only x-usage-telemetry response header for 2-second debugging. server/_shared/auth-session.ts - New resolveClerkSession returning { userId, orgId } in one JWT verify so customer_id can be Clerk org id without a second pass. resolveSessionUserId kept as back-compat wrapper. No emission wiring yet — that lands in the next commit (gateway request event + 403 + 429). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(gateway): emit Axiom request events on every return path (#3381) Wires the request-event side of the Axiom usage-telemetry stack. Behind USAGE_TELEMETRY=1 — no-op when the env var is unset. Emit points (each builds identity from accumulated gateway state): - origin_403 disallowed origin → reason=origin_403 - API access subscription required (403) - legacy bearer 401 / 403 / 401-without-bearer - entitlement check fail-through - endpoint rate-limit 429 → reason=rate_limit_429 - global rate-limit 429 → reason=rate_limit_429 - 405 method not allowed - 404 not found - 304 etag match (resolved cache tier) - 200 GET with body (resolved cache tier, real res_bytes) - streaming / non-GET-200 final return (res_bytes best-effort) Identity inputs (UsageIdentityInput): - sessionUserId / clerkOrgId from new resolveClerkSession (one JWT verify) - isUserApiKey + userApiKeyCustomerRef from validateUserApiKey result - enterpriseApiKey when keyCheck.valid + non-wm_ wmKey present - widgetKey from x-widget-key header (best-effort) - tier captured opportunistically from existing getEntitlements calls Header derivers reuse Vercel/CF metadata (x-vercel-id, x-vercel-ip-country, cf-ipcountry, content-length, sentry-trace) — no new geo lookup, no new crypto on the hot path. ua_hash null unless USAGE_UA_PEPPER is set. Dev-only x-usage-telemetry response header (ok | degraded | off) attached on the response paths for 2-second debugging in non-production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(usage): upstream events via implicit request scope (#3381) Closes the upstream-attribution side of the Axiom usage-telemetry stack without requiring leaf-handler changes (per koala's review). server/_shared/usage.ts - AsyncLocalStorage-backed UsageScope: gateway sets it once per request, fetch helpers read from it lazily. Defensive import — if the runtime rejects node:async_hooks, scope helpers degrade to no-ops and the request event is unaffected. - runWithUsageScope(scope, fn) / getUsageScope() exports. server/gateway.ts - Wraps matchedHandler in runWithUsageScope({ ctx, requestId, customerId, route, tier }) so deep fetchers can attribute upstream calls without threading state through every handler signature. server/_shared/redis.ts - cachedFetchJsonWithMeta accepts opts.usage = { provider, operation? }. Only the provider label is required to opt in — request_id / customer_id / route / tier flow implicitly from UsageScope. - Emits on the fresh path only (cache hits don't emit; the inbound request event already records cache_status). - cache_status correctly distinguishes 'miss' vs 'neg-sentinel' by construction, matching NEG_SENTINEL handling. - Telemetry never throws — failures are swallowed in the lazy-import catch, sink itself short-circuits on USAGE_TELEMETRY=0. server/_shared/fetch-json.ts - New optional { provider, operation } in FetchJsonOptions. Same opt-in-by-provider model as cachedFetchJsonWithMeta. Auto-derives host from URL. Reads body via .text() so response_bytes is recorded (best-effort; chunked responses still report 0). Net result: any handler that uses fetchJson or cachedFetchJsonWithMeta gets full per-customer upstream attribution by adding two fields to the options bag. No signature changes anywhere else. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gateway): address round-1 codex feedback on usage telemetry - ctx is now optional on the createDomainGateway handler signature so direct callers (tests, non-Vercel paths) no longer crash on emit - legacy premium bearer-token routes (resilience, shipping-v2) propagate session.userId into the usage accumulator so successful requests are attributed instead of emitting as anon - after checkEntitlement allows a tier-gated route, re-read entitlements (Redis-cached + in-flight coalesced) to populate usage.tier so analyze-stock & co. emit the correct tier rather than 0 - domain extraction now skips a leading vN segment, so /api/v2/shipping/* records domain="shipping" instead of "v2" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(usage): assert telemetry payload + identity resolver + operator guide - tests/usage-telemetry-emission.test.mts stubs globalThis.fetch to capture the Axiom ingest POST body and asserts the four review-flagged fields end-to-end through the gateway: domain on /api/v2/<svc>/* (was "v2"), customer_id on legacy premium bearer success (was null/anon), tier on entitlement-gated success via the Convex fallback path (was 0), plus a ctx-optional regression guard - server/__tests__/usage-identity.test.ts unit-tests the pure buildUsageIdentity() resolver across every auth_kind branch, tier coercion, and the secret-handling invariant (raw enterprise key never lands in any output field) - docs/architecture/usage-telemetry.md is the operator + dev guide: field reference, architecture, configuration, failure modes, local workflow, eight Axiom APL recipes, and runbooks for adding fields / new gateway return paths Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(usage): make recorder.settled robust to nested waitUntil Promise.all(pending) snapshotted the array at call time, missing the inner ctx.waitUntil(sendToAxiom(...)) that emitUsageEvents pushes after the outer drain begins. Tests passed only because the fetch spy resolved in an earlier microtask tick. Replace with a quiescence loop so the helper survives any future async in the emit path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: trigger preview * fix(usage): address koala #3403 review — collapse nested waitUntil, widget-key validation, neg-sentinel status, auth_* reasons P1 - Collapse nested ctx.waitUntil at all 3 emit sites (gateway.ts emitRequest, fetch-json.ts, redis.ts emitUpstreamFromHook). Export sendToAxiom and call it directly inside the outer waitUntil so Edge runtimes don't drop the delivery promise after the response phase. - Validate X-Widget-Key against WIDGET_AGENT_KEY before populating usage.widgetKey so unauthenticated callers can't spoof per-customer attribution. P2 - Emit on OPTIONS preflight (new 'preflight' RequestReason). - Gate cachedFetchJsonWithMeta upstreamStatus=200 on result != null so the neg-sentinel branch no longer reports as a successful upstream call. - Extend RequestReason with auth_401/auth_403/tier_403 and replace reason:'ok' on every auth/tier-rejection emit path. - Replace 32-bit FNV-1a with a two-round XOR-folded 64-bit variant in hashKeySync (collision space matters once widget-key adoption grows). Verification - tests/usage-telemetry-emission.test.mts — 6/6 - tests/premium-stock-gateway.test.mts + tests/gateway-cdn-origin-policy.test.mts — 15/15 - npx vitest run server/__tests__/usage-identity.test.ts — 13/13 - npx tsc --noEmit clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: trigger preview rebuild for AXIOM_API_TOKEN * chore(usage): note Axiom region in ingest URL comment Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * debug(usage): unconditional logs in sendToAxiom for preview troubleshooting Temporary — to be reverted once Axiom delivery is confirmed working in preview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(usage): add 'live' cache tier + revert preview debug logs - Sync UsageCacheTier with the local CacheTier in gateway.ts (main added 'live' in PR #3402 — synthetic merge with main was failing typecheck:api). - Revert temporary unconditional debug logs in sendToAxiom now that Axiom delivery is verified end-to-end on preview (event landed with all fields populated, including the new auth_401 reason from the koala #3403 fix). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5c955691a9 |
feat(energy-atlas): live tanker map layer + contract (parity PR 3, plan U7-U8) (#3402)
* feat(energy-atlas): live tanker map layer + contract (PR 3, plan U7-U8)
Lands the third and final parity-push surface — per-vessel tanker positions
inside chokepoint bounding boxes, refreshed every 60s. Closes the visual
gap with peer reference energy-intel sites for the live AIS tanker view.
Per docs/plans/2026-04-25-003-feat-energy-parity-pushup-plan.md PR 3.
Codex-approved through 8 review rounds against origin/main @
|
||
|
|
2f5445284b |
fix(brief): single canonical synthesis brain — eliminate email/brief lead divergence (#3396)
* feat(brief-llm): canonical synthesis prompt + v3 cache key
Extends generateDigestProse to be the single source of truth for
brief executive-summary synthesis (canonicalises what was previously
split between brief-llm's generateDigestProse and seed-digest-
notifications.mjs's generateAISummary). Ports Brain B's prompt
features into buildDigestPrompt:
- ctx={profile, greeting, isPublic} parameter (back-compat: 4-arg
callers behave like today)
- per-story severity uppercased + short-hash prefix [h:XXXX] so the
model can emit rankedStoryHashes for stable re-ranking
- profile lines + greeting opener appear only when ctx.isPublic !== true
validateDigestProseShape gains optional rankedStoryHashes (≥4-char
strings, capped to MAX_STORIES_PER_USER × 2). v2-shaped rows still
pass — field defaults to [].
hashDigestInput v3:
- material includes profile-SHA, greeting bucket, isPublic flag,
per-story hash
- isPublic=true substitutes literal 'public' for userId in the cache
key so all share-URL readers of the same (date, sensitivity, pool)
hit ONE cache row (no PII in public cache key)
Adds generateDigestProsePublic(stories, sensitivity, deps) wrapper —
no userId param by design — for the share-URL surface.
Cache prefix bumped brief:llm:digest:v2 → v3. v2 rows expire on TTL.
Per the v1→v2 precedent (see hashDigestInput comment), one-tick cost
on rollout is acceptable for cache-key correctness.
Tests: 72/72 passing in tests/brief-llm.test.mjs (8 new for the v3
behaviors), full data suite 6952/6952.
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 1, Codex-approved (5 rounds).
* feat(brief): envelope v3 — adds digest.publicLead for share-URL surface
Bumps BRIEF_ENVELOPE_VERSION 2 → 3. Adds optional
BriefDigest.publicLead — non-personalised executive lead generated
by generateDigestProsePublic (already in this branch from the
previous commit) for the public share-URL surface. Personalised
`lead` is the canonical synthesis for authenticated channels;
publicLead is its profile-stripped sibling so api/brief/public/*
never serves user-specific content (watched assets/regions).
SUPPORTED_ENVELOPE_VERSIONS = [1, 2, 3] keeps v1 + v2 envelopes
in the 7-day TTL window readable through the rollout — the
composer only ever writes the current version, but readers must
tolerate older shapes that haven't expired yet. Same rollout
pattern used at the v1 → v2 bump.
Renderer changes (server/_shared/brief-render.js):
- ALLOWED_DIGEST_KEYS gains 'publicLead' (closed-key-set still
enforced; v2 envelopes pass because publicLead === undefined is
the v2 shape).
- assertBriefEnvelope: new isNonEmptyString check on publicLead
when present. Type contract enforced; absence is OK.
Tests (tests/brief-magazine-render.test.mjs):
- New describe block "v3 publicLead field": v3 envelope renders;
malformed publicLead rejected; v2 envelope still passes; ad-hoc
digest keys (e.g. synthesisLevel) still rejected — confirming
the closed-key-set defense holds for the cron-local-only fields
the orchestrator must NOT persist.
- BRIEF_ENVELOPE_VERSION pin updated 2 → 3 with rollout-rationale
comment.
Test results: 182 brief-related tests pass; full data suite
6956/6956.
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 2, Codex Round-3 Medium #2.
* feat(brief): synthesis splice + rankedStoryHashes pre-cap re-order
Plumbs the canonical synthesis output (lead, threads, signals,
publicLead, rankedStoryHashes from generateDigestProse) through the
pure composer so the orchestration layer can hand pre-resolved data
into envelope.digest. Composer stays sync / no I/O — Codex Round-2
High #2 honored.
Changes:
scripts/lib/brief-compose.mjs:
- digestStoryToUpstreamTopStory now emits `hash` (the digest story's
stable identifier, falls back to titleHash when absent). Without
this, rankedStoryHashes from the LLM has nothing to match against.
- composeBriefFromDigestStories accepts opts.synthesis = {lead,
threads, signals, rankedStoryHashes?, publicLead?}. When passed,
splices into envelope.digest after the stub is built. Partial
synthesis (e.g. only `lead` populated) keeps stub defaults for the
other fields — graceful degradation when L2 fallback fires.
shared/brief-filter.js:
- filterTopStories accepts optional rankedStoryHashes. New helper
applyRankedOrder re-orders stories by short-hash prefix match
BEFORE the cap is applied, so the model's editorial judgment of
importance survives MAX_STORIES_PER_USER. Stable for ties; stories
not in the ranking come after in original order. Empty/missing
ranking is a no-op (legacy callers unchanged).
shared/brief-filter.d.ts:
- filterTopStories signature gains rankedStoryHashes?: string[].
- UpstreamTopStory gains hash?: unknown (carried through from
digestStoryToUpstreamTopStory).
Tests added (tests/brief-from-digest-stories.test.mjs):
- synthesis substitutes lead/threads/signals/publicLead.
- legacy 4-arg callers (no synthesis) keep stub lead.
- partial synthesis (only lead) keeps stub threads/signals.
- rankedStoryHashes re-orders pool before cap.
- short-hash prefix match (model emits 8 chars; story carries full).
- unranked stories go after in original order.
Test results: 33/33 in brief-from-digest-stories; 182/182 across all
brief tests; full data suite 6956/6956.
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 3, Codex Round-2 Low + Round-2 High #2.
* feat(brief): single canonical synthesis per user; rewire all channels
Restructures the digest cron's per-user compose + send loops to
produce ONE canonical synthesis per user per issueSlot — the lead
text every channel (email HTML, plain-text, Telegram, Slack,
Discord, webhook) and the magazine show is byte-identical. This
eliminates the "two-brain" divergence that was producing different
exec summaries on different surfaces (observed 2026-04-25 0802).
Architecture:
composeBriefsForRun (orchestration):
- Pre-annotates every eligible rule with lastSentAt + isDue once,
before the per-user pass. Same getLastSentAt helper the send loop
uses so compose + send agree on lastSentAt for every rule.
composeAndStoreBriefForUser (per-user):
- Two-pass winner walk: try DUE rules first (sortedDue), fall back
to ALL eligible rules (sortedAll) for compose-only ticks.
Preserves today's dashboard refresh contract for weekly /
twice_daily users on non-due ticks (Codex Round-4 High #1).
- Within each pass, walk by compareRules priority and pick the
FIRST candidate with a non-empty pool — mirrors today's behavior
at scripts/seed-digest-notifications.mjs:1044 and prevents the
"highest-priority but empty pool" edge case (Codex Round-4
Medium #2).
- Three-level synthesis fallback chain:
L1: generateDigestProse(fullPool, ctx={profile,greeting,!public})
L2: generateDigestProse(envelope-sized slice, ctx={})
L3: stub from assembleStubbedBriefEnvelope
Distinct log lines per fallback level so ops can quantify
failure-mode distribution.
- Generates publicLead in parallel via generateDigestProsePublic
(no userId param; cache-shared across all share-URL readers).
- Splices synthesis into envelope via composer's optional
`synthesis` arg (Step 3); rankedStoryHashes re-orders the pool
BEFORE the cap so editorial importance survives MAX_STORIES.
- synthesisLevel stored in the cron-local briefByUser entry — NOT
persisted in the envelope (renderer's assertNoExtraKeys would
reject; Codex Round-2 Medium #5).
Send loop:
- Reads lastSentAt via shared getLastSentAt helper (single source
of truth with compose flow).
- briefLead = brief?.envelope?.data?.digest?.lead — the canonical
lead. Passed to buildChannelBodies (text/Telegram/Slack/Discord),
injectEmailSummary (HTML email), and sendWebhook (webhook
payload's `summary` field). All-channel parity (Codex Round-1
Medium #6).
- Subject ternary reads cron-local synthesisLevel: 1 or 2 →
"Intelligence Brief", 3 → "Digest" (preserves today's UX for
fallback paths; Codex Round-1 Missing #5).
Removed:
- generateAISummary() — the second LLM call that produced the
divergent email lead. ~85 lines.
- AI_SUMMARY_CACHE_TTL constant — no longer referenced. The
digest:ai-summary:v1:* cache rows expire on their existing 1h
TTL (no cleanup pass).
Helpers added:
- getLastSentAt(rule) — extracted Upstash GET for digest:last-sent
so compose + send both call one source of truth.
- buildSynthesisCtx(rule, nowMs) — formats profile + greeting for
the canonical synthesis call. Preserves all today's prefs-fetch
failure-mode behavior.
Composer:
- compareRules now exported from scripts/lib/brief-compose.mjs so
the cron can sort each pass identically to groupEligibleRulesByUser.
Test results: full data suite 6962/6962 (was 6956 pre-Step 4; +6
new compose-synthesis tests from Step 3).
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Steps 4 + 4b. Codex-approved (5 rounds).
* fix(brief-render): public-share lead fail-safe — never leak personalised lead
Public-share render path (api/brief/public/[hash].ts → renderer
publicMode=true) MUST NEVER serve the personalised digest.lead
because that string can carry profile context — watched assets,
saved-region names, etc. — written by generateDigestProse with
ctx.profile populated.
Previously: redactForPublic redacted user.name and stories.whyMatters
but passed digest.lead through unchanged. Codex Round-2 High
(security finding).
Now (v3 envelope contract):
- redactForPublic substitutes digest.lead = digest.publicLead when
the v3 envelope carries one (generated by generateDigestProsePublic
with profile=null, cache-shared across all public readers).
- When publicLead is absent (v2 envelope still in TTL window OR v3
envelope where publicLead generation failed), redactForPublic sets
digest.lead to empty string.
- renderDigestGreeting: when lead is empty, OMIT the <blockquote>
pull-quote entirely. Page still renders complete (greeting +
horizontal rule), just without the italic lead block.
- NEVER falls back to the original personalised lead.
assertBriefEnvelope still validates publicLead's contract (when
present, must be a non-empty string) BEFORE redactForPublic runs,
so a malformed publicLead throws before any leak risk.
Tests added (tests/brief-magazine-render.test.mjs):
- v3 envelope renders publicLead in pull-quote, personalised lead
text never appears.
- v2 envelope (no publicLead) omits pull-quote; rest of page
intact.
- empty-string publicLead rejected by validator (defensive).
- private render still uses personalised lead.
Test results: 68 brief-magazine-render tests pass; full data suite
remains green from prior commit.
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Step 5, Codex Round-2 High (security).
* feat(digest): brief lead parity log + extra acceptance tests
Adds the parity-contract observability line and supplementary
acceptance tests for the canonical synthesis path.
Parity log (per send, after successful delivery):
[digest] brief lead parity user=<id> rule=<v>:<s>:<lang>
synthesis_level=<1|2|3> exec_len=<n> brief_lead_len=<n>
channels_equal=<bool> public_lead_len=<n>
When channels_equal=false an extra WARN line fires —
"PARITY REGRESSION user=… — email lead != envelope lead." Sentry's
existing console-breadcrumb hook lifts this without an explicit
captureMessage call. Plan acceptance criterion A5.
Tests added (tests/brief-llm.test.mjs, +9):
- generateDigestProsePublic: two distinct callers with identical
(sensitivity, story-pool) hit the SAME cache row (per Codex
Round-2 Medium #4 — "no PII in public cache key").
- public + private writes never collide on cache key (defensive).
- greeting bucket change re-keys the personalised cache (Brain B
parity).
- profile change re-keys the personalised cache.
- v3 cache prefix used (no v2 writes).
Test results: 77/77 in brief-llm; full data suite 6971/6971
(was 6962 pre-Step-7; +9 new public-cache tests).
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Steps 6 (partial) + 7. Acceptance A5, A6.g, A6.f.
* test(digest): backfill A6.h/i/l/m acceptance tests via helper extraction
* fix(brief): close two correctness regressions on multi-rule + public surface
Two findings from human review of the canonical-synthesis PR:
1. Public-share redaction leaked personalised signals + threads.
The new prompt explicitly personalises both `lead` and `signals`
("personalise lead and signals"), but redactForPublic only
substituted `lead` — leaving `signals` and `threads` intact.
Public renderer's hasSignals gate would emit the signals page
whenever `digest.signals.length > 0`, exposing watched-asset /
region phrasing to anonymous readers. Same privacy bug class
the original PR was meant to close, just on different fields.
2. Multi-rule users got cross-pool lead/storyList mismatch.
composeAndStoreBriefForUser picks ONE winning rule for the
canonical envelope. The send loop then injected that ONE
`briefLead` into every due rule's channel body — even though
each rule's storyList came from its own (per-rule) digest pool.
Multi-rule users (e.g. `full` + `finance`) ended up with email
bodies leading on geopolitics while listing finance stories.
Cross-rule editorial mismatch reintroduced after the cross-
surface fix.
Fix 1 — public signals + threads:
- Envelope shape: BriefDigest gains `publicSignals?: string[]` +
`publicThreads?: BriefThread[]` (sibling fields to publicLead).
Renderer's ALLOWED_DIGEST_KEYS extended; assertBriefEnvelope
validates them when present.
- generateDigestProsePublic already returned a full prose object
(lead + signals + threads) — orchestration now captures all
three instead of just `.lead`. Composer splices each into its
envelope slot.
- redactForPublic substitutes:
digest.lead ← publicLead (or empty → omits pull-quote)
digest.signals ← publicSignals (or empty → omits signals page)
digest.threads ← publicThreads (or category-derived stub via
new derivePublicThreadsStub helper — never
falls back to the personalised threads)
- New tests cover all three substitutions + their fail-safes.
Fix 2 — per-rule synthesis in send loop:
- Each due rule independently calls runSynthesisWithFallback over
ITS OWN pool + ctx. Channel body lead is internally consistent
with the storyList (both from the same pool).
- Cache absorbs the cost: when this is the winner rule, the
synthesis hits the cache row written during the compose pass
(same userId/sensitivity/pool/ctx) — no extra LLM call. Only
multi-rule users with non-overlapping pools incur additional
LLM calls.
- magazineUrl still points at the winner's envelope (single brief
per user per slot — `(userId, issueSlot)` URL contract). Channel
lead vs magazine lead may differ for non-winner rule sends;
documented as acceptable trade-off (URL/key shape change to
support per-rule magazines is out of scope for this PR).
- Parity log refined: adds `winner_match=<bool>` field. The
PARITY REGRESSION warning now fires only when winner_match=true
AND the channel lead differs from the envelope lead (the actual
contract regression). Non-winner sends with legitimately
different leads no longer spam the alert.
Test results:
- tests/brief-magazine-render.test.mjs: 75/75 (+7 new for public
signals/threads + validator + private-mode-ignores-public-fields)
- Full data suite: 6995/6995 (was 6988; +7 net)
- typecheck + typecheck:api: clean
Plan: docs/plans/2026-04-25-002-fix-brief-email-two-brain-divergence-plan.md
Addresses 2 review findings on PR #3396 not anticipated in the
5-round Codex review.
* fix(brief): unify compose+send window, fall through filter-rejection
Address two residual risks in PR #3396 (single-canonical-brain refactor):
Risk 1 — canonical lead synthesized from a fixed 24h pool while the
send loop ships stories from `lastSentAt ?? 24h`. For weekly users
that meant a 24h-pool lead bolted onto a 7d email body — the same
cross-surface divergence the refactor was meant to eliminate, just in
a different shape. Twice-daily users hit a 12h-vs-24h variant.
Fix: extract the window formula to `digestWindowStartMs(lastSentAt,
nowMs, defaultLookbackMs)` in digest-orchestration-helpers.mjs and
call it from BOTH the compose path's digestFor closure AND the send
loop. The compose path now derives windowStart per-candidate from
`cand.lastSentAt`, identical to what the send loop will use for that
rule. Removed the now-unused BRIEF_STORY_WINDOW_MS constant.
Side-effect: digestFor now receives the full annotated candidate
(`cand`) instead of just the rule, so it can reach `cand.lastSentAt`.
Backwards-compatible at the helper level — pickWinningCandidateWithPool
forwards `cand` instead of `cand.rule`.
Cache memo hit rate drops since lastSentAt varies per-rule, but
correctness > a few extra Upstash GETs.
Risk 2 — pickWinningCandidateWithPool returned the first candidate
with a non-empty raw pool as winner. If composeBriefFromDigestStories
then dropped every story (URL/headline/shape filters), the caller
bailed without trying lower-priority candidates. Pre-PR behaviour was
to keep walking. This regressed multi-rule users whose top-priority
rule's pool happens to be entirely filter-rejected.
Fix: optional `tryCompose(cand, stories)` callback on
pickWinningCandidateWithPool. When provided, the helper calls it after
the non-empty pool check; falsy return → log filter-rejected and walk
to the next candidate; truthy → returns `{winner, stories,
composeResult}` so the caller can reuse the result. Without the
callback, legacy semantics preserved (existing tests + callers
unaffected).
Caller composeAndStoreBriefForUser passes a no-synthesis compose call
as tryCompose — cheap pure-JS, no I/O. Synthesis only runs once after
the winner is locked in, so the perf cost is one extra compose per
filter-rejected candidate, no extra LLM round-trips.
Tests:
- 10 new cases in tests/digest-orchestration-helpers.test.mjs
covering: digestFor receiving full candidate; tryCompose
fall-through to lower-priority; all-rejected returns null;
composeResult forwarded; legacy semantics without tryCompose;
digestWindowStartMs lastSentAt-vs-default branches; weekly +
twice-daily window parity assertions; epoch-zero ?? guard.
- Updated tests/digest-cache-key-sensitivity.test.mjs static-shape
regex to match the new `cand.rule.sensitivity` cache-key shape
(intent unchanged: cache key MUST include sensitivity).
Stacked on PR #3396 — targets feat/brief-two-brain-divergence.
|
||
|
|
7c0c08ad89 |
feat(energy-atlas): seed-side countries[] denorm on disruptions + CountryDeepDive row (§R #5 = B) (#3377)
* feat(energy-atlas): seed-side countries[] denorm + CountryDeepDive row (§R #5 = B)
Per plan §R/#5 decision B: denormalise countries[] at seed time on each
disruption event so CountryDeepDivePanel can filter events per country
without an asset-registry round trip. Schema join (pipeline/storage
→ event.assetId) happens once in the weekly cron, not on every panel
render. The alternative (client-side join) was rejected because it
couples UI logic to asset-registry internals and duplicates the join
for every surface that wants a per-country filter.
Changes:
- `proto/.../list_energy_disruptions.proto`: add `repeated string
countries = 15` to EnergyDisruptionEntry with doc comment tying it
to the plan decision and the always-non-empty invariant.
- `scripts/_energy-disruption-registry.mjs`:
• Load pipeline-gas + pipeline-oil + storage-facilities registries
once per seed cycle; index by id.
• `deriveCountriesForEvent()` resolves assetId to {fromCountry,
toCountry, transitCountries} (pipeline) or {country} (storage),
deduped + alpha-sorted so byte-diff stability holds.
• `buildPayload()` attaches the computed countries[] to every
event before writing.
• `validateRegistry()` now requires non-empty countries[] of
ISO2 codes. Combined with the seeder's `emptyDataIsFailure:
true`, this surfaces orphaned assetIds loudly — the next cron
tick fails validation and seed-meta stays stale, tripping
health alarms.
- `scripts/data/energy-disruptions.json`: fix two orphaned assetIds
that the new join caught:
• `cpc-force-majeure-2022`: `cpc-pipeline` → `cpc` (matches the
entry in pipelines-oil.json).
• `pdvsa-designation-2019`: `ve-petrol-2026-q1` (non-existent) →
`venezuela-anzoategui-puerto-la-cruz`.
- `server/.../list-energy-disruptions.ts`: project countries[] into
the RPC response via coerceStringArray. Legacy pre-denorm rows
surface as empty array (always present on wire, length 0 => old).
- `src/components/CountryDeepDivePanel.ts`: add 4th Atlas row —
"Energy disruptions in {iso2}" — filtered by `iso2 ∈ countries[]`.
Failure is silent; EnergyDisruptionsPanel (upcoming) is the
primary disruption surface.
- `tests/energy-disruptions-registry.test.mts`: switch to validating
the buildPayload output (post-denorm), add §R #5 B invariant
tests, plus a raw-JSON invariant ensuring curators don't hand-edit
countries[] (it's derived, not declared).
Proto regen note: `make generate` currently fails with a duplicate
openapi plugin collision in buf.gen.yaml (unrelated bug — 3 plugin
entries emit to the same out dir). Worked around by temporarily
trimming buf.gen.yaml to just the TS plugins for this regen. Added
only the `countries: string[]` wire field to both service_client and
service_server; no other generated-file drift in this PR.
* chore(proto): regenerate openapi specs for countries[] field
Runs `make generate` with the sebuf v0.11.1 plugin now correctly
resolved via the PATH fix (cherry-picked from fix/makefile-generate-path-prefix).
The new `countries` field on EnergyDisruptionEntry propagates into:
- docs/api/SupplyChainService.openapi.yaml (primary per-service spec)
- docs/api/SupplyChainService.openapi.json (machine-readable variant)
- docs/api/worldmonitor.openapi.yaml (consolidated bundle)
No TypeScript drift beyond the already-committed service_client.ts /
service_server.ts updates in
|
||
|
|
e68a7147dd |
chore(api): sebuf migration follow-ups (post-#3242) (#3287)
* chore(api-manifest): rewrite brief-why-matters reason as proper internal-helper justification Carried in from #3248 merge as a band-aid (called out in #3242 review followup checklist item 7). The endpoint genuinely belongs in internal-helper — RELAY_SHARED_SECRET-bearer auth, cron-only caller, never reached by dashboards or partners. Same shape constraint as api/notify.ts. Replaces the apologetic "filed here to keep the lint green" framing with a proper structural justification: modeling it as a generated service would publish internal cron plumbing as user-facing API surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(lint): premium-fetch parity check for ServiceClients (closes #3279) Adds scripts/enforce-premium-fetch.mjs — AST-walks src/, finds every `new <ServiceClient>(...)` (variable decl OR `this.foo =` assignment), tracks which methods each instance actually calls, and fails if any called method targets a path in src/shared/premium-paths.ts PREMIUM_RPC_PATHS without `{ fetch: premiumFetch }` on the constructor. Per-call-site analysis (not class-level) keeps the trade/index.ts pattern clean — publicClient with globalThis.fetch + premiumClient with premiumFetch on the same TradeServiceClient class — since publicClient never calls a premium method. Wired into: - npm run lint:premium-fetch - .husky/pre-push (right after lint:rate-limit-policies) - .github/workflows/lint-code.yml (right after lint:api-contract) Found and fixed three latent instances of the HIGH(new) #1 class from #3242 review (silent 401 → empty fallback for signed-in browser pros): - src/services/correlation-engine/engine.ts — IntelligenceServiceClient built with no fetch option called deductSituation. LLM-assessment overlay on convergence cards never landed for browser pros without a WM key. - src/services/economic/index.ts — EconomicServiceClient with globalThis.fetch called getNationalDebt. National-debt panel rendered empty for browser pros. - src/services/sanctions-pressure.ts — SanctionsServiceClient with globalThis.fetch called listSanctionsPressure. Sanctions-pressure panel rendered empty for browser pros. All three swap to premiumFetch (single shared client, mirrors the supply-chain/index.ts justification — premiumFetch no-ops safely on public methods, so the public methods on those clients keep working). Verification: - lint:premium-fetch clean (34 ServiceClient classes, 28 premium paths, 466 src/ files analyzed) - Negative test: revert any of the three to globalThis.fetch → exit 1 with file:line and called-premium-method names - typecheck + typecheck:api clean - lint:api-contract / lint:rate-limit-policies / lint:boundaries clean - tests/sanctions-pressure.test.mjs + premium-fetch.test.mts: 16/16 pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(military): fetchStaleFallback NEG_TTL=30s parity (closes #3277) The legacy /api/military-flights handler had NEG_TTL = 30_000ms — a short suppression window after a failed live + stale read so we don't Redis-hammer the stale key during sustained relay+seed outages. Carried into the sebuf list-military-flights handler: - Module-scoped `staleNegUntil` timestamp (per-isolate on Vercel Edge, which is fine — each warm isolate gets its own 30s suppression window). - Set whenever fetchStaleFallback returns null (key missing, parse fail, empty array after staleToProto filter, or thrown error). - Checked at the entry of fetchStaleFallback before doing the Redis read. - Test seam `_resetStaleNegativeCacheForTests()` exposed for unit tests. Test pinned in tests/redis-caching.test.mjs: drives a stale-empty cycle three times — first read hits Redis, second within window doesn't, after test-only reset it does again. Verified: 18/18 redis-caching tests pass, typecheck:api clean, lint:premium-fetch clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(lint): rate-limit-policies regex → import() (closes #3278) The previous lint regex-parsed ENDPOINT_RATE_POLICIES from the source file. That worked because the literal happens to fit a single line per key today, but a future reformat (multi-line key wrap, formatter swap, etc.) would silently break the lint without breaking the build — exactly the failure mode that's worse than no lint at all. Fix: - Export ENDPOINT_RATE_POLICIES from server/_shared/rate-limit.ts. - Convert scripts/enforce-rate-limit-policies.mjs to async + dynamic import() of the policy object directly. Same TS module that the gateway uses at runtime → no source-of-truth drift possible. - Run via tsx (already a dev dep, used by test:data) so the .mjs shebang can resolve a .ts import. - npm script swapped to `tsx scripts/...`. .husky/pre-push uses `npm run lint:rate-limit-policies` so no hook change needed. Verified: - Clean: 6 policies / 182 gateway routes. - Negative test (rename a key to the original sanctions typo /api/sanctions/v1/lookup-entity): exit 1 with the same incident- attributed remedy message as before. - Reformat test (split a single-line entry across multiple lines): still passes — the property is what's read, not the source layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(shipping/v2): alertThreshold: 0 preserved; drop dead validation branch (#3242 followup) Before: alert_threshold was a plain int32. proto3 scalar default is 0, so the handler couldn't distinguish "partner explicitly sent 0 (deliver every disruption)" from "partner omitted the field (apply legacy default 50)" — both arrived as 0 and got coerced to 50 by `> 0 ? : 50`. Silent intent-drop for any partner who wanted every alert. The subsequent `alertThreshold < 0` branch was also unreachable after that coercion. After: - Proto field is `optional int32 alert_threshold` — TS type becomes `alertThreshold?: number`, so omitted = undefined and explicit 0 stays 0. - Handler uses `req.alertThreshold ?? 50` — undefined → 50, any number passes through unchanged. - Dead `< 0 || > 100` runtime check removed; buf.validate `int32.gte = 0, int32.lte = 100` already enforces the range at the wire layer. Partner wire contract: identical for the omit-field and 1..100 cases. Only behavioural change is explicit 0 — previously impossible to request, now honored per proto3 optional semantics. Scoped `buf generate --path worldmonitor/shipping/v2` to avoid the full- regen `@ts-nocheck` drift Seb documented in the #3242 PR comments. Re-applied `@ts-nocheck` on the two regenerated files manually. Tests: - `alertThreshold 0 coerces to 50` flipped to `alertThreshold 0 preserved`. - New test: `alertThreshold omitted (undefined) applies legacy default 50`. - `rejects > 100` test removed — proto/wire validation handles it; direct handler calls intentionally bypass wire and the handler no longer carries a redundant runtime range check. Verified: 18/18 shipping-v2-handler tests pass, typecheck + typecheck:api clean, all 4 custom lints clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(shipping/v2): document missing webhook delivery worker + DNS-rebinding contract (#3242 followup) #3242 followup checklist item 6 from @koala73 — sanity-check that the delivery worker honors the re-resolve-and-re-check contract that isBlockedCallbackUrl explicitly delegates to it. Audit finding: no delivery worker for shipping/v2 webhooks exists in this repo. Grep across the entire tree (excluding generated/dist) shows the only readers of webhook:sub:* records are the registration / inspection / rotate-secret handlers themselves. No code reads them and POSTs to the stored callbackUrl. The delivery worker is presumed to live in Railway (separate repo) or hasn't been built yet — neither is auditable from this repo. Refreshes the comment block at the top of webhook-shared.ts to: - explicitly state DNS rebinding is NOT mitigated at registration - spell out the four-step contract the delivery worker MUST follow (re-validate URL, dns.lookup, re-check resolved IP against patterns, fetch with resolved IP + Host header preserved) - flag the in-repo gap so anyone landing delivery code can't miss it Tracking the gap as #3288 — acceptance there is "delivery worker imports the patterns + helpers from webhook-shared.ts and applies the four steps before each send." Action moves to wherever the delivery worker actually lives (Railway likely). No code change. Tests + lints unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(lint): add rate-limit-policies step (greptile P1 #3287) Pre-push hook ran lint:rate-limit-policies but the CI workflow did not, so fork PRs and --no-verify pushes bypassed the exact drift check the lint was added to enforce (closes #3278). Adding it right after lint:api-contract so it runs in the same context the lint was designed for. * refactor(lint): premium-fetch regex → import() + loop classRe (greptile P2 #3287) Two fragilities greptile flagged on enforce-premium-fetch.mjs: 1. loadPremiumPaths regex-parsed src/shared/premium-paths.ts with /'(\/api\/[^']+)'/g — same class of silent drift we just removed from enforce-rate-limit-policies in #3278. Reformatting the source Set (double quotes, spread, helper-computed entries) would drop paths from the lint while leaving the runtime untouched. Fix: flip the shebang to `#!/usr/bin/env -S npx tsx` and dynamic-import PREMIUM_RPC_PATHS directly, mirroring the rate-limit pattern. package.json lint:premium-fetch now invokes via tsx too so the npm-script path matches direct execution. 2. loadClientClassMap ran classRe.exec once, silently dropping every ServiceClient after the first if a file ever contained more than one. Current codegen emits one class per file so this was latent, but a template change would ship un-linted classes. Fix: collect every class-open match with matchAll, slice each class body with the next class's start as the boundary, and scan methods per-body so method-to-class binding stays correct even with multiple classes per file. Verification: - lint:premium-fetch clean (34 classes / 28 premium paths / 466 files — identical counts to pre-refactor, so no coverage regression). - Negative test: revert src/services/economic/index.ts to globalThis.fetch → exit 1 with file:line, bound var name, and premium method list (getNationalDebt). Restore → clean. - lint:rate-limit-policies still clean. * fix(shipping/v2): re-add alertThreshold handler range guard (greptile nit 1 #3287) Wire-layer buf.validate enforces 0..100, but direct handler invocation (internal jobs, test harnesses, future transports) bypasses it. Cheap invariant-at-the-boundary — rejects < 0 or > 100 with ValidationError before the record is stored. Tests: restored the rejects-out-of-range cases that were dropped when the branch was (correctly) deleted as dead code on the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(lint): premium-fetch method-regex → TS AST (greptile nits 2+5 #3287) loadClientClassMap: The method regex `async (\w+)\s*\([^)]*\)\s*:\s*Promise<[^>]+>\s*\{\s*let path = "..."` assumed (a) no nested `)` in arg types, (b) no nested `>` in the return type, (c) `let path = "..."` as the literal first statement. Any codegen template shift would silently drop methods with the lint still passing clean — the same silent-drift class #3287 just closed on the premium-paths side. Now walks the service_client.ts AST, matches `export class *ServiceClient`, iterates `MethodDeclaration` members, and reads the first `let path: string = '...'` variable statement as a StringLiteral. Tolerant to any reformatting of arg/return types or method shape. findCalls scope-blindness: Added limitation comment — the walker matches `<varName>.<method>()` anywhere in the file without respecting scope. Two constructions in different function scopes sharing a var name merge their called-method sets. No current src/ file hits this; the lint errs cautiously (flags both instances). Keeping the walker simple until scope-aware binding is needed. webhook-shared.ts: Inlined issue reference (#3288) so the breadcrumb resolves without bouncing through an MDX that isn't in the diff. Verification: - lint:premium-fetch clean — 34 classes / 28 premium paths / 489 files. Pre-refactor: 34 / 28 / 466. Class + path counts identical; file bump is from the main-branch rebase, not the refactor. - Negative test: revert src/services/economic/index.ts premiumFetch → globalThis.fetch. Lint exits 1 at `src/services/economic/index.ts:64:7` with `premium method(s) called: getNationalDebt`. Restore → clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(lint): rate-limit OpenAPI regex → yaml parser (greptile nit 3 #3287) Input side (ENDPOINT_RATE_POLICIES) was flipped to live `import()` in |
||
|
|
184e82cb40 |
feat(resilience): PR 3A — net-imports denominator for sovereignFiscalBuffer (#3380)
PR 3A of cohort-audit plan 2026-04-24-002. Construct correction for
re-export hubs: the SWF rawMonths denominator was gross imports, which
double-counted flow-through trade that never represents domestic
consumption. Net-imports fix:
rawMonths = aum / (grossImports × (1 − reexportShareOfImports)) × 12
applied to any country in the re-export share manifest. Countries NOT
in the manifest get gross imports unchanged (status-quo fallback).
Plan acceptance gates — verified synthetically in this PR:
Construct invariant. Two synthetic countries, same SWF, same gross
imports. A re-exports 60%; B re-exports 0%. Post-fix, A's rawMonths
is 2.5× B's (1/(1-0.6) = 2.5). Pinned in
tests/resilience-net-imports-denominator.test.mts.
SWF-heavy exporter invariant. Country with share ≤ 5%: rawMonths
lift < 5% vs baseline (negligible). Pinned.
What shipped
1. Re-export share manifest infrastructure.
- scripts/shared/reexport-share-manifest.yaml (new, empty) — schema
committed; entries populated in follow-up PRs with UNCTAD
Handbook citations.
- scripts/shared/reexport-share-loader.mjs (new) — loader + strict
validator, mirrors swf-manifest-loader.mjs.
- scripts/seed-recovery-reexport-share.mjs (new) — publishes
resilience:recovery:reexport-share:v1 from manifest. Empty
manifest = valid (no countries, no adjustment).
2. SWF seeder uses net-imports denominator.
- scripts/seed-sovereign-wealth.mjs exports computeNetImports(gross,
share) — pure helper, unit-tested.
- Per-country loop: reads manifest, computes denominatorImports,
applies to rawMonths math.
- Payload records annualImports (gross, audit), denominatorImports
(used in math), reexportShareOfImports (provenance).
- Summary log reports which countries had a net-imports adjustment
applied with source year.
3. Bundle wiring.
- Reexport-Share runs BEFORE Sovereign-Wealth in the recovery
bundle so the SWF seeder reads fresh re-export data in the same
cron tick.
- tests/seed-bundle-resilience-recovery.test.mjs expected-entries
updated (6 → 7) with ordering preservation.
4. Cache-prefix bump (per cache-prefix-bump-propagation-scope skill).
- RESILIENCE_SCORE_CACHE_PREFIX: v11 → v12
- RESILIENCE_RANKING_CACHE_KEY: v11 → v12
- RESILIENCE_HISTORY_KEY_PREFIX: v6 → v7 (history rotation prevents
30-day rolling window from mixing pre/post-fix scores and
manufacturing false "falling" trends on deploy day).
- Source of truth: server/worldmonitor/resilience/v1/_shared.ts
- Mirrored in: scripts/seed-resilience-scores.mjs,
scripts/validate-resilience-correlation.mjs,
scripts/backtest-resilience-outcomes.mjs,
scripts/validate-resilience-backtest.mjs,
scripts/benchmark-resilience-external.mjs, api/health.js
- Test literals bumped in 4 test files (26 line edits).
- EXTENDED tests/resilience-cache-keys-health-sync.test.mts with
a parity pass that reads every known mirror file and asserts
both (a) canonical prefix present AND (b) no stale v<older>
literals in non-comment code. Found one legacy log-line that
still referenced v9 (scripts/seed-resilience-scores.mjs:342)
and refactored it to use the RESILIENCE_RANKING_CACHE_KEY
constant so future bumps self-update.
Explicitly NOT in this PR
- liquidReserveAdequacy denominator fix. The plan's PR 3A wording
mentions both dims, but the RESERVES ratio (WB FI.RES.TOTL.MO) is a
PRE-COMPUTED WB series; applying a post-hoc net-imports adjustment
mixes WB's denominator year with our manifest-year, and the math
change belongs in PR 3B (unified liquidity) where the α calibration
is explicit. This PR stays scoped to sovereignFiscalBuffer.
- Live re-export share entries. The manifest ships EMPTY in this PR;
entries with UNCTAD citations are one-per-PR follow-ups so each
figure is individually auditable.
Verified
- tests/resilience-net-imports-denominator.test.mts — 9 pass (construct
contract: 2.5× ratio gate, monotonicity, boundary rejections,
backward-compat on missing manifest entry, cohort-proportionality,
SWF-heavy-exporter-unchanged)
- tests/reexport-share-loader.test.mts — 7 pass (committed-manifest
shape + 6 schema-violation rejections)
- tests/resilience-cache-keys-health-sync.test.mts — 5 pass (existing 3
+ 2 new parity checks across all mirror files)
- tests/seed-bundle-resilience-recovery.test.mjs — 17 pass (expected
entries bumped to 7)
- npm run test:data — 6714 pass / 0 fail
- npm run typecheck / typecheck:api — green
- npm run lint / lint:md — clean
Deployment notes
Score + ranking + history cache prefixes all bump in the same deploy.
Per established v10→v11 precedent (and the cache-prefix-bump-
propagation-scope skill):
- Score / ranking: 6h TTL — the new prefix populates via the Railway
resilience-scores cron within one tick.
- History: 30d ring — the v7 ring starts empty; the first 30 days
post-deploy lack baseline points, so trend / change30d will read as
"no change" until v7 accumulates a window.
- Legacy v11 keys can be deleted from Redis at any time post-deploy
(no reader references them). Leaving them in place costs storage
but does no harm.
|
||
|
|
34dfc9a451 |
fix(news): ground LLM surfaces on real RSS description end-to-end (#3370)
* feat(news/parser): extract RSS/Atom description for LLM grounding (U1)
Add description field to ParsedItem, extract from the first non-empty of
description/content:encoded (RSS) or summary/content (Atom), picking the
longest after HTML-strip + entity-decode + whitespace-normalize. Clip to
400 chars. Reject empty, <40 chars after strip, or normalize-equal to the
headline — downstream consumers fall back to the cleaned headline on '',
preserving current behavior for feeds without a description.
CDATA end is anchored to the closing tag so internal ]]> sequences do not
truncate the match. Preserves cached rss:feed:v1 row compatibility during
the 1h TTL bleed since the field is additive.
Part of fix: pipe RSS description end-to-end so LLM surfaces stop
hallucinating named actors (docs/plans/2026-04-24-001-...).
Covers R1, R7.
* feat(news/story-track): persist description on story:track:v1 HSET (U2)
Append description to the story:track:v1 HSET only when non-empty. Additive
— no key version bump. Old rows and rows from feeds without a description
return undefined on HGETALL, letting downstream readers fall back to the
cleaned headline (R6).
Extract buildStoryTrackHsetFields as a pure helper so the inclusion gate is
unit-testable without Redis.
Update the contract comment in cache-keys.ts so the next reader of the
schema sees description as an optional field.
Covers R2, R6.
* feat(proto): NewsItem.snippet + SummarizeArticleRequest.bodies (U3)
Add two additive proto fields so the article description can ride to every
LLM-adjacent consumer without a breaking change:
- NewsItem.snippet (field 12): RSS/Atom description, HTML-stripped,
≤400 chars, empty when unavailable. Wired on toProtoItem.
- SummarizeArticleRequest.bodies (field 8): optional article bodies
paired 1:1 with headlines for prompt grounding. Empty array is today's
headline-only behavior.
Regenerated TS client/server stubs and OpenAPI YAML/JSON via sebuf v0.11.1
(PATH=~/go/bin required — Homebrew's protoc-gen-openapiv3 is an older
pre-bundle-mode build that collides on duplicate emission).
Pre-emptive bodies:[] placeholders at the two existing SummarizeArticle
call sites in src/services/summarization.ts; U6 replaces them with real
article bodies once SummarizeArticle handler reads the field.
Covers R3, R5.
* feat(brief/digest): forward RSS description end-to-end through brief envelope (U4)
Digest accumulator reader (seed-digest-notifications.mjs::buildDigest) now
plumbs the optional `description` field off each story:track:v1 HGETALL into
the digest story object. The brief adapter (brief-compose.mjs::
digestStoryToUpstreamTopStory) prefers the real RSS description over the
cleaned headline; when the upstream row has no description (old rows in the
48h bleed, feeds that don't carry one), we fall back to the cleaned headline
so today behavior is preserved (R6).
This is the upstream half of the description cache path. U5 lands the LLM-
side grounding + cache-prefix bump so Gemini actually sees the article body
instead of hallucinating a named actor from the headline.
Covers R4 (upstream half), R6.
* feat(brief/llm): RSS grounding + sanitisation + 4 cache prefix bumps (U5)
The actual fix for the headline-only named-actor hallucination class:
Gemini 2.5 Flash now receives the real article body as grounding context,
so it paraphrases what the article says instead of filling role-label
headlines from parametric priors ("Iran's new supreme leader" → "Ali
Khamenei" was the 2026-04-24 reproduction; with grounding, it becomes
the actual article-named actor).
Changes:
- buildStoryDescriptionPrompt interpolates a `Context: <body>` line
between the metadata block and the "One editorial sentence" instruction
when description is non-empty AND not normalise-equal to the headline.
Clips to 400 chars as a second belt-and-braces after the U1 parser cap.
No Context line → identical prompt to pre-fix (R6 preserved).
- sanitizeStoryForPrompt extended to cover `description`. Closes the
asymmetry where whyMatters was sanitised and description wasn't —
untrusted RSS bodies now flow through the same injection-marker
neutraliser before prompt interpolation. generateStoryDescription wraps
the story in sanitizeStoryForPrompt before calling the builder,
matching generateWhyMatters.
- Four cache prefixes bumped atomically to evict pre-grounding rows:
scripts/lib/brief-llm.mjs:
brief:llm:description:v1 → v2 (Railway, description path)
brief:llm:whymatters:v2 → v3 (Railway, whyMatters fallback)
api/internal/brief-why-matters.ts:
brief:llm:whymatters:v6 → v7 (edge, primary)
brief:llm:whymatters:shadow:v4 → shadow:v5 (edge, shadow)
hashBriefStory already includes description in the 6-field material
(v5 contract) so identity naturally drifts; the prefix bump is the
belt-and-braces that guarantees a clean cold-start on first tick.
- Tests: 8 new + 2 prefix-match updates on tests/brief-llm.test.mjs.
Covers Context-line injection, empty/dup-of-headline rejection,
400-char clip, sanitisation of adversarial descriptions, v2 write,
and legacy-v1 row dark (forced cold-start).
Covers R4 + new sanitisation requirement.
* feat(news/summarize): accept bodies + bump summary cache v5→v6 (U6)
SummarizeArticle now grounds on per-headline article bodies when callers
supply them, so the dashboard "News summary" path stops hallucinating
across unrelated headlines when the upstream RSS carried context.
Three coordinated changes:
1. SummarizeArticleRequest handler reads req.bodies, sanitises each entry
through sanitizeForPrompt (same trust treatment as geoContext — bodies
are untrusted RSS text), clips to 400 chars, and pads to the headlines
length so pair-wise identity is stable.
2. buildArticlePrompts accepts optional bodies and interleaves a
` Context: <body>` line under each numbered headline that has a
non-empty body. Skipped in translate mode (headline[0]-only) and when
all bodies are empty — yielding a byte-identical prompt to pre-U6
for every current caller (R6 preserved).
3. summary-cache-key bumps CACHE_VERSION v5→v6 so the pre-grounding rows
(produced from headline-only prompts) cold-start cleanly. Extends
canonicalizeSummaryInputs + buildSummaryCacheKey with a pair-wise
bodies segment `:bd<hash>`; the prefix is `:bd` rather than `:b` to
avoid colliding with `:brief:` when pattern-matching keys. Translate
mode is headline[0]-only and intentionally does not shift on bodies.
Dedup reorder preserved: the handler re-pairs bodies to the deduplicated
top-5 via findIndex, so layout matches without breaking cache identity.
New tests: 7 on buildArticlePrompts (bodies interleave, partial fill,
translate-mode skip, clip, short-array tolerance), 8 on
buildSummaryCacheKey (pair-wise sort, cache-bust on body drift, translate
skip). Existing summary-cache-key assertions updated v5→v6.
Covers R3, R4.
* feat(consumers): surface RSS snippet across dashboard, email, relay, MCP + audit (U7)
Thread the RSS description from the ingestion path (U1-U5) into every
user-facing LLM-adjacent surface. Audit the notification producers so
RSS-origin and domain-origin events stay on distinct contracts.
Dashboard (proto snippet → client → panel):
- src/types/index.ts NewsItem.snippet?:string (client-side field).
- src/app/data-loader.ts proto→client mapper propagates p.snippet.
- src/components/NewsPanel.ts renders snippet as a truncated (~200 chars,
word-boundary ellipsis) `.item-snippet` line under each headline.
- NewsPanel.currentBodies tracks per-headline bodies paired 1:1 with
currentHeadlines; passed as options.bodies to generateSummary so the
server-side SummarizeArticle LLM grounds on the article body.
Summary plumbing:
- src/services/summarization.ts threads bodies through SummarizeOptions
→ generateSummary → runApiChain → tryApiProvider; cache key now includes
bodies (via U6's buildSummaryCacheKey signature).
MCP world-brief:
- api/mcp.ts pairs headlines with their RSS snippets and POSTs `bodies`
to /api/news/v1/summarize-article so the MCP tool surface is no longer
starved.
Email digest:
- scripts/seed-digest-notifications.mjs plain-text formatDigest appends
a ~200-char truncated snippet line under each story; HTML formatDigestHtml
renders a dim-grey description div between title and meta. Both gated
on non-empty description (R6 — empty → today's behavior).
Real-time alerts:
- src/services/breaking-news-alerts.ts BreakingAlert gains optional
description; checkBatchForBreakingAlerts reads item.snippet; dispatchAlert
includes `description` in the /api/notify payload when present.
Notification relay:
- scripts/notification-relay.cjs formatMessage gated on
NOTIFY_RELAY_INCLUDE_SNIPPET=1 (default off). When on, RSS-origin
payloads render a `> <snippet>` context line under the title. When off
or payload.description absent, output is byte-identical to pre-U7.
Audit (RSS vs domain):
- tests/notification-relay-payload-audit.test.mjs enforces file-level
@notification-source tags on every producer, rejects `description:` in
domain-origin payload blocks, and verifies the relay codepath gates
snippet rendering under the flag.
- Tag added to ais-relay.cjs (domain), seed-aviation.mjs (domain),
alert-emitter.mjs (domain), breaking-news-alerts.ts (rss).
Deferred (plan explicitly flags): InsightsPanel + cluster-producer
plumbing (bodies default to [] — will unlock gradually once news:insights:v1
producer also carries primarySnippet).
Covers R5, R6.
* docs+test: grounding-path note + bump pinned CACHE_VERSION v5→v6 (U8)
Final verification for the RSS-description-end-to-end fix:
- docs/architecture.mdx — one-paragraph "News Grounding Pipeline"
subsection tracing parser → story:track:v1.description → NewsItem.snippet
→ brief / SummarizeArticle / dashboard / email / relay / MCP, with the
empty-description R6 fallback rule called out explicitly.
- tests/summarize-reasoning.test.mjs — Fix-4 static-analysis pin updated
to match the v6 bump from U6. Without this the summary cache bump silently
regressed CI's pinned-version assertion.
Final sweep (2026-04-24):
- grep -rn 'brief:llm:description:v1' → only in the U5 legacy-row test
simulation (by design: proves the v2 bump forces cold-start).
- grep -rn 'brief:llm:whymatters:v2/v6/shadow:v4' → no live references.
- grep -rn 'summary:v5' → no references.
- CACHE_VERSION = 'v6' in src/utils/summary-cache-key.ts.
- Full tsx --test sweep across all tests/*.test.{mjs,mts}: 6747/6747 pass.
- npm run typecheck + typecheck:api: both clean.
Covers R4, R6, R7.
* fix(rss-description): address /ce:review findings before merge
14 fixes from structured code review across 13 reviewer personas.
Correctness-critical (P1 — fixes that prevent R6/U7 contract violations):
- NewsPanel signature covers currentBodies so view-mode toggles that leave
headlines identical but bodies different now invalidate in-flight summaries.
Without this, switching renderItems → renderClusters mid-summary let a
grounded response arrive under a stale (now-orphaned) cache key.
- summarize-article.ts re-pairs bodies with headlines BEFORE dedup via a
single zip-sanitize-filter-dedup pass. Previously bodies[] was indexed by
position in light-sanitized headlines while findIndex looked up the
full-sanitized array — any headline that sanitizeHeadlines emptied
mispaired every subsequent body, grounding the LLM on the wrong story.
- Client skips the pre-chain cache lookup when bodies are present, since
client builds keys from RAW bodies while server sanitizes first. The
keys diverge on injection content, which would silently miss the
server's authoritative cache every call.
Test + audit hardening:
- Legacy v1 eviction test now uses the real hashBriefStory(story()) suffix
instead of a literal "somehash", so a bug where the reader still queried
the v1 prefix at the real key would actually be caught.
- tests/summary-cache-key.test.mts adds 400-char clip identity coverage so
the canonicalizer's clip and any downstream clip can't silently drift.
- tests/news-rss-description-extract.test.mts renames the well-formed
CDATA test and adds a new test documenting the malformed-]]> fallback
behavior (plain regex captures, article content survives).
Safe_auto cleanups:
- Deleted dead SNIPPET_PUSH_MAX constant in notification-relay.cjs.
- BETA-mode groq warm call now passes bodies, warming the right cache slot.
- seed-digest shares a local normalize-equality helper for description !=
headline comparison, matching the parser's contract.
- Pair-wise sort in summary-cache-key tie-breaks on body so duplicate
headlines produce stable order across runs.
- buildSummaryCacheKey gained JSDoc documenting the client/server contract
and the bodies parameter semantics.
- MCP get_world_brief tool description now mentions RSS article-body
grounding so calling agents see the current contract.
- _shared.ts `opts.bodies![i]!` double-bang replaced with `?? ''`.
- extractRawTagBody regexes cached in module-level Map, mirroring the
existing TAG_REGEX_CACHE pattern.
Deferred to follow-up (tracked for PR description / separate issue):
- Promote shared MAX_BODY constant across the 5 clip sites
- Promote shared truncateForDisplay helper across 4 render sites
- Collapse NewsPanel.{currentHeadlines, currentBodies} → Array<{title, snippet}>
- Promote sanitizeStoryForPrompt to shared/brief-llm-core.js
- Split list-feed-digest.ts parser helpers into sibling -utils.ts
- Strengthen audit test: forward-sweep + behavioral gate test
Tests: 6749/6749 pass. Typecheck clean on both configs.
* fix(summarization): thread bodies through browser T5 path (Codex #2)
Addresses the second of two Codex-raised findings on PR #3370:
The PR threaded bodies through the server-side API provider chain
(Ollama → Groq → OpenRouter → /api/news/v1/summarize-article) but the
local browser T5 path at tryBrowserT5 was still summarising from
headlines alone. In BETA_MODE that ungrounded path runs BEFORE the
grounded server providers; in normal mode it remains the last
fallback. Whenever T5-small won, the dashboard summary surface
regressed to the headline-only path — the exact hallucination class
this PR exists to eliminate.
Fix: tryBrowserT5 accepts an optional `bodies` parameter and
interleaves each body with its paired headline via a `headline —
body` separator in the combined text (clipped to 200 chars per body
to stay within T5-small's ~512-token context window). All three call
sites (BETA warm, BETA cold, normal-mode fallback) now pass the
bodies threaded down from generateSummary options.bodies.
When bodies is empty/omitted, the combined text is byte-identical to
pre-fix (R6 preserved).
On Codex finding #1 (story:track:v1 additive-only HSET keeps a body
from an earlier mention of the same normalized title), declining to
change. The current rule — "if this mention has a body, overwrite;
otherwise leave the prior body alone" — is defensible: a body from
mention A is not falsified by mention B being body-less (a wire
reprint doesn't invalidate the original source's body). A feed that
publishes a corrected headline creates a new normalized-title hash,
so no stale body carries forward. The failure window is narrow (live
story evolving while keeping the same title through hours of
body-less wire reprints) and the 7-day STORY_TTL is the backstop.
Opening a follow-up issue to revisit semantics if real-world evidence
surfaces a stale-grounding case.
* fix(story-track): description always-written to overwrite stale bodies (Codex #1)
Revisiting Codex finding #1 on PR #3370 after re-review. The previous
response declined the fix with reasoning; on reflection the argument
was over-defending the current behavior.
Problem: buildStoryTrackHsetFields previously wrote `description` only
when non-empty. Because story:track:v1 rows are collapsed by
normalized-title hash, an earlier mention's body would persist for up
to STORY_TTL (7 days) on subsequent body-less mentions of the same
story. Consumers reading `track.description` via HGETALL could not
distinguish "this mention's body" from "some mention's body from the
last week," silently grounding brief / whyMatters / SummarizeArticle
LLMs on text the current mention never supplied. That violates the
grounding contract advertised to every downstream surface in this PR.
Fix: HSET `description` unconditionally on every mention — empty
string when the current item has no body, real body when it does. An
empty value overwrites any prior mention's body so the row is always
authoritative for the current cycle. Consumers continue to treat
empty description as "fall back to cleaned headline" (R6 preserved).
The 7-day STORY_TTL and normalized-title hash semantics are unchanged.
Trade-off accepted: a valid body from Feed A (NYT) is wiped when Feed
B (AP body-less wire reprint) arrives for the same normalized title,
even though Feed A's body is factually correct. Rationale: the
alternative — keeping Feed A's body indefinitely — means the user
sees Feed A's body attributed (by proximity) to an AP mention at a
later timestamp, which is at minimum misleading and at worst carries
retracted/corrected details. Honest absence beats unlabeled presence.
Tests: new stale-body overwrite sequence test (T0 body → T1 empty →
T2 new body), existing "writes description when non-empty" preserved,
existing "omits when empty" inverted to "writes empty, overwriting."
cache-keys.ts contract comment updated to mark description as
always-written rather than optional.
|
||
|
|
d521924253 |
fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs (#3363)
* fix(resilience): fail closed on missing v2 energy seeds + health CRIT on absent inputs PR #3289 shipped the v2 energy construct behind RESILIENCE_ENERGY_V2_ENABLED (default false). Audit on 2026-04-24 after the user flagged "AE only moved 1.49 points — we added nuclear credit, we should see more" revealed two safety gaps that made a future flag flip unsafe: 1. scoreEnergyV2 silently fell back to IMPUTE when any of its three required Redis seeds (low-carbon-generation, fossil-electricity-share, power-losses) was null. A future operator flipping the flag with seeds absent would produce fabricated-looking numbers for every country with zero operator signal. 2. api/health.js had those three seed labels in BOTH SEED_META (CRIT on missing) AND ON_DEMAND_KEYS (which demotes CRIT to WARN). The demotion won. Health has been reporting WARNING on a scorer dependency that has been 100% missing since PR #3289 merged — no paging trail existed. Changes: server/worldmonitor/resilience/v1/_dimension-scorers.ts - Add ResilienceConfigurationError with missingKeys[] payload. - scoreEnergy: preflight the three v2 seeds when flag=true. Throw ResilienceConfigurationError listing the specific absent keys. - scoreAllDimensions: wrap per-dimension dispatch in try/catch so a thrown ResilienceConfigurationError routes to the source-failure shape (imputationClass='source-failure', coverage=0) for that ONE dimension — country keeps scoring other dims normally. Log once per country-dimension pair so the gap is audit-traceable. api/health.js - Remove lowCarbonGeneration / fossilElectricityShare / powerLosses from ON_DEMAND_KEYS. They stay in BOOTSTRAP_KEYS + SEED_META. - Replace the transitional comment with a hard "do NOT add these back" note pointing at the scorer's fail-closed gate. tests/resilience-energy-v2.test.mts - New test: flag on + ALL three seeds missing → throws ResilienceConfigurationError naming all three keys. - New test: flag on + only one seed missing → throws naming ONLY the missing key (operator-clarity guard). - New test: flag on + all seeds present → v2 runs normally. - Update the file-level invariant comment to reflect the new fail-closed contract (replacing the prior "degrade gracefully" wording that codified the silent-IMPUTE bug). - Note: fixture's `??` fallbacks coerce null-overrides into real data, so the preflight tests use a direct-reader helper. docs/methodology/country-resilience-index.mdx - New "Fail-closed semantics" paragraph in the v2 Energy section documenting the throw + source-failure + health-CRIT contract. Non-goals (intentional): - This PR does NOT flip RESILIENCE_ENERGY_V2_ENABLED. - This PR does NOT provision seed-bundle-resilience-energy-v2 on Railway. - This PR does NOT touch RESILIENCE_PILLAR_COMBINE_ENABLED. Operational effect post-merge: - /api/health flips from WARNING → CRITICAL on the three v2 seed-meta entries. That is the intended alarm; it reveals that the Railway bundle was never provisioned. - scoreEnergy behavior with flag=false is unchanged (legacy path). - scoreEnergy behavior with flag=true + seeds present is unchanged. - scoreEnergy behavior with flag=true + seeds absent changes from "silently IMPUTE all 217 countries" to "source-failure on the energy dim for every country, visible in widget + API response". Tests: 511/511 resilience-* pass. Biome clean. Lint:md clean. Related plan: docs/plans/2026-04-24-001-fix-resilience-v2-fail-closed-on-missing-seeds-plan.md * docs(resilience): scrub stale ON_DEMAND_KEYS references for v2 energy seeds Greptile P2 on PR #3363: four stale references implied the three v2 energy seeds were still gated as ON_DEMAND_KEYS (WARN-on-missing) even though this PR's api/health.js change removed them (now strict SEED_META = CRIT on missing). Scrubbing each: - api/health.js:196 (BOOTSTRAP_KEYS comment) — was "ON_DEMAND_KEYS until Railway cron provisions; see below." Updated to cite plan 2026-04-24-001 and the strict-SEED_META posture. - api/health.js:398 (SEED_META comment) — was "Listed in ON_DEMAND_KEYS below until Railway cron provisions..." Updated for same reason. - docs/methodology/country-resilience-index.mdx:635 — v2.1 changelog entry said seed keys were ON_DEMAND_KEYS until graduation. Replaced with the fail-closed contract description. - docs/methodology/energy-v2-flag-flip-runbook.md:25 — step 3 said "ON_DEMAND_KEYS graduation" was required at flag-flip time. Rewrote to explain no graduation step is needed because the posture was removed pre-activation. No code change. Tests still 14/14 on the energy-v2 suite, lint:md clean. * fix(docs): escape MDX-unsafe `<=` in energy-v2 runbook to unblock Mintlify Mintlify deploy on PR #3363 failed with `Unexpected character '=' (U+003D) before name` at `docs/methodology/energy-v2-flag-flip-runbook.md`. Two lines had `<=` in plain prose, which MDX tries to parse as a JSX-tag-start. Replaced both with `≤` (U+2264) — and promoted the two existing `>=` on adjacent lines to `≥` for consistency. Prose is clearer and MDX safe. Same pattern as `mdx-unsafe-patterns-in-md` skill; also adjacent to PR #3344's `(<137 countries)` fix. |
||
|
|
53c50f4ba9 |
fix(swf): move manifest next to its loader so Railway ships it (#3344)
PR #3336 fixed the yaml dep but the next Railway tick crashed with `ENOENT: no such file or directory, open '/docs/methodology/swf-classification-manifest.yaml'`. Root cause: the loader at scripts/shared/swf-manifest-loader.mjs resolved `../../docs/methodology/swf-classification-manifest.yaml`, which works in a full repo checkout but lands at `/docs/...` (outside `/app`) in the Railway recovery-bundle container. That service has rootDirectory=scripts/ in the dashboard, so NIXPACKS only copies `scripts/` into the image — `docs/` is never shipped. Fix: move the YAML to scripts/shared/swf-classification-manifest.yaml, alongside its loader. MANIFEST_PATH becomes `./swf-classification-manifest.yaml` so the file is always adjacent to the code that reads it, regardless of rootDirectory. Tests: 53/53 SWF tests still pass; biome clean on changed files. |
||
|
|
8b12ecdf43 |
fix(aviation): seeder writes delays-bootstrap aggregate (close EMPTY-on-quiet-traffic alarm) (#3334)
* fix(aviation): seeder writes delays-bootstrap aggregate (close EMPTY-on-quiet-traffic alarm) api/health.js BOOTSTRAP_KEYS.flightDelays points at aviation:delays-bootstrap:v1, but no seeder ever produced it — the key was only written as a 1800s side-effect inside list-airport-delays.ts. Quiet user-traffic windows >30 min let the bootstrap expire, tripping EMPTY (CRIT) even with healthy upstream FAA + intl + NOTAM seeds. PR #3073 (Apr 13) doubled the cron cadence to 30 min, putting the bootstrap TTL right at the failure edge. Make seed-aviation.mjs the canonical writer: - New writeDelaysBootstrap() reads FAA + intl + NOTAM from Redis, applies the same NOTAM merge + Normal-operations filler the RPC builds, writes aviation:delays-bootstrap:v1 with TTL=7200 (~4 missed cron ticks of cushion). - Called pre-runSeed (last-good intl, covers intl-fail tick) AND inside afterPublishIntl (this-tick intl, happy-path overwrite). - Bump RPC's incidental write TTL 1800 → 7200 so a user-triggered RPC doesn't shorten the seeder's expiry and re-create the failure mode. NOTAM merge logic + filler shape are now mirrored in two files (seeder + RPC's _shared.ts). Both carry comments pointing at the other to surface drift risk. Verified: typecheck (both tsconfigs) clean; node --test tests/aviation-*.test.mjs green; full test:data 6590/6590 green. * fix(aviation): seeder writes restrictedIcaos + bootstrap unwraps intl envelope PR #3334 review (P1 + P2): P1 — bootstrap silently dropped NOTAM restrictions seedNotamClosures() only tracked NOTAM_CLOSURE_QCODES; the live RPC's classifier in server/worldmonitor/aviation/v1/_shared.ts also derives restrictions via NOTAM_RESTRICTION_QCODES (RA, RO) + restriction code45s + restriction-text regex. Seeded NOTAM payload only had `closedIcaos`, so restrictedIcaos was always empty in Redis — both the new bootstrap aggregate AND the RPC's seed-read path silently dropped every NOTAM restriction. Mirror the full classifier from _shared.ts:438-452; side-car write now includes restrictedIcaos and seed-meta count reflects closures + restrictions. P2 — pre-runSeed bootstrap built with no intl alerts on intl-fail tick runSeed wraps the canonical INTL_KEY in {_seed, data} when declareRecords is enabled. writeDelaysBootstrap()'s upstashGet only JSON.parsed — no envelope unwrap — so intlPayload.alerts was undefined on the pre-runSeed bootstrap-build path, and an intl-fail tick would publish a bootstrap with all intl alerts dropped instead of preserving the last-good snapshot. Add upstashGetUnwrapped() (delegates to unwrapEnvelope from _seed-envelope-source.mjs); use it for all three reads (FAA/NOTAM bare values pass through unchanged via unwrapEnvelope's permissive path). Verified: typecheck (both tsconfigs) clean; aviation + edge-functions tests green; full test:data 6590/6590 green. * fix(aviation): bootstrap iterates union of seeder + RPC airport registries PR #3334 review (P2 ×2): P2 — AIRPORTS vs MONITORED_AIRPORTS registry drift Today the two diverge by ~45 iata codes (29 RPC-only, 16 seeder-only). Pre-fix the bootstrap iterated the seeder's local AIRPORTS list for Normal-operations filler and NOTAM airport lookup, so 29 monitored airports never appeared in the bootstrap aggregate even though the live RPC included them. Fix: parse src/config/airports.ts as text at startup (regex over the static const), memoise the parse, build a by-iata Map union (seeder wins on conflict for canonical meta), and iterate that for both NOTAM lookup and filler. First-run divergence summary logged to surface future drift in cron logs without blocking writes. Degrades to seeder AIRPORTS only with a warning if parse fails. P2 — afterPublishIntl receives raw pre-transform data runSeed forwards the RAW fetchIntl() result to afterPublish, NOT the publishTransform()'d shape. Today publishTransform is a pass-through wrapper so data.alerts is correct, but coupling is subtle — added an inline CONTRACT comment so a future publishTransform mutation doesn't silently drift bootstrap from INTL_KEY. Verified: typecheck (both tsconfigs) clean; aviation + edge-functions tests green; full test:data 6590/6590 green; standalone parse harness recovers all 111 MONITORED_AIRPORTS rows. |
||
|
|
d3d406448a |
feat(resilience): PR 2 §3.4 recovery-domain weight rebalance (#3328)
* feat(resilience): PR 2 §3.4 recovery-domain weight rebalance
Dials the two PR 2 §3.4 recovery dims (liquidReserveAdequacy,
sovereignFiscalBuffer) to ~10% share each of the recovery-domain
score via a new per-dimension weight channel in the coverage-weighted
mean. Matches the plan's direction that the sovereign-wealth signal
complement — rather than dominate — the classical liquid-reserves
and fiscal-space signals.
Implementation
- RESILIENCE_DIMENSION_WEIGHTS: new Record<ResilienceDimensionId, number>
alongside RESILIENCE_DOMAIN_WEIGHTS. Every dim has an explicit entry
(default 1.0) so rebalance decisions stay auditable; the two new
recovery dims carry 0.5 each.
Share math at full coverage (6 active recovery dims):
weight sum = 4 × 1.0 + 2 × 0.5 = 5.0
each new-dim share = 0.5 / 5.0 = 0.10 ✓
each core-dim share = 1.0 / 5.0 = 0.20
Retired dims (reserveAdequacy, fuelStockDays) keep weight 1.0 in
the map; their coverage=0 neutralizes them at the coverage channel
regardless. Explicit entries guard against a future scorer bug
accidentally returning coverage>0 for a retired dim and falling
through the `?? 1.0` default — every retirement decision is now
tied to a single explicit source of truth.
- coverageWeightedMean (_shared.ts): refactored to apply
`coverage × dimWeight` per dim instead of `coverage` alone. Backward-
compatible when all weights default to 1.0 (reduces to the original
mean). All three aggregation callers — buildDomainList, baseline-
Score, stressScore — pick up the weighting transparently.
Test coverage
1. New `tests/resilience-recovery-weight-rebalance.test.mts`:
pins the per-dim weight values, asserts the share math
(0.10 new / 0.20 core), verifies completeness of the weight map,
and documents why retired dims stay in the map at 1.0.
2. New `tests/resilience-recovery-ordering.test.mts`: fixture-based
Spearman-proxy sensitivity check. Asserts NO > US > YE ordering
preserved on both the overall score and the recovery-domain
subscore after the rebalance. (Live post-merge Spearman rerun
against the PR 0 snapshot is tracked as a follow-up commit.)
3. resilience-scorers.test.mts fixture anchors updated in lockstep:
baselineScore: 60.35 → 62.17 (low-scoring liquidReserveAdequacy
+ partial-coverage SWF now contribute ~half the weight)
overallScore: 63.60 → 64.39 (recovery subscore lifts by ~3 pts
from the rebalance, overall by ~0.79)
recovery flat mean: 48.75 (unchanged — flat mean doesn't apply
weights by design; documents the coverage-weighted diff)
Local coverageWeightedMean helper in the test mirrors the
production implementation (weights applied per dim).
Methodology doc
- New "Per-dimension weights in the recovery domain" subsection with
the weight table and a sentence explaining the cap. Cross-references
the source of truth (RESILIENCE_DIMENSION_WEIGHTS).
Deliberate non-goals
- Live post-merge Spearman ≥0.85 check against the PR 0 baseline
snapshot. Fixture ordering is preserved (new ordering test); the
live-data check runs after Railway cron refreshes the rankings on
the new weights and commits docs/snapshots/resilience-ranking-live-
post-pr2-<date>.json. Tracked as the final piece of PR 2 §3.4
alongside the health.js / bootstrap graduation (waiting on the
7-day Railway cron bake-in window).
Tests: 6588/6588 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on touched files. NO > US > YE
fixture ordering preserved.
* fix(resilience): PR 2 review — thread RESILIENCE_DIMENSION_WEIGHTS through the comparison harness
Greptile P2: the operator comparison harness
(scripts/compare-resilience-current-vs-proposed.mjs) claims its domain
scores "mirror the production scorer's coverage-weighted mean" and is
the artifact generator for Spearman / rank-delta acceptance decisions.
After PR 2 §3.4's weight rebalance, the production mirror diverged —
production now applies RESILIENCE_DIMENSION_WEIGHTS (liquidReserveAdequacy
= 0.5, sovereignFiscalBuffer = 0.5) inside coverageWeightedMean, but
the harness still used equal-weight aggregation.
Left unfixed, post-merge Spearman / rank-delta diagnostics would
compare live API scores (with the 0.5 recovery weights) against
harness predictions that assume equal-share dims — silently biasing
every acceptance decision until someone noticed a country's rank-
delta didn't track.
Fix
- Mirrored coverageWeightedMean now accepts dimensionWeights and
applies `coverage × weight` per dim, matching _shared.ts exactly.
- Mirrored buildDomainList accepts + forwards dimensionWeights.
- main() imports RESILIENCE_DIMENSION_WEIGHTS from the scorer module
and passes it through to buildDomainList at the single call site.
- Missing-entry default = 1.0 (same contract as production) — makes
the harness forward-compatible with any future weight refactor
(adds a new dim without an explicit entry, old production fallback
path still produces the correct number).
Verification
- Harness syntax-check clean (node -c).
- RESILIENCE_DIMENSION_WEIGHTS import resolves correctly from the
harness's import path.
- 509/509 resilience tests still pass (harness isn't in the test
suite; the invariant is that production ↔ harness use the same
math, and the production side is covered by tests/resilience-
recovery-weight-rebalance.test.mts).
* fix(resilience): PR 2 review — bump cache prefixes v10→v11 + document coverage-vs-weight asymmetry
Greptile P1 + P2 on PR #3328.
P1 — cache prefix not bumped after formula change
--------------------------------------------------
The per-dim weight rebalance changes the score formula, but the
`_formula` tag only distinguishes 'd6' vs 'pc' (pillar-combined vs
legacy 6-domain) — it does NOT detect intra-'d6' weight changes. Left
unfixed, scores cached before deploy would be served with the old
equal-weight math for up to the full 6h TTL, and the ranking key for
up to its 12h TTL. Matches the established v9→v10 pattern for every
prior formula-changing deploy.
Bumped in lockstep:
- RESILIENCE_SCORE_CACHE_PREFIX: v10 → v11
- RESILIENCE_RANKING_CACHE_KEY: v10 → v11
- RESILIENCE_HISTORY_KEY_PREFIX: v5 → v6
- scripts/seed-resilience-scores.mjs local mirrors
- api/health.js resilienceRanking literal
- 4 analysis/backtest scripts that read the cached keys directly
- Test fixtures in resilience-{ranking, handlers, scores-seed,
pillar-aggregation}.test.* that assert on literal key values
The v5→v6 history bump is the critical one: without it, pre-rebalance
history points would mix with post-rebalance points inside the 30-day
window, and change30d / trend math would diff values from different
formulas against each other, producing false-negative "falling" trends
for every country across the deploy window.
P2 — coverage-vs-weight asymmetry in computeLowConfidence / computeOverallCoverage
----------------------------------------------------------------------------------
Reviewer flagged that these two functions still average coverage
equally across all non-retired dims, even after the scoring aggregation
started applying RESILIENCE_DIMENSION_WEIGHTS. The asymmetry is
INTENTIONAL — these signals answer a different question from scoring:
scoring aggregation: "how much does each dim matter to the score?"
coverage signal: "how much real data do we have on this country?"
A dim at weight 0.5 still has the same data-availability footprint as
a weight=1.0 dim: its coverage value reflects whether we successfully
fetched the upstream source, not whether the scorer cares about it.
Applying scoring weights to the coverage signal would let a
half-weight dim hide half its sparsity from the overallCoverage pill,
misleading users reading coverage as a data-quality indicator.
Added explicit comments to both functions noting the asymmetry is
deliberate and pointing at the other site for matching rationale.
No code change — just documentation.
Tests: 6588/6588 data-tier tests pass (+511 resilience-specific
including the prefix-literal assertions). Typecheck clean on both
tsconfig configs. Biome clean on touched files.
* docs(resilience): bump methodology doc cache-prefix references to v11/v6
Greptile P2 on PR #3328: Redis keys table in the reproducibility
appendix still published `score:v10` / `ranking:v10` / `history:v5`,
and the rollback instructions told operators to flush those keys.
After the recovery-domain weight rebalance, live cache runs at
`score:v11` / `ranking:v11` / `history:v6`.
- Updated the Redis keys table (line 490-492) to match `_shared.ts`.
- Updated the rollback block to name the current keys.
- Left the historical "Activation sequence" narrative intact (it
accurately describes the pillar-combine PR's v9→v10 / v4→v5 bump)
but added a parenthetical pointing at the current v11/v6 values.
No code change — doc-only correction for operator accuracy.
* fix(docs): escape MDX-unsafe `<137` pattern to unblock Mintlify deploy
Line 643 had `(<137 countries)` — MDX parses `<137` as a JSX tag
starting with digit `1`, which is illegal and breaks the deploy with
"Unexpected character \`1\` (U+0031) before name". Surfaced after the
prior cache-prefix commit forced Mintlify to re-parse this file.
Replaced with "fewer than 137 countries" for unambiguous rendering.
Other `<` occurrences in this doc (lines 34, 642) are followed by
whitespace and don't trip MDX's tag parser.
|
||
|
|
c48ceea463 |
feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer (#3324)
* feat(resilience): PR 2 dimension wiring — split reserveAdequacy + add sovereignFiscalBuffer Plan §3.4 follow-up to #3305 + #3319. Lands the scorer + dimension registration so the SWF seed from the Railway cron feeds a real score once the bake-in window closes. No weight rebalance yet (separate commit with Spearman sensitivity check), no health.js graduation yet (7-day ON_DEMAND window per feedback_health_required_key_needs_ railway_cron_first.md), no bootstrap wiring yet (follow-up PR). Shape of the change Retirement: - reserveAdequacy joins fuelStockDays in RESILIENCE_RETIRED_DIMENSIONS. The legacy scorer now mirrors scoreFuelStockDays: returns coverage=0 / imputationClass=null so the dimension is filtered out of the confidence / coverage averages via the registry filter in computeLowConfidence, computeOverallCoverage, and the widget's formatResilienceConfidence. Kept in RESILIENCE_DIMENSION_ORDER for structural continuity (tests, cached payload shape, registry membership). Indicator registry tier demoted to 'experimental'. Two new active dimensions: - liquidReserveAdequacy (replaces the liquid-reserves half of the retired reserveAdequacy). Same source (WB FI.RES.TOTL.MO, total reserves in months of imports) but re-anchored 1..12 months instead of 1..18. Twelve months ≈ IMF "full reserve adequacy" benchmark for a diversified emerging-market importer — the tighter ceiling prevents wealthy commodity-exporters from claiming outsized credit for on-paper reserve stocks that are not the relevant shock-absorption buffer. - sovereignFiscalBuffer. Reads resilience:recovery:sovereign-wealth:v1 (populated by scripts/seed-sovereign-wealth.mjs, landed in #3305 + wired into Railway cron in #3319). Computes the saturating transform: effectiveMonths = Σ [ aum/annualImports × 12 × access × liquidity × transparency ] score = 100 × (1 − exp(−effectiveMonths / 12)) Exponential saturation prevents Norway-type outliers (effective months in the 100s) from dominating the recovery pillar. Three code paths in scoreSovereignFiscalBuffer: 1. Seed key absent entirely → IMPUTE.recoverySovereignFiscalBuffer (score 50 / coverage 0.3 / unmonitored). Covers the Railway-cron bake-in window before the first successful tick. 2. Seed present, country NOT in manifest → score=0 with FULL coverage. Substantive absence, NOT imputation — per plan §3.4 "What happens to no-SWF countries." 0 × weight = 0 in the numerator, so the country correctly scores lower than SWF-holding peers on this dim. 3. Seed present, country in payload → saturating score, coverage derated by the partial-seed completeness signal (so a Mubadala or Temasek scrape drift on a multi-fund country shows up as lower confidence rather than a silently-understated total). Indicator registry: - Demoted recoveryReserveMonths (tied to retired reserveAdequacy) to tier='experimental'. - Added recoveryLiquidReserveMonths: WB FI.RES.TOTL.MO, anchors 1..12, tier='core', coverage=188. - Added recoverySovereignWealthEffectiveMonths: the new SWF signal, tier='experimental' for now because the manifest only has 8 funds (below the 180-core / 137-§3.6-gate threshold). Graduating to 'core' requires expanding the manifest past ~137 entries — a later PR. Tests updated - resilience-release-gate: 19→21 dim count; RETIRED_DIMENSIONS allow- list now includes reserveAdequacy alongside fuelStockDays. - resilience-dimension-scorers: scoreReserveAdequacy monotonicity + "high reserves score well" tests migrated to scoreLiquidReserve- Adequacy (same source, new 1..12 anchor). New retirement-shape test for scoreReserveAdequacy mirroring the PR 3 fuelStockDays retirement test. Four new scorer tests pin the three code paths of scoreSovereignFiscalBuffer (absent seed / no-SWF country / SWF country / partial-completeness derate). - resilience-scorers fixture: baseline 60.12→60.35, recovery-domain flat mean 47.33→48.75, overall 63.27→63.6. Each number commented with the driver (split adds liquidReserveAdequacy 18@1.0 + sovereign FiscalBuffer 50@0.3 at IMPUTE; retired reserveAdequacy drops out). - resilience-dimension-monotonicity: target scoreLiquidReserveAdequacy instead of scoreReserveAdequacy. - resilience-handlers: response-shape dim count 19→21. - resilience-indicator-registry: coverage 19→21 dimensions. - resilience-dimension-freshness: allowlisted the new sovereign-wealth seed-meta key in KNOWN_SEEDS_NOT_IN_HEALTH for the ON_DEMAND window. - resilience-methodology-lint HEADING_TO_DIMENSION: added the two new heading mappings. Methodology doc gets H4 sections for Liquid Reserve Adequacy and Sovereign Fiscal Buffer; Reserve Adequacy section is annotated as retired. - resilience-retired-dimensions-parity: client-side RESILIENCE_RETIRED_DIMENSION_IDS gets reserveAdequacy. Parser upgraded to strip inline `// …` comments from the array body so a future reviewer can drop a rationale next to an entry without breaking parity. - resilience-confidence-averaging: fixture updated to include both retired dims (reserveAdequacy + fuelStockDays) — confirms the registry filter correctly excludes BOTH from the visible coverage reading. Extraction harness (scripts/compare-resilience-current-vs-proposed.mjs): - recoveryLiquidReserveMonths: reads the same reserve-adequacy seed field as recoveryReserveMonths. - recoverySovereignWealthEffectiveMonths: reads the new SWF seed key on field totalEffectiveMonths. Absent-payload → 0 for correlation math (matches the substantive-no-SWF scorer branch). Out of scope for this commit (follow-ups) - Recovery-domain weight rebalance + Spearman sensitivity rerun against the PR 0 baseline. - health.js graduation (SEED_META entry + ON_DEMAND_KEYS removal) once Railway cron has ~7 days of clean runs. - api/bootstrap.js wiring once an RPC consumer needs the SWF data. - Manifest expansion past 137 countries so sovereignFiscalBuffer can graduate from tier='experimental' to tier='core'. Tests: 6573/6573 data-tier tests pass. Typecheck clean on both tsconfig configs. Biome clean on all touched files. * fix(resilience): PR 2 review — add widget labels for new dimensions P2 review finding on PR #3324. DIMENSION_LABELS in src/components/ resilience-widget-utils.ts covered only the old 19 dimension IDs, so the two new active dims (liquidReserveAdequacy, sovereignFiscalBuffer) would render with their raw internal IDs in the confidence grid for every country once the scorer started emitting them. The widget test at getResilienceDimensionLabel also asserted only the 19-label set, so the gap would have shipped silently. Fix: add user-facing short labels for both new dims. "Reserves" is already claimed by the retired reserveAdequacy, so the replacement disambiguates with "Liquid Reserves"; sovereignFiscalBuffer → "Sovereign Wealth" per the methodology doc H4 heading. Also added a regression guard — new test asserts EVERY id in RESILIENCE_DIMENSION_ORDER resolves to a non-id label. Any future dimension that ships without a matching DIMENSION_LABELS entry now fails CI loudly instead of leaking the ID into the UI. Tests: 502/502 resilience tests pass (+1 new coverage check). Typecheck clean on both configs. * fix(resilience): PR 2 review — remove dead IMPUTE.recoveryReserveAdequacy entry Greptile P2: the retired scoreReserveAdequacy stub no longer reads from IMPUTE (it hardcodes coverage=0 / imputationClass=null per the retirement pattern), making IMPUTE.recoveryReserveAdequacy dead code. Removed the entry + added a breadcrumb comment pointing at the replacement IMPUTE.recoveryLiquidReserveAdequacy. The second P2 (bootstrap.js not wired) is a deliberate non-goal — the reviewer explicitly flags "for visibility" since it's tracked in the PR body. No action this commit; bootstrap wiring lands alongside the SEED_META graduation after the ~7-day Railway-cron bake-in. Tests: 502/502 resilience tests still pass. Typecheck clean. |
||
|
|
84ee2beb3e |
feat(energy): Energy Atlas end-to-end — pipelines + storage + shortages + disruptions + country drill-down (#3294)
* feat(energy): pipeline registries (gas + oil) — evidence-based schema
Day 6 of the Energy Atlas Release 1 plan (Week 2). First curated asset
registry for the atlas — the real gap vs GEF.
## Curated data (critical assets only, not global completeness)
scripts/data/pipelines-gas.json — 12 critical gas lines:
Nord Stream 1/2 (offline; Swedish EEZ sabotage 2022; EU sanctions refs),
TurkStream, Yamal–Europe (offline; Polish counter-sanctions),
Brotherhood/Soyuz (offline; Ukraine transit expired 2024-12-31),
Power of Siberia, Dolphin, Medgaz, TAP, TANAP,
Central Asia–China, Langeled.
scripts/data/pipelines-oil.json — 12 critical oil lines:
Druzhba North/South (N offline per EU 2022/879; S under landlocked
derogation), CPC, ESPO (+ price-cap sanction ref), BTC, TAPS,
Habshan–Fujairah (Hormuz bypass), Keystone, Kirkuk–Ceyhan (offline
since 2023 ICC ruling), Baku–Supsa, Trans-Mountain (TMX expansion
May 2024), ESPO spur to Daqing.
Scope note: 75+ each is Week 2b work via GEM bulk import. Today's cut
is curated from first-hand operator disclosures + regulator filings so
I can stand behind every evidence field.
## Evidence-based schema (not conclusion labels)
Per docs/methodology/pipelines.mdx: no bare `sanctions_blocked` field.
Every pipeline carries an evidence bundle with `physicalState`,
`physicalStateSource`, `operatorStatement`, `commercialState`,
`sanctionRefs[]`, `lastEvidenceUpdate`, `classifierVersion`,
`classifierConfidence`. The public badge (`flowing|reduced|offline|
disputed`) is derived server-side from this bundle at read time.
## Seeder
scripts/seed-pipelines.mjs — single process publishes BOTH keys
(energy:pipelines:{gas,oil}:v1) via two runSeed() calls. Tiny datasets
(<20KB each) so co-location is cheap and guarantees classifierVersion
consistency.
Conventions followed (worldmonitor-bootstrap-registration skill):
- TTL 21d = 3× weekly cadence (gold-standard per
feedback_seeder_gold_standard.md)
- maxStaleMin 20_160 = 2× cadence (health-maxstalemin-write-cadence skill)
- sourceVersion + schemaVersion + recordCount + declareRecords wired
(seed-contract-foundation)
- Zero-case explicitly NOT allowed — MIN_PIPELINES_PER_REGISTRY=8 floor
## Health registration (dual, per feedback_two_health_endpoints_must_match)
- api/health.js: BOOTSTRAP_KEYS adds pipelinesGas + pipelinesOil;
SEED_META adds both with maxStaleMin=20_160.
- api/seed-health.js: mirror entries with intervalMin=10_080 (maxStaleMin/2).
## Bundle registration
scripts/seed-bundle-energy-sources.mjs adds a single Pipelines entry
(not two) because seed-pipelines.mjs publishes both keys in one run —
listing oil separately would double-execute. Monitoring of the oil key
staleness happens in api/health.js instead.
## Tests (tests/pipelines-registry.test.mts)
17 passing node:test assertions covering:
- Schema validation (both registries pass validateRegistry)
- Identity resolution (no id collisions, id matches object key)
- Country ISO2 normalization (from/to/transit all match /^[A-Z]{2}$/)
- Endpoint geometry within Earth bounds
- Evidence rigor: non-flowing badges require at least one supporting
evidence source (operator statement / sanctionRefs / ais-relay /
satellite / press)
- ClassifierConfidence in 0..1
- Commodity/capacity pairing (gas uses capacityBcmYr, oil uses
capacityMbd — mixing = test fail)
- validateRegistry rejects: empty object, null, no-evidence fixtures,
below-floor counts
Typecheck clean (both tsconfig.json and tsconfig.api.json).
Next: Day 7 will add list-pipelines / get-pipeline-detail RPCs in
supply-chain/v1. Day 8 ships PipelineStatusPanel with DeckGL PathLayer
consuming the registry.
* fix(energy): split seed-pipelines.mjs into two entry points — runSeed hard-exits
High finding from PR review. scripts/seed-pipelines.mjs called runSeed()
twice in one process and awaited Promise.all. But runSeed() in
scripts/_seed-utils.mjs hard-exits via process.exit on ~9 terminal paths
(lines 816, 820, 839, 888, 917, 989, plus fetch-retry 946, fatal 859,
skipped-lock 81). The first runSeed to reach any terminal path exits the
entire node process, so the second runSeed's resolve never fires — only
one of energy:pipelines:{gas,oil}:v1 would ever be written.
Since the bundle scheduled seed-pipelines.mjs exactly once, and both
api/health.js and api/seed-health.js expect both keys populated, the
other registry would stay permanently EMPTY/STALE after deploy.
Fix: split into two entry-point scripts around a shared utility.
- scripts/_pipeline-registry.mjs (NEW, was seed-pipelines.mjs) — shared
helpers ONLY. Exports GAS_CANONICAL_KEY, OIL_CANONICAL_KEY,
PIPELINES_TTL_SECONDS, MAX_STALE_MIN, buildGasPayload, buildOilPayload,
validateRegistry, recordCount, declareRecords. Underscore prefix marks
it as non-entry-point (matches _seed-utils.mjs / _seed-envelope-source.mjs
convention).
- scripts/seed-pipelines-gas.mjs (NEW) — imports from the shared module,
single runSeed('energy','pipelines-gas',…) call.
- scripts/seed-pipelines-oil.mjs (NEW) — same shape, oil.
- scripts/seed-bundle-energy-sources.mjs — register BOTH seeders (not one).
- scripts/seed-pipelines.mjs — deleted.
- tests/pipelines-registry.test.mts — update import path to the shared
module. All 17 tests still pass.
Typecheck clean (both configs). Tests pass. No other consumers import
from the deleted script.
* fix(energy): complete pipeline bootstrap registration per 4-file checklist
High finding from PR review. My earlier PR description claimed
worldmonitor-bootstrap-registration was complete, but I only touched two
of the four registries (api/health.js + api/seed-health.js). The bootstrap
hydration payload itself (api/bootstrap.js) and the shared cache-keys
registry (server/_shared/cache-keys.ts) still had no entry for either
pipeline key, so any consumer that reads bootstrap data would see
pipelinesGas/pipelinesOil as missing on first load.
Files updated this commit:
- api/bootstrap.js — KEYS map + SLOW_KEYS set both gain pipelinesGas +
pipelinesOil. Placed next to sprPolicies (same curated-registry cadence
and tier). Slow tier is correct: weekly cron, not needed on first paint.
- server/_shared/cache-keys.ts — PIPELINES_GAS_KEY + PIPELINES_OIL_KEY
exported constants (matches SPR_POLICIES_KEY pattern), BOOTSTRAP_KEYS map
entries, and BOOTSTRAP_TIERS entries (both 'slow').
Not touched (intentional):
- server/gateway.ts — pipeline data is free-tier per the Energy Atlas
plan; no PREMIUM_RPC_PATHS entry required. Energy Atlas monetization
hooks (scenario runner, MCP tools, subscriptions) are Release 2.
Full 4-file checklist now complete:
✅ server/_shared/cache-keys.ts (this commit)
✅ api/bootstrap.js (this commit)
✅ api/health.js (earlier in PR)
✅ api/seed-health.js (earlier in PR — dual-registry rule)
Typecheck clean (both configs).
* feat(energy): ListPipelines + GetPipelineDetail RPCs with evidence-derived badges
Day 7 of the Energy Atlas Release 1 plan (Week 2). Exposes the pipeline
registries (shipped in Day 6) via two supply-chain RPCs and ships the
evidence-to-badge derivation server-side.
## Proto
proto/worldmonitor/supply_chain/v1/list_pipelines.proto — new:
- ListPipelinesRequest { commodity_type?: 'gas' | 'oil' }
- ListPipelinesResponse { pipelines[], fetched_at, classifier_version, upstream_unavailable }
- GetPipelineDetailRequest { pipeline_id (required, query-param) }
- GetPipelineDetailResponse { pipeline?, revisions[], fetched_at, unavailable }
- PipelineEntry — wire shape mirroring scripts/data/pipelines-{gas,oil}.json
+ a server-derived public_badge field
- PipelineEvidence, OperatorStatement, SanctionRef, LatLon, PipelineRevisionEntry
service.proto adds both rpc methods with HTTP_METHOD_GET + path bindings:
/api/supply-chain/v1/list-pipelines
/api/supply-chain/v1/get-pipeline-detail
`make generate` regenerated src/generated/{client,server}/… + docs/api/
OpenAPI json/yaml.
## Evidence-derivation
server/worldmonitor/supply-chain/v1/_pipeline-evidence.ts — new.
derivePublicBadge(evidence) → 'flowing' | 'reduced' | 'offline' | 'disputed'
is deterministic + versioned (DERIVER_VERSION='badge-deriver-v1').
Rules (first match wins):
1. offline + sanctionRef OR expired/suspended commercial → offline
2. offline + operator statement → offline
3. offline + only press/ais/satellite → disputed (single-source negative claim)
4. reduced → reduced
5. flowing → flowing
6. unknown / malformed → disputed
Staleness guard: non-flowing badges on >14d-old evidence demote to
disputed. Flowing is the optimistic default — stale "still flowing" is
safer than stale "offline". Matches seed-pipelines-{gas,oil}.mjs maxStaleMin.
Tests (tests/pipeline-evidence-derivation.test.mts) — 15 passing cases
covering happy paths, disputed fallbacks, staleness guard, versioning.
## Handlers
server/worldmonitor/supply-chain/v1/list-pipelines.ts
- Reads energy:pipelines:{gas,oil}:v1 via getCachedJson.
- projectPipeline() narrows the Upstash `unknown` into PipelineEntry
shape + calls derivePublicBadge.
- Honors commodity_type filter (skip the opposite registry's Redis read
when the client pre-filters).
- Returns upstream_unavailable=true when BOTH registries miss.
server/worldmonitor/supply-chain/v1/get-pipeline-detail.ts
- Scans both registries by id (ids are globally unique per
tests/pipelines-registry.test.mts).
- Empty revisions[] for now; auto-revision log wires up in Week 3.
handler.ts registers both into supplyChainHandler.
## Gateway
server/gateway.ts adds 'static' cache-tier for both new RPC paths
(registry is slow-moving; 'static' matches the other read-mostly
supply-chain endpoints).
## Consumer wiring
Not in this commit — PipelineStatusPanel (Day 8) is what will call
listPipelines/getPipelineDetail via the generated client. pipelinesGas
+ pipelinesOil stay in PENDING_CONSUMERS until Day 8.
Typecheck clean (both configs). 15 new tests + 17 registry tests all pass.
* feat(energy): PipelineStatusPanel — evidence-backed status table + drawer
Day 8 of the Energy Atlas Release 1 plan. First consumer of the Day 6–7
registries + RPCs.
## What this PR adds
- src/components/PipelineStatusPanel.ts — new panel (id=pipeline-status).
* Bootstrap-hydrates from pipelinesGas + pipelinesOil for instant first
paint; falls through to listPipelines() RPC if bootstrap misses.
Background re-fetch runs on every render so a classifier-version bump
between bootstrap stamp and first view produces a visible update.
* Table rows sorted non-flowing-first (offline / reduced / disputed
before flowing) — what an atlas reader cares about.
* Click-to-expand drawer calls getPipelineDetail() lazily — operator
statements, sanction refs (with clickable source URLs), commercial
state, classifier version + confidence %, capacity + route metadata.
* publicBadge color-chip palette matches the methodology doc.
* Attribution footer with GEM (CC-BY 4.0) credit + classifier version.
- src/components/index.ts — barrel export.
- src/app/panel-layout.ts — import + createPanel('pipeline-status', …).
- src/config/panels.ts — ENERGY_PANELS adds 'pipeline-status' at priority 1.
## PENDING_CONSUMERS cleanup
tests/bootstrap.test.mjs — removes 'pipelinesGas' + 'pipelinesOil' from
the allowlist. The invariant "every bootstrap key has a getHydratedData
consumer" now enforces real wiring for these keys: the panel literally
calls getHydratedData('pipelinesGas') and getHydratedData('pipelinesOil').
Future regressions that remove the consumer will fail pre-push.
## Consumer contract verified
- 67 tests pass including bootstrap.test.mjs consumer coverage check.
- Typecheck clean.
- No DeckGL PathLayer in this commit — existing 'pipelines-layer' has a
separate data source, so modifying DeckGLMap.ts to overlay evidence-
derived badges on the map is a follow-up commit to avoid clobbering.
## Out of scope for Day 8 (next steps on same PR)
- DeckGL PathLayer integration (color pipelines on the main map by
publicBadge, click-to-open this drawer) — Day 8b commit.
- Storage facility registry + StorageFacilityMapPanel — Days 9-10.
* fix(energy): PipelineStatusPanel bootstrap path — client-side badge derivation
High finding from PR review. The Day-8 panel crashed on first paint
whenever bootstrap hydration succeeded, because:
- Bootstrap hydrates raw scripts/data/pipelines-{gas,oil}.json verbatim.
- That JSON does NOT include publicBadge — that field is only added by
the server handler's projectPipeline() in list-pipelines.ts.
- PipelineStatusPanel passed raw entries into badgeChip(), which called
badgeLabel(undefined).charAt(0) → TypeError.
The background RPC refresh that would have repaired the data never ran
because the panel threw before reaching it. So the exact bootstrap path
newly wired in commit
|
||
|
|
7cf37c604c |
feat(resilience): PR 3 — dead-signal cleanup (plan §3.5, §3.6) (#3297)
* feat(resilience): PR 3 §3.5 — retire fuelStockDays from core score permanently
First commit in PR 3 of the resilience repair plan. Retires
`fuelStockDays` from the core score with no replacement.
Why permanent, not replaced:
IEA emergency-stockholding rules are defined in days of NET IMPORTS
and do not bind net exporters by design. Norway/Canada/US measured
in days-of-imports are incomparable to Germany/Japan measured the
same way — the construct is fundamentally different across the two
country classes. No globally-comparable recovery-fuel signal can
be built from this source; the pre-repair probe showed 100% imputed
at 50 for every country in the April 2026 freeze.
scoreFuelStockDays:
- Rewritten to return coverage=0 + observedWeight=0 +
imputationClass='source-failure' for every country regardless
of seed content.
- Drops the dimension from the `recovery` domain's coverage-
weighted mean automatically; remaining recovery dimensions
pick up the share via re-normalisation in
`_shared.ts#coverageWeightedMean`.
- No explicit weight transfer needed — the coverage-weighted
blend handles redistribution.
Registry:
- recoveryFuelStockDays re-tagged from tier='enrichment' to
tier='experimental' so the Core coverage gate treats it as
out-of-score.
- Description updated to make the retirement explicit; entry
stays in the registry for structural continuity (the
dimension `fuelStockDays` remains in RESILIENCE_DIMENSION_ORDER
for the 19-dimension tests; removing the dimension entirely is
a PR 4 structural-audit concern).
Housekeeping:
- Removed `RESILIENCE_RECOVERY_FUEL_STOCKS_KEY` constant (no
longer read; noUnusedLocals would reject it).
- Removed `RecoveryFuelStocksCountry` interface for the same
reason. Comment at the removed declaration instructs future
maintainers not to re-add the type as a reservation; when a
new recovery-fuel concept lands, introduce a fresh interface.
Plan reference: §3.5 point 1 of
`docs/plans/2026-04-22-001-fix-resilience-scorer-structural-bias-plan.md`.
51 resilience tests pass, typecheck + biome clean. The
`recovery` domain's published score will shift slightly for every
country because the 0.10 slot that fuelStockDays was imputing to
now redistributes; the compare-harness acceptance-gate rerun at
merge time will quantify the shift per plan §6 gates.
* feat(resilience): PR 3 §3.5 — retire BIS-backed currencyExternal; rebuild on IMF inflation + WB reserves
BIS REER/DSR feeds were load-bearing in currencyExternal (weights 0.35
fxVolatility + 0.35 fxDeviation, ~70% of dimension). They cover ~60
countries max — so every non-BIS country fell through to
curated_list_absent (coverage 0.3) or a thin IMF proxy (coverage 0.45).
Combined with reserveMarginPct already removed in PR 1, currencyExternal
was the clearest "construct absent for most of the world" carrier left
in the scorer.
Changes:
_dimension-scorers.ts
- scoreCurrencyExternal now reads IMF macro (inflationPct) + WB FX
reserves only. Coverage ladder:
inflation + reserves → 0.85 (observed primary + secondary)
inflation only → 0.55
reserves only → 0.40
neither → 0.30 (IMPUTE.bisEer retained for snapshot
continuity; semantics read as
"no IMF + no WB reserves" now)
- Removed dead symbols: RESILIENCE_BIS_EXCHANGE_KEY constant (reserved
via comment only, flagged by noUnusedLocals), stddev() helper,
getCountryBisExchangeRates() loader, BisExchangeRate interface,
dateToSortableNumber() — all were exclusive callers of the retired
BIS path.
_indicator-registry.ts
- New core entry inflationStability (weight 0.60, tier=core,
sourceKey=economic:imf:macro:v2).
- fxReservesAdequacy weight 0.15 → 0.40 (secondary reliability
anchor).
- fxVolatility + fxDeviation demoted tier=enrichment → tier=experimental
(BIS ~60-country coverage; off the core weight sum).
- Non-experimental weights now sum to 1.0 (0.60 + 0.40).
scripts/compare-resilience-current-vs-proposed.mjs
- EXTRACTION_RULES: added inflationStability →
imf-macro-country-field field=inflationPct so the registry-parity
test passes and the correlation harness sees the new construct.
tests/resilience-dimension-scorers.test.mts
- Dropped BIS-era wording ("non-BIS country") and test 266
(BIS-outage coverage 0.35 branch) which collapsed to the inflation-
only path post-retirement.
- Updated coverage assertions: inflation-only 0.45 → 0.55; inflation+
reserves 0.55 → 0.85.
tests/resilience-scorers.test.mts
- domainAverages.economic 68.33 → 66.33 (US currencyExternal score
shifts slightly under IMF+reserves vs old BIS composite).
- stressScore 67.85 → 67.21; stressFactor 0.3215 → 0.3279.
- overallScore 65.82 → 65.52.
- baselineScore unchanged (currencyExternal is stress-only).
All 6324 data-tier tests pass. typecheck:api clean. No change to
seeders or Redis keys; this is a pure scorer + registry rebuild.
* feat(resilience): PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)
Plan §2.1 diagnosis table showed externalDebtCoverage saturating at
score=100 across all 9 probe countries — including stressed states.
Signal was collapsed. Root cause: (worst=5, best=0) gave every country
with ratio < 0.5 a score above 90, and mapped Greenspan-Guidotti's
reserve-adequacy threshold (ratio=1.0) to score 80 — well into "no
worry" territory instead of the "mild warning" it should be.
Re-anchored on Greenspan-Guidotti directly: ratio=1.0 now maps to score
50 (mild warning), ratio=2.0 to score 0 (acute rollover-shock exposure).
Ratios above 2.0 clamp to 0, consistent with "beyond this point the
country is already in crisis; exact value stops mattering."
Files changed:
- _indicator-registry.ts: recoveryDebtToReserves goalposts
{worst: 5, best: 0} → {worst: 2, best: 0}. Description updated to
cite Greenspan-Guidotti; inline comment documents anchor + rationale.
- _dimension-scorers.ts: scoreExternalDebtCoverage normalizer bound
changed from (0..5) to (0..2), with inline comment.
- docs/methodology/country-resilience-index.mdx: goalpost table row
5-0 → 2-0, description cites Greenspan-Guidotti.
- docs/methodology/indicator-sources.yaml:
* constructStatus: dead-signal → observed-mechanism (signal is now
discriminating).
* reviewNotes updated to describe the new anchor.
* mechanismTestRationale names the Greenspan-Guidotti rule.
- tests/resilience-dimension-monotonicity.test.mts: updated the
comment + picked values inside the (0..2) discriminating band (0.3
and 1.5). Old values (1 vs 4) had 4 clamping to 0.
- tests/resilience-dimension-scorers.test.mts: NO score threshold
relaxed >90 → >=85 (NO ratio=0.2 now scores 90, was 96).
- tests/resilience-scorers.test.mts: fixture drift:
* domainAverages.recovery 54.83 → 47.33 (US extDebt 70 → 25).
* baselineScore 63.63 → 60.12 (extDebt is baseline type).
* overallScore 65.52 → 63.27.
* stressScore / stressFactor unchanged (extDebt is baseline-only).
All 6324 data-tier tests pass. typecheck:api clean.
* feat(resilience): PR 3 §3.6 — CI gate on indicator coverage and nominal weight
Plan §3.6 adds a new acceptance criterion (also §5 item 5):
> No indicator with observed coverage below 70% may exceed 5% nominal
> weight OR 5% effective influence in the post-change sensitivity run.
This commit enforces the NOMINAL-WEIGHT half as a unit test that runs
on every CI build. The EFFECTIVE-INFLUENCE half is produced by
scripts/validate-resilience-sensitivity.mjs as a committed artifact;
the gate file only asserts that script still exists so a refactor that
removes it breaks the build loudly.
Why the gate exists (plan §3.6):
"A dimension at 30% observed coverage carries the same effective
weight as one at 95%. This contradicts the OECD/JRC handbook on
uncertainty analysis."
Implementation:
tests/resilience-coverage-influence-gate.test.mts — three tests:
1. Nominal-weight gate: for every core indicator with coverage < 137
countries (70% of the ~195-country universe), computes its nominal
overall weight as
indicator.weight × (1/dimensions-in-domain) × domain-weight
and asserts it does not exceed 5%. Equal-share-per-dimension is
the *upper bound* on runtime weight (coverage-weighted mean gives
a lower share when a dimension drops out), so this is a strict
bound: if the nominal number passes, the runtime number also
passes for every country.
2. Effective-influence contract: asserts the sensitivity script
exists at its expected path. Removing it (intentionally or by
refactor) breaks the build.
3. Audit visibility: prints the top 10 core indicators by nominal
overall weight. No assertion beyond "ran" — the list lets
reviewers spot outliers that pass the gate but are near the cap.
Current state (observed from audit output):
recoveryReserveMonths: nominal=4.17% coverage=188
recoveryDebtToReserves: nominal=4.17% coverage=185
recoveryImportHhi: nominal=4.17% coverage=190
inflationStability: nominal=3.40% coverage=185
electricityConsumption: nominal=3.30% coverage=217
ucdpConflict: nominal=3.09% coverage=193
Every core indicator has coverage ≥ 180 (already enforced by the
pre-existing indicator-tiering test), so the nominal-weight gate has
no current violators — its purpose is catching future drift, not
flagging today's state.
All 6327 data-tier tests pass. typecheck:api clean.
* docs(resilience): PR 3 methodology doc — document §3.5 dead-signal retirements + §3.6 coverage gate
Methodology-doc update capturing the three §3.5 landings and the §3.6 CI
gate. Five edits:
1. **Known construct limitations section (#5 and #6):** strikethrough the
original "dead signals" and "no coverage-based weight cap" items,
annotate them with "Landed in PR 3 §3.5"/"Landed in PR 3 §3.6" +
specifics of what shipped.
2. **Currency & External H4 section:** completely rewritten. Old table
(fxVolatility / fxDeviation / fxReservesAdequacy on BIS primary) is
replaced by the two-indicator post-PR-3 table (inflationStability at
0.60 + fxReservesAdequacy at 0.40). Coverage ladder spelled out
(0.85 / 0.55 / 0.40 / 0.30). Legacy BIS indicators named as
experimental-tier drill-downs only.
3. **Fuel Stock Days H4 section:** H4 heading text kept verbatim so the
methodology-lint H4-to-dimension mapping does not break; body
rewritten to explain that the dimension is retired from core but the
seeder still runs for IEA-member drill-downs.
4. **External Debt Coverage table row:** goalpost 5-0 → 2-0, description
cites Greenspan-Guidotti reserve-adequacy rule.
5. **New v2.2 changelog entry** — PR 3 dead-signal cleanup, covering
§3.5 points 1/2/3 + §3.6 + acceptance gates + construct-audit
updates.
No scoring or code changes in this commit. Methodology-lint test passes
(H4 mapping intact). All 6327 data-tier tests pass.
* fix(resilience): PR 3 §3.6 gate — correct share-denominator for coverage-weighted aggregation
Reviewer catch (thanks). The previous gate computed each indicator's
nominal overall weight as
indicator.weight × (1 / N_total_dimensions_in_domain) × domain_weight
and claimed this was an upper bound ("actual runtime weight is ≤ this
when some dimensions drop out on coverage"). That is BACKWARDS for
this scorer.
The domain aggregation is coverage-weighted
(server/worldmonitor/resilience/v1/_shared.ts coverageWeightedMean),
so when a dimension pins at coverage=0 it is EXCLUDED from the
denominator and the surviving dimensions' shares go UP, not down.
PR 3 commit 1 retires fuelStockDays by hard-coding its scorer to
coverage=0 for every country — so in the current live state the
recovery domain has 5 contributing dimensions (not 6), and each core
recovery indicator's nominal share is
1.0 × 1/5 × 0.25 = 5.00% (was mis-reported as 4.17%)
The old gate therefore under-estimated nominal influence and could
silently pass exactly the kind of low-coverage overweight regression
it is meant to block.
Fix:
- Added `coreBearingDimensions(domainId)` helper that counts only
dimensions that have ≥1 core indicator in the registry. A dimension
with only experimental/enrichment entries (post-retirement
fuelStockDays) has no core contribution → does not dilute shares.
- Updated `nominalOverallWeight` to divide by the core-bearing count,
not the raw dimension count.
- Rewrote the helper's doc comment to stop claiming this is a strict
upper bound — explicitly calls out the dynamic case (source failure
raising surviving dim shares further) as the sensitivity script's
responsibility.
- Added a new regression test: asserts (a) at least one recovery
dimension is all-non-core (fuelStockDays post-retirement),
(b) fuelStockDays has zero core indicators, and (c) recoveryDebt
ToReserves nominal = 0.05 exactly (not 0.0417) — any reversion
of the retirement or regression to N_total-denominator will fail
loudly.
Top-10 audit output now correctly shows:
recoveryReserveMonths: nominal=5% coverage=188
recoveryDebtToReserves: nominal=5% coverage=185
recoveryImportHhi: nominal=5% coverage=190
(was 4.17% each under the old math)
All 486 resilience tests pass. typecheck:api clean.
Note: the 5% figure is exactly AT the cap, not over it. "exceed" means
strictly > 5%, so it still passes. But now the reviewer / audit log
reflects reality.
* fix(resilience): PR 3 review — retired-dim confidence drag + false source-failure label
Addresses the Codex review P1 + P2 on PR #3297.
P1 — retired-dim drag on confidence averages
--------------------------------------------
scoreFuelStockDays returns coverage=0 by design (retired construct),
but computeLowConfidence, computeOverallCoverage, and the widget's
formatResilienceConfidence averaged across all 19 dimensions. That
dragged every country's reported averageCoverage down — US went from
0.8556 (active dims only) to 0.8105 (all dims) — enough drift to
misclassify edge countries as lowConfidence and to shift the ranking
widget's overallCoverage pill for every country.
Fix: introduce an authoritative RESILIENCE_RETIRED_DIMENSIONS set in
_dimension-scorers.ts and filter it out of all three averages. The
filter is keyed on the retired-dim REGISTRY, not on coverage === 0,
because a non-retired dim can legitimately emit coverage=0 on a
genuinely sparse-data country via weightedBlend fall-through — those
entries MUST keep dragging confidence down (that is the sparse-data
signal lowConfidence exists to surface). Verified: sparse-country
release-gate test (marks sparse WHO/FAO countries as low confidence)
still passes with the registry-keyed filter; would have failed with
a naive coverage=0 filter.
Server-client parity: widget-utils cannot import server code, so
RESILIENCE_RETIRED_DIMENSION_IDS is a hand-mirrored constant, kept
in lockstep by tests/resilience-retired-dimensions-parity.test.mts
(parses the widget file as text, same pattern as existing widget-util
tests that can't import the widget module directly).
P2 — false "Source down" label on retired dim
---------------------------------------------
scoreFuelStockDays hard-coded imputationClass: 'source-failure',
which the widget maps to "Source down: upstream seeder failed" with
a `!` icon for every country. That is semantically wrong for an
intentional retirement. Flipped to null so the widget's absent-path
renders a neutral cell without a false outage label. null is already
a legal value of ResilienceDimensionScore.imputationClass; no type
change needed.
Tests
-----
- tests/resilience-confidence-averaging.test.mts (new): pins the
registry-keyed filter semantic for computeOverallCoverage +
computeLowConfidence. Includes a negative-control test proving
non-retired coverage=0 dims still flip lowConfidence.
- tests/resilience-retired-dimensions-parity.test.mts (new):
lockstep gate between server and client retired-dim lists.
- Widget test adds a registry-keyed exclusion test with a non-retired
coverage=0 dim in the fixture to lock in the correct semantic.
- Existing tests asserting imputationClass: 'source-failure' for
fuelStockDays flipped to null.
All 494 resilience tests + full 6336/6336 data-tier suite pass.
Typecheck clean for both tsconfig.json and tsconfig.api.json.
* docs(resilience): align methodology + registry metadata with shipped imputationClass=null
Follow-up to the previous PR 3 review commit that flipped
scoreFuelStockDays's imputationClass from 'source-failure' to null to
avoid a false "Source down" widget label on every country. The code
changed; the doc and registry metadata did not, leaving three sites
in the methodology mdx and two comment/description sites in the
registry still claiming imputationClass='source-failure'. Any future
reviewer (or tooling that treats the registry description as
authoritative) would be misled.
This commit rewrites those sites to describe the shipped behavior:
- imputationClass=null (not 'source-failure'), with the rationale
- exclusion from confidence/coverage averages via the
RESILIENCE_RETIRED_DIMENSIONS registry filter
- the distinction between structural retirement (filtered) and
runtime coverage=0 (kept so sparse-data countries still flag
lowConfidence)
Touched:
- docs/methodology/country-resilience-index.mdx (lines ~33, ~268, ~590)
- server/worldmonitor/resilience/v1/_indicator-registry.ts
(recoveryFuelStockDays comment block + description field)
No code-behavior change. Docs-only.
Tests: 157 targeted resilience tests pass (incl. methodology-lint +
widget + release-gate + confidence-averaging). Typecheck clean on
both tsconfig.json and tsconfig.api.json.
|
||
|
|
c067a7dd63 |
fix(resilience): include hydroelectric in lowCarbonGenerationShare (PR #3289 follow-up) (#3293)
Greptile P1 review on the merged PR #3289: World Bank EG.ELC.RNEW.ZS explicitly excludes hydroelectric. The v2 lowCarbonGenerationShare composite was summing only nuclear + renew-ex-hydro, which would collapse to ~0 for hydro-dominant economies the moment the RESILIENCE_ENERGY_V2_ENABLED flag flipped: Norway ~95% hydro → score near 0 on a 0.20-weight indicator Paraguay ~99% hydro → same Brazil ~65% hydro → same Canada ~60% hydro → same Directly contradicts the plan §3.3 intent of crediting "firm low-carbon generation" and would produce rankings that contradict the power-system security framing. PR #3289 merged before the review landed. This branch applies the fix against main. Fix: add EG.ELC.HYRO.ZS as a third series in the composite. seed-low-carbon-generation.mjs: - INDICATORS: ['EG.ELC.NUCL.ZS', 'EG.ELC.RNEW.ZS'] + 'EG.ELC.HYRO.ZS' - fetchLowCarbonGeneration(): sum three series, track latest year across all three, same cap-at-100 guard - File header comment names the three-series sum with the hydro- exclusion rationale + the country list that would break. _indicator-registry.ts lowCarbonGenerationShare.description: rewritten to name all three WB codes + explain the hydro exclusion. country-resilience-index.mdx: - Known-limitations item 3 names all three WB codes + country list - Energy domain v2 table row names all three WB codes - v2.1 changelog Indicators-added bullet names all three WB codes - v2.1 changelog New-seeders bullet names all three WB codes on seed-low-carbon-generation No scorer code change (composite lives in the seeder; scorer reads the pre-summed value from resilience:low-carbon-generation:v1). No weight change. Flag-off path remains byte-identical. 25 resilience tests pass, typecheck + typecheck:api clean. |
||
|
|
b8615dd106 |
fix(intelligence): read regional-snapshot latestKey as raw string (#3302)
* fix(intelligence): read regional-snapshot latestKey as raw string
Regional Intelligence panel rendered "No snapshot available yet" for every
region despite the 6h cron writing per-region snapshots successfully. Root
cause: writer/reader encoding mismatch.
Writer (scripts/regional-snapshot/persist-snapshot.mjs:60) stores the
snapshot_id pointer via `['SET', latestKey, snapshotId]` — a BARE string,
not JSON.stringify'd. The seeder's own reader (line 97) reads it as-is
and works.
Vercel RPC handler used `getCachedJson(latestKey, true)`, which internally
does `JSON.parse(data.result)`. `JSON.parse('mena-20260421T000000-steady')`
throws; the try/catch silently returns null; handler returns {}; panel
renders empty.
Fix: new `getCachedRawString()` helper in server/_shared/redis.ts that
reads a Redis key as-is with no JSON.parse. Handler uses it for the
latestKey read (while still using getCachedJson for the snapshot-by-id
payload, which IS JSON.stringify'd). No writer or backfill change needed.
Regression guard: new structural test asserts the handler reads latestKey
specifically via getCachedRawString so a future refactor can't silently
revert to getCachedJson and re-break every region.
Health.js monitors the summary key (intelligence:regional-snapshots:
summary:v1), which stays green because summary writes succeed. Per-region
health probes would be worth adding as a follow-up.
* fix(redis): detect AbortSignal.timeout() as TimeoutError too
Greptile P2 on PR #3302: AbortSignal.timeout() throws a DOMException with
name='TimeoutError' on V8 runtimes (Vercel Edge included). The existing
isTimeout check only matched name==='AbortError' — what you'd get from a
manual controller.abort() — so the [REDIS-TIMEOUT] structured log never
fired. Every redis fetch timeout silently fell through to the generic
console.warn branch, eroding the Sentry drain we added specifically to
catch these per docs/plans/chokepoint-rpc-payload-split.md.
Fix both getCachedJson (pre-existing) and the new getCachedRawString by
matching TimeoutError OR AbortError — covers both the current
AbortSignal.timeout() path and any future switch to manual AbortController.
Pre-existing copy in getCachedJson fixed in the same edit since it's the
same file and the same observability hole.
* test(redis): update isTimeout regex to match new TimeoutError|AbortError check
Pre-push hook caught the brittle static-analysis test in
tests/get-chokepoint-history.test.mjs:83 that asserted the exact old
single-name pattern. Update the regex (and description) to cover both
TimeoutError and AbortError, matching the observability fix in the
previous commit.
|
||
|
|
52659ce192 |
feat(resilience): PR 1 — energy construct repair (flag-gated) (#3289)
* docs(resilience): PR 1 foundation — Option B framing + v2 energy construct spec
First commit in PR 1 of the resilience repair plan. Zero scoring-behaviour
change; sets up the construct contract that the code changes will implement.
Declares the framing decision required by plan section 3.2 before any
scorer code lands: Option B (power-system security) is adopted. Electricity
grids are the dominant short-horizon shock-transmission channel, and the
choice lets the v2 energy indicator set share one denominator (percent of
electricity generation) instead of mixing primary-energy and power-system
measures in a composite.
Methodology doc changes:
- Energy Domain section now documents both the legacy indicator set
(still the default) and the v2 indicator set (flag-gated), under a
single #### Energy H4 heading so the methodology-doc linter still
asserts dimension-id parity with the registry.
- v2 indicators: importedFossilDependence (EG.ELC.FOSL.ZS x
max(EG.IMP.CONS.ZS, 0)), lowCarbonGenerationShare (EG.ELC.NUCL.ZS +
EG.ELC.RNEW.ZS), powerLossesPct (EG.ELC.LOSS.ZS), reserveMarginPct
(IEA), euGasStorageStress (renamed + scoped to EU), energyPriceStress
(retained at 0.15 weight).
- Retired under v2: electricityConsumption, gasShare, coalShare,
dependency (all into importedFossilDependence), renewShare.
- electricityAccess moves from energy to infrastructure under v2.
- Added a v2.1 changelog section documenting the flag-gated rollout,
acceptance gates (per plan section 6), and snapshot filenames for
the post-flag-flip captures.
- Known-limitations items 1-3 updated to note PR 1 lands the v2
construct behind RESILIENCE_ENERGY_V2_ENABLED (default off).
Methodology-doc linter + mdx-lint + typecheck all clean. Indicator
registry, seeders, and scorer rewrite land in subsequent commits on
this same branch.
* feat(resilience): PR 1 — RESILIENCE_ENERGY_V2_ENABLED flag + scoreEnergy v2 + registry entries
Second commit in PR 1 of the resilience repair plan. Lands the flag,
the v2 scorer code path, and the registry entries the methodology
doc referenced. Default is flag off; published rankings are unchanged
until the flag flips in a later commit (after seeders land and the
acceptance-gate rerun produces a fresh post-flip snapshot).
Changes:
- _shared.ts: isEnergyV2Enabled() function reader on the canonical
RESILIENCE_ENERGY_V2_ENABLED env var. Dynamic read (like
isPillarCombineEnabled) so tests can flip per-case.
- _dimension-scorers.ts:
- New Redis key constants for the three v2 seed keys plus the
reserved reserveMargin key (seeder deferred per plan §3.1
open-question).
- EU_GAS_STORAGE_COUNTRIES set (EU + EFTA + UK) for the renamed
euGasStorageStress signal per plan §3.5 point 2.
- isEnergyV2EnabledLocal() — private duplicate of the flag reader
to avoid a circular import (_shared.ts already imports from
this module). Same env-var contract.
- scoreEnergy split into scoreEnergyLegacy() + scoreEnergyV2().
Public scoreEnergy() branches on the flag. Legacy path is
byte-identical to the pre-commit behaviour.
- scoreEnergyV2() reads four new bulk payloads, composes
importedFossilDependence = fossilElectricityShare × max(netImports, 0)/100
per plan §3.2, collapses net exporters to 0, and gates
euGasStorageStress on EU membership so non-EU countries
re-normalise rather than getting penalised for a regional
signal.
- _indicator-registry.ts: four new entries under `dimension: 'energy'`
with `tier: 'experimental'` — importedFossilDependence (0.35),
lowCarbonGenerationShare (0.20), powerLossesPct (0.10),
reserveMarginPct (0.10). Experimental tier keeps them out of the
Core coverage gate until seed coverage is confirmed.
- compare-resilience-current-vs-proposed.mjs: new
'bulk-v1-country-value' shape family in the extraction dispatcher.
EXTRACTION_RULES now covers the four v2 registry indicators so
the per-indicator influence harness tracks them from day one.
When the seeders are absent, pairedSampleSize = 0 and Pearson = 0
— the harness output surfaces the "no influence yet" state rather
than silently dropping the indicators.
- tests/resilience-energy-v2.test.mts: 11 new tests pinning:
- flag-off = legacy behaviour preserved (v2 seed keys have no
effect when flag is off — catches accidental cross-path reads)
- flag-on = v2 composite behaves correctly:
- lower fossilElectricityShare raises score
- net exporter with 90% fossil > net importer with 90% fossil
(max(·, 0) collapse verified)
- higher lowCarbonGenerationShare raises score (nuclear credit)
- higher powerLossesPct lowers score
- euGasStorageStress is invariant for non-EU, responds for DE
- all v2 inputs absent = graceful degradation, coverage < 1.0
106 resilience tests pass (existing + 11 new). Typecheck clean. Biome
clean. No production behaviour change with flag off (default).
Next commits on this branch: three World Bank seeders for the v2 keys,
health.js + SEED_META registration (gated ON_DEMAND_KEYS until Railway
cron provisions), acceptance-gate rerun at flag-flip time.
* feat(resilience): PR 1 — three WB seeders + health registration for v2 energy construct
Third commit in PR 1. Lands the seed scripts for the three v2 energy
indicator source keys, registered in api/health.js with ON_DEMAND_KEYS
gating until Railway cron provisions.
New seeders (weekly cron cadence, 8d maxStaleMin = 2x interval):
- scripts/seed-low-carbon-generation.mjs
Pulls EG.ELC.NUCL.ZS + EG.ELC.RNEW.ZS from World Bank, sums per
country into `resilience:low-carbon-generation:v1`. Partial
coverage (one series missing) still emits a value using the
observed half — the scorer's 0-80 saturating goalpost tolerates
it and the underlying construct is "firm low-carbon share".
- scripts/seed-fossil-electricity-share.mjs
Pulls EG.ELC.FOSL.ZS into `resilience:fossil-electricity-share:v1`.
Feeds the importedFossilDependence composite at score time
(composite = fossilShare × max(netImports, 0) / 100 per plan §3.2).
- scripts/seed-power-reliability.mjs
Pulls EG.ELC.LOSS.ZS into `resilience:power-losses:v1`. Direct
grid-integrity signal replacing the retired electricityConsumption
wealth proxy.
All three follow the existing seed-recovery-*.mjs template:
- Shape: { countries: { [ISO2]: { value, year } }, seededAt }
- runSeed() from _seed-utils.mjs with schemaVersion=1, ttl=35d
- validateFn floor of 150 countries (WB coverage is 150-180 for
the three indicators; below 150 = transient fetch failure)
- ISO3 → ISO2 mapping via scripts/shared/iso3-to-iso2.json
No reserveMargin seeder is shipped in this commit per plan §3.1 open
question: IEA electricity-balance coverage is sparse outside OECD+G20,
and the indicator will likely ship as 'unmonitored' with weight 0.05
if it lands at all. The Redis key (`resilience:reserve-margin:v1`) is
reserved in _dimension-scorers.ts so the v2 scorer shape is stable.
api/health.js:
- SEED_DOMAINS: add `lowCarbonGeneration`, `fossilElectricityShare`,
`powerLosses` → their Redis keys.
- SEED_META: same three, pointing at `seed-meta:resilience:*` meta
keys with maxStaleMin=11520 (8d, per the worldmonitor
health-maxstalemin-write-cadence pattern: 2x weekly cron).
- ON_DEMAND_KEYS: three new entries gated as TRANSITIONAL until
Railway cron provisions and the first clean run completes. Remove
from this set after ~7 days of green production runs.
Typecheck clean; existing 106 resilience tests pass (seeders have no
in-repo callers yet, so nothing depends on them executing). Real-API
integration tests land when Railway cron is provisioned.
Next commit: Railway cron configuration + bundle-runner wiring.
* feat(resilience): PR 1 — bundle-runner + acceptance-gate verdict + flag-flip runbook
Final commit in the PR 1 tranche. Lands the three remaining pieces so
the flag-flip is fully operable once Railway cron provisions.
- scripts/seed-bundle-resilience-energy-v2.mjs
Railway cron bundle wrapping the three v2 energy seeders
(low-carbon-generation, fossil-electricity-share, power-losses).
Weekly cadence (7-day intervalMs); the underlying data is annual
at source so polling more frequently just hammers the World Bank
API. 5-minute per-script timeout. Mirrors the existing
seed-bundle-resilience-recovery.mjs pattern.
- scripts/compare-resilience-current-vs-proposed.mjs: acceptanceGates
block. Programmatic evaluation of plan §6 gates using the inputs
the harness already computes:
gate-1-spearman Spearman vs baseline >= 0.85
gate-2-country-drift Max country drift vs baseline <= 15
gate-6-cohort-median Cohort median shift vs baseline <= 10
gate-7-matched-pair Every pair holds expected direction
gate-9-effective-influence >= 80% Core indicators measurable
gate-universe-integrity No cohort/pair endpoint missing from scorable
Thresholds are encoded in a const so they can't silently soften.
Output verdict is PASS / CONDITIONAL / BLOCK. Emitted in
summary.acceptanceVerdict for at-a-glance PR comment pasting, with
full per-gate detail in acceptanceGates.results.
- docs/methodology/energy-v2-flag-flip-runbook.md
Operator runbook for the flag flip. Pre-flip checklist (seeders
green, health endpoint green, ON_DEMAND_KEYS graduation, Spearman
verification), flip procedure (pre-flip snapshot, dry-run, cache
prefix bump, Vercel env flip, post-flip snapshot, methodology
doc reclassification), rollback procedure, and a reference table
for the three possible verdict states.
PR 1 is now code-complete pending:
1. Railway cron provisioning (ops, not code)
2. Flag flip + acceptance-gate rerun (follows runbook, not code)
3. Reserve-margin seeder (deferred per plan §3.1 open-question)
Zero scoring-behaviour change in this commit. 121 resilience tests
pass, typecheck clean.
* fix(resilience): PR 1 — drop unseeded reserveMargin from scorer + fix composite extractor
Addresses two P1 review findings on PR #3289.
Finding 1: scoreEnergyV2 read resilience:reserve-margin:v1 at weight
0.10 but no seeder ships in this PR (indicator deferred per plan
§3.1 open-question). On flag flip that slot would be permanently
null, silently renormalizing the remaining 90% of weight and
producing a construct different from what the methodology doc
describes. Fix: remove reserve-margin from the v2 reader +
blend entirely. Redistribute its 0.10 weight to powerLossesPct
(now 0.20); both are grid-integrity signals per plan §3.1, and
the original plan split electricityConsumption's 0.30 weight
across powerLossesPct + reserveMarginPct + importedFossilDependence
— without reserveMarginPct, powerLossesPct carries the shared
grid-integrity load until the IEA seeder ships.
v2 weights now: 0.35 + 0.20 + 0.20 + 0.10 + 0.15 = 1.00
(importedFossilDependence + lowCarbonGenerationShare +
powerLossesPct + euGasStorageStress + energyPriceStress)
Reserve-margin Redis key constant stays reserved so the v2
scorer shape is stable when a future commit lands the seeder;
split 0.10 back out of powerLossesPct at that point.
Methodology doc, _shared.ts flag comment, and v2 test suite all
updated to the 5-indicator shape. New regression test asserts
that changing reserve-margin Redis content has zero effect on
the v2 score — guards against a future commit accidentally
wiring the reader back in without its seeder.
Finding 2: scripts/compare-resilience-current-vs-proposed.mjs
measured importedFossilDependence by reading fossilElectricityShare
alone. The scorer defines it as fossilShare × max(netImports, 0)
/ 100, so the extractor zeroed out net exporters and
under-reported net importers — making gate-9 effective-influence
wrong for the centrepiece construct change of PR 1.
Fix: new 'imported-fossil-dependence-composite' extractor type
in applyExtractionRule that recomputes the same composite from
both inputs (fossilShare bulk payload + staticRecord.iea.
energyImportDependency.value). Stays in lockstep with the
scorer — drift between the two would break gate-9's
interpretation.
New unit tests pin:
- net importer: 80% × max(60, 0) / 100 = 48 ✓
- net exporter: 80% × max(-40, 0) / 100 = 0 ✓
- missing either input → null
64 resilience tests pass; typecheck clean. Flag-off path is
still byte-identical to pre-PR behaviour.
* docs(resilience): PR 1 — align methodology doc with actual shipped indicators and seeders
Addresses P1 review on docs/methodology/country-resilience-index.mdx
lines 29 and 574-575. The doc still described reserveMarginPct as a
shipped v2 indicator and listed seed-net-energy-imports.mjs in the
new-seeders list, neither of which the branch actually ships.
Doc changes to match the code in this branch:
Known-limitations item 1: restated to describe the actual v2
replacement footprint — powerLossesPct at 0.20 (temporarily
absorbing reserveMarginPct's 0.10) plus accessToElectricityPct
moved to infrastructure. reserveMarginPct is named as a deferred
companion with the split-out instructions for when its seeder
lands.
v2.1 changelog (Indicators added): split into "live in PR 1" and
"deferred in PR 1" so the reader can distinguish which entries
match real code. importedFossilDependence's composite formula
now written out and the net-imports source attributed to the
existing resilience:static.iea path (not a new seeder).
v2.1 changelog (New seeders): lists the three actual files that
ship in this branch (seed-low-carbon-generation, seed-fossil-
electricity-share, seed-power-reliability) and explicitly notes
seed-net-energy-imports.mjs is NOT a new seeder — the
EG.IMP.CONS.ZS series is already fetched by seed-resilience-
static.mjs. Adds the bundle-runner reference.
Methodology-doc linter + mdx-lint both pass (125/125). Typecheck
clean. Doc is now the source of truth for what PR 1 actually ships.
* fix(resilience): PR 1 — sync powerLossesPct registry weight with scorer (0.10 → 0.20)
Reviewer-caught mismatch between INDICATOR_REGISTRY and scoreEnergyV2.
The previous commit redistributed the deferred reserveMarginPct's 0.10
weight into powerLossesPct in the SCORER but left the REGISTRY entry
unchanged at 0.10. Two downstream effects:
1. scripts/compare-resilience-current-vs-proposed.mjs copies
`spec.weight` into `nominalWeight` for gate-9 reporting, so
powerLossesPct's nominal influence would be under-reported by
half in every post-flip acceptance run — exactly the harness PR 1
relies on for merge evidence.
2. Methodology doc vs registry vs scorer drift is the pattern the
methodology-doc linter is supposed to catch; it passes here
because the linter only checks dimension-id parity, not weights.
Registry is now the only remaining source of truth to keep in
lockstep with the scorer.
Change:
- `_indicator-registry.ts` powerLossesPct.weight: 0.1 → 0.2
- Inline comment names the deferral and instructs: "when the IEA
electricity-balance seeder lands, split 0.10 back out and restore
reserveMarginPct at 0.10. Keep this field in lockstep with
scoreEnergyV2 ... because the PR 0 compare harness copies
spec.weight into nominalWeight for gate-9 reporting."
Experimental weights per dimension invariant still holds (0.35 + 0.20
+ 0.20 = 0.75 for energy, well under the 1.0 ceiling). 64 resilience
tests pass, typecheck clean.
|
||
|
|
da0f26a3cf |
feat(resilience): PR 0 diagnostic freeze + fairness-audit harness (no scoring changes) (#3284)
* feat(resilience): PR 0 diagnostic freeze + fairness-audit harness
Lands the before-state and measurement apparatus every subsequent
resilience-scorer PR validates against. Zero scoring changes. Per the
v3 plan at docs/plans/2026-04-22-001-fix-resilience-scorer-structural-
bias-plan.md this is tranche 0 of five.
What lands:
- Construct contract published in the methodology doc: absolute
resilience not development-adjusted, mechanism test for every
indicator, peer-relative views published separately from the core.
- Known construct limitations section: six construct errors scheduled
for PR 1-3 repair with explicit mapping to plan tranches.
- Indicator-source manifest at docs/methodology/indicator-sources.yaml
with source, seriesId, seriesUrl, coveragePct, lastObservedYear,
license, mechanismTestRationale, and a constructStatus classification.
- Pre-repair ranking snapshot at
docs/snapshots/resilience-ranking-live-pre-repair-2026-04-22.json
(217 items + 5 greyedOut, captured 2026-04-22 08:38 UTC at commit
|
||
|
|
58e42aadf9 |
chore(api): enforce sebuf contract + migrate drifting endpoints (#3207) (#3242)
* chore(api): enforce sebuf contract via exceptions manifest (#3207) Adds api/api-route-exceptions.json as the single source of truth for non-proto /api/ endpoints, with scripts/enforce-sebuf-api-contract.mjs gating every PR via npm run lint:api-contract. Fixes the root-only blind spot in the prior allowlist (tests/edge-functions.test.mjs), which only scanned top-level *.js files and missed nested paths and .ts endpoints — the gap that let api/supply-chain/v1/country-products.ts and friends drift under proto domain URL prefixes unchallenged. Checks both directions: every api/<domain>/v<N>/[rpc].ts must pair with a generated service_server.ts (so a deleted proto fails CI), and every generated service must have an HTTP gateway (no orphaned generated code). Manifest entries require category + reason + owner, with removal_issue mandatory for temporary categories (deferred, migration-pending) and forbidden for permanent ones. .github/CODEOWNERS pins the manifest to @SebastienMelki so new exceptions don't slip through review. The manifest only shrinks: migration-pending entries (19 today) will be removed as subsequent commits in this PR land each migration. * refactor(maritime): migrate /api/ais-snapshot → maritime/v1.GetVesselSnapshot (#3207) The proto VesselSnapshot was carrying density + disruptions but the frontend also needed sequence, relay status, and candidate_reports to drive the position-callback system. Those only lived on the raw relay passthrough, so the client had to keep hitting /api/ais-snapshot whenever callbacks were registered and fall back to the proto RPC only when the relay URL was gone. This commit pushes all three missing fields through the proto contract and collapses the dual-fetch-path into one proto client call. Proto changes (proto/worldmonitor/maritime/v1/): - VesselSnapshot gains sequence, status, candidate_reports. - GetVesselSnapshotRequest gains include_candidates (query: include_candidates). Handler (server/worldmonitor/maritime/v1/get-vessel-snapshot.ts): - Forwards include_candidates to ?candidates=... on the relay. - Separate 5-min in-memory caches for the candidates=on and candidates=off variants; they have very different payload sizes and should not share a slot. - Per-request in-flight dedup preserved per-variant. Frontend (src/services/maritime/index.ts): - fetchSnapshotPayload now calls MaritimeServiceClient.getVesselSnapshot directly with includeCandidates threaded through. The raw-relay path, SNAPSHOT_PROXY_URL, DIRECT_RAILWAY_SNAPSHOT_URL and LOCAL_SNAPSHOT_FALLBACK are gone — production already routed via Vercel, the "direct" branch only ever fired on localhost, and the proto gateway covers both. - New toLegacyCandidateReport helper mirrors toDensityZone/toDisruptionEvent. api/ais-snapshot.js deleted; manifest entry removed. Only reduced the codegen scope to worldmonitor.maritime.v1 (buf generate --path) — regenerating the full tree drops // @ts-nocheck from every client/server file and surfaces pre-existing type errors across 30+ unrelated services, which is not in scope for this PR. Shape-diff vs legacy payload: - disruptions / density: proto carries the same fields, just with the GeoCoordinates wrapper and enum strings (remapped client-side via existing toDisruptionEvent / toDensityZone helpers). - sequence, status.{connected,vessels,messages}: now populated from the proto response — was hardcoded to 0/false in the prior proto fallback. - candidateReports: same shape; optional numeric fields come through as 0 instead of undefined, which the legacy consumer already handled. * refactor(sanctions): migrate /api/sanctions-entity-search → LookupSanctionEntity (#3207) The proto docstring already claimed "OFAC + OpenSanctions" coverage but the handler only fuzzy-matched a local OFAC Redis index — narrower than the legacy /api/sanctions-entity-search, which proxied OpenSanctions live (the source advertised in docs/api-proxies.mdx). Deleting the legacy without expanding the handler would have been a silent coverage regression for external consumers. Handler changes (server/worldmonitor/sanctions/v1/lookup-entity.ts): - Primary path: live search against api.opensanctions.org/search/default with an 8s timeout and the same User-Agent the legacy edge fn used. - Fallback path: the existing OFAC local fuzzy match, kept intact for when OpenSanctions is unreachable / rate-limiting. - Response source field flips between 'opensanctions' (happy path) and 'ofac' (fallback) so clients can tell which index answered. - Query validation tightened: rejects q > 200 chars (matches legacy cap). Rate limiting: - Added /api/sanctions/v1/lookup-entity to ENDPOINT_RATE_POLICIES at 30/min per IP — matches the legacy createIpRateLimiter budget. The gateway already enforces per-endpoint policies via checkEndpointRateLimit. Docs: - docs/api-proxies.mdx — dropped the /api/sanctions-entity-search row (plus the orphaned /api/ais-snapshot row left over from the previous commit in this PR). - docs/panels/sanctions-pressure.mdx — points at the new RPC URL and describes the OpenSanctions-primary / OFAC-fallback semantics. api/sanctions-entity-search.js deleted; manifest entry removed. * refactor(military): migrate /api/military-flights → ListMilitaryFlights (#3207) Legacy /api/military-flights read a pre-baked Redis blob written by the seed-military-flights cron and returned flights in a flat app-friendly shape (lat/lon, lowercase enums, lastSeenMs). The proto RPC takes a bbox, fetches OpenSky live, classifies server-side, and returns nested GeoCoordinates + MILITARY_*_TYPE_* enum strings + lastSeenAt — same data, different contract. fetchFromRedis in src/services/military-flights.ts was doing nothing sebuf-aware. Renamed it to fetchViaProto and rewrote to: - Instantiate MilitaryServiceClient against getRpcBaseUrl(). - Iterate MILITARY_QUERY_REGIONS (PACIFIC + WESTERN) in parallel — same regions the desktop OpenSky path and the seed cron already use, so dashboard coverage tracks the analytic pipeline. - Dedup by hexCode across regions. - Map proto → app shape via new mapProtoFlight helper plus three reverse enum maps (AIRCRAFT_TYPE_REVERSE, OPERATOR_REVERSE, CONFIDENCE_REVERSE). The seed cron (scripts/seed-military-flights.mjs) stays put: it feeds regional-snapshot mobility, cross-source signals, correlation, and the health freshness check (api/health.js: 'military:flights:v1'). None of those read the legacy HTTP endpoint; they read the Redis key directly. The proto handler uses its own per-bbox cache keys under the same prefix, so dashboard traffic no longer races the seed cron's blob — the two paths diverge by a small refresh lag, which is acceptable. Docs: dropped the /api/military-flights row from docs/api-proxies.mdx. api/military-flights.js deleted; manifest entry removed. Shape-diff vs legacy: - f.location.{latitude,longitude} → f.lat, f.lon - f.aircraftType: MILITARY_AIRCRAFT_TYPE_TANKER → 'tanker' via reverse map - f.operator: MILITARY_OPERATOR_USAF → 'usaf' via reverse map - f.confidence: MILITARY_CONFIDENCE_LOW → 'low' via reverse map - f.lastSeenAt (number) → f.lastSeen (Date) - f.enrichment → f.enriched (with field renames) - Extra fields registration / aircraftModel / origin / destination / firstSeenAt now flow through where proto populates them. * fix(supply-chain): thread includeCandidates through chokepoint status (#3207) Caught by tsconfig.api.json typecheck in the pre-push hook (not covered by the plain tsc --noEmit run that ran before I pushed the ais-snapshot commit). The chokepoint status handler calls getVesselSnapshot internally with a static no-auth request — now required to include the new includeCandidates bool from the proto extension. Passing false: server-internal callers don't need per-vessel reports. * test(maritime): update getVesselSnapshot cache assertions (#3207) The ais-snapshot migration replaced the single cachedSnapshot/cacheTimestamp pair with a per-variant cache so candidates-on and candidates-off payloads don't evict each other. Pre-push hook surfaced that tests/server-handlers still asserted the old variable names. Rewriting the assertions to match the new shape while preserving the invariants they actually guard: - Freshness check against slot TTL. - Cache read before relay call. - Per-slot in-flight dedup. - Stale-serve on relay failure (result ?? slot.snapshot). * chore(proto): restore // @ts-nocheck on regenerated maritime files (#3207) I ran 'buf generate --path worldmonitor/maritime/v1' to scope the proto regen to the one service I was changing (to avoid the toolchain drift that drops @ts-nocheck from 60+ unrelated files — separate issue). But the repo convention is the 'make generate' target, which runs buf and then sed-prepends '// @ts-nocheck' to every generated .ts file. My scoped command skipped the sed step. The proto-check CI enforces the sed output, so the two maritime files need the directive restored. * refactor(enrichment): decomm /api/enrichment/{company,signals} legacy edge fns (#3207) Both endpoints were already ported to IntelligenceService: - getCompanyEnrichment (/api/intelligence/v1/get-company-enrichment) - listCompanySignals (/api/intelligence/v1/list-company-signals) No frontend callers of the legacy /api/enrichment/* paths exist. Removes: - api/enrichment/company.js, signals.js, _domain.js - api-route-exceptions.json migration-pending entries (58 remain) - docs/api-proxies.mdx rows for /api/enrichment/{company,signals} - docs/architecture.mdx reference updated to the IntelligenceService RPCs Verified: typecheck, typecheck:api, lint:api-contract (89 files / 58 entries), lint:boundaries, tests/edge-functions.test.mjs (136 pass), tests/enrichment-caching.test.mjs (14 pass — still guards the intelligence/v1 handlers), make generate is zero-diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(leads): migrate /api/{contact,register-interest} → LeadsService (#3207) New leads/v1 sebuf service with two POST RPCs: - SubmitContact → /api/leads/v1/submit-contact - RegisterInterest → /api/leads/v1/register-interest Handler logic ported 1:1 from api/contact.js + api/register-interest.js: - Turnstile verification (desktop sources bypass, preserved) - Honeypot (website field) silently accepts without upstream calls - Free-email-domain gate on SubmitContact (422 ApiError) - validateEmail (disposable/offensive/typo-TLD/MX) on RegisterInterest - Convex writes via ConvexHttpClient (contactMessages:submit, registerInterest:register) - Resend notification + confirmation emails (HTML templates unchanged) Shared helpers moved to server/_shared/: - turnstile.ts (getClientIp + verifyTurnstile) - email-validation.ts (disposable/offensive/MX checks) Rate limits preserved via ENDPOINT_RATE_POLICIES: - submit-contact: 3/hour per IP (was in-memory 3/hr) - register-interest: 5/hour per IP (was in-memory 5/hr; desktop sources previously capped at 2/hr via shared in-memory map — now 5/hr like everyone else, accepting the small regression in exchange for Upstash-backed global limiting) Callers updated: - pro-test/src/App.tsx contact form → new submit-contact path - src-tauri/sidecar/local-api-server.mjs cloud-fallback rewrites /api/register-interest → /api/leads/v1/register-interest when proxying; keeps local path for older desktop builds - src/services/runtime.ts isKeyFreeApiTarget allows both old and new paths through the WORLDMONITOR_API_KEY-optional gate Tests: - tests/contact-handler.test.mjs rewritten to call submitContact handler directly; asserts on ValidationError / ApiError - tests/email-validation.test.mjs + tests/turnstile.test.mjs point at the new server/_shared/ modules Deleted: api/contact.js, api/register-interest.js, api/_ip-rate-limit.js, api/_turnstile.js, api/_email-validation.js, api/_turnstile.test.mjs. Manifest entries removed (58 → 56). Docs updated (api-platform, api-commerce, usage-rate-limits). Verified: npm run typecheck + typecheck:api + lint:api-contract (88 files / 56 entries) + lint:boundaries pass; full test:data (5852 tests) passes; make generate is zero-diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(pro-test): rebuild bundle for leads/v1 contact form (#3207) Updates the enterprise contact form to POST to /api/leads/v1/submit-contact (old path /api/contact removed in the previous commit). Bundle is rebuilt from pro-test/src/App.tsx source change in |
||
|
|
425507d15a |
fix(brief): category-gated context + RELEVANCE RULE to stop formulaic grounding (#3281)
* fix(brief): category-gated context + RELEVANCE RULE to stop formulaic grounding
Shadow-diff of 15 v2 pairs (2026-04-22) showed the analyst pattern-
matching the loudest context numbers — VIX 19.50, top forecast
probability, MidEast FX stress 77 — into every story regardless of
editorial fit. A Rwanda humanitarian story about refugees cited VIX;
an aviation story cited a forecast probability.
Root cause: every story got the same 6-bundle context block, so the
LLM had markets / forecasts / macro in-hand and the "cite a specific
fact" instruction did the rest.
Two-layer fix:
1. STRUCTURAL — sectionsForCategory() maps the story's category to
an editorially-relevant subset of bundles. Humanitarian stories
don't see marketData / forecasts / macroSignals; diplomacy gets
riskScores only; market/energy gets markets+forecasts but drops
riskScores. The model physically cannot cite what it wasn't
given. Unknown categories fall back to all six (backcompat).
2. PROMPT — WHY_MATTERS_ANALYST_SYSTEM_V2 adds a RELEVANCE RULE
that explicitly permits grounding in headline/description
actors when no context fact is a clean fit, and bans dragging
off-topic market metrics into humanitarian/aviation/diplomacy
stories. The prompt footer (inline, per-call) restates the
same guardrail — models follow inline instructions more
reliably than system-prompt constraints on longer outputs.
Cache keys bumped to invalidate the formulaic v5 output: endpoint
v5 to v6, shadow v3 to v4. Adds 11 unit tests pinning the 5
policies + default fallback + humanitarian structural guarantee +
market policy does-see-markets + guardrail footer presence.
Observability: endpoint now logs policyLabel per call so operators
can confirm in Vercel logs that humanitarian/aviation stories are
NOT seeing marketData without dumping the full prompt.
* test(brief): address greptile P2 — sync MAX_BODY_BYTES + add parseWhyMattersV2 coverage
Greptile PR #3281 review raised two P2 test-quality issues:
1. Test-side MAX_BODY_BYTES mirror was still 4096 — the endpoint
was bumped to 8192 in PR #3269 (v2 output + description). With
the stale constant, a payload in the 4097–8192 range was
accepted by the real endpoint but looked oversize in the test
mirror, letting the body-cap invariant silently drift. Fixed
by syncing to 8192 + bumping the bloated fixture to 10_000
bytes so a future endpoint-cap bump doesn't silently
re-invalidate the assertion.
2. parseWhyMattersV2 (the only output-validation gate on the
analyst path) had no dedicated unit tests. Adds 11 targeted
cases covering: valid 2 and 3 sentence output, 100/500 char
bounds (incl. boundary assertions), all 6 banned preamble
phrases, section-label leaks (SITUATION/ANALYSIS/Watch),
markdown leakage (#, -, *, 1.), stub echo rejection, smart/
plain quote stripping, non-string defensive branch, and
whitespace-only strings.
Suite size: 50 to 61 tests, all green.
* fix(brief): add aviation policy to sectionsForCategory (PR #3281 review P1)
Reviewer caught that aviation was named in WHY_MATTERS_ANALYST_SYSTEM_V2's
RELEVANCE RULE as a category banned from off-topic market metrics, but
had no matching regex entry in CATEGORY_SECTION_POLICY. So 'Aviation
Incident' / 'Airspace Closure' / 'Plane Crash' / 'Drone Incursion' all
fell through to DEFAULT_SECTIONS and still got all 6 bundles including
marketData, forecasts, and macroSignals — exactly the VIX / forecast
probability pattern the PR claimed to structurally prevent.
Reproduced on HEAD before fix:
Aviation Incident -> default
Airspace Closure -> default
Plane Crash -> default
...etc.
Fix:
1. Adds aviation policy (same 3 bundles as humanitarian/diplomacy/
tech: worldBrief, countryBrief, riskScores).
2. Adds dedicated aviation-gating test with 6 category variants.
3. Adds meta-invariant test: every category named in the system
prompt's RELEVANCE RULE MUST have a structural policy entry,
asserting policyLabel !== 'default'. If someone adds a new
category name to the prompt in the future, this test fires
until they wire up a regex — prevents soft-guard drift.
4. Removes 'Aviation Incident' from the default-fall-through test
list (it now correctly matches aviation).
No cache bump needed — v6 was published to the feature branch only a
few minutes ago, no production entries have been written yet.
|
||
|
|
fbaf07e106 |
feat(resilience): flag-gated pillar-combined score activation (default off) (#3267)
Wires the non-compensatory 3-pillar combined overall_score behind a RESILIENCE_PILLAR_COMBINE_ENABLED env flag. Default is false so this PR ships zero behavior change in production. When flipped true the top-level overall_score switches from the 6-domain weighted aggregate to penalizedPillarScore(pillars) with alpha 0.5 and pillar weights 0.40 / 0.35 / 0.25. Evidence from docs/snapshots/resilience-pillar-sensitivity-2026-04-21: - Spearman rank correlation current vs proposed 0.9935 - Mean score delta -13.44 points (every country drops, penalty is always at most 1) - Max top-50 rank swing 6 positions (Russia) - No ceiling or floor effects under plus/minus 20pct perturbation - Release gate PASS 0/19 Code change in server/worldmonitor/resilience/v1/_shared.ts: - New isPillarCombineEnabled() reads env dynamically so tests can flip state without reloading the module - overallScore branches on (isPillarCombineEnabled() AND RESILIENCE_SCHEMA_V2_ENABLED AND pillars.length > 0); otherwise falls through to the 6-domain aggregate (unchanged default path) - RESILIENCE_SCORE_CACHE_PREFIX bumped v9 to v10 - RESILIENCE_RANKING_CACHE_KEY bumped v9 to v10 Cache invalidation: the version bump forces both per-country score cache and ranking cache to recompute from the current code path on first read after a flag flip. Without the bump, 6-domain values cached under the flag-off path would continue to serve for up to 6-12 hours after the flip, producing a ragged mix of formulas. Ripple of v9 to v10: - api/health.js registry entry - scripts/seed-resilience-scores.mjs (both keys) - scripts/validate-resilience-correlation.mjs, scripts/backtest-resilience-outcomes.mjs, scripts/validate-resilience-backtest.mjs, scripts/benchmark-resilience-external.mjs - tests/resilience-ranking.test.mts 24 fixture usages - tests/resilience-handlers.test.mts - tests/resilience-scores-seed.test.mjs explicit pin - tests/resilience-pillar-aggregation.test.mts explicit pin - docs/methodology/country-resilience-index.mdx New tests/resilience-pillar-combine-activation.test.mts: 7 assertions exercising the flag-on path against the release fixtures with re-anchored bands (NO at least 60, YE/SO at most 40, NO greater than US preserved, elite greater than fragile). Regression guard verifies flipping the flag back off restores the 6-domain aggregate. tests/resilience-ranking-snapshot.test.mts: band thresholds now resolve from a METHODOLOGY_BANDS table keyed on snapshot.methodologyFormula. Backward compatible (missing formula defaults to domain-weighted-6d bands). Snapshots: - docs/snapshots/resilience-ranking-2026-04-21.json tagged methodologyFormula domain-weighted-6d - docs/snapshots/resilience-ranking-pillar-combined-projected-2026-04-21.json new: top/bottom/major-economies tables projected from the 52-country sensitivity sample. Explicitly tagged projected (NOT a full-universe live capture). When the flag is flipped in production, run scripts/freeze-resilience-ranking.mjs to capture the authoritative full-universe snapshot. Methodology doc: Pillar-combined score activation section rewritten to describe the flag-gated mechanism (activation is an env-var flip, no code deploy) and the rollback path. Verification: npm run typecheck:all clean, 397/397 resilience tests pass (up from 390, +7 activation tests). Activation plan: 1. Merge this PR with flag default false (zero behavior change) 2. Set RESILIENCE_PILLAR_COMBINE_ENABLED=true in Vercel and Railway env 3. Redeploy or wait for next cold start; v9 to v10 bump forces every country to be rescored on first read 4. Run scripts/freeze-resilience-ranking.mjs against the flag-on deployment and commit the resulting snapshot 5. Ship a v2.0 methodology-change note explaining the re-anchored scale so analysts understand the universal ~13 point score drop is a scale rebase, not a country-level regression Rollback: set RESILIENCE_PILLAR_COMBINE_ENABLED=false, flush resilience:score:v10:* and resilience:ranking:v10 keys (or wait for TTLs). The 6-domain formula stays alongside the pillar combine in _shared.ts and needs no code change to come back. |
||
|
|
502bd4472c |
docs(resilience): sync methodology/proto/widget to 6-domain + 3-pillar reality (#3264)
Brings every user-facing surface into alignment with the live resilience scorer. Zero behavior change: overall_score is still the 6-domain weighted aggregate, schemaVersion is still 2.0 default, and every existing test continues to pass. Surfaces touched: - proto + OpenAPI: rewrote the ResiliencePillar + schema_version descriptions. 2.0 is correctly documented as default; shaped-but-empty language removed. - Widget: added missing recovery: 'Recovery' label (was rendering literal lowercase recovery before), retitled footer data-version chip from Data to Seed date so it is clear the value reflects the static seed bundle not every live input, rewrote help tooltip for 6 domains and 3 pillars and called out the 0.25 recovery weight. - Methodology doc: domains-and-weights table now carries all 6 rows with actual code weights (0.17/0.15/0.11/0.19/0.13/0.25), Recovery section header weight corrected from 1.0 to 0.25, new Pillar-combined score activation (pending) section with the measured Spearman 0.9935, top-5 movers, and the activation checklist. - documentation.mdx + features.mdx: product blurbs updated from 5 domains and 13 dimensions to 6 domains and 19 dimensions grouped into 3 pillars. - Tests: recovery-label regression pin, Seed date label pin, clarified pillar-schema degenerate-input semantics. New scaffolding for defensibility: - docs/snapshots/resilience-ranking-2026-04-21.json frozen published tables artifact with methodology metadata and commit SHA. - docs/snapshots/resilience-pillar-sensitivity-2026-04-21.json live Redis capture (52-country sample) combining sensitivity stability with the current-vs-proposed Spearman comparison. - scripts/freeze-resilience-ranking.mjs refresh script. - scripts/compare-resilience-current-vs-proposed.mjs comparison script. - tests/resilience-ranking-snapshot.test.mts 13 assertions auto discovered from any resilience-ranking-YYYY-MM-DD.json in snapshots. Verification: npm run typecheck:all clean, 390/390 resilience tests pass. Follow-up: pillar-combined score activation. The sensitivity artifact shows rank-preservation Spearman 0.9935 and no ceiling effects, which clears the methodological bar. Blocker is messaging because every country drops ~13 points under the penalty, so activation PR ships with re-anchored release-gate bands, refreshed frozen ranking, and a v2.0 methodology note. |
||
|
|
ec35cf4158 |
feat(brief): analyst prompt v2 — multi-sentence, grounded, story description (#3269)
* feat(brief): analyst prompt v2 — multi-sentence, grounded, includes story description
Shadow-diff of 12 prod stories on 2026-04-21 showed v1 analyst output
indistinguishable from legacy Gemini: identical single-sentence
abstraction ("destabilize / systemic / sovereign risk repricing") with
no named actors, metrics, or dates — in several cases Gemini was MORE
specific.
Root cause: 18–30 word cap compressed context specifics out.
v2 loosens three dials at once so we can settle the A/B:
1. New system prompt WHY_MATTERS_ANALYST_SYSTEM_V2 — 2–3 sentences,
40–70 words, implicit SITUATION→ANALYSIS→(optional) WATCH arc,
MUST cite one specific named actor / metric / date / place from
the context. Analyst path only; gemini path stays on v1.
2. New parser parseWhyMattersV2 — accepts 100–500 chars, rejects
preamble boilerplate + leaked section labels + markdown.
3. Story description plumbed through — endpoint body accepts optional
story.description (≤ 1000 chars, body cap bumped 4 KB → 8 KB).
Cron forwards it when upstream has one (skipped when it equals the
headline — no new signal).
Cache + shadow bumped v3 → v4 / v1 → v2 so fresh output lands on the
first post-deploy cron tick. maxTokens 180 → 260 for ~3× output length.
If shadow-diff 24h after deploy still shows no delta vs gemini, kill
is BRIEF_WHY_MATTERS_PRIMARY=gemini on Vercel (instant, no redeploy).
Tests: 6059 pass (was 6022 + 37 new). typecheck × 2 clean.
* fix(brief): stop truncating v2 multi-sentence output + description in cache hash
Two P1s caught in PR #3269 review.
P1a — cron reparsed endpoint output with v1 single-sentence parser,
silently dropping sentences 2+3 of v2 analyst output. The endpoint had
ALREADY validated the string (parseWhyMattersV2 for analyst path;
parseWhyMatters for gemini). Re-parsing with v1 took only the first
sentence — exact regression #3269 was meant to fix.
Fix: trust the endpoint. Replace re-parse with bounds check (30–500
chars) + stub-echo reject. Added regression test asserting multi-
sentence output reaches the envelope unchanged.
P1b — `story.description` flowed into the analyst prompt but NOT into
the cache hash. Two requests with identical core fields but different
descriptions collided on one cache slot → second caller got prose
grounded in the FIRST caller's description.
Fix: add `description` as the 6th field of `hashBriefStory`. Bump
endpoint cache v4→v5 and shadow v2→v3 so buggy 5-field entries are
dropped. Updated the parity sentinel in brief-llm-core.test.mjs to
match 6-field semantics. Added regression tests covering different-
descriptions-differ and present-vs-absent-differ.
Tests: 6083 pass. typecheck × 2 clean.
|
||
|
|
2f19d96357 |
feat(brief): route whyMatters through internal analyst-context endpoint (#3248)
* feat(brief): route whyMatters through internal analyst-context endpoint
The brief's "why this is important" callout currently calls Gemini on
only {headline, source, threatLevel, category, country} with no live
state. The LLM can't know whether a ceasefire is on day 2 or day 50,
that IMF flagged >90% gas dependency in UAE/Qatar/Bahrain, or what
today's forecasts look like. Output is generic prose instead of the
situational analysis WMAnalyst produces when given live context.
This PR adds an internal Vercel edge endpoint that reuses a trimmed
variant of the analyst context (country-brief, risk scores, top-3
forecasts, macro signals, market data — no GDELT, no digest-search)
and ships it through a one-sentence LLM call with the existing
WHY_MATTERS_SYSTEM prompt. The endpoint owns its own Upstash cache
(v3 prefix, 6h TTL), supports a shadow mode that runs both paths in
parallel for offline diffing, and is auth'd via RELAY_SHARED_SECRET.
Three-layer graceful degradation (endpoint → legacy Gemini-direct →
stub) keeps the brief shipping on any failure.
Env knobs:
- BRIEF_WHY_MATTERS_PRIMARY=analyst|gemini (default: analyst; typo → gemini)
- BRIEF_WHY_MATTERS_SHADOW=0|1 (default: 1; only '0' disables)
- BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT=0..100 (default: 100)
- BRIEF_WHY_MATTERS_ENDPOINT_URL (Railway, optional override)
Cache keys:
- brief:llm:whymatters:v3:{hash16} — envelope {whyMatters, producedBy,
at}, 6h TTL. Endpoint-owned.
- brief:llm:whymatters:shadow:v1:{hash16} — {analyst, gemini, chosen,
at}, 7d TTL. Fire-and-forget.
- brief:llm:whymatters:v2:{hash16} — legacy. Cron's fallback path
still reads/writes during the rollout window; expires in ≤24h.
Tests: 6022 pass (existing 5915 + 12 core + 36 endpoint + misc).
typecheck + typecheck:api + biome on changed files clean.
Plan (Codex-approved after 4 rounds):
docs/plans/2026-04-21-001-feat-brief-why-matters-analyst-endpoint-plan.md
* fix(brief): address /ce:review round 1 findings on PR #3248
Fixes 5 findings from multi-agent review, 2 of them P1:
- #241 P1: `.gitignore !api/internal/**` was too broad — it re-included
`.env`, `.env.local`, and any future secret file dropped into that
directory. Narrowed to explicit source extensions (`*.ts`, `*.js`,
`*.mjs`) so parent `.env` / secrets rules stay in effect inside
api/internal/.
- #242 P1: `Dockerfile.digest-notifications` did not COPY
`shared/brief-llm-core.js` + `.d.ts`. Cron would have crashed at
container start with ERR_MODULE_NOT_FOUND. Added alongside
brief-envelope + brief-filter COPY lines.
- #243 P2: Cron dropped the endpoint's source/producedBy ground-truth
signal, violating PR #3247's own round-3 memory
(feedback_gate_on_ground_truth_not_configured_state.md). Added
structured log at the call site: `[brief-llm] whyMatters source=<src>
producedBy=<pb> hash=<h>`. Endpoint response now includes `hash` so
log + shadow-record pairs can be cross-referenced.
- #244 P2: Defense-in-depth prompt-injection hardening. Story fields
flowed verbatim into both LLM prompts, bypassing the repo's
sanitizeForPrompt convention. Added sanitizeStoryFields helper and
applied in both analyst and gemini paths.
- #245 P2: Removed redundant `validate` option from callLlmReasoning.
With only openrouter configured in prod, a parse-reject walked the
provider chain, then fell through to the other path (same provider),
then the cron's own fallback (same model) — 3x billing on one reject.
Post-call parseWhyMatters check already handles rejection cleanly.
Deferred to P3 follow-ups (todos 246-248): singleflight, v2 sunset,
misc polish (country-normalize LOC, JSDoc pruning, shadow waitUntil,
auto-sync mirror, context-assembly caching).
Tests: 6022 pass. typecheck + typecheck:api clean.
* fix(brief-why-matters): ctx.waitUntil for shadow write + sanitize legacy fallback
Two P2 findings on PR #3248:
1. Shadow record was fire-and-forget without ctx.waitUntil on an Edge
function. Vercel can terminate the isolate after response return,
so the background redisPipeline write completes unreliably — i.e.
the rollout-validation signal the shadow keys were supposed to
provide was flaky in production.
Fix: accept an optional EdgeContext 2nd arg. Build the shadow
promise up front (so it starts executing immediately) then register
it with ctx.waitUntil when present. Falls back to plain unawaited
execution when ctx is absent (local harness / tests).
2. scripts/lib/brief-llm.mjs legacy fallback path called
buildWhyMattersPrompt(story) on raw fields with no sanitization.
The analyst endpoint sanitizes before its own prompt build, but
the fallback is exactly what runs when the endpoint misses /
errors — so hostile headlines / sources reached the LLM verbatim
on that path.
Fix: local sanitizeStoryForPrompt wrapper imports sanitizeForPrompt
from server/_shared/llm-sanitize.js (existing pattern — see
scripts/seed-digest-notifications.mjs:41). Wraps story fields
before buildWhyMattersPrompt. Cache key unchanged (hash is over raw
story), so cache parity with the analyst endpoint's v3 entries is
preserved.
Regression guard: new test asserts the fallback prompt strips
"ignore previous instructions", "### Assistant:" line prefixes, and
`<|im_start|>` tokens when injection-crafted fields arrive.
Typecheck + typecheck:api clean. 6023 / 6023 data tests pass.
* fix(digest-cron): COPY server/_shared/llm-sanitize into digest-notifications image
Reviewer P1 on PR #3248: my previous commit (
|
||
|
|
89c179e412 |
fix(brief): cover greeting was hardcoded "Good evening" regardless of issue time (#3254)
* fix(brief): cover greeting was hardcoded "Good evening" regardless of issue time Reported: a brief viewed at 13:02 local time showed "Good evening" on the cover (slide 1) but "Good afternoon." on the digest greeting page (slide 2). Cause: `server/_shared/brief-render.js:renderCover` had the string `'Good evening'` hardcoded in the cover's mono-cased salutation slot. The digest greeting page (slide 2) renders the time-of-day-correct value from `envelope.data.digest.greeting`, which is computed by `shared/brief-filter.js:174-179` from `localHour` in the user's TZ (< 12 → morning, < 18 → afternoon, else → evening). So any brief viewed outside the literal evening showed an inconsistent pair. Fix: thread `digest.greeting` into `renderCover`; a small `coverGreeting()` helper strips the trailing period so the cover's no-punctuation mono style is preserved. On unexpected/missing values it falls back to a generic "Hello" rather than silently re-hardcoding a specific time of day. Tests: 5 regression cases in `tests/brief-magazine-render.test.mjs` cover afternoon/morning/evening parity, period stripping, and HTML escape (defense-in-depth). 60 total in that file pass. Full test:data 5921 pass. typecheck + typecheck:api + biome clean. * chore(brief): fix orphaned JSDoc on coverGreeting / renderCover Greptile flagged: the original `renderCover` JSDoc block stayed above `coverGreeting` when the helper was inserted, so the @param shape was misattributed to the wrong function and `renderCover` was left undocumented (plus the new `greeting` field was unlisted). Moved the opts-shape JSDoc to immediately above `renderCover` and added `greeting: string` to the param type. `coverGreeting` keeps its own prose comment. No runtime change. |
||
|
|
6977e9d0fe |
fix(gateway): accept Dodo entitlement as pro, not just Clerk role — unblocks paying users (#3249)
* fix(gateway): accept Dodo entitlement as pro, not just Clerk role
The gateway's legacy premium-paths gate (lines 388-401) was rejecting
authenticated Bearer users with 403 "Pro subscription required"
whenever session.role !== 'pro' — which is EVERY paying Dodo
subscriber, because the Dodo webhook pipeline writes Convex
entitlements and does NOT sync Clerk publicMetadata.role.
So the flow was:
- User pays, Dodo webhook fires, Convex entitlement tier=1 written
- User loads the dashboard, Clerk token includes Bearer but role='free'
- Gateway sees role!=='pro' → 403 on every intelligence/trade/
economic/sanctions premium endpoint
- User sees a blank dashboard despite having paid
This is the exact split-brain documented at the frontend layer
(src/services/panel-gating.ts:11-27): "The Convex entitlement check
is the authoritative signal for paying customers — Clerk
`publicMetadata.plan` is NOT written by our webhook pipeline". The
frontend was fixed by having hasPremiumAccess() fall through to
isEntitled() from Convex. The backend gateway still had the
Clerk-role-only gate, so paying users got rejected even though
their Convex entitlement was active.
Align the gateway gate with the logic already in
server/_shared/premium-check.ts::isCallerPremium (line 44-49):
1. If Clerk role === 'pro' → allow (fast path, no Redis/Convex I/O)
2. Else if session.userId → look up Convex entitlement; allow if
tier >= 1 AND validUntil >= Date.now() (covers lapsed subs)
3. Else → 403
Same two-signal semantics as the per-handler isCallerPremium, so
the gateway and handlers can't disagree on who is premium. Uses
the already-imported getEntitlements function (line 345 already
imports it dynamically; promoting to top-level import since the new
site is in a hotter path).
Impact: unblocks all Dodo subscribers whose Clerk role is still
'free' — the common case after any fresh Pro purchase and for
every user since webhook-based role sync was never wired up.
Reported 2026-04-21 post-purchase flow: user completed Dodo payment,
landed back on dashboard, saw 403s on get-regional-snapshot,
get-tariff-trends, list-comtrade-flows, get-national-debt,
deduct-situation — all 5 are in PREMIUM_RPC_PATHS but not in
ENDPOINT_ENTITLEMENTS, so they hit this legacy gate.
* fix(gateway): move entitlement fallback to the gate that actually fires
Reviewer caught that the previous iteration of this fix put the
entitlement fallback at line ~400, inside an `if (sessionUserId &&
!keyCheck.valid && needsLegacyProBearerGate)` branch that's
unreachable for the case the PR was supposed to fix:
- sessionUserId is only resolved when isTierGated is true (line 292)
— JWKS lookup is intentionally skipped for non-tier-gated paths.
- needsLegacyProBearerGate IS the non-tier-gated set
(PREMIUM_RPC_PATHS && !isTierGated).
- So sessionUserId is null, the branch never enters, and the actual
legacy-Bearer rejection still happens earlier at line 367 inside
the `keyCheck.required && !keyCheck.valid` branch.
Move the entitlement fallback INTO the line-367 check, where the
Bearer is already being validated and `session.userId` is already
exposed on the validateBearerToken() result. No extra JWKS round-trip
needed (validateBearerToken already verified the JWT). The previously-
added line-400 block is removed since it never ran.
Now for a paying Dodo subscriber whose Clerk role is still 'free':
- Bearer validates → role !== 'pro'
- Fall through: getEntitlements(session.userId) → tier=1, validUntil future
- allowed = true, request proceeds to handler
Same fail-closed semantics as before for the negative cases:
- Anonymous → no Bearer → 401
- Bearer with invalid JWT → 401
- Free user with no Dodo entitlement → 403
- Pro user whose Dodo subscription lapsed (validUntil < now) → 403
* chore(gateway): drop redundant dynamic getEntitlements import
Greptile spotted that the previous commit promoted getEntitlements to
a top-level import for the new line-385 fallback site, but the older
dynamic import at line 345 (in the user-API-key entitlement check
branch) was left in place. Same module, same symbol, so the dynamic
import is now dead weight that just adds a microtask boundary to the
hot path.
Drop it; line 345's `getEntitlements(sessionUserId)` call now resolves
through the top-level import like the line-385 site already does.
|
||
|
|
9e022f23bb |
fix(cable-health): stop EMPTY alarm during NGA outages — writeback fallback + mark zero-events healthy (#3230)
User reported health endpoint showing:
"cableHealth": { status: "EMPTY", records: 0, seedAgeMin: 0, maxStaleMin: 90 }
despite the 30-min warm-ping loop running. Two bugs stacked:
1. get-cable-health.ts null-upstream path didn't write Redis.
cachedFetchJson with a returning-null fetcher stores NEG_SENTINEL
(10 bytes) in cable-health-v1 for 2 min. Handler then returned
`fallbackCache || { cables: {} }` to the client WITHOUT writing to
cable-health-v1 or refreshing seed-meta. api/health.js saw strlen=10
→ strlenIsData=false → hasData=false → records=0 → EMPTY (CRIT).
Fix: on null result, write the fallback response back to CACHE_KEY
(short TTL matching NEG_SENTINEL so a recovered NGA fetch can
overwrite immediately) AND refresh seed-meta with the real count.
Health now sees hasData=true during an outage.
2. Zero-cables was treated as EMPTY_DATA (CRIT), but `cables: {}` is
the valid healthy state — NGA had no active subsea cable warnings.
The old `Math.max(count, 1)` on recordCount was an intentional lie
to sidestep this; now honest.
Fix: add `cableHealth` to EMPTY_DATA_OK_KEYS. Matches the existing
pattern for notamClosures, gpsjam, weatherAlerts — "zero events is
valid, not critical". recordCount now reports actual cables.length.
Combined: NGA outage → fallback cached locally + written back → health
reads hasData=true, records=N, no false alarm. NGA healthy with zero
active warnings → cables={}, records=0, EMPTY_DATA_OK → OK. NGA healthy
with warnings → cables={...}, records>0 → OK.
Regression guard to keep in mind: if anyone later removes cableHealth
from EMPTY_DATA_OK_KEYS and wants strict zero-events to alarm, they'd
also need to revisit `Math.max(count, 1)` or an equivalent floor so
the "legitimately empty but healthy" state doesn't CRIT.
|
||
|
|
14c1314629 |
fix(scoring): scope "critical" to geopolitical events, not domestic tragedies (#3221)
The weight rebalance (PR #3144) amplified a prompt gap: domestic mass shootings (e.g. "8 children killed in Louisiana") scored 88 because the LLM classified them as "critical" (mass-casualty 10+ killed) and the 55% severity weight pushed them into the critical gate. But WorldMonitor is a geopolitical monitor — domestic tragedies are terrible but not geopolitically destabilizing. Prompt change (both ais-relay.cjs + classify-event.ts): - "critical" now explicitly requires GEOPOLITICAL scope: "events that destabilize international order, threaten cross-border security, or disrupt global systems" - Domestic mass-casualty events (mass shootings, industrial accidents) moved to "high" — still important, but not critical-sensitivity alerts - Added counterexamples: "8 children killed in mass shooting in Louisiana → domestic mass-casualty → high" and "23 killed in fireworks factory explosion → industrial accident → high" - Retained: "700 killed in Sudan drone strikes → geopolitical mass- casualty in active civil war → critical" Classify cache: v2→v3 (bust stale entries that lack geopolitical scope). Shadow-log: v4→v5 (clean dataset for recalibration under the scoped prompt). 🤖 Generated with Claude Opus 4.6 via Claude Code Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
1581b2dd70 |
fix(scoring): upgrade LLM classifier prompt + keywords + cache v2 (#3218)
* fix(scoring): upgrade LLM classifier prompt + keywords + cache v2 PR B of the scoring recalibration plan (docs/plans/2026-04-17-002). Builds on PR A (weight rebalance, PR #3144) which achieved Pearson 0.669. This PR targets the remaining noise in the 50-69 band where editorials, tutorials, and domestic crime score alongside real news. LLM prompt upgrade (both writers): - scripts/ais-relay.cjs CLASSIFY_SYSTEM_PROMPT: added per-level guidelines, content-type distinction (editorial/opinion/tutorial → info, domestic crime → info, mass-casualty → critical), and concrete counterexamples. - server/worldmonitor/intelligence/v1/classify-event.ts: same guidelines added to align the second cache writer. Classify cache bump: - classify:sebuf:v1: → classify:sebuf:v2: in all three locations (ais-relay.cjs classifyCacheKey, list-feed-digest.ts enrichWithAiCache, _shared.ts CLASSIFY_CACHE_PREFIX). Old v1 entries expire naturally (24h TTL). All items reclassified within 15 min of Railway deploy. Keyword additions (_classifier.ts): - HIGH: 'sanctions imposed', 'sanctions package', 'new sanctions' (phrase patterns — no false positives on 'sanctioned 10 individuals') - MEDIUM: promoted 'ceasefire' from LOW to reduce prompt/keyword misalignment during cold-cache window Shadow-log v4: - Clean dataset for post-prompt-change recalibration. v3 rolls off via 7-day TTL. Deploy order: Railway first (seedClassify prewarms v2 cache immediately), then Vercel. First ~15 min of v4 may carry stale digest-cached scores. 🤖 Generated with Claude Opus 4.6 via Claude Code + Compound Engineering v2.49.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(scoring): align classify-event prompt + remove dead keywords + update v4 docs Review findings on PR #3218: P1: classify-event.ts prompt was missing 2 counterexamples and the "Focus" line present in the relay prompt. Both writers share classify:sebuf:v2 cache, so differing prompts mean nondeterministic classification depending on which path writes first. Now both prompts have identical level guidelines and counterexamples (format differs: array vs single object, but classification logic is aligned). P2: Removed 3 dead phrase-pattern keywords (sanctions imposed/package/ new) — the existing 'sanctions' entry already substring-matches all of them and maps to the same (high, economic). Just dead code that misled readers into thinking they added coverage. P2: Updated stale v3 references in cache-keys.ts (doc block + exported constant) and shadow-score-report.mjs header to v4. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
4853645d53 |
fix(brief): switch carousel to @vercel/og on edge runtime (#3210)
* fix(brief): switch carousel to @vercel/og on edge runtime Every attempt to ship the Phase 8 Telegram carousel on Vercel's Node serverless runtime has failed at cold start: - PR #3174 direct satori + @resvg/resvg-wasm: Vercel edge bundler refused the `?url` asset import required by resvg-wasm. - PR #3174 (fix) direct satori + @resvg/resvg-js native binding: Node runtime accepted it, but Vercel's nft tracer does not follow @resvg/resvg-js/js-binding.js's conditional `require('@resvg/resvg-js-<platform>-<arch>-<libc>')` pattern, so the linux-x64-gnu peer package was never bundled. Cold start threw MODULE_NOT_FOUND, isolate crashed, FUNCTION_INVOCATION_FAILED on every request including OPTIONS, and Telegram reported WEBPAGE_CURL_FAILED with no other signal. - PR #3204 added `vercel.json` `functions.includeFiles` to force the binding in, but (a) the initial key was a literal path that Vercel micromatch read as a character class (PR #3206 fixed), (b) even with the corrected `api/brief/carousel/**` wildcard, the function still 500'd across the board. The `functions.includeFiles` path appears honored in the deployment manifest but not at runtime for this particular native-binding pattern. Fix: swap the renderer to @vercel/og's ImageResponse, which is Vercel's first-party wrapper around satori + resvg-wasm with Vercel-native bundling. Runs on Edge runtime — matches every other API route in the project. No native binding, no includeFiles, no nft tracing surprises. Cold start ~300ms, warm ~30ms. Changes: - server/_shared/brief-carousel-render.ts: replace renderCarouselPng (Uint8Array) with renderCarouselImageResponse (ImageResponse). Drop ensureLibs + satori + @resvg/resvg-js dynamic-import dance. Keep layout builders (buildCover/buildThreads/buildStory) and font loading unchanged — the Satori object trees are wire-compatible with ImageResponse. - api/brief/carousel/[userId]/[issueDate]/[page].ts: flip `runtime: 'nodejs'` -> `runtime: 'edge'`. Delegate rendering to the renderer's ImageResponse and return it directly; error path still 503 no-store so CDN + Telegram don't pin a bad render. - vercel.json: drop the now-useless `functions.includeFiles` block. - package.json: drop direct `@resvg/resvg-js` and `satori` deps (both now bundled inside @vercel/og). - tests/deploy-config.test.mjs: replace the native-binding regression guards with an assertion that no `functions` block exists (with a comment pointing at the skill documenting the micromatch gotcha for future routes). - tests/brief-carousel.test.mjs: updated comment references. Verified: - typecheck + typecheck:api clean - test:data 5814/5814 pass - node -e test: @vercel/og imports cleanly in Node (tests that reach through the renderer file no longer depend on native bindings) Post-deploy validation: curl -I -H "User-Agent: TelegramBot (like TwitterBot)" \ "https://www.worldmonitor.app/api/brief/carousel/<uid>/<slot>/0" # Expect: HTTP/2 403 (no token) or 200 (valid token) # NOT: HTTP/2 500 FUNCTION_INVOCATION_FAILED Then tail Railway digest logs on the next tick; the `[digest] Telegram carousel 400 ... WEBPAGE_CURL_FAILED` line should stop appearing, and the 3-image preview should actually land on Telegram. * Add renderer smoke test + fix Cache-Control duplication Reviewer flagged residual risk: no dedicated carousel-route smoke test for the @vercel/og path. Adds one, and catches a real bug in the process. Findings during test-writing: 1. @vercel/og's ImageResponse runs CLEANLY in Node via tsx — the comment in brief-carousel.test.mjs saying "we can't test the render in Node" was true for direct satori + @resvg/resvg-wasm but no longer holds after PR #3210. Pure Node render works end-to-end: satori tree-parse, jsdelivr font fetch, resvg-wasm init, PNG output. ~850ms first call, ~20ms warm. 2. ImageResponse sets its own default `Cache-Control: public, immutable, no-transform, max-age=31536000`. Passing Cache-Control via the constructor's headers option APPENDS rather than overrides, producing a duplicated comma-joined value like `public, immutable, no-transform, max-age=31536000, public, max-age=60` on the Response. The route handler was doing exactly this via extraHeaders. Fix: drop our Cache-Control override and rely on @vercel/og's 1-year immutable default — envelope is only immutable for its 7d Redis TTL so the effective ceiling is 7d anyway (after that the route 404s before render). Changes: - tests/brief-carousel.test.mjs: 6 new assertions under `renderCarouselImageResponse`: * renders cover / threads / story pages, each returning a valid PNG (magic bytes + size range) * rejects a structurally empty envelope * threads non-cache extraHeaders onto the Response * pins @vercel/og's Cache-Control default so it survives caller-supplied Cache-Control overrides (regression guard for the bug fixed in this commit) - api/brief/carousel/[userId]/[issueDate]/[page].ts: remove the stacked Cache-Control; lean on @vercel/og default. Drop the now- unused `PAGE_CACHE_TTL` constant. Comment explains why. Verified: - test:data 5820/5820 pass (was 5814, +6 smoke) - typecheck + typecheck:api clean - Render smoke: cover 825ms / threads 23ms / story 16ms first run (wasm init dominates first render) |
||
|
|
38e6892995 |
fix(brief): per-run slot URL so same-day digests link to distinct briefs (#3205)
* fix(brief): per-run slot URL so same-day digests link to distinct briefs
Digest emails at 8am and 1pm on the same day pointed to byte-identical
magazine URLs because the URL was keyed on YYYY-MM-DD in the user tz.
Each compose run overwrote the single daily envelope in place, and the
composer rolling 24h story window meant afternoon output often looked
identical to morning. Readers clicking an older email got whatever the
latest cron happened to write.
Slot format is now YYYY-MM-DD-HHMM (local tz, per compose run). The
magazine URL, carousel URLs, and Redis key all carry the slot, and each
digest dispatch gets its own frozen envelope that lives out the 7d TTL.
envelope.data.date stays YYYY-MM-DD for rendering "19 April 2026".
The digest cron also writes a brief:latest:{userId} pointer (7d TTL,
overwritten each compose) so the dashboard panel and share-url endpoint
can locate the most recent brief without knowing the slot. The
previous date-probing strategy does not work once keys carry HHMM.
No back-compat for the old YYYY-MM-DD format: the verifier rejects it,
the composer only ever writes the new shape, and any in-flight
notifications signed under the old format will 403 on click. Acceptable
at the rollout boundary per product decision.
* fix(brief): carve middleware bot allowlist to accept slot-format carousel path
BRIEF_CAROUSEL_PATH_RE in middleware.ts was still matching only the
pre-slot YYYY-MM-DD segment, so every slot-based carousel URL emitted
by the digest cron (YYYY-MM-DD-HHMM) would miss the social allowlist
and fall into the generic bot gate. Telegram/Slack/Discord/LinkedIn
image fetchers would 403 on sendMediaGroup, breaking previews for the
new digest links.
CI missed this because tests/middleware-bot-gate.test.mts still
exercised the old /YYYY-MM-DD/ path shape. Swap the fixture to the
slot format and add a regression asserting the pre-slot shape is now
rejected, so legacy links cannot silently leak the allowlist after
the rollout.
* fix(brief): preserve caller-requested slot + correct no-brief share-url error
Two contract bugs in the slot rollout that silently misled callers:
1. GET /api/latest-brief?slot=X where X has no envelope was returning
{ status: 'composing', issueDate: <today UTC> } — which reads as
"today's brief is composing" instead of "the specific slot you
asked about doesn't exist". A caller probing a known historical
slot would get a completely unrelated "today" signal. Now we echo
the requested slot back (issueSlot + issueDate derived from its
date portion) when the caller supplied ?slot=, and keep the
UTC-today placeholder only for the no-param path.
2. POST /api/brief/share-url with no slot and no latest-pointer was
falling into the generic invalid_slot_shape 400 branch. That is
not an input-shape problem; it is "no brief exists yet for this
user". Return 404 brief_not_found — the same code the
existing-envelope check returns — so callers get one coherent
contract: either the brief exists and is shareable, or it doesn't
and you get 404.
|
||
|
|
63464775a5 |
feat(supply-chain): scenario UX — rich banner + projected score + faster poll (#3193)
* feat(supply-chain): rich scenario banner + projected score per chokepoint + faster poll
User reported Simulate Closure adds only a thin banner with no context —
"not clear what value user is getting, takes many many seconds". Four
targeted UX improvements in one PR:
A. Rich banner (scenario params + tagline)
Banner now reads:
⚠ Hormuz Tanker Blockade · 14d · +110% cost
CN 100% · IN 84% · TW 82% · IR 80% · US 39%
Simulating 14d / 100% closure / +110% cost on 1 chokepoint.
Chokepoint card below shows projected score; map highlights…
Surfaces the scenario template fields (durationDays, disruptionPct,
costShockMultiplier) + a one-line explainer so a first-time user
understands what "CN 100%" actually means.
B. Projected score on each affected chokepoint card
Card header now shows: `[current]/100 → [projected]/100` with a red
trailing badge + red left border on the card body.
Body prepends: "⚠ Projected under scenario: X% closure for N days
(+Y% cost)".
Projected = max(current, template.disruptionPct) — conservative
floor since the real scoring mixes threat + warnings + anomaly.
C. Faster polling
Status poll interval 2s → 1s. Max iterations 30→60 (unchanged 60s
budget). Worker processes in <1s; perceived latency drops from
2–3s to <2s in the common case. First poll still immediate.
D. ScenarioResult interface widened
Added optional `template` and `currentDisruptionScores` fields in
scenario-templates.ts to match what the scenario-worker already
emits. Optional = backward-compat with map-only consumers.
Dependent on PR #3192 (already merged) which fixed the 10000% banner
% inflation.
* fix(supply-chain): trigger render() on scenario activate/dismiss — cards must re-render
PR review caught a real bug in the new scenario UX: showScenarioSummary
and hideScenarioSummary were mutating the banner DOM directly without
triggering render(). renderChokepoints() reads activeScenarioState to
paint the projected score + red border + callout, but those only run
during render() — so the cards stayed stale on activate AND on dismiss
until some unrelated re-render happened.
Refactor to split public API from internal rendering:
- showScenarioSummary(scenarioId, result) — now just sets state + calls
render(). Was: set state + inline DOM mutation (bypassing card render).
- renderScenarioBanner() — new private helper that builds the banner
DOM from activeScenarioState. Called from render()'s postlude
(replacing the old self-recursive showScenarioSummary() call — which
only worked because it had a side-effectful early-exit path that
happened to terminate, but was a latent recursion risk).
- hideScenarioSummary() — now just sets state=null + calls render().
Was: clear state + manual banner removal + manual button-text reset
loop. The button loop is redundant now — the freshly-rendered card
template produces buttons with default "Simulate Closure" text by
construction.
Net effect: activating a scenario paints the banner AND the affected
chokepoint cards in a single render tick. Dismissing strips both in
the same tick.
* fix(supply-chain): derive scenario button state from activeScenarioState, not imperative mutation
PR review caught: the earlier re-render fix (showScenarioSummary → render())
correctly repaints cards on activate, but the button-state logic in
runScenario() is now wrong. render() detaches the old btn reference, so
the post-onScenarioActivate `resetButton('Active') + btn.disabled = true`
touches a detached node and no-ops (resetButton() explicitly skips
!btn.isConnected). The fresh button painted by render() uses the default
template text — visible button reads "Simulate Closure" enabled, and users
can queue duplicate runs of an already-active scenario.
Fix: make button state a function of panel state.
- renderChokepoints() scenario section: check
activeScenarioState.scenarioId === template.id and, when matched, emit
the button with class `sc-scenario-btn--active`, text "Active", and
`disabled` attribute. On dismiss, the next render strips those
automatically — same pattern as the card projection styling.
- runScenario(): drop the dead `resetButton('Active')` + `btn.disabled`
lines after onScenarioActivate. That path is now template-driven;
touching the detached btn was the defect.
Catch-path resets ('Simulate Closure' on abort, 'Error — retry' on real
error) are unchanged — those fire BEFORE any render could detach the btn,
so the imperative path is still correct there.
* fix(supply-chain): hide scenario projection arrow when current already ≥ template
Greptile P1: projected badge was rendered as `N/100 → N/100` whenever
current disruptionScore already met or exceeded template.disruptionPct.
Visible for Suez (80%) or Panama (50%) scenarios when a chokepoint is
already elevated — read as "scenario has zero effect", which is misleading.
The two values live on different scales — cp.disruptionScore is a
computed risk score (threat + warnings + anomaly) while
template.disruptionPct is "% of capacity blocked" — but they share the
0–100 axis so directional comparison is still meaningful for the
"does this scenario escalate things?" signal.
Fix: arrow only renders when template.disruptionPct > cp.disruptionScore.
When current already equals or exceeds the scenario level, show the
single current badge. The card's red left border + "⚠ Projected under
scenario" callout still indicate the card is the scenario target —
only the escalation arrow is suppressed.
|
||
|
|
d8e479188a |
fix(supply-chain): don't CDN-cache empty chokepoint-history responses (#3189)
User reported "Transit history unavailable" persisting for Hormuz after PR #3187 deployed. Direct Redis probe confirms supply_chain:transit-summaries:history:v1:hormuz_strait has 174 entries. Direct server-side curl to /api/supply-chain/v1/get-chokepoint-history also returns 174. But the user's browser kept receiving `{"history":[],"fetchedAt":"0"}`. Root cause: gateway cache tier `slow` pins 200 responses for 30 min at Cloudflare edge (s-maxage=1800). During the gap between Vercel instant deploy and Railway ais-relay redeploy + first transit-summary cron tick (~20 min), per-id history keys were absent, so the handler returned empty. Those empty bodies got CF-cached and kept serving for 30 min AFTER the keys were populated in Redis. Bab-el-Mandeb (which DOES render for the user) got a fresh non-empty cache entry; Hormuz got stuck with the empty one. Fix: when returning empty (missing key, invalid id, error), call markNoCacheResponse(ctx.request) so the gateway sets Cache-Control: no-store instead of the 30-min tier cache. Every call on an empty state re-checks Redis. Once data is present, the normal tier cache applies on the non-empty response. Mechanism: the gateway at server/gateway.ts:488 honors X-No-Cache header via the same side-channel (response-headers.ts). Pattern already used by other handlers for upstream-unavailable bodies. Cost: per-id history keys are ~35KB, edge→Upstash round-trip <1.5s. Slight Redis-traffic bump for as long as keys stay empty; negligible in practice (only the deploy window). Also no-caches invalid chokepoint IDs so scanners/junk IDs don't pin 30-min empties either. |
||
|
|
96fca1dc2b |
fix(supply-chain): popup-keyed history re-query + dataAvailable flag (#3187)
* fix(supply-chain): popup-keyed history re-query + dataAvailable flag for partial coverage Two P1 findings on #3185 post-merge review: 1. MapPopup cross-chokepoint history contamination Popup's async history resolve re-queried [data-transit-chart] without a cpId key. User opens popup A → fetch starts for cpA; user opens popup B before it resolves → cpA's history mounts into cpB's chart container. Fix: add data-transit-chart-id keyed by cpId; re-query by it on resolve. Mirrors SupplyChainPanel's existing data-chart-cp-id pattern. 2. Partial portwatch coverage still looked healthy Previous fix emits all 13 canonical summaries (zero-state fill for missing IDs) and records pwCovered in seed-meta, but: - get-chokepoint-status still zero-filled missing chokepoints and cached the response as healthy — panel rendered silent empty rows. - api/health.js only degrades on recordCount=0, so 10/13 partial read as OK despite the UI hiding entire chokepoints. Fix: - proto: TransitSummary.data_available (field 12). Writer tags with Boolean(cpData). Status RPC passes through; defaults true for pre-fix payloads (absence = covered). - Status RPC writes seed-meta recordCount as covered count (not shape size), and flips response-level upstreamUnavailable on partial. - api/health.js: new minRecordCount field on SEED_META entries + new COVERAGE_PARTIAL status (warn rollup). chokepoints entry declares minRecordCount: 13. recordCount < 13 → COVERAGE_PARTIAL. - Client (panel + popup): skip stats/chart rendering when !dataAvailable; show "Transit data unavailable (upstream partial)" microcopy so users understand the gap. 5759/5759 data tests pass. Typecheck + typecheck:api clean. * fix(supply-chain): guarantee Simulate Closure button exits Computing state User reports "Simulate Closure does nothing beyond write Computing…" — the button sticks at Computing forever. Two causes: 1. Scenario worker appears down (0 scenario-result:* keys in Redis in the last 24h of 24h-TTL). Railway-side — separate intervention needed to redeploy scripts/scenario-worker.mjs. 2. Client leaked the "Computing…" state on multiple exit paths: - signal.aborted early-return inside the poll loop never reset the button. Second click fired abort on first → first returned without resetting → button stayed "Computing…" until next render. - !this.content.isConnected early-return also skipped reset (less user-visible but same class of bug). - catch block swallowed AbortError without resetting. - POST /run had no hard timeout — a hanging edge function left the button in Computing indefinitely. Fix: - resetButton(text) helper touches the btn only if still connected; applied in every exit path (abort, timeout, post-success, catch). - AbortSignal.any([caller, AbortSignal.timeout(20_000)]) on POST /run. - console.error on failure so Simulate Closure errors surface in ops. - Error message includes "scenario worker may be down" on loop timeout so operators see the right suspect. Backend observations (for follow-up): - Hormuz backend is healthy (/api/health chokepoints OK, 13 records, 1 min old; live RPC has hormuz_strait.riskLevel=critical, wow=-22, flowEstimate present; GetChokepointHistory returns 174 entries). User-reported "Hormuz empty" is likely browser/CDN stale cache from before PR #3185; hard refresh should resolve. - scenario-worker.mjs has zero result keys in 24h. Railway service needs verification/redeployment. * fix(scenario): wrong Upstash RPUSH format silently broke every Simulate Closure Railway scenario-worker log shows every job failing field validation since at least 03:06Z today: [scenario-worker] Job failed field validation, discarding: ["{\"jobId\":\"scenario:1776535792087:cynxx5v4\",... The leading [" in the payload is the smoking gun. api/scenario/v1/run.ts was POSTing to /rpush/{key} with body `[payload]`, expecting Upstash to unpack the array and push one string value. Upstash does NOT parse that form — it stored the literal `["{...}"]` string as a single list value. Worker BLMOVEs the literal string → JSON.parse → array → destructure `{jobId, scenarioId, iso2}` on an array returns undefined for all three → every job discarded without writing a result. Client poll returns `pending` for the full 60s timeout, then (on the prior client code path) leaked the stuck "Computing…" button state indefinitely. Fix: use the standard Upstash REST command format — POST to the base URL with body `["RPUSH", key, value]`. Matches scripts/ais-relay.cjs upstashLpush. After this, the scenario-queue:pending list stores the raw payload string, BLMOVE returns the payload, JSON.parse gives the object, validation passes, computeScenario runs, result key gets written, client poll sees `done`. Zero result keys existed in prod Redis in the last 24h (24h TTL on scenario-result:*) — confirms the fix addresses the production outage. |
||
|
|
3c47c1b222 |
fix(supply-chain): split chokepoint transit data + close silent zero-state cache (#3185)
* fix(supply-chain): split chokepoint transit data + close silent zero-state cache
Production supply-chain panel was rendering 13 empty chokepoints because
the getChokepointStatus RPC silently cached zero-state for 5 minutes:
1. supply_chain:transit-summaries:v1 grew to ~500 KB (180d × 13 × 14 fields
of history per chokepoint).
2. REDIS_OP_TIMEOUT_MS is 1.5 s. Vercel Sydney edge → Upstash for a 500 KB
GET consistently exceeded the budget; getCachedJson caught the AbortError
and returned null.
3. The 500 KB portwatch fallback read hit the same timeout.
4. summaries = {} → every summaries[cp.id] was undefined → 13 chokepoints
got the zero-state default → cached as a non-null success response for
REDIS_CACHE_TTL (5 min) instead of NEG_SENTINEL (120 s).
Fix (one PR, per docs/plans/chokepoint-rpc-payload-split.md):
- ais-relay.cjs: split seedTransitSummaries output.
- supply_chain:transit-summaries:v1 — compact (~30 KB, no history).
- supply_chain:transit-summaries:history:v1:{id} — per chokepoint
(~35 KB each, 13 keys). Both under the 1.5 s Redis read budget.
- New RPC GetChokepointHistory: lazy-loaded on card expand.
- get-chokepoint-status.ts: drop the 500 KB portwatch/corridorrisk/
chokepoint_transits fallback reads. Treat a null transit-summaries
read as upstreamUnavailable=true so cachedFetchJson writes NEG_SENTINEL
(2 min) instead of a 5-min zero-state pin. Omit history from the
response (proto field stays declared; empty array).
- server/_shared/redis.ts: tag AbortError timeouts with [REDIS-TIMEOUT]
key=… timeoutMs=… so log drains / Sentry-Vercel integration pick up
large-payload timeouts instead of them being silently swallowed.
- SupplyChainPanel.ts + MapPopup.ts: lazy-fetch history on card expand
via fetchChokepointHistory; session-scoped cache; graceful "History
unavailable" on empty/error. PRO gating on the map popup unchanged.
- Gateway: cache-tier entry for /get-chokepoint-history (slow).
- Tests: regression guards for upstreamUnavailable gate + per-id key
shape + handler wiring + proto query annotations.
Audit included in plan: no other RPC consumer read stacks >200 KB
besides displacement:summary:v1:2026 (724 KB, same risk, flagged for
follow-up PR). wildfire:fires:v1 at 1.7 MB loads via bootstrap (3 s
timeout, different path) — monitor but out of scope.
Expected impact:
- supply_chain:chokepoints:v4 payload drops from ~508 KB to <100 KB.
- supply_chain:transit-summaries:v1 drops from ~502 KB to <50 KB.
- RPC Redis reads stay well under 1.5 s in the hot path.
- Silent zero-state pinning is now impossible: null reads → 2-min neg
cache → self-heal on next relay tick.
* fix(supply-chain): address PR #3185 review — stop caching empty/error + fix partial coverage
Two P1 regressions caught in review:
1. Client cache poisoning on empty/error (MapPopup.ts, SupplyChainPanel.ts)
Empty-array is truthy in JS, so MapPopup's `!cached && !inflight` branch
never fired once we cached []. Neither `cached && cached.length` fired
either — popup stuck on "Loading transit history..." for the session.
SupplyChainPanel had the explicit `cached && !cached.length` branch but
still never retried, so the same transient became session-sticky there too.
Fix: cache ONLY non-empty successful responses. Empty/error show the
"History unavailable" placeholder but leave the cache untouched, so the
next re-expand retries. The /get-chokepoint-history gateway tier is
"slow" (5-min CF edge cache) → retries stay cheap.
2. Partial portwatch coverage treated as healthy (ais-relay.cjs)
seedTransitSummaries iterated Object.entries(pw), so if seed-portwatch
dropped N of 13 chokepoints (ArcGIS reject/empty), summaries had <13 keys.
get-chokepoint-status upstreamUnavailable fires only on fully-empty
summaries, so the N missing chokepoints fell through to zero-state rows
that got pinned in cache for 5 minutes.
Fix: iterate CANONICAL_IDS (Object.keys(CHOKEPOINT_THREAT_LEVELS)) and
fill zero-state for any ID missing from pw. Shape is consistently 13
keys. Track pwCovered → envelope + seed-meta recordCount reflect real
upstream coverage (not shape size), so health.js can distinguish 13/13
healthy from 10/13 partial. Warn-log on shortfall.
Tests: new regression guards
- panel must NOT cache empty arrays (historyCache.set with []).
- writer must iterate CANONICAL_IDS, not Object.entries(pw).
- seed-meta recordCount binds to pwCovered.
5718/5718 data tests pass. typecheck + typecheck:api clean.
|
||
|
|
55ac431c3f |
feat(brief): public share mirror + in-magazine Share button (#3183)
* feat(brief): public share mirror + in-magazine Share button
Adds the growth-vector piece listed under Future Considerations in the
original brief plan (line 399): a shareable public URL and a one-click
Share button on the reader's magazine.
Problem: the per-user magazine at /api/brief/{userId}/{issueDate} is
HMAC-signed to a specific reader. You cannot share the URL you are
looking at, because the recipient either 403s (bad token) or reads
your personalised issue against your userId. Result: no way to share
the daily brief, no way for readers to drive discovery. Opening a
growth loop requires a separate public surface.
Approach: deterministic HMAC-derived short hash per {userId,
issueDate} backed by a pointer key in Redis.
New files
- server/_shared/brief-share-url.ts
Web Crypto HMAC helper. deriveShareHash returns 12 base64url chars
(72 bits) from (userId, issueDate) using BRIEF_SHARE_SECRET.
Pointer encode/decode helpers and a shape check. Distinct from the
per-user BRIEF_URL_SIGNING_SECRET so a leak of one does not
automatically unmask the other.
- api/brief/share-url.ts (edge, Clerk auth, Pro gated)
POST /api/brief/share-url?date=YYYY-MM-DD
Idempotently writes brief:public:{hash} pointer with the same 7 day
TTL as the underlying brief, then returns {shareUrl, hash,
issueDate}. 404 if the per-user brief is missing. 503 on Upstash
failure. Accepts an optional refCode in the JSON body for referral
attribution.
- api/brief/public/[hash].ts (edge, unauth)
GET /api/brief/public/{hash}?ref={code}
Reads pointer, reads the real brief envelope, renders with
publicMode=true. Emits X-Robots-Tag: noindex,nofollow so shared
briefs never get enumerated by search engines. 404 on any missing
part (bad hash shape, missing pointer, missing envelope) with a
neutral error page. 503 on Upstash failure.
Renderer changes (server/_shared/brief-render.js)
- Signature extended: renderBriefMagazine(envelope, options?)
- options.publicMode: redacts user.name and whyMatters before any
HTML emission; swaps the back cover to a Subscribe CTA; prepends
a Subscribe strip across the top of the deck; omits the Share
button + share script; adds a noindex meta tag.
- options.refCode: appended as ?ref= to /pro links on public views.
- Non-public views gain a sticky .wm-share pill in the top-right
chrome. Inline SHARE_SCRIPT handles the click flow: POST /api/
brief/share-url then navigator.share with clipboard fallback and a
prompt() ancient-browser fallback. User-visible feedback via
data-state on the button (sharing / copied / error). No change to
the envelope contract, no LLM calls, no composer-side work
required.
- Validation runs on the full unredacted envelope first, so the
public path can never accept a shape the private path would reject.
Tests
- tests/brief-share-url.test.mts (18 assertions): determinism,
secret sensitivity, userId/date sensitivity, shape validation, URL
composition with/without refCode, trailing-slash handling on
baseUrl, pointer encode/decode round-trip.
- tests/brief-magazine-render.test.mjs (+13 assertions): Share
button carries the issue date; share script emitted once;
share-url endpoint wired; publicMode strips the button+script,
replaces whyMatters, emits noindex meta, prepends Subscribe strip,
passes refCode through with escaping, swaps the back cover, does
not leak the user name, preserves story headlines, options-less
call matches the empty-options call byte for byte.
- Full typecheck/lint/edge-bundle/test:data/edge-functions suite all
green: 5704/5704 data tests, 171/171 edge-function tests, 0 lint
errors.
Env vars (new)
- BRIEF_SHARE_SECRET: 64+ random hex chars, Vercel (edge) only. NOT
needed by the Railway composer because pointer writes are lazy
(on share, not on compose).
* fix(brief): public share round-trip + magazine Share button without auth
Two P1 findings on #3183 review.
1) Pointer wire format: share-url.ts wrote the pointer as a raw colon-delimited string via SET. The public route reads via readRawJsonFromUpstash which ALWAYS JSON.parses. A bare non-JSON string throws at parse, the route returned 503 instead of resolving. Fix: JSON.stringify on both write sites. Regression test locks the wire format.
2) Share button auth unreachable from a standalone magazine tab: inline script needed window.WM_CLERK_JWT which is never set, endpoint hard-requires Bearer, fallback to credentials:include fails. Fix: derive share URL server-side in the per-user route (same inputs share-url uses), embed as data-share-url, click handler now reads dataset and invokes navigator.share directly. No network, no auth, works in any tab.
The /api/brief/share-url endpoint stays in place for other callers (dashboard panel) with its Clerk auth intact and its pointer write now in the correct format.
QA: typecheck clean, 5708/5708 data tests, 45/45 magazine, 20/20 share-url, edge bundle OK, lint exit 0.
* fix(brief): address remaining review findings on #3183
P0-2 (comment-only): public/[hash].ts inline comment incorrectly described readRawJsonFromUpstash parse-failure behaviour. The helper rethrows on JSON.parse failure, it does not return null. Rewrote the comment to match reality (JSON-encoded wire format, parse-to-string round-trip, intentional 503-on-bug-value as the loud failure mode). The actual wire-format fix was in prior commit
|
||
|
|
81536cb395 |
feat(brief): source links, LLM descriptions, strip suffix (envelope v2) (#3181)
* feat(brief): source links, LLM descriptions, strip publisher suffix (envelope v2) Three coordinated fixes to the magazine content pipeline. 1. Headlines were ending with " - AP News" / " | Reuters" etc. because the composer passed RSS titles through verbatim. Added stripHeadlineSuffix() in brief-compose.mjs, conservative case- insensitive match only when the trailing token equals primarySource, so a real subtitle that happens to contain a dash still survives. 2. Story descriptions were the headline verbatim. Added generateStoryDescription to brief-llm.mjs, plumbed into enrichBriefEnvelopeWithLLM: one additional LLM call per story, cached 24h on a v1 key covering headline, source, severity, category, country. Cache hits are revalidated via parseStoryDescription so a bad row cannot flow to the envelope. Falls through to the cleaned headline on any failure. 3. Source attribution was plain text, no outgoing link. Bumped BRIEF_ENVELOPE_VERSION to 2, added BriefStory.sourceUrl. The composer now plumbs story:track:v1.link through digestStoryToUpstreamTopStory, UpstreamTopStory.primaryLink, filterTopStories, BriefStory.sourceUrl. The renderer wraps the Source line in an anchor with target=_blank, rel=noopener noreferrer, and UTM params (utm_source=worldmonitor, utm_medium=brief, utm_campaign=<issueDate>, utm_content=story- <rank>). UTM appending is idempotent, publisher-attributed URLs keep their own utm_source. Envelope validation gains a validateSourceUrl step (https/http only, no userinfo credentials, parseable absolute URL). Stories without a valid upstream link are dropped by filterTopStories rather than shipping with an unlinked source. Tests: 30 renderer tests to 38; new assertions cover UTM presence on every anchor, HTML-escaping of ampersands in hrefs, pre-existing UTM preservation, and all four validator rejection modes. New composer tests cover suffix stripping, link plumb-through, and v2 drop-on-no- link behaviour. New LLM tests for generateStoryDescription cover cache hit/miss, revalidation of bad rows, 24h TTL, and null-on- failure. * fix(brief): v1 back-compat window on renderer + consolidate story hash helper Two P1/P2 review findings on #3181. P1 (v1 back-compat). Bumping BRIEF_ENVELOPE_VERSION 1 to 2 made every v1 envelope still resident in Redis under the 7-day TTL fail assertBriefEnvelope. The hosted /api/brief route would 404 "expired" and the /api/latest-brief preview would downgrade to "composing", breaking already-issued links from the preceding week. Fix: renderer now accepts SUPPORTED_ENVELOPE_VERSIONS = Set([1, 2]) on READ. BRIEF_ENVELOPE_VERSION stays at 2 and is the only version the composer ever writes. BriefStory.sourceUrl is required when version === 2 and absent on v1; when rendering a v1 story the source line degrades to plain text (no anchor), matching pre-v2 appearance. When the TTL window passes the set can shrink to [2] in a follow-up. P2 (hash dedup). hashStoryDescription was byte-identical to hashStory, inviting silent drift if one prompt gains a field the other forgets. Consolidated into hashBriefStory. Cache key separation remains via the distinct prefixes (brief:llm:whymatters:v2:/brief:llm:description:v1:). Tests: adds 3 v1 back-compat assertions (plain source line, field validation still runs, defensive sourceUrl check), updates the version-mismatch assertion to match the new supported-set message. 161/161 pass (was 158). Full test:data 5706/5706. |
||
|
|
8fc302abd9 |
fix(brief): mobile layout — stack story callout, floor digest typography (#3180)
* fix(brief): mobile layout — stack story callout, floor digest typography On viewports <=640px the 55/45 story grid cramped both the headline and the "Why this is important" callout to ~45% width each, and several digest rules used raw vw units (blockquote 2vw, threads 1.55vw) that collapsed to ~7-8px on a 393px iPhone frame before the browser min clamped them to barely-readable. Appends a single @media (max-width: 640px) block to the renderer's STYLE_BLOCK: - .story becomes a flex column — callout stacks under the headline, no column squeeze. Headline goes full-width at 9.5vw. - Digest blockquote, threads, signals, and stat rows get max(Npx, Nvw) floors so they never render below ~15-17px regardless of viewport. - Running-head stacks on digest and the absolute page-number gets right-hand clearance so they stop overlapping. - Tags and source labels pinned to 11px (were scaling down with vw). CSS-only; no envelope, no HTML structure, no new classes. All 30 renderBriefMagazine tests still pass. * fix(brief): raise mobile digest px floors and running-head clearance Two P2 findings from PR review on #3180: 1. .digest .running-head padding-right: 18vw left essentially zero clearance from the absolute .page-number block on iPhone SE (375px) and common Android (360px). Bumped to 22vw (~79px at 360px) which accommodates "09 / 12" in IBM Plex Mono at the right:5vw offset with a one-vw safety margin. 2. Mobile overrides were lowering base-rule px floors (thread 17px to 15px, signal 18px to 15px). On viewports <375px this rendered digest body text smaller than desktop. Kept the px floors at or above the base rules so effective size only ever goes up on mobile. |
||
|
|
6f6102e5a7 |
feat(brief): swap sienna rust for two-strength WM mint (Option B palette) (#3178)
* feat(brief): swap sienna rust for two-strength WM mint (Option B palette)
The only off-brand color in the product was the brief's sienna rust
(#8b3a1f) accent. Every other surface — /pro landing, dashboard,
dashboard panels — uses the WM mint green (#4ade80). Swapping the
brief's accent to the brand mint makes the magazine read as a sibling
of /pro rather than a separate editorial product, while keeping the
magazine-grade serif typography and even/odd page inversion intact.
Implementation (user picked Option B from brief-palette-playground.html):
--sienna : #8b3a1f -> #3ab567 muted mint for LIGHT pages (readable
on #fafafa without the bright-mint
glare of a pure-brand swap)
--mint : + #4ade80 bright WM mint for DARK pages
(matches /pro exactly)
--cream : #f1e9d8 -> #fafafa unified with --paper; one crisp white
--cream-ink: #1a1612 -> #0a0a0a crisper contrast on the new paper
Accent placement (unchanged structurally — only colors swapped):
- Digest running heads, labels, blockquote rule, stats dividers,
end-marker rule, signal/thread tags: all muted mint on light
- Story source line: newly mint (was unstyled bone/ink at 0.6 opacity);
two-strength — muted on light stories, bright on dark
- Logo ekg dot: mint on every page so the brand 'signal' pulse
threads through the whole magazine
No layout changes. No HTML structure changes. Only color constants +
a ~20-line CSS addition for story-source + ekg-dot accents.
165/165 brief tests pass (renderer contract unchanged — envelope shape
identical, only computed styles differ). Both tsconfigs typecheck clean.
* fix(brief): darken light-page mint to pass WCAG AA + fix digest ekg-dot
Two P2 findings on PR #3178 review.
1. Digest ekg-dot used bright #4ade80 on a #fafafa background,
contradicting the code comment that said 'light pages use the
muted mint'. The rule was grouped with .cover and .story.dark
(both ink backgrounds) when it should have been grouped with
.story.light (paper background). Regrouped.
2. #3ab567 on #fafafa tests at ~2.31:1 — fails WCAG AA 4.5:1 for
every text size and fails the 3:1 large-text floor. The PR called
this a rollback trigger; contrast math says it would fail every
meaningful text usage (mono running heads, source lines, labels,
footer captions). Swapped --sienna from #3ab567 to #1f7a3f —
tested at ~4.90:1 on #fafafa, passes AA for normal text.
Kept the variable name '--sienna' for backwards compat (every
.digest rule references it). The hue stays recognisably mint-
family (green dominant) so the brand relationship with #4ade80
on dark pages is still clear to a reader. Dark-page mint is
unchanged — #4ade80 on #0a0a0a is ~11.4:1, passes AAA.
Playground (brief-palette-playground.html) updated to match so
future iterations work against the accessible value.
165/165 brief tests pass. Both tsconfigs typecheck clean.
|
||
|
|
b5824d0512 |
feat(brief): Phase 9 / Todo #223 — share button + referral attribution (#3175)
* feat(brief): Phase 9 / Todo #223 — share button + referral attribution Adds a Share button to the dashboard Brief panel so PRO users can spread WorldMonitor virally. Built on the existing referral plumbing (registrations.referralCode + referredBy fields; api/register-interest already passes referredBy through) — this PR fills in the last mile: a stable referral code for signed-in Clerk users, a share URL, and a client-side share sheet. Files: server/_shared/referral-code.ts (new) Deterministic 8-char hex code: HMAC(BRIEF_URL_SIGNING_SECRET, 'referral:v1:' + userId). Same Clerk userId always produces the same code. No DB write on login, no schema migration, stable for the life of the account. api/referral/me.ts (new) GET -> { code, shareUrl, invitedCount }. Bearer-auth via Clerk. Reuses BRIEF_URL_SIGNING_SECRET to avoid another Railway env var. Stats fail gracefully to 0 on Convex outage. convex/registerInterest.ts + convex/http.ts New internal query getReferralStatsByCode({referralCode}) counts registrations rows that named this code as their referredBy. Exposed via POST /relay/referral-stats (RELAY_SHARED_SECRET auth). src/services/referral.ts (new) getReferralProfile: 5-min cache, profile is effectively immutable shareReferral: Web Share API primary (mobile native sheet), clipboard fallback on desktop. Returns 'shared'/'copied'/'blocked' /'error'. AbortError is treated as 'blocked', not failure. clearReferralCache for account-switch hygiene. src/components/LatestBriefPanel.ts + src/styles/panels.css New share row below the brief cover card. Button disabled until /api/referral/me resolves; if fetch fails the row removes itself. invitedCount > 0 renders as 'N invited' next to the button. Referral cache invalidated alongside Clerk token cache on account switch (otherwise user B would see user A's share link for 5 min). Tests: 10 new cases in tests/brief-referral-code.test.mjs - getReferralCodeForUser: hex shape, determinism, uniqueness, secret-rotation invalidates, input guards - buildShareUrl: path shape, trailing-slash trim, URL-encoding 153/153 brief + deploy tests pass. Both tsconfigs typecheck clean. Attribution flow (already working end-to-end): 1. Share button -> worldmonitor.app/pro?ref={code} 2. /pro landing page already reads ?ref= and passes to /api/register-interest as referredBy 3. convex registerInterest:register increments the referrer's referralCount and stores referredBy on the new row 4. /api/referral/me reads the count back via the relay query 5. 'N invited' updates on next 5-min cache refresh Scope boundaries (deferred): - Convex conversion tracking (invited -> PRO subscribed). Needs a join from registrations.referredBy to subscriptions.userId via email. Surface 'N converted' in a follow-up. - Referral-credit / reward system: viral loop works today, reward logic is a separate product decision. * fix(brief): address three P2 review findings on #3175 - api/referral/me.ts JSDoc said '503 if REFERRAL_SIGNING_SECRET is not configured' but the handler actually reads BRIEF_URL_SIGNING_SECRET. Updated the docstring so an operator chasing a 503 doesn't look for an env var that doesn't exist. - server/_shared/referral-code.ts carried a RESERVED_CODES Set to avoid collisions with URL-path keywords ('index', 'robots', 'admin'). The guard is dead code: the code alphabet is [0-9a-f] (hex output of the HMAC) so none of those non-hex keywords can ever appear. Removed the Set + the while loop; left a comment explaining why it was unnecessary so nobody re-adds it. - src/components/LatestBriefPanel.ts passed disabled: 'true' (string) to the h() helper. DOM-utils' h() calls setAttribute for unknown props, which does disable the button — but it's inconsistent with the later .disabled = false property write. Fixed to the boolean disabled: true so the attribute and the IDL property agree. 10/10 referral-code tests pass. Both tsconfigs typecheck clean. * fix(brief): address two review findings on #3175 — drop misleading count + fix user-agnostic cache P1: invitedCount wired to the wrong attribution store. The share URL is /pro?ref=<code>. On /pro the 'ref' feeds Dodopayments checkout metadata (affonso_referral), NOT registrations.referredBy. /api/referral/me counted only the waitlist path, so the panel would show 0 invited for anyone who converted direct-to-checkout — misleading. Rather than ship a count that measures only one of two attribution paths (and the less-common one at that), the count is removed entirely. The share button itself still works. A proper metric requires unifying the waitlist + Dodo-metadata paths into a single attribution store, which is a follow-up. Changes: - api/referral/me.ts: response shape is { code, shareUrl } — no invitedCount / convertedCount - convex/registerInterest.ts: removed getReferralStatsByCode internal query - convex/http.ts: removed /relay/referral-stats route - src/services/referral.ts: ReferralProfile interface no longer has invitedCount; fetch call unchanged in behaviour - src/components/LatestBriefPanel.ts: dropped the 'N invited' render branch P2: referral cache was user-agnostic. Module-global _cached had no userId key, so a stale cache primed by user A would hand user B user A's share link for up to 5 min after an account switch — if no panel is mounted at the transition moment to call clearReferralCache(). Per the reviewer's point, this is a real race. Fix: two-part. (a) Cache entry carries the userId it was computed for; reads check the current Clerk userId and only accept hits when they match. Mismatch → drop + re-fetch. (b) src/services/referral.ts self-subscribes to auth-state at module load (ensureAuthSubscription). On any id transition _cached is dropped. Module-level subscription means the invalidation works even when no panel is currently mounted. (c) Belt-and-suspenders: post-fetch, re-check the current user before caching. Protects against account switches that happen mid-flight between 'read cache → ask network → write cache'. Panel's local clearReferralCache() call removed — module now self-invalidates. 10/10 referral-code tests pass. Both tsconfigs typecheck clean. * fix(referral): address P1 review finding — share codes now actually credit the sharer The earlier head generated 8-char Clerk-derived HMAC codes for the share button, but the waitlist register mutation only looked up registrations.by_referral_code (6-char email-generated codes). Codes issued by the share button NEVER resolved to a sharer — the 'referral attribution' half of the feature was non-functional. Fix (schema-level, honest attribution path): convex/schema.ts - userReferralCodes { userId, code, createdAt } + by_code, by_user - userReferralCredits { referrerUserId, refereeEmail, createdAt } + by_referrer, by_referrer_email convex/registerInterest.ts - register mutation: after the existing registrations.by_referral_code lookup, falls through to userReferralCodes.by_code. On match, inserts a userReferralCredits row (the Clerk user has no registrations row to increment, so credit needs its own table). Dedupes by (referrer, refereeEmail) so returning visitors can't double-credit. - new internalMutation registerUserReferralCode({userId, code}) idempotent binding of a code to a userId. Collisions logged and ignored (keeps first writer). convex/http.ts - new POST /relay/register-referral-code (RELAY_SHARED_SECRET auth) that runs the mutation above. api/referral/me.ts - signature gains a ctx.waitUntil handle - after generating the user's code, fire-and-forget POSTs to /relay/register-referral-code so the binding is live by the time anyone clicks a shared link. Idempotent — a failure just means the NEXT call re-registers. Still deferred: display of 'N credited' / 'N converted' in the LatestBriefPanel. The waitlist side now resolves correctly, but the Dodopayments checkout path (/pro?ref=<code> → affonso_referral) is tracked in Dodo, not Convex. Surfacing a unified count requires a separate follow-up to pull Dodo metadata into Convex. Regression tests (3 new cases in tests/brief-referral-code.test.mjs): - register mutation extends to userReferralCodes + inserts credits - schema declares both tables with the right indexes - /api/referral/me registers the binding via waitUntil 13/13 referral tests pass. Both tsconfigs typecheck clean. * fix(referral): address two P1 review findings — checkout attribution + dead-link prevention P1: share URL didn't credit on the /pro?ref= checkout path. The earlier PR wired Clerk codes into the waitlist path (/api/register-interest -> userReferralCodes -> userReferralCredits) but a visitor landing on /pro?ref=<code> and going straight to Dodo checkout forwarded the code only into Dodo metadata (affonso_referral). Nothing on our side credited the sharer. Fix: convex/payments/subscriptionHelpers.ts handleSubscriptionActive now reads data.metadata.affonso_referral when inserting a NEW subscription row. If the code resolves in userReferralCodes, a userReferralCredits row crediting the sharer is inserted (deduped by (referrer, refereeEmail) so webhook replays don't double-credit). The credit only lands on first-activation — the else-branch of the existing/new split guards against replays. P1: /api/referral/me returned 200 + share link even when the (code, userId) binding failed. ctx.waitUntil(registerReferralCodeInConvex(...)) ran the binding asynchronously, swallowing missing env + non-2xx + network errors. Users got a share URL that the waitlist lookup could never resolve — dead link. Fix: registerReferralCodeInConvex is now BLOCKING (throws on any failure) and the handler awaits it before returning. On failure the endpoint responds 503 service_unavailable rather than a 200 with a non-functional URL. Mutation is idempotent so client retries are safe. Regression tests (2 updated/new in tests/brief-referral-code.test.mjs): - asserts the binding is awaited, not ctx.waitUntil'd; asserts the failure path returns 503 - asserts subscriptionHelpers reads affonso_referral, resolves via userReferralCodes.by_code, inserts a userReferralCredits row, and dedupes by (referrer, refereeEmail) 14/14 referral tests pass. Both tsconfigs typecheck clean. Net effect: /pro?ref=<code> visitors who convert (direct checkout) now credit the sharer on webhook receipt, same as waitlist signups. The share button is no longer a dead-end UI. |
||
|
|
122204f691 |
feat(brief): Phase 8 — Telegram carousel images via Satori + resvg-wasm (#3174)
* feat(brief): Phase 8 — Telegram carousel images via Satori + resvg-wasm
Implements the Phase 8 carousel renderer (Option B): server-side PNG
generation in a Vercel edge function using Satori (JSX to SVG) +
@resvg/resvg-wasm (SVG to PNG). Zero new Railway infra, zero
Chromium, same edge runtime that already serves the magazine HTML.
Files:
server/_shared/brief-carousel-render.ts (new)
Pure renderer: (BriefEnvelope, CarouselPage) -> Uint8Array PNG.
Three layouts (cover/threads/story), 1200x630 OG size.
Satori + resvg + WASM are lazy-imported so Node tests don't trip
over '?url' asset imports and the 800KB wasm doesn't ship in
every bundle. Font: Noto Serif regular, fetched once from Google
Fonts and memoised on the edge isolate.
api/brief/carousel/[userId]/[issueDate]/[page].ts (new)
Public edge function reusing the magazine route's HMAC token —
same signer, same (userId, issueDate) binding, so one token
unlocks magazine HTML AND all three carousel images. Returns
image/png with 7d immutable cache headers. 404 on invalid page
index, 403 on bad token, 404 on Redis miss, 503 on missing
signing secret. Render failure falls back to a 1x1 transparent
PNG so Telegram's sendMediaGroup doesn't 500 the brief.
scripts/seed-digest-notifications.mjs
carouselUrlsFrom(magazineUrl) derives the 3 signed carousel
URLs from the already-signed magazine URL. sendTelegramBriefCarousel
calls Telegram's sendMediaGroup with those URLs + short caption.
Runs before the existing sendTelegram(text) so the carousel is
the header and the text the body — long-form stories remain
forwardable as text. Best-effort: carousel failure doesn't
block text delivery.
package.json + package-lock.json
satori ^0.10.14 + @resvg/resvg-wasm ^2.6.2.
Tests (tests/brief-carousel.test.mjs, 9 cases):
- pageFromIndex mapping + out-of-range
- carouselUrlsFrom: valid URL, localhost origin preserved, missing
token, wrong path, invalid issueDate, garbage input
- Drift guard: cron must still declare the same helper + template
string. If it drifts, test fails with a pointer to move the impl
into a shared module.
PNG render itself isn't unit-tested — Satori + WASM need a
browser/edge runtime. Covered by smoke validation step in the
deploy monitoring plan.
Both tsconfigs typecheck clean. 152/152 brief tests pass.
Scope boundaries (deferred):
- Slack + Discord image attachments (different payload shapes)
- notification-relay.cjs brief_ready dispatch (real-time route)
- Redis caching of rendered PNG (edge Cache-Control is enough for
MVP)
* fix(brief): address two P1 review findings on Phase 8 carousel
P1-A: 200 placeholder PNG cached 7d on render failure.
Route config said runtime: 'edge' but a comment contradicted it
claiming Node semantics. More importantly, any render/init failure
(WASM load, Satori, Google Fonts) was converted to a 1x1 transparent
PNG returned with Cache-Control: public, max-age=7d, immutable.
Telegram's media fetcher and Vercel's CDN would cache that blank
for the full brief TTL per chat message — one cold-start mismatch
= every reader of that brief sees blank carousel previews for a
week.
Fix: deleted errorPng(). Render failure now returns 503 with
Cache-Control: no-store. sendMediaGroup fails cleanly for that
carousel (the digest cron already treats it as best-effort and
still sends the long-form text message), next cron tick re-renders
from a fresh isolate. Self-healing across ticks. Contradictory
comment about Node runtime removed.
P1-B: Google Fonts as silent hard dependency.
The renderer claimed 'safe embedded/fallback path' in comments but
no fallback existed. loadFont() fetches Noto Serif from gstatic.com
and rethrows on any failure. Combined with P1-A's old 200-cache-7d
path, a transient CDN blip would lock in a blank carousel for a
week.
Fix: updated comments to honestly declare the CDN dependency plus
document the self-healing semantics now that P1-A's fix no longer
caches the failure. If Google Fonts reliability becomes a problem,
swap the fetch for a bundled base64 TTF — noted as the escape hatch.
Tests (tests/brief-carousel.test.mjs): 2 new regression cases.
11/11 carousel tests pass. Both tsconfigs typecheck clean locally.
Note on currently-red CI: failures are NOT typecheck errors — npm
ci dies fetching libvips for sharp (504 Gateway Time-out from
GitHub releases). sharp is a transitive dep via @xenova/transformers,
pre-existing, not touched by this PR. Transient infra flake.
* fix(brief): switch carousel to Node + @resvg/resvg-js (fixes deploy block)
Vercel edge bundler fails the carousel deploy with:
'Edge Function is referencing unsupported modules:
@resvg/resvg-wasm/index_bg.wasm?url'
The ?url asset-import syntax is a Vite-ism that Vercel's edge
bundler doesn't resolve. Two ways out: find a Vercel-blessed edge
WASM import incantation, or switch to Node runtime with the native
@resvg/resvg-js binding. The second is simpler, faster per request,
and avoids the whole WASM-in-edge-bundler rabbit hole.
Changes:
- package.json: @resvg/resvg-wasm -> @resvg/resvg-js ^2.6.2
- api/brief/carousel/.../[page].ts: runtime 'edge' -> 'nodejs20.x'
- server/_shared/brief-carousel-render.ts: drop initWasm path,
dynamic-import resvg-js in ensureLibs(). Satori and resvg load
in parallel via Promise.all, shaving ~30ms off cold start.
Also addresses the P2 finding from review: the old ensureLibsAndWasm
had a concurrent-cold-start race where two callers could reach
'await initWasm()' simultaneously. Replaced the boolean flag with a
shared _libsLoadPromise so concurrent callers await the same load.
On failure the promise resets so the NEXT request retries rather
than poisoning the isolate for its lifetime.
Cold start ~700ms (Satori + resvg-js native init), warm ~40ms.
Carousel images are not latency-critical — fetched by Telegram's
media service, CDN-cached 7d.
Both tsconfigs typecheck clean. 11/11 carousel tests pass.
* fix(brief): carousel runtime = 'nodejs' (was 'nodejs20.x', rejected by Vercel)
Vercel's functions config validator rejects 'nodejs20.x' at deploy
time:
unsupported "runtime" value in config: "nodejs20.x"
(must be one of: ["edge","experimental-edge","nodejs"])
The Node version comes from the project's default (currently Node 20
via package.json engines + Vercel project settings), not from the
runtime string. Use 'nodejs' — unversioned — and let the platform
resolve it.
11/11 carousel tests pass.
* fix(brief): swap carousel font from woff2 to woff (Satori can't parse woff2)
Review on PR #3174: the FONT_URL was pointing at a gstatic.com woff2
file. Satori parses ttf / otf / woff v1 — NOT woff2. Every render
was about to throw on font decode, the route would return 503, and
the carousel would never deliver a single image.
Fix: point FONT_URL at @fontsource's Noto Serif Latin 400 WOFF v1
via jsdelivr. WOFF v1 is a TrueType wrapper that Satori parses
natively (verified: file says 'Web Open Font Format, TrueType,
version 1.1'). Same cold-start semantics as before — one fetch per
warm isolate, memoised.
Regression test: asserts FONT_URL ends in ttf/otf/woff and explicitly
rejects any .woff2 suffix. A future swap that silently reintroduces
woff2 now fails CI loudly instead of shipping a permanently-broken
renderer.
12/12 carousel tests pass. Both tsconfigs typecheck clean.
|
||
|
|
de769ce8e1 |
fix(api): unblock Pro API clients at edge + accept x-api-key alias (#3155)
* fix(api): unblock Pro API clients at edge + accept x-api-key alias Fixes #3146: Pro API subscriber getting 403 when calling from Railway. Two independent layers were blocking server-side callers: 1. Vercel Edge Middleware (middleware.ts) blocks any UA matching /bot|curl\/|python-requests|go-http|java\//, which killed every legitimate server-to-server API client before the gateway even saw the request. Add bypass: requests carrying an `x-worldmonitor-key` or `x-api-key` header that starts with `wm_` skip the UA gate. The prefix is a cheap client-side signal, not auth — downstream server/gateway.ts still hashes the key and validates against the Convex `userApiKeys` table + entitlement check. 2. Header name mismatch. Docs/gateway only accepted `X-WorldMonitor-Key`, but most API clients default to `x-api-key`. Accept both header names in: - api/_api-key.js (legacy static-key allowlist) - server/gateway.ts (user-issued Convex-backed keys) - server/_shared/premium-check.ts (isCallerPremium) Add `X-Api-Key` to CORS Allow-Headers in server/cors.ts and api/_cors.js so browser preflights succeed. Follow-up outside this PR (Cloudflare dashboard, not in repo): - Extend the "Allow api access with WM" custom WAF rule to also match `starts_with(http.request.headers["x-api-key"][0], "wm_")`, so CF Managed Rules don't block requests using the x-api-key header name. - Update the api-cors-preflight CF Worker's corsHeaders to include `X-Api-Key` (memory: cors-cloudflare-worker.md — Worker overrides repo CORS on api.worldmonitor.app). * fix(api): tighten middleware bypass shape + finish x-api-key alias coverage Addresses review findings on #3155: 1. middleware.ts bypass was too loose. "Starts with wm_" let any caller send X-Api-Key: wm_fake and skip the UA gate, shifting unauthenticated scraper load onto the gateway's Convex lookup. Tighten to the exact key format emitted by src/services/api-keys.ts:generateKey — `^wm_[a-f0-9]{40}$` (wm_ + 20 random bytes as hex). Still a cheap edge heuristic (no hash lookup in middleware), but raises spoofing from trivial prefix match to a specific 43-char shape. 2. Alias was incomplete on bespoke endpoints outside the shared gateway: - api/v2/shipping/route-intelligence.ts: async wm_ user-key fallback now reads X-Api-Key as well - api/v2/shipping/webhooks.ts: webhook ownership fingerprint now reads X-Api-Key as well (same key value → same SHA-256 → same ownerTag, so a user registering with either header can manage their webhook from the other) - api/widget-agent.ts: accept X-Api-Key in the auth read AND in the OPTIONS Allow-Headers list - api/chat-analyst.ts: add X-Api-Key to the OPTIONS Allow-Headers list (auth path goes through shared helpers already aliased) |
||
|
|
d46c170012 |
feat(brief): hosted magazine edge route + latest-brief preview RPC (Phase 2) (#3153)
* feat(brief): hosted magazine edge route + latest-brief preview RPC (Phase 2)
Phase 2 of the WorldMonitor Brief plan (docs/plans/2026-04-17-003).
Adds the read path that every downstream delivery channel binds to.
Phase 1 shipped the renderer + envelope contract; Phase 3 (composer)
will write brief:{userId}:{issueDate} to Redis. Until then the route
returns a neutral "expired" page for every request — intentional, safe
to deploy ahead of the composer.
Files:
- server/_shared/brief-url.ts
HMAC-SHA256 sign + verify helpers using Web Crypto (edge-compatible,
no node:crypto). Rotation supported via optional prevSecret in
verifyBriefToken. Constant-time comparison. Strict userId and
YYYY-MM-DD shape validation before any crypto operation. Typed
BriefUrlError with code enum for caller branches.
- api/brief/[userId]/[issueDate].ts (Vercel edge)
Auth model: the HMAC-signed ?t= token IS the credential. No Clerk
session required — URLs are delivered over already-authenticated
channels (push, email, dashboard panel). Flow: verify token →
Redis GET brief:{userId}:{issueDate} → renderBriefMagazine → HTML.
403 on bad token, 404 on Redis miss, 503 on missing secret. Never
echoes the userId in error pages.
- api/latest-brief.ts (Clerk JWT + PRO gated, Vercel edge)
Returns { issueDate, dateLong, greeting, threadCount, magazineUrl }
or { status: 'composing', issueDate } when Redis is cold. Mirrors
the auth + entitlement pattern in api/notify.ts. magazineUrl is
freshly signed per request against the current BRIEF_URL_SIGNING_
SECRET so rotation takes effect immediately.
- tests/brief-url.test.mjs
20 tests: round-trip, tampering, wrong user/date/secret, malformed
token/inputs, rotation (accept prev secret + reject after removal),
URL composition (trailing-slash trim, path encoding).
Quality gates: typecheck + typecheck:api clean, biome lint clean,
render tests still 30/30, new HMAC tests 20/20.
PRE-MERGE: BRIEF_URL_SIGNING_SECRET must be set in Vercel before this
deploys. The route returns 503 (not 500) with a server-side log when
the secret is missing, so a mis-configured deploy is safe to roll.
* fix(brief): address Phase 2 review — broken import + HEAD body + host reflection
Addresses todos 212–217 from the /ce-review on PR #3153.
P0 / P1 blockers:
- todo 212 (P0): renderer import path was broken — pointed at
shared/render-brief-magazine.js which no longer exists (Phase 1
review moved it to server/_shared/brief-render.js). Route would
have 500'd on every successful token verification; @ts-expect-error
silenced the error at compile time. Fixed the path and removed the
now-unnecessary @ts-expect-error since brief-render.d.ts exists.
New tests/brief-edge-route-smoke.test.mjs imports the handler so a
future broken path cannot pass CI.
- todo 213 (P1): HEAD returned the full HTML body (RFC 7231
violation). htmlResponse() now takes the request and emits null
body when method is HEAD.
- todo 214 (P1): publicBaseUrl reflected x-forwarded-host, allowing a
signed magazineUrl to point at a non-canonical origin (preview
deploy, forwarded-host spoof). Now pinned to
WORLDMONITOR_PUBLIC_BASE_URL env var, falling back to the request
origin only in dev. This env var joins BRIEF_URL_SIGNING_SECRET on
the pre-merge Vercel checklist.
P2 cleanups:
- todo 215: dropped dead 'invalid_token_shape' enum member from
BriefUrlError — never thrown (verifyBriefToken returns false on
shape failure by design).
- todo 216: consolidated EXPIRED_PAGE + FORBIDDEN_PAGE (and added
UNAVAILABLE_PAGE) behind a single renderErrorPage(heading, body)
helper with shared styles.
- todo 217: extracted readRawJsonFromUpstash() into api/_upstash-json.js
so api/brief and api/latest-brief share one 3-second-timeout raw-GET
implementation. Dodges the readJsonFromUpstash unwrap that strips
seed envelopes (our brief envelope is flat, not seed-framed).
Also:
- brief-url.ts header now documents the rotation runbook + the
emergency kill-switch path (rotate SECRET without setting PREV).
- distinguishable log tag for malformed-envelope vs Redis-miss
(composer-bug grep vs expired-key grep).
Deferred (todo 218, P2): token-in-URL log leak is a design-level
issue that wants a cookie-based first-visit exchange or iat/exp in
the signed payload. Out of scope for this fix commit.
Tests: 55/55 (20 HMAC + 30 render + 5 edge-route smoke). Typecheck
clean, biome lint clean.
* fix(brief): distinguish infra failure from miss + walk tz boundary
Addresses two additional P1 review findings on PR #3153.
1. readRawJsonFromUpstash collapsed every failure mode (missing creds,
HTTP non-2xx, timeout, JSON parse failure, genuine miss) into a
single null return. Both edge routes then treated null as empty
state: api/brief → 404 "expired", api/latest-brief → 200
"composing". During any Upstash outage a user with a valid brief
would see misleading empty-state UX.
Helper now throws on every infrastructure/parse failure and
returns null ONLY on a genuine miss (Upstash replied 200 with
result: null). Callers:
- api/brief catches → 503 UNAVAILABLE_PAGE
- api/latest-brief catches → 503 { error: service_unavailable }
2. api/latest-brief probed today UTC only. For users ahead of or
behind UTC, a ready brief could be invisible for 12-24h around
the date boundary.
Route now accepts an optional ?date=YYYY-MM-DD query param so the
dashboard panel (or any client) can send its local date directly.
When ?date= is absent, we walk today UTC → yesterday UTC and
return the most recent hit; composing is only reported when both
slots miss. Malformed ?date= returns 400.
Tests added:
- helper-level: missing creds → throws, HTTP error → throws,
genuine miss → null
- route-level: api/brief returns 503 (not 404) on Upstash HTTP error
Final suite: 60/60 (20 HMAC + 30 render + 10 smoke). Typecheck +
lint clean.
* fix(brief): tomorrow-UTC probe + unified envelope validation
Addresses two third-round review findings on PR #3153.
1. Walk-back missed the tomorrow-UTC slot. Users at positive tz
offsets (UTC+1..UTC+14) store today's brief under tomorrow UTC;
the previous [today, yesterday] probe never saw them. Fixed to
walk [tomorrow, today, yesterday] UTC — covers the full tz range
and naturally prefers the most recently composed slot. Also
correctly echoes the caller's ?date= in composing responses
instead of always echoing today UTC.
2. readBriefPreview validated only dateLong + digest.greeting +
stories (array). The renderer's assertBriefEnvelope is much
stricter (closed-key contract, version guard, cross-field
surfaced === stories.length, per-story 7 required fields, etc.).
A partial envelope could be reported "ready" by the preview and
404-expired on click. Exported assertBriefEnvelope from the
renderer module, called it in the preview reader. An envelope
that would render successfully is exactly an envelope that the
preview reports as ready; any divergence is logged as a composer
bug and surfaced as "composing".
Tests: 62/62 (20 HMAC + 30 render + 12 smoke). New cases cover the
shared-validator export and catch the "partial envelope slips past
weak preview" regression.
Typecheck + lint clean.
|
||
|
|
66ca645571 |
feat(brief): WorldMonitor Brief magazine renderer + envelope contract (Phase 1) (#3150)
* feat(brief): add WorldMonitor Brief magazine renderer + envelope contract
Phase 1 of the WorldMonitor Brief plan (docs/plans/2026-04-17-003-feat-
worldmonitor-brief-magazine-plan.md). Establishes the integration
boundary between the future per-user composer and every consumer
surface (hosted edge route, dashboard panel, email teaser, carousel,
Tauri reader).
- shared/brief-envelope.{d.ts,js}: BriefEnvelope + BRIEF_ENVELOPE_VERSION
- shared/render-brief-magazine.{d.ts,js}: pure (envelope) -> HTML
- tests/brief-magazine-render.test.mjs: 28 shape + chrome + leak tests
Page sequence is derived from data (no hardcoded counts). Threads
split into 03a/03b when more than six; Signals page is omitted when
signals is empty; story palette alternates light/dark by index
parity. Forbidden-field guard asserts importanceScore / primaryLink /
pubDate / model + provider names never appear in rendered HTML.
No runtime impact: purely additive, no consumers yet.
* fix(brief): address code review findings on PR #3150
Addresses all seven review items from todos/205..211.
P1 (merge blockers):
- todo 205: forbidden-field test no longer matches free-text content
like 'openai' / 'claude' / 'gemini' (false-positives on legitimate
stories). Narrowed to JSON-key structural tokens
('"importanceScore":', '"_seed":', etc.) and added a sentinel-
poisoning test that injects values into non-data envelope fields and
asserts they never appear in output.
- todo 206: drop 'moderate' from BriefThreatLevel union (synonym of
'medium'). Four-value ladder: critical | high | medium | low. Added
THREAT_LABELS map + HIGHLIGHTED_LEVELS set so display label and
highlight rule live together instead of inline char-case + hardcoded
comparison.
- todo 207: replace the minimal validator with assertBriefEnvelope()
that walks every required field (user.name/tz, date YYYY-MM-DD shape,
digest.greeting/lead/numbers.{clusters,multiSource,surfaced}, threads
array + per-element shape, signals array of strings, per-story
required fields + threatLevel enum). Throws with field-path message
on first miss. Adds nine negative tests covering common omissions.
- todo 208: envelope carries version guard. assertBriefEnvelope throws
when envelope.version !== BRIEF_ENVELOPE_VERSION with a message
naming both observed and expected versions.
P2 (should-fix, now included):
- todo 209: drop the _seed wrapper and make BriefEnvelope a flat
{ version, issuedAt, data } shape. A per-user brief is not a global
seed; reusing _seed invited mis-application of seed invariants
(SEED_META pairing, TTL rotation). Locked down before Phase 3
composer bakes the shape in.
- todo 210: move renderer from shared/render-brief-magazine.{js,d.ts}
to server/_shared/brief-render.{js,d.ts}. The 740-line template
doesn't belong on the shared/ mirror hot path (Vercel+Railway). Keep
only the envelope contract in shared/. Import path updated in tests.
- todo 211: logo SVG now defined once per document as a <symbol> and
referenced via <use> at each placement (~8 refs/doc). Drops ~7KB
from rendered output (~26% total size reduction on small inputs).
Tests pass (26/26), typecheck clean, lint clean, mirror check (169/169)
unaffected.
* fix(brief): enforce closed-key contract + surfaced-count invariant
Addresses two additional P1 review findings on PR #3150.
1. Validator rejects extra keys at every level (envelope root, data,
user, digest, numbers, each thread, each story). Previously the
forbidden-field rule was documented in the .d.ts and proven only at
render time via a sentinel test — a producer could still persist
importanceScore, primaryLink, pubDate, briefModel, _seed, fetchedAt,
etc. into the Redis-resident envelope and every downstream consumer
(edge route, dashboard panel preview, email teaser, carousel) would
accept them. The validator is the only place the invariant can live.
2. Validator now asserts digest.numbers.surfaced === stories.length.
The renderer uses both values — surfaced on the at-a-glance page,
stories.length for the cover blurb and page count — so allowing
them to disagree produced a self-contradictory brief that the old
validator silently passed.
Tests: 30/30 (was 26). New negative tests cover the strict-keys
rejection at root / data / numbers / story[i] levels and the surfaced
mismatch. The earlier sentinel-poisoning test is superseded — the
strict-keys check catches the same class of bug earlier and harder.
|
||
|
|
dcf73385ca |
fix(scoring): rebalance formula weights severity 55%, corroboration 15% (#3144)
* fix(scoring): rebalance formula weights severity 55%, corroboration 15%
PR A of the scoring recalibration plan (docs/plans/2026-04-17-002).
The v2 shadow-log recalibration (690 items, Pearson 0.413) showed the
formula compresses scores into a narrow 30-70 range, making the 85
critical gate unreachable and the 65 high gate marginal. Root cause:
corroboration at 30% weight penalizes breaking single-source news
(the most important alerts) while severity at 40% doesn't separate
critical from high enough.
Weight change:
BEFORE: severity 0.40 + sourceTier 0.20 + corroboration 0.30 + recency 0.10
AFTER: severity 0.55 + sourceTier 0.20 + corroboration 0.15 + recency 0.10
Expected effect: critical/tier1/fresh rises from 76 to 88 (clears 85
gate). critical/tier2/fresh rises from 71 to 83 (recommend lowering
critical gate to 80 at activation time). high/tier2/fresh rises from
61 to 69 (clears 65 gate). The HIGH-CRITICAL gap widens from 10 to
14 points for same-tier items.
Also:
- Bumps shadow-log key from v2 to v3 for a clean recalibration dataset
(v2 had old-weight scores that would contaminate the 48h soak)
- Updates proto/news_item.proto formula comment to reflect new weights
- Updates cache-keys.ts documentation
No cache migration needed: the classify cache stores {level, category},
not scores. Scores are computed at read time from the stored level +
the formula, so new digest requests immediately produce new scores.
Gates remain OFF. After 48h of v3 data, re-run:
node scripts/shadow-score-report.mjs
node scripts/shadow-score-rank.mjs sample 25
🤖 Generated with Claude Opus 4.6 via Claude Code + Compound Engineering v2.49.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: regenerate proto OpenAPI docs for weight rebalance
* fix(scoring): bump SHADOW_SCORE_LOG_KEY export to v3
The exported constant in cache-keys.ts was left at v2 while the relay's
local constant was bumped to v3. Anyone importing the export (or grep-
discovering it) would get a stale key. Architecture review flagged this.
* fix(scoring): update test + stale comments for shadow-log v3
Review found the regression test still asserted v2 key, causing CI
failure. Also fixed stale v1/v2 references in report script header,
default-key comment, report title render, and shouldNotify docstring.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
a4d9b0a5fa |
feat(auth): user-facing API key management (create / list / revoke) (#3125)
* feat(auth): user-facing API key management (create / list / revoke) Adds full-stack API key management so authenticated users can create, list, and revoke their own API keys from the Settings UI. Backend: - Convex `userApiKeys` table with SHA-256 key hash storage - Mutations: createApiKey, listApiKeys, revokeApiKey - Internal query validateKeyByHash + touchKeyLastUsed for gateway - HTTP endpoints: /api/api-keys (CRUD) + /api/internal-validate-api-key - Gateway middleware validates user-owned keys via Convex + Redis cache Frontend: - New "API Keys" tab in UnifiedSettings (visible when signed in) - Create form with copy-on-creation banner (key shown once) - List with prefix display, timestamps, and revoke action - Client-side key generation + hashing (plaintext never sent to DB) Closes #3116 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(api-keys): address PR review — cache invalidation, prefix validation, revoked-key guard - Invalidate Redis cache on key revocation so gateway rejects revoked keys immediately instead of waiting for 5-min TTL expiry (P1) - Enforce `wm_` prefix format with regex instead of loose length check (P2) - Skip `touchKeyLastUsed` for revoked keys to preserve clean audit trail (P2) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(api-keys): address consolidated PR review (P0–P3) P0: gate createApiKey on pro entitlement (tier >= 1); isCallerPremium now verifies key-owner tier instead of treating existence as premium. P1: wire wm_ user keys into the domain gateway auth path with async Convex-backed validation; user keys go through entitlement checks (only admin keys bypass). Lower cache TTL 300s → 60s and await revocation cache-bust instead of fire-and-forget. P2: remove dead HTTP create/list/revoke path from convex/http.ts; switch to cachedFetchJson (stampede protection, env-prefixed keys, standard NEG_SENTINEL); add tenancy check on cache-invalidation endpoint via new /api/internal-get-key-owner route; add 22 Convex tests covering tier gate, per-user limit, duplicate hash, ownership revoke guard, getKeyOwner, and touchKeyLastUsed debounce. P3: tighten keyPrefix regex to exactly 5 hex chars; debounce touchKeyLastUsed (5 min); surface PRO_REQUIRED in UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(api-keys): gate on apiAccess (not tier), wire wm_ keys through edge routes, harden error paths - Gate API key creation/validation on features.apiAccess instead of tier >= 1. Pro (tier 1, apiAccess=false) can no longer mint keys — only API_STARTER+. - Wire wm_ user keys through standalone edge routes (shipping/route-intelligence, shipping/webhooks) that were short-circuiting on validateApiKey before async Convex validation could run. - Restore fail-soft behavior in validateUserApiKey: transient Convex/network errors degrade to unauthorized instead of bubbling a 500. - Fail-closed on cache invalidation endpoint: ownership check errors now return 503 instead of silently proceeding (surfaces Convex outages in logs). - Tests updated: positive paths use api_starter (apiAccess=true), new test locks Pro-without-API-access rejection. 23 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(webhooks): remove wm_ user key fallback from shipping webhooks Webhook ownership is keyed to SHA-256(apiKey) via callerFingerprint(), not to the user. With user-owned keys (up to 5 per user), this causes cross-key blindness (webhooks invisible when calling with a different key) and revoke-orphaning (revoking the creating key makes the webhook permanently unmanageable). User keys remain supported on the read-only route-intelligence endpoint. Webhook ownership migration to userId will follow in a separate PR. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Elie Habib <elie.habib@gmail.com> |
||
|
|
da1fa3367b |
fix(resilience-ranking): chunked warm SET, always-on rebuild, truthful meta (Slice B) (#3124)
* fix(resilience-ranking): chunked warm SET, always-on rebuild, truthful meta Slice B follow-up to PR #3121. Three coupled production failures observed: 1. Per-country score persistence works (Slice A), but the 222-SET single pipeline body (~600KB) exceeds REDIS_PIPELINE_TIMEOUT_MS (5s) on Vercel Edge. runRedisPipeline returns []; persistence guard correctly returns empty; coverage = 0/222 < 75%; ranking publish silently dropped. Live Railway log: "Ranking: 0 ranked, 222 greyed out" → "Rebuilt … with 222 countries (bulk-call race left ranking:v9 null)" — second call only succeeded because Upstash had finally caught up between attempts. 2. The seeder's probe + rebuild block lives inside `if (missing > 0)`. When per-country scores survive a cron tick (TTL 6h, cron every 6h), missing=0 and the rebuild path is skipped. Ranking aggregate then expires alone and is never refreshed until scores also expire — multi-hour gaps where `resilience:ranking:v9` is gone while seed-meta still claims freshness. 3. `writeRankingSeedMeta` fires whenever finalWarmed > 0, regardless of whether the ranking key is actually present. Health endpoint sees fresh meta + missing data → EMPTY_ON_DEMAND with a misleading seedAge. Fixes: - _shared.ts: split the warm pipeline SET into SET_BATCH=30-command chunks so each pipeline body fits well under timeout. Pad missing-batch results with empty entries so the per-command alignment stays correct (failed batches stay excluded from `warmed`, no proof = no claim). - seed-resilience-scores.mjs: extract `ensureRankingPresent` helper, call it from BOTH the missing>0 and missing===0 branches so the ranking gets refreshed every cron. Add a post-rebuild STRLEN verification — rebuild HTTP can return 200 with a payload but still skip the SET (coverage gate, pipeline failure). - main(): only writeRankingSeedMeta when result.rankingPresent === true. Otherwise log and let the next cron retry. Tests: - resilience-ranking.test.mts: assert pipelines stay ≤30 commands. - resilience-scores-seed.test.mjs: structural checks that the rebuild is hoisted (≥2 callsites of ensureRankingPresent), STRLEN verification is present, and meta write is gated on rankingPresent. Full resilience suite: 373/373 pass (was 370 — 3 new tests). * fix(resilience-ranking): seeder no longer writes seed-meta (handler is sole writer) Reviewer P1: ensureRankingPresent() returning true only means the ranking key exists in Redis — not that THIS cron actually wrote it. The handler skips both the ranking SET and the meta SET when coverage < 75%, so an older ranking from a prior cron can linger while this cron's data didn't land. Under that scenario, the previous commit still wrote a fresh seed-meta:resilience:ranking, recreating the stale-meta-over-stale-data failure this PR is meant to eliminate. Fix: remove seeder-side seed-meta writes entirely. The ranking handler already writes ranking + meta atomically in the same pipeline when (and only when) coverage passes the gate. ensureRankingPresent() triggers the handler every cron, which addresses the original rationale for the seeder heartbeat (meta going stale during quiet Pro usage) without the seeder needing to lie. Consequence on failure: - Coverage gate trips → handler writes neither ranking nor meta. - seed-meta stays at its previous timestamp; api/health reports accurate staleness (STALE_SEED after maxStaleMin, then CRIT) instead of a fresh meta over stale/empty data. Tests updated: the "meta gated on rankingPresent" assertion is replaced with "seeder must not SET seed-meta:resilience:ranking" + "no writeRankingSeedMeta". Comments may still reference the key name for maintainer clarity — the assertion targets actual SET commands. Full resilience suite: 373/373 pass. * fix(resilience-ranking): always refresh + 12h TTL (close timing hole) Reviewer P1+P2: - P1: ranking TTL == cron interval (both 6h) left a timing hole. If a cron wrote the key near the end of its run and the next cron fired near the start of its interval, the key was still alive at probe time → ensureRankingPresent() returned early → no rebuild → key expired a short while later and stayed absent until a cron eventually ran while the key was missing. Multi-hour EMPTY_ON_DEMAND gaps. - P2: probing only the ranking data key (not seed-meta) meant a partial handler pipeline (ranking SET ok, meta SET missed) would self-heal only when the ranking itself expired — never during its TTL window. Fix: 1. Bump RESILIENCE_RANKING_CACHE_TTL_SECONDS from 6h to 12h (2x cron interval). A single missed or slow cron no longer causes a gap. Server-side and seeder-side constants kept in sync. 2. Replace ensureRankingPresent() with refreshRankingAggregate(): drop the 'if key present, skip' short-circuit. Rebuild every cron, unconditionally. One cheap HTTP call keeps ranking + seed-meta rolling forward together and self-heals the partial-pipeline case — handler retries the atomic pair every 6h regardless of whether the keys are currently live. 3. Update health.js comment to reflect the new TTL and refresh cadence (12h data TTL, 6h refresh, 12h staleness threshold = 2 missed ticks). Tests: - RESILIENCE_RANKING_CACHE_TTL_SECONDS asserts 12h (was 6h). - New assertion: refreshRankingAggregate must NOT early-return on probe- hit, and the rebuild HTTP call must be unconditional in its body. - DEL-guard test relaxed to allow comments between '{' and the DEL line (structural property preserved). Full resilience suite: 375/375. * fix(resilience-ranking): parallelize warm batches + atomic rebuild via ?refresh=1 Reviewer P2s: - Warm path serialized the 8 batch pipelines with `await` in a for-loop, adding ~7 extra Upstash round-trips (100-500ms each on Edge) to the warm wall-clock. Batches are independent; Promise.all collapses them into one slowest-batch window. - DEL+rebuild created a brief absence window: if the rebuild request failed transiently, the ranking stayed absent until the next cron. Now seeder calls `/api/resilience/v1/get-resilience-ranking?refresh=1` and the handler bypasses its cache-hit early-return, recomputing and SETting atomically. On rebuild failure, the existing (possibly stale-but-present) ranking is preserved instead of being nuked. Handler: read ctx.request.url for the refresh query param; guard the URL parse with try/catch so an unparseable url falls back to the cached-first behavior. Tests: - New: ?refresh=1 must bypass the cache-hit early-return (fails on old code, passes now). - DEL-guard test replaced with 'does NOT DEL' + 'uses ?refresh=1'. - Batch chunking still asserted at SET_BATCH=30. Full resilience suite: 376/376. * fix(resilience-ranking): bulk-warm call also needs ?refresh=1 (asymmetric TTL hazard) Reviewer P1: in the 6h-12h window, per-country score keys have expired (TTL 6h) but the ranking aggregate is still alive (TTL 12h). The seeder's bulk-warm call was hitting get-resilience-ranking without ?refresh=1, so the handler's cache-hit early-return fired and the entire warm path was skipped. Scores stayed missing; coverage degraded; the only recovery was the per-country laggard loop (5-request batches) — which silently no-ops when WM_KEY is absent. This defeated the whole point of the chunked bulk warm introduced in this PR. Fix: the bulk-warm fetch at scripts/seed-resilience-scores.mjs:167 now appends ?refresh=1, matching the rebuild call. Every seeder-initiated hit on the ranking endpoint forces the handler to route through warmMissingResilienceScores and its chunked pipeline SET, regardless of whether the aggregate is still cached. Test extended: structural assertion now scans ALL occurrences of get-resilience-ranking in the seeder and requires every one of them to carry ?refresh=1. Fails the moment a future change adds a bare call. Full resilience suite: 376/376. * fix(resilience-ranking): gate ?refresh=1 on seed key + detect partial pipeline publish Reviewer P1: ?refresh=1 was honored for any caller — including valid Pro bearer tokens. A full warm is ~222 score computations + chunked pipeline SETs; a Pro user looping on refresh=1 (or an automated client) could DoS Upstash quota and Edge budget. Gate refresh behind WORLDMONITOR_VALID_KEYS / WORLDMONITOR_API_KEY (X-WorldMonitor-Key header) — the same allowlist the cron uses. Pro bearer tokens get the standard cache-first path; refresh requires the seed service key. Reviewer P2: the handler's atomic runRedisPipeline SET of ranking + meta is non-transactional on Upstash REST — either SET can fail independently. If the ranking landed but meta missed, the seeder's STRLEN verify would pass (ranking present) while /api/health stays stuck on stale meta. Two-part fix: - Handler inspects pipelineResult[0] and [1] and logs a warning when either SET didn't return OK. Ops-greppable signal. - Seeder's verify now checks BOTH keys in parallel: STRLEN on ranking data, and GET + fetchedAt freshness (<5min) on seed-meta. Partial publish logs a warning; next cron retries (SET is idempotent). Tests: - New: ?refresh=1 without/with-wrong X-WorldMonitor-Key must NOT trigger recompute (falls back to cached response). Existing bypass test updated to carry a valid seed key header. Full resilience suite: 376/376 + 1 new = 377/377. |
||
|
|
bdfb415f8f |
fix(resilience-ranking): return warmed scores from memory, skip lossy re-read (#3121)
* fix(resilience-ranking): return warmed scores from memory, skip lossy re-read Upstash REST writes via /set aren't always visible to an immediately-following /pipeline GET in the same Vercel invocation (documented in PR #3057 / feedback_upstash_write_reread_race_in_handler.md). The ranking handler was warming 222 countries then re-reading them from Redis to compute a coverage ratio; that re-read could return 0 despite every SET succeeding, collapsing coverage to 0% < 75% and silently dropping the ranking publish. Consequence: `resilience:ranking:v9` missing, per-country score keys absent, health reports EMPTY_ON_DEMAND even while the seeder keeps writing a "fresh" meta. Fix: warmMissingResilienceScores now returns Map<cc, GetResilienceScoreResponse> with every successfully computed score. The handler merges those into cachedScores directly and drops the post-warm re-read. Coverage now reflects what was actually computed in-memory this invocation, not what Redis happens to surface after write lag. Adds a regression test that simulates pipeline-GET returning null for freshly-SET score keys; it fails against the old code (coverage=0, no ranking written) and passes with the fix (coverage=100%, ranking written). Slice A of the resilience-ranking recovery plan; Slice B (seeder meta truthfulness) follows. * fix(resilience-ranking): verify score-key persistence via pipeline SET response PR review P1: trusting every fulfilled ensureResilienceScoreCached() result as "cached" turned the read-lag fix into a write-failure false positive. cachedFetchJson's underlying setCachedJson only logs and swallows write errors, so a transient /set failure on resilience:score:v9:* would leave per-country scores absent while the ranking aggregate and its seed-meta got published on top of them — worse than the bug this PR was meant to fix. Fix: use the pipeline SET response as the authoritative persistence signal. - Extract the score builder into a pure `buildResilienceScore()` with no caching side-effects (appendHistory stays — it's part of the score semantics). - `ensureResilienceScoreCached()` still wraps it in cachedFetchJson so single-country RPC callers keep their log-and-return-anyway behavior. - `warmMissingResilienceScores()` now computes in-memory, persists all scores in one pipeline SET, and only returns countries whose command reported `result: OK`. Pipeline SET's response is synchronous with the write, so OK means actually stored — no ambiguity with read-after-write lag. - When runRedisPipeline returns fewer responses than commands (transport failure), return an empty map: no proof of persistence → coverage gate can't false-positive. Adds regression test that blocks pipeline SETs to score keys and asserts the ranking + meta are NOT published. Existing race-regression test still passes. * fix(resilience-ranking): preserve env key prefix on warm pipeline SET PR review P1: the pipeline SET added to verify score-key persistence was called with raw=true, bypassing the preview/dev key prefix (preview:<sha>:). Two coupled regressions: 1. Preview/dev deploys write unprefixed `resilience:score:v9:*` keys, but all reads (getCachedResilienceScores, ensureResilienceScoreCached via setCachedJson/cachedFetchJson) look in the prefixed namespace. Warmed scores become invisible to the same preview on the next read. 2. Because production uses the empty prefix, preview writes land directly in the production-visible namespace, defeating the environment isolation guard in server/_shared/redis.ts. Fix: drop the raw=true flag so runRedisPipeline applies prefixKey on each command, symmetric with the reads. Adds __resetKeyPrefixCacheForTests in redis.ts so tests can exercise a non-empty prefix without relying on process-startup memoization order. Adds regression test that simulates VERCEL_ENV=preview + a commit SHA and asserts every score-key SET in the pipeline carries the preview:<sha>: prefix. Fails on old code (raw writes), passes now. installRedis gains an opt-in `keepVercelEnv` so the test can run under a forced env without being clobbered by the helper's default reset. * test(resilience-ranking): snapshot + restore VERCEL_GIT_COMMIT_SHA in afterEach PR review P2: the preview-prefix test mutates process.env.VERCEL_GIT_COMMIT_SHA but the file's afterEach only restored VERCEL_ENV. A process started with a real preview SHA (e.g. CI) would have that value unconditionally deleted after the test ran, leaking changed state into later tests and producing different prefix behavior locally vs. CI. Fix: capture originalVercelSha at module load, restore it in afterEach, and invalidate the memoized key prefix after each test so the next one recomputes against the restored env. The preview-prefix test's finally block is no longer needed — the shared teardown handles it. Verified: suite still passes 11/11 under both `VERCEL_ENV=production` (unset) and `VERCEL_ENV=preview VERCEL_GIT_COMMIT_SHA=ci-original-sha` process environments. |
||
|
|
fffc5d9607 |
fix(analyze-stock): classify dividend frequency by median gap (#3102)
* fix(analyze-stock): classify dividend frequency by median gap recentDivs.length within a hard 365.25-day window misclassifies quarterly payers whose last-year Q1 payment falls just outside the cutoff — common after mid-April each year, when Date.now() - 365.25d lands after Jan's payment timestamp. The test 'non-zero CAGR for a quarterly payer' flaked calendar-dependently for this reason. Prefer median inter-payment interval: quarterly = ~91d median gap, regardless of where the trailing-12-month window happens to bisect the payment series. Falls back to the old count when <2 entries exist. Also documents the CAGR filter invariant in the test helper. * fix(analyze-stock): suppress frequency when no recent divs + detect regime slowdowns Addresses PR #3102 review: 1. Suspended programs no longer leak a frequency badge. When recentDivs is empty, dividendYield and trailingAnnualDividendRate are both 0; emitting 'Quarterly' derived from historical median would contradict those zeros in the UI. paymentsPerYear now short-circuits to 0 before the interval classifier runs. 2. Whole-history median-gap no longer masks cadence regime changes. The reconciliation now depends on trailing-year count: recent >= 3 → interval classifier (robust to calendar drift) recent 1..2 → inspect most-recent inter-payment gap: > 180d = real slowdown, trust count (Annual) <= 180d = calendar drift, trust interval (Quarterly) recent 0 → empty frequency (suspended) The interval classifier itself is now scoped to the last 2 years so it responds to regime changes instead of averaging over 5y of history. Regression tests: - 'emits empty frequency when the dividend program has been suspended' — 3y of quarterly history + 18mo silence must report '' not 'Quarterly'. - 'detects a recent quarterly → annual cadence change' — 12 historical quarterly payments + 1 recent annual payment must report 'Annual'. * fix(analyze-stock): scope interval median to trailing year when recent>=3 Addresses PR #3102 review round 2: the reconciler's recent>=3 branch called paymentsPerYearFromInterval(entries), which scopes to the last 2 years. A monthly→quarterly shift (12 monthly payments in year -2..-1 plus 4 quarterly in year -1..0) produced a 2-year median of ~30d and misclassified as Monthly even though the current trailing-year cadence is clearly quarterly. Pass recentDivs directly to the interval classifier when recent>=3. Two payments in the trailing year = 1 gap which suffices for the median (gap count >=1, median well-defined). The historical-window 2y scoping still applies for the recent 1..2 branch, where we actively need history to distinguish drift from slowdown. Regression test: 12 monthly payments from -13..-24 months ago + 4 quarterly payments inside the trailing year must classify as Quarterly. * fix(analyze-stock): use true median (avg of two middles) for even gap counts PR #3102 P2: gaps[floor(length/2)] returns the upper-middle value for even-length arrays, biasing toward slower cadence at classifier thresholds when the trailing-year sample is small. Use the average of the two middles for even lengths. Harmless on 5-year histories with 50+ gaps where values cluster, but correct at sparse sample sizes where the trailing-year branch can have only 2–3 gaps. |
||
|
|
044598346e |
feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated (#3097)
* feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated
Opt-in contract path in runSeed: when opts.declareRecords is provided, write
{_seed, data} envelope to the canonical key alongside legacy seed-meta:*
(dual-write). State machine: OK / OK_ZERO / RETRY with zeroIsValid opt.
declareRecords throws or returns non-integer → hard fail (contract violation).
extraKeys[*] support per-key declareRecords; each extra key writes its own
envelope. Legacy seeders (no declareRecords) entirely unchanged.
Migrated all 91 scripts/seed-*.mjs to contract mode. Each exports
declareRecords returning the canonical record count, and passes
schemaVersion: 1 + maxStaleMin (matched to api/health.js SEED_META, or 2.5x
interval where no registry entry exists). Contract conformance reports 84/86
seeders with full descriptor (2 pre-existing warnings).
Legacy seed-meta keys still written so unmigrated readers keep working;
follow-up slices flip health.js + readers to envelope-first.
Tests: 61/61 PR 1 tests still pass.
Next slices for PR 2:
- api/health.js registry collapse + 15 seed-bundle-*.mjs canonicalKey wiring
- reader migration (mcp, resilience, aviation, displacement, regional-snapshot)
- direct writers — ais-relay.cjs, consumer-prices-core publish.ts
- public-boundary stripSeedEnvelope + test migration
Plan: docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md
* fix(seed-contract): unwrap envelopes in internal cross-seed readers
After PR 2a enveloped 91 canonical keys as {_seed, data}, every script-side
reader that returned the raw parsed JSON started silently handing callers the
envelope instead of the bare payload. WoW baselines (bigmac, grocery-basket,
fear-greed) saw undefined .countries / .composite; seed-climate-anomalies saw
undefined .normals from climate:zone-normals:v1; seed-thermal-escalation saw
undefined .fireDetections from wildfire:fires:v1; seed-forecasts' ~40-key
pipeline batch returned envelopes for every input.
Fix: route every script-side reader through unwrapEnvelope(...).data. Legacy
bare-shape values pass through unchanged (unwrapEnvelope returns
{_seed: null, data: raw} for any non-envelope shape).
Changed:
- scripts/_seed-utils.mjs: import unwrapEnvelope; redisGet, readSeedSnapshot,
verifySeedKey all unwrap. Exported new readCanonicalValue() helper for
cross-seed consumers.
- 18 seed-*.mjs scripts with local redisGet-style helpers or inline fetch
patched to unwrap via the envelope source module (subagent sweep).
- scripts/seed-forecasts.mjs pipeline batch: parse() unwraps each result.
- scripts/seed-energy-spine.mjs redisMget: unwraps each result.
Tests:
- tests/seed-utils-envelope-reads.test.mjs: 7 new cases covering envelope
+ legacy + null paths for readSeedSnapshot and verifySeedKey.
- Full seed suite: 67/67 pass (was 61, +6 new).
Addresses both of user's P1 findings on PR #3097.
* feat(seed-contract): envelope-aware reads in server + api helpers
Every RPC and public-boundary reader now automatically strips _seed from
contract-mode canonical keys. Legacy bare-shape values pass through unchanged
(unwrapEnvelope no-ops on non-envelope shapes).
Changed helpers (one-place fix — unblocks ~60 call sites):
- server/_shared/redis.ts: getRawJson, getCachedJson, getCachedJsonBatch
unwrap by default. cachedFetchJson inherits via getCachedJson.
- api/_upstash-json.js: readJsonFromUpstash unwraps (covers api/mcp.ts
tool responses + all its canonical-key reads).
- api/bootstrap.js: getCachedJsonBatch unwraps (public-boundary —
clients never see envelope metadata).
Left intentionally unchanged:
- api/health.js / api/seed-health.js: read only seed-meta:* keys which
remain bare-shape during dual-write. unwrapEnvelope already imported at
the meta-read boundary (PR 1) as a defensive no-op.
Tests: 67/67 seed tests pass. typecheck + typecheck:api clean.
This is the blast-radius fix the PR #3097 review called out — external
readers that would otherwise see {_seed, data} after the writer side
migrated.
* fix(test): strip export keyword in vm.runInContext'd seed source
cross-source-signals-regulatory.test.mjs loads scripts/seed-cross-source-signals.mjs
via vm.runInContext, which cannot parse ESM `export` syntax. PR 2a added
`export function declareRecords` to every seeder, which broke this test's
static-analysis approach.
Fix: strip the `export` keyword from the declareRecords line in the
preprocessed source string so the function body still evaluates as a plain
declaration.
Full test:data suite: 5307/5307 pass. typecheck + typecheck:api clean.
* feat(seed-contract): consumer-prices publish.ts writes envelopes
Wrap the 5 canonical keys written by consumer-prices-core/src/jobs/publish.ts
(overview, movers:7d/30d, freshness, categories:7d/30d/90d, retailer-spread,
basket-series) in {_seed, data} envelopes. Legacy seed-meta:<key> writes
preserved for dual-write.
Inlined a buildEnvelope helper (10 lines) rather than taking a cross-package
dependency — consumer-prices-core is a standalone npm package. Documented the
four-file parity contract (mjs source, ts mirror, js edge mirror, this copy).
Contract fields: sourceVersion='consumer-prices-core-publish-v1', schemaVersion=1,
state='OK' (recordCount>0) or 'OK_ZERO' (legitimate zero).
Typecheck: no new errors in publish.ts.
* fix(seed-contract): 3 more server-side readers unwrap envelopes
Found during final audit:
- server/worldmonitor/resilience/v1/_shared.ts: resilience score reader
parsed cached GetResilienceScoreResponse raw. Contract-mode seed-resilience-scores
now envelopes those keys.
- server/worldmonitor/resilience/v1/get-resilience-ranking.ts: p05/p95
interval lookup parsed raw from seed-resilience-scores' extra-key path.
- server/worldmonitor/infrastructure/v1/_shared.ts: mgetJson() used for
count-source keys (wildfire:fires:v1, news:insights:v1) which are both
contract-mode now.
All three now unwrap via server/_shared/seed-envelope. Legacy shapes pass
through unchanged.
Typecheck clean.
* feat(seed-contract): ais-relay.cjs direct writes produce envelopes
32 canonical-key write sites in scripts/ais-relay.cjs now produce {_seed, data}
envelopes. Inlined buildEnvelope() (CJS module can't require ESM source) +
envelopeWrite(key, data, ttlSeconds, meta) wrapper. Enveloped keys span market
bootstrap, aviation, cyber-threats, theater-posture, weather-alerts, economic
spending/fred/worldbank, tech-events, corridor-risk, usni-fleet, shipping-stress,
social:reddit, wsb-tickers, pizzint, product-catalog, chokepoint transits,
ucdp-events, satellites, oref.
Left bare (not seeded data keys): seed-meta:* (dual-write legacy),
classifyCacheKey LLM cache, notam:prev-closed-state internal state,
wm:notif:scan-dedup flags.
Updated tests/ucdp-seed-resilience.test.mjs regex to accept both upstashSet
(pre-contract) and envelopeWrite (post-contract) call patterns.
* feat(seed-contract): 15 bundle files add canonicalKey for envelope gate
54 bundle sections across 12 files now declare canonicalKey alongside the
existing seedMetaKey. _bundle-runner.mjs (from PR 1) prefers canonicalKey
when both are present — gates section runs on envelope._seed.fetchedAt
read directly from the data key, eliminating the meta-outlives-data class
of bugs.
Files touched:
- climate (5), derived-signals (2), ecb-eu (3), energy-sources (6),
health (2), imf-extended (4), macro (10), market-backup (9),
portwatch (4), relay-backup (2), resilience-recovery (5), static-ref (2)
Skipped (14 sections, 3 whole bundles): multi-key writers, dynamic
templated keys (displacement year-scoped), or non-runSeed orchestrators
(regional brief cron, resilience-scores' 222-country publish, validation/
benchmark scripts). These continue to use seedMetaKey or their own gate.
seedMetaKey preserved everywhere — dual-write. _bundle-runner.mjs falls
back to legacy when canonicalKey is absent.
All 15 bundles pass node --check. test:data: 5307/5307. typecheck:all: clean.
* fix(seed-contract): 4 PR #3097 review P1s — transform/declareRecords mismatches + envelope leaks
Addresses both P1 findings and the extra-key seed-meta leak surfaced in review:
1. runSeed helper-level invariant: seed-meta:* keys NEVER envelope.
scripts/_seed-utils.mjs exports shouldEnvelopeKey(key) — returns false for
any key starting with 'seed-meta:'. Both atomicPublish (canonical) and
writeExtraKey (extras) gate the envelope wrap through this helper. Fixes
seed-iea-oil-stocks' ANALYSIS_META_EXTRA_KEY silently getting enveloped,
which broke health.js parsing the value as bare {fetchedAt, recordCount}.
Also defends against any future manual writeExtraKey(..., envelopeMeta)
call that happens to target a seed-meta:* key.
2. seed-token-panels canonical + extras fixed.
publishTransform returns data.defi (the defi panel itself, shape {tokens}).
Old declareRecords counted data.defi.tokens + data.ai.tokens + data.other.tokens
on the transformed payload → 0 → RETRY path → canonical market:defi-tokens:v1
never wrote, and because runSeed returned before the extraKeys loop,
market:ai-tokens:v1 + market:other-tokens:v1 stayed stale too.
New: declareRecords counts data.tokens on the transformed shape. AI_KEY +
OTHER_KEY extras reuse the same function (transforms return structurally
identical panels). Added isMain guard so test imports don't fire runSeed.
3. api/product-catalog.js cached reader unwraps envelope.
ais-relay.cjs now envelopes product-catalog:v2 via envelopeWrite(). The
edge reader did raw JSON.parse(result) and returned {_seed, data} to
clients, breaking the cached path. Fix: import unwrapEnvelope from
./_seed-envelope.js, apply after JSON.parse. One site — :238-241 is
downstream of getFromCache(), so the single reader fix covers both.
4. Regression lock tests/seed-contract-transform-regressions.test.mjs (11 cases):
- shouldEnvelopeKey invariant: seed-meta:* false, canonical true
- Token-panels declareRecords works on transformed shape (canonical + both extras)
- Explicit repro of pre-fix buggy signature returning 0 — guards against revert
- resolveRecordCount accepts 0, rejects non-integer
- Product-catalog envelope unwrap returns bare shape; legacy passes through
Verification:
- npm run test:data → 5318/5318 pass (was 5307 — 11 new regressions)
- npm run typecheck:all → clean
- node --check on every modified script
iea-oil-stocks canonical declareRecords was NOT broken (user confirmed during
review — buildIndex preserves .members); only its ANALYSIS_META_EXTRA_KEY
was affected, now covered generically by commit 1's helper invariant.
* fix(seed-contract): seed-token-panels validateFn also runs on post-transform shape
Review finding: fixing declareRecords wasn't sufficient — atomicPublish() runs
validateFn(publishData) on the transformed payload too. seed-token-panels'
validate() checked data.defi/.ai/.other on the transformed {tokens} shape,
returned false, and runSeed took the early skipped-write branch (before even
reaching the declareRecords RETRY logic). Net effect: same as before the
declareRecords fix — canonical + both extras stayed stale.
Fix: validate() now checks the canonical defi panel directly (Array.isArray
(data?.tokens) && has at least one t.price > 0). AI/OTHER panels are validated
implicitly by their own extraKey declareRecords on write.
Audited the other 9 seeders with publishTransform (bls-series, bis-extended,
bis-data, gdelt-intel, trade-flows, iea-oil-stocks, jodi-gas, sanctions-pressure,
forecasts): all validateFn's correctly target the post-transform shape. Only
token-panels regressed.
Added 4 regression tests (tests/seed-contract-transform-regressions.test.mjs):
- validate accepts transformed panel with priced tokens
- validate rejects all-zero-price tokens
- validate rejects empty/missing tokens
- Explicit pre-fix repro (buggy old signature fails on transformed shape)
Verification:
- npm run test:data → 5322/5322 pass (was 5318; +4 new)
- npm run typecheck:all → clean
- node --check clean
* feat(seed-contract): add /api/seed-contract-probe validation endpoint
Single machine-readable gate for 'is PR #3097 working in production'.
Replaces the curl/jq ritual with one authenticated edge call that returns
HTTP 200 ok:true or 503 + failing check list.
What it validates:
- 8 canonical keys have {_seed, data} envelopes with required data fields
and minRecords floors (fsi-eu, zone-normals, 3 token panels + minRecords
guard against token-panels RETRY regression, product-catalog, wildfire,
earthquakes).
- 2 seed-meta:* keys remain BARE (shouldEnvelopeKey invariant; guards
against iea-oil-stocks ANALYSIS_META_EXTRA_KEY-class regressions).
- /api/product-catalog + /api/bootstrap responses contain no '_seed' leak.
Auth: x-probe-secret header must match RELAY_SHARED_SECRET (reuses existing
Vercel↔Railway internal trust boundary).
Probe logic is exported (checkProbe, checkPublicBoundary, DEFAULT_PROBES) for
hermetic testing. tests/seed-contract-probe.test.mjs covers every branch:
envelope pass/fail on field/records/shape, bare pass/fail on shape/field,
missing/malformed JSON, Redis non-2xx, boundary seed-leak detection,
DEFAULT_PROBES sanity (seed-meta invariant present, token-panels minRecords
guard present).
Usage:
curl -H "x-probe-secret: $RELAY_SHARED_SECRET" \
https://api.worldmonitor.app/api/seed-contract-probe
PR 3 will extend the probe with a stricter mode that asserts seed-meta:*
keys are GONE (not just bare) once legacy dual-write is removed.
Verification:
- tests/seed-contract-probe.test.mjs → 15/15 pass
- npm run test:data → 5338/5338 (was 5322; +16 new incl. conformance)
- npm run typecheck:all → clean
* fix(seed-contract): tighten probe — minRecords on AI/OTHER + cache-path source header
Review P2 findings: the probe's stated guards were weaker than advertised.
1. market:ai-tokens:v1 + market:other-tokens:v1 probes claimed to guard the
token-panels extra-key RETRY regression but only checked shape='envelope'
+ dataHas:['tokens']. If an extra-key declareRecords regressed to 0, both
probes would still pass because checkProbe() only inspects _seed.recordCount
when minRecords is set. Now both enforce minRecords: 1.
2. /api/product-catalog boundary check only asserted no '_seed' leak — which
is also true for the static fallback path. A broken cached reader
(getFromCache returning null or throwing) could serve fallback silently
and still pass this probe. Now:
- api/product-catalog.js emits X-Product-Catalog-Source: cache|dodo|fallback
on the response (the json() helper gained an optional source param wired
to each of the three branches).
- checkPublicBoundary declaratively requires that header's value match
'cache' for /api/product-catalog, so a fallback-serve fails the probe
with reason 'source:fallback!=cache' or 'source:missing!=cache'.
Test updates (tests/seed-contract-probe.test.mjs):
- Boundary check reworked to use a BOUNDARY_CHECKS config with optional
requireSourceHeader per endpoint.
- New cases: served-from-cache passes, served-from-fallback fails with source
mismatch, missing header fails, seed-leak still takes precedence, bad
status fails.
- Token-panels sanity test now asserts minRecords≥1 on all 3 panels.
Verification:
- tests/seed-contract-probe.test.mjs → 17/17 pass (was 15, +2 net)
- npm run test:data → 5340/5340
- npm run typecheck:all → clean
|