worldmonitor

mirror of https://github.com/koala73/worldmonitor.git synced 2026-04-25 17:14:57 +02:00

Author	SHA1	Message	Date
Elie Habib	63ef0dd0f1	docs(mintlify): finish PR 1 — landing rewrite, features refresh, maritime link-out Completes the PR 1 items from docs/plans/2026-04-19-001-feat-docs-user- facing-ia-refresh-plan.md that were deferred after the checkpoint on Route Explorer + Scenario Engine + CRI nav. No new pages — only edits to existing pages to point at and cohere with the new workflow pages. - documentation.mdx: landing rewrite. Dropped brittle counts (344 news sources, 49 layers, 24 CII countries, 31+ sources, 24 typed services) in favor of durable product framing. Surfaced the shipped differentiators that were invisible on the landing previously: Country Resilience Index (222 countries, linked to its methodology page), AI daily brief, Route Explorer, Scenario Engine, MCP server. Kept CII and CRI as two distinct country-risk surfaces — do not conflate. - features.mdx: replaced the 'all 55 panels' Cmd+K claim and the stale inventory list with family-grouped descriptions that include the panels this audit surfaced as missing (disease- outbreaks, radiation-watch, thermal-escalation, consumer-prices, latest-brief, forecast, country-resilience). Added a Workflows section linking to Route Explorer and Scenario Engine, and a Country-level risk section linking CII + CRI. Untouched sections (map, marker clustering, data layers, export, monitors, activity tracking) left as-is. - maritime-intelligence.mdx: collapsed the embedded Route Explorer subsection to a one-paragraph pointer at /route-explorer so the standalone page is the canonical home. Panels nav group remains intentionally unadded; it waits on PR 2 content to avoid rendering an empty group in Mintlify.	2026-04-19 17:09:05 +04:00
Elie Habib	233835e206	docs(mintlify): fix stale line cite (MapContainer.activateScenario at :1010) Greptile review P2: prose cited MapContainer.ts:1004 but activateScenario is declared at :1010. Line 1004 landed inside the JSDoc block.	2026-04-19 17:03:21 +04:00
Elie Habib	bffa1f4498	docs(mintlify): fix PRO auth contract (trusted origin ≠ PRO) - api-scenarios: 'X-WorldMonitor-Key (or trusted browser origin) + PRO' was wrong — isCallerPremium() explicitly skips trusted-origin short-circuits (keyCheck.required === false) and only counts (a) an env-valid or user-owned wm_-prefixed API key with apiAccess entitlement, or (b) a Clerk bearer with role=pro or Dodo tier ≥ 1. Browser calls work because premiumFetch() injects one of those credentials per request, not because Origin alone authenticates. Per server/_shared/premium-check.ts:34 and src/services/premium-fetch.ts:66. - usage-auth: strengthened the 'Entitlement / tier gating' section to state outright that authentication and PRO entitlement are orthogonal, and that trusted Origin is NOT accepted as PRO even though it is accepted for public endpoints. Listed the two real credential forms that pass the gate.	2026-04-19 17:01:45 +04:00
Elie Habib	80ede691cf	docs(mintlify): fix fourth-round findings (banner DOM, webhook TTL refresh) - scenario-engine: accurate description of the rendered scenario banner. Always-present elements are the ⚠ icon, scenario name, top-5 impacted countries with impact %, and dismiss ×. Params chip (e.g. '14d · +110% cost') and 'Simulating …' tagline are conditional on the worker result carrying template parameters (durationDays, disruptionPct, costShockMultiplier). The banner never lists affected chokepoints by name — the map and the chokepoint cards surface those. Per renderScenarioBanner at src/components/SupplyChainPanel.ts:750. - api-shipping-v2 (webhook TTL): register extends both the record and the owner-index set's 30-day TTL via atomic pipeline (SET + SADD + EXPIRE). rotate-secret and reactivate only extend the record's TTL — neither touches the owner-index set, so the owner index can expire independently if a caller only rotates/reactivates within a 30-day window. Re-register to keep both alive. Per api/v2/shipping/webhooks.ts:230 (register pipeline) and :325 (rotate setCachedJson on record only).	2026-04-19 16:57:46 +04:00
Elie Habib	fc7d46829f	docs(mintlify): fix third-round review findings (real IDs + 4-state lifecycle) - api-scenarios (template example): replaced invented hormuz-closure-30d / ["hormuz"] with the actually-shipped hormuz-tanker-blockade / ["hormuz_strait"] from scenario- templates.ts:80. Listed the other 5 shipped template IDs so scripted users aren't dependent on a single example. - api-scenarios (status lifecycle): worker writes FOUR states, not three. Added the intermediate "processing" state with startedAt, written by the worker at job pickup (scenario- worker.mjs:411). Lifecycle now: pending → processing → done\|failed. Both pending and processing are non-terminal. - scenario-engine (scripted use blurb): mirror the 4-state language and link into the lifecycle table. - scenario-engine (UI dismiss): replaced "Click Deactivate" with the actual × dismiss control on the scenario banner (aria-label: "Dismiss scenario") per src/components/SupplyChainPanel.ts:790. Also described the banner contents (name, chokepoints, countries, tagline). - api-shipping-v2: while fixing chokepoint IDs, also corrected "hormuz" → "hormuz_strait" and "bab-el-mandeb" → "bab_el_mandeb" across all four occurrences in the shipping v2 page (from PR #3209). Real IDs come from server/_shared/chokepoint- registry.ts (snake_case, not kebab-case, not bare "hormuz").	2026-04-19 16:49:43 +04:00
Elie Habib	6380245f21	docs(mintlify): fix Route Explorer + Scenario Engine review findings Reviewer caught 4 cases where I described behavior I hadn't read carefully. All fixes cross-checked against source. - route-explorer (free-tier): the workflow does NOT blur a numeric payload behind a public demo route. On free tier, fetchLane() short-circuits to renderFreeGate() which blurs the left rail, replaces the tab area with an Upgrade-to-PRO card, and applies a generic public-route highlight on the map. No lane data is rendered in any tab. See src/components/RouteExplorer/ RouteExplorer.ts:212 + :342. - route-explorer (keyboard): Tab / Shift+Tab moves focus between the panel and the map. Direct field jumps are F (From), T (To), P (Product/HS2), not Tab-cycling. Also added the full KeyboardHelp binding list (S swap, ↑/↓ list nav, Enter commit, Cmd+, copy URL, Esc close, ? help, 1-4 tabs). See src/components/RouteExplorer/ KeyboardHelp.ts:9 and RouteExplorer.ts:623. - scenario-engine: the SCENARIO_TEMPLATES array only ships templates of 4 types today (conflict, weather, sanctions, tariff_shock). The ScenarioType union includes infrastructure and pandemic but no templates of those types ship. Dropped them from the shipped table and noted the type union leaves room for future additions. - scenario-engine + api-scenarios: the worker writes status: 'done' (not 'completed') on success, 'failed' on error; pending is synthesised by the status endpoint when no worker record exists. Fixed both the new workflow page and the merged api-scenarios.mdx completed-response example + polling language. See scripts/scenario-worker.mjs:421 and src/components/SupplyChainPanel.ts:870.	2026-04-19 15:56:23 +04:00
Elie Habib	44bc40ee34	docs(mintlify): add Route Explorer + Scenario Engine workflow pages Checkpoint for review on the IA refresh (per plan docs/plans/2026-04-19-001-feat-docs-user-facing-ia-refresh-plan.md). - docs/docs.json: link Country Resilience Index methodology under Intelligence & Analysis so the flagship 222-country feature is reachable from the main nav (previously orphaned). Add a new Workflows group containing route-explorer and scenario-engine. - docs/route-explorer.mdx: standalone workflow page. Who it is for, Cmd+K entry, four tabs (Current / Alternatives / Land / Impact), inputs, keyboard bindings, map-state integration, PRO gating with free-tier blur + public-route highlight, data sources. - docs/scenario-engine.mdx: standalone workflow page. Template categories (conflict / weather / sanctions / tariff_shock / infrastructure / pandemic), how a scenario activates on the map, PRO gating, pointers to the async job API. Deferred to follow-up commits in the same PR: - documentation.mdx landing rewrite - features.mdx refresh - maritime-intelligence.mdx link-out to Route Explorer - Panels nav group (waits for PR 2 content) All content grounded in live source files cited inline.	2026-04-19 15:27:27 +04:00
Elie Habib	e4c95ad9be	docs(mintlify): cover MCP, OAuth, non-RPC endpoints, and usage (#3209 ) * docs(mintlify): cover MCP, OAuth, non-RPC endpoints, and usage Audit against api/ + proto/ revealed 9 OpenAPI specs missing from nav, the scenario/v1 service undocumented, and MCP (32 tools + OAuth 2.1 flow) with no user-facing docs. The stale Docs_To_Review/API_REFERENCE.md still pointed at pre-migration endpoints that no longer exist. - Wire 9 orphaned specs into docs.json: ConsumerPrices, Forecast, Health, Imagery, Radiation, Resilience, Sanctions, Thermal, Webcam - Hand-write ScenarioService.openapi.yaml (3 RPCs) until it's proto-backed (tracked in issue #3207) - New MCP page with tool catalog + client setup (Claude Desktop/web, Cursor) - New MDX for OAuth, Platform, Brief, Commerce, Notifications, Shipping v2, Proxies - New Usage group: quickstart, auth matrix, rate limits, errors - Remove docs/Docs_To_Review/API_REFERENCE.md and EXTERNAL_APIS.md (referenced dead endpoints); add README flagging dir as archival * docs(mintlify): move scenario docs out of generated docs/api/ tree The pre-push hook enforces that docs/api/ is proto-generated only. Replace the hand-written ScenarioService.openapi.yaml with a plain MDX page (docs/api-scenarios.mdx) until the proto migration lands (tracked in issue #3207). * docs(mintlify): fix factual errors flagged in PR review Reviewer caught 5 endpoints where I speculated on shape/method/limits instead of reading the code. All fixes cross-checked against the source: - api-shipping-v2: route-intelligence is GET with query params (fromIso2, toIso2, cargoType, hs2), not POST with a JSON body. Response shape is {primaryRouteId, chokepointExposures[], bypassOptions[], warRiskTier, disruptionScore, ...}. - api-commerce: /api/product-catalog returns {tiers, fetchedAt, cachedUntil, priceSource} with tier groups free\|pro\|api_starter\| enterprise, not the invented {currency, plans}. Document the DELETE purge path too. - api-notifications: Slack/Discord /oauth/start are POST + Clerk JWT + PRO (returning {oauthUrl}), not GET redirects. Callbacks remain GET. - api-platform: /api/version returns the latest GitHub Release ({version, tag, url, prerelease}), not deployed commit/build metadata. - api-oauth + mcp: /api/oauth/register limit is 5/60s/IP (match code), not 10/hour. Also caught while double-checking: /api/register-interest and /api/contact are 5/60min and 3/60min respectively (1-hour window, not 1-minute). Both require Turnstile. Removed the fabricated limits for share-url, notification-channels, create-checkout (they fall back to the default per-IP limit). * docs(mintlify): second-round fixes — verify every claim against source Reviewer caught 7 more cases where I described API behavior I hadn't read. Each fix below cross-checked against the handler. - api-commerce (product-catalog): tiers are flat objects with monthlyPrice/annualPrice/monthlyProductId/annualProductId on paid tiers, price+period for free, price:null for enterprise. There is no nested plans[] array. - api-commerce (referral/me): returns {code, shareUrl}, not counts. Code is a deterministic 8-char HMAC of the Clerk userId; binding into Convex is fire-and-forget via ctx.waitUntil. - api-notifications (notification-channels): actual action set is create-pairing-token, set-channel, set-web-push, delete-channel, set-alert-rules, set-quiet-hours, set-digest-settings. Replaced the made-up list. - api-shipping-v2 (webhooks): alertThreshold is numeric 0-100 (default 50), not a severity string. Subscriber IDs are wh_+24hex; secret is raw 64-char hex (no whsec_ prefix). POST registration returns 201. Added the management routes: GET /{id}, POST /{id}/rotate-secret, POST /{id}/reactivate. - api-platform (cache-purge): auth is Authorization: Bearer RELAY_SHARED_SECRET, not an admin-key header. Body takes keys[] and/or patterns[] (not {key} or {tag}), with explicit per-request caps and prefix-blocklist behavior. - api-platform (download): platform+variant query params, not file=<id>. Response is a 302 to a GitHub release asset; documented the full platform/variant tables. - mcp: server also accepts direct X-WorldMonitor-Key in addition to OAuth bearer. Fixed the curl example which was incorrectly sending a wm_live_ API key as a bearer token. - api-notifications (youtube/live): handler reads channel or videoId, not channelId. - usage-auth: corrected the auth-matrix row for /api/mcp to reflect that OAuth is one of two accepted modes. * docs(mintlify): fix Greptile review findings - mcp.mdx: 'Five' slow tools → 'Six' (list contains 6 tools) - api-scenarios.mdx: replace invalid JSON numeric separator (8_400_000_000) with plain integer (8400000000) Greptile's third finding — /api/oauth/register rate-limit contradiction across api-oauth.mdx / mcp.mdx / usage-rate-limits.mdx — was already resolved in commit `4f2600b2a` (reviewed commit was `eb5654647`).	2026-04-19 15:03:16 +04:00
Elie Habib	38e6892995	fix(brief): per-run slot URL so same-day digests link to distinct briefs (#3205 ) * fix(brief): per-run slot URL so same-day digests link to distinct briefs Digest emails at 8am and 1pm on the same day pointed to byte-identical magazine URLs because the URL was keyed on YYYY-MM-DD in the user tz. Each compose run overwrote the single daily envelope in place, and the composer rolling 24h story window meant afternoon output often looked identical to morning. Readers clicking an older email got whatever the latest cron happened to write. Slot format is now YYYY-MM-DD-HHMM (local tz, per compose run). The magazine URL, carousel URLs, and Redis key all carry the slot, and each digest dispatch gets its own frozen envelope that lives out the 7d TTL. envelope.data.date stays YYYY-MM-DD for rendering "19 April 2026". The digest cron also writes a brief:latest:{userId} pointer (7d TTL, overwritten each compose) so the dashboard panel and share-url endpoint can locate the most recent brief without knowing the slot. The previous date-probing strategy does not work once keys carry HHMM. No back-compat for the old YYYY-MM-DD format: the verifier rejects it, the composer only ever writes the new shape, and any in-flight notifications signed under the old format will 403 on click. Acceptable at the rollout boundary per product decision. * fix(brief): carve middleware bot allowlist to accept slot-format carousel path BRIEF_CAROUSEL_PATH_RE in middleware.ts was still matching only the pre-slot YYYY-MM-DD segment, so every slot-based carousel URL emitted by the digest cron (YYYY-MM-DD-HHMM) would miss the social allowlist and fall into the generic bot gate. Telegram/Slack/Discord/LinkedIn image fetchers would 403 on sendMediaGroup, breaking previews for the new digest links. CI missed this because tests/middleware-bot-gate.test.mts still exercised the old /YYYY-MM-DD/ path shape. Swap the fixture to the slot format and add a regression asserting the pre-slot shape is now rejected, so legacy links cannot silently leak the allowlist after the rollout. * fix(brief): preserve caller-requested slot + correct no-brief share-url error Two contract bugs in the slot rollout that silently misled callers: 1. GET /api/latest-brief?slot=X where X has no envelope was returning { status: 'composing', issueDate: <today UTC> } — which reads as "today's brief is composing" instead of "the specific slot you asked about doesn't exist". A caller probing a known historical slot would get a completely unrelated "today" signal. Now we echo the requested slot back (issueSlot + issueDate derived from its date portion) when the caller supplied ?slot=, and keep the UTC-today placeholder only for the no-param path. 2. POST /api/brief/share-url with no slot and no latest-pointer was falling into the generic invalid_slot_shape 400 branch. That is not an input-shape problem; it is "no brief exists yet for this user". Return 404 brief_not_found — the same code the existing-envelope check returns — so callers get one coherent contract: either the brief exists and is shareable, or it doesn't and you get 404.	2026-04-19 14:15:59 +04:00
Elie Habib	56054bfbc1	fix(brief): use wildcard glob in vercel.json functions key (PR #3204 follow-up) (#3206 ) * fix(brief): use wildcard glob in vercel.json functions key PR #3204 shipped the right `includeFiles` value but the WRONG key: "api/brief/carousel/[userId]/[issueDate]/[page].ts" Vercel's `functions` config keys are micromatch globs, not literal paths. Bracketed segments like `[userId]` are parsed as character classes (match any ONE character from {u,s,e,r,I,d}), so my rule matched zero files and `includeFiles` was silently ignored. Post- merge probe still returned HTTP 500 FUNCTION_INVOCATION_FAILED on every request. Build log shows zero mentions of `carousel` or `resvg` — corroborates the key never applied. Fix: wildcard path segments. "api/brief/carousel/*" Matches any file under the carousel route dir. Since the only deployed file there is the dynamic-segment handler, the effective scope is identical to what I originally intended. Added a second regression test that sweeps every functions key and fails loudly if any bracketed segment slips back in. Guards against future reverts AND against anyone copy-pasting the literal route path without realising Vercel reads it as a glob. 23/23 deploy-config tests pass (was 22, +1 new guard). Address Greptile P2: widen bracket-literal guard regex Greptile spotted that `/\[[A-Za-z]+\]/` only matches purely-alphabetic segment names. Real-world Next.js routes often use `[user_id]`, `[issue_date]`, `[page1]`, `[slug2024]` — none flagged by the old regex, so the guard would silently pass on the exact kind of regression it was written to catch. Widened to `/\[[A-Za-z][A-Za-z0-9_]\]/`: - requires a leading letter (so legit char classes like `[0-9]` and `[!abc]` don't false-positive) - allows letters, digits, underscores after the first char - covers every Next.js-style dynamic-segment name convention Also added a self-test that pins positive cases (userId, user_id, issue_date, page1, slug2024) and negative cases (the actual `*` glob, `[0-9]`, `[!abc]`) so any future narrowing of the regex breaks CI immediately instead of silently re-opening PR #3206. 24/24 deploy-config tests pass (was 23, +1 new self-test).	2026-04-19 14:02:30 +04:00
Elie Habib	305dc5ef36	feat(digest-dedup): Phase A — embedding-based dedup scaffolding (no-op) (#3200 ) * feat(digest-dedup): Phase A — embedding-based dedup scaffolding (no-op) Replaces the inline Jaccard story-dedup in seed-digest-notifications with an orchestrator that can run Jaccard, shadow, or full embedding modes. Ships with DIGEST_DEDUP_MODE=jaccard as the default so production behaviour is unchanged until Phase C shadow + Phase D flip. New modules (scripts/lib/): - brief-dedup-consts.mjs tunables + cache prefix + __constants bag - brief-dedup-jaccard.mjs verbatim 0.55-threshold extract (fallback) - entity-gazetteer.mjs cities/regions gazetteer + common-caps - brief-embedding.mjs OpenRouter /embeddings client with Upstash cache, all-or-nothing timeout, cosineSimilarity - brief-dedup-embed.mjs complete-link clustering + entity veto (pure) - brief-dedup.mjs orchestrator, env read at call entry, shadow archive, structured log line Operator tools (scripts/tools/): - calibrate-dedup-threshold.mjs offline calibration runner + histogram - golden-pair-validator.mjs live-embedder drift detector (nightly CI) - shadow-sample.mjs Sample A/B CSV emitter over SCAN archive Tests: - brief-dedup-jaccard.test.mjs migrated from regex-harness to direct import plus orchestrator parity tests (22) - brief-dedup-embedding.test.mjs 9 plan scenarios incl. 10-permutation property test, complete-link non-chain (21) - brief-dedup-golden.test.mjs 20-pair mocked canary (21) Workflows: - .github/workflows/dedup-golden-pairs.yml nightly live-embedder canary (07:17 UTC), opens issue on drift Deviation from plan: the shouldVeto("Iran closes Hormuz", "Tehran shuts Hormuz") case can't return true under a single coherent classification (country-in-A vs capital-in-B sit on different sides of the actor/location boundary). Gazetteer follows the plan's "countries are actors" intent; the test is updated to assert false with a comment pointing at the irreducible capital-country coreference limitation. Verification: - npm run test:data 5825/5825 pass - tests/edge-functions 171/171 pass - typecheck + typecheck:api clean - biome check on new files clean - lint:md 0 errors Phase B (calibration), Phase C (shadow), and Phase D (flip) are subsequent PRs. * refactor(digest-dedup): address review findings 193-199 Fresh-eyes review found 3 P1s, 3 P2s, and a P3 bundle across kieran-typescript, security-sentinel, performance-oracle, architecture- strategist, and code-simplicity reviewers. Fixes below; all 64 dedup tests + 5825 data tests + 171 edge-function tests still green. P1 #193 - dedup regex + redis pipeline duplication - Extract defaultRedisPipeline into scripts/lib/_upstash-pipeline.mjs; both orchestrator and embedding client import from there. - normalizeForEmbedding now delegates to stripSourceSuffix from the Jaccard module so the outlet allow-list is single-sourced. P1 #194 - embedding timeout floor + negative-budget path - callEmbeddingsApi throws EmbeddingTimeoutError when timeoutMs<=0 instead of opening a doomed 250ms fetch. - Removed Math.max(250, ...) floor that let wall-clock cap overshoot. P1 #195 - dead env getters - Deleted getMode / isRemoteEmbedEnabled / isEntityVetoEnabled / getCosineThreshold / getWallClockMs from brief-dedup-consts.mjs (zero callers; orchestrator reimplements inline). P2 #196 - orchestrator cleanup bundle - Removed re-exports at bottom of brief-dedup.mjs. - Extracted materializeCluster into brief-dedup-jaccard.mjs; both the fallback and orchestrator use the shared helper. - Deleted clusterWithEntityVeto wrapper; orchestrator inlines the vetoFn wiring at the single call site. - Shadow mode now runs Jaccard exactly once per tick (was twice). - Fallback warn line carries reason=ErrorName so operators can filter timeout vs provider vs shape errors. - Invalid DIGEST_DEDUP_MODE values emit a warn once per run (vs silently falling to jaccard). P2 #197 - workflow + shadow-sample hardening - dedup-golden-pairs.yml body composition no longer relies on a heredoc that would command-substitute validator stdout. Switched to printf with sanitised LOG_TAIL (printable ASCII only) and --body-file so crafted fixture text cannot escape into the runner. - shadow-sample.mjs Upstash helper enforces a hardcoded command allowlist (SCAN \| GET \| EXISTS). P2 #198 - test + observability polish - Scenarios 2 and 3 deep-equal returned clusters against the Jaccard expected shape, not just length. Also assert the reason= field. P3 #199 - nits - Removed __constants test-bag; jaccard tests use named imports. - Renamed deps.apiKey to deps._apiKey in embedding client. - Added @pre JSDoc on diffClustersByHash about unique-hash contract. - Deferred: mocked golden-pair test removal, gazetteer JSON migration, scripts/tools AGENTS.md doc note. Todos 193-199 moved from pending to complete. Verification: - npm run test:data 5825/5825 pass - tests/edge-functions 171/171 pass - typecheck + typecheck:api clean - biome check on changed files clean * fix(digest-dedup): address Greptile P2 findings on PR #3200 1. brief-embedding.mjs: wrap fetch lookup as `(...args) => globalThis.fetch(...args)` instead of aliasing bare `fetch`. Aliasing captures the binding at module-load time, so later instrumentation / Edge-runtime shims don't see the wrapper — same class of bug as the banned `fetch.bind(globalThis)` pattern flagged in AGENTS.md. 2. dedup-golden-pairs.yml: `gh issue create --label "..." \|\| true` silently swallowed the failure when any of dedup/canary/p1 labels didn't pre-exist, breaking the drift alert channel while leaving the job red in the Actions UI. Switched to repeated `--label` flags + `--create-label` so any missing label is auto-created on first drift, and dropped the `\|\| true` so a legitimate failure (network / auth) surfaces instead of hiding. Both fixes are P2-style per Greptile (confidence 5/5, no P0/P1); applied pre-merge so the nightly canary is usable from day one. * fix(digest-dedup): two P1s found on PR #3200 P1 — canary classifier must match production Nightly golden-pair validator was checking a hardcoded threshold (default 0.60) and always applied the entity veto, while the actual dedup path at runtime reads DIGEST_DEDUP_COSINE_THRESHOLD and DIGEST_DEDUP_ENTITY_VETO_ENABLED from env at every call. A Phase C/D env flip could make the canary green while prod was wrong or red while prod was healthy, defeating the whole point of a drift detector. Fix: - golden-pair-validator.mjs now calls readOrchestratorConfig(process.env) — the same helper the orchestrator uses — so any classifier knob added later is picked up automatically. The threshold and veto- enabled flags are sourced from env by default; a --threshold CLI flag still overrides for manual calibration sweeps. - dedup-golden-pairs.yml sources DIGEST_DEDUP_COSINE_THRESHOLD and DIGEST_DEDUP_ENTITY_VETO_ENABLED from GitHub repo variables (vars.), which operators must keep in lockstep with Railway. The workflow_dispatch threshold input now defaults to empty; the scheduled canary always uses the production-parity config. - Validator log line prints the effective config + source so nightly output makes the classifier visible. P1 — shadow archive writes were fail-open `defaultRedisPipeline()` returns null on timeout / auth / HTTP failure. `writeShadowArchive()` only had a try/catch, so the null result was silently treated as success. A Phase C rollout could log clean "mode=shadow … disagreements=X" lines every tick while the Upstash archive received zero writes — and Sample B labelling would then find no batches, silently killing calibration. Fix: - writeShadowArchive now inspects the pipeline return. null result, non-array response, per-command {error}, or a cell without {result: "OK"} all return {ok: false, reason}. - Orchestrator emits a warn line with the failure reason, and the structured log line carries archive_write=ok\|failed so operators can grep for failed ticks. - Regression test in brief-dedup-embedding.test.mjs simulates the null-pipeline contract and asserts both the warn and the structured field land. Verification: - test:data 5825/5825 pass - dedup suites 65/65 pass (new: archive-fail regression) - typecheck + api clean - biome check clean on changed files fix(digest-dedup): two more P1s found on PR #3200 P1 — canary must also honour DIGEST_DEDUP_MODE + REMOTE_EMBED_ENABLED The prior round fixed the threshold/veto knobs but left the canary running embeddings regardless of whether production could actually reach the embed path. If Railway has DIGEST_DEDUP_MODE=jaccard or DIGEST_DEDUP_REMOTE_EMBED_ENABLED=0, production never calls the classifier, so a drift signal is meaningless — or worse, a live OpenRouter issue flags the canary while prod is obliviously fine. Fix: - golden-pair-validator.mjs reads mode + remoteEmbedEnabled from the same readOrchestratorConfig() helper the orchestrator uses. When either says "embed path inactive in prod", the validator logs an explicit skip line and exits 0. The nightly workflow then shows green, which is the correct signal ("nothing to drift against"). - A --force CLI flag remains for manual dispatch during staged rollouts. - dedup-golden-pairs.yml sources DIGEST_DEDUP_MODE and DIGEST_DEDUP_REMOTE_EMBED_ENABLED from GitHub repo variables alongside the threshold and veto-enabled knobs, so all four classifier gates stay in lockstep with Railway. - Validator log line now prints mode + remoteEmbedEnabled so the canary output surfaces which classifier it validated. P1 — shadow-sample Sample A was biased by SCAN order enumerate-and-dedup added every seen pair to a dedup key BEFORE filtering by agreement. If the same pair appeared in an agreeing batch first and a disagreeing batch later, the disagreeing occurrence was silently dropped. SCAN order is unspecified, so Sample A could omit real disagreement pairs. Fix: - Extracted the enumeration into a pure `enumeratePairs(archives, mode)` export so the logic is testable. Mode filter runs BEFORE the dedup check: agreeing pairs are skipped entirely under --mode disagreements, so any later disagreeing occurrence can still claim the dedup slot. - Added tests/brief-dedup-shadow-sample.test.mjs with 5 regression cases: agreement-then-disagreement, reversed order (symmetry), always-agreed omission, population enumeration, cross-batch dedup. - isMain guard added so importing the module for tests does not kick off the CLI scan path. Verification: - test:data 5825/5825 pass - dedup suites 70/70 pass (5 new shadow-sample regressions) - typecheck + api clean - biome check clean on changed files Operator follow-up before Phase C: Set all FOUR dedup repo variables in GitHub alongside Railway: DIGEST_DEDUP_MODE, DIGEST_DEDUP_REMOTE_EMBED_ENABLED, DIGEST_DEDUP_COSINE_THRESHOLD, DIGEST_DEDUP_ENTITY_VETO_ENABLED * refactor(digest-dedup): Railway is the single source of truth for dedup config Fair user pushback: asking operators to set four DIGEST_DEDUP_* values in BOTH Railway (where the cron runs) AND GitHub repo variables (where the canary runs) is architectural debt. Two copies of the same truth will always drift. Solution: the digest cron publishes its resolved config to Upstash on every tick under brief:dedup:config:v1 (2h TTL). The nightly golden-pair canary reads that key instead of env vars. Railway stays the sole source of truth; no parallel repo variables to maintain. A missing/expired key signals "cron hasn't run" and the canary skips with exit 0 — better than validating against hardcoded defaults that might diverge from prod. Changes: - brief-dedup-consts.mjs: new ACTIVE_CONFIG_KEY + TTL constants. - brief-dedup.mjs: new publishActiveConfig() fires at the start of every deduplicateStories() call (before the mode short-circuit, so jaccard ticks also publish a "mode=jaccard" signal the canary can read). Fire-and-forget; archive-write error semantics still apply if the operator wants stricter tracking. - golden-pair-validator.mjs: removed readOrchestratorConfig(env) path. Now calls fetchActiveConfigFromUpstash() and either validates against that config, skips when the embed path is inactive, or skips when the key is missing (with --force override for manual dispatch). - dedup-golden-pairs.yml: dropped the four DIGEST_DEDUP_* env lines and the corresponding repo-variable dependency. Only the three Upstash + OpenRouter secrets remain. - tests: two new regressions assert config is published on every tick (shadow AND jaccard modes) with the right shape + TTL. Operator onboarding now takes one action: set the four DIGEST_DEDUP_* variables on the Railway seed-digest-notifications service. Nothing to set in GitHub beyond the existing OPENROUTER_API_KEY / UPSTASH_* secrets. Verification: - test:data 5825/5825 pass - dedup suites 72/72 pass (2 new config-publish regressions) - typecheck + api clean - biome check clean on changed files * refactor(digest-dedup): ship embed directly, drop phases/canary/shadow User feedback: "i dont need multiple phases and shit, we go directly to embed". Fair. Ripping out the overengineering I accumulated: DELETED - .github/workflows/dedup-golden-pairs.yml (nightly canary) - scripts/tools/golden-pair-validator.mjs - scripts/tools/shadow-sample.mjs - scripts/tools/calibrate-dedup-threshold.mjs - tests/fixtures/brief-dedup-golden-pairs.json - tests/brief-dedup-golden.test.mjs - tests/brief-dedup-shadow-sample.test.mjs SIMPLIFIED - brief-dedup.mjs: removed shadow mode, publishActiveConfig, writeShadowArchive, diffClustersByHash, jaccardRepsToClusterHashes, and the DIGEST_DEDUP_REMOTE_EMBED_ENABLED knob. MODE is now binary: `embed` (default) or `jaccard` (instant kill switch). - brief-dedup-consts.mjs: dropped SHADOW_ARCHIVE_, ACTIVE_CONFIG_. - Default flipped: DIGEST_DEDUP_MODE unset = embed (prod path). Railway deploy with OPENROUTER_API_KEY set = embeddings live on next cron tick. Set MODE=jaccard on Railway to revert instantly. Orchestrator still falls back to Jaccard on any embed-path failure (timeout, provider outage, missing API key, bad response). Fallback warn carries reason=<ErrorName>. The cron never fails because embeddings flaked. All 64 dedup tests + 5825 data tests still green. Net diff: -1,407 lines. Operator single action: set OPENROUTER_API_KEY on Railway's seed-digest-notifications service (already present) and ship. No GH Actions, no shadow archives, no labelling sprints. If the 0.60 threshold turns out wrong, tune DIGEST_DEDUP_COSINE_THRESHOLD on Railway — takes effect on next tick, no redeploy. * fix(digest-dedup): multi-word location phrases in the entity veto Extractor was whitespace-tokenising and only single-token matching against LOCATION_GAZETTEER, silently making every multi-word entry unreachable: extractEntities("Houthis strike ship in Red Sea") → { locations: [], actors: ['houthis','red','sea'] } ✗ shouldVeto("Houthis strike ship in Red Sea", "US escorts convoy in Red Sea") → false ✗ With MODE=embed as the default, that turned off the main anti-overmerge safety rail for bodies of water, regions, and compound city names — exactly the P07-Hormuz / Houthis-Red-Sea headlines the veto was designed to cover. Fix: greedy longest-phrase scan with a sliding window. At each token position try the longest multi-word phrase first (down to 2), require first AND last tokens to be capitalised (so lowercase prose like "the middle east" doesn't falsely match while headline "Middle East" does), lowercase connectors in between are fine ("Strait of Hormuz" → phrase "strait of hormuz" ✓). Falls back to single-token lookup when no multi-word phrase fits. Now: extractEntities("Houthis strike ship in Red Sea") → { locations: ['red sea'], actors: ['houthis'] } ✓ shouldVeto(Red-Sea-Houthis, Red-Sea-US) → true ✓ Complexity still O(N · MAX_PHRASE_LEN) — MAX_PHRASE_LEN is 4 (longest gazetteer entry: "ho chi minh city"), so this is effectively O(N). Added 5 regression tests covering Red Sea, South China Sea, Strait of Hormuz (lowercase-connector case), Abu Dhabi, and New York, plus the Houthis-vs-US veto reproducer from the P1. All 5825 data tests + 45 dedup tests green; lint + typecheck clean.	2026-04-19 13:49:48 +04:00
Elie Habib	27849fee1e	fix(brief): bundle resvg linux-x64-gnu native binding with carousel fn (#3204 ) * fix(brief): bundle resvg linux-x64-gnu native binding with carousel fn Real root cause of every Telegram carousel WEBPAGE_CURL_FAILED since PR #3174 merged. Not middleware (last PR fixed that theoretical path but not the observed failure). The Vercel function itself crashes HTTP 500 FUNCTION_INVOCATION_FAILED on every request including OPTIONS - the isolate can't initialise. The handler imports brief-carousel-render which lazy-imports @resvg/resvg-js. That package's js-binding.js does runtime require(@resvg/resvg-js-<platform>-<arch>-<libc>). On Vercel Lambda (Amazon Linux 2 glibc) that resolves to @resvg/resvg-js-linux-x64-gnu. Vercel nft tracing does NOT follow this conditional require so the optional peer package isnt bundled. Cold start throws MODULE_NOT_FOUND, isolate crashes, Vercel returns FUNCTION_INVOCATION_FAILED, Telegram reports WEBPAGE_CURL_FAILED. Fix: vercel.json functions.includeFiles forces linux-x64-gnu binding into the carousel functions bundle. Only this route needs it; every other api route is unaffected. Verified: - deploy-config tests 21/21 pass - JSON valid - Reproduced 500 via curl on all methods and UAs - resvg-js/js-binding.js confirms linux-x64-gnu is the runtime binary on Amazon Linux 2 glibc Post-merge: curl with TelegramBot UA should return 200 image/png instead of 500; next cron tick should clear the Railway [digest] Telegram carousel 400 line. * Address Greptile P2s: regression guard + arch-assumption reasoning Two P2 findings on PR #3204: P2 #1 (inline on vercel.json:6): Platform architecture assumption undocumented. If Vercel migrates to Graviton/arm64 Lambda the cold-start crash silently returns. vercel.json is strict JSON so comments aren't possible inline. P2 #2 (tests/deploy-config.test.mjs:17): No regression guard for the carousel includeFiles rule. A future vercel.json tidy-up could silently revert the fix with no CI signal. Fixed both in a single block: - New describe() in deploy-config.test.mjs asserts the carousel route's functions entry exists AND its includeFiles points at @resvg/resvg-js-linux-x64-gnu. Any drift fails the build. - The block comment above it documents the Amazon Linux 2 x86_64 glibc assumption that would have lived next to the includeFiles entry if JSON supported comments. Includes the Graviton/arm64 migration pointer. tests 22/22 pass (was 21, +1 new).	2026-04-19 13:36:17 +04:00
Elie Habib	45f02fed00	fix(sentry): filter Three.js OrbitControls setPointerCapture NotFoundError (#3201 ) * fix(sentry): suppress Three.js OrbitControls setPointerCapture NotFoundError OrbitControls' pointerdown handler calls setPointerCapture after the browser has already released the pointer (focus change, rapid re-tap), leaking as an unhandled NotFoundError. OrbitControls is bundled into main-.js so hasFirstParty=true; matched by the unique setPointerCapture message (grep confirms no first-party setPointerCapture usage). Resolves WORLDMONITOR-NC. fix(sentry): gate OrbitControls setPointerCapture filter on bundle-only stack Review feedback: suppressing by message alone would hide a future first-party setPointerCapture regression. Mirror the existing OrbitControls filter's provenance check — require absence of any source-mapped .ts/.tsx frame so the filter only matches stacks whose only non-infra frame is the bundled main chunk. Adds positive + negative regression tests for the pair. * fix(sentry): gate OrbitControls filter on positive three.js context signature Review feedback: absence of .ts/.tsx frames is not proof of third-party origin because production stacks are often unsymbolicated. Replace the negative-only gate with a positive OrbitControls signature — require a frame whose context slice contains the literal `_pointers … setPointerCapture` adjacency unique to three.js OrbitControls. Update tests to cover the production-realistic case (unsymbolicated first-party bundle frame calling setPointerCapture must still reach Sentry) plus a defensive no-context fallthrough.	2026-04-19 13:15:31 +04:00
Elie Habib	d7f87754f0	fix(emails): update transactional email copy — 22 → 30+ services (#3203 ) Follow-up to #3202. Greptile flagged two transactional email templates still claimed '22 services' while /pro now advertises '30+': - api/register-interest.js:90 — interest-registration confirmation email ('22 Services, 1 Key') - convex/payments/subscriptionEmails.ts:57 — API subscription confirmation email ('22 services, one API key') A user signing up via /pro would read '30+ services' on the page, then receive an email saying '22'. Both updated to '30+' matching the /pro page and the actual server domain count (31 in server/worldmonitor/*, plus api/scenario/v1/ = 32, growing).	2026-04-19 13:15:17 +04:00
Elie Habib	135082d84f	fix(pro): correct service-domain count — 22 → 30+ (server has 31) (#3202 ) * fix(pro): correct service-domain count — 22 → 30+ (server has 31, growing) The /pro page advertised '22 services' / '22 service domains' but server/worldmonitor/, proto/worldmonitor/, and src/generated/server/worldmonitor/ all have 31 domain dirs (aviation, climate, conflict, consumer-prices, cyber, displacement, economic, forecast, giving, health, imagery, infrastructure, intelligence, maritime, market, military, natural, news, positive-events, prediction, radiation, research, resilience, sanctions, seismology, supply-chain, thermal, trade, unrest, webcam, wildfire). api/scenario/v1/ adds a 32nd recently shipped surface. Used '30+' rather than the literal '31' so the page doesn't drift again every time a new domain ships (the '22' was probably accurate at one point too). 168 string substitutions across all 21 locale JSON files (8 keys each: twoPath.proDesc, twoPath.proF1, whyUpgrade.fasterDesc, pillars.askItDesc, dataCoverage.subtitle, proShowcase.oneKey, apiSection.restApi, faq.a8). Plus 10 in pro-test/index.html (meta description, og:description, twitter:description, SoftwareApplication ld+json description + Pro Monthly offer, FAQ ld+json a8, noscript fallback). Bundle rebuilt. * fix(pro): Bulgarian grammar — drop definite-article suffix after 30+	2026-04-19 13:07:07 +04:00
Elie Habib	cce46a1767	fix(pro): API tier is launched — drop 'Coming Soon' label (#3198 ) The /pro comparison-table column header still read 'API (Coming Soon)' across all 21 locales (and locale-translated variants), but convex/config/productCatalog.ts has api_starter at currentForCheckout=true, publicVisible=true, priceCents=9999 — $99.99/month, with api_starter_annual at $999/year. The API tier is shipped and self-serve. Updated pricingTable.apiHeader → 'API ($99.99)' for every locale, matching the same '<Tier> ($<price>)' pattern as 'Free ($0)' and 'Pro ($39.99)'. Bundle rebuilt.	2026-04-19 11:44:35 +04:00
Elie Habib	c7aacfd651	fix(health): persist WARNING events + add failure-log timeline (#3197 ) * fix(health): persist WARNING events + add failure-log timeline WARNING status (stale seeds) was excluded from the health:last-failure Redis write (line 680 checked `!== 'WARNING'`). When UptimeRobot keyword- checks for "HEALTHY" and gets a WARNING response, it flags DOWN, but no forensic trail was left in Redis. This made stale-seed incidents invisible to post-mortem investigation. Changes: - Write health:last-failure for ANY non-HEALTHY status (including WARNING) - Add health:failure-log (LPUSH list, last 50 entries, 7-day TTL) so multiple incidents are preserved as a timeline, not just the latest - Include warnCount alongside critCount in the snapshot - Broaden the problems filter to capture all non-OK statuses * fix(health): dedupe failure-log entries by incident signature Repeated polls during one long WARNING window would LPUSH near-identical snapshots, filling the 50-entry log and evicting older distinct incidents. Now compares a signature (status + sorted problem set) against the previous entry via health:failure-log-sig. Only appends when the incident changes. The last-failure key is still updated every poll (latest timestamp matters). * fix(health): add 4s timeout to persist pipelines + consistent arg types Addresses greptile review on PR #3197: - Both persist redisPipeline calls now pass 4_000ms timeout (main data pipeline uses 8_000ms; persist is less critical so shorter is fine) - LTRIM/EXPIRE args use numbers consistently (was mixing number/string) * fix(health): atomic sig swap via SET ... GET to eliminate dedupe race Two concurrent /api/health requests could both read the old signature before either write lands, appending duplicate entries. Now uses SET key val EX ttl GET (Redis 6.2+) to atomically swap the sig and return the previous value in one pipeline command. The LPUSH only fires if the returned previous sig differs from the new one. Also skips the second redisPipeline call entirely when sig matches (no logCmds to send). * fix(health): exclude seedAgeMin from dedupe sig + clear sig on recovery Two issues with the failure-log dedupe: 1. seedAgeMin changes on every poll (e.g. 31min, 32min, 33min), so the signature changed every time and LPUSH still fired on every probe during a STALE_SEED window. Now uses a separate sigKeys array with only key:status (no age) for the signature, while problemKeys still includes ages for the snapshot payload. 2. The sig was never cleared on recovery. If the same problem set recurred after a healthy gap, the old sig (within its 24h TTL) would match and the recurrence would be silently skipped. Now DELs health:failure-log-sig when overall === 'HEALTHY'. * fix(health): move sig write after LPUSH in same pipeline The sig was written eagerly in the first pipeline (SET ... GET), but the LPUSH happened in a separate background pipeline. If that second write failed, the sig was already advanced, permanently deduping the incident out of the timeline. Now: GET sig first (read-only), then write last-failure + LPUSH + sig all in one pipeline. The sig only advances if the entire pipeline succeeds. Failure leaves the old sig in place so the next poll retries. Reintroduces a small read-then-write race window (two concurrent probes can both read the old sig), but the worst case is a single duplicate entry, which is strictly better than a permanently dropped incident.	2026-04-19 10:14:19 +04:00
Elie Habib	63464775a5	feat(supply-chain): scenario UX — rich banner + projected score + faster poll (#3193 ) * feat(supply-chain): rich scenario banner + projected score per chokepoint + faster poll User reported Simulate Closure adds only a thin banner with no context — "not clear what value user is getting, takes many many seconds". Four targeted UX improvements in one PR: A. Rich banner (scenario params + tagline) Banner now reads: ⚠ Hormuz Tanker Blockade · 14d · +110% cost CN 100% · IN 84% · TW 82% · IR 80% · US 39% Simulating 14d / 100% closure / +110% cost on 1 chokepoint. Chokepoint card below shows projected score; map highlights… Surfaces the scenario template fields (durationDays, disruptionPct, costShockMultiplier) + a one-line explainer so a first-time user understands what "CN 100%" actually means. B. Projected score on each affected chokepoint card Card header now shows: `[current]/100 → [projected]/100` with a red trailing badge + red left border on the card body. Body prepends: "⚠ Projected under scenario: X% closure for N days (+Y% cost)". Projected = max(current, template.disruptionPct) — conservative floor since the real scoring mixes threat + warnings + anomaly. C. Faster polling Status poll interval 2s → 1s. Max iterations 30→60 (unchanged 60s budget). Worker processes in <1s; perceived latency drops from 2–3s to <2s in the common case. First poll still immediate. D. ScenarioResult interface widened Added optional `template` and `currentDisruptionScores` fields in scenario-templates.ts to match what the scenario-worker already emits. Optional = backward-compat with map-only consumers. Dependent on PR #3192 (already merged) which fixed the 10000% banner % inflation. * fix(supply-chain): trigger render() on scenario activate/dismiss — cards must re-render PR review caught a real bug in the new scenario UX: showScenarioSummary and hideScenarioSummary were mutating the banner DOM directly without triggering render(). renderChokepoints() reads activeScenarioState to paint the projected score + red border + callout, but those only run during render() — so the cards stayed stale on activate AND on dismiss until some unrelated re-render happened. Refactor to split public API from internal rendering: - showScenarioSummary(scenarioId, result) — now just sets state + calls render(). Was: set state + inline DOM mutation (bypassing card render). - renderScenarioBanner() — new private helper that builds the banner DOM from activeScenarioState. Called from render()'s postlude (replacing the old self-recursive showScenarioSummary() call — which only worked because it had a side-effectful early-exit path that happened to terminate, but was a latent recursion risk). - hideScenarioSummary() — now just sets state=null + calls render(). Was: clear state + manual banner removal + manual button-text reset loop. The button loop is redundant now — the freshly-rendered card template produces buttons with default "Simulate Closure" text by construction. Net effect: activating a scenario paints the banner AND the affected chokepoint cards in a single render tick. Dismissing strips both in the same tick. * fix(supply-chain): derive scenario button state from activeScenarioState, not imperative mutation PR review caught: the earlier re-render fix (showScenarioSummary → render()) correctly repaints cards on activate, but the button-state logic in runScenario() is now wrong. render() detaches the old btn reference, so the post-onScenarioActivate `resetButton('Active') + btn.disabled = true` touches a detached node and no-ops (resetButton() explicitly skips !btn.isConnected). The fresh button painted by render() uses the default template text — visible button reads "Simulate Closure" enabled, and users can queue duplicate runs of an already-active scenario. Fix: make button state a function of panel state. - renderChokepoints() scenario section: check activeScenarioState.scenarioId === template.id and, when matched, emit the button with class `sc-scenario-btn--active`, text "Active", and `disabled` attribute. On dismiss, the next render strips those automatically — same pattern as the card projection styling. - runScenario(): drop the dead `resetButton('Active')` + `btn.disabled` lines after onScenarioActivate. That path is now template-driven; touching the detached btn was the defect. Catch-path resets ('Simulate Closure' on abort, 'Error — retry' on real error) are unchanged — those fire BEFORE any render could detach the btn, so the imperative path is still correct there. * fix(supply-chain): hide scenario projection arrow when current already ≥ template Greptile P1: projected badge was rendered as `N/100 → N/100` whenever current disruptionScore already met or exceeded template.disruptionPct. Visible for Suez (80%) or Panama (50%) scenarios when a chokepoint is already elevated — read as "scenario has zero effect", which is misleading. The two values live on different scales — cp.disruptionScore is a computed risk score (threat + warnings + anomaly) while template.disruptionPct is "% of capacity blocked" — but they share the 0–100 axis so directional comparison is still meaningful for the "does this scenario escalate things?" signal. Fix: arrow only renders when template.disruptionPct > cp.disruptionScore. When current already equals or exceeds the scenario level, show the single current badge. The card's red left border + "⚠ Projected under scenario" callout still indicate the card is the scenario target — only the escalation arrow is suppressed.	2026-04-19 09:25:55 +04:00
Elie Habib	85d6308ed0	fix(brief): unblock Telegram carousel fetch in middleware bot gate (#3196 ) * fix(brief): allow Telegram/social UAs to fetch carousel images middleware.ts BOT_UA regex (/bot/i) was 403 on Telegram sendMediaGroup fetch of /api/brief/carousel/<u>/<d>/<p>. SOCIAL_IMAGE_UA allowlist (includes telegrambot) was scoped to /favico/* and .png suffix only; carousel returns image/png but the URL has no extension. Symptom: Railway log [digest] Telegram carousel 400 ... WEBPAGE_CURL_FAILED and zero images above the Telegram brief. Fix: extend UA-bypass guard to cover /api/brief/carousel/ prefix. HMAC token on the URL is the real auth; UA allowlist is defence-in-depth. * Address P2 + P3: regression test + route-shape regex P2: Add tests/middleware-bot-gate.test.mts — 13 cases pinning the contract: - TelegramBot/Slackbot/Discordbot/LinkedInBot pass on carousel - curl, generic bot UAs, missing UA still 403 on carousel - TelegramBot 403s on non-carousel API routes (scoped, not global) - Malformed carousel paths (admin/dashboard, page >= 3, non-ISO date) all still 403 via the regex - Normal browsers pass everywhere P3: Replace startsWith('/api/brief/carousel/') prefix with BRIEF_CAROUSEL_PATH_RE matching the exact shape enforced by api/brief/carousel/[userId]/[issueDate]/[page].ts (userId / YYYY-MM-DD / page 0\|1\|2). A future /api/brief/carousel/admin or similar sibling cannot inherit the bypass. Comment now lists every social-image UA this protects. typecheck + typecheck:api clean. test:data 5772/5772.	2026-04-19 09:16:14 +04:00
Elie Habib	6025b0ce47	chore(sentry): add Chrome/Firefox variant of UTItemActionController filter (#3194 ) The Safari variant (Can't find variable: UTItemActionController) was already in ignoreErrors at line 53. Chrome/Firefox uses the "X is not defined" format instead (WORLDMONITOR-NB). Added to the existing "is not defined" group at line 119.	2026-04-19 08:58:07 +04:00
Elie Habib	434a2e0628	feat(settings): API Keys tab visible to all users with PRO upgrade CTA (#3190 ) * feat(settings): show API Keys tab to all users with PRO upgrade CTA Free users who clicked the API Keys tab triggered a server-side ConvexError: API_ACCESS_REQUIRED (WORLDMONITOR-NA). Now the tab is always visible with a PRO badge, and the content is gated client-side: - Anonymous: lock icon + "Sign In" CTA (opens Clerk sign-in) - Free: upgrade icon + "Upgrade to Pro" CTA (opens Dodo checkout) - PRO: full key management UI (unchanged) The Convex query is never called for non-PRO users, eliminating the server error at the source while creating a natural upgrade funnel. Reuses existing panel-locked-state CSS (gold accent, gradient button). * fix(settings): gate API Keys on apiAccess feature, not isProUser Addresses review findings on PR #3190: 1. Gate changed from isProUser() to hasFeature('apiAccess') — matches the server contract in convex/apiKeys.ts which requires apiAccess (tier 2+), not just PRO (tier 1). PRO users without apiAccess now correctly see the upgrade CTA instead of the full UI. 2. CTA button now launches API_STARTER_MONTHLY checkout instead of DEFAULT_UPGRADE_PRODUCT (PRO_MONTHLY) — users buy the correct product that actually includes API key access. 3. loadApiKeys() guard now checks both getAuthState().user AND hasFeature('apiAccess') — prevents anonymous keyed sessions (widget/pro keys without Clerk auth) from hitting the Convex query that requires authentication. * fix(settings): re-render API Keys panel when entitlements arrive On cold load, hasFeature('apiAccess') returns false until the Convex entitlement subscription delivers data. A paid API Starter user who opens settings before that snapshot arrives would see the upgrade CTA and loadApiKeys() would be skipped. Subscribes to onEntitlementChange() while the modal is open and re-renders the api-keys panel content + re-attaches handlers when entitlements change. Cleans up in close() and destroy(). Also extracts handler attachment into attachApiKeysHandlers() to avoid duplicating the CTA click + input keydown wiring between render() and the entitlement callback.	2026-04-19 08:24:10 +04:00
Elie Habib	7a99c3406e	fix(supply-chain, news): scenario % double-multiply + scoreByEntities null-type TypeError (#3192 ) Two unrelated issues reported from a live session (browser screenshot + console): 1. Scenario banner showed "CN 10000% · IN 8400% · TW 8200%" showScenarioSummary did (c.impactPct * 100).toFixed(0) but the scenario-worker already sends impactPct as a 0-100 integer: scripts/scenario-worker.mjs:295 — Math.min(Math.round((total / max) * 100), 100) Multiplying by 100 again inflated every percentage 100x. Fix: drop the extra * 100. 100 renders as "100%", 84 as "84%". 2. Sentry/console TypeError at parallel-analysis.ts scoreByEntities: [ParallelAnalysis] Error: TypeError: Cannot read properties of undefined (reading 'includes') The ML worker occasionally returns entities with undefined `type` or `text`. scoreByEntities did entities.filter(e => e.type.includes('LOC')) — NPE when e.type missing. Fix: narrow to a well-formed subset via a type guard on e?.type and e?.text strings before any string access. Apply the safe array everywhere downstream (locations/people/orgs + density + confidence) so the guard is the single source of truth.	2026-04-19 08:22:44 +04:00
Elie Habib	d8e479188a	fix(supply-chain): don't CDN-cache empty chokepoint-history responses (#3189 ) User reported "Transit history unavailable" persisting for Hormuz after PR #3187 deployed. Direct Redis probe confirms supply_chain:transit-summaries:history:v1:hormuz_strait has 174 entries. Direct server-side curl to /api/supply-chain/v1/get-chokepoint-history also returns 174. But the user's browser kept receiving `{"history":[],"fetchedAt":"0"}`. Root cause: gateway cache tier `slow` pins 200 responses for 30 min at Cloudflare edge (s-maxage=1800). During the gap between Vercel instant deploy and Railway ais-relay redeploy + first transit-summary cron tick (~20 min), per-id history keys were absent, so the handler returned empty. Those empty bodies got CF-cached and kept serving for 30 min AFTER the keys were populated in Redis. Bab-el-Mandeb (which DOES render for the user) got a fresh non-empty cache entry; Hormuz got stuck with the empty one. Fix: when returning empty (missing key, invalid id, error), call markNoCacheResponse(ctx.request) so the gateway sets Cache-Control: no-store instead of the 30-min tier cache. Every call on an empty state re-checks Redis. Once data is present, the normal tier cache applies on the non-empty response. Mechanism: the gateway at server/gateway.ts:488 honors X-No-Cache header via the same side-channel (response-headers.ts). Pattern already used by other handlers for upstream-unavailable bodies. Cost: per-id history keys are ~35KB, edge→Upstash round-trip <1.5s. Slight Redis-traffic bump for as long as keys stay empty; negligible in practice (only the deploy window). Also no-caches invalid chokepoint IDs so scanners/junk IDs don't pin 30-min empties either.	2026-04-19 07:33:34 +04:00
Elie Habib	d7e40bc4e5	chore(sentry): filter NS_ERROR_UNEXPECTED + ConvexError API_ACCESS_REQUIRED (#3188 ) NS_ERROR_UNEXPECTED (WORLDMONITOR-N6/N7/N8/N9): Firefox 149/Ubuntu XPCOM Worker init failure. Same family as already-filtered NS_ERROR_ABORT and NS_ERROR_OUT_OF_MEMORY. 0 repo matches. Worker fallback confirmed working via breadcrumbs ("keeping flat list"). ConvexError: API_ACCESS_REQUIRED (WORLDMONITOR-NA, 15 events/5 users): expected business error from PR #3125 (API key management). Free user opens API Keys tab, server correctly denies, client try/catch at UnifiedSettings.ts:731 handles gracefully. Convex WS transport leaks the rejection to Sentry before the client Promise chain catches it.	2026-04-18 23:51:12 +04:00
Elie Habib	96fca1dc2b	fix(supply-chain): popup-keyed history re-query + dataAvailable flag (#3187 ) * fix(supply-chain): popup-keyed history re-query + dataAvailable flag for partial coverage Two P1 findings on #3185 post-merge review: 1. MapPopup cross-chokepoint history contamination Popup's async history resolve re-queried [data-transit-chart] without a cpId key. User opens popup A → fetch starts for cpA; user opens popup B before it resolves → cpA's history mounts into cpB's chart container. Fix: add data-transit-chart-id keyed by cpId; re-query by it on resolve. Mirrors SupplyChainPanel's existing data-chart-cp-id pattern. 2. Partial portwatch coverage still looked healthy Previous fix emits all 13 canonical summaries (zero-state fill for missing IDs) and records pwCovered in seed-meta, but: - get-chokepoint-status still zero-filled missing chokepoints and cached the response as healthy — panel rendered silent empty rows. - api/health.js only degrades on recordCount=0, so 10/13 partial read as OK despite the UI hiding entire chokepoints. Fix: - proto: TransitSummary.data_available (field 12). Writer tags with Boolean(cpData). Status RPC passes through; defaults true for pre-fix payloads (absence = covered). - Status RPC writes seed-meta recordCount as covered count (not shape size), and flips response-level upstreamUnavailable on partial. - api/health.js: new minRecordCount field on SEED_META entries + new COVERAGE_PARTIAL status (warn rollup). chokepoints entry declares minRecordCount: 13. recordCount < 13 → COVERAGE_PARTIAL. - Client (panel + popup): skip stats/chart rendering when !dataAvailable; show "Transit data unavailable (upstream partial)" microcopy so users understand the gap. 5759/5759 data tests pass. Typecheck + typecheck:api clean. * fix(supply-chain): guarantee Simulate Closure button exits Computing state User reports "Simulate Closure does nothing beyond write Computing…" — the button sticks at Computing forever. Two causes: 1. Scenario worker appears down (0 scenario-result:* keys in Redis in the last 24h of 24h-TTL). Railway-side — separate intervention needed to redeploy scripts/scenario-worker.mjs. 2. Client leaked the "Computing…" state on multiple exit paths: - signal.aborted early-return inside the poll loop never reset the button. Second click fired abort on first → first returned without resetting → button stayed "Computing…" until next render. - !this.content.isConnected early-return also skipped reset (less user-visible but same class of bug). - catch block swallowed AbortError without resetting. - POST /run had no hard timeout — a hanging edge function left the button in Computing indefinitely. Fix: - resetButton(text) helper touches the btn only if still connected; applied in every exit path (abort, timeout, post-success, catch). - AbortSignal.any([caller, AbortSignal.timeout(20_000)]) on POST /run. - console.error on failure so Simulate Closure errors surface in ops. - Error message includes "scenario worker may be down" on loop timeout so operators see the right suspect. Backend observations (for follow-up): - Hormuz backend is healthy (/api/health chokepoints OK, 13 records, 1 min old; live RPC has hormuz_strait.riskLevel=critical, wow=-22, flowEstimate present; GetChokepointHistory returns 174 entries). User-reported "Hormuz empty" is likely browser/CDN stale cache from before PR #3185; hard refresh should resolve. - scenario-worker.mjs has zero result keys in 24h. Railway service needs verification/redeployment. * fix(scenario): wrong Upstash RPUSH format silently broke every Simulate Closure Railway scenario-worker log shows every job failing field validation since at least 03:06Z today: [scenario-worker] Job failed field validation, discarding: ["{\"jobId\":\"scenario:1776535792087:cynxx5v4\",... The leading [" in the payload is the smoking gun. api/scenario/v1/run.ts was POSTing to /rpush/{key} with body `[payload]`, expecting Upstash to unpack the array and push one string value. Upstash does NOT parse that form — it stored the literal `["{...}"]` string as a single list value. Worker BLMOVEs the literal string → JSON.parse → array → destructure `{jobId, scenarioId, iso2}` on an array returns undefined for all three → every job discarded without writing a result. Client poll returns `pending` for the full 60s timeout, then (on the prior client code path) leaked the stuck "Computing…" button state indefinitely. Fix: use the standard Upstash REST command format — POST to the base URL with body `["RPUSH", key, value]`. Matches scripts/ais-relay.cjs upstashLpush. After this, the scenario-queue:pending list stores the raw payload string, BLMOVE returns the payload, JSON.parse gives the object, validation passes, computeScenario runs, result key gets written, client poll sees `done`. Zero result keys existed in prod Redis in the last 24h (24h TTL on scenario-result:*) — confirms the fix addresses the production outage.	2026-04-18 23:38:33 +04:00
Elie Habib	d37ffb375e	fix(referral): stop /api/referral/me 503s on prod homepage (#3186 ) * fix(referral): make /api/referral/me non-blocking to stop prod 503s Reported in prod: every PRO homepage load was logging 'GET /api/referral/me 503' to Sentry. Root cause: a prior review required the Convex binding to block the response (rationale: don't hand users a dead share link). That turned any flaky relay call into a homepage-wide 503 for the 5-minute client cache window — every PRO user, every page reload. Fix: dispatch registerReferralCodeInConvex via ctx.waitUntil. Response returns 200 + code + shareUrl unconditionally. Binding failures log a warning but never surface as 503. The mutation is idempotent; the next /api/referral/me fetch retries. The /pro?ref=<code> signup side reads userReferralCodes at conversion time, so a missed binding degrades to missed attribution (partial), never to blocked homepage (total). The BRIEF_URL_SIGNING_SECRET-missing 503 path is unchanged — that's a genuine misconfig, not a flake. Handler signature now takes ctx with waitUntil, matching api/notification-channels.ts and api/discord/oauth/callback.ts. Regression test flipped: brief-referral-code.test.mjs previously enforced the blocking shape; now enforces the non-blocking shape + handler signature + explicit does-not-503-on-binding-failure assertion. 14/14 referral tests pass. Typecheck clean, 5706/5706 test:data, lint exit 0. * fix(referral): narrow err in non-blocking catch instead of unsafe cast Greptile P2 on #3186. The (err as Error).message cast was safe today (registerReferralCodeInConvex only throws Error instances) but would silently log 'undefined' if a future path ever threw a non-Error value. Swapped to instanceof narrow + String(err) fallback.	2026-04-18 23:32:48 +04:00
Elie Habib	3c47c1b222	fix(supply-chain): split chokepoint transit data + close silent zero-state cache (#3185 ) * fix(supply-chain): split chokepoint transit data + close silent zero-state cache Production supply-chain panel was rendering 13 empty chokepoints because the getChokepointStatus RPC silently cached zero-state for 5 minutes: 1. supply_chain:transit-summaries:v1 grew to ~500 KB (180d × 13 × 14 fields of history per chokepoint). 2. REDIS_OP_TIMEOUT_MS is 1.5 s. Vercel Sydney edge → Upstash for a 500 KB GET consistently exceeded the budget; getCachedJson caught the AbortError and returned null. 3. The 500 KB portwatch fallback read hit the same timeout. 4. summaries = {} → every summaries[cp.id] was undefined → 13 chokepoints got the zero-state default → cached as a non-null success response for REDIS_CACHE_TTL (5 min) instead of NEG_SENTINEL (120 s). Fix (one PR, per docs/plans/chokepoint-rpc-payload-split.md): - ais-relay.cjs: split seedTransitSummaries output. - supply_chain:transit-summaries:v1 — compact (~30 KB, no history). - supply_chain:transit-summaries:history:v1:{id} — per chokepoint (~35 KB each, 13 keys). Both under the 1.5 s Redis read budget. - New RPC GetChokepointHistory: lazy-loaded on card expand. - get-chokepoint-status.ts: drop the 500 KB portwatch/corridorrisk/ chokepoint_transits fallback reads. Treat a null transit-summaries read as upstreamUnavailable=true so cachedFetchJson writes NEG_SENTINEL (2 min) instead of a 5-min zero-state pin. Omit history from the response (proto field stays declared; empty array). - server/_shared/redis.ts: tag AbortError timeouts with [REDIS-TIMEOUT] key=… timeoutMs=… so log drains / Sentry-Vercel integration pick up large-payload timeouts instead of them being silently swallowed. - SupplyChainPanel.ts + MapPopup.ts: lazy-fetch history on card expand via fetchChokepointHistory; session-scoped cache; graceful "History unavailable" on empty/error. PRO gating on the map popup unchanged. - Gateway: cache-tier entry for /get-chokepoint-history (slow). - Tests: regression guards for upstreamUnavailable gate + per-id key shape + handler wiring + proto query annotations. Audit included in plan: no other RPC consumer read stacks >200 KB besides displacement:summary:v1:2026 (724 KB, same risk, flagged for follow-up PR). wildfire:fires:v1 at 1.7 MB loads via bootstrap (3 s timeout, different path) — monitor but out of scope. Expected impact: - supply_chain:chokepoints:v4 payload drops from ~508 KB to <100 KB. - supply_chain:transit-summaries:v1 drops from ~502 KB to <50 KB. - RPC Redis reads stay well under 1.5 s in the hot path. - Silent zero-state pinning is now impossible: null reads → 2-min neg cache → self-heal on next relay tick. * fix(supply-chain): address PR #3185 review — stop caching empty/error + fix partial coverage Two P1 regressions caught in review: 1. Client cache poisoning on empty/error (MapPopup.ts, SupplyChainPanel.ts) Empty-array is truthy in JS, so MapPopup's `!cached && !inflight` branch never fired once we cached []. Neither `cached && cached.length` fired either — popup stuck on "Loading transit history..." for the session. SupplyChainPanel had the explicit `cached && !cached.length` branch but still never retried, so the same transient became session-sticky there too. Fix: cache ONLY non-empty successful responses. Empty/error show the "History unavailable" placeholder but leave the cache untouched, so the next re-expand retries. The /get-chokepoint-history gateway tier is "slow" (5-min CF edge cache) → retries stay cheap. 2. Partial portwatch coverage treated as healthy (ais-relay.cjs) seedTransitSummaries iterated Object.entries(pw), so if seed-portwatch dropped N of 13 chokepoints (ArcGIS reject/empty), summaries had <13 keys. get-chokepoint-status upstreamUnavailable fires only on fully-empty summaries, so the N missing chokepoints fell through to zero-state rows that got pinned in cache for 5 minutes. Fix: iterate CANONICAL_IDS (Object.keys(CHOKEPOINT_THREAT_LEVELS)) and fill zero-state for any ID missing from pw. Shape is consistently 13 keys. Track pwCovered → envelope + seed-meta recordCount reflect real upstream coverage (not shape size), so health.js can distinguish 13/13 healthy from 10/13 partial. Warn-log on shortfall. Tests: new regression guards - panel must NOT cache empty arrays (historyCache.set with []). - writer must iterate CANONICAL_IDS, not Object.entries(pw). - seed-meta recordCount binds to pwCovered. 5718/5718 data tests pass. typecheck + typecheck:api clean.	2026-04-18 23:14:00 +04:00
Elie Habib	d1e084061d	fix(sw): preserve open modals when tab-hide auto-reload would fire (#3184 ) * fix(sw): preserve open modals when tab-hide auto-reload would fire Scenario: a Pro user opens the Clerk sign-in modal, enters their email, and switches to their mail app to fetch the code. If a deploy happens while they wait and the SW update toast's 5 s dwell window has elapsed, `visibilitychange: hidden` triggers `window.location.reload()` — which wipes the Clerk flow, so the code in the inbox is for a now-dead attempt and the user has to re-request. Same failure applies to UnifiedSettings, the ⌘K search modal, story/signal popups, and anything else with modal semantics: leaving the tab = lose your place. Fix: in `sw-update.ts`, the hidden-tab auto-reload now checks for any open modal/dialog via a compound selector (`[aria-modal="true"], [role="dialog"], .modal, .cl-modalBackdrop, dialog[open]`) and suppresses the reload when one matches. Covers Clerk's `.cl-modalBackdrop`, the site-wide `.modal` convention (UnifiedSettings, WidgetChatModal), and any well-authored dialog. The reload stays armed — next tab-hide after the modal closes fires it. Manual "Reload" button click is unaffected (explicit user intent). Over-matching is safe (worst case: user clicks Reload manually). Under-matching keeps the bug, so the selector errs generous. Tests: three new cases cover modal-open suppression, re-arming after modal close, and manual-click bypass. 25/25 sw-update tests pass. Follow-up ticket worth filing: add `aria-modal="true"` + `role="dialog"` to the modals that are missing them (SearchModal, StoryModal, SignalModal, WidgetChatModal, McpConnectModal, MobileWarningModal, CountryIntelModal, UnifiedSettings). That's the proper long-term a11y fix and would let us narrow the selector once coverage is complete. * fix(sw): filter modal guard by actual visibility, not just DOM presence Addresses review feedback on #3184: The previous selector (`[role="dialog"]` etc.) matched the UnifiedSettings overlay, which is created in its constructor at app startup (App.ts:977 → UnifiedSettings.ts:68-71 sets role="dialog") and stays in the DOM for the whole session. That meant auto-reload was effectively disabled for every user, not just those with an actually-open modal. Fix: don't just check for selector matches — check whether the matched element is actually rendered. Persistent modal overlays hide themselves via `display: none` (main.css:6744: `.modal-overlay { display: none }`) and reveal via an `.active` class (main.css:6750: `.active { display: flex }`), so `offsetParent === null` cleanly distinguishes closed from open. We prefer `checkVisibility()` where available (Chrome 105+, Safari 17.4+, Firefox 125+, which covers virtually all current WM users) and fall back to `offsetParent` otherwise. This also handles future modals automatically, without needing us to enumerate every `.xxx-modal-overlay.active` class the site might introduce. New tests: - Modal mounted AND visible → reload suppressed (original Clerk case) - Modal mounted but hidden → reload fires (reviewer's regression case) - Modal visible, then hidden on return → reload fires on next tab-hide - Manual Reload click unaffected in all cases 26/26 sw-update tests pass. * fix(sw): replace offsetParent fallback with getClientRects for fixed overlays Addresses second review finding on #3184: The previous fallback `el.offsetParent !== null` silently failed on every `position: fixed` overlay — which is every modal in this app: - `.modal-overlay` (main.css:6737) — UnifiedSettings, WidgetChatModal - `.story-modal-overlay` (main.css:3442) - `.country-intel-modal-overlay` active state (main.css:18415) MDN: `offsetParent` is specified to return null for any `position: fixed` element, regardless of visibility. So on Firefox <125 or Safari <17.4 (where `Element.checkVisibility()` is unavailable), `isModalOpen` would return false for actually-open modals → auto-reload fires → Clerk sign-in and every other fixed-position flow gets wiped exactly as PR #3184 was meant to prevent. Fix: fall back to `getClientRects().length > 0`. This returns 0 for `display: none` elements (how `.modal-overlay` hides when `.active` is absent) and non-zero for rendered elements, including position:fixed. It's universally supported and matches the semantics we want. New tests exercise the fallback path explicitly with a `supportsCheckVisibility` toggle on the fake env: - visible position:fixed modal + no checkVisibility → reload suppressed - hidden mounted modal + no checkVisibility → reload fires 28/28 sw-update tests pass. * fix(a11y): add role=dialog + aria-modal=true to five missing modals Addresses third review finding on #3184. SW auto-reload guard uses a `[role="dialog"]` selector but five modals were missing the attribute, so `isModalOpen()` returned false and the page could still auto-reload mid-flow on those screens. Broadening the selector to enumerate specific class names was rejected because the app has many non-modal `-overlay` classes (`#deckgl-overlay`, `.conflict-label-overlay`, `.layer-warn-overlay`, `.mobile-menu-overlay`) that would cause false positives and permanently disable auto-reload. Instead, standardize on the existing convention used by UnifiedSettings: every modal overlay sets `role="dialog"` + `aria-modal="true"` at creation. This makes the SW selector work AND improves screen-reader behavior (focus trap, background element suppression). Modals updated: - SearchModal (⌘K search) — both mobile sheet and desktop variants use the same element, single set-attributes call at create time - StoryModal (news story detail) - SignalModal (instability spike detail) - CountryIntelModal (country deep-dive overlay) - MobileWarningModal (mobile device warning) No change to sw-update.ts — the existing selector already covers the newly-attributed elements. All 28 sw-update tests still pass; typecheck clean.	2026-04-18 22:54:58 +04:00
Elie Habib	55ac431c3f	feat(brief): public share mirror + in-magazine Share button (#3183 ) * feat(brief): public share mirror + in-magazine Share button Adds the growth-vector piece listed under Future Considerations in the original brief plan (line 399): a shareable public URL and a one-click Share button on the reader's magazine. Problem: the per-user magazine at /api/brief/{userId}/{issueDate} is HMAC-signed to a specific reader. You cannot share the URL you are looking at, because the recipient either 403s (bad token) or reads your personalised issue against your userId. Result: no way to share the daily brief, no way for readers to drive discovery. Opening a growth loop requires a separate public surface. Approach: deterministic HMAC-derived short hash per {userId, issueDate} backed by a pointer key in Redis. New files - server/_shared/brief-share-url.ts Web Crypto HMAC helper. deriveShareHash returns 12 base64url chars (72 bits) from (userId, issueDate) using BRIEF_SHARE_SECRET. Pointer encode/decode helpers and a shape check. Distinct from the per-user BRIEF_URL_SIGNING_SECRET so a leak of one does not automatically unmask the other. - api/brief/share-url.ts (edge, Clerk auth, Pro gated) POST /api/brief/share-url?date=YYYY-MM-DD Idempotently writes brief:public:{hash} pointer with the same 7 day TTL as the underlying brief, then returns {shareUrl, hash, issueDate}. 404 if the per-user brief is missing. 503 on Upstash failure. Accepts an optional refCode in the JSON body for referral attribution. - api/brief/public/[hash].ts (edge, unauth) GET /api/brief/public/{hash}?ref={code} Reads pointer, reads the real brief envelope, renders with publicMode=true. Emits X-Robots-Tag: noindex,nofollow so shared briefs never get enumerated by search engines. 404 on any missing part (bad hash shape, missing pointer, missing envelope) with a neutral error page. 503 on Upstash failure. Renderer changes (server/_shared/brief-render.js) - Signature extended: renderBriefMagazine(envelope, options?) - options.publicMode: redacts user.name and whyMatters before any HTML emission; swaps the back cover to a Subscribe CTA; prepends a Subscribe strip across the top of the deck; omits the Share button + share script; adds a noindex meta tag. - options.refCode: appended as ?ref= to /pro links on public views. - Non-public views gain a sticky .wm-share pill in the top-right chrome. Inline SHARE_SCRIPT handles the click flow: POST /api/ brief/share-url then navigator.share with clipboard fallback and a prompt() ancient-browser fallback. User-visible feedback via data-state on the button (sharing / copied / error). No change to the envelope contract, no LLM calls, no composer-side work required. - Validation runs on the full unredacted envelope first, so the public path can never accept a shape the private path would reject. Tests - tests/brief-share-url.test.mts (18 assertions): determinism, secret sensitivity, userId/date sensitivity, shape validation, URL composition with/without refCode, trailing-slash handling on baseUrl, pointer encode/decode round-trip. - tests/brief-magazine-render.test.mjs (+13 assertions): Share button carries the issue date; share script emitted once; share-url endpoint wired; publicMode strips the button+script, replaces whyMatters, emits noindex meta, prepends Subscribe strip, passes refCode through with escaping, swaps the back cover, does not leak the user name, preserves story headlines, options-less call matches the empty-options call byte for byte. - Full typecheck/lint/edge-bundle/test:data/edge-functions suite all green: 5704/5704 data tests, 171/171 edge-function tests, 0 lint errors. Env vars (new) - BRIEF_SHARE_SECRET: 64+ random hex chars, Vercel (edge) only. NOT needed by the Railway composer because pointer writes are lazy (on share, not on compose). * fix(brief): public share round-trip + magazine Share button without auth Two P1 findings on #3183 review. 1) Pointer wire format: share-url.ts wrote the pointer as a raw colon-delimited string via SET. The public route reads via readRawJsonFromUpstash which ALWAYS JSON.parses. A bare non-JSON string throws at parse, the route returned 503 instead of resolving. Fix: JSON.stringify on both write sites. Regression test locks the wire format. 2) Share button auth unreachable from a standalone magazine tab: inline script needed window.WM_CLERK_JWT which is never set, endpoint hard-requires Bearer, fallback to credentials:include fails. Fix: derive share URL server-side in the per-user route (same inputs share-url uses), embed as data-share-url, click handler now reads dataset and invokes navigator.share directly. No network, no auth, works in any tab. The /api/brief/share-url endpoint stays in place for other callers (dashboard panel) with its Clerk auth intact and its pointer write now in the correct format. QA: typecheck clean, 5708/5708 data tests, 45/45 magazine, 20/20 share-url, edge bundle OK, lint exit 0. * fix(brief): address remaining review findings on #3183 P0-2 (comment-only): public/[hash].ts inline comment incorrectly described readRawJsonFromUpstash parse-failure behaviour. The helper rethrows on JSON.parse failure, it does not return null. Rewrote the comment to match reality (JSON-encoded wire format, parse-to-string round-trip, intentional 503-on-bug-value as the loud failure mode). The actual wire-format fix was in prior commit `045771d55`. P2 (consistency): publicStripHtml href was built via template literal + encodeURIComponent without the final escapeHtml wrap that renderBackCover uses. Safe in practice (encodeURIComponent handles all HTML-special chars + route boundary restricts refCode to [A-Za-z0-9_-]) but inconsistent. Unified by extracting publicStripHref and escaping on interpolation, matching the sibling function. QA: typecheck clean, 45/45 magazine tests pass, lint exit 0.	2026-04-18 22:46:22 +04:00
Elie Habib	81536cb395	feat(brief): source links, LLM descriptions, strip suffix (envelope v2) (#3181 ) * feat(brief): source links, LLM descriptions, strip publisher suffix (envelope v2) Three coordinated fixes to the magazine content pipeline. 1. Headlines were ending with " - AP News" / " \| Reuters" etc. because the composer passed RSS titles through verbatim. Added stripHeadlineSuffix() in brief-compose.mjs, conservative case- insensitive match only when the trailing token equals primarySource, so a real subtitle that happens to contain a dash still survives. 2. Story descriptions were the headline verbatim. Added generateStoryDescription to brief-llm.mjs, plumbed into enrichBriefEnvelopeWithLLM: one additional LLM call per story, cached 24h on a v1 key covering headline, source, severity, category, country. Cache hits are revalidated via parseStoryDescription so a bad row cannot flow to the envelope. Falls through to the cleaned headline on any failure. 3. Source attribution was plain text, no outgoing link. Bumped BRIEF_ENVELOPE_VERSION to 2, added BriefStory.sourceUrl. The composer now plumbs story:track:v1.link through digestStoryToUpstreamTopStory, UpstreamTopStory.primaryLink, filterTopStories, BriefStory.sourceUrl. The renderer wraps the Source line in an anchor with target=_blank, rel=noopener noreferrer, and UTM params (utm_source=worldmonitor, utm_medium=brief, utm_campaign=<issueDate>, utm_content=story- <rank>). UTM appending is idempotent, publisher-attributed URLs keep their own utm_source. Envelope validation gains a validateSourceUrl step (https/http only, no userinfo credentials, parseable absolute URL). Stories without a valid upstream link are dropped by filterTopStories rather than shipping with an unlinked source. Tests: 30 renderer tests to 38; new assertions cover UTM presence on every anchor, HTML-escaping of ampersands in hrefs, pre-existing UTM preservation, and all four validator rejection modes. New composer tests cover suffix stripping, link plumb-through, and v2 drop-on-no- link behaviour. New LLM tests for generateStoryDescription cover cache hit/miss, revalidation of bad rows, 24h TTL, and null-on- failure. * fix(brief): v1 back-compat window on renderer + consolidate story hash helper Two P1/P2 review findings on #3181. P1 (v1 back-compat). Bumping BRIEF_ENVELOPE_VERSION 1 to 2 made every v1 envelope still resident in Redis under the 7-day TTL fail assertBriefEnvelope. The hosted /api/brief route would 404 "expired" and the /api/latest-brief preview would downgrade to "composing", breaking already-issued links from the preceding week. Fix: renderer now accepts SUPPORTED_ENVELOPE_VERSIONS = Set([1, 2]) on READ. BRIEF_ENVELOPE_VERSION stays at 2 and is the only version the composer ever writes. BriefStory.sourceUrl is required when version === 2 and absent on v1; when rendering a v1 story the source line degrades to plain text (no anchor), matching pre-v2 appearance. When the TTL window passes the set can shrink to [2] in a follow-up. P2 (hash dedup). hashStoryDescription was byte-identical to hashStory, inviting silent drift if one prompt gains a field the other forgets. Consolidated into hashBriefStory. Cache key separation remains via the distinct prefixes (brief:llm:whymatters:v2:/brief:llm:description:v1:). Tests: adds 3 v1 back-compat assertions (plain source line, field validation still runs, defensive sourceUrl check), updates the version-mismatch assertion to match the new supported-set message. 161/161 pass (was 158). Full test:data 5706/5706.	2026-04-18 21:49:17 +04:00
Elie Habib	8fc302abd9	fix(brief): mobile layout — stack story callout, floor digest typography (#3180 ) * fix(brief): mobile layout — stack story callout, floor digest typography On viewports <=640px the 55/45 story grid cramped both the headline and the "Why this is important" callout to ~45% width each, and several digest rules used raw vw units (blockquote 2vw, threads 1.55vw) that collapsed to ~7-8px on a 393px iPhone frame before the browser min clamped them to barely-readable. Appends a single @media (max-width: 640px) block to the renderer's STYLE_BLOCK: - .story becomes a flex column — callout stacks under the headline, no column squeeze. Headline goes full-width at 9.5vw. - Digest blockquote, threads, signals, and stat rows get max(Npx, Nvw) floors so they never render below ~15-17px regardless of viewport. - Running-head stacks on digest and the absolute page-number gets right-hand clearance so they stop overlapping. - Tags and source labels pinned to 11px (were scaling down with vw). CSS-only; no envelope, no HTML structure, no new classes. All 30 renderBriefMagazine tests still pass. * fix(brief): raise mobile digest px floors and running-head clearance Two P2 findings from PR review on #3180: 1. .digest .running-head padding-right: 18vw left essentially zero clearance from the absolute .page-number block on iPhone SE (375px) and common Android (360px). Bumped to 22vw (~79px at 360px) which accommodates "09 / 12" in IBM Plex Mono at the right:5vw offset with a one-vw safety margin. 2. Mobile overrides were lowering base-rule px floors (thread 17px to 15px, signal 18px to 15px). On viewports <375px this rendered digest body text smaller than desktop. Kept the px floors at or above the base rules so effective size only ever goes up on mobile.	2026-04-18 21:37:40 +04:00
Elie Habib	388995b1a4	fix(health): macroSignals maxStaleMin 20 → 150 to match seed-economy cron cadence (#3179 ) macroSignals is a secondary key written by seed-economy.mjs, whose primary key energy-prices has maxStaleMin=150 in its runSeed config. A 20-min threshold guaranteed STALE_SEED between every cron run.	2026-04-18 20:50:48 +04:00
Elie Habib	6f6102e5a7	feat(brief): swap sienna rust for two-strength WM mint (Option B palette) (#3178 ) * feat(brief): swap sienna rust for two-strength WM mint (Option B palette) The only off-brand color in the product was the brief's sienna rust (#8b3a1f) accent. Every other surface — /pro landing, dashboard, dashboard panels — uses the WM mint green (#4ade80). Swapping the brief's accent to the brand mint makes the magazine read as a sibling of /pro rather than a separate editorial product, while keeping the magazine-grade serif typography and even/odd page inversion intact. Implementation (user picked Option B from brief-palette-playground.html): --sienna : #8b3a1f -> #3ab567 muted mint for LIGHT pages (readable on #fafafa without the bright-mint glare of a pure-brand swap) --mint : + #4ade80 bright WM mint for DARK pages (matches /pro exactly) --cream : #f1e9d8 -> #fafafa unified with --paper; one crisp white --cream-ink: #1a1612 -> #0a0a0a crisper contrast on the new paper Accent placement (unchanged structurally — only colors swapped): - Digest running heads, labels, blockquote rule, stats dividers, end-marker rule, signal/thread tags: all muted mint on light - Story source line: newly mint (was unstyled bone/ink at 0.6 opacity); two-strength — muted on light stories, bright on dark - Logo ekg dot: mint on every page so the brand 'signal' pulse threads through the whole magazine No layout changes. No HTML structure changes. Only color constants + a ~20-line CSS addition for story-source + ekg-dot accents. 165/165 brief tests pass (renderer contract unchanged — envelope shape identical, only computed styles differ). Both tsconfigs typecheck clean. * fix(brief): darken light-page mint to pass WCAG AA + fix digest ekg-dot Two P2 findings on PR #3178 review. 1. Digest ekg-dot used bright #4ade80 on a #fafafa background, contradicting the code comment that said 'light pages use the muted mint'. The rule was grouped with .cover and .story.dark (both ink backgrounds) when it should have been grouped with .story.light (paper background). Regrouped. 2. #3ab567 on #fafafa tests at ~2.31:1 — fails WCAG AA 4.5:1 for every text size and fails the 3:1 large-text floor. The PR called this a rollback trigger; contrast math says it would fail every meaningful text usage (mono running heads, source lines, labels, footer captions). Swapped --sienna from #3ab567 to #1f7a3f — tested at ~4.90:1 on #fafafa, passes AA for normal text. Kept the variable name '--sienna' for backwards compat (every .digest rule references it). The hue stays recognisably mint- family (green dominant) so the brand relationship with #4ade80 on dark pages is still clear to a reader. Dark-page mint is unchanged — #4ade80 on #0a0a0a is ~11.4:1, passes AAA. Playground (brief-palette-playground.html) updated to match so future iterations work against the accessible value. 165/165 brief tests pass. Both tsconfigs typecheck clean.	2026-04-18 20:50:16 +04:00
Elie Habib	048e5486ac	fix(brief): Latest Brief panel locks out Pro users — gate reads Clerk metadata, not entitlement (#3177 ) * fix(brief): Latest Brief panel locks out Pro users — gate reads wrong field Reported: PR #3166 shipped a WEB_CLERK_PRO_ONLY_PANELS gate that downgrades the Latest Brief panel to FREE_TIER/ANONYMOUS when the user isn't Clerk-Pro. The downgrade condition was: state.user?.role !== 'pro' state.user.role is derived from Clerk's publicMetadata.plan via getCurrentClerkUser(). That field is NOT kept in sync with the real entitlement for many users — the source of truth is the Convex entitlements table, not Clerk metadata. Result: a confirmed Pro user (Convex entitlement.features.tier = 1+) sees every other premium panel unlock (hasPremiumAccess consults isEntitled() per PR #3167) but the Latest Brief panel shows 'Upgrade to Pro'. Fix: swap the condition from 'state.user?.role !== \'pro\'' to '!hasTier(1)' — the same Convex-backed entitlement check hasPremiumAccess uses. The panel's own auth subscription keeps the separate role-based guard inside refresh() as defence in depth (belt-and-suspenders), but the top-level gating no longer over-fires on the wrong field. No new behaviour for users without an entitlement. Typecheck clean. * fix(brief): panel-side role gate also reads Convex entitlement (not Clerk metadata) Reviewer caught that the prior PR (#3177) fixed the layout-level gate but left the panel's own refresh() guard reading authState.user.role — same stale-publicMetadata bug. A user whose Convex entitlement says tier=1 but whose Clerk publicMetadata.plan is unset would unlock past the layout gate (now correct) and then still hit the panel's local renderUpgradeRequired() path. Fix: swap the local role check to hasTier(1) — the same Convex snapshot the layout now consults. Now BOTH gates agree on the source of truth. * fix(brief): defer Pro gate when entitlement snapshot hasn't arrived yet Review flagged a transient 'Upgrade to Pro' flash for Pro users on initial load. The auth-state subscription can fire before the Convex entitlement snapshot arrives; hasTier(1) returns false by default when currentState is null, so a Pro user briefly sees the upgrade overlay until onEntitlementChange re-runs the gate with the real snapshot. Fix: treat 'entitlement not yet loaded' as distinct from 'free user'. Both panel-layout.ts gate AND LatestBriefPanel.refresh() now check getEntitlementState() !== null before applying the Clerk-Pro-only downgrade. During the unknown window the panel stays in its loading state; the onEntitlementChange listener re-runs updatePanelGating once the snapshot lands and either unlocks or gates correctly. No behaviour change for free users (entitlement snapshot arrives with tier=0, still correctly gates). No behaviour change for the steady-state Pro case. Only the cold-start window differs: flash of upgrade-overlay → clean loading state. * fix(brief): drop client entitlement gate from panel refresh — let server decide Reviewer's sharper read on PR #3177: the prior 'defer-if-unknown' fix still blocks Pro users whenever the Convex entitlement subscription is late, skipped, or failed to establish. getEntitlementState() can stay null indefinitely if the Convex client auth never connects; hasTier(1) would stay false; the panel would stay on renderLoading() forever and the server-side /api/latest-brief check would never even fire. The correct architecture: the server is authoritative. /api/latest- brief already does its own entitlement check against the Clerk JWT. Client-side entitlement is a fast-path optimisation, never a gate. Fix: switch both call sites to AFFIRMATIVE DENIAL ONLY. LatestBriefPanel.refresh() Before: if snapshot null -> renderLoading (fetch never fires); if snapshot + free -> renderUpgradeRequired. After: if snapshot != null AND !hasTier(1) -> renderUpgradeRequired. Otherwise fall through and FIRE THE FETCH. The 403 path (BriefAccessError 'upgrade_required') already renders the upgrade CTA when the server says free. panel-layout.ts updatePanelGating Already shaped as affirmative-denial (snapshot != null AND !hasTier). Updated the comment to make the invariant explicit so a future refactor doesn't flip it back to positive-gating. Consequence: an API-key-only user with a free Clerk account will fire one doomed fetch per refresh and see renderUpgradeRequired a beat later than before. Accepted — the alternative locked legitimate Pro users out whenever Convex was anything other than perfectly healthy, which is a materially worse failure mode. Both tsconfigs typecheck clean. No test changes needed — the BriefAccessError path was already covered by existing tests.	2026-04-18 20:49:39 +04:00
Elie Habib	b5824d0512	feat(brief): Phase 9 / Todo #223 — share button + referral attribution (#3175 ) * feat(brief): Phase 9 / Todo #223 — share button + referral attribution Adds a Share button to the dashboard Brief panel so PRO users can spread WorldMonitor virally. Built on the existing referral plumbing (registrations.referralCode + referredBy fields; api/register-interest already passes referredBy through) — this PR fills in the last mile: a stable referral code for signed-in Clerk users, a share URL, and a client-side share sheet. Files: server/_shared/referral-code.ts (new) Deterministic 8-char hex code: HMAC(BRIEF_URL_SIGNING_SECRET, 'referral:v1:' + userId). Same Clerk userId always produces the same code. No DB write on login, no schema migration, stable for the life of the account. api/referral/me.ts (new) GET -> { code, shareUrl, invitedCount }. Bearer-auth via Clerk. Reuses BRIEF_URL_SIGNING_SECRET to avoid another Railway env var. Stats fail gracefully to 0 on Convex outage. convex/registerInterest.ts + convex/http.ts New internal query getReferralStatsByCode({referralCode}) counts registrations rows that named this code as their referredBy. Exposed via POST /relay/referral-stats (RELAY_SHARED_SECRET auth). src/services/referral.ts (new) getReferralProfile: 5-min cache, profile is effectively immutable shareReferral: Web Share API primary (mobile native sheet), clipboard fallback on desktop. Returns 'shared'/'copied'/'blocked' /'error'. AbortError is treated as 'blocked', not failure. clearReferralCache for account-switch hygiene. src/components/LatestBriefPanel.ts + src/styles/panels.css New share row below the brief cover card. Button disabled until /api/referral/me resolves; if fetch fails the row removes itself. invitedCount > 0 renders as 'N invited' next to the button. Referral cache invalidated alongside Clerk token cache on account switch (otherwise user B would see user A's share link for 5 min). Tests: 10 new cases in tests/brief-referral-code.test.mjs - getReferralCodeForUser: hex shape, determinism, uniqueness, secret-rotation invalidates, input guards - buildShareUrl: path shape, trailing-slash trim, URL-encoding 153/153 brief + deploy tests pass. Both tsconfigs typecheck clean. Attribution flow (already working end-to-end): 1. Share button -> worldmonitor.app/pro?ref={code} 2. /pro landing page already reads ?ref= and passes to /api/register-interest as referredBy 3. convex registerInterest:register increments the referrer's referralCount and stores referredBy on the new row 4. /api/referral/me reads the count back via the relay query 5. 'N invited' updates on next 5-min cache refresh Scope boundaries (deferred): - Convex conversion tracking (invited -> PRO subscribed). Needs a join from registrations.referredBy to subscriptions.userId via email. Surface 'N converted' in a follow-up. - Referral-credit / reward system: viral loop works today, reward logic is a separate product decision. * fix(brief): address three P2 review findings on #3175 - api/referral/me.ts JSDoc said '503 if REFERRAL_SIGNING_SECRET is not configured' but the handler actually reads BRIEF_URL_SIGNING_SECRET. Updated the docstring so an operator chasing a 503 doesn't look for an env var that doesn't exist. - server/_shared/referral-code.ts carried a RESERVED_CODES Set to avoid collisions with URL-path keywords ('index', 'robots', 'admin'). The guard is dead code: the code alphabet is [0-9a-f] (hex output of the HMAC) so none of those non-hex keywords can ever appear. Removed the Set + the while loop; left a comment explaining why it was unnecessary so nobody re-adds it. - src/components/LatestBriefPanel.ts passed disabled: 'true' (string) to the h() helper. DOM-utils' h() calls setAttribute for unknown props, which does disable the button — but it's inconsistent with the later .disabled = false property write. Fixed to the boolean disabled: true so the attribute and the IDL property agree. 10/10 referral-code tests pass. Both tsconfigs typecheck clean. * fix(brief): address two review findings on #3175 — drop misleading count + fix user-agnostic cache P1: invitedCount wired to the wrong attribution store. The share URL is /pro?ref=<code>. On /pro the 'ref' feeds Dodopayments checkout metadata (affonso_referral), NOT registrations.referredBy. /api/referral/me counted only the waitlist path, so the panel would show 0 invited for anyone who converted direct-to-checkout — misleading. Rather than ship a count that measures only one of two attribution paths (and the less-common one at that), the count is removed entirely. The share button itself still works. A proper metric requires unifying the waitlist + Dodo-metadata paths into a single attribution store, which is a follow-up. Changes: - api/referral/me.ts: response shape is { code, shareUrl } — no invitedCount / convertedCount - convex/registerInterest.ts: removed getReferralStatsByCode internal query - convex/http.ts: removed /relay/referral-stats route - src/services/referral.ts: ReferralProfile interface no longer has invitedCount; fetch call unchanged in behaviour - src/components/LatestBriefPanel.ts: dropped the 'N invited' render branch P2: referral cache was user-agnostic. Module-global _cached had no userId key, so a stale cache primed by user A would hand user B user A's share link for up to 5 min after an account switch — if no panel is mounted at the transition moment to call clearReferralCache(). Per the reviewer's point, this is a real race. Fix: two-part. (a) Cache entry carries the userId it was computed for; reads check the current Clerk userId and only accept hits when they match. Mismatch → drop + re-fetch. (b) src/services/referral.ts self-subscribes to auth-state at module load (ensureAuthSubscription). On any id transition _cached is dropped. Module-level subscription means the invalidation works even when no panel is currently mounted. (c) Belt-and-suspenders: post-fetch, re-check the current user before caching. Protects against account switches that happen mid-flight between 'read cache → ask network → write cache'. Panel's local clearReferralCache() call removed — module now self-invalidates. 10/10 referral-code tests pass. Both tsconfigs typecheck clean. * fix(referral): address P1 review finding — share codes now actually credit the sharer The earlier head generated 8-char Clerk-derived HMAC codes for the share button, but the waitlist register mutation only looked up registrations.by_referral_code (6-char email-generated codes). Codes issued by the share button NEVER resolved to a sharer — the 'referral attribution' half of the feature was non-functional. Fix (schema-level, honest attribution path): convex/schema.ts - userReferralCodes { userId, code, createdAt } + by_code, by_user - userReferralCredits { referrerUserId, refereeEmail, createdAt } + by_referrer, by_referrer_email convex/registerInterest.ts - register mutation: after the existing registrations.by_referral_code lookup, falls through to userReferralCodes.by_code. On match, inserts a userReferralCredits row (the Clerk user has no registrations row to increment, so credit needs its own table). Dedupes by (referrer, refereeEmail) so returning visitors can't double-credit. - new internalMutation registerUserReferralCode({userId, code}) idempotent binding of a code to a userId. Collisions logged and ignored (keeps first writer). convex/http.ts - new POST /relay/register-referral-code (RELAY_SHARED_SECRET auth) that runs the mutation above. api/referral/me.ts - signature gains a ctx.waitUntil handle - after generating the user's code, fire-and-forget POSTs to /relay/register-referral-code so the binding is live by the time anyone clicks a shared link. Idempotent — a failure just means the NEXT call re-registers. Still deferred: display of 'N credited' / 'N converted' in the LatestBriefPanel. The waitlist side now resolves correctly, but the Dodopayments checkout path (/pro?ref=<code> → affonso_referral) is tracked in Dodo, not Convex. Surfacing a unified count requires a separate follow-up to pull Dodo metadata into Convex. Regression tests (3 new cases in tests/brief-referral-code.test.mjs): - register mutation extends to userReferralCodes + inserts credits - schema declares both tables with the right indexes - /api/referral/me registers the binding via waitUntil 13/13 referral tests pass. Both tsconfigs typecheck clean. * fix(referral): address two P1 review findings — checkout attribution + dead-link prevention P1: share URL didn't credit on the /pro?ref= checkout path. The earlier PR wired Clerk codes into the waitlist path (/api/register-interest -> userReferralCodes -> userReferralCredits) but a visitor landing on /pro?ref=<code> and going straight to Dodo checkout forwarded the code only into Dodo metadata (affonso_referral). Nothing on our side credited the sharer. Fix: convex/payments/subscriptionHelpers.ts handleSubscriptionActive now reads data.metadata.affonso_referral when inserting a NEW subscription row. If the code resolves in userReferralCodes, a userReferralCredits row crediting the sharer is inserted (deduped by (referrer, refereeEmail) so webhook replays don't double-credit). The credit only lands on first-activation — the else-branch of the existing/new split guards against replays. P1: /api/referral/me returned 200 + share link even when the (code, userId) binding failed. ctx.waitUntil(registerReferralCodeInConvex(...)) ran the binding asynchronously, swallowing missing env + non-2xx + network errors. Users got a share URL that the waitlist lookup could never resolve — dead link. Fix: registerReferralCodeInConvex is now BLOCKING (throws on any failure) and the handler awaits it before returning. On failure the endpoint responds 503 service_unavailable rather than a 200 with a non-functional URL. Mutation is idempotent so client retries are safe. Regression tests (2 updated/new in tests/brief-referral-code.test.mjs): - asserts the binding is awaited, not ctx.waitUntil'd; asserts the failure path returns 503 - asserts subscriptionHelpers reads affonso_referral, resolves via userReferralCodes.by_code, inserts a userReferralCredits row, and dedupes by (referrer, refereeEmail) 14/14 referral tests pass. Both tsconfigs typecheck clean. Net effect: /pro?ref=<code> visitors who convert (direct checkout) now credit the sharer on webhook receipt, same as waitlist signups. The share button is no longer a dead-end UI.	2026-04-18 20:39:55 +04:00
Elie Habib	122204f691	feat(brief): Phase 8 — Telegram carousel images via Satori + resvg-wasm (#3174 ) * feat(brief): Phase 8 — Telegram carousel images via Satori + resvg-wasm Implements the Phase 8 carousel renderer (Option B): server-side PNG generation in a Vercel edge function using Satori (JSX to SVG) + @resvg/resvg-wasm (SVG to PNG). Zero new Railway infra, zero Chromium, same edge runtime that already serves the magazine HTML. Files: server/_shared/brief-carousel-render.ts (new) Pure renderer: (BriefEnvelope, CarouselPage) -> Uint8Array PNG. Three layouts (cover/threads/story), 1200x630 OG size. Satori + resvg + WASM are lazy-imported so Node tests don't trip over '?url' asset imports and the 800KB wasm doesn't ship in every bundle. Font: Noto Serif regular, fetched once from Google Fonts and memoised on the edge isolate. api/brief/carousel/[userId]/[issueDate]/[page].ts (new) Public edge function reusing the magazine route's HMAC token — same signer, same (userId, issueDate) binding, so one token unlocks magazine HTML AND all three carousel images. Returns image/png with 7d immutable cache headers. 404 on invalid page index, 403 on bad token, 404 on Redis miss, 503 on missing signing secret. Render failure falls back to a 1x1 transparent PNG so Telegram's sendMediaGroup doesn't 500 the brief. scripts/seed-digest-notifications.mjs carouselUrlsFrom(magazineUrl) derives the 3 signed carousel URLs from the already-signed magazine URL. sendTelegramBriefCarousel calls Telegram's sendMediaGroup with those URLs + short caption. Runs before the existing sendTelegram(text) so the carousel is the header and the text the body — long-form stories remain forwardable as text. Best-effort: carousel failure doesn't block text delivery. package.json + package-lock.json satori ^0.10.14 + @resvg/resvg-wasm ^2.6.2. Tests (tests/brief-carousel.test.mjs, 9 cases): - pageFromIndex mapping + out-of-range - carouselUrlsFrom: valid URL, localhost origin preserved, missing token, wrong path, invalid issueDate, garbage input - Drift guard: cron must still declare the same helper + template string. If it drifts, test fails with a pointer to move the impl into a shared module. PNG render itself isn't unit-tested — Satori + WASM need a browser/edge runtime. Covered by smoke validation step in the deploy monitoring plan. Both tsconfigs typecheck clean. 152/152 brief tests pass. Scope boundaries (deferred): - Slack + Discord image attachments (different payload shapes) - notification-relay.cjs brief_ready dispatch (real-time route) - Redis caching of rendered PNG (edge Cache-Control is enough for MVP) * fix(brief): address two P1 review findings on Phase 8 carousel P1-A: 200 placeholder PNG cached 7d on render failure. Route config said runtime: 'edge' but a comment contradicted it claiming Node semantics. More importantly, any render/init failure (WASM load, Satori, Google Fonts) was converted to a 1x1 transparent PNG returned with Cache-Control: public, max-age=7d, immutable. Telegram's media fetcher and Vercel's CDN would cache that blank for the full brief TTL per chat message — one cold-start mismatch = every reader of that brief sees blank carousel previews for a week. Fix: deleted errorPng(). Render failure now returns 503 with Cache-Control: no-store. sendMediaGroup fails cleanly for that carousel (the digest cron already treats it as best-effort and still sends the long-form text message), next cron tick re-renders from a fresh isolate. Self-healing across ticks. Contradictory comment about Node runtime removed. P1-B: Google Fonts as silent hard dependency. The renderer claimed 'safe embedded/fallback path' in comments but no fallback existed. loadFont() fetches Noto Serif from gstatic.com and rethrows on any failure. Combined with P1-A's old 200-cache-7d path, a transient CDN blip would lock in a blank carousel for a week. Fix: updated comments to honestly declare the CDN dependency plus document the self-healing semantics now that P1-A's fix no longer caches the failure. If Google Fonts reliability becomes a problem, swap the fetch for a bundled base64 TTF — noted as the escape hatch. Tests (tests/brief-carousel.test.mjs): 2 new regression cases. 11/11 carousel tests pass. Both tsconfigs typecheck clean locally. Note on currently-red CI: failures are NOT typecheck errors — npm ci dies fetching libvips for sharp (504 Gateway Time-out from GitHub releases). sharp is a transitive dep via @xenova/transformers, pre-existing, not touched by this PR. Transient infra flake. * fix(brief): switch carousel to Node + @resvg/resvg-js (fixes deploy block) Vercel edge bundler fails the carousel deploy with: 'Edge Function is referencing unsupported modules: @resvg/resvg-wasm/index_bg.wasm?url' The ?url asset-import syntax is a Vite-ism that Vercel's edge bundler doesn't resolve. Two ways out: find a Vercel-blessed edge WASM import incantation, or switch to Node runtime with the native @resvg/resvg-js binding. The second is simpler, faster per request, and avoids the whole WASM-in-edge-bundler rabbit hole. Changes: - package.json: @resvg/resvg-wasm -> @resvg/resvg-js ^2.6.2 - api/brief/carousel/.../[page].ts: runtime 'edge' -> 'nodejs20.x' - server/_shared/brief-carousel-render.ts: drop initWasm path, dynamic-import resvg-js in ensureLibs(). Satori and resvg load in parallel via Promise.all, shaving ~30ms off cold start. Also addresses the P2 finding from review: the old ensureLibsAndWasm had a concurrent-cold-start race where two callers could reach 'await initWasm()' simultaneously. Replaced the boolean flag with a shared _libsLoadPromise so concurrent callers await the same load. On failure the promise resets so the NEXT request retries rather than poisoning the isolate for its lifetime. Cold start ~700ms (Satori + resvg-js native init), warm ~40ms. Carousel images are not latency-critical — fetched by Telegram's media service, CDN-cached 7d. Both tsconfigs typecheck clean. 11/11 carousel tests pass. * fix(brief): carousel runtime = 'nodejs' (was 'nodejs20.x', rejected by Vercel) Vercel's functions config validator rejects 'nodejs20.x' at deploy time: unsupported "runtime" value in config: "nodejs20.x" (must be one of: ["edge","experimental-edge","nodejs"]) The Node version comes from the project's default (currently Node 20 via package.json engines + Vercel project settings), not from the runtime string. Use 'nodejs' — unversioned — and let the platform resolve it. 11/11 carousel tests pass. * fix(brief): swap carousel font from woff2 to woff (Satori can't parse woff2) Review on PR #3174: the FONT_URL was pointing at a gstatic.com woff2 file. Satori parses ttf / otf / woff v1 — NOT woff2. Every render was about to throw on font decode, the route would return 503, and the carousel would never deliver a single image. Fix: point FONT_URL at @fontsource's Noto Serif Latin 400 WOFF v1 via jsdelivr. WOFF v1 is a TrueType wrapper that Satori parses natively (verified: file says 'Web Open Font Format, TrueType, version 1.1'). Same cold-start semantics as before — one fetch per warm isolate, memoised. Regression test: asserts FONT_URL ends in ttf/otf/woff and explicitly rejects any .woff2 suffix. A future swap that silently reintroduces woff2 now fails CI loudly instead of shipping a permanently-broken renderer. 12/12 carousel tests pass. Both tsconfigs typecheck clean.	2026-04-18 20:27:41 +04:00
Elie Habib	e1c3b28180	feat(notifications): Phase 6 — web-push channel for PWA notifications (#3173 ) * feat(notifications): Phase 6 — web-push channel for PWA notifications Adds a web_push notification channel so PWA users receive native notifications when this tab is closed. Deep-links click to the brief magazine URL for brief_ready events, to the event link for everything else. Schema / API: - channelTypeValidator gains 'web_push' literal - notificationChannels union adds { endpoint, p256dh, auth, userAgent? } variant (standard PushSubscription identity triple + cosmetic UA for the settings UI) - new setWebPushChannelForUser internal mutation upserts the row - /relay/deactivate allow-list extended to accept 'web_push' - api/notification-channels: 'set-web-push' action validates the triple, rejects non-https, truncates UA to 200 chars Client (src/services/push-notifications.ts + src/config/push.ts): - isWebPushSupported guards Tauri webview + iOS Safari - subscribeToPush: permission + pushManager.subscribe + POST triple - unsubscribeFromPush: pushManager.unsubscribe + DELETE row - VAPID_PUBLIC_KEY constant (with VITE_VAPID_PUBLIC_KEY env override) - base64 <-> Uint8Array helpers (VAPID key encoding) Service worker (public/push-handler.js): - Imported into VitePWA's generated sw.js via workbox.importScripts - push event: renders notification; requireInteraction=true for brief_ready so a lock-screen swipe does not dismiss the CTA - notificationclick: focuses+navigates existing same-origin client when present, otherwise opens a new window - Malformed JSON falls back to raw text body, missing data falls back to a minimal WorldMonitor default Relay (scripts/notification-relay.cjs): - sendWebPush() with lazy-loaded web-push dep. 404/410 triggers deactivateChannel('web_push'). Missing VAPID env vars logs once and skips — other channels keep delivering. - processEvent dispatch loop + drainHeldForUser both gain web_push branches Settings UI (src/services/notifications-settings.ts): - New 'Browser Push' tile with bell icon - Enable button lazy-imports push-notifications, calls subscribe, renders 'Not supported' on Tauri/in-app webviews - Remove button routes web_push specifically through unsubscribeFromPush so the browser side is cleaned up too Env vars required on Railway services: VAPID_PUBLIC_KEY public key VAPID_PRIVATE_KEY private key VAPID_SUBJECT mailto:support@worldmonitor.app (optional) Public key is also committed as the default in src/config/push.ts so the client bundle works without a build-time override. Tests: 11 new cases in tests/brief-web-push.test.mjs - base64 <-> Uint8Array round-trip + null guards - VAPID default fallback when env absent - SW push event rendering, requireInteraction gating, malformed JSON + no-data fallbacks - SW notificationclick: openWindow vs focus+navigate, default url 154/154 tests pass. Both tsconfigs typecheck clean. * fix(brief): address PR #3173 review findings + drop hardcoded VAPID P1 (security): VAPID private key leaked in PR description. Rotated the keypair. Old pair permanently invalidated. Structural fix: Removed DEFAULT_VAPID_PUBLIC_KEY entirely. Hardcoding the public key in src/config/push.ts gave rotations two sources of truth (code vs env) — exactly the friction that caused me to paste the private key in a PR description in the first place. VAPID_PUBLIC_KEY now comes SOLELY from VITE_VAPID_PUBLIC_KEY at build time. isWebPushConfigured() gates the subscribe flow so builds without the env var surface as 'Not supported' rather than crashing pushManager.subscribe. Operator setup (one-time): Vercel build: VITE_VAPID_PUBLIC_KEY=<public> Railway services: VAPID_PUBLIC_KEY=<public> VAPID_PRIVATE_KEY=<private> VAPID_SUBJECT=mailto:support@worldmonitor.app Rotation: update env on both sides, redeploy. No code change, no PR body — no chance of leaking a key in a commit. P2: single-device fan-out — setWebPushChannelForUser replaces the previous subscription silently. Per-device fan-out is a schema change deferred to follow-up. Fix for now: surface the replacement in settings UI copy ('Enabling here replaces any previously registered browser.') so users who expect multi-device see the warning. P2: 24h push TTL floods offline devices on reconnect. Event-type-aware: brief_ready: 12h (daily editorial — still interesting) quiet_hours_batch: 6h (by definition queued-on-wake) everything else: 30m (transient alerts: noise after 30min) REGRESSION test: VAPID_PUBLIC_KEY must be '' when env var is unset. If a committed default is reintroduced, the test fails loudly. 11/11 web-push tests pass. Both tsconfigs typecheck clean. * fix(notifications): deliver channel_welcome push for web_push connects (#3173 P2) The settings UI queues a channel_welcome event on first web_push subscribe (api/notification-channels.ts:240 via publishWelcome), but processWelcome() in the relay only branched on slack/discord/email — no web_push arm. The welcome event was consumed off the queue and then silently dropped, leaving first-time subscribers with no 'connection confirmed' signal. Fix: add a web_push branch to processWelcome. Calls sendWebPush with eventType='channel_welcome' which maps to the 30-minute TTL tier in the push-delivery switch — a welcome that arrives >30 min after subscribe is noise, not confirmation. Short body (under 80 chars) so Chrome/Firefox/Safari notification shelves don't clip past ellipsis. 11/11 web-push tests pass. * fix(notifications): address two P1 review findings on #3173 P1-A: SSRF via user-supplied web_push endpoint. The set-web-push edge handler accepted any https:// URL and wrote it to Convex. The relay's sendWebPush() later POSTs to whatever endpoint sits in that row, giving any Pro user a server-side-request primitive bounded only by the relay's network egress. Fix: isAllowedPushEndpointHost() allow-list in api/notification- channels.ts. Only the four known browser push-service hosts pass: fcm.googleapis.com (Chrome / Edge / Brave) updates.push.services.mozilla.com (Firefox) web.push.apple.com (Safari, macOS 13+) *.notify.windows.com (Windows Notification Service) Fail-closed: unknown hosts rejected with 400 before the row ever reaches Convex. If a future browser ships a new push service we'll need to widen this list (guarded by the SSRF regression tests). P1-B: cross-account endpoint reuse on shared devices. The browser's PushSubscription is bound to the origin, NOT to the Clerk session. User A subscribes on device X, signs out, user B signs in on X and subscribes — the browser hands out the SAME endpoint/p256dh/auth triple. The previous setWebPushChannelForUser upsert keyed only by (userId, channelType), so BOTH rows now carry the same endpoint. Every push the relay fans out for user A also lands on device X which is now showing user B's session. Fix: setWebPushChannelForUser scans all web_push rows and deletes any that match the new endpoint BEFORE upserting. Effectively transfers ownership of the subscription to the current caller. The previous user will need to re-subscribe on that device if they sign in again. No endpoint-based index on notificationChannels — the scan happens at <10k rows and is well-bounded to the one write-path per user per connect. If volume grows, add an + migration. Regression tests (tests/brief-web-push.test.mjs, 3 new cases): - allow-list defines all four browser hosts + fail-closed return - allow-list is invoked BEFORE convexRelay() in the handler - setWebPushChannelForUser compares + deletes rows by endpoint 14/14 web-push tests pass. Both tsconfigs typecheck clean.	2026-04-18 20:27:08 +04:00
Elie Habib	c2356890da	feat(brief): Phase 3b — LLM whyMatters + editorial digest prose via Gemini (#3172 ) * feat(brief): Phase 3b — LLM whyMatters + editorial digest prose via Gemini Replaces the Phase 3a stubs with editorial output from Gemini 2.5 Flash via the existing OpenRouter-backed callLLM chain. Two LLM pathways, different caching semantics: whyMatters (per story): 1 editorial sentence, 18-30 words, global stakes. Cache brief:llm:whymatters:v1:{sha256(headline\|source\|severity)} with 24h TTL shared ACROSS users (whyMatters is not personalised). Bounded concurrency 5 so a 12-story brief doesn't open 12 parallel sockets to OpenRouter. digest prose (per user): JSON { lead, threads[], signals[] } replacing the stubs. Cache brief:llm:digest:v1:{userId}:{sensitivity} :{poolHash} with 4h TTL per-user. Pool hash is order-insensitive so rank shuffling doesn't invalidate. Provider pinned to OpenRouter (google/gemini-2.5-flash) via skipProviders: ['ollama', 'groq'] per explicit user direction. Null-safe all the way down. If the LLM is unreachable, parse fails, or cache throws, enrichBriefEnvelopeWithLLM returns the baseline envelope with its stubs intact. The brief always ships. Kill switch BRIEF_LLM_ENABLED is distinct from AI_DIGEST_ENABLED so the brief's editorial prose and the email's AI summary can be toggled independently during provider outages. Files: scripts/lib/brief-llm.mjs (new) — pure prompt/parse helpers + IO generateWhyMatters/generateDigestProse + envelope enrichment scripts/seed-digest-notifications.mjs — BRIEF_LLM_ENABLED flag, briefLlmDeps closure, enrichment inserted between compose + SETEX tests/brief-llm.test.mjs (new, 34 cases) End-to-end verification: the enriched envelope passes assertBriefEnvelope() — the renderer's strict validator is the gate between composer and api/brief, so we prove the enriched envelope still validates. 156/156 brief tests pass. Both tsconfigs typecheck clean. * fix(brief): address three P1 review findings on Phase 3b All three findings are about cache-key correctness + envelope safety. P1-A — whyMatters cache key under-specifies the prompt. hashStory keyed on headline\|source\|threatLevel, but the prompt also carries category + country. Upstream classification or geocoding corrections that leave those three fields unchanged would return pre-correction prose for a materially different prompt. Bumped to v2 key space (pre-fix rows ignored, re-LLM once on rollout). Added regression tests for category + country busting the cache. P1-B — digest prose cache key under-specifies the prompt. hashDigestInput sorted stories and hashed headline\|threatLevel only. The actual prompt includes ranked order + category + country + source. v2 hash now canonicalises to JSON of the fields in the prompt's ranked order. Test inverted to lock the corrected behaviour (reordering MUST miss the cache). Added a test for category change invalidating. P1-C — malformed cached digest poisons the envelope at SETEX time. On cache hit generateDigestProse accepted any object with a string lead, skipping the full shape check. enrichBriefEnvelopeWithLLM then wrote prose.threads/.signals into the envelope, and the cron SETEXed unvalidated. A bad cache row would 404 /api/brief at render time. Two-layer fix: 1. Extracted validateDigestProseShape(obj) — same strictness parseDigestProse ran on fresh output. generateDigestProse now runs it on cache hits too, and returns a normalised copy. 2. Cron now re-runs assertBriefEnvelope on the ENRICHED envelope before SETEX. On assertion failure it falls back to the unenriched baseline (already passed assertion on construction). Regression test: malformed cached row is rejected on hit and the LLM is called again to overwrite. Tests: 8 new regression cases locking all three findings. Total brief test suite now 185/185 green. Both tsconfigs typecheck clean. Cache-key version bumps (v1 -> v2) trigger one-off cache miss on deploy. Editorial prose re-LLM'd on the next cron tick per user. * fix(brief): address two P2 review findings on #3172 P2-A: misleading test name 'different users share the cache' asserted the opposite (per-user isolation). Renamed to 'different users do NOT share the digest cache even when the story pool is identical' so a future reader can't refactor away the per-user key on a misreading. P2-B: signal length validator only capped bytes (< 220 chars), so a 30-word signal could pass even though the prompt says '<=14 words'. Added a word-count filter with an 18-word ceiling (14 + 4 margin for model drift / hyphenated compounds). Regression test locks the behaviour: signals with >14-word drift are dropped, short imperatives pass. 43/43 brief-llm tests pass. Both tsconfigs typecheck clean.	2026-04-18 19:37:33 +04:00
Elie Habib	0fd8cd7d5f	feat(pro): complimentary entitlement tooling + subscription.expired guard (#3169 ) Adds support / goodwill tooling for granting free-tier credits that survive Dodo subscription cancellations. Triggered by the 2026-04-17/18 duplicate-subscription incident: the customer was granted a manual extension to validUntil, but that extension is naked — our existing handleSubscriptionExpired handler unconditionally downgrades to free when Dodo fires the expired event, which would wipe the credit. Three coordinated changes: 1. convex/schema.ts — add optional compUntil: number to entitlements. Acts as a "don't downgrade me before this" floor independent of the subscription billing cycle. Optional, so existing rows are untouched. 2. convex/payments/billing.ts::grantComplimentaryEntitlement — new internalMutation callable via `npx convex run`. Upserts the entitlement, sets both validUntil and compUntil to max(existing, now + days). Never shrinks (calling twice is idempotent upward), validates planKey against PRODUCT_CATALOG, and syncs the Redis cache so edge gateway sees the comp without waiting for TTL. 3. convex/payments/subscriptionHelpers.ts::handleSubscriptionExpired — before the unconditional downgrade, read the current entitlement and skip revocation if compUntil > eventTimestamp. This protects comp grants from Dodo-originated revocations; normal subscription.expired revocation is unchanged when there's no comp or the comp has lapsed. Tests (convex/__tests__/comp-entitlement.test.ts, 9 new): grantComplimentaryEntitlement creates row with validUntil == compUntil never shrinks an existing longer comp upgrades planKey on existing free-tier row rejects unknown planKey and non-positive days handleSubscriptionExpired comp guard revokes to free when no comp set (unchanged) revokes to free when comp is already expired preserves entitlement when comp is still valid end-to-end: grant + expired webhook = entitlement survives CLI usage (requires npx convex deploy after merge, then): npx convex run 'payments/billing:grantComplimentaryEntitlement' \ '{"userId":"user_XXX","planKey":"pro_monthly","days":90,"reason":"support"}' Full check suite green: typecheck x2, biome, 5590 data tests, 171 edge tests, 9 convex tests, md lint, version sync.	2026-04-18 18:06:20 +04:00
Elie Habib	c90d40dfc5	fix(pro): consult Convex entitlement in hasPremiumAccess + isProUser (#3167 ) * fix(pro): consult Convex entitlement in hasPremiumAccess and isProUser Customer cus_0NcmwcAWw0jhVBHVOK58C still saw "Upgrade to Pro" overlays on premium panels this morning despite PR #3163 landing, because the reload the transition detector triggers only helps if hasPremiumAccess returns true on the next load — and it doesn't. Our Dodo webhook writes to Convex `entitlements` but never to Clerk publicMetadata.plan, so hasPremiumAccess (which only checked Clerk role / API keys / tester keys) stayed false for every paying Dodo customer. isPanelEntitled honours the Convex entitlement and unlocks the panel content, but getPanelGateReason -> hasPremiumAccess still returns FREE_TIER and covers it with the upgrade overlay. Kevin's screenshot shows exactly that: PRO badges next to panel titles, "Upgrade to Pro" body beneath. Fix: - src/services/panel-gating.ts — hasPremiumAccess now checks isEntitled() after the Clerk role check. Convex entitlement is now authoritative for paying customers. - src/services/widget-store.ts — same fix in isProUser so widget, search, and event-handler gates agree with panel gating. - src/app/panel-layout.ts — after the existing free->pro reload branch, re-run updatePanelGating(getAuthState()) on every entitlement snapshot. Without this a legacy-pro user's null->true initial snapshot (NOT reloaded to avoid a loop) would leave the paywall overlay in place until the next auth event; likewise on WS reconnect or revocation the lock state must follow the current snapshot synchronously. Import graph stays acyclic: panel-gating -> entitlements -> convex-client -> clerk; widget-store -> entitlements likewise. Typecheck clean. Pairs with PR #3163 (already merged). Earlier iterations of the activation race fix were cargo-culted around the transition detector; this is the actual UI-visible fix. * refactor(pro): drop redundant isEntitled() from hasPremiumAccess Greptile P2: the isEntitled() check at the end of hasPremiumAccess can never flip the result, because isProUser() — called two lines earlier in the same function — now also checks isEntitled() after this PR's widget-store change. If isProUser() returns false, we already know isEntitled() was false inside it. Remove the redundant call and expand the docstring to say explicitly that isProUser carries the Convex entitlement signal. Keeps hasPremiumAccess as a thin union of signals that aren't already covered by isProUser (WORLDMONITOR_API_KEY desktop secret + the passed-in authState.role which could in principle differ from getAuthState()).	2026-04-18 15:49:16 +04:00
Elie Habib	01c607c27c	fix(brief): compose magazine stories from digest accumulator, not news:insights (#3168 ) Root cause of the "email digest lists 30 critical events, brief shows 2 random Bellingcat stories" mismatch reported today: the email and the brief read from two unrelated Redis keys. email digest -> digest:accumulator:v1:{variant}:{lang} live per-variant ZSET of 30+ ingested stories, hydrated from story:track:v1:{hash} + sources. written by list-feed-digest on every ingest cycle. brief -> news:insights:v1 global 8-story summary written by seed-insights. After sensitivity=critical filter only 2 survive. A completely different pool on a different cadence. The brief was shipping from the wrong source, so a user who had just read "UNICEF / Hormuz / Rohingya" in their email would open their brief and see unrelated Bellingcat pieces. Fix: brief now composes from the same digest accumulator the email reads. scripts/lib/brief-compose.mjs exposes a new composeBriefFromDigestStories(rule, digestStories, insightsNumbers, {nowMs}) that maps the digest story shape ({hash, title, severity, sources[], ...}) through a small adapter into the upstream brief- filter shape, applies the user's sensitivity gate, and assembles the envelope. news:insights:v1 is still read — but only for the clusters / multi-source counters on the stats page. A failed insights fetch now returns zeroed stats instead of aborting brief composition, because the stories (not the numbers) are what matter. seed-digest-notifications: - composeBriefsForRun now calls buildDigest(candidate, windowStart) per rule instead of using a single global pool - memoizes buildDigest by (variant, lang, windowStart) to keep the per-user loop from issuing N identical ZRANGE+HGETALL round-trips - BRIEF_STORY_WINDOW_MS = 24h — a weekly-cadence user still expects a fresh brief in the dashboard every day, independent of email cadence - composeBriefForRule kept as @deprecated so tests that stub news:insights directly don't break; all live traffic uses the digest path Tests: new tests/brief-from-digest-stories.test.mjs (12 cases) locks the mapping — empty input, source selection, sensitivity pass/drop, 12-story cap, moderate→medium severity aliasing, category/country defaults, stats-number passthrough, determinism. 122/122 brief tests pass; both tsconfigs typecheck clean. Operator note: today's wrong brief at brief:user_...:2026-04-18 was already DELed manually. The next cron tick under this code composes a correct one from the same pool the email used.	2026-04-18 15:47:08 +04:00
Elie Habib	e98df6f694	fix(brief): clerk-pro-only gate + generation-guarded token cache (#3166 ) Two P1 race/gate findings from post-merge review of #3160. Finding 1 — mixed-auth path still rendered as unlocked. hasPremiumAccess() returns true for desktop API key / browser tester keys even when the signed-in Clerk account is FREE. The brief is stored at brief:{clerkUserId}:{date} in Redis — without a Clerk PRO user there is nothing to fetch, and /api/latest-brief returns 403. Relying on the panel's inline role check only inverts the UX: the panel "unlocks", then paints an upgrade CTA inside an unlocked body. Fix: WEB_CLERK_PRO_ONLY_PANELS in panel-layout.ts. When a panel is in this set AND the Clerk role is not 'pro', the layout downgrades the gate reason from NONE to FREE_TIER (or ANONYMOUS when no Clerk user at all). The panel now shows the same locked overlay an actual free user sees — consistent, and no doomed fetch. Finding 2 — clearClerkTokenCache() didn't invalidate mid-flight fetch. Nulling _cachedToken and _tokenInflight does not cancel a promise that is already awaiting session.getToken(). When that promise resolves it unconditionally writes _cachedToken = tokenA and returns tokenA to its already-awaiting callers, re-poisoning the cache for 50s and silently shipping A's JWT into B's session. The panel's post-response UI-user check catches the direct A-during-fetch race, but a panel that starts a fresh refresh AFTER the switch can still get A's token back from the stale inflight. Fix: monotonic _tokenGen counter. Bumped by clearClerkTokenCache() and signOut(). Each getClerkToken() captures myGen on entry; if the generation has advanced by the time the JWT arrives, the result is dropped on the floor (no cache write, no return value). The finally block also guards the _tokenInflight null-out so a newer generation's inflight isn't clobbered. Typecheck clean on both tsconfigs. 94/94 brief + deploy tests green.	2026-04-18 15:46:56 +04:00
Elie Habib	8684e5a398	fix(brief): per-route CSP override so magazine swipe/arrow nav runs (#3165 ) * fix(brief): per-route CSP override so magazine swipe/arrow nav runs The global CSP at /((?!docs).) allow-lists only four SHA-256 hashes for inline scripts (the app's own index.html scripts). brief-render.js emits its swipe/arrow/wheel/touch nav as a deterministic inline IIFE with a different hash, so the browser silently blocked it. The deck rendered, pages were present, dots were drawn — but nothing advanced. Fix mirrors the existing /api/slack/oauth/callback and /api/discord/oauth/callback precedent: a per-route Content-Security- Policy header for /api/brief/(.) that relaxes script-src to 'unsafe-inline'. Everything else is tight: - default-src 'self' - connect-src 'self' (no outbound network) - object-src 'none', form-action 'none' - frame-ancestors pinned to worldmonitor domains - style-src keeps Google Fonts; font-src keeps gstatic - script-src keeps Cloudflare Insights beacon (auto-injected) 'unsafe-inline' is safe here because server/_shared/brief-render.js HTML-escapes all Redis-sourced content via escapeHtml over [&<>"']. No user-controlled string reaches the DOM unescaped. Verified: all 17 tests/deploy-config.test.mjs security-header assertions still pass (they target the catch-all route, untouched). * fix(brief): un-block Cloudflare Insights beacon + add CSP test coverage Two P2 follow-ups from Greptile review on #3165. 1. connect-src was 'self' only — the Cloudflare Insights beacon script loaded (script-src allowed static.cloudflareinsights.com) but its outbound POST to https://cloudflareinsights.com/cdn-cgi/rum was silently blocked. Analytics for brief-page traffic was dropped with no console error. Added https://cloudflareinsights.com to connect-src so the beacon can ship its payload. 2. tests/deploy-config.test.mjs had 17 assertions for the catch-all CSP but nothing for the new /api/brief/(.*) override. Any future edit — or accidental deletion — of the rule would land without a red test. Added a 4-test suite covering: - rule exists with a CSP header - script-src allows 'unsafe-inline' (the whole point) - connect-src allows cloudflareinsights.com (this fix) - tight non-script defaults still present (default-src 'self', object-src 'none', form-action 'none', base-uri 'self') 21/21 deploy-config assertions pass locally.	2026-04-18 15:20:01 +04:00
Sebastien Melki	bc91c61a87	[codex] guard duplicate subscription checkout (#3162 ) * guard duplicate subscription checkout * address checkout guard review feedback	2026-04-18 15:19:34 +04:00
Elie Habib	c49c2f80f6	fix(pro): reliable post-payment activation (#3163 ) * fix(pro): reliable post-payment activation (transition reload + auth wait + overlay-success reload) Fixes a silent race where paying users saw locked panels after a successful Dodo checkout and concluded PRO hadn't activated. Incident 2026-04-17/18: one customer purchased Pro Monthly twice within 32 min on Google Pay then Credit Card because the first charge showed no UI change; the duplicate was refunded by Dodo with reason "Duplicate transaction". Server path (webhook -> Convex entitlements row) was verified correct end to end: all 9 webhook events processed, entitlement row written within seconds, planKey=pro_monthly. The bug was client-side in three places. 1. panel-layout.ts replaced skipInitialSnapshot with a free->pro transition detector (shouldReloadOnEntitlementChange). The prior guard swallowed the first pro snapshot unconditionally, which collapsed two distinct cases: (a) legacy-pro user on normal page load (correctly no reload) and (b) free user whose post-payment pro snapshot arrives after panels rendered against free-tier gating (should reload). The transition detector distinguishes them by remembering the last observed entitlement. 2. entitlements.ts awaits waitForConvexAuth(10_000) before calling client.onUpdate. Mirrors the pattern already used in api-keys.ts and App.ts claimSubscription path. Eliminates the spurious FREE_TIER_DEFAULTS first snapshot from unauthenticated cold sessions that the transition detector would otherwise treat as the baseline. 3. checkout.ts on Dodo overlay checkout.status=succeeded schedules a window.location.reload() after 3s (median webhook latency <5s observed in prod). Belt-and-braces: guarantees the post-payment state is fresh even if the WS subscription is slow or the transition detector misses the edge for any reason. Unit tests in tests/entitlement-transition.test.mts cover all six (last, next) combinations plus the full incident-simulation sequence (null -> false -> true -> true => exactly one reload) and the legacy-pro reconnect sequence (null -> true -> true -> true => zero reloads). Out of scope (tracked separately): server-side duplicate-subscription guard in _createCheckoutSession. * fix(pro): seed lastEntitled=false on redirect-return from checkout Addresses a gap in the original PR: the transition detector still swallowed the first pro snapshot when the user came back via Dodo's full-page redirect flow (/pro page -> Dodo checkout -> return to worldmonitor.app with ?subscription_id=...&status=active URL params handled by handleCheckoutReturn). On that path a fast webhook can land before the browser finishes the return navigation. When the dashboard boots, Convex's first entitlement snapshot already carries pro_monthly — which the detector treats as the "legacy-pro on normal page load" case and does not reload. Panels rendered against free-tier gating stay locked until manual refresh. Fix: when handleCheckoutReturn() returns true, seed lastEntitled=false instead of null. This biases the detector to treat the first pro snapshot as the true free->pro transition that it is, not a legacy-pro baseline. Adds two new unit tests covering both redirect-return timings (webhook already landed; webhook still pending). Full transition suite is now 10/10 passing. * fix(pro): seed lastEntitled=false across the overlay reload too Prior amendment covered the full-page Dodo redirect return (URL carries subscription_id params consumed by handleCheckoutReturn). But the overlay success path does its own setTimeout(() => window.location.reload(), 3_000) and the overlay uses manualRedirect:true, so the reload lands at the original URL with no params. handleCheckoutReturn returns false there, returnedFromCheckout stays false, lastEntitled seeds to null, and a fast webhook's first-snapshot pro entitlement gets swallowed as legacy-pro baseline — same class of bug that caused the 2026-04-17/18 incident, now reproducible on the overlay path instead of the redirect path. Fix: before the scheduled reload, set a session flag (wm-post-checkout). On the reloaded page, panel-layout consumes the flag and treats it as a post-checkout return, which makes the transition detector seed lastEntitled=false and correctly route the first pro snapshot through the reload. Session storage is used (not local) so the flag is scoped to the tab and doesn't leak across sessions. Silent try/catch keeps private-browsing environments working — in that case we fall back to the pre-flag behavior (risk bounded by the 3s reload + Convex WS catching up, same as before).	2026-04-18 15:19:12 +04:00
Elie Habib	5673bc6c16	fix(brief): sign-in-required state + composing auto-refresh (#3160 ) * fix(brief): sign-in-required state + composing poll + visibility refresh Addresses two P1 review findings after PR #3159 merged. 1. hasPremiumAccess unlocks for desktop WORLDMONITOR_API_KEY and browser tester keys, but /api/latest-brief is Clerk-userId scoped — it returns 401 for those paths. Previously the panel unlocked then showed a retry-loop error for API-key-only users. Now: if authState.user is missing, render a dedicated "Sign in to view your brief" CTA inline (no endpoint hit). Subscribes to subscribeAuthState; when the user signs in mid-session, refresh() fires automatically. 2. composing state never auto-refreshed. Fix: every renderComposing schedules a 60s setTimeout re-poll; cleared on ready/error/lock/ destroy. Also added a document visibilitychange listener that triggers refresh when the tab comes back into focus. Covers the "brief composed while tab backgrounded" case without reload. Added matching destroy() override to clean up timeout + auth subscription + visibility listener. Typecheck + biome lint clean. * fix(brief): direct Clerk fetch + sign-out clear + account-switch guard Addresses three P1 review findings on PR #3160. 1. Desktop API key + Clerk Pro couldn't load the panel. premiumFetch hard-stops on WORLDMONITOR_API_KEY and never sends Clerk Bearer. /api/latest-brief is Bearer-only so every desktop request 401'd even for a signed-in Pro user. Fix: fetchLatest bypasses premiumFetch entirely — imports getClerkToken() and builds the request with Authorization: Bearer directly. The user-scoped pre-check already guaranteed we have a Clerk user. 2. Sign-out left the previous user's brief on screen. The auth subscription only triggered refresh on truthy nextId, and hasPremiumAccess stays true on desktop/tester keys so the layout-level updatePanelGating doesn't re-lock us. Fix: the subscription now handles ALL three transitions explicitly: null → id → refresh (sign-in) idA → idB → abort in-flight + refresh (account switch) id → null → abort in-flight + renderSignInRequired (sign-out) Clears composing poll and inflight fetch on every transition. 3. Clerk account switch could paint user A's brief in user B's session. getClerkToken caches the JWT for 50s, so a fast A→B switch during an in-flight fetch would hit the server with A's Bearer, return A's brief, and the post-response guard (which only checked "still premium") would let it paint in B's UI. Fix: refresh() captures requestUserId at fetch-start and the post-response + error branches re-verify that the current authState.user.id still equals requestUserId before any this.content mutation. A transient account switch silently drops the stale response. Typecheck + biome lint clean. * fix(brief): clear Clerk token cache on user-id transition Closes the remaining account-switch race the previous commit couldn't cover from inside the panel. Detail: My previous post-response userId check compared the CURRENT authState.user.id to the requestUserId captured at refresh start. For a fast A→B switch: - Subscription handler fires, lastUserId = B - refresh() captures requestUserId = B - fetchLatest calls getClerkToken() → returns A's cached JWT (50s TTL, keyed by time not user) - Server /api/latest-brief decodes A's sub → returns A's brief - Post-check: currentUserId (B) === requestUserId (B) → paint - Result: user A's greeting + signed magazineUrl painted in user B's session Fix: clear the Clerk token cache via the existing clearClerkTokenCache() export from clerk.ts on every observed user-id transition. Next getClerkToken() re-fetches from Clerk.session — bound to the CURRENT session, not the previous one. Server now receives B's token, returns B's brief. Covers sign-in (A=null→B), account switch (A→B), and sign-out (B→null) symmetrically. The (B→null) branch also short-circuits to renderSignInRequired which was already in place. Findings 1 (sign-out clear) and 3 (desktop API key) from the latest review were already resolved in commit `8b0690b9a` — reviewer was evaluating pre-commit state. This commit closes the residual token-cache hole that wasn't visible from the panel alone. Typecheck + biome lint clean. * fix(brief): upgrade CTA for mixed-auth + clear inflight promise too Addresses two P1 review findings on PR #3160. 1. Mixed auth: tester/API key + FREE Clerk session. hasPremiumAccess unlocks the panel; the Bearer-only fetch now sends the free Clerk's JWT; /api/latest-brief validates entitlement from the JWT userId and returns 403. Before: user saw a retry-loop error banner. Now: panel checks authState.user.role !== 'pro' BEFORE fetching and renders a dedicated "Pro required" CTA inline. A stale client cache that says role='pro' while Convex says free is also covered — 403 from the server now surfaces as the same upgrade CTA, not a retry error. Introduces a typed BriefAccessError so the refresh loop can branch terminal vs transient failures cleanly. 2. clearClerkTokenCache only nulled _cachedToken, not _tokenInflight. If a token fetch for user A was in flight when the app switched to B, the next getClerkToken() reused A's promise and sent A's JWT. Server returned A's brief; post- response guard (currentUserId B === requestUserId B) let it paint. Fix: clearClerkTokenCache now nulls _tokenInflight too. The old promise still resolves to its closure, but no caller holds a reference, so the fresh getClerkToken() call starts a new request bound to the current Clerk session. Typecheck + biome lint clean.	2026-04-18 14:51:32 +04:00
Elie Habib	64c906a406	feat(eia): gold-standard /api/eia/petroleum (Railway seed → Redis → Vercel reads only) (#3161 ) * feat(eia): move /api/eia/petroleum to gold-standard (Railway seed → Redis → Vercel reads only) Live api.eia.gov fetches from the Vercel edge function were causing FUNCTION_INVOCATION_TIMEOUT 504s on /api/eia/petroleum (Sydney edge → US origin with no timeout, no cache, no stale fallback — one EIA blip blew the 25s budget). - New seeder scripts/seed-eia-petroleum.mjs — fetches WTI/Brent/ production/inventory from api.eia.gov with per-fetch 15s timeouts, writes energy:eia-petroleum:v1 with the {_seed, data} envelope. Accepts 1-of-4 series; 0-of-4 routes to contract-mode RETRY so seed-meta stays stale and the bundle retries on next cron. - Bundled into seed-bundle-energy-sources.mjs (daily, 90s timeout) — no new Railway service needed. - Rewrote api/eia/[[...path]].js as a Redis-only reader via readJsonFromUpstash. Same response shape for backward compat with widgets/MCP/external callers. 503 + Retry-After on miss (never 504). - Registered eiaPetroleum in api/health.js STANDALONE_KEYS + gated as ON_DEMAND_KEYS for the deploy window; promote to SEED_META (maxStaleMin: 4320) in a follow-up after ~7 days of clean cron. - Tests: 14 seeder unit tests + 9 edge handler tests. Audit result: /api/eia/petroleum was the only Vercel route fetching dashboard data live. Every other fetch(https://…) in api/ is auth/payments/notifications/user-initiated enrichment. * fix(eia): close silent-stale window — add SEED_META + seed-health registration Review finding on PR #3161: without a SEED_META entry, readSeedMeta returns seedStale: null and classifyKey never reaches STALE_SEED. That meant a broken Railway cron or missing EIA_API_KEY after the first successful seed would keep /api/eia/petroleum serving stale data for up to 7 days (TTL) while /api/health reported OK. - api/health.js: add SEED_META.eiaPetroleum with maxStaleMin=4320 (72h = 3× daily bundle cadence). Keep eiaPetroleum in ON_DEMAND_KEYS so the Vercel-instant / Railway-delayed deploy window doesn't CRIT on first seed, but stale-after-seed now properly fires STALE_SEED. - api/seed-health.js: register energy:eia-petroleum in SEED_DOMAINS (intervalMin=1440) so the secondary health endpoint reports it too. - Updated ON_DEMAND_KEYS comment to reflect freshness is now enforced.	2026-04-18 14:40:00 +04:00
Elie Habib	fd419bcfae	feat(brief): dashboard Latest Brief panel (Phase 4) (#3159 ) * feat(brief): dashboard "Latest Brief" panel (Phase 4) New PRO-gated panel that reads /api/latest-brief and renders today's brief with a cover-style thumbnail + greeting + thread count + CTA. Clicking opens the signed magazine URL in a new tab. Base Panel class handles the PRO overlay (ANONYMOUS/FREE_TIER) via premium: 'locked' — no story content, headline, or greeting leaks through DOM on the locked state. Three render states: - ready → cover card + "Read brief →" CTA linking to magazineUrl - composing → neutral empty state ("Your brief is composing.") - error → base showError() with retry Files: - src/components/LatestBriefPanel.ts — new Panel subclass, self- fetching via premiumFetch (handles Clerk Bearer + X-WorldMonitor- Key tester keys + api key fallback) - src/components/index.ts — export the new panel - src/app/panel-layout.ts — createPanel('latest-brief', ...) - src/config/panels.ts — registry entry (priority 1 so it sorts up front across all variant registries) - src/styles/panels.css — cover-card + meta-strip styles using the same e-ink palette as the magazine (sienna kicker, bone text on ink cover, serif greeting) Self-contained: no Convex migration, no new env vars, no backend changes. Reads the /api/latest-brief endpoint already shipped in Phase 2 (#3153 merged). Lands independently of Phase 3b / 5 / 6 / 8. Follow-ups (not in this PR): - CMD+K entry for "Open Latest Brief" — locale strings + commands registry, trivial. - Localisation of panel title + copy strings. - Share button (todo 223). Typecheck clean, lint clean on the new file. * fix(brief): register latest-brief in both premium gate registries Addresses the review finding that the Panel base class's `premium: 'locked'` flag is NOT what actually enforces PRO gating in the app. Two separate registries do: 1. WEB_PREMIUM_PANELS in src/app/panel-layout.ts — the set updatePanelGating() iterates on every auth-state change to decide which panels to lock with a CTA overlay. Panels not in this set get `reason === NONE` and are always unlocked for whoever's viewing them, regardless of the Panel constructor flag. 2. The `premium:` property on each entry in src/config/panels.ts — which isPanelEntitled() checks to decide whether a panel is premium at all. `latest-brief` was missing from both. Result for anonymous/free users: the panel mounted, self-fetched /api/latest-brief, got 401 or 403, and showed raw error UI instead of the intended "Upgrade to Pro" overlay. Also: a PRO user who downgraded mid-session would retain the rendered brief because updatePanelGating() wouldn't re-lock them. Fixes: - src/app/panel-layout.ts — add 'latest-brief' to WEB_PREMIUM_PANELS so updatePanelGating() locks the panel correctly for non-PRO users and RE-locks it on a mid-session downgrade. - src/config/panels.ts — add `premium: 'locked' as const` to all four registry entries (full, finance, tech, happy variants) so isPanelEntitled() treats it as premium everywhere. - src/components/LatestBriefPanel.ts — guard refresh() against running without premium access. Belt-and-suspenders against race conditions where the panel mounts before updatePanelGating() completes, and against mid-session downgrade where the panel stays mounted but should stop hitting the endpoint. Uses the same hasPremiumAccess(getAuthState()) check as the gating infrastructure itself. Typecheck + biome lint clean. * fix(brief): SVG logo now actually renders + queue concurrent refresh Addresses two P1 + one P2 from Greptile on PR #3159. 1. P1 (line 147 + line 167): `h('div', { innerHTML: ... })` silently did nothing. src/utils/dom-utils.ts applyProps has no special case for `innerHTML` — it falls through to `el.setAttribute('innerHTML', svgString)` which just sets a literal DOM attribute. Both logo containers rendered empty. Switched to: const logo = h('div', { className: '...' }); logo.appendChild(rawHtml(WM_LOGO_SVG)); rawHtml() exists in dom-utils for exactly this case; returns a parsed DocumentFragment. 2. P2: Concurrent refresh() was silently dropped. Added a refreshQueued flag so a second refresh during an in-flight one queues a single follow-up pass instead of disappearing. Now a retry-after-error or a downstream caller that triggers refresh while another is mid-fetch always sees its intent applied. Typecheck + biome lint clean. * fix(brief): close downgrade-leak + blank-on-upgrade races on panel Addresses two P1 findings on PR #3159. 1. In-flight fetch leaked premium content after downgrade. refresh() checked entitlement only BEFORE await premiumFetch. If auth flipped during the fetch, updatePanelGating had already replaced this.content with the locked CTA, but renderReady/ renderComposing then overwrote it with brief content. Fixed with a three-gate sequence + fetch abort: (a) Pre-fetch check: gate + hasPremiumAccess — unchanged. (b) In-flight abort: override showGatedCta() to abort() the AbortController before super() paints the locked CTA. renderReady/renderComposing never even runs. (c) Post-response re-check: re-verify this.gateLocked + hasPremiumAccess before any this.content mutation. Catches the tight window where abort() lost the race or where an error-handler path could still paint brief-ish UI. All three are needed — a user can sign out between any two of them; removing any one leaves a real leakage window. 2. Upgrade → blank panel. unlockPanel() base-class behaviour clears the locked content and leaves the content element empty. No refresh was triggered on the free/anon → PRO transition, so the panel stayed blank until page reload. Overrode unlockPanel() to detect the wasLocked transition and call refresh() after re-rendering the loading state. Also tracks gateLocked as a local mirror of the base's private _locked, since Panel doesn't expose a getter. Synced via the two override sites above; used in the three-gate checks. Typecheck + biome lint clean.	2026-04-18 13:28:23 +04:00
Elie Habib	711636c7b6	feat(brief): consolidate composer into digest cron (retire standalone service) (#3157 ) * feat(brief): consolidate composer into digest cron (retire standalone service) Merges the Phase 3a standalone Railway composer into the existing digest cron. End state: one cron (seed-digest-notifications.mjs) writes brief:{userId}:{issueDate} for every eligible user AND dispatches the digest to their configured channels with a signed magazine URL appended. Net -1 Railway service. User's architectural note: "there is no reason to have 1 digest preparing all and sending, then another doing a duplicate". This delivers that — infrastructure consolidation, same send cadence, single source of truth for brief envelopes. File moves / deletes: - scripts/seed-brief-composer.mjs → scripts/lib/brief-compose.mjs Pure-helpers library: no main(), no env guards, no cron. Exports composeBriefForRule + groupEligibleRulesByUser + dedupeRulesByUser (shim) + shouldExitNonZero + date helpers + extractInsights. - Dockerfile.seed-brief-composer → deleted. - The seed-brief-composer Railway service is retired (user confirmed they would delete it manually). New files: - scripts/lib/brief-url-sign.mjs — plain .mjs port of the sign path in server/_shared/brief-url.ts (Web Crypto only, no node:crypto). - tests/brief-url-sign.test.mjs — parity tests that confirm tokens minted by the scripts-side signer verify via the edge-side verifier and produce byte-identical output for identical input. Digest cron (scripts/seed-digest-notifications.mjs): - Reads news:insights:v1 once per run, composes per-user brief envelopes, SETEX brief:{userId}:{issueDate} via body-POST pipeline. - Signs magazine URL per user (BRIEF_URL_SIGNING_SECRET + WORLDMONITOR_PUBLIC_BASE_URL new env requirements, see pre-merge). - Injects magazineUrl into buildChannelBodies for every channel (email, telegram, slack, discord) as a "📖 Open your WorldMonitor Brief magazine" footer CTA. - Email HTML gets a dedicated data-brief-cta-slot near the top of the body with a styled button. - Compose failures NEVER block the digest send — the digest cron's existing behaviour is preserved when the brief pipeline has issues. - Brief compose extracted to its own functions (composeBriefsForRun + composeAndStoreBriefForUser) to keep main's biome complexity at baseline (64 — was 63 before; inline would have pushed to 117). Tests: 98/98 across the brief suite. New parity tests confirm cross- module signer agreement. PRE-MERGE: add BRIEF_URL_SIGNING_SECRET and WORLDMONITOR_PUBLIC_BASE_URL to the digest-notifications Railway service env (same values already set on Vercel for Phase 2). Without them, brief compose is auto- disabled and the digest falls back to its current behaviour — safe to deploy before env is set. * fix(brief): digest Dockerfile + propagate compose failure to exit code Addresses two seventh-round review findings on PR #3157. 1. Cross-directory imports + current Railway build root (todo 230). The consolidated digest cron imports from ../api, ../shared, and (transitively via scripts/lib/brief-compose.mjs) ../server/_shared. The running digest-notifications Railway service builds from the scripts/ root — those parent paths are outside the deploy tree and would 500 on next rebuild with ERR_MODULE_NOT_FOUND. New Dockerfile.digest-notifications (repo-root build context) COPYs exactly the modules the cron needs: scripts/ contents, scripts/lib/, shared/brief-envelope., shared/brief-filter., server/_shared/brief-render., api/_upstash-json.js, api/_seed-envelope.js. Tight list to keep the watch surface small. Pattern matches the retired Dockerfile.seed-brief-composer + the existing Dockerfile.relay. 2. Silent compose failures (todo 231). composeBriefsForRun logged counters but never exited non-zero. An Upstash outage or missing signing secret silently dropped every brief write while Railway showed the cron green. The retired standalone composer exited 1 on structural failures; that observability was lost in the consolidation. Changed the compose fn to return {briefByUser, composeSuccess, composeFailed}. Main captures the counters, runs the full digest send loop first (compose-layer breakage must NEVER block user- visible digest delivery), then calls shouldExitNonZero at the very end. Exit-on-failure gives ops the Railway-red signal without touching send behaviour. Also: a total read failure of news:insights:v1 (catch branch) now counts as 1 compose failure so the gate trips on insights- key infra breakage, not just per-user write failures. Tests unchanged (98/98). Typecheck + node --check clean. Biome complexity ticks 63→65 — same pre-existing bucket, already tolerated by CI; no new blocker. PRE-MERGE Railway work still pending: set BRIEF_URL_SIGNING_SECRET + WORLDMONITOR_PUBLIC_BASE_URL on the digest-notifications service, AND switch its dockerfilePath to /Dockerfile.digest-notifications before merging. Without the dockerfilePath switch, the next rebuild fails. fix(brief): Dockerfile type:module + explicit missing-secret tripwire Addresses two eighth-round review findings on PR #3157. 1. ESM .js files parse as CommonJS in the container (todo 232). Dockerfile.digest-notifications COPYs shared/.js, server/_shared/.js, api/.js — all ESM because the repo-root package.json has "type":"module". But the image never copies the root package.json, so Node's nearest-pjson walk inside /app/ reaches / without finding one and defaults to CommonJS. First `export` statement throws `SyntaxError: Unexpected token 'export'` at startup. Fix: write a minimal /app/package.json with {"type":"module"} early in the build. Avoids dragging the full root package.json into the image while still giving Node the ESM hint it needs for repo-owned .js files. 2. Missing BRIEF_URL_SIGNING_SECRET silently tolerated (todo 233). The old gate folded "operator-disabled" (BRIEF_COMPOSE_ENABLED=0) and "required secret missing in rollout" into the same boolean via AND. A production deploy that forgot the env var would skip brief compose without any failure signal — Railway green, no briefs, no CTA in digests, nobody notices. Split the two states: BRIEF_COMPOSE_DISABLED_BY_OPERATOR (explicit kill switch, silent) and BRIEF_SIGNING_SECRET_MISSING (the misconfig we care about). When the secret is missing without the operator flag, composeBriefsForRun returns composeFailed=1 on first call so the end-of-run exit gate trips and Railway flags the run red. Digest send still proceeds — compose-layer issues never block notifications. Tests: 98/98. Syntax + node --check clean. fix(brief): address 2 remaining P2 review comments on PR #3157 Greptile review (2026-04-18T05:04Z) flagged three P2 items. The first (shouldExitNonZero never wired into cron) was already fixed in commit `35a46aa34`. This commit addresses the other two. 1. composeBriefForRule: issuedAt used Date.now() instead of the caller-supplied nowMs. Under the digest cron the delta is milliseconds and harmless, but it broke the function's determinism contract — same input must produce same output for tests + retries. Now uses the passed nowMs. 2. buildChannelBodies: magazineUrl embedded raw inside Telegram HTML <a href="..."> and Slack <URL\|text> syntax. The URL is HMAC- signed and shape-validated upstream (userId regex + YYYY-MM-DD date), so injection is practically impossible — but the email CTA (injectBriefCta) escapes per-target and channel footers should match that discipline. Added: - Telegram: escape &, <, >, " to HTML entities - Slack: strip <, >, \| (mrkdwn metacharacters) Discord and plain-text paths unchanged — Discord links tolerate raw URLs, plain text has no metacharacters to escape. Tests: 98/98 still pass (deterministic issuedAt change was transparent to existing assertions because tests already pass nowMs explicitly via the issuedAt fixture field).	2026-04-18 12:30:08 +04:00
Elie Habib	45da551d17	feat(brief): per-user composer writing brief:{userId}:{issueDate} (Phase 3a) (#3154 ) * feat(brief): per-user composer writing brief:{userId}:{issueDate} (Phase 3a) Phase 3a of docs/plans/2026-04-17-003. Produces the Redis-resident envelopes that Phases 1 (renderer) and 2 (edge routes) already know how to serve, so after this ships the end-to-end read path works with real data. Files: - shared/brief-filter.{js,d.ts}: pure helpers. normaliseThreatLevel maps upstream 'moderate' -> 'medium' (contract pinned the union in Phase 1). filterTopStories applies sensitivity thresholds and caps at maxStories. assembleStubbedBriefEnvelope builds a full envelope with stubbed greeting/lead/threads/signals and runs it through the renderer's assertBriefEnvelope so no malformed envelope is ever persisted. issueDateInTz computes per-user local date via Intl with UTC fallback. - scripts/seed-brief-composer.mjs: Railway cron. Reads news:insights:v1 once, fetches enabled alert rules via the existing /relay/digest-rules endpoint (same set seed-digest-notifications uses), then for each rule computes the user's local issue date, filters stories, assembles an envelope, and SETEX brief:{userId}:{issueDate} with 7-day TTL. Respects aiDigestEnabled opt-in. Honours SIGTERM. Exits non-zero when >5% of rules fail so Railway surfaces structural breakage. - Dockerfile.seed-brief-composer: standalone container. Copies the minimum set (composer + shared/ contract + renderer validator + Upstash helper + seed-envelope unwrapper). - tests/brief-filter.test.mjs: 22 pure-function tests covering severity normalisation (including 'moderate' alias), sensitivity thresholds, story cap, empty-title drop, envelope assembly passes the strict renderer validator, tz-aware date math across +UTC/-UTC offsets with a bad-timezone fallback. Out of scope for this PR: - LLM-generated whyMatters / lead / signals (Phase 3b). - brief_ready event fan-out to notification-relay (Phase 3c). - Dashboard panel that consumes /api/latest-brief (Phase 4). Pre-merge runbook: 1. Create a new Railway service from Dockerfile.seed-brief-composer. 2. Set env vars (UPSTASH_, CONVEX_URL, RELAY_SHARED_SECRET) — reuse the values already in the digest service. 3. Add a cron schedule (suggested: hourly at :05 so it lands between the insights-seeder tick and the digest cron). 4. Verify first run: check service logs for "[brief-composer] Done: success=X ..." and a reader's /api/latest-brief should stop returning 'composing' within one cron cycle. Tests: 72/72 (22 brief-filter + 30 render + 20 HMAC). Typecheck + lint clean. Composer script parses with node --check. fix(brief): aiDigestEnabled default + per-user rule dedupe Addresses two fourth-round review findings on PR #3154. 1. aiDigestEnabled default parity (todo 224). Composer was checking `!rule.aiDigestEnabled`, which skips legacy rules that predate the optional field. The rest of the codebase defaults it to true (seed-digest-notifications.mjs:914 uses `!== false`; notifications-settings.ts:228 uses `?? true`; the Convex setter defaults to true). Flipped the composer to `=== false` so only an explicit opt-out skips the brief. 2. Multi-variant last-write-wins (todo 225). alertRules are (userId, variant)-scoped but the brief key is user-scoped (brief:{userId}:{issueDate}). Users with the full+finance+tech variants all enabled would produce three competing writes with a nondeterministic survivor. Added dedupeRulesByUser() that picks one rule per user: prefers 'full' variant, then most permissive sensitivity (all > high > critical), tie-breaking on earliest updatedAt for stability across input reordering. Logs the occurrence so we can see how often users have multi-variant configs. Also hardened against future regressions: - Moved env-var guards + main() call behind an isMain() check (feedback_seed_isMain_guard). Previously, importing the script from a test would fire process.exit(0) on the BRIEF_COMPOSER_ENABLED=0 branch and kill the test runner. Tests now load the file cleanly. - Exported dedupeRulesByUser so the tests can exercise the selection logic directly. - The new tests/brief-composer-rule-dedup.test.mjs includes a cross-module assertion that seed-digest-notifications.mjs still reads `rule.aiDigestEnabled !== false`. If the digest cron ever drifts, this test fails loud — the brief and digest must agree on who is eligible. Tests: 83/83 (was 72; +6 dedupe cases + 5 aiDigestEnabled parity cases). Typecheck + lint clean. * fix(brief): dedupe order + failure-rate denominator Addresses two fifth-round review findings on PR #3154. 1. Dedupe was picking a preferred variant BEFORE checking whether it could actually emit a brief (todo 226). A user with aiDigestEnabled=false on 'full' but true on 'finance' got skipped entirely; same for a user with sensitivity='critical' on 'full' that filters to zero stories while 'finance' has matching content. Replaced dedupeRulesByUser with groupEligibleRulesByUser: pre- filters opted-out rules, then returns ALL eligible variants per user in preference order. The main loop walks candidates and takes the first one whose story filter produces non-empty content. Fallback is cheap (story filter is pure) and preserves the 'full'- first + most-permissive-sensitivity tie-breakers from before. dedupeRulesByUser is kept as a thin wrapper for the existing tests; new tests exercise the group+fallback path directly (opt-out + opt-in sibling, all-opted-out drop, ordering stability). 2. Failure gate denominator drifted from numerator (todo 227). After dedupe, `failed` counts per-user but the gate still compared to pre-dedupe rules.length. 60 rules → 10 users → 2 failed writes = 20% real failure hidden behind a 60-rule denominator. Fix: denominator is now eligibleUserCount (Map size after group-and-filter). Log line reports rules + eligible_users + success + skipped_empty + failed + duration so ops can see the full shape. Tests: 86/86 (was 83; +3 new: opt-out+sibling, all-opted-out drop, candidate-ordering). Typecheck clean, node --check clean, biome clean. * fix(brief): body-POST SETEX + attempted-only failure denominator Addresses two sixth-round review findings on PR #3154. 1. Upstash SETEX (todo 228). The previous write path URL-encoded the full envelope into /setex/{key}/{ttl}/{payload} which can blow past proxy/edge/Node HTTP request-target limits for realistic 12-story briefs (5-20 KB JSON). Switched to body-POST via the existing `redisPipeline` helper — same transport every other write in the repo uses. Per-command error surface is preserved: the wrapper throws on null pipeline response or on a {error} entry in the result array. 2. Failure-rate denominator (todo 229). Earlier round switched denominator from pre-dedupe rules.length to eligibleUserCount, but the numerator only counts users that actually reached a write attempt. skipped_empty users inflate eligibleUserCount without being able to fail, so 4/4 failed writes against 100 eligible (96 skipped_empty) reads as 4% and silently passes. Denominator is now `success + failed` (attempted writes only). Extracted shouldExitNonZero({success, failed}) so the denominator contract lives in a pure function with 7 test cases: - 0 failures → no exit - 100% failure on small volume → exits - 1/20 at exact 5% threshold → exits (documented boundary) - 1/50 below threshold → no exit - 2/10 above Math.max(1) floor → exits - 1/1 single isolated failure → exits - 0 attempted (no signal) → no exit Tests: 93/93 (was 86; +7 threshold cases). Typecheck + lint clean.	2026-04-18 08:45:02 +04:00

1 2 3 4 5 ...

3471 Commits