* fix(brief): unblock whyMatters analyst endpoint + add DIGEST_ONLY_USER filter
Three changes, all operational for PR #3248's brief-why-matters feature.
1. middleware.ts PUBLIC_API_PATHS allowlist
Railway logs post-#3248 merge showed every cron call to
/api/internal/brief-why-matters returning 403 — middleware's "short
UA" guard (~L183) rejects Node undici's default UA before the
endpoint's own Bearer-auth runs. The feature never executed in prod;
three-layer fallback silently shipped legacy Gemini output. Same
class as /api/seed-contract-probe (2026-04-15). Endpoint still
carries its own subtle-crypto HMAC auth, so bypassing the UA gate
is safe.
2. Explicit UA on callAnalystWhyMatters fetch
Defense-in-depth. Explicit 'worldmonitor-digest-notifications/1.0'
keeps the endpoint reachable if PUBLIC_API_PATHS is ever refactored,
and makes cron traffic distinguishable from ops curl in logs.
3. DIGEST_ONLY_USER=user_xxx filter
Operator single-user test flag. Set on Railway to run compose + send
for one user on the next tick (then unset) — validates new features
end-to-end without fanning out. Empty/unset = normal fan-out. Applied
right after rule fetch so both compose and dispatch paths respect it.
Regression tests: 15 new cases in tests/middleware-bot-gate.test.mts
pin every PUBLIC_API_PATHS entry against 3 triggers (empty/short/curl
UA) plus a negative sibling-path suite so a future prefix-match
refactor can't silently unblock /api/internal/.
Tests: 6043 pass. typecheck + typecheck:api clean. biome: pre-existing
main() complexity warning bumped 74→78 by the filter block (unchanged
in character from pre-PR).
* test(middleware): expand sibling-path negatives to cover all 3 trigger UAs
Greptile flagged: `SIBLING_PATHS` was only tested with `EMPTY_UA`. Under
the current middleware chain this is sufficient (sibling paths hit the
short-UA OR BOT_UA 403 regardless), but it doesn't pin *which* guard
fires. A future refactor that moves `PUBLIC_API_PATHS.has(path)` later
in the chain could let a curl or undici UA pass on a sibling path
without this suite failing.
Fix: iterate the 3 sibling paths against all 3 trigger UAs (empty,
short/undici, curl). Every combination must still 403 regardless of
which guard catches it. 6 new test cases.
Tests: 35 pass in the middleware-bot-gate suite (was 29).
* fix(brief): per-run slot URL so same-day digests link to distinct briefs
Digest emails at 8am and 1pm on the same day pointed to byte-identical
magazine URLs because the URL was keyed on YYYY-MM-DD in the user tz.
Each compose run overwrote the single daily envelope in place, and the
composer rolling 24h story window meant afternoon output often looked
identical to morning. Readers clicking an older email got whatever the
latest cron happened to write.
Slot format is now YYYY-MM-DD-HHMM (local tz, per compose run). The
magazine URL, carousel URLs, and Redis key all carry the slot, and each
digest dispatch gets its own frozen envelope that lives out the 7d TTL.
envelope.data.date stays YYYY-MM-DD for rendering "19 April 2026".
The digest cron also writes a brief:latest:{userId} pointer (7d TTL,
overwritten each compose) so the dashboard panel and share-url endpoint
can locate the most recent brief without knowing the slot. The
previous date-probing strategy does not work once keys carry HHMM.
No back-compat for the old YYYY-MM-DD format: the verifier rejects it,
the composer only ever writes the new shape, and any in-flight
notifications signed under the old format will 403 on click. Acceptable
at the rollout boundary per product decision.
* fix(brief): carve middleware bot allowlist to accept slot-format carousel path
BRIEF_CAROUSEL_PATH_RE in middleware.ts was still matching only the
pre-slot YYYY-MM-DD segment, so every slot-based carousel URL emitted
by the digest cron (YYYY-MM-DD-HHMM) would miss the social allowlist
and fall into the generic bot gate. Telegram/Slack/Discord/LinkedIn
image fetchers would 403 on sendMediaGroup, breaking previews for the
new digest links.
CI missed this because tests/middleware-bot-gate.test.mts still
exercised the old /YYYY-MM-DD/ path shape. Swap the fixture to the
slot format and add a regression asserting the pre-slot shape is now
rejected, so legacy links cannot silently leak the allowlist after
the rollout.
* fix(brief): preserve caller-requested slot + correct no-brief share-url error
Two contract bugs in the slot rollout that silently misled callers:
1. GET /api/latest-brief?slot=X where X has no envelope was returning
{ status: 'composing', issueDate: <today UTC> } — which reads as
"today's brief is composing" instead of "the specific slot you
asked about doesn't exist". A caller probing a known historical
slot would get a completely unrelated "today" signal. Now we echo
the requested slot back (issueSlot + issueDate derived from its
date portion) when the caller supplied ?slot=, and keep the
UTC-today placeholder only for the no-param path.
2. POST /api/brief/share-url with no slot and no latest-pointer was
falling into the generic invalid_slot_shape 400 branch. That is
not an input-shape problem; it is "no brief exists yet for this
user". Return 404 brief_not_found — the same code the
existing-envelope check returns — so callers get one coherent
contract: either the brief exists and is shareable, or it doesn't
and you get 404.
* fix(brief): allow Telegram/social UAs to fetch carousel images
middleware.ts BOT_UA regex (/bot/i) was 403 on Telegram sendMediaGroup
fetch of /api/brief/carousel/<u>/<d>/<p>. SOCIAL_IMAGE_UA allowlist
(includes telegrambot) was scoped to /favico/* and .png suffix only;
carousel returns image/png but the URL has no extension.
Symptom: Railway log [digest] Telegram carousel 400 ... WEBPAGE_CURL_FAILED
and zero images above the Telegram brief.
Fix: extend UA-bypass guard to cover /api/brief/carousel/ prefix.
HMAC token on the URL is the real auth; UA allowlist is defence-in-depth.
* Address P2 + P3: regression test + route-shape regex
P2: Add tests/middleware-bot-gate.test.mts — 13 cases pinning the
contract:
- TelegramBot/Slackbot/Discordbot/LinkedInBot pass on carousel
- curl, generic bot UAs, missing UA still 403 on carousel
- TelegramBot 403s on non-carousel API routes (scoped, not global)
- Malformed carousel paths (admin/dashboard, page >= 3, non-ISO
date) all still 403 via the regex
- Normal browsers pass everywhere
P3: Replace startsWith('/api/brief/carousel/') prefix with
BRIEF_CAROUSEL_PATH_RE matching the exact shape enforced by
api/brief/carousel/[userId]/[issueDate]/[page].ts
(userId / YYYY-MM-DD / page 0|1|2). A future
/api/brief/carousel/admin or similar sibling cannot inherit the
bypass. Comment now lists every social-image UA this protects.
typecheck + typecheck:api clean. test:data 5772/5772.
* fix(api): unblock Pro API clients at edge + accept x-api-key alias
Fixes#3146: Pro API subscriber getting 403 when calling from Railway.
Two independent layers were blocking server-side callers:
1. Vercel Edge Middleware (middleware.ts) blocks any UA matching
/bot|curl\/|python-requests|go-http|java\//, which killed every
legitimate server-to-server API client before the gateway even saw
the request. Add bypass: requests carrying an `x-worldmonitor-key`
or `x-api-key` header that starts with `wm_` skip the UA gate.
The prefix is a cheap client-side signal, not auth — downstream
server/gateway.ts still hashes the key and validates against the
Convex `userApiKeys` table + entitlement check.
2. Header name mismatch. Docs/gateway only accepted
`X-WorldMonitor-Key`, but most API clients default to `x-api-key`.
Accept both header names in:
- api/_api-key.js (legacy static-key allowlist)
- server/gateway.ts (user-issued Convex-backed keys)
- server/_shared/premium-check.ts (isCallerPremium)
Add `X-Api-Key` to CORS Allow-Headers in server/cors.ts and
api/_cors.js so browser preflights succeed.
Follow-up outside this PR (Cloudflare dashboard, not in repo):
- Extend the "Allow api access with WM" custom WAF rule to also match
`starts_with(http.request.headers["x-api-key"][0], "wm_")`, so CF
Managed Rules don't block requests using the x-api-key header name.
- Update the api-cors-preflight CF Worker's corsHeaders to include
`X-Api-Key` (memory: cors-cloudflare-worker.md — Worker overrides
repo CORS on api.worldmonitor.app).
* fix(api): tighten middleware bypass shape + finish x-api-key alias coverage
Addresses review findings on #3155:
1. middleware.ts bypass was too loose. "Starts with wm_" let any caller
send X-Api-Key: wm_fake and skip the UA gate, shifting unauthenticated
scraper load onto the gateway's Convex lookup. Tighten to the exact
key format emitted by src/services/api-keys.ts:generateKey —
`^wm_[a-f0-9]{40}$` (wm_ + 20 random bytes as hex). Still a cheap
edge heuristic (no hash lookup in middleware), but raises spoofing
from trivial prefix match to a specific 43-char shape.
2. Alias was incomplete on bespoke endpoints outside the shared gateway:
- api/v2/shipping/route-intelligence.ts: async wm_ user-key fallback
now reads X-Api-Key as well
- api/v2/shipping/webhooks.ts: webhook ownership fingerprint now
reads X-Api-Key as well (same key value → same SHA-256 → same
ownerTag, so a user registering with either header can manage
their webhook from the other)
- api/widget-agent.ts: accept X-Api-Key in the auth read AND in the
OPTIONS Allow-Headers list
- api/chat-analyst.ts: add X-Api-Key to the OPTIONS Allow-Headers
list (auth path goes through shared helpers already aliased)
Vercel log showed 'Middleware 403 Forbidden' on /api/seed-contract-probe
for both curl-from-ops and UptimeRobot requests. middleware.ts's BOT_UA
regex matches 'curl/' and 'bot', so any monitoring/probe UA was blocked
before reaching the handler — even though the probe has its own
RELAY_SHARED_SECRET auth that makes the UA check redundant.
Added /api/seed-contract-probe to PUBLIC_API_PATHS (joining /api/version
and /api/health). Safe: the endpoint enforces x-probe-secret matching
RELAY_SHARED_SECRET internally; bypassing the generic UA gate does not
reduce security.
Commented the allowlist to spell out the invariant: entries must carry
their own auth, because this list disables the middleware's generic bot
gate.
Verified via Vercel Inspector log trace:
Firewall: bypass → OK
Middleware: 403 Forbidden ← this commit fixes it
Handler: (unreachable before fix)
* feat(seo): BlogPosting schema, FAQPage JSON-LD, author system, AI crawler welcome
Blog structured data:
- Change @type Article to BlogPosting for all blog posts
- Author: Organization to Person with extensible default (Elie Habib)
- Add per-post author/authorUrl/authorBio/modifiedDate frontmatter fields
- Auto-extract FAQPage JSON-LD from FAQ sections in all 17 posts
- Show Updated date when modifiedDate differs from pubDate
- Add author bio section with GitHub avatar and fallback
Main app:
- Add commodity variant to middleware VARIANT_HOST_MAP and VARIANT_OG
- Add commodity.worldmonitor.app to sitemap.xml
- Shorten index.html meta description to 136 chars (was 161)
- Remove worksFor block from index.html author JSON-LD
- Welcome all bots in robots.txt (removed per-bot blocks, global allows)
- Update llms.txt: five variants listed, all 17 blog post URLs added
* fix(seo): scope FAQ regex to section boundary, use author-aware avatar
- extractFaqLd now slices only to the next ## heading (was: to end of body)
preventing bold text in post-FAQ sections from being mistakenly extracted
- Avatar src now derived from DEFAULT_AUTHOR_GITHUB constant (koala73)
only when using the default author; custom authors fall back to favicon
so multi-author posts show a correct image instead of the wrong profile
UptimeRobot UA contains "bot" (UptimeRo*bot*) which triggers the
BOT_UA regex, causing 403 on health checks. Add /api/health to
PUBLIC_API_PATHS alongside /api/version.
* feat: enhance support for HLS streams and update font styles
* chore: add .vercelignore to exclude large local build artifacts from Vercel deploys
* chore: include node types in tsconfig to fix server type errors on Vercel build
* fix(middleware): guard optional variant OG lookup to satisfy strict TS
* fix: desktop build and live channels handle null safety
- scripts/build-sidecar-sebuf.mjs: Skip building removed [domain]/v1/[rpc].ts (removed in #785)
- src/live-channels-window.ts: Add optional chaining for handle property to prevent null errors
- src-tauri/Cargo.lock: Bump version to 2.5.24
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
* fix: address review issues on PR #1020
- Remove AGENTS.md (project guidelines belong to repo owner)
- Restore tracking script in index.html (accidentally removed)
- Revert tsconfig.json "node" types (leaks Node globals to frontend)
- Add protocol validation to isHlsUrl() (security: block non-http URIs)
- Revert Cargo.lock version bump (release management concern)
* fix: address P2/P3 review findings
- Preserve hlsUrl for HLS-only channels in refreshChannelInfo (was
incorrectly clearing the stream URL on every refresh cycle)
- Replace deprecated .substr() with .substring()
- Extract duplicated HLS display name logic into getChannelDisplayName()
---------
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Elie Habib <elie.habib@gmail.com>
* perf(rss): route RSS direct to Railway, skip Vercel middleman
Vercel /api/rss-proxy has 65% error rate (207K failed invocations/12h).
Route browser RSS requests directly to Railway (proxy.worldmonitor.app)
via Cloudflare CDN, eliminating Vercel as middleman.
- Add VITE_RSS_DIRECT_TO_RELAY feature flag (default off) for staged rollout
- Centralize RSS proxy URL in rssProxyUrl() with desktop/dev/prod routing
- Make Railway /rss public (skip auth, keep rate limiting with CF-Connecting-IP)
- Add wildcard *.worldmonitor.app CORS + always emit Vary: Origin on /rss
- Extract ~290 RSS domains to shared/rss-allowed-domains.cjs (single source of truth)
- Convert Railway domain check to Set for O(1) lookups
- Remove rss-proxy from KEYED_CLOUD_API_PATTERN (no longer needs API key header)
- Add edge function test for shared domain list import
* fix(edge): replace node:module with JSON import for edge-compatible RSS domains
api/_rss-allowed-domains.js used createRequire from node:module which is
unsupported in Vercel Edge Runtime, breaking all edge functions (including
api/gpsjam). Replaced with JSON import attribute syntax that works in both
esbuild (Vercel build) and Node.js 22+ (tests).
Also fixed middleware.ts TS18048 error where VARIANT_OG[variant] could be
undefined.
* test(edge): add guard against node: built-in imports in api/ files
Scans ALL api/*.js files (including _ helpers) for node: module imports
which are unsupported in Vercel Edge Runtime. This would have caught the
createRequire(node:module) bug before it reached Vercel.
* fix(edge): inline domain array and remove NextResponse reference
- Replace `import ... with { type: 'json' }` in _rss-allowed-domains.js
with inline array — Vercel esbuild doesn't support import attributes
- Replace `NextResponse.next()` with bare `return` in middleware.ts —
NextResponse was never imported
* ci(pre-push): add esbuild bundle check and edge function tests
The pre-push hook now catches Vercel build failures locally:
- esbuild bundles each api/*.js entrypoint (catches import attribute
syntax, missing modules, and other bundler errors)
- runs edge function test suite (node: imports, module isolation)
Replace build-time VITE_VARIANT resolution with hostname-based detection
for web deployments. A single build now serves all 4 variants (full, tech,
finance, happy) via subdomain routing, eliminating 3 redundant Vercel
projects and their build minutes.
- Extract VARIANT_META into shared src/config/variant-meta.ts
- Detect variant from hostname (tech./finance./happy. subdomains)
- Preserve VITE_VARIANT env var for desktop builds and localhost dev
- Add social bot OG responses in middleware for variant subdomains
- Swap favicons and meta tags at runtime per resolved variant
- Restrict localStorage variant reads to localhost/Tauri only
- Remove PostHog analytics runtime and configuration
- Add API rate limiting (api/_rate-limit.js)
- Harden traffic controls across edge functions
- Add runtime fallback controls and data-loader improvements
- Add military base data scripts (fetch-mirta-bases, fetch-osm-bases)
- Gitignore large raw data files
- Settings playground prototypes
Route web production RPC traffic through api.worldmonitor.app via fetch
interceptor (installWebApiRedirect). Add default Cache-Control headers
(s-maxage=300, stale-while-revalidate=60) on GET 200 responses, with
no-store override for real-time endpoints (vessel snapshot). Update CORS
to allow GET method. Skip Vercel bot middleware for API subdomain using
hostname check (non-spoofable, replacing CF-Ray header approach). Update
desktop cloud fallback to route through api.worldmonitor.app.
* fix: restrict SW route patterns to same-origin only
The broad regex /^https?:\/\/.*\/api\/.*/i matched ANY URL with /api/
in the path, including external APIs like NASA EONET
(eonet.gsfc.nasa.gov/api/v3/events). Workbox intercepted these
cross-origin requests with NetworkOnly, causing no-response errors
when CORS failed.
Changed all /api/, /ingest/, and /rss/ SW route patterns to use
sameOrigin callback check so only our Vercel routes get NetworkOnly
handling. External APIs now pass through without SW interference.
* fix: whitelist social preview bots on OG image assets
Slack-ImgProxy (distinct from Slackbot) was blocked from fetching
/favico/og-image.png by both our bot filter and Vercel Attack Challenge.
Extend middleware matcher to /favico/* and allow all social preview/image
bots through on static asset paths.
- fred-data: batch mode (comma-separated series_id) reduces 7 edge
function invocations to 1; cap at 15 series; propagate upstream
502s instead of masking as empty 200; add X-Data-Status header
- ucdp-events: parallelize page fetches; track failed pages and use
short cache TTL for partial results instead of caching at full 6h
- ucdp: add OPTIONS/method guard matching ucdp-events pattern
- middleware: exact-match social bot paths instead of startsWith
- vercel.json: use VERCEL_GIT_PREVIOUS_SHA for multi-commit diffs;
add middleware.ts, settings.html, vercel.json to watch list
- Panel.ts: use safeHtml() allowlist sanitizer for tooltip content
- dom-utils: add safeHtml() with tag/attribute allowlist and
javascript: URI blocking
Block crawlers/scrapers from /api/* routes via Edge Middleware (403 for
bot user-agents and missing/short UAs). Social preview bots (Twitter,
Facebook, LinkedIn, Slack, Discord) are allowed on /api/story and
/api/og-story for OG previews. robots.txt reinforces the same policy.