5 Commits

Author SHA1 Message Date
Elie Habib
de769ce8e1 fix(api): unblock Pro API clients at edge + accept x-api-key alias (#3155)
* fix(api): unblock Pro API clients at edge + accept x-api-key alias

Fixes #3146: Pro API subscriber getting 403 when calling from Railway.
Two independent layers were blocking server-side callers:

1. Vercel Edge Middleware (middleware.ts) blocks any UA matching
   /bot|curl\/|python-requests|go-http|java\//, which killed every
   legitimate server-to-server API client before the gateway even saw
   the request. Add bypass: requests carrying an `x-worldmonitor-key`
   or `x-api-key` header that starts with `wm_` skip the UA gate.
   The prefix is a cheap client-side signal, not auth — downstream
   server/gateway.ts still hashes the key and validates against the
   Convex `userApiKeys` table + entitlement check.

2. Header name mismatch. Docs/gateway only accepted
   `X-WorldMonitor-Key`, but most API clients default to `x-api-key`.
   Accept both header names in:
     - api/_api-key.js (legacy static-key allowlist)
     - server/gateway.ts (user-issued Convex-backed keys)
     - server/_shared/premium-check.ts (isCallerPremium)
   Add `X-Api-Key` to CORS Allow-Headers in server/cors.ts and
   api/_cors.js so browser preflights succeed.

Follow-up outside this PR (Cloudflare dashboard, not in repo):
- Extend the "Allow api access with WM" custom WAF rule to also match
  `starts_with(http.request.headers["x-api-key"][0], "wm_")`, so CF
  Managed Rules don't block requests using the x-api-key header name.
- Update the api-cors-preflight CF Worker's corsHeaders to include
  `X-Api-Key` (memory: cors-cloudflare-worker.md — Worker overrides
  repo CORS on api.worldmonitor.app).

* fix(api): tighten middleware bypass shape + finish x-api-key alias coverage

Addresses review findings on #3155:

1. middleware.ts bypass was too loose. "Starts with wm_" let any caller
   send X-Api-Key: wm_fake and skip the UA gate, shifting unauthenticated
   scraper load onto the gateway's Convex lookup. Tighten to the exact
   key format emitted by src/services/api-keys.ts:generateKey —
   `^wm_[a-f0-9]{40}$` (wm_ + 20 random bytes as hex). Still a cheap
   edge heuristic (no hash lookup in middleware), but raises spoofing
   from trivial prefix match to a specific 43-char shape.

2. Alias was incomplete on bespoke endpoints outside the shared gateway:
   - api/v2/shipping/route-intelligence.ts: async wm_ user-key fallback
     now reads X-Api-Key as well
   - api/v2/shipping/webhooks.ts: webhook ownership fingerprint now
     reads X-Api-Key as well (same key value → same SHA-256 → same
     ownerTag, so a user registering with either header can manage
     their webhook from the other)
   - api/widget-agent.ts: accept X-Api-Key in the auth read AND in the
     OPTIONS Allow-Headers list
   - api/chat-analyst.ts: add X-Api-Key to the OPTIONS Allow-Headers
     list (auth path goes through shared helpers already aliased)
2026-04-18 08:18:49 +04:00
Elie Habib
bcdd508b69 feat(analyst): topic-aware digest search fixes hallucination on niche queries (#2677)
* feat(analyst): topic-aware digest search for WM Analyst

The analyst was answering topic-specific questions from forecast probabilities and model knowledge instead of from actual ingested news articles.

Root cause: 200 RSS feed articles flow into news:digest:v1:full:en but only 8 top-scored stories make it to news:insights:v1. Topic-relevant articles were silently discarded at the clustering step.

Three gaps fixed:
- GDELT live headlines now append up to 3 user query keywords to surface topic-relevant live articles
- Full digest corpus is now keyword-searched per query; top 8 matched articles injected first in context as 'Matched News Articles'
- Fallback prompt no longer invites model speculation; replaced with explicit prohibition

Keyword extraction runs once in assembleAnalystContext and is shared by both GDELT and digest search (zero added latency).

TODO: fan out digest search to multi-language keys when available.

* fix(analyst): multi-turn retrieval continuity and word-boundary keyword matching

P1: prepend last user turn to retrieval query for follow-up topic continuity
P2: preserve 2-char acronyms (US/UK/EU); use tokenizeForMatch+findMatchingKeywords for word-boundary-safe scoring instead of String.includes

* fix(analyst): prioritize current-turn keywords in retrieval query

extractKeywords processes tokens left-to-right and caps at 8 distinct
terms. Building the retrieval string as prevTurn + currentQuery let a
long prior question fill the cap before the pivot term in the follow-up
(e.g. 'germany' in 'What about Germany?') was ever seen.

Swapped to currentQuery + prevTurn so current-turn keywords always win
the available slots; prior-turn terms backfill what remains for topic
continuity.

* fix(analyst): preserve 2-char acronyms case-insensitively in keyword extraction

Previous guard (/^[A-Z]{2}$/) only matched uppercase input, so common
lowercase queries like 'us sanctions', 'uk energy', 'ai chip exports'
still dropped the key term before retrieval.

Added KNOWN_2CHAR_ACRONYMS set (us, uk, eu, un, ai) checked against the
lowercased token, so the preservation path triggers regardless of how the
user typed the query.

* test(analyst): cover extractKeywords edge cases and retrieval priority ordering

Fills the coverage gap noted in review: existing tests only exercised
prompt text, leaving keyword extraction and retrieval assembly untested.

- Export extractKeywords() to make it unit-testable
- Fix emptyCtx/fullCtx fixtures to include relevantArticles field
- extractKeywords suite: stopword filtering, deduplication, 8-keyword cap,
  known 2-char acronyms (us/uk/eu/un/ai) case-insensitive, non-acronym
  2-char drop, empty-result path
- Retrieval priority suite: verifies current-turn pivot appears first in
  keyword list when query+prevTurn is combined, prior-turn backfills
  remaining slots, long prior turns cannot crowd out current-turn pivot
2026-04-04 15:11:50 +04:00
Elie Habib
46d69547ad feat(analyst): suggest Widget Creator for visual/chart queries (#2546)
* feat(analyst): suggest Widget Creator for visual/chart queries

When a user asks the Chat Analyst for something that could be visualised
(charts, price comparisons, dashboards), the backend detects that intent
via a keyword regex on the query and emits an SSE action event before
the first token. The frontend renders a "Create chart widget →" button
in the analyst bubble; clicking it opens the Widget Creator pre-filled
with the original query.

- api/chat-analyst.ts: VISUAL_INTENT_RE + buildActionEvents() +
  prependSseEvents() replaces prependSseEvent; action event emitted
  between meta and the first LLM delta
- ChatAnalystPanel.ts: ActionEvent interface, renderActionChip() method,
  action event handling in readStream()
- WidgetChatModal.ts: initialMessage? option pre-fills textarea
- panel-layout.ts: wm:open-widget-creator listener calls
  openWidgetChatModal with onComplete → addCustomWidget
- main.css: .chat-action-chip accent-coloured button style

🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.ai/claude-code) + Compound Engineering v2.49.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(analyst): chip reusable + extract action logic + add tests

P1: Remove { once: true } from action chip click handler so the
    Create chart widget button remains clickable after the modal
    is closed or fails preflight.

P2: Extract VISUAL_INTENT_RE and buildActionEvents into
    server/worldmonitor/intelligence/v1/chat-analyst-actions.ts
    (exported) and add 14 assertions to chat-analyst.test.mts:
    - visual queries emit suggest-widget action with correct fields
    - all five Quick Actions produce no chip
    - bare "chart" (UN Charter, "chart a course") is not matched
    - regex is case-insensitive

* fix(chat-analyst): extend VISUAL_INTENT_RE to match intermediate subject nouns

Add optional one-word gap in chart/graph/plot verb arms so natural queries
like 'chart oil prices vs gold' and 'graph interest rates' are detected.
Add rates? and performance to graph/plot arms for consistency.
Add 3 new test cases covering the intermediate-noun pattern.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 17:01:21 +04:00
Elie Habib
cf5328f2d4 fix(llm): pin LLM edge functions to US/EU regions to prevent geo-block 403s (#2541)
OpenRouter returns 403 'This model is not available in your region' when
Vercel routes through edge nodes in regions where Google Gemini is blocked.
Pin chat-analyst, news, and intelligence edge functions to iad1/lhr1/fra1/sfo1.

Also improves error logging in callLlmReasoningStream to include model name
and full response body on non-2xx for easier future diagnosis.
2026-03-30 11:08:14 +04:00
Elie Habib
4435d43436 feat(analyst): WM Analyst — pro chat panel with streaming SSE (#2459)
* feat(analyst): add WM Analyst pro chat panel with streaming SSE

Pro-gated conversational AI panel that assembles live context from 9
Redis sources (news insights, risk scores, market implications, forecasts,
macro signals, prediction markets, stock/commodity quotes, country brief)
and streams LLM responses token-by-token via SSE.

- api/chat-analyst.ts: standalone Vercel edge function with isCallerPremium
  auth gate, history trimming, and text/event-stream SSE response
- server/.../chat-analyst-context.ts: parallel Redis assembly via
  Promise.allSettled with graceful degradation on partial failure
- server/.../chat-analyst-prompt.ts: domain-focused system prompt builder
  with geo/market/military/economic emphasis modes
- server/_shared/llm.ts: callLlmReasoningStream() streaming variant
  returning ReadableStream emitting delta/done/error SSE events
- src/components/ChatAnalystPanel.ts: Panel subclass with domain chips,
  quick actions, streaming indicator, export, and AbortController cleanup
- Wired into WEB_PREMIUM_PANELS + lazyPanel in panel-layout.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: gitignore docs/plans and docs/brainstorms (local planning artifacts)

* refactor(analyst): rename panel from SIGINT Analyst to WM Analyst

* fix(chat-analyst): address security and correctness review findings

- Add VALID_DOMAINS allowlist for domainFocus to prevent prompt injection
- Sanitize history message content with sanitizeForPrompt()
- Reduce LLM timeout to 25s to stay within Vercel Edge wall-clock limit
- Surface degraded context as SSE event when >4 Redis sources fail
- Remove unused _domainFocus param from assembleAnalystContext
- Remove redundant .catch(() => null) inside Promise.allSettled calls
- Fix callLlmReasoningStream return type (always returns stream, not null)
- Remove dead sseError function (unreachable null-stream path)
- Add chat-analyst to apiKeyPanels so API key holders can access the panel
- Use requestAnimationFrame for scroll-to-bottom to prevent layout thrash
- Handle degraded SSE event in client with inline notice
- Fix orphaned history.push in !res.ok error path
- Use pushHistory() in incomplete stream path to avoid duplication

* feat(chat-analyst): GDELT headlines, domain filtering, source chips, preset UX

- Add GDELT live headlines as 10th context source (domain-aware topic,
  8s timeout, graceful fallback, parallel with Redis via Promise.allSettled)
- Add activeSources[] to AnalystContext — labels for non-empty data fields
- Restore domainFocus param in assembleAnalystContext for GDELT topic mapping
- Add DOMAIN_SECTIONS map: only inject relevant context sections per domain
  (market skips risk/world brief, military skips market sections, etc.)
- Replace always-inject-all approach — smaller prompts for focused queries
- Raise word limit 250→350 words; add bold headers + cite figures instruction
- Replace conditional degraded SSE event with always-present meta event:
  { meta: { sources: string[], degraded: bool } } as first SSE event
- Upgrade QUICK_ACTIONS to { label, icon, query } objects; chips show
  icon + short label instead of raw query text
- Add renderSourceChips(): source badge row above each analyst response,
  rendered from meta.sources before first token arrives

* fix(chat-analyst): address P1/P2 review findings

- Probability threshold: use prob > 1 (not prob * 100 > 1) so fractions
  like 0.75 render as 75% not 1%
- CHROME_UA: import from server/_shared/constants instead of local Chrome/124
- GDELT timeout: reduce 8s to 2.5s to avoid breaching Vercel edge stream limit
- GDELT headlines: use sanitizeForPrompt instead of sanitizeHeadline
  (GDELT is an external unauthenticated source, deserves full sanitization)
- Rate limiting: add checkRateLimit after isPremium gate to bound LLM cost
- LlmStreamOptions: omit provider field (silently ignored by callLlmReasoningStream)
- ChatAnalystPanel: append text nodes during streaming, render markdown only
  on finalize to eliminate per-token layout recalculations
- geoContext: document as Phase 2 / agent-accessible pending map wiring

* test(chat-analyst): add 18 tests for buildAnalystSystemPrompt domain filtering and config alignment

* fix(llm-stream): abort propagation and mid-stream provider splice

P1 — callLlmReasoningStream now has a cancel() handler that sets a
streamClosed flag and aborts the active provider fetch via AbortController.
Accepts an optional signal (req.signal from the edge handler) wired to
the per-fetch controller so client disconnect cancels upstream work.
prependSseEvent propagates cancel() to its inner reader.

P2 — hasContent is now declared outside the try block so the catch path
can see it. If a provider has already emitted delta chunks and then throws
(mid-stream network error), the stream closes with {done:true} rather than
falling through to the next provider and splicing a second answer.

* fix(llm-stream): timeout scope and truncated-response signaling

Timeout regression: clearTimeout was firing immediately after fetch()
returned headers, so timeoutMs no longer bounded the streaming body read.
Moved it to after the while loop (normal completion) and to the !resp.ok
early-exit path; catch still clears it on throw.

Partial-success misreport: when a provider threw mid-stream after emitting
deltas, the catch emitted {done:true} which the client treated as clean
completion and committed the truncated answer to history. Now the stream
closes without a done event so readStream() returns 'incomplete'. The
panel appends a visible truncation note and skips pushHistory to avoid
poisoning the conversation context with a partial answer.

* fix(chat-analyst): source chip CSS and italic markdown rendering

Add .chat-source-chips / .chat-source-chip styles that were missing —
source labels (Brief, Risk, Forecasts, etc.) were rendering as unstyled
run-on text. Added .chat-source-chip--warn for the degraded indicator.

Add italic support to basicMarkdownToHtml() so *...* renders as <em>
instead of literal asterisks. Bold is processed first to avoid partial
matches on **...**. Fixes the "Response may be incomplete" and
"Response cut off" truncation messages which used italic markers.

* fix(llm-stream): [DONE] sentinel now exits the outer read loop

break inside the inner for-lines loop only exited that loop; the outer
while continued reading until the TCP connection closed naturally. Added
providerDone flag so [DONE] breaks both loops immediately.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 10:44:29 +04:00