Commit Graph

25 Commits

Author SHA1 Message Date
Elie Habib
0e1714e559 fix(seed): write seed-meta when validateFn rejects empty data (#2273)
* feat(seed): switch economic calendar from Finnhub to FRED API

Finnhub /calendar/economic requires a $3500/mo premium subscription.
FRED (St. Louis Fed) provides official government-scheduled release
dates for free using the existing FRED_API_KEY already in Railway env.

Sources:
- Release 10: CPI (BLS)
- Release 50: Nonfarm Payrolls (BLS)
- Release 53: GDP (BEA)
- Release 54: PCE / Personal Income (BEA)
- Release 9:  Retail Sales (Census Bureau)
- Hardcoded:  FOMC rate decision dates (Fed, published annually)

FRED tracks the full year schedule in advance via
include_release_dates_with_no_data=true. No new API key needed.

* fix(panels): remove description blob from AI Market Implications; refresh every 3h

* fix(seed): write seed-meta even when validateFn rejects empty data

When a seed runs but finds no publishable data (e.g. no earnings in the
next 14-day window, no econ events scheduled), runSeed calls extendExistingTtl
which only extends keys that already exist. If seed-meta was never written
(first run or expired), health sees seedStale=true → STALE_SEED warn even
though the seeder is healthy.

Fix: call writeFreshnessMetadata(count=0) in the skipped path so health can
distinguish 'seeder ran, nothing to publish' from 'seeder stopped running'.

* fix(seed): add User-Agent to FRED fetch; make FOMC dates year-keyed not hardcoded

Greptile P2s from PR #2273:
- Missing User-Agent: CHROME_UA added to fetchFredReleaseDates per AGENTS.md
- FOMC_DATES_2026 constant would silently return empty FOMC list from Jan 2027;
  restructured as FOMC_DATES_BY_YEAR map, buildFomcEvents merges current + next
  year so there is always a lookahead window until next year's dates are added
2026-03-26 10:04:55 +04:00
Elie Habib
9b4a7f793f fix(fear-greed): route CBOE+CNN through residential proxy, update Chrome UA to 134 (#2191)
CBOE CDN (cdn.cboe.com) returns 403 and CNN dataviz returns 418 to Railway
datacenter IPs — both block non-residential server traffic. Route fetchCBOE()
and fetchCNN() through the OREF_PROXY_AUTH residential proxy (froxy.com) using
undici ProxyAgent. Falls back to native fetch when OREF_PROXY_AUTH is unset
(local dev). Also adds Referer headers and explicit HTTP status logging so
failures are visible in seed logs. Updates CHROME_UA from Chrome/120 to 134.
2026-03-24 12:52:31 +04:00
Elie Habib
e6bae4d7a8 feat(seeds): shared FX rate cache + BigMac WoW data quality guards (#2003)
* feat(seeds): shared FX rate cache + BigMac WoW guards

- Extract SHARED_FX_FALLBACKS to _seed-utils.mjs as single source of truth,
  eliminating duplicated FX fallback tables across seed-bigmac, seed-grocery-basket,
  and seed-fx-rates
- Add getSharedFxRates() / fetchYahooFxRates() to _seed-utils.mjs so all seeds
  share one Redis-cached rate set (shared:fx-rates:v1, 4h TTL) instead of each
  making ~46 independent Yahoo Finance calls per run
- Add seed-fx-rates.mjs: dedicated daily Railway cron that pre-warms the shared
  FX cache, saving ~90 Yahoo calls per weekly bigmac+grocery-basket cycle
- Add WoW minimum-age guard (6 days): prevents week-on-week display when previous
  snapshot is less than 6 days old (fixes -98.5% France WoW on first seed run)
- Add per-country WoW anomaly filter (+-20%): nulls suspicious entries and logs
  admin alert with country name and delta for Railway log monitoring
- Fix global WoW anomaly check to use unfiltered raw average so it can actually
  exceed +-20% (filtered average was mathematically bounded and never triggered)
- Add USD price sanity range guard ($1.50-$12.00): drops prices from bad scrapes
  before they reach Redis (would have caught the $470 France value)
- Move WOW_ANOMALY_THRESHOLD, MIN_WOW_AGE_MS, USD_MIN, USD_MAX to module scope

* fix(seed-fx): address PR review — TTL mismatch and partial write-back risk

- Extend shared:fx-rates:v1 TTL from 4h to 25h so cache stays warm
  between daily cron runs (with 1h drift buffer)
- Make getSharedFxRates() read-only: remove write-back on partial cache
  hit and on cache miss; only seed-fx-rates.mjs owns writes to this key,
  preventing a subset consumer from silently overwriting a fuller cache
2026-03-21 18:41:04 +04:00
Elie Habib
99a7793e99 feat(seed): learned routes cache for grocery basket — skip EXA on known-good URLs (#1981)
* feat(seed): add learned routes cache to grocery basket seed

Persists successful EXA/Firecrawl URL discoveries in Redis so subsequent
runs skip the expensive EXA search for known-good (country, item) pairs.

Strategy per item:
1. Direct fetch + matchPrice on learned URL (free)
2. Firecrawl on learned URL if step 1 fails (handles JS SPAs)
3. Full EXA search only when learned route fails or is absent
4. Saves newly discovered URL as learned route for next run

Safety guarantees matching the Codex review:
- isAllowedRouteHost() validates hostname against country.sites allowlist
  before both saving and replaying (prevents stored-SSRF)
- tryDirectFetch() applies CURRENCY_MIN + ITEM_USD_MAX bulk-price guards
  identical to the existing EXA and Firecrawl paths
- failsSinceSuccess >= 2 triggers true DEL (not TTL wait)
- SET/DEL conflict resolved: effectiveDeletes filters keys in updates;
  DELs sent before SETs in pipeline
- All operations non-fatal: pipeline failures log warnings, seed continues

New exports in _seed-utils.mjs: isAllowedRouteHost, bulkReadLearnedRoutes,
bulkWriteLearnedRoutes (1 pipeline read + 1 pipeline write per run).

BigMac deferred to Phase 2 (uses EXA summaries from aggregator pages).

Estimated savings: ~63 of 90 EXA calls skipped per run at 70% hit rate.

* test(seed): extract processItemRoute for testability; add 5 integration tests

- Move item-level decision tree into processItemRoute() in _seed-utils.mjs
  so it can be imported and unit-tested without triggering runSeed()
- seed-grocery-basket.mjs delegates to processItemRoute() with fetchViaExa
  callback containing the existing EXA+Firecrawl block
- 5 integration tests cover: learned-hit success (EXA skipped), learned fail
  + EXA replacement, fail x2 eviction, SSRF guard (bad host blocks direct
  fetch), EXA success with unlisted host (route not saved)
- Fix: move allowedHosts computation outside Promise.all (once per country)
- Fix: add [EXA->learned] log tag when new route is saved from EXA discovery
- All 21 seed-learned-routes tests pass

* fix(seed): strip path from allowedHosts entries before hostname comparison

grocery-basket.json contains "noon.com/saudi-en" for Saudi Arabia.
allowedHosts was built with only www. stripped, so the comparison
  hostname === 'noon.com/saudi-en'
was always false — noon.com routes for SA were rejected or evicted
every run, preventing the cache from ever stabilizing there.

Fix: split('/')[0] after stripping www., giving bare hostname.
Add regression test: path-bearing allowlist entry matches noon.com URL.
2026-03-21 12:43:56 +04:00
Elie Habib
2e16159bb6 feat(economic): WoW price tracking + weekly cadence for BigMac & Grocery panels (#1974)
* feat(economic): add WoW tracking and fix plumbing for bigmac/grocery-basket panels

Phase 1 — Fix Plumbing:
- Adjust CACHE_TTL to 10 days (864000s) for bigmac and grocery-basket seeds
- Align health.js SEED_META maxStaleMin to 10080 (7 days) for both
- Add grocery-basket and bigmac to seed-health.js SEED_DOMAINS with intervalMin: 5040
- Refactor publish.ts writeSnapshot to accept advanceSeedMeta param; only
  advance seed-meta when fresh data exists (overallFreshnessMin < 120)
- Add manual-fallback-only comment to seed-consumer-prices.mjs

Phase 2 — Week-over-Week Tracking:
- Add wow_pct field to BigMacCountryPrice and CountryBasket proto messages
- Add wow_avg_pct, wow_available, prev_fetched_at to both response protos
- Regenerate client/server TypeScript from updated protos
- Add readCurrentSnapshot() helper + WoW computation to seed-bigmac.mjs
  and seed-grocery-basket.mjs; write :prev key via extraKeys
- Update BigMacPanel.ts to show per-country WoW column and global avg summary
- Update GroceryBasketPanel.ts to show WoW badge on total row and basket avg summary
- Add .bm-wow-up, .bm-wow-down, .bm-wow-summary, .gb-wow CSS classes
- Fix server handlers to include new WoW fields in fallback responses

* fix(economic): guard :prev extraKey against null on first seed run; eliminate double freshness query in publish.ts

* refactor(economic): address code review findings from PR #1974

- Extract readSeedSnapshot() into _seed-utils.mjs (DRY: was duplicated
  verbatim in seed-bigmac and seed-grocery-basket)
- Add FRESH_DATA_THRESHOLD_MIN constant in publish.ts (replace magic 120)
- Fix seed-consumer-prices.mjs contradictory JSDoc (remove stale
  "Deployed as: Railway cron service" line that contradicted manual-only warning)
- Add i18n keys panels.bigmacWow / panels.bigmacCountry to en.json
- Replace hardcoded "WoW" / "Country" with t() calls in BigMacPanel
- Replace IIFE-in-ternary pattern with plain if blocks in BigMacPanel
  and GroceryBasketPanel (P2/P3 from code review)

* fix(publish): gate advanceSeedMeta on any-retailer freshness, not average

overallFreshnessMin is the arithmetic mean across all retailers, so with
1 fresh + 2 stale retailers the average can exceed 120 min and suppress
seed-meta advancement even while fresh data is being published.

Use retailers.some(r => r.freshnessMin < 120) to correctly implement
"at least one retailer scraped within the last 2 hours."
2026-03-21 10:56:48 +04:00
Elie Habib
3670716daa feat(forecast): add market transmission state (#1971) 2026-03-21 09:48:38 +04:00
Elie Habib
b52916b7e3 fix(health): adjust gdeltIntel maxStaleMin for 6h cron; warn on expired-key EXPIRE no-op (#1853)
* fix(health): adjust gdeltIntel maxStaleMin for 6h cron; fix silent EXPIRE no-op on expired keys

- gdeltIntel maxStaleMin: 150 → 420 (6h cron + 1h grace). The 150 threshold was
  calibrated for the old 2h cron — with 6h intervals it fires STALE throughout
  most of each cycle, masking the signal entirely.

- _seed-utils extendExistingTtl: EXPIRE returns 0 (no-op) on expired/missing keys,
  but the log always said "Extended TTL on N key(s)" regardless. Added per-result
  checking: keys that returned 0 now emit a WARNING so the death-spiral condition
  (validate fails + key expired + EXPIRE is silently a no-op) is visible in logs
  rather than silently passing as if TTL was extended.

* fix(seed-health): align gdelt-intel intervalMin to 210 (420min maxStaleMin / 2)

Codex flagged mismatch: health.js allows 420min before flagging gdelt-intel
stale, but seed-health.js still used intervalMin: 150 (flags after 300min).
Ops tooling monitoring seed-health would generate spurious alerts for most
of each 6h cron cycle. Align to 210min per the maxStaleMin/2 convention.
2026-03-19 08:33:14 +04:00
Elie Habib
9e58365587 fix(seeds): extend seed-meta TTL alongside data keys on fetch failure (#1724)
When upstream APIs fail and seeds extend existing data key TTLs, the
seed-meta key was left untouched. Health checks use seed-meta fetchedAt
to determine staleness, so preserved data still triggered STALE_SEED
warnings even though the data was valid.

Now all TTL extension paths include the corresponding seed-meta key:
- _seed-utils.mjs runSeed() (fetch failure + validation skip)
- fetch-gpsjam.mjs (Wingbits 500 fallback)
- seed-airport-delays.mjs (FAA fetch failure)
- seed-military-flights.mjs (OpenSky fetch failure)
- seed-service-statuses.mjs (RPC fetch failure)
2026-03-17 06:35:12 +04:00
Elie Habib
fbb8f15943 fix(seeds): skip transient redis lock timeouts (#1714)
* fix(seeds): skip transient redis lock timeouts

* docs(seeds): clarify transient redis error matching

* test: expand transient redis error coverage

Add tests for ECONNRESET, DNS failure (EAI_AGAIN), ETIMEDOUT, and
negative cases (HTTP 403, payload size) to confirm isTransientRedisError
only matches network-level failures, not app-level Redis errors.
2026-03-16 11:57:52 +04:00
Elie Habib
63fe04d78f fix(seeds): extend existing cache TTL on validation failure (#1705)
When a seed fetches data but validation rejects it (e.g. FIRMS API
returns 0 fires due to timeout), extend the existing key's TTL
instead of letting it expire. Old data survives until the next
successful fetch. Applies to all seeds using runSeed().
2026-03-16 08:10:42 +04:00
Elie Habib
39931456a1 feat(forecast): add structured scenario pipeline and trace export (#1646)
* feat(forecast): add AI Forecasts prediction module (Pro-tier)

MiroFish-inspired prediction engine that generates structured forecasts
across 6 domains (conflict, market, supply chain, political, military,
infrastructure) using existing WorldMonitor data streams.

- Proto definitions for ForecastService with GetForecasts RPC
- Dedicated seed script (seed-forecasts.mjs) with 6 domain detectors,
  cross-domain cascade resolver, prediction market calibration, and
  trend detection via prior snapshot comparison
- Premium-gated RPC handler (PREMIUM_RPC_PATHS enforcement)
- Lazy-loaded ForecastPanel with domain filters, probability bars,
  trend arrows, signal evidence, and cascade links
- Health monitoring integration (seed-meta freshness tracking)
- Refresh scheduler with API key guard

* test(forecast): add 47 unit tests for forecast detectors and utilities

Covers forecastId, normalize, resolveCascades, calibrateWithMarkets,
computeTrends, and smoke tests for all 6 domain detectors. Exports
testable functions from seed script with direct-run guard.

* fix(forecast): domain mismatch 'infra' vs 'infrastructure', add panel category

- Seed script used 'infra' but ForecastPanel filtered on 'infrastructure',
  causing Infra tab to show zero results
- Added 'forecast' to intelligence category in PANEL_CATEGORY_MAP

* fix(forecast): move CSS to one-time injection, improve type safety

- P2: Move style block from setContent to one-time document.head injection
  to prevent CSS accumulation on repeated renders
- P3: Replace +toFixed(3) with Math.round for readability in seed script
- P3: Use Forecast type instead of any[] in RPC handler filter

* fix(forecast): handle sebuf proto data shapes from Redis

Detectors now normalize CII scores from server-side proto format
(combinedScore, TREND_DIRECTION_RISING, region) to uniform shape.
Outage severity handles proto enum format (SEVERITY_LEVEL_HIGH).
Added confidence floor of 0.3 for single-source predictions.

Verified against live Redis: 2 predictions generated (Iran infra
shutdown, IL political instability).

* feat(forecast): unlock AI Forecasts on web, lock desktop only (trial)

- Remove forecast RPC from PREMIUM_RPC_PATHS (web access is free)
- Panel locked on desktop only (same as oref-sirens/telegram-intel)
- Remove API key guards from data-loader and refresh scheduler
- Web users get full access during trial period

* chore: regenerate proto types with make generate

Re-ran make generate after rebasing on main. Plugin v0.7.0 dropped
@ts-nocheck from output, added it back to all 50 generated files.
Fixed 4 type errors from proto codegen changes:
- MarketSource enum -> string union type
- TemporalAnomalyProto -> TemporalAnomaly rename
- webcam lastUpdated number -> string

* chore: add proto freshness check to pre-push hook

Runs make generate before push and compares checksums of generated files.
If proto types are stale, blocks push with instructions to regenerate.
Skips gracefully if buf CLI is not installed.

* fix(forecast): use chokepoints v4 key, include ciiContribution in unrest

- P1: Switch chokepoints input from stale v2 to active v4 Redis key,
  matching bootstrap.js and cache-keys.ts
- P2: Add ciiContribution to unrest component fallback chain in
  normalizeCiiEntry so political detector reads the correct sebuf field

* feat(forecast): Phase 2 LLM scenario enrichment + confidence model

MiroFish-inspired enhancements:
- LLM scenario narratives via Groq/OpenRouter (narrative-only, no numeric
  adjustment). Evidence-grounded prompts with mandatory signal citation
  and few-shot examples from MiroFish's SECTION_SYSTEM_PROMPT_TEMPLATE.
- Top-4 predictions batched into single LLM call for cost efficiency.
- News context from newsInsights attached to all predictions for LLM
  prompt grounding (NOT in signals, cannot affect confidence).
- Deterministic confidence model: source diversity via SIGNAL_TO_SOURCE
  mapping (deduplicates cii+cii_delta, theater+indicators) + calibration
  agreement from prediction market drift. Floor 0.2, ceiling 1.0.
- Output validation: rejects scenarios without signal references.
- Truncated JSON repair for small model output.
- Structured JSON logging for LLM calls.
- Redis cache for LLM scenarios (1h TTL).
- 23 new tests (70 total), all passing.
- Live-tested: OpenRouter gemini-2.5-flash produces evidence-grounded
  scenario narratives from real WorldMonitor data.

* feat(forecast): Phase 3 multi-perspective scenarios, projections, data-driven cascades

MiroFish-inspired enhancements:
- Multi-perspective LLM analysis: top-2 predictions get strategic,
  regional, and contrarian viewpoints via combined LLM call
- Probability projections: domain-specific decay curves (h24/d7/d30)
  anchored to timeHorizon so probability equals projections[timeHorizon]
- Data-driven cascade rules: moved from hardcoded array to JSON config
  (scripts/data/cascade-rules.json) with schema validation, named
  predicate evaluators, unknown key rejection, and fallback to defaults
- 4 new cascade paths: infrastructure->supply_chain, infrastructure->market
  (both requiresSeverity:total), conflict->political, political->market
- Proto: added Perspectives and Projections messages to Forecast
- ForecastPanel: renders projections row and conditional perspectives toggle
- 89 tests (19 new), all passing
- Live-tested: OpenRouter produces perspectives from real data

* feat(forecast): Phase 4 data utilization + entity graph

Fixes data gaps that prevented 4 of 6 detectors from firing:
- Input normalizers: chokepoint v4 shape + GPS hexes-to-zones mapping
- Chokepoint warm-ping (production-only, requires WM_API_BASE_URL)
- Lowered CII conflict threshold from 70 to 60, gated on level=high|critical

4 new standalone detectors:
- UCDP conflict zones (10+ events per country)
- Cyber threat concentration (5+ threats per country)
- GPS jamming in maritime shipping zones (5 regions)
- Prediction markets as signals (60-90% probability markets)

Entity-relationship graph (file-based, 38 nodes):
- Countries, theaters, commodities, chokepoints, alliances
- Alias table resolves both ISO codes and display names
- Graph cascade discovery links predictions across entities

Result: 51 predictions (up from 1-2), spanning conflict, infrastructure,
and supply chain domains. 112 tests, all passing.

* fix(forecast): redis cache format, signal source mapping, type safety

Fresh-eyes audit fixes:
- BUG: redisSet used wrong Upstash API format (POST body with {value,ex}
  instead of command array ['SET',key,value,'EX',ttl]). LLM cache writes
  were silently failing, causing fresh LLM calls every run.
- BUG: prediction_market signal type missing from SIGNAL_TO_SOURCE,
  inflating confidence for market-derived predictions.
- CLEANUP: Remove unnecessary (f as any) casts in ForecastPanel since
  generated Forecast type already has projections/perspectives fields.
- CLEANUP: Bump health maxStaleMin from 60 to 90 to avoid false STALE
  alerts when LLM calls add latency to seed runs.

* feat(forecast): headline-entity matching with news corroboration signals

Uses entity graph aliases to match headlines to predictions by
country/theater (excludes commodity/infrastructure nodes to prevent
false positives). Predictions with matching headlines get a
news_corroboration signal visible in the panel.

Also fixes buildUserPrompt to merge unique headlines from ALL
predictions in the LLM batch (was only reading preds[0].newsContext).

Live-tested: 13 of 51 predictions now have corroborating headlines
(Iran, Israel, Syria, Ukraine, etc). 116 tests, all passing.

* feat(forecast): add country-codes.json for headline-entity matching

56 countries with ISO codes, full names, and scoring keywords (extracted
from src/config/countries.ts + UCDP-relevant additions). Used by
attachNewsContext for richer headline matching via getSearchTermsForRegion
which combines country-codes + entity graph + keyword aliases.

14/57 predictions now have news corroboration (limited by headline
coverage, not matching quality: only 8 headlines currently available).

* feat(forecast): read 300 headlines from news digest instead of 8

Read news:digest:v1:full:en (300 headlines across 16 categories) instead
of just news:insights:v1 topStories (8 headlines). Fallback to topStories
if digest is unavailable.

Result: news corroboration jumped from 25% to 64% (38/59 predictions).

* fix(forecast): handle parenthetical country names in headline matching

Strip suffixes like '(Zaire)', '(Burma)', '(Soviet Union)' from UCDP
region names before matching against country-codes.json. Also use
includes() for reverse name lookup to catch partial matches.

Corroboration: 64% -> 69% (41/59). Remaining 18 unmatched are countries
with no current English-language news coverage.

* fix(forecast): cache validated LLM output, add digest test, log cache errors

Fresh-eyes audit fixes:
- Combined LLM cache now stores only validated items (was caching raw
  unvalidated output, serving potentially invalid scenarios on cache hit)
- redisSet logs warnings on failure (was silently swallowing all errors)
- Added digest-based test for attachNewsContext (primary path was untested)
- Fixed test arity: attachNewsContext(preds, news, digest) with 3 params

* fix(forecast): remove dead confidenceFromSources, reduce warm-ping timeout

- P2: Remove confidenceFromSources (dead code, computeConfidence overwrites
  all initial confidence values). Inline the formula in original detectors.
- P3: Reduce warm-ping timeout from 30s to 15s (non-critical step)
- P3: Add trial status comment on forecast panel config

* fix(forecast): resolve ISO codes to country names, fix market detector, safe pre-push

P1 fixes from code review:
- CII ISO codes (IL, IR) now resolved to full country names (Israel, Iran)
  via country-codes.json. Prevents substring false positives (IL matching
  Chile) in event correlation. Uses word-boundary regex for matching.
- Market detector CII-to-theater mapping now uses entity graph traversal
  instead of broken theater-name substring matching. Iran correctly maps
  to Middle East theater via graph links.
- Pre-push hook no longer runs destructive git checkout on proto freshness
  failure. Reports mismatch and exits without modifying worktree.

* feat(forecast): add structured scenario pipeline and trace export

* fix(forecast): hydrate bootstrap and trim generated drift

* fix(forecast): keep required supply-chain contract updates

* fix(ci): add forecasts to cache-keys registry and regenerate proto

Add forecasts entry to BOOTSTRAP_CACHE_KEYS and BOOTSTRAP_TIERS in
cache-keys.ts to match api/bootstrap.js. Regenerate SupplyChain proto
to fix duplicate TransitDayCount and add riskSummary/riskReportAction.
2026-03-15 15:57:22 +04:00
Elie Habib
ac9e3c8af2 refactor(llm): consolidate provider chain to single source of truth (#1640)
* fix(relay): add LLM fallback chain to ais-relay classify

Replace single Groq-only LLM call with provider fallback chain
(Groq → OpenRouter → Ollama) matching seed-insights.mjs pattern.
If Groq fails or is unavailable, classify falls through to the
next configured provider automatically.

* refactor(llm): consolidate provider chain to single source of truth

- Fix OpenRouter model: openrouter/free → google/gemini-2.5-flash in canonical llm.ts
- Migrate 4 intelligence handlers (classify-event, batch-classify, deduct-situation,
  get-country-intel-brief) from hardcoded Groq-only to callLlm() with full
  ollama → groq → openrouter fallback chain
- Remove duplicate getProviderCredentials from news/v1/_shared.ts, re-export canonical
- Remove orphaned GROQ_API_URL/GROQ_MODEL from intelligence/v1/_shared.ts
- Reorder script provider chains (ais-relay.cjs, seed-insights.mjs) to canonical
  ollama → groq → openrouter order
- Net -161 lines: eliminated duplicated provider logic across 9 files

* fix: eliminate double JSON parse in classify-event, throw on runSeed verification failure

* fix(tests): add llm module alias to country-intel-brief test fixture

* fix: preserve generic LLM_API_* fallback, add retry to seed verification

- Add 'generic' provider to callLlm() chain for LLM_API_URL/LLM_API_KEY/LLM_MODEL
  (preserves existing OpenAI-compatible endpoint contract)
- Change seed verification to warn-only with 1 retry instead of fatal throw
  (write already succeeded, transient read failure shouldn't fail the job)
- Update docs to reflect new provider fallback chain
2026-03-15 11:44:42 +04:00
Elie Habib
4008f56254 fix: log fetch error cause in seed retry/FATAL handlers (#1638)
* test: rewrite transit chart test as structural contract verification

Replace fragile source-string extraction + new Function() compilation
with structural pattern checks on the source code. Tests verify:
- render() clears chart before content change
- clearTransitChart() cancels timer, disconnects observer, destroys chart
- MutationObserver setup for DOM readiness detection
- Fallback timer for no-op renders (100-500ms range)
- Both callbacks (observer + timer) clean up each other
- Tab switch and collapse clear chart state
- Mount function guards against missing element/data

Replaces PR #1634's approach which was brittle (method body extraction,
TypeScript cast stripping, sandboxed execution).

* fix: log fetch error cause in seed retry and FATAL handlers

Node 20 fetch() throws TypeError('fetch failed') with the real error
hidden in err.cause (DNS, TLS, timeout). The current logging only shows
'fetch failed' which is useless for diagnosis. Now logs err.cause.message
in both withRetry() retries and FATAL catch blocks.
2026-03-15 11:09:34 +04:00
Elie Habib
f209c11713 fix(seeds): rethrow non-fetch failures, separate publish errors (#1606)
* fix(seeds): rethrow non-fetch failures in runSeed()

Split runSeed() into two phases so only upstream fetch errors get
the graceful TTL-extension path. Redis publish, seed-meta, and
verification failures now rethrow (exit 1) so monitoring catches them.

* fix(seeds): separate fetch from publish errors in standalone scripts

Split seed-airport-delays, seed-military-flights, and
seed-service-statuses into two phases matching runSeed() pattern:
- Phase 1: upstream fetch errors are graceful (extend TTL, exit 0)
- Phase 2: Redis publish/verify errors propagate (exit 1)

* fix(seeds): make Redis SET throw on failure so publish errors propagate

Local redisSet() returned false instead of throwing, silently masking
Redis write failures. writeExtraKey() also warned instead of throwing.
Both now throw on non-OK responses, ensuring Phase 2 catch fires.

* fix(seed): treat empty Redis key after successful RPC as publish failure

When cachedFetchJson() silently swallows a Redis write failure, the
warm-ping script now throws instead of warning, reaching the outer
catch handler (exit 1) so monitoring detects the issue.
2026-03-15 01:30:54 +04:00
Elie Habib
485d416065 feat(seeds): Railway seed scripts for all unseeded Vercel RPC endpoints (#1599)
* feat(seeds): add Railway seed scripts for economic and trade endpoints

Two new seed scripts to eliminate Vercel edge external API calls:

seed-economy.mjs:
- EIA energy prices (WTI, Brent) -> economic:energy:v1:all
- EIA energy capacity (Solar, Wind, Coal) -> economic:capacity:v1:COL,SUN,WND:20
- FRED series (10 series) -> economic:fred:v1:<id>:120
- Macro signals (Yahoo, Alternative.me, Mempool) -> economic:macro-signals:v1

seed-supply-chain-trade.mjs:
- Shipping rates (FRED) -> supply_chain:shipping:v2
- Trade barriers (WTO tariff gap) -> trade:barriers:v1:tariff-gap:50
- Trade restrictions (WTO MFN overview) -> trade:restrictions:v1:tariff-overview:50
- Trade flows (WTO, 15 major reporters) -> trade:flows:v1:<reporter>:000:10
- Tariff trends (WTO, 15 major reporters) -> trade:tariffs:v1:<reporter>:all:10

Cache keys match handler patterns exactly so cachedFetchJson finds
pre-seeded data and avoids live external API calls from Vercel edge.

* feat(seeds): add seed-aviation.mjs for airport ops and aviation news

Seeds 2 aviation endpoints with predictable default params:
- getAirportOpsSummary (AviationStack + NOTAM) -> aviation:ops-summary:v1:CDG,ESB,FRA,IST,LHR,SAW
- listAviationNews (9 RSS feeds, 24h window) -> aviation:news::24:v1

NOT seeded (inherently on-demand, user-specific inputs):
- getFlightStatus: specific flight number lookup
- trackAircraft: bounding-box or icao24 queries
- listAirportFlights: arbitrary airport+direction+limit combos
- getCarrierOps: depends on listAirportFlights with variable params

* feat(seeds): add seed-conflict-intel.mjs for ACLED, HAPI, and PizzINT

Seeds 3 conflict/intelligence endpoints with predictable default params:
- listAcledEvents (all countries, last 30 days) -> conflict:acled:v1:all:0:0
- getHumanitarianSummary (20 top conflict countries) -> conflict:humanitarian:v1:<CC>
- getPizzintStatus (base + GDELT variants) -> intel:pizzint:v1:base, intel:pizzint:v1:gdelt

NOT seeded (inherently on-demand, LLM or user-specific inputs):
- classifyEvent: per-headline LLM classification
- deductSituation: per-query LLM deduction
- getCountryIntelBrief: per-country LLM brief with context hash
- getCountryFacts: per-country REST Countries + Wikidata + Wikipedia
- searchGdeltDocuments: per-query GDELT search

Requires: ACLED_EMAIL, ACLED_KEY, UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-research.mjs for arXiv, HN, tech events, trending repos

Seeds 4 research endpoints:
- listArxivPapers (cs.AI, cs.CL, cs.CR) -> research:arxiv:v1:<cat>::50
- listHackernewsItems (top, best feeds) -> research:hackernews:v1:<feed>:30
- listTechEvents (Techmeme ICS + dev.events RSS) -> research:tech-events:v1
- listTrendingRepos (python, javascript, typescript) -> research:trending:v1:<lang>:daily:50

Tech events key is also seeded by the relay, this script provides backup
hydration and ensures the key is warm even if relay hasn't run yet.

Requires: UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-military-maritime-news.mjs for USNI and nav warnings

Seeds 2 endpoints with predictable default params:
- USNI Fleet Report (WordPress JSON API) -> usni-fleet:sebuf:v1 + stale backup
- Navigational Warnings (NGA broadcast, all areas) -> maritime:navwarnings:v1:all

NOT seeded (inherently on-demand):
- getAircraftDetails/batch: per-icao24 Wingbits lookup
- listMilitaryFlights: bounding-box query (quantized 1-degree grid)
- getVesselSnapshot: in-memory cache, reads from relay /ais-snapshot
- listFeedDigest: per-feed-URL RSS caching (hundreds of feeds, relay proxied)
- summarizeArticle: per-article LLM summarization

Requires: UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-infra.mjs warm-ping for service statuses and cable health

Uses warm-ping pattern (calls Vercel RPC from Railway) because:
- list-service-statuses: 30 status page parsers with 8 custom formats
- get-cable-health: NGA text analysis with cable name matching + proximity
Replicating this logic in a standalone script is fragile and duplicative.

NOT seeded (on-demand):
- search-imagery: per-bbox/datetime STAC query
- get-giving-summary: hardcoded baselines, no external fetches
- get-webcam-image: per-webcamId Windy API lookup

* fix(seeds): move secondary key writes before process.exit, fix data shapes

Critical bugs found in code review:

1. runSeed() calls process.exit(0) after primary key write, so .then()
   callbacks were dead code. All secondary keys (FRED, macro signals,
   trade data, HAPI summaries, pizzint, HN, trending, etc.) were NEVER
   written. Fix: move writeExtraKey calls inside fetchAll() before return.

2. FRED cache key used :120 suffix but handler default is :0 (req.limit||0).
   Fixed to :0 so seed matches handler cache key for default requests.

3. USNI and nav warnings seed parsers produced wrong data shapes vs handler
   (different field names, missing fields). Converted to warm-ping pattern
   (like seed-infra.mjs) to avoid shape divergence.

* fix(seeds): reduce GDELT 429 rate limiting in seed-gdelt-intel

Problems from logs: every topic fetch hits 429, runs take 3-5min,
4th run failed fatally after 12min of cascading retries.

Fixes:
- Increase inter-topic delay: 12s -> 20s (GDELT needs longer cooldown)
- Increase initial backoff: 10s -> 20s, with 15s increments per retry
- Graceful degradation: exhausted retries return empty topic instead of
  throwing (prevents withRetry from restarting ALL topics from scratch)
- Align TTL with health.js: 3600s -> 7200s (matches maxStaleMin:120)
- Validation allows partial success (3/6 topics minimum)

Cron interval should also be increased from 30min to 2h on Railway
to match the new 2h TTL.

* fix(seeds): 4 bugs from review - ACLED auth, NOTAM key, infra precedence, curated events

P1: ACLED auth used wrong endpoint (api/acled/token) and env vars (ACLED_KEY).
Fixed to match server/acled-auth.ts: ACLED_EMAIL+ACLED_PASSWORD via /oauth/token,
with ACLED_ACCESS_TOKEN static fallback.

P1: Aviation NOTAM key was aviation:notam-closures:v1, handler reads
aviation:notam:closures:v2. Fixed key to match _shared.ts.

P2: Infra warm-ping had operator precedence bug in nullish coalescing:
(a ?? b) ? c : d instead of a ?? (b ? c : d). Added parens.

P2: Research seed missed curated conferences that the handler appends
(CURATED_EVENTS in list-tech-events.ts). Added same curated events so
seeded data matches what the handler would produce.

* fix(seeds): add seed-meta freshness metadata for all secondary keys

Added writeExtraKeyWithMeta() to _seed-utils.mjs that writes both the
data key and a seed-meta:<key> freshness metadata entry. All secondary
key writes in seed scripts now use this helper so health.js can track
freshness for: energy capacity, FRED series, macro signals, trade
barriers/restrictions/flows/tariffs, aviation news, HAPI summaries,
PizzINT, arXiv categories, HN feeds, tech events, trending repos.

Previously only the primary key per script got seed-meta (via runSeed),
leaving secondary keys operationally invisible to health monitoring.

* fix(seeds): align seed-meta keys with health.js conventions

P1: writeExtraKeyWithMeta wrote seed-meta:<full-cache-key> (e.g.,
seed-meta:economic:macro-signals:v1), but health.js expects normalized
names without version suffixes (seed-meta:economic:macro-signals).
Fixed by stripping trailing :v\d+ from key. Added metaKeyOverride
param for cases needing explicit control.

P1: shipping seed used runSeed('supply-chain', 'shipping-trade', ...)
producing seed-meta:supply-chain:shipping-trade, but health.js expects
seed-meta:supply_chain:shipping. Fixed domain/resource to match.

* fix(seeds): only write seed-meta after successful data key write

writeExtraKey() now returns false on failure. writeExtraKeyWithMeta()
skips seed-meta write when the data write fails, preventing false-positive
health reports for keys like macro-signals and tech-events.
2026-03-15 00:37:31 +04:00
Elie Habib
19ee1f38e4 fix(seeds): extend TTL on stale data instead of crashing on fetch errors (#1600)
* fix(seeds): extend TTL on stale data instead of crashing on fetch errors

Seed scripts crashed with process.exit(1) when upstream APIs returned
errors (e.g., Wingbits 401), causing Redis keys to expire and panels
to lose data. Now all seeds gracefully extend TTL on existing keys and
exit 0, keeping stale data alive until the API recovers.

- Add shared extendExistingTtl() helper to _seed-utils.mjs
- Update runSeed() catch block (fixes 24 scripts using it)
- Fix fetch-gpsjam.mjs, seed-airport-delays.mjs,
  seed-military-flights.mjs, seed-service-statuses.mjs

* fix(seeds): preserve per-key TTLs when extending stale military data

THEATER_POSTURE_BACKUP_KEY has a 7-day TTL (604800s) but was being
extended with STALE_TTL (86400s), shortening it from 7 days to 1 day
during upstream outages. Now each key group gets its original TTL.
2026-03-14 23:42:30 +04:00
Elie Habib
fe67111dc9 feat: harness engineering P0 - linting, testing, architecture docs (#1587)
* feat: harness engineering P0 - linting, testing, architecture docs

Add foundational infrastructure for agent-first development:

- AGENTS.md: agent entry point with progressive disclosure to deeper docs
- ARCHITECTURE.md: 12-section system reference with source-file refs and ownership rule
- Biome 2.4.7 linter with project-tuned rules, CI workflow (lint-code.yml)
- Architectural boundary lint enforcing forward-only dependency direction (lint-boundaries.mjs)
- Unit test CI workflow (test.yml), all 1083 tests passing
- Fixed 9 pre-existing test failures (bootstrap sync, deploy-config headers, globe parity, redis mocks, geometry URL, import.meta.env null safety)
- Fixed 12 architectural boundary violations (types moved to proper layers)
- Added 3 missing cache tier entries in gateway.ts
- Synced cache-keys.ts with bootstrap.js
- Renamed docs/architecture.mdx to "Design Philosophy" with cross-references
- Deprecated legacy docs/Docs_To_Review/ARCHITECTURE.md
- Harness engineering roadmap tracking doc

* fix: address PR review feedback on harness-engineering-p0

- countries-geojson.test.mjs: skip gracefully when CDN unreachable
  instead of failing CI on network issues
- country-geometry-overrides.test.mts: relax timing assertion
  (250ms -> 2000ms) for constrained CI environments
- lint-boundaries.mjs: implement the documented api/ boundary check
  (was documented but missing, causing false green)

* fix(lint): scan api/ .ts files in boundary check

The api/ boundary check only scanned .js/.mjs files, missing the 25
sebuf RPC .ts edge functions. Now scans .ts files with correct rules:
- Legacy .js: fully self-contained (no server/ or src/ imports)
- RPC .ts: may import server/ and src/generated/ (bundled at deploy),
  but blocks imports from src/ application code

* fix(lint): detect import() type expressions in boundary lint

- Move AppContext back to app/app-context.ts (aggregate type that
  references components/services/utils belongs at the top, not types/)
- Move HappyContentCategory and TechHQ to types/ (simple enums/interfaces)
- Boundary lint now catches import('@/layer') expressions, not just
  from '@/layer' imports
- correlation-engine imports of AppContext marked boundary-ignore
  (type-only imports of top-level aggregate)
2026-03-14 21:29:21 +04:00
Elie Habib
db6a4a2763 feat(correlation): server-side correlation engine seed + bootstrap hydration (#1571)
* feat(correlation): server-side correlation engine seed + bootstrap hydration

Move correlation card computation from client-side (per-browser, 10-30s delay)
to server-side (Railway cron, instant via bootstrap). Seed script reads 8 Redis
keys, runs 4 adapter signal collectors (military, escalation, economic, disaster),
clusters/scores/generates cards, writes to Redis with 10min TTL.

- New: scripts/seed-correlation.mjs (pure JS port of correlation engine)
- bootstrap.js: add correlationCards to FAST_KEYS tier
- health.js + seed-health.js: register for monitoring (maxStaleMin: 15)
- CorrelationPanel: consume bootstrap on construction, show "Analyzing..." only
  after live engine has run (not for bootstrap-only cards)
- _seed-utils.mjs: support opts.recordCount override (function or number)

* fix(correlation): stale timestamp fallback + coordinate-based country resolution

P1: news stories lacked per-story pubDate, causing Date.now() fallback on
every seed run. Now _clustering.mjs propagates pubDate through to
enrichedStories, and seed-correlation reads s.pubDate then generatedAt.

P2: normalizeToCode dropped signals with unparseable country names.
Added centroid-based coordinate fallback (haversine nearest-match within
800km) matching the live engine's getCountryAtCoordinates behavior.

* fix(correlation): add 11 missing country centroids to coordinate fallback

CI, CR, CV, CY, GA, IS, LA, SZ, TL, TT, XK were in the normalization
maps but missing from COUNTRY_CENTROIDS, causing coordinate-only signals
in those countries to be misclassified or dropped during bootstrap.

* fix(correlation): align protest/outage field names with actual Redis schema

Codex review P1 findings: seed-correlation read wrong field names from
Redis data.

Protests (unrest:events:v1): p.time -> p.occurredAt, p.lat/lon ->
p.location.latitude/longitude, severity enum SEVERITY_LEVEL_* mapping.

Outages (infra:outages:v1): o.pubDate -> o.detectedAt, o.lat/lon ->
o.location.latitude/longitude, severity enum OUTAGE_SEVERITY_* mapping.

Both escalation and disaster adapters updated. Old field names kept as
fallbacks for data shape compatibility.
2026-03-14 15:07:30 +04:00
Elie Habib
760c129c71 fix(seed): SyntaxError from mixing || and ?? operators without parens (#1558)
Mixing || and ?? in the same expression without explicit grouping is
a JS syntax error. This broke ALL Railway seed scripts after #1556.

Refactored to use ?? throughout with explicit Array.isArray guard so
non-topic seeds correctly fall through to their own length checks.
2026-03-14 10:16:59 +04:00
Elie Habib
e0bf4f9bd2 feat: seed GDELT intelligence topics to Redis (#1556)
* feat: seed GDELT intelligence topics to Redis with bootstrap hydration

Add standalone seed script that pre-populates all 6 Live Intelligence
topics (military, cyber, nuclear, sanctions, intelligence, maritime)
from the GDELT Doc API into Redis. Frontend consumes bootstrap data
lazily via the service layer, falling back to RPC if unavailable.

- scripts/seed-gdelt-intel.mjs: new seed script with per-topic 429 retry
- api/bootstrap.js: register gdeltIntel in FAST_KEYS
- api/health.js: register in BOOTSTRAP_KEYS + SEED_META + dataSize
- api/seed-health.js: register in SEED_DOMAINS
- scripts/_seed-utils.mjs: add topics to recordCount detection
- src/services/gdelt-intel.ts: lazy bootstrap consumption in service layer

* fix(seed): align staleness thresholds and strengthen GDELT validation

- seed-health intervalMin 30→60 so staleness (120min) matches health.js maxStaleMin
- validate requires ≥3/6 topics populated (not just military)
- recordCount sums articles across topics instead of reporting topic count
2026-03-14 10:07:28 +04:00
Elie Habib
364e497bd1 fix(scripts): resolve shared JSON configs for Railway rootDirectory (#1231)
Railway deploys seed services with rootDirectory=scripts/, placing files
at /app/ without the parent shared/ directory. The createRequire +
require('../shared/X.json') pattern resolves to /shared/ which doesn't
exist in the container.

- Add loadSharedConfig() to _seed-utils.mjs: tries ../shared/ (local)
  then ./shared/ (Railway) with clear error on miss
- Add requireShared() to ais-relay.cjs with same dual-path fallback
- Add postinstall to scripts/package.json that copies ../shared/ into
  ./shared/ during Railway build
- Update all 6 seed scripts to use loadSharedConfig instead of
  createRequire + require
- Add scripts/shared/ to .gitignore

Fixes crash introduced by #1212 (shared JSON consolidation).
2026-03-08 00:09:24 +04:00
Elie Habib
cad6b9c4e0 feat(infrastructure): expand submarine cables to 86 via TeleGeography API (#1224)
* feat(infrastructure): expand submarine cables to 86 via TeleGeography API seed

- Add `seed-submarine-cables.mjs` Railway cron script fetching 86 strategic
  cables from TeleGeography API (was 19 hand-curated)
- Update `geo.ts` static baseline with full cable data (routes, landing points,
  owners, RFS year, regions)
- Update `get-cable-health.ts` cable name/landing mappings for new slug-based IDs
- Add `data?.cables?.length` to `_seed-utils.mjs` record count heuristic
- Update `map-harness.ts` cable ID references
- Remove GitHub Actions workflows for UCDP and WB indicators (Railway cron only)

* fix(infrastructure): cable route matching, name false positives, validation threshold

- Fix route geometry: only strip numeric suffix when result matches a known
  cable slug, preventing seamewe-6→seamewe, farice-1→farice, etc.
- Fix name matching: use word-boundary regex instead of substring includes;
  disambiguate short names (ACE→ACE CABLE, SAFE→SAFE CABLE, PEACE→PEACE CABLE,
  TEAMS→TEAMS CABLE) to prevent false matches on common NGA words
- Raise validation threshold from 50 to 75 (88% success required) to prevent
  heavily partial upstream results from overwriting good cached data

* fix(infrastructure): tie validation threshold to 90% of configured cable count

Dynamic threshold based on CABLE_REGIONS length instead of a hardcoded number.
Currently requires >= 78 of 86 cables (90%).
2026-03-07 22:24:58 +04:00
Elie Habib
314d341563 fix: gracefully skip seed write when validation fails (empty data) (#1089)
At midnight UTC, FIRMS API returns 0 fire detections due to date
rollover. The validateFn correctly rejects empty data, but previously
this threw a FATAL error and crashed. Now it exits cleanly (code 0),
preserving existing cached data in Redis for the next successful run.
2026-03-06 08:03:13 +04:00
Elie Habib
124085edd6 fix: add process.exit(0) to seed scripts for Railway cron compatibility (#999)
Railway marks cron jobs as "failed" when the Node.js process doesn't
exit cleanly. The seed scripts relied on natural event loop drain,
but undici's connection pool keeps handles alive, causing Railway to
kill the process and mark it as failed.

Changes:
- Add process.exit(0) on success and lock-skip paths in runSeed()
- Fix recordCount for crypto (.quotes) and stablecoin (.stablecoins)
- Add writeExtraKey, sleep, parseYahooChart shared utilities
- Add extraKeys option to runSeed for bootstrap hydration keys
2026-03-04 20:43:16 +04:00
Elie Habib
78a14306d9 feat: add seed-first pattern to 15 RPC handlers with Railway seed scripts (#989)
Migrate handlers from direct external API calls to seed-first pattern:
Railway cron seeds Redis → handlers read from Redis → fallback to live
fetch if seed stale and SEED_FALLBACK_* env enabled.

Handlers updated: earthquakes, fire-detections, internet-outages,
climate-anomalies, unrest-events, cyber-threats, market-quotes,
commodity-quotes, crypto-quotes, etf-flows, gulf-quotes,
stablecoin-markets, natural-events, displacement-summary, risk-scores.

Also adds:
- scripts/_seed-utils.mjs (shared seed framework with atomic publish,
  distributed locks, retry, freshness metadata)
- 13 seed scripts for Railway cron
- api/seed-health.js monitoring endpoint
- scripts/validate-seed-migration.mjs post-deploy validation
- Restored multi-source CII in get-risk-scores (8 sources: ACLED,
  UCDP, outages, climate, cyber, fires, GPS, Iran)
2026-03-04 17:37:15 +04:00