Commit Graph

12 Commits

Author SHA1 Message Date
Elie Habib
f209c11713 fix(seeds): rethrow non-fetch failures, separate publish errors (#1606)
* fix(seeds): rethrow non-fetch failures in runSeed()

Split runSeed() into two phases so only upstream fetch errors get
the graceful TTL-extension path. Redis publish, seed-meta, and
verification failures now rethrow (exit 1) so monitoring catches them.

* fix(seeds): separate fetch from publish errors in standalone scripts

Split seed-airport-delays, seed-military-flights, and
seed-service-statuses into two phases matching runSeed() pattern:
- Phase 1: upstream fetch errors are graceful (extend TTL, exit 0)
- Phase 2: Redis publish/verify errors propagate (exit 1)

* fix(seeds): make Redis SET throw on failure so publish errors propagate

Local redisSet() returned false instead of throwing, silently masking
Redis write failures. writeExtraKey() also warned instead of throwing.
Both now throw on non-OK responses, ensuring Phase 2 catch fires.

* fix(seed): treat empty Redis key after successful RPC as publish failure

When cachedFetchJson() silently swallows a Redis write failure, the
warm-ping script now throws instead of warning, reaching the outer
catch handler (exit 1) so monitoring detects the issue.
2026-03-15 01:30:54 +04:00
Elie Habib
485d416065 feat(seeds): Railway seed scripts for all unseeded Vercel RPC endpoints (#1599)
* feat(seeds): add Railway seed scripts for economic and trade endpoints

Two new seed scripts to eliminate Vercel edge external API calls:

seed-economy.mjs:
- EIA energy prices (WTI, Brent) -> economic:energy:v1:all
- EIA energy capacity (Solar, Wind, Coal) -> economic:capacity:v1:COL,SUN,WND:20
- FRED series (10 series) -> economic:fred:v1:<id>:120
- Macro signals (Yahoo, Alternative.me, Mempool) -> economic:macro-signals:v1

seed-supply-chain-trade.mjs:
- Shipping rates (FRED) -> supply_chain:shipping:v2
- Trade barriers (WTO tariff gap) -> trade:barriers:v1:tariff-gap:50
- Trade restrictions (WTO MFN overview) -> trade:restrictions:v1:tariff-overview:50
- Trade flows (WTO, 15 major reporters) -> trade:flows:v1:<reporter>:000:10
- Tariff trends (WTO, 15 major reporters) -> trade:tariffs:v1:<reporter>:all:10

Cache keys match handler patterns exactly so cachedFetchJson finds
pre-seeded data and avoids live external API calls from Vercel edge.

* feat(seeds): add seed-aviation.mjs for airport ops and aviation news

Seeds 2 aviation endpoints with predictable default params:
- getAirportOpsSummary (AviationStack + NOTAM) -> aviation:ops-summary:v1:CDG,ESB,FRA,IST,LHR,SAW
- listAviationNews (9 RSS feeds, 24h window) -> aviation:news::24:v1

NOT seeded (inherently on-demand, user-specific inputs):
- getFlightStatus: specific flight number lookup
- trackAircraft: bounding-box or icao24 queries
- listAirportFlights: arbitrary airport+direction+limit combos
- getCarrierOps: depends on listAirportFlights with variable params

* feat(seeds): add seed-conflict-intel.mjs for ACLED, HAPI, and PizzINT

Seeds 3 conflict/intelligence endpoints with predictable default params:
- listAcledEvents (all countries, last 30 days) -> conflict:acled:v1:all:0:0
- getHumanitarianSummary (20 top conflict countries) -> conflict:humanitarian:v1:<CC>
- getPizzintStatus (base + GDELT variants) -> intel:pizzint:v1:base, intel:pizzint:v1:gdelt

NOT seeded (inherently on-demand, LLM or user-specific inputs):
- classifyEvent: per-headline LLM classification
- deductSituation: per-query LLM deduction
- getCountryIntelBrief: per-country LLM brief with context hash
- getCountryFacts: per-country REST Countries + Wikidata + Wikipedia
- searchGdeltDocuments: per-query GDELT search

Requires: ACLED_EMAIL, ACLED_KEY, UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-research.mjs for arXiv, HN, tech events, trending repos

Seeds 4 research endpoints:
- listArxivPapers (cs.AI, cs.CL, cs.CR) -> research:arxiv:v1:<cat>::50
- listHackernewsItems (top, best feeds) -> research:hackernews:v1:<feed>:30
- listTechEvents (Techmeme ICS + dev.events RSS) -> research:tech-events:v1
- listTrendingRepos (python, javascript, typescript) -> research:trending:v1:<lang>:daily:50

Tech events key is also seeded by the relay, this script provides backup
hydration and ensures the key is warm even if relay hasn't run yet.

Requires: UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-military-maritime-news.mjs for USNI and nav warnings

Seeds 2 endpoints with predictable default params:
- USNI Fleet Report (WordPress JSON API) -> usni-fleet:sebuf:v1 + stale backup
- Navigational Warnings (NGA broadcast, all areas) -> maritime:navwarnings:v1:all

NOT seeded (inherently on-demand):
- getAircraftDetails/batch: per-icao24 Wingbits lookup
- listMilitaryFlights: bounding-box query (quantized 1-degree grid)
- getVesselSnapshot: in-memory cache, reads from relay /ais-snapshot
- listFeedDigest: per-feed-URL RSS caching (hundreds of feeds, relay proxied)
- summarizeArticle: per-article LLM summarization

Requires: UPSTASH_REDIS_REST_URL/TOKEN

* feat(seeds): add seed-infra.mjs warm-ping for service statuses and cable health

Uses warm-ping pattern (calls Vercel RPC from Railway) because:
- list-service-statuses: 30 status page parsers with 8 custom formats
- get-cable-health: NGA text analysis with cable name matching + proximity
Replicating this logic in a standalone script is fragile and duplicative.

NOT seeded (on-demand):
- search-imagery: per-bbox/datetime STAC query
- get-giving-summary: hardcoded baselines, no external fetches
- get-webcam-image: per-webcamId Windy API lookup

* fix(seeds): move secondary key writes before process.exit, fix data shapes

Critical bugs found in code review:

1. runSeed() calls process.exit(0) after primary key write, so .then()
   callbacks were dead code. All secondary keys (FRED, macro signals,
   trade data, HAPI summaries, pizzint, HN, trending, etc.) were NEVER
   written. Fix: move writeExtraKey calls inside fetchAll() before return.

2. FRED cache key used :120 suffix but handler default is :0 (req.limit||0).
   Fixed to :0 so seed matches handler cache key for default requests.

3. USNI and nav warnings seed parsers produced wrong data shapes vs handler
   (different field names, missing fields). Converted to warm-ping pattern
   (like seed-infra.mjs) to avoid shape divergence.

* fix(seeds): reduce GDELT 429 rate limiting in seed-gdelt-intel

Problems from logs: every topic fetch hits 429, runs take 3-5min,
4th run failed fatally after 12min of cascading retries.

Fixes:
- Increase inter-topic delay: 12s -> 20s (GDELT needs longer cooldown)
- Increase initial backoff: 10s -> 20s, with 15s increments per retry
- Graceful degradation: exhausted retries return empty topic instead of
  throwing (prevents withRetry from restarting ALL topics from scratch)
- Align TTL with health.js: 3600s -> 7200s (matches maxStaleMin:120)
- Validation allows partial success (3/6 topics minimum)

Cron interval should also be increased from 30min to 2h on Railway
to match the new 2h TTL.

* fix(seeds): 4 bugs from review - ACLED auth, NOTAM key, infra precedence, curated events

P1: ACLED auth used wrong endpoint (api/acled/token) and env vars (ACLED_KEY).
Fixed to match server/acled-auth.ts: ACLED_EMAIL+ACLED_PASSWORD via /oauth/token,
with ACLED_ACCESS_TOKEN static fallback.

P1: Aviation NOTAM key was aviation:notam-closures:v1, handler reads
aviation:notam:closures:v2. Fixed key to match _shared.ts.

P2: Infra warm-ping had operator precedence bug in nullish coalescing:
(a ?? b) ? c : d instead of a ?? (b ? c : d). Added parens.

P2: Research seed missed curated conferences that the handler appends
(CURATED_EVENTS in list-tech-events.ts). Added same curated events so
seeded data matches what the handler would produce.

* fix(seeds): add seed-meta freshness metadata for all secondary keys

Added writeExtraKeyWithMeta() to _seed-utils.mjs that writes both the
data key and a seed-meta:<key> freshness metadata entry. All secondary
key writes in seed scripts now use this helper so health.js can track
freshness for: energy capacity, FRED series, macro signals, trade
barriers/restrictions/flows/tariffs, aviation news, HAPI summaries,
PizzINT, arXiv categories, HN feeds, tech events, trending repos.

Previously only the primary key per script got seed-meta (via runSeed),
leaving secondary keys operationally invisible to health monitoring.

* fix(seeds): align seed-meta keys with health.js conventions

P1: writeExtraKeyWithMeta wrote seed-meta:<full-cache-key> (e.g.,
seed-meta:economic:macro-signals:v1), but health.js expects normalized
names without version suffixes (seed-meta:economic:macro-signals).
Fixed by stripping trailing :v\d+ from key. Added metaKeyOverride
param for cases needing explicit control.

P1: shipping seed used runSeed('supply-chain', 'shipping-trade', ...)
producing seed-meta:supply-chain:shipping-trade, but health.js expects
seed-meta:supply_chain:shipping. Fixed domain/resource to match.

* fix(seeds): only write seed-meta after successful data key write

writeExtraKey() now returns false on failure. writeExtraKeyWithMeta()
skips seed-meta write when the data write fails, preventing false-positive
health reports for keys like macro-signals and tech-events.
2026-03-15 00:37:31 +04:00
Elie Habib
19ee1f38e4 fix(seeds): extend TTL on stale data instead of crashing on fetch errors (#1600)
* fix(seeds): extend TTL on stale data instead of crashing on fetch errors

Seed scripts crashed with process.exit(1) when upstream APIs returned
errors (e.g., Wingbits 401), causing Redis keys to expire and panels
to lose data. Now all seeds gracefully extend TTL on existing keys and
exit 0, keeping stale data alive until the API recovers.

- Add shared extendExistingTtl() helper to _seed-utils.mjs
- Update runSeed() catch block (fixes 24 scripts using it)
- Fix fetch-gpsjam.mjs, seed-airport-delays.mjs,
  seed-military-flights.mjs, seed-service-statuses.mjs

* fix(seeds): preserve per-key TTLs when extending stale military data

THEATER_POSTURE_BACKUP_KEY has a 7-day TTL (604800s) but was being
extended with STALE_TTL (86400s), shortening it from 7 days to 1 day
during upstream outages. Now each key group gets its original TTL.
2026-03-14 23:42:30 +04:00
Elie Habib
fe67111dc9 feat: harness engineering P0 - linting, testing, architecture docs (#1587)
* feat: harness engineering P0 - linting, testing, architecture docs

Add foundational infrastructure for agent-first development:

- AGENTS.md: agent entry point with progressive disclosure to deeper docs
- ARCHITECTURE.md: 12-section system reference with source-file refs and ownership rule
- Biome 2.4.7 linter with project-tuned rules, CI workflow (lint-code.yml)
- Architectural boundary lint enforcing forward-only dependency direction (lint-boundaries.mjs)
- Unit test CI workflow (test.yml), all 1083 tests passing
- Fixed 9 pre-existing test failures (bootstrap sync, deploy-config headers, globe parity, redis mocks, geometry URL, import.meta.env null safety)
- Fixed 12 architectural boundary violations (types moved to proper layers)
- Added 3 missing cache tier entries in gateway.ts
- Synced cache-keys.ts with bootstrap.js
- Renamed docs/architecture.mdx to "Design Philosophy" with cross-references
- Deprecated legacy docs/Docs_To_Review/ARCHITECTURE.md
- Harness engineering roadmap tracking doc

* fix: address PR review feedback on harness-engineering-p0

- countries-geojson.test.mjs: skip gracefully when CDN unreachable
  instead of failing CI on network issues
- country-geometry-overrides.test.mts: relax timing assertion
  (250ms -> 2000ms) for constrained CI environments
- lint-boundaries.mjs: implement the documented api/ boundary check
  (was documented but missing, causing false green)

* fix(lint): scan api/ .ts files in boundary check

The api/ boundary check only scanned .js/.mjs files, missing the 25
sebuf RPC .ts edge functions. Now scans .ts files with correct rules:
- Legacy .js: fully self-contained (no server/ or src/ imports)
- RPC .ts: may import server/ and src/generated/ (bundled at deploy),
  but blocks imports from src/ application code

* fix(lint): detect import() type expressions in boundary lint

- Move AppContext back to app/app-context.ts (aggregate type that
  references components/services/utils belongs at the top, not types/)
- Move HappyContentCategory and TechHQ to types/ (simple enums/interfaces)
- Boundary lint now catches import('@/layer') expressions, not just
  from '@/layer' imports
- correlation-engine imports of AppContext marked boundary-ignore
  (type-only imports of top-level aggregate)
2026-03-14 21:29:21 +04:00
Elie Habib
db6a4a2763 feat(correlation): server-side correlation engine seed + bootstrap hydration (#1571)
* feat(correlation): server-side correlation engine seed + bootstrap hydration

Move correlation card computation from client-side (per-browser, 10-30s delay)
to server-side (Railway cron, instant via bootstrap). Seed script reads 8 Redis
keys, runs 4 adapter signal collectors (military, escalation, economic, disaster),
clusters/scores/generates cards, writes to Redis with 10min TTL.

- New: scripts/seed-correlation.mjs (pure JS port of correlation engine)
- bootstrap.js: add correlationCards to FAST_KEYS tier
- health.js + seed-health.js: register for monitoring (maxStaleMin: 15)
- CorrelationPanel: consume bootstrap on construction, show "Analyzing..." only
  after live engine has run (not for bootstrap-only cards)
- _seed-utils.mjs: support opts.recordCount override (function or number)

* fix(correlation): stale timestamp fallback + coordinate-based country resolution

P1: news stories lacked per-story pubDate, causing Date.now() fallback on
every seed run. Now _clustering.mjs propagates pubDate through to
enrichedStories, and seed-correlation reads s.pubDate then generatedAt.

P2: normalizeToCode dropped signals with unparseable country names.
Added centroid-based coordinate fallback (haversine nearest-match within
800km) matching the live engine's getCountryAtCoordinates behavior.

* fix(correlation): add 11 missing country centroids to coordinate fallback

CI, CR, CV, CY, GA, IS, LA, SZ, TL, TT, XK were in the normalization
maps but missing from COUNTRY_CENTROIDS, causing coordinate-only signals
in those countries to be misclassified or dropped during bootstrap.

* fix(correlation): align protest/outage field names with actual Redis schema

Codex review P1 findings: seed-correlation read wrong field names from
Redis data.

Protests (unrest:events:v1): p.time -> p.occurredAt, p.lat/lon ->
p.location.latitude/longitude, severity enum SEVERITY_LEVEL_* mapping.

Outages (infra:outages:v1): o.pubDate -> o.detectedAt, o.lat/lon ->
o.location.latitude/longitude, severity enum OUTAGE_SEVERITY_* mapping.

Both escalation and disaster adapters updated. Old field names kept as
fallbacks for data shape compatibility.
2026-03-14 15:07:30 +04:00
Elie Habib
760c129c71 fix(seed): SyntaxError from mixing || and ?? operators without parens (#1558)
Mixing || and ?? in the same expression without explicit grouping is
a JS syntax error. This broke ALL Railway seed scripts after #1556.

Refactored to use ?? throughout with explicit Array.isArray guard so
non-topic seeds correctly fall through to their own length checks.
2026-03-14 10:16:59 +04:00
Elie Habib
e0bf4f9bd2 feat: seed GDELT intelligence topics to Redis (#1556)
* feat: seed GDELT intelligence topics to Redis with bootstrap hydration

Add standalone seed script that pre-populates all 6 Live Intelligence
topics (military, cyber, nuclear, sanctions, intelligence, maritime)
from the GDELT Doc API into Redis. Frontend consumes bootstrap data
lazily via the service layer, falling back to RPC if unavailable.

- scripts/seed-gdelt-intel.mjs: new seed script with per-topic 429 retry
- api/bootstrap.js: register gdeltIntel in FAST_KEYS
- api/health.js: register in BOOTSTRAP_KEYS + SEED_META + dataSize
- api/seed-health.js: register in SEED_DOMAINS
- scripts/_seed-utils.mjs: add topics to recordCount detection
- src/services/gdelt-intel.ts: lazy bootstrap consumption in service layer

* fix(seed): align staleness thresholds and strengthen GDELT validation

- seed-health intervalMin 30→60 so staleness (120min) matches health.js maxStaleMin
- validate requires ≥3/6 topics populated (not just military)
- recordCount sums articles across topics instead of reporting topic count
2026-03-14 10:07:28 +04:00
Elie Habib
364e497bd1 fix(scripts): resolve shared JSON configs for Railway rootDirectory (#1231)
Railway deploys seed services with rootDirectory=scripts/, placing files
at /app/ without the parent shared/ directory. The createRequire +
require('../shared/X.json') pattern resolves to /shared/ which doesn't
exist in the container.

- Add loadSharedConfig() to _seed-utils.mjs: tries ../shared/ (local)
  then ./shared/ (Railway) with clear error on miss
- Add requireShared() to ais-relay.cjs with same dual-path fallback
- Add postinstall to scripts/package.json that copies ../shared/ into
  ./shared/ during Railway build
- Update all 6 seed scripts to use loadSharedConfig instead of
  createRequire + require
- Add scripts/shared/ to .gitignore

Fixes crash introduced by #1212 (shared JSON consolidation).
2026-03-08 00:09:24 +04:00
Elie Habib
cad6b9c4e0 feat(infrastructure): expand submarine cables to 86 via TeleGeography API (#1224)
* feat(infrastructure): expand submarine cables to 86 via TeleGeography API seed

- Add `seed-submarine-cables.mjs` Railway cron script fetching 86 strategic
  cables from TeleGeography API (was 19 hand-curated)
- Update `geo.ts` static baseline with full cable data (routes, landing points,
  owners, RFS year, regions)
- Update `get-cable-health.ts` cable name/landing mappings for new slug-based IDs
- Add `data?.cables?.length` to `_seed-utils.mjs` record count heuristic
- Update `map-harness.ts` cable ID references
- Remove GitHub Actions workflows for UCDP and WB indicators (Railway cron only)

* fix(infrastructure): cable route matching, name false positives, validation threshold

- Fix route geometry: only strip numeric suffix when result matches a known
  cable slug, preventing seamewe-6→seamewe, farice-1→farice, etc.
- Fix name matching: use word-boundary regex instead of substring includes;
  disambiguate short names (ACE→ACE CABLE, SAFE→SAFE CABLE, PEACE→PEACE CABLE,
  TEAMS→TEAMS CABLE) to prevent false matches on common NGA words
- Raise validation threshold from 50 to 75 (88% success required) to prevent
  heavily partial upstream results from overwriting good cached data

* fix(infrastructure): tie validation threshold to 90% of configured cable count

Dynamic threshold based on CABLE_REGIONS length instead of a hardcoded number.
Currently requires >= 78 of 86 cables (90%).
2026-03-07 22:24:58 +04:00
Elie Habib
314d341563 fix: gracefully skip seed write when validation fails (empty data) (#1089)
At midnight UTC, FIRMS API returns 0 fire detections due to date
rollover. The validateFn correctly rejects empty data, but previously
this threw a FATAL error and crashed. Now it exits cleanly (code 0),
preserving existing cached data in Redis for the next successful run.
2026-03-06 08:03:13 +04:00
Elie Habib
124085edd6 fix: add process.exit(0) to seed scripts for Railway cron compatibility (#999)
Railway marks cron jobs as "failed" when the Node.js process doesn't
exit cleanly. The seed scripts relied on natural event loop drain,
but undici's connection pool keeps handles alive, causing Railway to
kill the process and mark it as failed.

Changes:
- Add process.exit(0) on success and lock-skip paths in runSeed()
- Fix recordCount for crypto (.quotes) and stablecoin (.stablecoins)
- Add writeExtraKey, sleep, parseYahooChart shared utilities
- Add extraKeys option to runSeed for bootstrap hydration keys
2026-03-04 20:43:16 +04:00
Elie Habib
78a14306d9 feat: add seed-first pattern to 15 RPC handlers with Railway seed scripts (#989)
Migrate handlers from direct external API calls to seed-first pattern:
Railway cron seeds Redis → handlers read from Redis → fallback to live
fetch if seed stale and SEED_FALLBACK_* env enabled.

Handlers updated: earthquakes, fire-detections, internet-outages,
climate-anomalies, unrest-events, cyber-threats, market-quotes,
commodity-quotes, crypto-quotes, etf-flows, gulf-quotes,
stablecoin-markets, natural-events, displacement-summary, risk-scores.

Also adds:
- scripts/_seed-utils.mjs (shared seed framework with atomic publish,
  distributed locks, retry, freshness metadata)
- 13 seed scripts for Railway cron
- api/seed-health.js monitoring endpoint
- scripts/validate-seed-migration.mjs post-deploy validation
- Restored multi-source CII in get-risk-scores (8 sources: ACLED,
  UCDP, outages, climate, cyber, fires, GPS, Iran)
2026-03-04 17:37:15 +04:00