feat(forecast): AI Forecasts prediction module (#1579)

* feat(forecast): add AI Forecasts prediction module (Pro-tier)

MiroFish-inspired prediction engine that generates structured forecasts
across 6 domains (conflict, market, supply chain, political, military,
infrastructure) using existing WorldMonitor data streams.

- Proto definitions for ForecastService with GetForecasts RPC
- Dedicated seed script (seed-forecasts.mjs) with 6 domain detectors,
  cross-domain cascade resolver, prediction market calibration, and
  trend detection via prior snapshot comparison
- Premium-gated RPC handler (PREMIUM_RPC_PATHS enforcement)
- Lazy-loaded ForecastPanel with domain filters, probability bars,
  trend arrows, signal evidence, and cascade links
- Health monitoring integration (seed-meta freshness tracking)
- Refresh scheduler with API key guard

* test(forecast): add 47 unit tests for forecast detectors and utilities

Covers forecastId, normalize, resolveCascades, calibrateWithMarkets,
computeTrends, and smoke tests for all 6 domain detectors. Exports
testable functions from seed script with direct-run guard.

* fix(forecast): domain mismatch 'infra' vs 'infrastructure', add panel category

- Seed script used 'infra' but ForecastPanel filtered on 'infrastructure',
  causing Infra tab to show zero results
- Added 'forecast' to intelligence category in PANEL_CATEGORY_MAP

* fix(forecast): move CSS to one-time injection, improve type safety

- P2: Move style block from setContent to one-time document.head injection
  to prevent CSS accumulation on repeated renders
- P3: Replace +toFixed(3) with Math.round for readability in seed script
- P3: Use Forecast type instead of any[] in RPC handler filter

* fix(forecast): handle sebuf proto data shapes from Redis

Detectors now normalize CII scores from server-side proto format
(combinedScore, TREND_DIRECTION_RISING, region) to uniform shape.
Outage severity handles proto enum format (SEVERITY_LEVEL_HIGH).
Added confidence floor of 0.3 for single-source predictions.

Verified against live Redis: 2 predictions generated (Iran infra
shutdown, IL political instability).

* feat(forecast): unlock AI Forecasts on web, lock desktop only (trial)

- Remove forecast RPC from PREMIUM_RPC_PATHS (web access is free)
- Panel locked on desktop only (same as oref-sirens/telegram-intel)
- Remove API key guards from data-loader and refresh scheduler
- Web users get full access during trial period

* chore: regenerate proto types with make generate

Re-ran make generate after rebasing on main. Plugin v0.7.0 dropped
@ts-nocheck from output, added it back to all 50 generated files.
Fixed 4 type errors from proto codegen changes:
- MarketSource enum -> string union type
- TemporalAnomalyProto -> TemporalAnomaly rename
- webcam lastUpdated number -> string

* fix(forecast): use chokepoints v4 key, include ciiContribution in unrest

- P1: Switch chokepoints input from stale v2 to active v4 Redis key,
  matching bootstrap.js and cache-keys.ts
- P2: Add ciiContribution to unrest component fallback chain in
  normalizeCiiEntry so political detector reads the correct sebuf field

* feat(forecast): Phase 2 LLM scenario enrichment + confidence model

MiroFish-inspired enhancements:
- LLM scenario narratives via Groq/OpenRouter (narrative-only, no numeric
  adjustment). Evidence-grounded prompts with mandatory signal citation
  and few-shot examples from MiroFish's SECTION_SYSTEM_PROMPT_TEMPLATE.
- Top-4 predictions batched into single LLM call for cost efficiency.
- News context from newsInsights attached to all predictions for LLM
  prompt grounding (NOT in signals, cannot affect confidence).
- Deterministic confidence model: source diversity via SIGNAL_TO_SOURCE
  mapping (deduplicates cii+cii_delta, theater+indicators) + calibration
  agreement from prediction market drift. Floor 0.2, ceiling 1.0.
- Output validation: rejects scenarios without signal references.
- Truncated JSON repair for small model output.
- Structured JSON logging for LLM calls.
- Redis cache for LLM scenarios (1h TTL).
- 23 new tests (70 total), all passing.
- Live-tested: OpenRouter gemini-2.5-flash produces evidence-grounded
  scenario narratives from real WorldMonitor data.

* feat(forecast): Phase 3 multi-perspective scenarios, projections, data-driven cascades

MiroFish-inspired enhancements:
- Multi-perspective LLM analysis: top-2 predictions get strategic,
  regional, and contrarian viewpoints via combined LLM call
- Probability projections: domain-specific decay curves (h24/d7/d30)
  anchored to timeHorizon so probability equals projections[timeHorizon]
- Data-driven cascade rules: moved from hardcoded array to JSON config
  (scripts/data/cascade-rules.json) with schema validation, named
  predicate evaluators, unknown key rejection, and fallback to defaults
- 4 new cascade paths: infrastructure->supply_chain, infrastructure->market
  (both requiresSeverity:total), conflict->political, political->market
- Proto: added Perspectives and Projections messages to Forecast
- ForecastPanel: renders projections row and conditional perspectives toggle
- 89 tests (19 new), all passing
- Live-tested: OpenRouter produces perspectives from real data

* feat(forecast): Phase 4 data utilization + entity graph

Fixes data gaps that prevented 4 of 6 detectors from firing:
- Input normalizers: chokepoint v4 shape + GPS hexes-to-zones mapping
- Chokepoint warm-ping (production-only, requires WM_API_BASE_URL)
- Lowered CII conflict threshold from 70 to 60, gated on level=high|critical

4 new standalone detectors:
- UCDP conflict zones (10+ events per country)
- Cyber threat concentration (5+ threats per country)
- GPS jamming in maritime shipping zones (5 regions)
- Prediction markets as signals (60-90% probability markets)

Entity-relationship graph (file-based, 38 nodes):
- Countries, theaters, commodities, chokepoints, alliances
- Alias table resolves both ISO codes and display names
- Graph cascade discovery links predictions across entities

Result: 51 predictions (up from 1-2), spanning conflict, infrastructure,
and supply chain domains. 112 tests, all passing.

* fix(forecast): redis cache format, signal source mapping, type safety

Fresh-eyes audit fixes:
- BUG: redisSet used wrong Upstash API format (POST body with {value,ex}
  instead of command array ['SET',key,value,'EX',ttl]). LLM cache writes
  were silently failing, causing fresh LLM calls every run.
- BUG: prediction_market signal type missing from SIGNAL_TO_SOURCE,
  inflating confidence for market-derived predictions.
- CLEANUP: Remove unnecessary (f as any) casts in ForecastPanel since
  generated Forecast type already has projections/perspectives fields.
- CLEANUP: Bump health maxStaleMin from 60 to 90 to avoid false STALE
  alerts when LLM calls add latency to seed runs.

* feat(forecast): headline-entity matching with news corroboration signals

Uses entity graph aliases to match headlines to predictions by
country/theater (excludes commodity/infrastructure nodes to prevent
false positives). Predictions with matching headlines get a
news_corroboration signal visible in the panel.

Also fixes buildUserPrompt to merge unique headlines from ALL
predictions in the LLM batch (was only reading preds[0].newsContext).

Live-tested: 13 of 51 predictions now have corroborating headlines
(Iran, Israel, Syria, Ukraine, etc). 116 tests, all passing.

* feat(forecast): add country-codes.json for headline-entity matching

56 countries with ISO codes, full names, and scoring keywords (extracted
from src/config/countries.ts + UCDP-relevant additions). Used by
attachNewsContext for richer headline matching via getSearchTermsForRegion
which combines country-codes + entity graph + keyword aliases.

14/57 predictions now have news corroboration (limited by headline
coverage, not matching quality: only 8 headlines currently available).

* feat(forecast): read 300 headlines from news digest instead of 8

Read news:digest:v1:full:en (300 headlines across 16 categories) instead
of just news:insights:v1 topStories (8 headlines). Fallback to topStories
if digest is unavailable.

Result: news corroboration jumped from 25% to 64% (38/59 predictions).

* fix(forecast): handle parenthetical country names in headline matching

Strip suffixes like '(Zaire)', '(Burma)', '(Soviet Union)' from UCDP
region names before matching against country-codes.json. Also use
includes() for reverse name lookup to catch partial matches.

Corroboration: 64% -> 69% (41/59). Remaining 18 unmatched are countries
with no current English-language news coverage.

* fix(forecast): cache validated LLM output, add digest test, log cache errors

Fresh-eyes audit fixes:
- Combined LLM cache now stores only validated items (was caching raw
  unvalidated output, serving potentially invalid scenarios on cache hit)
- redisSet logs warnings on failure (was silently swallowing all errors)
- Added digest-based test for attachNewsContext (primary path was untested)
- Fixed test arity: attachNewsContext(preds, news, digest) with 3 params

* fix(forecast): remove dead confidenceFromSources, reduce warm-ping timeout

- P2: Remove confidenceFromSources (dead code, computeConfidence overwrites
  all initial confidence values). Inline the formula in original detectors.
- P3: Reduce warm-ping timeout from 30s to 15s (non-critical step)
- P3: Add trial status comment on forecast panel config

* fix(forecast): resolve ISO codes to country names, fix market detector, safe pre-push

P1 fixes from code review:
- CII ISO codes (IL, IR) now resolved to full country names (Israel, Iran)
  via country-codes.json. Prevents substring false positives (IL matching
  Chile) in event correlation. Uses word-boundary regex for matching.
- Market detector CII-to-theater mapping now uses entity graph traversal
  instead of broken theater-name substring matching. Iran correctly maps
  to Middle East theater via graph links.
- Pre-push hook no longer runs destructive git checkout on proto freshness
  failure. Reports mismatch and exits without modifying worktree.
This commit is contained in:
Elie Habib
2026-03-15 01:42:04 +04:00
committed by GitHub
parent 3f24169a05
commit 45f5e5a457
67 changed files with 7477 additions and 822 deletions

View File

@@ -0,0 +1,194 @@
openapi: 3.1.0
info:
title: ForecastService API
version: 1.0.0
paths:
/api/forecast/v1/get-forecasts:
get:
tags:
- ForecastService
summary: GetForecasts
operationId: GetForecasts
parameters:
- name: domain
in: query
required: false
schema:
type: string
- name: region
in: query
required: false
schema:
type: string
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/GetForecastsResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
schemas:
Error:
type: object
properties:
message:
type: string
description: Error message (e.g., 'user not found', 'database connection failed')
description: Error is returned when a handler encounters an error. It contains a simple error message that the developer can customize.
FieldViolation:
type: object
properties:
field:
type: string
description: The field path that failed validation (e.g., 'user.email' for nested fields). For header validation, this will be the header name (e.g., 'X-API-Key')
description:
type: string
description: Human-readable description of the validation violation (e.g., 'must be a valid email address', 'required field missing')
required:
- field
- description
description: FieldViolation describes a single validation error for a specific field.
ValidationError:
type: object
properties:
violations:
type: array
items:
$ref: '#/components/schemas/FieldViolation'
description: List of validation violations
required:
- violations
description: ValidationError is returned when request validation fails. It contains a list of field violations describing what went wrong.
GetForecastsRequest:
type: object
properties:
domain:
type: string
region:
type: string
GetForecastsResponse:
type: object
properties:
forecasts:
type: array
items:
$ref: '#/components/schemas/Forecast'
generatedAt:
type: integer
format: int64
description: 'Warning: Values > 2^53 may lose precision in JavaScript'
Forecast:
type: object
properties:
id:
type: string
domain:
type: string
region:
type: string
title:
type: string
scenario:
type: string
probability:
type: number
format: double
confidence:
type: number
format: double
timeHorizon:
type: string
signals:
type: array
items:
$ref: '#/components/schemas/ForecastSignal'
cascades:
type: array
items:
$ref: '#/components/schemas/CascadeEffect'
trend:
type: string
priorProbability:
type: number
format: double
calibration:
$ref: '#/components/schemas/CalibrationInfo'
createdAt:
type: integer
format: int64
description: 'Warning: Values > 2^53 may lose precision in JavaScript'
updatedAt:
type: integer
format: int64
description: 'Warning: Values > 2^53 may lose precision in JavaScript'
perspectives:
$ref: '#/components/schemas/Perspectives'
projections:
$ref: '#/components/schemas/Projections'
ForecastSignal:
type: object
properties:
type:
type: string
value:
type: string
weight:
type: number
format: double
CascadeEffect:
type: object
properties:
domain:
type: string
effect:
type: string
probability:
type: number
format: double
CalibrationInfo:
type: object
properties:
marketTitle:
type: string
marketPrice:
type: number
format: double
drift:
type: number
format: double
source:
type: string
Perspectives:
type: object
properties:
strategic:
type: string
regional:
type: string
contrarian:
type: string
Projections:
type: object
properties:
h24:
type: number
format: double
d7:
type: number
format: double
d30:
type: number
format: double