Files
worldmonitor/scripts/seed-regional-snapshots.mjs
Elie Habib 1367c36abc Phase 1 PR2: LLM narrative generator for regional snapshots (#2960)
* feat(intelligence): LLM narrative generator for regional snapshots (Phase 1 PR2)

Phase 1 PR2 of the Regional Intelligence Model. Fills in the `narrative`
field that Phase 0 left as empty stubs, populating all 6 sections
(situation, balance_assessment, outlook 24h/7d/30d, watch_items) and
writing narrative_provider / narrative_model onto SnapshotMeta.

## New module: scripts/regional-snapshot/narrative.mjs

Single public entry point:

  generateRegionalNarrative(region, snapshot, evidence, opts?)
    -> { narrative, provider, model }

Design:
- One structured-JSON call per region (cheaper + more coherent than 6
  per-section calls). ~32 calls/day across 7 regions on the 6h cadence.
- 'global' region is skipped entirely — the catch-all is too broad for
  a meaningful narrative.
- Ship-empty on any failure. The generator never throws; network errors,
  JSON parse failures, and all-empty LLM responses all resolve to the
  canonical emptyNarrative() shape. The snapshot remains valuable without
  the narrative layer and the diff engine still surfaces state changes.
- Evidence-grounded: each section's evidence_ids must be a subset of
  the IDs collectEvidence() produced. Hallucinated IDs are silently
  filtered by parseNarrativeJson().
- Provider chain: Groq (llama-3.3-70b-versatile) → OpenRouter
  (google/gemini-2.5-flash). Mirrors seed-insights.mjs's inline-provider
  pattern; Ollama skipped because Railway has no local model.
- `callLlm` is dependency-injected via opts.callLlm so unit tests can
  exercise the full prompt + parser + evidence-filter chain without
  touching the network.
- response_format: { type: "json_object" } constrains compatible
  providers; parseNarrativeJson() tolerates prose-wrapped output via
  extractFirstJsonObject() for providers that don't enforce JSON mode.

## seed-regional-snapshots.mjs compute order

Step 10 (previously an empty-stub placeholder) becomes a real LLM call.
The pipeline was reordered so regime derivation happens BEFORE narrative
generation — the prompt consumes the regime label. Final step list:

    1. read sources (caller)
    2. pre_meta
    3. balance vector
    4. actors
    5. triggers
    6. scenarios
    7. transmissions
    8. mobility (still empty; Phase 2)
    9. evidence
   10. snapshot_id
   11. read previous + derive regime
   12. build snapshot-for-prompt
   13. generateRegionalNarrative (ship-empty on failure)
   14. splice narrative into tentative snapshot
   15. diff → trigger_reason
   16. final_meta with narrative_provider / narrative_model

buildFinalMeta already accepted these fields from Phase 0 — only the
seed writer needed to pass them through.

## Tests: 23 new unit tests

- buildNarrativePrompt (7): balance/actors/regime/evidence are rendered,
  empty-evidence fallback, missing optional fields don't throw.
- parseNarrativeJson (7): clean JSON, hallucinated-ID filtering, prose
  extraction, all-empty invalid, garbage invalid, null/empty input,
  watch_items cap.
- generateRegionalNarrative (7): success path, global-region skip (asserts
  callLlm is never called), null result, garbage text, thrown error
  doesn't propagate, end-to-end evidence filtering, provider name captured.
- emptyNarrative (2): shape and no-shared-state.

Full suite passes 4375/4375; typecheck + typecheck:api + biome lint clean.

* fix(intelligence): address 3 review findings on #2960

Three review findings from PR #2960, all in scripts/regional-snapshot/narrative.mjs.

## P2 - provider fallback on malformed responses

callLlmDefault() previously returned on the first non-empty response
regardless of whether it parsed. Groq returning prose, truncated JSON,
or an all-empty object would short-circuit the whole chain so OpenRouter
never got a chance — the most common LLM failure mode slipped past the
backup.

Fix: callLlmDefault now accepts an optional `validate(text)` callback
and continues to the next provider when a response fails validation.
generateRegionalNarrative wires up a validator that runs
parseNarrativeJson against the prompt-visible evidence set — so any
provider returning prose, truncated JSON, or all-empty sections falls
through to the next provider.

## P2 - evidence validator scoped to prompt-visible slice

The prompt rendered only the first 15 evidence items, but the parser's
valid-evidence-ID whitelist was built from the full evidence array.
That weakened the grounding guarantee: the LLM could cite IDs it never
actually saw and the parser would accept them.

Fix: added selectPromptEvidence(evidence) helper that slices once.
generateRegionalNarrative calls it, then passes the SAME slice into
both buildNarrativePrompt and the validEvidenceIds set used by the
parser. Removed the duplicate slice that used to live inside
buildNarrativePrompt — callers are now responsible for pre-slicing.
This also simplified buildNarrativePrompt's contract (no hidden cap).

## P3 - narrative_model records the actual model

callLlmDefault returned the requested provider.model regardless of
what the API resolved it to. Some providers (OpenRouter routing,
model aliases) return a different concrete model via json.model, and
the persisted narrative_model metadata was misreporting reality.

Fix: return `json?.model || provider.model` so the SnapshotMeta field
reflects what actually ran.

## Tests — 10 new regression tests

selectPromptEvidence (3):
  - caps at MAX_EVIDENCE_IN_PROMPT (15)
  - handles non-array input
  - returns full array when under cap

provider fallback on malformed response (4):
  - prose from first provider -> falls through to second
  - truncated JSON from first provider -> falls through to second
  - all-empty JSON object from first provider -> falls through
  - all providers malformed -> empty narrative

evidence validator scope (2):
  - hallucinated citation to ev16 (beyond 15-item window) stripped
  - citations to ev0 and ev14 (at window edges) preserved

narrative_model actual output (1):
  - json.model value flows through to result.model

## Verification

- npm run test:data: 4385/4385 pass (was 4375; +10 regression tests)
- npm run typecheck: clean
- npm run typecheck:api: clean
- biome lint on touched files: clean
2026-04-11 23:39:20 +04:00

286 lines
11 KiB
JavaScript

#!/usr/bin/env node
// @ts-check
/**
* Regional Intelligence snapshot seeder.
*
* Computes a RegionalSnapshot per region using deterministic scoring across
* seven balance axes, derives a regime label, scores actors, evaluates
* structured trigger thresholds, builds normalized scenario sets, resolves
* pre-built transmission templates, and persists to Redis with idempotency.
*
* Phase 1 (PR2): LLM narrative layer added. One structured-JSON call per
* region via generateRegionalNarrative(), ship-empty on any failure. The
* 'global' region is skipped inside the generator. Provider + model flow
* through SnapshotMeta.narrative_provider / narrative_model.
*
* Architecture: docs/internal/pro-regional-intelligence-upgrade.md
* Engineering: docs/internal/pro-regional-intelligence-appendix-engineering.md
* Scoring: docs/internal/pro-regional-intelligence-appendix-scoring.md
*
* Run via the seed bundle (recommended) or directly:
* node scripts/seed-regional-snapshots.mjs
*/
import { pathToFileURL } from 'node:url';
import { loadEnvFile, getRedisCredentials, writeExtraKeyWithMeta } from './_seed-utils.mjs';
// Use scripts/shared mirror rather than the repo-root shared/ folder: the
// Railway bundle service sets rootDirectory=scripts, so `../shared/` resolves
// to filesystem / on deploy and the import fails with ERR_MODULE_NOT_FOUND.
// scripts/shared/* is kept in sync with shared/* via tests.
import { REGIONS, GEOGRAPHY_VERSION } from './shared/geography.js';
import { computeBalanceVector, SCORING_VERSION } from './regional-snapshot/balance-vector.mjs';
import { buildRegimeState } from './regional-snapshot/regime-derivation.mjs';
import { scoreActors } from './regional-snapshot/actor-scoring.mjs';
import { evaluateTriggers } from './regional-snapshot/trigger-evaluator.mjs';
import { buildScenarioSets } from './regional-snapshot/scenario-builder.mjs';
import { resolveTransmissions } from './regional-snapshot/transmission-templates.mjs';
import { collectEvidence } from './regional-snapshot/evidence-collector.mjs';
import { buildPreMeta, buildFinalMeta } from './regional-snapshot/snapshot-meta.mjs';
import { diffRegionalSnapshot, inferTriggerReason } from './regional-snapshot/diff-snapshot.mjs';
import { persistSnapshot, readLatestSnapshot } from './regional-snapshot/persist-snapshot.mjs';
import { ALL_INPUT_KEYS } from './regional-snapshot/freshness.mjs';
import { generateSnapshotId } from './regional-snapshot/_helpers.mjs';
import { generateRegionalNarrative, emptyNarrative } from './regional-snapshot/narrative.mjs';
loadEnvFile(import.meta.url);
const SEED_META_KEY = 'intelligence:regional-snapshots';
/** @returns {Promise<Record<string, any>>} */
async function readAllInputs() {
const { url, token } = getRedisCredentials();
const pipeline = ALL_INPUT_KEYS.map((k) => ['GET', k]);
const resp = await fetch(`${url}/pipeline`, {
method: 'POST',
headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' },
body: JSON.stringify(pipeline),
signal: AbortSignal.timeout(15_000),
});
if (!resp.ok) throw new Error(`Redis pipeline read: HTTP ${resp.status}`);
const results = await resp.json();
/** @type {Record<string, any>} */
const data = {};
for (let i = 0; i < ALL_INPUT_KEYS.length; i++) {
const key = ALL_INPUT_KEYS[i];
const raw = results[i]?.result;
if (raw === null || raw === undefined) {
data[key] = null;
continue;
}
try {
data[key] = JSON.parse(raw);
} catch {
data[key] = null;
}
}
return data;
}
/**
* Run the full compute pipeline for one region in the canonical order.
*
* 1. (sources already read by caller)
* 2. pre_meta
* 3. balance vector
* 4. actors
* 5. triggers (BEFORE scenarios)
* 6. scenarios (normalized)
* 7. transmissions
* 8. mobility (empty in Phase 0)
* 9. evidence
* 10. snapshot_id
* 11. read previous + derive regime
* 12. build snapshot-for-prompt (no narrative yet)
* 13. LLM narrative call (ship-empty on failure; skipped for 'global')
* 14. splice narrative into tentative snapshot
* 15. diff → trigger_reason
* 16. final_meta with narrative_provider/narrative_model
*/
async function computeSnapshot(regionId, sources) {
// Step 2: pre-meta
const { pre } = buildPreMeta(sources, SCORING_VERSION, GEOGRAPHY_VERSION);
// Step 3: balance vector
const { vector: balance } = computeBalanceVector(regionId, sources);
// Step 4: actors
const { actors, edges } = scoreActors(regionId, sources);
// Step 5: triggers (before scenarios)
const triggers = evaluateTriggers(regionId, sources, balance);
// Step 6: scenarios (normalized to 1.0 per horizon)
const scenarioSets = buildScenarioSets(regionId, sources, triggers);
// Step 7: transmissions (matched to active triggers)
const transmissionPaths = resolveTransmissions(regionId, triggers);
// Step 8: mobility (empty in Phase 0 - see appendix Mobility Input Keys)
const mobility = {
airspace: [],
flight_corridors: [],
airports: [],
reroute_intensity: 0,
notam_closures: [],
};
// Step 9: evidence chain
const evidence = collectEvidence(regionId, sources);
// Step 10: snapshot_id
const snapshotId = generateSnapshotId();
// Step 11: read previous + derive regime. Must happen before narrative
// generation because the prompt consumes the regime label.
const previous = await readLatestSnapshot(regionId).catch(() => null);
const previousLabel = previous?.regime?.label ?? '';
const regime = buildRegimeState(balance, previousLabel, '');
// Step 12: snapshot-shaped input for the narrative prompt. The narrative
// generator reads regime/balance/actors/scenarios/triggers/evidence from
// this object and does NOT inspect `meta` or the placeholder narrative.
// Meta here is a throwaway — the real meta is built after diff so
// trigger_reason and narrative_* can flow in together.
const snapshotForPrompt = {
region_id: regionId,
generated_at: Date.now(),
meta: buildFinalMeta(pre, { snapshot_id: snapshotId, trigger_reason: 'scheduled_6h' }),
regime,
balance,
actors,
leverage_edges: edges,
scenario_sets: scenarioSets,
transmission_paths: transmissionPaths,
triggers,
mobility,
evidence,
narrative: emptyNarrative(),
};
// Step 13: LLM narrative. Ship-empty on any failure — the snapshot remains
// valuable without the narrative, and the narrative generator itself
// never throws. 'global' is skipped inside the generator.
const region = REGIONS.find((r) => r.id === regionId);
const narrativeResult = region
? await generateRegionalNarrative(region, snapshotForPrompt, evidence)
: { narrative: emptyNarrative(), provider: '', model: '' };
// Step 14: tentative snapshot with the real narrative spliced in.
const tentativeSnapshot = {
...snapshotForPrompt,
narrative: narrativeResult.narrative,
};
// Step 15: diff against previous for trigger_reason inference
const diff = diffRegionalSnapshot(previous, tentativeSnapshot);
const triggerReason = inferTriggerReason(diff);
// Step 16: final_meta with diff-derived trigger_reason and narrative metadata
const finalMeta = buildFinalMeta(pre, {
snapshot_id: snapshotId,
trigger_reason: triggerReason,
narrative_provider: narrativeResult.provider,
narrative_model: narrativeResult.model,
});
// Return the snapshot WITHOUT the diff. The diff is a runtime artifact for
// alert emission; persisting it would leak a non-RegionalSnapshot field into
// Redis and break Phase 1 proto codegen consumers.
/** @type {import('../shared/regions.types.js').RegionalSnapshot} */
const snapshot = { ...tentativeSnapshot, meta: finalMeta };
return { snapshot, diff };
}
async function main() {
const t0 = Date.now();
console.log(`[regional-snapshots] Starting compute for ${REGIONS.length} regions`);
// Step 1: read all inputs once (shared across regions)
const sources = await readAllInputs();
const presentKeys = Object.entries(sources).filter(([, v]) => v !== null).length;
console.log(`[regional-snapshots] Read inputs: ${presentKeys}/${ALL_INPUT_KEYS.length} keys present`);
let persisted = 0;
let skipped = 0;
let failed = 0;
const summary = [];
const failedRegions = [];
for (const region of REGIONS) {
try {
const { snapshot } = await computeSnapshot(region.id, sources);
const result = await persistSnapshot(snapshot);
if (result.persisted) {
persisted += 1;
summary.push({
region: region.id,
regime: snapshot.regime.label,
confidence: snapshot.meta.snapshot_confidence,
active_triggers: snapshot.triggers.active.length,
trigger_reason: snapshot.meta.trigger_reason,
});
console.log(`[${region.id}] persisted regime=${snapshot.regime.label} confidence=${snapshot.meta.snapshot_confidence} triggers=${snapshot.triggers.active.length} reason=${snapshot.meta.trigger_reason}`);
} else {
skipped += 1;
console.log(`[${region.id}] skipped: ${result.reason}`);
}
} catch (err) {
failed += 1;
failedRegions.push({ region: region.id, error: String(/** @type {any} */ (err)?.message ?? err) });
console.error(`[${region.id}] FAILED: ${/** @type {any} */ (err)?.message ?? err}`);
}
}
// Health policy:
// 1. persisted > 0 && failed === 0: write the fresh summary + seed-meta.
// 2. persisted === 0 && failed === 0: all regions dedup-skipped (e.g., a
// retry within the 15min idempotency bucket). Preserve the prior good
// summary by skipping the write entirely. api/health.js classifies an
// empty `regions: []` + `recordCount: 0` as EMPTY_DATA which flips the
// overall health to red, so overwriting on a no-op retry is actively
// harmful. The 12h maxStaleMin budget lets the next full run refresh
// the payload naturally.
// 3. failed > 0: skip the meta write so /api/health flips to STALE after
// the maxStaleMin budget on persistent degradation instead of silently
// reporting OK. The bundle runner's freshness gate retries next cycle.
const elapsed = ((Date.now() - t0) / 1000).toFixed(1);
if (failed === 0 && persisted > 0) {
const ttlSec = 12 * 60 * 60; // 12h, 2x the 6h cron cadence
await writeExtraKeyWithMeta(
`intelligence:regional-snapshots:summary:v1`,
{ regions: summary, generatedAt: Date.now() },
ttlSec,
persisted,
`seed-meta:${SEED_META_KEY}`,
ttlSec,
);
console.log(`[regional-snapshots] Done in ${elapsed}s: persisted=${persisted} skipped=${skipped} failed=0`);
return;
}
if (failed === 0) {
// All regions dedup-skipped. Preserve the prior summary and return cleanly.
console.log(`[regional-snapshots] Done in ${elapsed}s: persisted=0 skipped=${skipped} failed=0 (all dedup-skipped, prior summary preserved)`);
return;
}
console.error(`[regional-snapshots] Done in ${elapsed}s: persisted=${persisted} skipped=${skipped} failed=${failed}`);
for (const f of failedRegions) {
console.error(` [${f.region}] ${f.error}`);
}
console.error('[regional-snapshots] Skipping seed-meta write due to partial failure. /api/health will reflect degradation after 12h.');
process.exit(1);
}
const isDirectRun = process.argv[1] && import.meta.url === pathToFileURL(process.argv[1]).href;
if (isDirectRun) {
main().catch((err) => {
console.error(`PUBLISH FAILED: ${err?.message || err}`);
process.exit(1);
});
}
export { main, computeSnapshot, readAllInputs };