feat(simulation): MiroFish Phase 2 — theater-limited simulation runner (#2220)

* feat(simulation): MiroFish Phase 2 — theater-limited simulation runner

Adds the simulation execution layer that consumes simulation-package.json
and produces simulation-outcome.json for maritime chokepoint + energy/logistics
theaters, closing the WorldMonitor → MiroFish handoff loop.

Changes:
- scripts/seed-forecasts.mjs: 2-round LLM simulation runner (prompt builders,
  JSON extractor, runTheaterSimulation, writeSimulationOutcome, task queue
  with NX dedup lock, runSimulationWorker poll loop)
- scripts/process-simulation-tasks.mjs: standalone worker entry point
- proto: GetSimulationOutcome RPC + make generate
- server/worldmonitor/forecast/v1/get-simulation-outcome.ts: RPC handler
- server/gateway.ts: slow tier for get-simulation-outcome
- api/health.js: simulationOutcomeLatest in STANDALONE + ON_DEMAND keys
- tests: 14 new tests for simulation runner functions

* fix(simulation): address P1/P2 code review findings from PR #2220

Security (P1 #018):
- sanitizeForPrompt() applied to all entity/seed fields interpolated into
  Round 1 prompt (entityId, class, stance, seedId, type, timing)
- sanitizeForPrompt() applied to actorId and entityIds in Round 2 prompt
- sanitizeForPrompt() + length caps applied to all LLM array fields written
  to R2 (dominantReactions, stabilizers, invalidators, keyActors, timingMarkers)

Validation (P1 #019):
- Added validateRunId() regex guard
- Applied in enqueueSimulationTask() and processNextSimulationTask() loop

Type safety (P1 #020):
- Added isOutcomePointer() and isPackagePointer() type guards in TS handlers
- Replaced unsafe as-casts with runtime-validated guards in both handlers

Correctness (P2 #022):
- Log warning when pkgPointer.runId does not match task runId

Architecture (P2 #024):
- isMaritimeChokeEnergyCandidate() accepts both flat and nested topBucketId
- Call site simplified to pass theater directly

Performance (P2 #025):
- SIMULATION_ROUND1_MAX_TOKENS raised 1800 to 2200
- Added max 3 initialReactions instruction to Round 1 prompt

Maintainability (P2 #026):
- Simulation pointer keys exported from server/_shared/cache-keys.ts
- Both TS handlers import from shared location

Documentation (P2 #027):
- Strengthened runId no-op description in proto and OpenAPI spec

* fix(todos): add blank lines around lists in markdown todo files

* style(api): reformat openapi yaml to match linter output

* test(simulation): add flat-shape filter test + getSimulationOutcome handler coverage

Two tests identified as missing during PR #2220 review:

1. isMaritimeChokeEnergyCandidate flat-shape tests — covers the || candidate.topBucketId
   normalization added in the P1/P2 review pass. The existing tests only used the nested
   marketContext.topBucketId shape; this adds the flat root-field shape that arrives from
   the simulation-package.json JSON (selectedTheaters entries have topBucketId at root).

2. getSimulationOutcome handler structural tests — verifies the isOutcomePointer guard,
   found:false NOT_FOUND return, found:true success path, note population on runId mismatch,
   and redis_unavailable error string. Follows the readSrc static-analysis pattern used
   elsewhere in server-handlers.test.mjs (handler imports Redis so full integration test
   would require a test Redis instance).
This commit is contained in:
Elie Habib
2026-03-25 13:55:59 +04:00
committed by GitHub
parent 8f27a871f5
commit 01f6057389
28 changed files with 1811 additions and 7 deletions

View File

@@ -97,6 +97,7 @@ const STANDALONE_KEYS = {
marketImplications: 'intelligence:market-implications:v1',
hormuzTracker: 'supply_chain:hormuz_tracker:v1',
simulationPackageLatest: 'forecast:simulation-package:latest',
simulationOutcomeLatest: 'forecast:simulation-outcome:latest',
};
const SEED_META = {
@@ -194,6 +195,7 @@ const ON_DEMAND_KEYS = new Set([
'militaryForecastInputs', // intermediate seed-to-seed pipeline key; only populated after seed-military-flights runs
'marketImplications', // LLM-generated inside forecast cron; can fail silently on LLM errors — degrade to WARN not CRIT
'simulationPackageLatest', // written by writeSimulationPackage after deep forecast runs; only present after first successful deep run
'simulationOutcomeLatest', // written by writeSimulationOutcome after simulation runs; only present after first successful simulation
]);
// Keys where 0 records is a valid healthy state (e.g. no airports closed).

File diff suppressed because one or more lines are too long

View File

@@ -71,6 +71,41 @@ paths:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/forecast/v1/get-simulation-outcome:
get:
tags:
- ForecastService
summary: GetSimulationOutcome
operationId: GetSimulationOutcome
parameters:
- name: runId
in: query
description: |-
IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
and you need to detect a mismatch between requested and returned run.
required: false
schema:
type: string
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/GetSimulationOutcomeResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
schemas:
Error:
@@ -393,3 +428,40 @@ components:
description: |-
Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
Value: "redis_unavailable" on Redis errors.
GetSimulationOutcomeRequest:
type: object
properties:
runId:
type: string
description: |-
IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
and you need to detect a mismatch between requested and returned run.
GetSimulationOutcomeResponse:
type: object
properties:
found:
type: boolean
runId:
type: string
outcomeKey:
type: string
schemaVersion:
type: string
theaterCount:
type: integer
format: int32
generatedAt:
type: integer
format: int64
description: 'Unix timestamp in milliseconds (from Date.now()). Warning: Values > 2^53 may lose precision in JavaScript.. Warning: Values > 2^53 may lose precision in JavaScript'
note:
type: string
description: |-
Populated when req.runId was supplied but does not match the returned outcome's runId.
Indicates that per-run filtering is not yet active and the latest outcome was returned instead.
error:
type: string
description: |-
Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
Value: "redis_unavailable" on Redis errors.

View File

@@ -0,0 +1,28 @@
syntax = "proto3";
package worldmonitor.forecast.v1;
import "sebuf/http/annotations.proto";
message GetSimulationOutcomeRequest {
// IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
// Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
// and you need to detect a mismatch between requested and returned run.
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
}
message GetSimulationOutcomeResponse {
bool found = 1;
string run_id = 2;
string outcome_key = 3;
string schema_version = 4;
int32 theater_count = 5;
// Unix timestamp in milliseconds (from Date.now()). Warning: Values > 2^53 may lose precision in JavaScript.
int64 generated_at = 6 [(sebuf.http.int64_encoding) = INT64_ENCODING_NUMBER];
// Populated when req.runId was supplied but does not match the returned outcome's runId.
// Indicates that per-run filtering is not yet active and the latest outcome was returned instead.
string note = 7;
// Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
// Value: "redis_unavailable" on Redis errors.
string error = 8;
}

View File

@@ -5,6 +5,7 @@ package worldmonitor.forecast.v1;
import "sebuf/http/annotations.proto";
import "worldmonitor/forecast/v1/get_forecasts.proto";
import "worldmonitor/forecast/v1/get_simulation_package.proto";
import "worldmonitor/forecast/v1/get_simulation_outcome.proto";
service ForecastService {
option (sebuf.http.service_config) = {base_path: "/api/forecast/v1"};
@@ -16,4 +17,8 @@ service ForecastService {
rpc GetSimulationPackage(GetSimulationPackageRequest) returns (GetSimulationPackageResponse) {
option (sebuf.http.config) = {path: "/get-simulation-package", method: HTTP_METHOD_GET};
}
rpc GetSimulationOutcome(GetSimulationOutcomeRequest) returns (GetSimulationOutcomeResponse) {
option (sebuf.http.config) = {path: "/get-simulation-outcome", method: HTTP_METHOD_GET};
}
}

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env node
import { loadEnvFile } from './_seed-utils.mjs';
import { runSimulationWorker } from './seed-forecasts.mjs';
loadEnvFile(import.meta.url);
const once = process.argv.includes('--once');
const runId = process.argv.find((arg) => arg.startsWith('--run-id='))?.split('=')[1] || '';
const result = await runSimulationWorker({ once, runId });
if (once && result?.status && result.status !== 'idle') {
console.log(` [Simulation] ${result.status}`);
}

View File

@@ -32,6 +32,17 @@ const FORECAST_DEEP_MAX_CANDIDATES = 3;
const FORECAST_DEEP_RUN_PREFIX = 'seed-data/forecast-traces';
const SIMULATION_PACKAGE_SCHEMA_VERSION = 'v1';
const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
const SIMULATION_OUTCOME_SCHEMA_VERSION = 'v1';
const SIMULATION_RUNNER_VERSION = 'v1';
const SIMULATION_TASK_KEY_PREFIX = 'forecast:simulation-task:v1';
const SIMULATION_TASK_QUEUE_KEY = 'forecast:simulation-task-queue:v1';
const SIMULATION_LOCK_KEY_PREFIX = 'forecast:simulation-lock:v1';
const SIMULATION_ROUND1_MAX_TOKENS = 2200;
const SIMULATION_ROUND2_MAX_TOKENS = 2500;
const SIMULATION_LOCK_TTL_SECONDS = 20 * 60;
const SIMULATION_TASK_TTL_SECONDS = 30 * 60;
const SIMULATION_POLL_INTERVAL_MS = 30 * 1000;
const PUBLISH_MIN_PROBABILITY = 0;
const PANEL_MIN_PROBABILITY = 0.1;
const CANONICAL_PAYLOAD_SOFT_LIMIT_BYTES = 4 * 1024 * 1024;
@@ -11861,7 +11872,8 @@ function isMaritimeChokeEnergyCandidate(candidate) {
const routeKey = candidate.routeFacilityKey || '';
if (!routeKey || !Object.prototype.hasOwnProperty.call(CHOKEPOINT_MARKET_REGIONS, routeKey)) return false;
const bucketArr = candidate.marketBucketIds || [];
const topBucket = candidate.marketContext?.topBucketId || '';
// Accept both nested (marketContext.topBucketId) and flat (topBucketId) shapes
const topBucket = candidate.marketContext?.topBucketId || candidate.topBucketId || '';
return bucketArr.includes('energy') || bucketArr.includes('freight') || topBucket === 'energy' || topBucket === 'freight'
|| SIMULATION_ENERGY_COMMODITY_KEYS.has(candidate.commodityKey || '');
}
@@ -15379,6 +15391,428 @@ if (_isDirectRun) {
});
}
// ---------------------------------------------------------------------------
// MiroFish Phase 2 — Theater-Limited Simulation Runner
// ---------------------------------------------------------------------------
function buildSimulationRound1SystemPrompt(theater, pkg) {
const theaterEntities = (pkg.entities || []).filter(
(e) => !e.relevanceToTheater || e.relevanceToTheater === theater.theaterId,
);
const entityList = theaterEntities.slice(0, 10).map(
(e) => `- ${sanitizeForPrompt(e.entityId)} | ${sanitizeForPrompt(e.name)} | class=${sanitizeForPrompt(e.class)} | stance=${sanitizeForPrompt(e.stance || 'unknown')}`,
).join('\n');
const theaterSeeds = (pkg.eventSeeds || []).filter((s) => s.theaterId === theater.theaterId);
const seedList = theaterSeeds.slice(0, 8).map(
(s) => `- ${sanitizeForPrompt(s.seedId)} [${sanitizeForPrompt(s.type)}] ${sanitizeForPrompt(s.summary)} (${sanitizeForPrompt(s.timing)})`,
).join('\n');
const constraints = (pkg.constraints?.[theater.theaterId] || pkg.constraints?.theater || [])
.map((c) => `- ${sanitizeForPrompt(c)}`).join('\n') || '- No explicit constraints';
const evalTargets = (pkg.evaluationTargets?.[theater.theaterId] || pkg.evaluationTargets?.theater || [])
.map((t) => `- ${sanitizeForPrompt(t)}`).join('\n') || '- General market and security dynamics';
const requirement = sanitizeForPrompt(
pkg.simulationRequirement?.[theater.theaterId] || theater.theaterLabel || theater.theaterId,
);
return `You are a geopolitical simulation engine. Simulate actor behavior for a theater-level disruption scenario.
SIMULATION CONTEXT:
${requirement}
THEATER: ${sanitizeForPrompt(theater.theaterLabel || theater.theaterId)} | Region: ${sanitizeForPrompt(theater.theaterRegion || theater.dominantRegion || '')}
ACTORS (use exact entityId when citing actors):
${entityList || '- (none specified)'}
EVENT SEEDS (cite seedId in reactions where applicable):
${seedList || '- (none specified)'}
CONSTRAINTS:
${constraints}
EVALUATION TARGETS:
${evalTargets}
INSTRUCTIONS:
Generate EXACTLY 3 divergent paths named "escalation", "containment", and "spillover". For each path, model the initial actor reactions in the first 24 hours.
- Actors MUST be from the list above (use their exact entityId)
- Cite event seeds (seedId) in reactions where applicable
- Do NOT invent actors, routes, or commodities not present above
- timing format: "T+0h", "T+6h", "T+12h", "T+24h"
- Maximum 3 initialReactions per path
- note: A brief (≤200 char) meta-observation on the divergence logic
Return ONLY a JSON object with no markdown fences:
{
"paths": [
{
"pathId": "escalation",
"label": "<short label>",
"summary": "<≤200 char summary>",
"initialReactions": [
{ "actorId": "<entityId>", "actorName": "<name>", "action": "<≤120 char>", "timing": "T+0h" }
]
},
{ "pathId": "containment", "label": "...", "summary": "...", "initialReactions": [] },
{ "pathId": "spillover", "label": "...", "summary": "...", "initialReactions": [] }
],
"dominantReactions": ["<actor name>: <action summary>"],
"note": "<meta-observation>"
}`;
}
function buildSimulationRound2SystemPrompt(theater, pkg, round1) {
const r1Paths = (round1?.paths || []).slice(0, 3);
const pathSummaries = r1Paths.map(
(p) => `- ${p.pathId}: ${sanitizeForPrompt(p.summary || '')} — actors: ${(p.initialReactions || []).slice(0, 3).map((r) => sanitizeForPrompt(r.actorId || '')).join(', ')}`,
).join('\n') || '- (no round 1 paths available)';
const theaterEntities = (pkg.entities || []).filter(
(e) => !e.relevanceToTheater || e.relevanceToTheater === theater.theaterId,
);
const entityIds = theaterEntities.slice(0, 10).map((e) => sanitizeForPrompt(e.entityId || '')).join(', ');
const evalTargets = (pkg.evaluationTargets?.[theater.theaterId] || pkg.evaluationTargets?.theater || [])
.map((t) => `- ${sanitizeForPrompt(t)}`).join('\n') || '- General market and security dynamics';
return `You are a geopolitical simulation engine. This is ROUND 2 of a 2-round theater simulation.
THEATER: ${sanitizeForPrompt(theater.theaterLabel || theater.theaterId)} | Region: ${sanitizeForPrompt(theater.theaterRegion || theater.dominantRegion || '')}
ROUND 1 PATH SUMMARIES:
${pathSummaries}
VALID ACTOR IDs: ${entityIds || '(see round 1)'}
EVALUATION TARGETS:
${evalTargets}
INSTRUCTIONS:
For each of the 3 paths from Round 1 (escalation, containment, spillover), generate the EVOLVED outcome after 72 hours.
- keyActors: 2-4 actor IDs that drive this path
- roundByRoundEvolution: 2 entries (round 1 summary, round 2 evolution)
- timingMarkers: 2-4 key events with timing (T+Nh format)
- stabilizers: 2-4 factors that could prevent the worst outcome
- invalidators: 2-4 conditions that would invalidate this path
- confidence: 0.0-1.0 based on evidence strength
Return ONLY a JSON object with no markdown fences:
{
"paths": [
{
"pathId": "escalation",
"label": "<short label>",
"summary": "<≤200 char evolved summary>",
"keyActors": ["<entityId>"],
"roundByRoundEvolution": [
{ "round": 1, "summary": "<≤160 char>" },
{ "round": 2, "summary": "<≤160 char>" }
],
"confidence": 0.0,
"timingMarkers": [{ "event": "<≤80 char>", "timing": "T+Nh" }]
},
{ "pathId": "containment", "label": "...", "summary": "...", "keyActors": [], "roundByRoundEvolution": [], "confidence": 0.0, "timingMarkers": [] },
{ "pathId": "spillover", "label": "...", "summary": "...", "keyActors": [], "roundByRoundEvolution": [], "confidence": 0.0, "timingMarkers": [] }
],
"stabilizers": ["<≤100 char>"],
"invalidators": ["<≤100 char>"],
"globalObservations": "<≤300 char>",
"confidenceNotes": "<≤200 char>"
}`;
}
function tryParseSimulationRoundPayload(text, round) {
try {
const parsed = JSON.parse(text);
if (!Array.isArray(parsed?.paths)) return { paths: null };
const expectedIds = new Set(['escalation', 'containment', 'spillover']);
const paths = parsed.paths.filter((p) => p && expectedIds.has(p.pathId));
if (paths.length === 0) return { paths: null };
if (round === 2) {
return {
paths,
stabilizers: Array.isArray(parsed.stabilizers) ? parsed.stabilizers.map(String).slice(0, 6) : [],
invalidators: Array.isArray(parsed.invalidators) ? parsed.invalidators.map(String).slice(0, 6) : [],
globalObservations: String(parsed.globalObservations || '').slice(0, 300),
confidenceNotes: String(parsed.confidenceNotes || '').slice(0, 200),
};
}
return {
paths,
dominantReactions: Array.isArray(parsed.dominantReactions) ? parsed.dominantReactions.map(String).slice(0, 6) : [],
note: String(parsed.note || '').slice(0, 200),
};
} catch {
return { paths: null };
}
}
function extractSimulationRoundPayload(text, round) {
const cleaned = text
.replace(/<think>[\s\S]*?<\/think>/gi, '')
.replace(/<\|thinking\|>[\s\S]*?<\|\/thinking\|>/gi, '')
.replace(/```json\s*/gi, '```')
.trim();
const candidates = [];
const fencedBlocks = [...cleaned.matchAll(/```([\s\S]*?)```/g)].map((m) => m[1].trim());
candidates.push(...fencedBlocks);
candidates.push(cleaned);
for (const candidate of candidates) {
const trimmed = candidate.trim();
if (!trimmed) continue;
const direct = tryParseSimulationRoundPayload(trimmed, round);
if (direct.paths) return { ...direct, diagnostics: { stage: 'direct', preview: sanitizeForPrompt(trimmed).slice(0, 160) } };
const firstObject = extractFirstJsonObject(trimmed);
if (firstObject) {
const parsed = tryParseSimulationRoundPayload(firstObject, round);
if (parsed.paths) return { ...parsed, diagnostics: { stage: 'extracted', preview: sanitizeForPrompt(firstObject).slice(0, 160) } };
}
}
return { paths: null, diagnostics: { stage: 'no_json', preview: sanitizeForPrompt(cleaned).slice(0, 160) } };
}
async function runTheaterSimulation(theater, pkg) {
const theaterLabel = sanitizeForPrompt(theater.theaterLabel || theater.theaterId);
const userPrompt1 = `Theater: ${theaterLabel}\nRun ID: ${pkg.runId}\nGenerate Round 1 actor reactions for the 3 divergent paths.`;
const r1Raw = await callForecastLLM(
buildSimulationRound1SystemPrompt(theater, pkg),
userPrompt1,
{ ...getForecastLlmCallOptions('simulation_round_1'), stage: 'simulation_round_1', maxTokens: SIMULATION_ROUND1_MAX_TOKENS, temperature: 0 },
);
if (!r1Raw) return { failed: true, reason: 'round1_llm_failed' };
const r1 = extractSimulationRoundPayload(r1Raw.text, 1);
if (!r1.paths) return { failed: true, reason: 'round1_parse_failed', diagnostics: r1.diagnostics };
const userPrompt2 = `Theater: ${theaterLabel}\nRun ID: ${pkg.runId}\nGenerate Round 2 path evolution (72h) based on the Round 1 paths.`;
const r2Raw = await callForecastLLM(
buildSimulationRound2SystemPrompt(theater, pkg, r1),
userPrompt2,
{ ...getForecastLlmCallOptions('simulation_round_2'), stage: 'simulation_round_2', maxTokens: SIMULATION_ROUND2_MAX_TOKENS, temperature: 0 },
);
if (!r2Raw) return { round1: r1, round2: null, failed: false };
const r2 = extractSimulationRoundPayload(r2Raw.text, 2);
return { round1: r1, round2: r2.paths ? r2 : null, failed: false };
}
function buildSimulationOutcomeKey(runId, generatedAt) {
const prefix = buildTraceRunPrefix(runId, generatedAt, FORECAST_DEEP_RUN_PREFIX);
return `${prefix}/simulation-outcome.json`;
}
async function writeSimulationOutcome(pkg, outcome, { storageConfig } = {}) {
const config = storageConfig ?? resolveR2StorageConfig();
if (!config || !pkg?.runId) return null;
const { runId, generatedAt } = pkg;
const outcomeKey = buildSimulationOutcomeKey(runId, generatedAt || Date.now());
await putR2JsonObject(config, outcomeKey, outcome, {
runid: String(runId),
kind: 'simulation_outcome',
schema_version: SIMULATION_OUTCOME_SCHEMA_VERSION,
});
const { url, token } = getRedisCredentials();
await redisCommand(url, token, [
'SET',
SIMULATION_OUTCOME_LATEST_KEY,
JSON.stringify({
runId,
outcomeKey,
schemaVersion: SIMULATION_OUTCOME_SCHEMA_VERSION,
theaterCount: (outcome.theaterResults || []).length,
generatedAt: generatedAt || Date.now(),
}),
'EX',
String(TRACE_REDIS_TTL_SECONDS),
]);
return { outcomeKey };
}
const VALID_RUN_ID_RE = /^\d{13,}-[a-z0-9-]{1,64}$/i;
function validateRunId(runId) { return typeof runId === 'string' && VALID_RUN_ID_RE.test(runId); }
function buildSimulationTaskKey(runId) { return `${SIMULATION_TASK_KEY_PREFIX}:${runId}`; }
function buildSimulationLockKey(runId) { return `${SIMULATION_LOCK_KEY_PREFIX}:${runId}`; }
async function enqueueSimulationTask(runId) {
if (!runId) return { queued: false, reason: 'missing_run_id' };
if (!validateRunId(runId)) return { queued: false, reason: 'invalid_run_id_format' };
const { url, token } = getRedisCredentials();
const queued = await redisCommand(url, token, [
'SET', buildSimulationTaskKey(runId),
JSON.stringify({ runId, createdAt: Date.now() }),
'EX', String(SIMULATION_TASK_TTL_SECONDS), 'NX',
]);
if (queued?.result !== 'OK') return { queued: false, reason: 'duplicate' };
await redisCommand(url, token, ['ZADD', SIMULATION_TASK_QUEUE_KEY, String(Date.now()), runId]);
await redisCommand(url, token, ['EXPIRE', SIMULATION_TASK_QUEUE_KEY, String(TRACE_REDIS_TTL_SECONDS)]);
return { queued: true, reason: '' };
}
async function claimSimulationTask(runId, workerId) {
if (!runId) return null;
const { url, token } = getRedisCredentials();
const lockKey = buildSimulationLockKey(runId);
const claim = await redisCommand(url, token, [
'SET', lockKey, workerId, 'EX', String(SIMULATION_LOCK_TTL_SECONDS), 'NX',
]);
if (claim?.result !== 'OK') return null;
const taskRaw = await redisGet(url, token, buildSimulationTaskKey(runId));
if (!taskRaw?.runId) {
await redisDel(url, token, lockKey);
return null;
}
return taskRaw;
}
async function completeSimulationTask(runId) {
if (!runId) return;
const { url, token } = getRedisCredentials();
await redisCommand(url, token, ['ZREM', SIMULATION_TASK_QUEUE_KEY, runId]);
await redisDel(url, token, buildSimulationTaskKey(runId));
await redisDel(url, token, buildSimulationLockKey(runId));
}
async function listQueuedSimulationTasks(limit = 10) {
const { url, token } = getRedisCredentials();
const response = await redisCommand(url, token, [
'ZRANGE', SIMULATION_TASK_QUEUE_KEY, '0', String(Math.max(0, limit - 1)),
]);
return Array.isArray(response?.result) ? response.result : [];
}
async function processNextSimulationTask(options = {}) {
const workerId = options.workerId || `sim-worker-${process.pid}-${Date.now()}`;
const queuedRunIds = options.runId ? [options.runId] : await listQueuedSimulationTasks(10);
for (const runId of queuedRunIds) {
if (!validateRunId(runId)) {
console.warn(` [Simulation] Skipping invalid runId format: ${String(runId).slice(0, 80)}`);
continue;
}
const task = await claimSimulationTask(runId, workerId);
if (!task) continue;
try {
const { url, token } = getRedisCredentials();
// Idempotency: skip if already processed for this runId
const existing = await redisGet(url, token, SIMULATION_OUTCOME_LATEST_KEY);
if (existing?.runId === runId) {
console.log(` [Simulation] Skipping ${runId} — outcome already written`);
await completeSimulationTask(runId);
return { status: 'skipped', reason: 'already_processed', runId };
}
// Read package pointer from Redis
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
if (!pkgPointer?.pkgKey) {
console.warn(` [Simulation] No package pointer for ${runId}`);
await completeSimulationTask(runId);
return { status: 'failed', reason: 'no_package_pointer', runId };
}
if (pkgPointer.runId && pkgPointer.runId !== runId) {
console.warn(` [Simulation] Package runId mismatch: task=${runId} pkg=${pkgPointer.runId} — using latest package (Phase 2 behaviour)`);
}
const storageConfig = resolveR2StorageConfig();
if (!storageConfig) {
await completeSimulationTask(runId);
return { status: 'failed', reason: 'no_storage_config', runId };
}
const pkgData = await getR2JsonObject(storageConfig, pkgPointer.pkgKey);
if (!pkgData?.selectedTheaters) {
await completeSimulationTask(runId);
return { status: 'failed', reason: 'package_read_failed', runId };
}
// Phase 2 scope: maritime chokepoint + energy/logistics theaters only
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
isMaritimeChokeEnergyCandidate(t),
);
console.log(` [Simulation] ${runId}: ${eligibleTheaters.length}/${pkgData.selectedTheaters.length} theaters eligible`);
const theaterResults = [];
const failedTheaters = [];
for (const theater of eligibleTheaters) {
console.log(` [Simulation] Running theater: ${theater.theaterId}`);
const result = await runTheaterSimulation(theater, pkgData);
if (result.failed) {
console.warn(` [Simulation] Theater ${theater.theaterId} failed: ${result.reason}`);
failedTheaters.push({ theaterId: theater.theaterId, reason: result.reason });
continue;
}
const r2Paths = result.round2?.paths || [];
const r1Paths = result.round1?.paths || [];
const mergedPaths = (r2Paths.length ? r2Paths : r1Paths).map((p) => {
const r1Path = r1Paths.find((r) => r.pathId === p.pathId);
return {
pathId: p.pathId,
label: sanitizeForPrompt(p.label || p.pathId).slice(0, 80),
summary: sanitizeForPrompt(p.summary || '').slice(0, 200),
keyActors: Array.isArray(p.keyActors) ? p.keyActors.map((s) => sanitizeForPrompt(String(s)).slice(0, 80)).slice(0, 6) : [],
roundByRoundEvolution: Array.isArray(p.roundByRoundEvolution)
? p.roundByRoundEvolution.map((r) => ({ round: r.round, summary: sanitizeForPrompt(r.summary || '').slice(0, 160) }))
: [{ round: 1, summary: sanitizeForPrompt((r1Path?.summary || p.summary || '')).slice(0, 160) }],
confidence: typeof p.confidence === 'number' ? Math.max(0, Math.min(1, p.confidence)) : 0.5,
timingMarkers: Array.isArray(p.timingMarkers)
? p.timingMarkers.slice(0, 6).map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) }))
: [],
};
});
theaterResults.push({
theaterId: theater.theaterId,
topPaths: mergedPaths,
dominantReactions: (result.round1?.dominantReactions || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
stabilizers: (result.round2?.stabilizers || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
invalidators: (result.round2?.invalidators || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4).map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) })),
});
}
const outcome = {
runId,
schemaVersion: SIMULATION_OUTCOME_SCHEMA_VERSION,
runnerVersion: SIMULATION_RUNNER_VERSION,
sourceSimulationPackageKey: pkgPointer.pkgKey,
theaterResults,
failedTheaters,
globalObservations: eligibleTheaters.length === 0
? 'No maritime chokepoint/energy theaters in package'
: theaterResults.length === 0 ? 'All theaters failed simulation' : '',
confidenceNotes: `${theaterResults.length}/${eligibleTheaters.length} theaters completed`,
generatedAt: pkgData.generatedAt || Date.now(),
};
const writeResult = await writeSimulationOutcome(pkgData, outcome, { storageConfig });
await completeSimulationTask(runId);
console.log(` [Simulation] Completed ${runId}: ${theaterResults.length} theaters → ${writeResult?.outcomeKey}`);
return { status: 'completed', runId, theaterCount: theaterResults.length, outcomeKey: writeResult?.outcomeKey };
} catch (err) {
console.warn(` [Simulation] Task failed for ${runId}: ${err.message}`);
await completeSimulationTask(runId);
return { status: 'failed', reason: err.message, runId };
}
}
return { status: 'idle' };
}
async function runSimulationWorker({ once = false, runId = '' } = {}) {
for (;;) {
const result = await processNextSimulationTask({ runId });
if (once) return result;
if (result?.status === 'idle') await sleep(SIMULATION_POLL_INTERVAL_MS);
}
}
export {
CANONICAL_KEY,
PRIOR_KEY,
@@ -15535,6 +15969,17 @@ export {
enqueueDeepForecastTask,
processNextDeepForecastTask,
runDeepForecastWorker,
SIMULATION_OUTCOME_LATEST_KEY,
SIMULATION_OUTCOME_SCHEMA_VERSION,
buildSimulationOutcomeKey,
writeSimulationOutcome,
buildSimulationRound1SystemPrompt,
buildSimulationRound2SystemPrompt,
extractSimulationRoundPayload,
runTheaterSimulation,
enqueueSimulationTask,
processNextSimulationTask,
runSimulationWorker,
scoreImpactExpansionQuality,
buildImpactExpansionDebugPayload,
runImpactExpansionPromptRefinement,

View File

@@ -1,3 +1,11 @@
/**
* Shared Redis pointer keys for simulation artifacts.
* Defined here so TypeScript handlers and seed scripts agree on the exact string.
* The MJS seed script keeps its own copy (cannot import TS source directly).
*/
export const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
export const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
/**
* Static cache keys for the bootstrap endpoint.
* Only keys with NO request-varying suffixes are included.

View File

@@ -152,6 +152,7 @@ const RPC_CACHE_TIER: Record<string, CacheTier> = {
'/api/prediction/v1/list-prediction-markets': 'medium',
'/api/forecast/v1/get-forecasts': 'medium',
'/api/forecast/v1/get-simulation-package': 'slow',
'/api/forecast/v1/get-simulation-outcome': 'slow',
'/api/supply-chain/v1/get-chokepoint-status': 'medium',
'/api/news/v1/list-feed-digest': 'slow',
'/api/intelligence/v1/get-country-facts': 'daily',

View File

@@ -0,0 +1,45 @@
import type {
ForecastServiceHandler,
ServerContext,
GetSimulationOutcomeRequest,
GetSimulationOutcomeResponse,
} from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
import { getRawJson } from '../../../_shared/redis';
import { markNoCacheResponse } from '../../../_shared/response-headers';
import { SIMULATION_OUTCOME_LATEST_KEY } from '../../../_shared/cache-keys';
type OutcomePointer = { runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number };
function isOutcomePointer(v: unknown): v is OutcomePointer {
if (!v || typeof v !== 'object') return false;
const o = v as Record<string, unknown>;
return typeof o['runId'] === 'string' && typeof o['outcomeKey'] === 'string'
&& typeof o['schemaVersion'] === 'string' && typeof o['theaterCount'] === 'number'
&& typeof o['generatedAt'] === 'number';
}
const NOT_FOUND: GetSimulationOutcomeResponse = {
found: false, runId: '', outcomeKey: '', schemaVersion: '', theaterCount: 0, generatedAt: 0, note: '', error: '',
};
export const getSimulationOutcome: ForecastServiceHandler['getSimulationOutcome'] = async (
ctx: ServerContext,
req: GetSimulationOutcomeRequest,
): Promise<GetSimulationOutcomeResponse> => {
try {
const raw = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY);
const pointer = isOutcomePointer(raw) ? raw : null;
if (!pointer?.outcomeKey) {
markNoCacheResponse(ctx.request); // don't cache not-found — outcome may appear soon after a simulation run
return NOT_FOUND;
}
const note = req.runId && req.runId !== pointer.runId
? 'runId filter not yet active; returned outcome may differ from requested run'
: '';
return { found: true, runId: pointer.runId, outcomeKey: pointer.outcomeKey, schemaVersion: pointer.schemaVersion, theaterCount: pointer.theaterCount, generatedAt: pointer.generatedAt, note, error: '' };
} catch (err) {
console.warn('[getSimulationOutcome] Redis error:', err instanceof Error ? err.message : String(err));
markNoCacheResponse(ctx.request); // don't cache error state
return { ...NOT_FOUND, error: 'redis_unavailable' };
}
};

View File

@@ -6,8 +6,17 @@ import type {
} from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
import { getRawJson } from '../../../_shared/redis';
import { markNoCacheResponse } from '../../../_shared/response-headers';
import { SIMULATION_PACKAGE_LATEST_KEY } from '../../../_shared/cache-keys';
const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
type PackagePointer = { runId: string; pkgKey: string; schemaVersion: string; theaterCount: number; generatedAt: number };
function isPackagePointer(v: unknown): v is PackagePointer {
if (!v || typeof v !== 'object') return false;
const o = v as Record<string, unknown>;
return typeof o['runId'] === 'string' && typeof o['pkgKey'] === 'string'
&& typeof o['schemaVersion'] === 'string' && typeof o['theaterCount'] === 'number'
&& typeof o['generatedAt'] === 'number';
}
const NOT_FOUND: GetSimulationPackageResponse = {
found: false, runId: '', pkgKey: '', schemaVersion: '', theaterCount: 0, generatedAt: 0, note: '', error: '',
@@ -18,9 +27,8 @@ export const getSimulationPackage: ForecastServiceHandler['getSimulationPackage'
req: GetSimulationPackageRequest,
): Promise<GetSimulationPackageResponse> => {
try {
const pointer = await getRawJson(SIMULATION_PACKAGE_LATEST_KEY) as {
runId: string; pkgKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
} | null;
const raw = await getRawJson(SIMULATION_PACKAGE_LATEST_KEY);
const pointer = isPackagePointer(raw) ? raw : null;
if (!pointer?.pkgKey) {
markNoCacheResponse(ctx.request); // don't cache not-found — package may appear soon after a deep run
return NOT_FOUND;

View File

@@ -1,5 +1,6 @@
import type { ForecastServiceHandler } from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
import { getForecasts } from './get-forecasts';
import { getSimulationPackage } from './get-simulation-package';
import { getSimulationOutcome } from './get-simulation-outcome';
export const forecastHandler: ForecastServiceHandler = { getForecasts, getSimulationPackage };
export const forecastHandler: ForecastServiceHandler = { getForecasts, getSimulationPackage, getSimulationOutcome };

View File

@@ -136,6 +136,21 @@ export interface GetSimulationPackageResponse {
error: string;
}
export interface GetSimulationOutcomeRequest {
runId: string;
}
export interface GetSimulationOutcomeResponse {
found: boolean;
runId: string;
outcomeKey: string;
schemaVersion: string;
theaterCount: number;
generatedAt: number;
note: string;
error: string;
}
export interface FieldViolation {
field: string;
description: string;
@@ -235,6 +250,31 @@ export class ForecastServiceClient {
return await resp.json() as GetSimulationPackageResponse;
}
async getSimulationOutcome(req: GetSimulationOutcomeRequest, options?: ForecastServiceCallOptions): Promise<GetSimulationOutcomeResponse> {
let path = "/api/forecast/v1/get-simulation-outcome";
const params = new URLSearchParams();
if (req.runId != null && req.runId !== "") params.set("runId", String(req.runId));
const url = this.baseURL + path + (params.toString() ? "?" + params.toString() : "");
const headers: Record<string, string> = {
"Content-Type": "application/json",
...this.defaultHeaders,
...options?.headers,
};
const resp = await this.fetchFn(url, {
method: "GET",
headers,
signal: options?.signal,
});
if (!resp.ok) {
return this.handleError(resp);
}
return await resp.json() as GetSimulationOutcomeResponse;
}
private async handleError(resp: Response): Promise<never> {
const body = await resp.text();
if (resp.status === 400) {

View File

@@ -136,6 +136,21 @@ export interface GetSimulationPackageResponse {
error: string;
}
export interface GetSimulationOutcomeRequest {
runId: string;
}
export interface GetSimulationOutcomeResponse {
found: boolean;
runId: string;
outcomeKey: string;
schemaVersion: string;
theaterCount: number;
generatedAt: number;
note: string;
error: string;
}
export interface FieldViolation {
field: string;
description: string;
@@ -183,6 +198,7 @@ export interface RouteDescriptor {
export interface ForecastServiceHandler {
getForecasts(ctx: ServerContext, req: GetForecastsRequest): Promise<GetForecastsResponse>;
getSimulationPackage(ctx: ServerContext, req: GetSimulationPackageRequest): Promise<GetSimulationPackageResponse>;
getSimulationOutcome(ctx: ServerContext, req: GetSimulationOutcomeRequest): Promise<GetSimulationOutcomeResponse>;
}
export function createForecastServiceRoutes(
@@ -285,6 +301,53 @@ export function createForecastServiceRoutes(
}
},
},
{
method: "GET",
path: "/api/forecast/v1/get-simulation-outcome",
handler: async (req: Request): Promise<Response> => {
try {
const pathParams: Record<string, string> = {};
const url = new URL(req.url, "http://localhost");
const params = url.searchParams;
const body: GetSimulationOutcomeRequest = {
runId: params.get("runId") ?? "",
};
if (options?.validateRequest) {
const bodyViolations = options.validateRequest("getSimulationOutcome", body);
if (bodyViolations) {
throw new ValidationError(bodyViolations);
}
}
const ctx: ServerContext = {
request: req,
pathParams,
headers: Object.fromEntries(req.headers.entries()),
};
const result = await handler.getSimulationOutcome(ctx, body);
return new Response(JSON.stringify(result as GetSimulationOutcomeResponse), {
status: 200,
headers: { "Content-Type": "application/json" },
});
} catch (err: unknown) {
if (err instanceof ValidationError) {
return new Response(JSON.stringify({ violations: err.violations }), {
status: 400,
headers: { "Content-Type": "application/json" },
});
}
if (options?.onError) {
return options.onError(err, req);
}
const message = err instanceof Error ? err.message : String(err);
return new Response(JSON.stringify({ message }), {
status: 500,
headers: { "Content-Type": "application/json" },
});
}
},
},
];
}

View File

@@ -55,6 +55,13 @@ import {
SIMULATION_PACKAGE_SCHEMA_VERSION,
SIMULATION_PACKAGE_LATEST_KEY,
writeSimulationPackage,
SIMULATION_OUTCOME_LATEST_KEY,
SIMULATION_OUTCOME_SCHEMA_VERSION,
buildSimulationOutcomeKey,
writeSimulationOutcome,
buildSimulationRound1SystemPrompt,
buildSimulationRound2SystemPrompt,
extractSimulationRoundPayload,
} from '../scripts/seed-forecasts.mjs';
import {
@@ -5528,6 +5535,24 @@ describe('simulation package export', () => {
})), true);
});
it('isMaritimeChokeEnergyCandidate accepts candidate with energy bucket on root (flat shape, no marketContext)', () => {
// Flat shape: topBucketId is on the candidate root, no marketContext object.
// This is the package JSON shape written by buildSimulationPackageFromDeepSnapshot.
assert.equal(isMaritimeChokeEnergyCandidate(makeCandidate({
marketContext: undefined,
topBucketId: 'energy',
})), true);
});
it('isMaritimeChokeEnergyCandidate rejects flat shape with non-energy bucket and no energy commodity', () => {
assert.equal(isMaritimeChokeEnergyCandidate(makeCandidate({
marketContext: undefined,
topBucketId: 'semis',
commodityKey: '',
marketBucketIds: ['semis'],
})), false);
});
it('buildSimulationPackageFromDeepSnapshot returns null when no qualifying candidates', () => {
const pkg = buildSimulationPackageFromDeepSnapshot(makeSnapshot([
makeCandidate({ routeFacilityKey: '' }),
@@ -5705,3 +5730,197 @@ describe('simulation package export', () => {
assert.equal(result, null);
});
});
// ---------------------------------------------------------------------------
// MiroFish Phase 2 — Simulation Runner
// ---------------------------------------------------------------------------
const minimalTheater = {
theaterId: 'test-theater-1',
theaterRegion: 'Red Sea',
theaterLabel: 'Red Sea / Bab-el-Mandeb',
candidateStateId: 'state-001',
routeFacilityKey: 'Red Sea',
dominantRegion: 'Middle East',
macroRegions: ['MENA'],
topBucketId: 'energy',
topChannel: 'price_spike',
marketBucketIds: ['energy', 'freight'],
};
const minimalPkg = {
runId: 'run-001',
generatedAt: 1711234567000,
selectedTheaters: [minimalTheater],
entities: [
{ entityId: 'houthi-forces', name: 'Houthi Forces', class: 'military_or_security_actor', region: 'Yemen', stance: 'active', objectives: [], constraints: [], relevanceToTheater: 'test-theater-1' },
{ entityId: 'aramco-exports', name: 'Saudi Aramco', class: 'exporter_or_importer', region: 'Saudi Arabia', stance: 'stressed', objectives: [], constraints: [], relevanceToTheater: 'test-theater-1' },
],
eventSeeds: [
{ seedId: 'seed-1', theaterId: 'test-theater-1', type: 'live_news', summary: 'Houthi missile attack on Red Sea shipping', evidenceRefs: ['E1'], timing: 'T+0h' },
{ seedId: 'seed-2', theaterId: 'test-theater-1', type: 'state_signal', summary: 'Oil tanker rerouting Cape of Good Hope', evidenceRefs: ['E2'], timing: 'T+12h' },
],
constraints: { 'test-theater-1': ['No actor may unilaterally close the Strait of Bab-el-Mandeb'] },
evaluationTargets: { 'test-theater-1': ['Oil price trajectory over 72h', 'Shipping diversion extent'] },
simulationRequirement: { 'test-theater-1': 'Simulate how a Red Sea disruption propagates through energy and logistics markets' },
};
describe('simulation runner — prompt builders', () => {
it('Round 1 prompt contains theater label and region', () => {
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
assert.ok(prompt.includes('Red Sea / Bab-el-Mandeb'), 'should include theater label');
assert.ok(prompt.includes('Red Sea'), 'should include theater region');
});
it('Round 1 prompt contains all 3 required path IDs', () => {
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
assert.ok(prompt.includes('"escalation"'), 'should mention escalation path');
assert.ok(prompt.includes('"containment"'), 'should mention containment path');
assert.ok(prompt.includes('"spillover"'), 'should mention spillover path');
});
it('Round 1 prompt lists entity IDs', () => {
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
assert.ok(prompt.includes('houthi-forces'), 'should include entity entityId');
assert.ok(prompt.includes('aramco-exports'), 'should include entity entityId');
});
it('Round 1 prompt lists event seed IDs', () => {
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
assert.ok(prompt.includes('seed-1'), 'should include seed-1');
assert.ok(prompt.includes('seed-2'), 'should include seed-2');
});
it('Round 1 prompt includes simulation requirement', () => {
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
assert.ok(prompt.includes('Red Sea disruption'), 'should include simulationRequirement text');
});
it('Round 2 prompt contains Round 1 path summaries', () => {
const round1 = {
paths: [
{ pathId: 'escalation', summary: 'Escalation path summary', initialReactions: [{ actorId: 'houthi-forces' }] },
{ pathId: 'containment', summary: 'Containment path summary', initialReactions: [] },
{ pathId: 'spillover', summary: 'Spillover path summary', initialReactions: [] },
],
};
const prompt = buildSimulationRound2SystemPrompt(minimalTheater, minimalPkg, round1);
assert.ok(prompt.includes('Escalation path summary'), 'should include round 1 escalation summary');
assert.ok(prompt.includes('Containment path summary'), 'should include round 1 containment summary');
assert.ok(prompt.includes('ROUND 2'), 'should indicate this is round 2');
});
it('Round 2 prompt includes valid actor IDs list', () => {
const round1 = { paths: [] };
const prompt = buildSimulationRound2SystemPrompt(minimalTheater, minimalPkg, round1);
assert.ok(prompt.includes('houthi-forces'), 'should include valid actor IDs');
});
});
describe('simulation runner — extractSimulationRoundPayload', () => {
const r1Payload = JSON.stringify({
paths: [
{ pathId: 'escalation', label: 'Escalate', summary: 'Forces escalate', initialReactions: [] },
{ pathId: 'containment', label: 'Contain', summary: 'Forces contained', initialReactions: [] },
{ pathId: 'spillover', label: 'Spill', summary: 'Spillover effect', initialReactions: [] },
],
dominantReactions: ['Actor A: escalates'],
note: 'Three divergent paths',
});
const r2Payload = JSON.stringify({
paths: [
{ pathId: 'escalation', label: 'Full Escalation', summary: 'Escalated 72h', keyActors: ['houthi-forces'], roundByRoundEvolution: [{ round: 1, summary: 'Round 1' }, { round: 2, summary: 'Round 2' }], confidence: 0.75, timingMarkers: [{ event: 'First strike', timing: 'T+6h' }] },
{ pathId: 'containment', label: 'Contained', summary: 'Contained 72h', keyActors: [], roundByRoundEvolution: [], confidence: 0.6, timingMarkers: [] },
{ pathId: 'spillover', label: 'Spilled', summary: 'Spillover 72h', keyActors: [], roundByRoundEvolution: [], confidence: 0.4, timingMarkers: [] },
],
stabilizers: ['International pressure'],
invalidators: ['New attack'],
globalObservations: 'Cross-theater ripple effects expected',
confidenceNotes: 'Moderate confidence overall',
});
it('parses valid Round 1 JSON directly', () => {
const result = extractSimulationRoundPayload(r1Payload, 1);
assert.ok(Array.isArray(result.paths), 'should return paths array');
assert.equal(result.paths.length, 3, 'should have 3 paths');
assert.equal(result.paths[0].pathId, 'escalation');
assert.ok(Array.isArray(result.dominantReactions), 'should include dominantReactions');
assert.equal(result.diagnostics.stage, 'direct');
});
it('parses valid Round 2 JSON directly', () => {
const result = extractSimulationRoundPayload(r2Payload, 2);
assert.ok(Array.isArray(result.paths), 'should return paths array');
assert.equal(result.paths.length, 3);
assert.ok(Array.isArray(result.stabilizers), 'should include stabilizers');
assert.ok(Array.isArray(result.invalidators), 'should include invalidators');
assert.ok(typeof result.globalObservations === 'string');
});
it('strips fenced code blocks and parses Round 1', () => {
const fenced = `\`\`\`json\n${r1Payload}\n\`\`\``;
const result = extractSimulationRoundPayload(fenced, 1);
assert.ok(Array.isArray(result.paths), 'should parse fenced JSON');
assert.equal(result.paths.length, 3);
});
it('strips <think> tags before parsing', () => {
const withThink = `<think>internal reasoning here</think>\n${r1Payload}`;
const result = extractSimulationRoundPayload(withThink, 1);
assert.ok(Array.isArray(result.paths), 'should parse after stripping think tags');
});
it('returns null paths on invalid JSON', () => {
const result = extractSimulationRoundPayload('not valid json', 1);
assert.equal(result.paths, null);
assert.equal(result.diagnostics.stage, 'no_json');
});
it('returns null paths when paths array is missing', () => {
const result = extractSimulationRoundPayload('{"no_paths": true}', 1);
assert.equal(result.paths, null);
});
it('returns null paths when no valid pathId present', () => {
const badPaths = JSON.stringify({ paths: [{ pathId: 'unknown', summary: 'x' }] });
const result = extractSimulationRoundPayload(badPaths, 1);
assert.equal(result.paths, null);
});
it('uses extractFirstJsonObject fallback for prefix text', () => {
const withPrefix = `Here is the result:\n${r1Payload}\nEnd.`;
const result = extractSimulationRoundPayload(withPrefix, 1);
assert.ok(Array.isArray(result.paths), 'should parse via extractFirstJsonObject fallback');
});
});
describe('simulation runner — outcome key builder', () => {
it('buildSimulationOutcomeKey produces a key ending in simulation-outcome.json', () => {
const key = buildSimulationOutcomeKey('run-123', 1711234567000);
assert.ok(key.endsWith('/simulation-outcome.json'), `unexpected key: ${key}`);
assert.ok(key.includes('run-123'), 'should include runId');
});
it('SIMULATION_OUTCOME_LATEST_KEY is the canonical Redis pointer key', () => {
assert.equal(SIMULATION_OUTCOME_LATEST_KEY, 'forecast:simulation-outcome:latest');
});
it('SIMULATION_OUTCOME_SCHEMA_VERSION is v1', () => {
assert.equal(SIMULATION_OUTCOME_SCHEMA_VERSION, 'v1');
});
});
describe('simulation runner — writeSimulationOutcome', () => {
it('returns null when R2 storage is not configured', async () => {
const outcome = { theaterResults: [], failedTheaters: [], runId: 'run-001', generatedAt: Date.now() };
const result = await writeSimulationOutcome(minimalPkg, outcome, { storageConfig: null });
assert.equal(result, null);
});
it('returns null when pkg has no runId', async () => {
const outcome = { theaterResults: [], failedTheaters: [] };
const result = await writeSimulationOutcome({ generatedAt: Date.now() }, outcome, { storageConfig: null });
assert.equal(result, null);
});
});

View File

@@ -230,3 +230,54 @@ describe('getVesselSnapshot caching (HIGH-1)', () => {
// NOTE: Full integration test (mocking fetch, verifying cache hits) requires
// a TypeScript-capable test runner. This structural test verifies the pattern.
});
// ========================================================================
// getSimulationOutcome handler — structural tests
// ========================================================================
describe('getSimulationOutcome handler', () => {
const src = readSrc('server/worldmonitor/forecast/v1/get-simulation-outcome.ts');
it('returns found:false (NOT_FOUND) when pointer is absent', () => {
// The handler must define a NOT_FOUND sentinel with found: false
assert.match(src, /found:\s*false/,
'NOT_FOUND constant should set found: false');
// And return it when the pointer is missing
assert.match(src, /return\s+NOT_FOUND/,
'Should return NOT_FOUND when key is absent');
});
it('uses isOutcomePointer type guard before accessing pointer fields', () => {
assert.match(src, /isOutcomePointer\(raw\)/,
'Should use isOutcomePointer type guard on getRawJson result');
// Guard must check string and number fields — not just truthy
assert.match(src, /typeof\s+o\[.runId.\]\s*===\s*'string'/,
'Type guard should verify runId is a string');
assert.match(src, /typeof\s+o\[.theaterCount.\]\s*===\s*'number'/,
'Type guard should verify theaterCount is a number');
});
it('returns found:true with all pointer fields on success', () => {
assert.match(src, /found:\s*true/,
'Success path should return found: true');
// Must propagate all pointer fields
assert.match(src, /outcomeKey:\s*pointer\.outcomeKey/,
'Success path should include outcomeKey from pointer');
assert.match(src, /theaterCount:\s*pointer\.theaterCount/,
'Success path should include theaterCount from pointer');
});
it('populates note when runId supplied but does not match pointer runId', () => {
assert.match(src, /req\.runId.*pointer\.runId/,
'Should compare req.runId with pointer.runId for note');
assert.match(src, /runId filter not yet active/,
'Note text should explain the Phase 3 deferral');
});
it('returns redis_unavailable error string on Redis failure', () => {
assert.match(src, /redis_unavailable/,
'Should return redis_unavailable on catch');
assert.match(src, /markNoCacheResponse.*catch|catch[\s\S]*?markNoCacheResponse/,
'Should mark no-cache on error to avoid caching error state');
});
});

View File

@@ -0,0 +1,79 @@
---
status: complete
priority: p1
issue_id: "018"
tags: [code-review, security, simulation-runner, prompt-injection]
---
# Unsanitized entity/seed fields injected into LLM simulation prompts
## Problem Statement
`buildSimulationRound1SystemPrompt` and `buildSimulationRound2SystemPrompt` interpolate multiple fields directly into LLM system prompts without calling `sanitizeForPrompt`. The fields `e.entityId`, `e.class`, `e.stance`, `s.seedId`, `s.type`, `s.timing`, and Round 1 `r.actorId` all bypass sanitization entirely. These fields originate from external news data processed by the package builder, where `entityId` is derived from actor names extracted from live headlines via regex. A crafted headline can produce an `entityId` that, when embedded in the system prompt with the instruction "use exact entityId when citing actors", forms a valid prompt injection payload.
## Findings
**F-1 (HIGH):** `e.entityId` injected raw with explicit directive to LLM to use it verbatim:
```javascript
// scripts/seed-forecasts.mjs ~line 15402
`- ${e.entityId} | ${sanitizeForPrompt(e.name)} | class=${e.class} | stance=${e.stance || 'unknown'}`
// e.entityId, e.class, e.stance — none sanitized
```
**F-2 (HIGH):** Event seed fields `s.seedId`, `s.type`, `s.timing` injected raw:
```javascript
`- ${s.seedId} [${s.type}] ${sanitizeForPrompt(s.summary)} (${s.timing})`
```
**F-3 (HIGH):** Round 2 prompt uses `r.actorId` from Round 1 LLM output (chaining injection risk):
```javascript
// scripts/seed-forecasts.mjs ~line 15468
actors: ${(p.initialReactions || []).slice(0, 3).map((r) => r.actorId).join(', ')}
// r.actorId comes from LLM JSON output — not sanitized before round 2 injection
```
`sanitizeProposedLlmAddition` exists in the same file and provides keyword-pattern blocking ("ignore", "override", "you must") but is never called on simulation fields.
## Proposed Solutions
### Option A: Apply `sanitizeForPrompt` to all bypassed fields (Recommended)
```javascript
// In buildSimulationRound1SystemPrompt:
const entityList = theaterEntities.slice(0, 10).map(
(e) => `- ${sanitizeForPrompt(e.entityId)} | ${sanitizeForPrompt(e.name)} | class=${sanitizeForPrompt(e.class)} | stance=${sanitizeForPrompt(e.stance || 'unknown')}`,
).join('\n');
const seedList = theaterSeeds.slice(0, 8).map(
(s) => `- ${sanitizeForPrompt(s.seedId)} [${sanitizeForPrompt(s.type)}] ${sanitizeForPrompt(s.summary)} (${sanitizeForPrompt(s.timing)})`,
).join('\n');
// In buildSimulationRound2SystemPrompt:
actors: ${(p.initialReactions || []).slice(0, 3).map((r) => sanitizeForPrompt(r.actorId || '')).join(', ')}
```
Effort: Small | Risk: Low
### Option B: Enforce allowlist regex on `entityId` at package-build time
Add `/^[a-z0-9_\-]{1,80}$/` validation in `buildSimulationPackageEntities` at the point where `entityId` is generated. Reject any ID not matching the pattern. This is defense-in-depth upstream.
Effort: Small | Risk: Low
## Acceptance Criteria
- [ ] All fields interpolated into simulation system prompts are wrapped in `sanitizeForPrompt()`
- [ ] `e.entityId`, `e.class`, `e.stance` sanitized in `buildSimulationRound1SystemPrompt`
- [ ] `s.seedId`, `s.type`, `s.timing` sanitized in `buildSimulationRound1SystemPrompt`
- [ ] `r.actorId` sanitized in `buildSimulationRound2SystemPrompt`
- [ ] Test: entity with `entityId` containing newline + directive text produces sanitized prompt
## Technical Details
- File: `scripts/seed-forecasts.mjs``buildSimulationRound1SystemPrompt` (~line 15397), `buildSimulationRound2SystemPrompt` (~line 15430), `buildSimulationRound2SystemPrompt` (~line 15468)
- Existing function: `sanitizeForPrompt(text)` at line ~13481 — strips newlines, `<>{}`, control chars, truncates at 200 chars
- Related: todo #013 (package-builder sanitization) — this is the downstream consumer gap
## Work Log
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review

View File

@@ -0,0 +1,92 @@
---
status: complete
priority: p1
issue_id: "019"
tags: [code-review, security, simulation-runner, path-traversal]
---
# `runId` flows unvalidated into Redis key construction and R2 path
## Problem Statement
`buildSimulationTaskKey(runId)` and `buildSimulationLockKey(runId)` construct Redis keys via string concatenation using `runId` with no format validation. More critically, `runId` flows into `buildSimulationOutcomeKey``buildTraceRunPrefix` which constructs an R2 key of the form `seed-data/forecast-traces/{year}/{month}/{day}/{runId}/simulation-outcome.json`. A `runId` containing `/../` path traversal sequences could produce an R2 key escaping the intended namespace.
## Findings
**F-1 (HIGH):** R2 path uses `runId` directly in `buildTraceRunPrefix`:
```javascript
// scripts/seed-forecasts.mjs — buildTraceRunPrefix (~line 4407)
`${basePrefix}/${year}/${month}/${day}/${runId}`
// runId containing '/../' produces: seed-data/forecast-traces/2026/03/24/../../../evil
```
**F-2 (MEDIUM):** Redis key construction via simple concatenation:
```javascript
function buildSimulationTaskKey(runId) { return `${SIMULATION_TASK_KEY_PREFIX}:${runId}`; }
function buildSimulationLockKey(runId) { return `${SIMULATION_LOCK_KEY_PREFIX}:${runId}`; }
// No format guard — runId from CLI argv or queue member
```
**F-3 (MEDIUM):** ZADD member in task queue uses raw `runId`:
```javascript
await redisCommand(url, token, ['ZADD', SIMULATION_TASK_QUEUE_KEY, String(Date.now()), runId]);
// If queue is poisoned, `listQueuedSimulationTasks` returns the malformed runId
// which then flows into all downstream key construction
```
Entry points: `process.argv` in `process-simulation-tasks.mjs` (operator-controlled, lower risk) and `listQueuedSimulationTasks` (queue member, higher risk if queue is ever written from an untrusted path).
## Proposed Solutions
### Option A: Validate `runId` format before any key operation (Recommended)
The existing `parseForecastRunGeneratedAt` (~line 4414) matches `/^(\d{10,})/`, suggesting `runId` values are timestamp-prefixed. Enforce this:
```javascript
const VALID_RUN_ID = /^\d{13,}-[a-z0-9\-]{1,64}$/i;
function validateRunId(runId) {
if (!runId || !VALID_RUN_ID.test(runId)) return null;
return runId;
}
// In enqueueSimulationTask:
const safeRunId = validateRunId(runId);
if (!safeRunId) return { queued: false, reason: 'invalid_run_id_format' };
// In processNextSimulationTask, validate each queuedRunId before processing:
for (const rawId of queuedRunIds) {
const runId = validateRunId(rawId);
if (!runId) { console.warn('[Simulation] Skipping malformed runId:', rawId); continue; }
...
}
```
Effort: Small | Risk: Low
### Option B: Sanitize R2 path components
Apply `path.normalize` and prefix-check on the constructed R2 key before write:
```javascript
const key = buildSimulationOutcomeKey(runId, generatedAt);
if (!key.startsWith('seed-data/forecast-traces/')) throw new Error('R2 key escaped namespace');
```
Effort: Small | Risk: Low — defense-in-depth after Option A
## Acceptance Criteria
- [ ] `enqueueSimulationTask` validates `runId` matches expected format before Redis write
- [ ] `processNextSimulationTask` validates each `runId` from queue before key construction
- [ ] R2 key is prefix-checked before write in `writeSimulationOutcome`
- [ ] Invalid `runId` produces `{ queued: false, reason: 'invalid_run_id_format' }` not a silent key operation
- [ ] Test: `runId` of `"../../../evil"` is rejected before Redis/R2 operations
## Technical Details
- Files: `scripts/seed-forecasts.mjs``enqueueSimulationTask` (~line 15636), `buildSimulationTaskKey` (~line 15633), `processNextSimulationTask` (~line 15682), `writeSimulationOutcome` (~line 15613)
- Related: `buildTraceRunPrefix` (~line 4407) — used by all trace artifact key builders
## Work Log
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review

View File

@@ -0,0 +1,78 @@
---
status: complete
priority: p1
issue_id: "020"
tags: [code-review, typescript, simulation-runner, type-safety]
---
# Unvalidated `as` cast on `getRawJson` result in `get-simulation-outcome.ts`
## Problem Statement
`getRawJson` returns `Promise<unknown | null>`. The handler casts the result with `as { runId: string; outcomeKey: string; ... } | null` — a TypeScript compile-time assertion with no runtime enforcement. If Redis contains a malformed value (wrong shape, missing fields, renamed keys from a schema migration), `pointer.runId`, `pointer.outcomeKey`, etc. would be `undefined`, and the handler returns a partially-populated `GetSimulationOutcomeResponse` with `undefined` values spread into proto fields. The same pattern exists in `get-simulation-package.ts` and should be fixed in both files simultaneously.
## Findings
**F-1 (P1):** TypeScript `as` cast provides zero runtime protection:
```typescript
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts line 21
const pointer = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY) as {
runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
} | null;
// If Redis has { run_id: 'x', outcome_key: 'y' } (snake_case), pointer.runId === undefined
// Handler returns { found: true, runId: undefined, ... } — malformed response
```
**F-2 (P2):** Same pattern in `get-simulation-package.ts` line ~21 — fix both together.
## Proposed Solutions
### Option A: Add a type guard function (Recommended)
```typescript
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts
function isOutcomePointer(v: unknown): v is {
runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
} {
if (typeof v !== 'object' || v === null) return false;
const p = v as Record<string, unknown>;
return typeof p['runId'] === 'string'
&& typeof p['outcomeKey'] === 'string'
&& typeof p['schemaVersion'] === 'string'
&& typeof p['theaterCount'] === 'number'
&& typeof p['generatedAt'] === 'number';
}
// In handler:
const raw = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY);
if (!isOutcomePointer(raw)) {
markNoCacheResponse(ctx.request);
return NOT_FOUND; // treat malformed as not-found
}
const pointer = raw; // fully typed, no cast
```
Effort: Small | Risk: Low — safe degradation to NOT_FOUND on invalid data
### Option B: Use zod schema validation (heavier but more maintainable)
Add a `z.object({...}).safeParse()` call. Only viable if zod is already in the project dependencies.
## Acceptance Criteria
- [ ] `get-simulation-outcome.ts` uses a type guard instead of `as` cast
- [ ] Malformed Redis value returns `NOT_FOUND` response (not a partially-populated response)
- [ ] `get-simulation-package.ts` receives the same fix simultaneously
- [ ] TypeScript strict mode still passes after the change (no `any` introduced)
- [ ] Test: mocked `getRawJson` returning `{ run_id: 'x' }` (wrong key names) → handler returns `found: false`
## Technical Details
- File: `server/worldmonitor/forecast/v1/get-simulation-outcome.ts` lines 21-23
- File: `server/worldmonitor/forecast/v1/get-simulation-package.ts` lines ~21-23 (same pattern)
- `getRawJson` return type: `Promise<unknown | null>` — correct to return unknown
## Work Log
- 2026-03-24: Found by compound-engineering:review:kieran-typescript-reviewer in PR #2220 review

View File

@@ -0,0 +1,68 @@
---
status: pending
priority: p1
issue_id: "021"
tags: [code-review, agent-native, simulation-runner, api]
---
# No HTTP endpoint to trigger a simulation run — agents cannot initiate simulations
## Problem Statement
Simulation runs can only be triggered by a human operator running `node scripts/process-simulation-tasks.mjs --once` in the Railway environment. `enqueueSimulationTask(runId)` and `runSimulationWorker` are exported from `scripts/seed-forecasts.mjs` but are only callable from worker processes, not via HTTP. Agents operating through the HTTP API (AI Market Implications panel, future orchestration agents, LLM tool calls) have read-only access to the system — they can discover the latest simulation outcome pointer but cannot trigger a new simulation. For a feature described as AI-driven forecasting, agents being permanently blocked from initiating analysis is a design gap.
## Findings
**F-1 (P1):** No `POST /api/forecast/v1/trigger-simulation` or equivalent endpoint exists.
**F-2 (P1):** `enqueueSimulationTask(runId)` is exported and callable, but only from Node.js processes — no HTTP surface.
**F-3 (P2):** Compounded by `runId` filter being a no-op in `getSimulationOutcome` — even if an agent knew its trigger succeeded, it cannot verify its specific run completed vs. a concurrent run superseding it.
**Capability map:**
| Action | Human | Agent (HTTP) |
|---|---|---|
| Check outcome exists | ✅ | ✅ |
| Read outcome pointer | ✅ | ✅ |
| Trigger simulation run | ✅ (Railway CLI) | ❌ |
| Check if run in progress | ✅ (logs) | ❌ |
| Verify specific run completed | ✅ | ❌ (runId filter no-op) |
## Proposed Solutions
### Option A: Add `POST /api/forecast/v1/trigger-simulation` (Recommended)
A thin Vercel handler following the same proto pattern:
1. New proto message: `TriggerSimulationRequest { string run_id = 1; }`, `TriggerSimulationResponse { bool queued = 1; string run_id = 2; string reason = 3; }`
2. New handler: reads `SIMULATION_PACKAGE_LATEST_KEY` from Redis to derive `runId` if not supplied, calls `enqueueSimulationTask(runId)`, returns `{ queued, runId, reason }`
3. The actual execution remains Railway-side (existing poll loop picks it up) — the endpoint only enqueues
4. Rate-limit to 1 trigger per 5 minutes to prevent spam (can reuse existing rate-limit pattern)
Estimated effort: 1 proto file + 1 handler file + 1 service.proto entry + `make generate` — same scope as `get-simulation-outcome.ts`.
### Option B: Webhook trigger from deep forecast completion
When `processNextDeepForecastTask` completes and writes a simulation package, automatically call `enqueueSimulationTask`. This makes simulation trigger automatic rather than agent-driven. Simpler but removes on-demand triggering flexibility.
Effort: Small | Risk: Low — no new HTTP surface, but agents still can't trigger ad-hoc
## Acceptance Criteria
- [ ] `POST /api/forecast/v1/trigger-simulation` returns `{ queued: true, runId }` when package is available
- [ ] Returns `{ queued: false, reason: 'no_package' }` when no simulation package exists
- [ ] Returns `{ queued: false, reason: 'duplicate' }` when the same runId is already queued
- [ ] Rate limited to prevent spam
- [ ] Agent-native: an agent calling the trigger endpoint then polling `getSimulationOutcome` can complete a trigger-and-verify workflow
## Technical Details
- Would-be handler: `server/worldmonitor/forecast/v1/trigger-simulation.ts`
- Entry point: `enqueueSimulationTask(runId)` in `scripts/seed-forecasts.mjs` (already exported)
- Pattern reference: `get-simulation-outcome.ts` for handler structure, `service.proto` for RPC addition
- Related: todo #029 (runId filter no-op) — fix both for complete trigger-and-verify loop
## Work Log
- 2026-03-24: Found by compound-engineering:review:agent-native-reviewer in PR #2220 review

View File

@@ -0,0 +1,63 @@
---
status: complete
priority: p2
issue_id: "022"
tags: [code-review, architecture, simulation-runner, correctness]
---
# `pkgPointer.runId` never compared to task `runId` — can silently simulate wrong package
## Problem Statement
In `processNextSimulationTask`, after claiming a task for `runId=A`, the code reads `SIMULATION_PACKAGE_LATEST_KEY` which returns the *latest* package pointer — not necessarily the one for run A. If a new simulation package for run B is written to Redis while task A is still queued, the worker picks up task A but processes run B's package data. The outcome is written under run A's `runId` but contains run B content. No warning is logged, no error is returned. This is especially relevant in Phase 3 when per-run lookup becomes active.
## Findings
**F-1 (HIGH):** `pkgPointer.runId` is read but never compared to the task's `runId`:
```javascript
// scripts/seed-forecasts.mjs ~line 15697
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
if (!pkgPointer?.pkgKey) { ... return { status: 'failed', reason: 'no_package_pointer' }; }
// Missing: if (pkgPointer.runId && pkgPointer.runId !== runId) { ... abort ... }
const pkgData = await getR2JsonObject(storageConfig, pkgPointer.pkgKey);
// pkgData.runId !== runId — proceeds to simulate and write outcome under wrong runId
```
## Proposed Solutions
### Option A: Add explicit runId mismatch guard (Recommended)
```javascript
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
if (!pkgPointer?.pkgKey) { ... return failed; }
// Guard: skip if package is for a different run
if (pkgPointer.runId && pkgPointer.runId !== runId) {
console.warn(` [Simulation] Package mismatch: task=${runId} pkg=${pkgPointer.runId} — skipping`);
await completeSimulationTask(runId);
return { status: 'skipped', reason: 'package_run_mismatch', runId };
}
```
This is non-breaking: if `pkgPointer.runId` is absent (old format), the guard is skipped and behavior is unchanged.
Effort: Small | Risk: Low
### Option B: Accept current behavior, document explicitly
Add a comment explaining that "latest wins" is intentional and document the Phase 3 migration path. Safe for Phase 2 where only one run stream exists.
## Acceptance Criteria
- [ ] Guard added: if `pkgPointer.runId !== runId`, task is completed and `{ status: 'skipped', reason: 'package_run_mismatch' }` returned
- [ ] Log line emitted on mismatch for operational visibility
- [ ] Test: enqueue task for runId A, set package pointer to runId B — processNextSimulationTask returns `skipped/package_run_mismatch`
## Technical Details
- File: `scripts/seed-forecasts.mjs``processNextSimulationTask` (~line 15697)
## Work Log
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review

View File

@@ -0,0 +1,67 @@
---
status: complete
priority: p2
issue_id: "023"
tags: [code-review, security, simulation-runner, data-integrity]
---
# LLM output arrays written to R2 without per-element sanitization or length limits
## Problem Statement
In `processNextSimulationTask`, several LLM output arrays are written to R2 using only `map(String).slice(0, N)` — which ensures items are strings but applies no length cap per element and no sanitization. A single oversized or injection-containing LLM output item (e.g., a `stabilizers` entry of 50,000 characters) is written directly to R2 and later served to clients without truncation. Additionally, the `timingMarkers` sourced from `result.round2?.paths?.[0]` use a different (non-sanitized) path compared to the per-path `timingMarkers` processing that correctly applies `sanitizeForPrompt`.
## Findings
**F-1 (MEDIUM):**
```javascript
// scripts/seed-forecasts.mjs ~line 15766
dominantReactions: (result.round1?.dominantReactions || []).map(String).slice(0, 6),
stabilizers: (result.round2?.stabilizers || []).map(String).slice(0, 6),
invalidators: (result.round2?.invalidators || []).map(String).slice(0, 6),
keyActors: Array.isArray(p.keyActors) ? p.keyActors.map(String).slice(0, 6) : [],
// No per-element length limit or sanitization — each string can be arbitrarily long
```
**F-2 (MEDIUM):**
```javascript
// ~line 15769 — timingMarkers from round2 paths[0] (different code path than per-path markers)
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4),
// Individual marker objects NOT sanitized — but per-path timingMarkers at ~15757 DO sanitize
```
## Proposed Solutions
### Option A: Apply `sanitizeForPrompt` + length cap to all LLM array elements (Recommended)
```javascript
dominantReactions: (result.round1?.dominantReactions || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
stabilizers: (result.round2?.stabilizers || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
invalidators: (result.round2?.invalidators || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
keyActors: Array.isArray(p.keyActors)
? p.keyActors.map((s) => sanitizeForPrompt(String(s)).slice(0, 80)).slice(0, 6)
: [],
// For timingMarkers at ~15769 — apply same sanitization as per-path version:
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4)
.map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) })),
```
Effort: Small | Risk: Low
## Acceptance Criteria
- [ ] `dominantReactions`, `stabilizers`, `invalidators` elements capped at 120 chars each with `sanitizeForPrompt`
- [ ] `keyActors` elements capped at 80 chars each with `sanitizeForPrompt`
- [ ] `timingMarkers` at the theater-result level uses the same sanitization as per-path version
- [ ] Test: LLM output with a 10,000-char `stabilizers[0]` is truncated to ≤120 chars in R2 artifact
## Technical Details
- File: `scripts/seed-forecasts.mjs``processNextSimulationTask` (~lines 15752, 15766-15769)
## Work Log
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review

View File

@@ -0,0 +1,69 @@
---
status: complete
priority: p2
issue_id: "024"
tags: [code-review, architecture, simulation-runner, schema-drift]
---
# `isMaritimeChokeEnergyCandidate` hand-rolled adapter creates schema drift risk
## Problem Statement
`processNextSimulationTask` calls `isMaritimeChokeEnergyCandidate` with a manually-constructed adapter object mapping fields from `selectedTheaters` items individually. The function expects `{ routeFacilityKey, marketBucketIds, marketContext: { topBucketId }, commodityKey }` but `selectedTheaters` stores `topBucketId` flat (not under `marketContext`). If `selectedTheaters` items ever gain a `marketContext` field directly (as the upstream data model already uses), the manual mapping shadows the real `marketContext.topBucketId` with an empty string. If the function's logic ever expands to use additional fields, the call site silently fails to pass them.
## Findings
**F-1 (MEDIUM):**
```javascript
// scripts/seed-forecasts.mjs ~line 15719
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
isMaritimeChokeEnergyCandidate({
routeFacilityKey: t.routeFacilityKey || '',
marketBucketIds: t.marketBucketIds || [],
marketContext: { topBucketId: t.topBucketId || '' }, // t.topBucketId is flat; marketContext is reconstructed
commodityKey: t.commodityKey || '',
}),
);
// If t gains a real marketContext field, the reconstructed one shadows it
// isMaritimeChokeEnergyCandidate called at line 12190 uses the full candidate object directly
```
Two call sites for the same function with different input shapes is a maintenance hazard.
## Proposed Solutions
### Option A: Pass theater object directly, normalize inside the function (Recommended)
Update `isMaritimeChokeEnergyCandidate` to accept both flat and nested shapes:
```javascript
function isMaritimeChokeEnergyCandidate(candidate) {
const topBucket = candidate.marketContext?.topBucketId || candidate.topBucketId || '';
// ...rest of logic unchanged, just reads topBucket instead of candidate.marketContext.topBucketId
}
// In processNextSimulationTask — just pass t directly:
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
isMaritimeChokeEnergyCandidate(t)
);
```
Effort: Small | Risk: Low — backwards compatible, no behavior change for existing call site at line 12190
### Option B: Verify that `selectedTheaters` schema already includes all needed fields
Check `buildSimulationPackageFromDeepSnapshot` to confirm it writes `routeFacilityKey`, `marketBucketIds`, `topBucketId`, `commodityKey` to theater items. If confirmed, document the flat-vs-nested convention with a comment at the call site.
## Acceptance Criteria
- [ ] `isMaritimeChokeEnergyCandidate` accepts both flat (`topBucketId`) and nested (`marketContext.topBucketId`) input
- [ ] `processNextSimulationTask` passes `t` directly without hand-rolling the adapter
- [ ] Both call sites (line 12190 and new line 15719) produce identical classification results
- [ ] Existing tests for `isMaritimeChokeEnergyCandidate` still pass
## Technical Details
- File: `scripts/seed-forecasts.mjs``isMaritimeChokeEnergyCandidate` (~line 11871), `processNextSimulationTask` (~line 15719), `buildSimulationPackageFromDeepSnapshot` (~line 12190)
## Work Log
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review

View File

@@ -0,0 +1,60 @@
---
status: complete
priority: p2
issue_id: "025"
tags: [code-review, performance, simulation-runner, llm]
---
# Round 1 token budget (1800) may be too tight for fully-populated theaters
## Problem Statement
`SIMULATION_ROUND1_MAX_TOKENS = 1800` is the output token cap for Round 1 LLM calls. With a fully-populated theater (10 entities, 8 seeds, constraints, eval targets, simulation requirement, plus the ~350-token JSON response template), the system prompt alone consumes ~1,030 tokens. This leaves ~770 tokens for the response. A minimal valid Round 1 response (3 paths with labels, summaries, and 3 `initialReactions` each) costs ~700-900 tokens. At the high end of entity/seed density, the model will truncate its JSON mid-object, causing `round1_parse_failed` and marking the theater as failed — silently, with no token-exhaustion signal in the diagnostic.
## Findings
**F-1 (HIGH):** Token budget vs. prompt size analysis:
- Static template text: ~350 tokens
- 10 entities at ~20 tokens each: ~200 tokens
- 8 event seeds at ~25 tokens each: ~200 tokens
- simulationRequirement + constraints + evalTargets: ~255 tokens
- **Total input: ~1,005 tokens**
- **Output budget remaining: 795 tokens**
- Minimal valid Round 1 response (3 paths, 3 reactions each): **~700-900 tokens**
- Margin: **-105 to +95 tokens** — essentially zero at max density
`SIMULATION_ROUND2_MAX_TOKENS = 2500` is adequate for Round 2 (shorter input, richer output).
## Proposed Solutions
### Option A: Raise `SIMULATION_ROUND1_MAX_TOKENS` to 2200 + cap `initialReactions` in prompt (Recommended)
```javascript
const SIMULATION_ROUND1_MAX_TOKENS = 2200; // was 1800
// In buildSimulationRound1SystemPrompt INSTRUCTIONS section, add:
// - Maximum 3 initialReactions per path
```
This provides a 1,195-token output margin (2200 - 1005) which comfortably fits 3 paths × 3 reactions. The `initialReactions` cap aligns with existing behavior (only 3 are used in Round 2 path summaries).
Effort: Trivial | Risk: Very Low — increases LLM output budget, no structural change
### Option B: Dynamic token calculation based on entity/seed count
Calculate prompt token estimate and adjust `maxTokens` accordingly. More precise but adds complexity with no meaningful benefit given the fixed slice limits.
## Acceptance Criteria
- [ ] `SIMULATION_ROUND1_MAX_TOKENS` raised from 1800 to 2200
- [ ] INSTRUCTIONS block in `buildSimulationRound1SystemPrompt` includes "- Maximum 3 initialReactions per path"
- [ ] Existing tests pass (prompt builder tests check content, not token count)
## Technical Details
- File: `scripts/seed-forecasts.mjs``SIMULATION_ROUND1_MAX_TOKENS` (~line 38), `buildSimulationRound1SystemPrompt` INSTRUCTIONS section (~line 15445)
## Work Log
- 2026-03-24: Found by compound-engineering:review:performance-oracle in PR #2220 review

View File

@@ -0,0 +1,64 @@
---
status: complete
priority: p2
issue_id: "026"
tags: [code-review, typescript, simulation-runner, maintainability]
---
# Redis key strings duplicated between TS handler and MJS seed script
## Problem Statement
`SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest'` is defined independently in both `server/worldmonitor/forecast/v1/get-simulation-outcome.ts` and `scripts/seed-forecasts.mjs`. The same duplication exists for `SIMULATION_PACKAGE_LATEST_KEY`. `server/_shared/cache-keys.ts` (referenced in the worldmonitor-bootstrap-registration pattern) exists for exactly this purpose: shared Redis key constants that TypeScript handlers and seed scripts need to agree on. A future rename in one file without the other produces a silent miss where the handler reads an empty key forever.
## Findings
**F-1:**
```typescript
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts line 10
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
// scripts/seed-forecasts.mjs line 35
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
// Two independent definitions with no enforcement of consistency
```
**F-2:** Same pattern for `SIMULATION_PACKAGE_LATEST_KEY` between `get-simulation-package.ts` and `seed-forecasts.mjs`.
## Proposed Solutions
### Option A: Move keys to `server/_shared/cache-keys.ts`, import in handler (Recommended)
```typescript
// server/_shared/cache-keys.ts — add:
export const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
export const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts — replace local const:
import { SIMULATION_OUTCOME_LATEST_KEY } from '../../../_shared/cache-keys';
```
The seed script (`scripts/seed-forecasts.mjs`) keeps its own definition since it's a standalone MJS module that cannot import from TypeScript source. But the TypeScript handler becomes the downstream consumer of a canonical definition, making renames TypeScript-checked.
Effort: Small | Risk: Low
### Option B: Add a comment cross-referencing both locations
Not a fix, but documents the relationship so a human renaming one knows to update the other. Use as a stopgap if Option A causes import complexity.
## Acceptance Criteria
- [ ] `SIMULATION_OUTCOME_LATEST_KEY` exported from `server/_shared/cache-keys.ts`
- [ ] `get-simulation-outcome.ts` imports from `cache-keys.ts` instead of local const
- [ ] `SIMULATION_PACKAGE_LATEST_KEY` moved simultaneously
- [ ] `get-simulation-package.ts` updated to import from `cache-keys.ts`
- [ ] TypeScript compilation clean after change
## Technical Details
- Files: `server/worldmonitor/forecast/v1/get-simulation-outcome.ts:10`, `server/worldmonitor/forecast/v1/get-simulation-package.ts:~10`, `server/_shared/cache-keys.ts`
- Scripts keep their own definitions (they're standalone MJS — can't import from TS source)
## Work Log
- 2026-03-24: Found by compound-engineering:review:kieran-typescript-reviewer in PR #2220 review

View File

@@ -0,0 +1,64 @@
---
status: complete
priority: p2
issue_id: "027"
tags: [code-review, agent-native, simulation-runner, api]
---
# `runId` filter in `getSimulationOutcome` is a no-op with no OpenAPI documentation
## Problem Statement
`GetSimulationOutcomeRequest.runId` is accepted as a query parameter but explicitly ignored — the handler always returns the latest outcome. The proto file has a comment explaining this ("Currently ignored; always returns the latest outcome. Reserved for Phase 3"), but this comment does not surface in the generated OpenAPI spec's `description` field. Agents and API consumers relying on the OpenAPI spec see a `runId` parameter with no description and no indication that it is non-functional. An agent that triggers a simulation run, notes the `runId`, and passes it to `getSimulationOutcome` will silently receive a different run's outcome with no way to detect the mismatch (except the `note` field, which is easy to overlook).
## Findings
**F-1:** Proto comment exists but does not reach OpenAPI:
```proto
// proto/worldmonitor/forecast/v1/get_simulation_outcome.proto line 9
message GetSimulationOutcomeRequest {
// Currently ignored; always returns the latest outcome. Reserved for Phase 3 per-run lookup.
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
}
```
Generated `docs/api/ForecastService.openapi.yaml` has the `runId` parameter with no `description` field.
**F-2:** Agent trigger-and-verify workflow is unreliable without per-run lookup:
1. Agent calls `POST /api/forecast/v1/trigger-simulation` (when it exists) → gets `runId=A`
2. Agent polls `GET /api/forecast/v1/get-simulation-outcome?runId=A`
3. Run B completes first, writes `found: true, runId: B` to Redis
4. Handler returns run B's outcome with `note: "runId filter not yet active; returned outcome may differ"`
5. Agent receives `note` but may not check it; proceeds to act on wrong run's data
## Proposed Solutions
### Option A: Add description annotation to proto field so it propagates to OpenAPI (Recommended)
Check if sebuf's proto generator picks up leading comments or if it requires a `description` annotation extension. If the generator supports field descriptions, add:
```proto
// IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
// Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied.
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
```
If the generator does not propagate comments, manually update the generated OpenAPI yaml as a post-generation step.
### Option B: Document in the handler's response `note` more prominently
Current `note` text: "runId filter not yet active; returned outcome may differ from requested run". This is already a reasonable signal. Ensure the proto `note` field also has a description in OpenAPI explaining its purpose.
## Acceptance Criteria
- [ ] OpenAPI `description` for the `runId` parameter in `GetSimulationOutcome` explains it is currently a no-op
- [ ] OpenAPI `description` for the `note` response field explains it is populated when `runId` mismatch occurs
- [ ] Combined with todo #021 (trigger endpoint), a full trigger-and-verify loop is documented
## Technical Details
- File: `proto/worldmonitor/forecast/v1/get_simulation_outcome.proto`
- File: `docs/api/ForecastService.openapi.yaml` (auto-generated — check if manual edits survive `make generate`)
## Work Log
- 2026-03-24: Found by compound-engineering:review:agent-native-reviewer in PR #2220 review

View File

@@ -0,0 +1,60 @@
---
status: pending
priority: p3
issue_id: "028"
tags: [code-review, architecture, simulation-runner, schema]
---
# No structured `completionStatus` field in simulation outcome — callers must parse strings
## Problem Statement
The simulation outcome has no machine-readable `completionStatus` field. Callers must re-derive completion state from `theaterResults.length`, `failedTheaters.length`, and the string-encoded `globalObservations` field. This works for Phase 2 but will block Phase 3 callers (UI panels, downstream agents) that need to branch on `partial` vs `all_failed` vs `no_eligible_theaters`.
## Findings
**F-1:**
```javascript
const outcome = {
globalObservations: eligibleTheaters.length === 0
? 'No maritime chokepoint/energy theaters in package'
: theaterResults.length === 0 ? 'All theaters failed simulation' : '',
confidenceNotes: `${theaterResults.length}/${eligibleTheaters.length} theaters completed`,
// No structured completionStatus or eligibleTheaterCount
};
```
Callers deriving status: `theaterResults.length === 0 && failedTheaters.length === 0` could mean "no eligible theaters" or "eligibleTheaters array was somehow empty". No way to distinguish without `eligibleTheaterCount`.
## Proposed Solution
```javascript
const completionStatus =
eligibleTheaters.length === 0 ? 'no_eligible_theaters'
: theaterResults.length === 0 ? 'all_failed'
: failedTheaters.length > 0 ? 'partial'
: 'complete';
const outcome = {
...existingFields,
completionStatus,
eligibleTheaterCount: eligibleTheaters.length,
};
```
Also add `theaterCount` to `GetSimulationOutcomeResponse` proto (currently only `theaterCount` for successful results) — or add `eligibleTheaterCount` field in Phase 3.
## Acceptance Criteria
- [ ] `completionStatus: 'no_eligible_theaters' | 'all_failed' | 'partial' | 'complete'` added to outcome schema
- [ ] `eligibleTheaterCount` added to outcome schema
- [ ] `getSimulationOutcome` RPC response includes `completionStatus` (or proto updated in Phase 3)
## Technical Details
- File: `scripts/seed-forecasts.mjs``processNextSimulationTask` (~line 15774) outcome construction
- Phase 3 concern: add `completionStatus` to `GetSimulationOutcomeResponse` proto
## Work Log
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review

View File

@@ -0,0 +1,38 @@
---
status: pending
priority: p3
issue_id: "029"
tags: [code-review, performance, simulation-runner, llm]
---
# `getForecastLlmCallOptions` has no cases for `simulation_round_1` / `simulation_round_2`
## Problem Statement
`getForecastLlmCallOptions(stage)` maps stage names to provider order and model configuration. Every other pipeline stage (`combined`, `critical_signals`, `impact_expansion`, `market_implications`, etc.) has its own env override so operators can route that stage to a different model. The simulation runner uses `'simulation_round_1'` and `'simulation_round_2'` as stage names, but both fall through to the `else` branch (default provider order). This means simulation stages cannot be independently routed to a more capable reasoning model in Phase 3 without a code change.
## Proposed Solution
```javascript
// In getForecastLlmCallOptions, add cases:
: stage === 'simulation_round_1' || stage === 'simulation_round_2'
? (process.env.FORECAST_LLM_SIMULATION_PROVIDER_ORDER
? parseForecastProviderOrder(process.env.FORECAST_LLM_SIMULATION_PROVIDER_ORDER)
: globalProviderOrder || defaultProviderOrder)
```
This follows the exact pattern of every other named stage. No behavior change until `FORECAST_LLM_SIMULATION_PROVIDER_ORDER` is set.
## Acceptance Criteria
- [ ] `simulation_round_1` and `simulation_round_2` have explicit cases in `getForecastLlmCallOptions`
- [ ] `FORECAST_LLM_SIMULATION_PROVIDER_ORDER` env var controls simulation provider order when set
- [ ] Existing tests pass; no behavior change when env var is unset
## Technical Details
- File: `scripts/seed-forecasts.mjs``getForecastLlmCallOptions` (~line 3920)
## Work Log
- 2026-03-24: Found by compound-engineering:review:performance-oracle in PR #2220 review