mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
feat(simulation): MiroFish Phase 2 — theater-limited simulation runner (#2220)
* feat(simulation): MiroFish Phase 2 — theater-limited simulation runner Adds the simulation execution layer that consumes simulation-package.json and produces simulation-outcome.json for maritime chokepoint + energy/logistics theaters, closing the WorldMonitor → MiroFish handoff loop. Changes: - scripts/seed-forecasts.mjs: 2-round LLM simulation runner (prompt builders, JSON extractor, runTheaterSimulation, writeSimulationOutcome, task queue with NX dedup lock, runSimulationWorker poll loop) - scripts/process-simulation-tasks.mjs: standalone worker entry point - proto: GetSimulationOutcome RPC + make generate - server/worldmonitor/forecast/v1/get-simulation-outcome.ts: RPC handler - server/gateway.ts: slow tier for get-simulation-outcome - api/health.js: simulationOutcomeLatest in STANDALONE + ON_DEMAND keys - tests: 14 new tests for simulation runner functions * fix(simulation): address P1/P2 code review findings from PR #2220 Security (P1 #018): - sanitizeForPrompt() applied to all entity/seed fields interpolated into Round 1 prompt (entityId, class, stance, seedId, type, timing) - sanitizeForPrompt() applied to actorId and entityIds in Round 2 prompt - sanitizeForPrompt() + length caps applied to all LLM array fields written to R2 (dominantReactions, stabilizers, invalidators, keyActors, timingMarkers) Validation (P1 #019): - Added validateRunId() regex guard - Applied in enqueueSimulationTask() and processNextSimulationTask() loop Type safety (P1 #020): - Added isOutcomePointer() and isPackagePointer() type guards in TS handlers - Replaced unsafe as-casts with runtime-validated guards in both handlers Correctness (P2 #022): - Log warning when pkgPointer.runId does not match task runId Architecture (P2 #024): - isMaritimeChokeEnergyCandidate() accepts both flat and nested topBucketId - Call site simplified to pass theater directly Performance (P2 #025): - SIMULATION_ROUND1_MAX_TOKENS raised 1800 to 2200 - Added max 3 initialReactions instruction to Round 1 prompt Maintainability (P2 #026): - Simulation pointer keys exported from server/_shared/cache-keys.ts - Both TS handlers import from shared location Documentation (P2 #027): - Strengthened runId no-op description in proto and OpenAPI spec * fix(todos): add blank lines around lists in markdown todo files * style(api): reformat openapi yaml to match linter output * test(simulation): add flat-shape filter test + getSimulationOutcome handler coverage Two tests identified as missing during PR #2220 review: 1. isMaritimeChokeEnergyCandidate flat-shape tests — covers the || candidate.topBucketId normalization added in the P1/P2 review pass. The existing tests only used the nested marketContext.topBucketId shape; this adds the flat root-field shape that arrives from the simulation-package.json JSON (selectedTheaters entries have topBucketId at root). 2. getSimulationOutcome handler structural tests — verifies the isOutcomePointer guard, found:false NOT_FOUND return, found:true success path, note population on runId mismatch, and redis_unavailable error string. Follows the readSrc static-analysis pattern used elsewhere in server-handlers.test.mjs (handler imports Redis so full integration test would require a test Redis instance).
This commit is contained in:
@@ -97,6 +97,7 @@ const STANDALONE_KEYS = {
|
||||
marketImplications: 'intelligence:market-implications:v1',
|
||||
hormuzTracker: 'supply_chain:hormuz_tracker:v1',
|
||||
simulationPackageLatest: 'forecast:simulation-package:latest',
|
||||
simulationOutcomeLatest: 'forecast:simulation-outcome:latest',
|
||||
};
|
||||
|
||||
const SEED_META = {
|
||||
@@ -194,6 +195,7 @@ const ON_DEMAND_KEYS = new Set([
|
||||
'militaryForecastInputs', // intermediate seed-to-seed pipeline key; only populated after seed-military-flights runs
|
||||
'marketImplications', // LLM-generated inside forecast cron; can fail silently on LLM errors — degrade to WARN not CRIT
|
||||
'simulationPackageLatest', // written by writeSimulationPackage after deep forecast runs; only present after first successful deep run
|
||||
'simulationOutcomeLatest', // written by writeSimulationOutcome after simulation runs; only present after first successful simulation
|
||||
]);
|
||||
|
||||
// Keys where 0 records is a valid healthy state (e.g. no airports closed).
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -71,6 +71,41 @@ paths:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/Error'
|
||||
/api/forecast/v1/get-simulation-outcome:
|
||||
get:
|
||||
tags:
|
||||
- ForecastService
|
||||
summary: GetSimulationOutcome
|
||||
operationId: GetSimulationOutcome
|
||||
parameters:
|
||||
- name: runId
|
||||
in: query
|
||||
description: |-
|
||||
IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
|
||||
Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
|
||||
and you need to detect a mismatch between requested and returned run.
|
||||
required: false
|
||||
schema:
|
||||
type: string
|
||||
responses:
|
||||
"200":
|
||||
description: Successful response
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/GetSimulationOutcomeResponse'
|
||||
"400":
|
||||
description: Validation error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ValidationError'
|
||||
default:
|
||||
description: Error response
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/Error'
|
||||
components:
|
||||
schemas:
|
||||
Error:
|
||||
@@ -393,3 +428,40 @@ components:
|
||||
description: |-
|
||||
Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
|
||||
Value: "redis_unavailable" on Redis errors.
|
||||
GetSimulationOutcomeRequest:
|
||||
type: object
|
||||
properties:
|
||||
runId:
|
||||
type: string
|
||||
description: |-
|
||||
IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
|
||||
Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
|
||||
and you need to detect a mismatch between requested and returned run.
|
||||
GetSimulationOutcomeResponse:
|
||||
type: object
|
||||
properties:
|
||||
found:
|
||||
type: boolean
|
||||
runId:
|
||||
type: string
|
||||
outcomeKey:
|
||||
type: string
|
||||
schemaVersion:
|
||||
type: string
|
||||
theaterCount:
|
||||
type: integer
|
||||
format: int32
|
||||
generatedAt:
|
||||
type: integer
|
||||
format: int64
|
||||
description: 'Unix timestamp in milliseconds (from Date.now()). Warning: Values > 2^53 may lose precision in JavaScript.. Warning: Values > 2^53 may lose precision in JavaScript'
|
||||
note:
|
||||
type: string
|
||||
description: |-
|
||||
Populated when req.runId was supplied but does not match the returned outcome's runId.
|
||||
Indicates that per-run filtering is not yet active and the latest outcome was returned instead.
|
||||
error:
|
||||
type: string
|
||||
description: |-
|
||||
Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
|
||||
Value: "redis_unavailable" on Redis errors.
|
||||
|
||||
28
proto/worldmonitor/forecast/v1/get_simulation_outcome.proto
Normal file
28
proto/worldmonitor/forecast/v1/get_simulation_outcome.proto
Normal file
@@ -0,0 +1,28 @@
|
||||
syntax = "proto3";
|
||||
|
||||
package worldmonitor.forecast.v1;
|
||||
|
||||
import "sebuf/http/annotations.proto";
|
||||
|
||||
message GetSimulationOutcomeRequest {
|
||||
// IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
|
||||
// Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied
|
||||
// and you need to detect a mismatch between requested and returned run.
|
||||
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
|
||||
}
|
||||
|
||||
message GetSimulationOutcomeResponse {
|
||||
bool found = 1;
|
||||
string run_id = 2;
|
||||
string outcome_key = 3;
|
||||
string schema_version = 4;
|
||||
int32 theater_count = 5;
|
||||
// Unix timestamp in milliseconds (from Date.now()). Warning: Values > 2^53 may lose precision in JavaScript.
|
||||
int64 generated_at = 6 [(sebuf.http.int64_encoding) = INT64_ENCODING_NUMBER];
|
||||
// Populated when req.runId was supplied but does not match the returned outcome's runId.
|
||||
// Indicates that per-run filtering is not yet active and the latest outcome was returned instead.
|
||||
string note = 7;
|
||||
// Populated when the Redis lookup failed. Distinguish from healthy not-found (found=false, error="").
|
||||
// Value: "redis_unavailable" on Redis errors.
|
||||
string error = 8;
|
||||
}
|
||||
@@ -5,6 +5,7 @@ package worldmonitor.forecast.v1;
|
||||
import "sebuf/http/annotations.proto";
|
||||
import "worldmonitor/forecast/v1/get_forecasts.proto";
|
||||
import "worldmonitor/forecast/v1/get_simulation_package.proto";
|
||||
import "worldmonitor/forecast/v1/get_simulation_outcome.proto";
|
||||
|
||||
service ForecastService {
|
||||
option (sebuf.http.service_config) = {base_path: "/api/forecast/v1"};
|
||||
@@ -16,4 +17,8 @@ service ForecastService {
|
||||
rpc GetSimulationPackage(GetSimulationPackageRequest) returns (GetSimulationPackageResponse) {
|
||||
option (sebuf.http.config) = {path: "/get-simulation-package", method: HTTP_METHOD_GET};
|
||||
}
|
||||
|
||||
rpc GetSimulationOutcome(GetSimulationOutcomeRequest) returns (GetSimulationOutcomeResponse) {
|
||||
option (sebuf.http.config) = {path: "/get-simulation-outcome", method: HTTP_METHOD_GET};
|
||||
}
|
||||
}
|
||||
|
||||
14
scripts/process-simulation-tasks.mjs
Normal file
14
scripts/process-simulation-tasks.mjs
Normal file
@@ -0,0 +1,14 @@
|
||||
#!/usr/bin/env node
|
||||
|
||||
import { loadEnvFile } from './_seed-utils.mjs';
|
||||
import { runSimulationWorker } from './seed-forecasts.mjs';
|
||||
|
||||
loadEnvFile(import.meta.url);
|
||||
|
||||
const once = process.argv.includes('--once');
|
||||
const runId = process.argv.find((arg) => arg.startsWith('--run-id='))?.split('=')[1] || '';
|
||||
|
||||
const result = await runSimulationWorker({ once, runId });
|
||||
if (once && result?.status && result.status !== 'idle') {
|
||||
console.log(` [Simulation] ${result.status}`);
|
||||
}
|
||||
@@ -32,6 +32,17 @@ const FORECAST_DEEP_MAX_CANDIDATES = 3;
|
||||
const FORECAST_DEEP_RUN_PREFIX = 'seed-data/forecast-traces';
|
||||
const SIMULATION_PACKAGE_SCHEMA_VERSION = 'v1';
|
||||
const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
|
||||
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
|
||||
const SIMULATION_OUTCOME_SCHEMA_VERSION = 'v1';
|
||||
const SIMULATION_RUNNER_VERSION = 'v1';
|
||||
const SIMULATION_TASK_KEY_PREFIX = 'forecast:simulation-task:v1';
|
||||
const SIMULATION_TASK_QUEUE_KEY = 'forecast:simulation-task-queue:v1';
|
||||
const SIMULATION_LOCK_KEY_PREFIX = 'forecast:simulation-lock:v1';
|
||||
const SIMULATION_ROUND1_MAX_TOKENS = 2200;
|
||||
const SIMULATION_ROUND2_MAX_TOKENS = 2500;
|
||||
const SIMULATION_LOCK_TTL_SECONDS = 20 * 60;
|
||||
const SIMULATION_TASK_TTL_SECONDS = 30 * 60;
|
||||
const SIMULATION_POLL_INTERVAL_MS = 30 * 1000;
|
||||
const PUBLISH_MIN_PROBABILITY = 0;
|
||||
const PANEL_MIN_PROBABILITY = 0.1;
|
||||
const CANONICAL_PAYLOAD_SOFT_LIMIT_BYTES = 4 * 1024 * 1024;
|
||||
@@ -11861,7 +11872,8 @@ function isMaritimeChokeEnergyCandidate(candidate) {
|
||||
const routeKey = candidate.routeFacilityKey || '';
|
||||
if (!routeKey || !Object.prototype.hasOwnProperty.call(CHOKEPOINT_MARKET_REGIONS, routeKey)) return false;
|
||||
const bucketArr = candidate.marketBucketIds || [];
|
||||
const topBucket = candidate.marketContext?.topBucketId || '';
|
||||
// Accept both nested (marketContext.topBucketId) and flat (topBucketId) shapes
|
||||
const topBucket = candidate.marketContext?.topBucketId || candidate.topBucketId || '';
|
||||
return bucketArr.includes('energy') || bucketArr.includes('freight') || topBucket === 'energy' || topBucket === 'freight'
|
||||
|| SIMULATION_ENERGY_COMMODITY_KEYS.has(candidate.commodityKey || '');
|
||||
}
|
||||
@@ -15379,6 +15391,428 @@ if (_isDirectRun) {
|
||||
});
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// MiroFish Phase 2 — Theater-Limited Simulation Runner
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function buildSimulationRound1SystemPrompt(theater, pkg) {
|
||||
const theaterEntities = (pkg.entities || []).filter(
|
||||
(e) => !e.relevanceToTheater || e.relevanceToTheater === theater.theaterId,
|
||||
);
|
||||
const entityList = theaterEntities.slice(0, 10).map(
|
||||
(e) => `- ${sanitizeForPrompt(e.entityId)} | ${sanitizeForPrompt(e.name)} | class=${sanitizeForPrompt(e.class)} | stance=${sanitizeForPrompt(e.stance || 'unknown')}`,
|
||||
).join('\n');
|
||||
|
||||
const theaterSeeds = (pkg.eventSeeds || []).filter((s) => s.theaterId === theater.theaterId);
|
||||
const seedList = theaterSeeds.slice(0, 8).map(
|
||||
(s) => `- ${sanitizeForPrompt(s.seedId)} [${sanitizeForPrompt(s.type)}] ${sanitizeForPrompt(s.summary)} (${sanitizeForPrompt(s.timing)})`,
|
||||
).join('\n');
|
||||
|
||||
const constraints = (pkg.constraints?.[theater.theaterId] || pkg.constraints?.theater || [])
|
||||
.map((c) => `- ${sanitizeForPrompt(c)}`).join('\n') || '- No explicit constraints';
|
||||
const evalTargets = (pkg.evaluationTargets?.[theater.theaterId] || pkg.evaluationTargets?.theater || [])
|
||||
.map((t) => `- ${sanitizeForPrompt(t)}`).join('\n') || '- General market and security dynamics';
|
||||
const requirement = sanitizeForPrompt(
|
||||
pkg.simulationRequirement?.[theater.theaterId] || theater.theaterLabel || theater.theaterId,
|
||||
);
|
||||
|
||||
return `You are a geopolitical simulation engine. Simulate actor behavior for a theater-level disruption scenario.
|
||||
|
||||
SIMULATION CONTEXT:
|
||||
${requirement}
|
||||
|
||||
THEATER: ${sanitizeForPrompt(theater.theaterLabel || theater.theaterId)} | Region: ${sanitizeForPrompt(theater.theaterRegion || theater.dominantRegion || '')}
|
||||
|
||||
ACTORS (use exact entityId when citing actors):
|
||||
${entityList || '- (none specified)'}
|
||||
|
||||
EVENT SEEDS (cite seedId in reactions where applicable):
|
||||
${seedList || '- (none specified)'}
|
||||
|
||||
CONSTRAINTS:
|
||||
${constraints}
|
||||
|
||||
EVALUATION TARGETS:
|
||||
${evalTargets}
|
||||
|
||||
INSTRUCTIONS:
|
||||
Generate EXACTLY 3 divergent paths named "escalation", "containment", and "spillover". For each path, model the initial actor reactions in the first 24 hours.
|
||||
|
||||
- Actors MUST be from the list above (use their exact entityId)
|
||||
- Cite event seeds (seedId) in reactions where applicable
|
||||
- Do NOT invent actors, routes, or commodities not present above
|
||||
- timing format: "T+0h", "T+6h", "T+12h", "T+24h"
|
||||
- Maximum 3 initialReactions per path
|
||||
- note: A brief (≤200 char) meta-observation on the divergence logic
|
||||
|
||||
Return ONLY a JSON object with no markdown fences:
|
||||
{
|
||||
"paths": [
|
||||
{
|
||||
"pathId": "escalation",
|
||||
"label": "<short label>",
|
||||
"summary": "<≤200 char summary>",
|
||||
"initialReactions": [
|
||||
{ "actorId": "<entityId>", "actorName": "<name>", "action": "<≤120 char>", "timing": "T+0h" }
|
||||
]
|
||||
},
|
||||
{ "pathId": "containment", "label": "...", "summary": "...", "initialReactions": [] },
|
||||
{ "pathId": "spillover", "label": "...", "summary": "...", "initialReactions": [] }
|
||||
],
|
||||
"dominantReactions": ["<actor name>: <action summary>"],
|
||||
"note": "<meta-observation>"
|
||||
}`;
|
||||
}
|
||||
|
||||
function buildSimulationRound2SystemPrompt(theater, pkg, round1) {
|
||||
const r1Paths = (round1?.paths || []).slice(0, 3);
|
||||
const pathSummaries = r1Paths.map(
|
||||
(p) => `- ${p.pathId}: ${sanitizeForPrompt(p.summary || '')} — actors: ${(p.initialReactions || []).slice(0, 3).map((r) => sanitizeForPrompt(r.actorId || '')).join(', ')}`,
|
||||
).join('\n') || '- (no round 1 paths available)';
|
||||
|
||||
const theaterEntities = (pkg.entities || []).filter(
|
||||
(e) => !e.relevanceToTheater || e.relevanceToTheater === theater.theaterId,
|
||||
);
|
||||
const entityIds = theaterEntities.slice(0, 10).map((e) => sanitizeForPrompt(e.entityId || '')).join(', ');
|
||||
|
||||
const evalTargets = (pkg.evaluationTargets?.[theater.theaterId] || pkg.evaluationTargets?.theater || [])
|
||||
.map((t) => `- ${sanitizeForPrompt(t)}`).join('\n') || '- General market and security dynamics';
|
||||
|
||||
return `You are a geopolitical simulation engine. This is ROUND 2 of a 2-round theater simulation.
|
||||
|
||||
THEATER: ${sanitizeForPrompt(theater.theaterLabel || theater.theaterId)} | Region: ${sanitizeForPrompt(theater.theaterRegion || theater.dominantRegion || '')}
|
||||
|
||||
ROUND 1 PATH SUMMARIES:
|
||||
${pathSummaries}
|
||||
|
||||
VALID ACTOR IDs: ${entityIds || '(see round 1)'}
|
||||
|
||||
EVALUATION TARGETS:
|
||||
${evalTargets}
|
||||
|
||||
INSTRUCTIONS:
|
||||
For each of the 3 paths from Round 1 (escalation, containment, spillover), generate the EVOLVED outcome after 72 hours.
|
||||
|
||||
- keyActors: 2-4 actor IDs that drive this path
|
||||
- roundByRoundEvolution: 2 entries (round 1 summary, round 2 evolution)
|
||||
- timingMarkers: 2-4 key events with timing (T+Nh format)
|
||||
- stabilizers: 2-4 factors that could prevent the worst outcome
|
||||
- invalidators: 2-4 conditions that would invalidate this path
|
||||
- confidence: 0.0-1.0 based on evidence strength
|
||||
|
||||
Return ONLY a JSON object with no markdown fences:
|
||||
{
|
||||
"paths": [
|
||||
{
|
||||
"pathId": "escalation",
|
||||
"label": "<short label>",
|
||||
"summary": "<≤200 char evolved summary>",
|
||||
"keyActors": ["<entityId>"],
|
||||
"roundByRoundEvolution": [
|
||||
{ "round": 1, "summary": "<≤160 char>" },
|
||||
{ "round": 2, "summary": "<≤160 char>" }
|
||||
],
|
||||
"confidence": 0.0,
|
||||
"timingMarkers": [{ "event": "<≤80 char>", "timing": "T+Nh" }]
|
||||
},
|
||||
{ "pathId": "containment", "label": "...", "summary": "...", "keyActors": [], "roundByRoundEvolution": [], "confidence": 0.0, "timingMarkers": [] },
|
||||
{ "pathId": "spillover", "label": "...", "summary": "...", "keyActors": [], "roundByRoundEvolution": [], "confidence": 0.0, "timingMarkers": [] }
|
||||
],
|
||||
"stabilizers": ["<≤100 char>"],
|
||||
"invalidators": ["<≤100 char>"],
|
||||
"globalObservations": "<≤300 char>",
|
||||
"confidenceNotes": "<≤200 char>"
|
||||
}`;
|
||||
}
|
||||
|
||||
function tryParseSimulationRoundPayload(text, round) {
|
||||
try {
|
||||
const parsed = JSON.parse(text);
|
||||
if (!Array.isArray(parsed?.paths)) return { paths: null };
|
||||
const expectedIds = new Set(['escalation', 'containment', 'spillover']);
|
||||
const paths = parsed.paths.filter((p) => p && expectedIds.has(p.pathId));
|
||||
if (paths.length === 0) return { paths: null };
|
||||
if (round === 2) {
|
||||
return {
|
||||
paths,
|
||||
stabilizers: Array.isArray(parsed.stabilizers) ? parsed.stabilizers.map(String).slice(0, 6) : [],
|
||||
invalidators: Array.isArray(parsed.invalidators) ? parsed.invalidators.map(String).slice(0, 6) : [],
|
||||
globalObservations: String(parsed.globalObservations || '').slice(0, 300),
|
||||
confidenceNotes: String(parsed.confidenceNotes || '').slice(0, 200),
|
||||
};
|
||||
}
|
||||
return {
|
||||
paths,
|
||||
dominantReactions: Array.isArray(parsed.dominantReactions) ? parsed.dominantReactions.map(String).slice(0, 6) : [],
|
||||
note: String(parsed.note || '').slice(0, 200),
|
||||
};
|
||||
} catch {
|
||||
return { paths: null };
|
||||
}
|
||||
}
|
||||
|
||||
function extractSimulationRoundPayload(text, round) {
|
||||
const cleaned = text
|
||||
.replace(/<think>[\s\S]*?<\/think>/gi, '')
|
||||
.replace(/<\|thinking\|>[\s\S]*?<\|\/thinking\|>/gi, '')
|
||||
.replace(/```json\s*/gi, '```')
|
||||
.trim();
|
||||
const candidates = [];
|
||||
const fencedBlocks = [...cleaned.matchAll(/```([\s\S]*?)```/g)].map((m) => m[1].trim());
|
||||
candidates.push(...fencedBlocks);
|
||||
candidates.push(cleaned);
|
||||
|
||||
for (const candidate of candidates) {
|
||||
const trimmed = candidate.trim();
|
||||
if (!trimmed) continue;
|
||||
const direct = tryParseSimulationRoundPayload(trimmed, round);
|
||||
if (direct.paths) return { ...direct, diagnostics: { stage: 'direct', preview: sanitizeForPrompt(trimmed).slice(0, 160) } };
|
||||
const firstObject = extractFirstJsonObject(trimmed);
|
||||
if (firstObject) {
|
||||
const parsed = tryParseSimulationRoundPayload(firstObject, round);
|
||||
if (parsed.paths) return { ...parsed, diagnostics: { stage: 'extracted', preview: sanitizeForPrompt(firstObject).slice(0, 160) } };
|
||||
}
|
||||
}
|
||||
return { paths: null, diagnostics: { stage: 'no_json', preview: sanitizeForPrompt(cleaned).slice(0, 160) } };
|
||||
}
|
||||
|
||||
async function runTheaterSimulation(theater, pkg) {
|
||||
const theaterLabel = sanitizeForPrompt(theater.theaterLabel || theater.theaterId);
|
||||
const userPrompt1 = `Theater: ${theaterLabel}\nRun ID: ${pkg.runId}\nGenerate Round 1 actor reactions for the 3 divergent paths.`;
|
||||
|
||||
const r1Raw = await callForecastLLM(
|
||||
buildSimulationRound1SystemPrompt(theater, pkg),
|
||||
userPrompt1,
|
||||
{ ...getForecastLlmCallOptions('simulation_round_1'), stage: 'simulation_round_1', maxTokens: SIMULATION_ROUND1_MAX_TOKENS, temperature: 0 },
|
||||
);
|
||||
if (!r1Raw) return { failed: true, reason: 'round1_llm_failed' };
|
||||
const r1 = extractSimulationRoundPayload(r1Raw.text, 1);
|
||||
if (!r1.paths) return { failed: true, reason: 'round1_parse_failed', diagnostics: r1.diagnostics };
|
||||
|
||||
const userPrompt2 = `Theater: ${theaterLabel}\nRun ID: ${pkg.runId}\nGenerate Round 2 path evolution (72h) based on the Round 1 paths.`;
|
||||
const r2Raw = await callForecastLLM(
|
||||
buildSimulationRound2SystemPrompt(theater, pkg, r1),
|
||||
userPrompt2,
|
||||
{ ...getForecastLlmCallOptions('simulation_round_2'), stage: 'simulation_round_2', maxTokens: SIMULATION_ROUND2_MAX_TOKENS, temperature: 0 },
|
||||
);
|
||||
if (!r2Raw) return { round1: r1, round2: null, failed: false };
|
||||
const r2 = extractSimulationRoundPayload(r2Raw.text, 2);
|
||||
return { round1: r1, round2: r2.paths ? r2 : null, failed: false };
|
||||
}
|
||||
|
||||
function buildSimulationOutcomeKey(runId, generatedAt) {
|
||||
const prefix = buildTraceRunPrefix(runId, generatedAt, FORECAST_DEEP_RUN_PREFIX);
|
||||
return `${prefix}/simulation-outcome.json`;
|
||||
}
|
||||
|
||||
async function writeSimulationOutcome(pkg, outcome, { storageConfig } = {}) {
|
||||
const config = storageConfig ?? resolveR2StorageConfig();
|
||||
if (!config || !pkg?.runId) return null;
|
||||
const { runId, generatedAt } = pkg;
|
||||
const outcomeKey = buildSimulationOutcomeKey(runId, generatedAt || Date.now());
|
||||
await putR2JsonObject(config, outcomeKey, outcome, {
|
||||
runid: String(runId),
|
||||
kind: 'simulation_outcome',
|
||||
schema_version: SIMULATION_OUTCOME_SCHEMA_VERSION,
|
||||
});
|
||||
const { url, token } = getRedisCredentials();
|
||||
await redisCommand(url, token, [
|
||||
'SET',
|
||||
SIMULATION_OUTCOME_LATEST_KEY,
|
||||
JSON.stringify({
|
||||
runId,
|
||||
outcomeKey,
|
||||
schemaVersion: SIMULATION_OUTCOME_SCHEMA_VERSION,
|
||||
theaterCount: (outcome.theaterResults || []).length,
|
||||
generatedAt: generatedAt || Date.now(),
|
||||
}),
|
||||
'EX',
|
||||
String(TRACE_REDIS_TTL_SECONDS),
|
||||
]);
|
||||
return { outcomeKey };
|
||||
}
|
||||
|
||||
const VALID_RUN_ID_RE = /^\d{13,}-[a-z0-9-]{1,64}$/i;
|
||||
function validateRunId(runId) { return typeof runId === 'string' && VALID_RUN_ID_RE.test(runId); }
|
||||
|
||||
function buildSimulationTaskKey(runId) { return `${SIMULATION_TASK_KEY_PREFIX}:${runId}`; }
|
||||
function buildSimulationLockKey(runId) { return `${SIMULATION_LOCK_KEY_PREFIX}:${runId}`; }
|
||||
|
||||
async function enqueueSimulationTask(runId) {
|
||||
if (!runId) return { queued: false, reason: 'missing_run_id' };
|
||||
if (!validateRunId(runId)) return { queued: false, reason: 'invalid_run_id_format' };
|
||||
const { url, token } = getRedisCredentials();
|
||||
const queued = await redisCommand(url, token, [
|
||||
'SET', buildSimulationTaskKey(runId),
|
||||
JSON.stringify({ runId, createdAt: Date.now() }),
|
||||
'EX', String(SIMULATION_TASK_TTL_SECONDS), 'NX',
|
||||
]);
|
||||
if (queued?.result !== 'OK') return { queued: false, reason: 'duplicate' };
|
||||
await redisCommand(url, token, ['ZADD', SIMULATION_TASK_QUEUE_KEY, String(Date.now()), runId]);
|
||||
await redisCommand(url, token, ['EXPIRE', SIMULATION_TASK_QUEUE_KEY, String(TRACE_REDIS_TTL_SECONDS)]);
|
||||
return { queued: true, reason: '' };
|
||||
}
|
||||
|
||||
async function claimSimulationTask(runId, workerId) {
|
||||
if (!runId) return null;
|
||||
const { url, token } = getRedisCredentials();
|
||||
const lockKey = buildSimulationLockKey(runId);
|
||||
const claim = await redisCommand(url, token, [
|
||||
'SET', lockKey, workerId, 'EX', String(SIMULATION_LOCK_TTL_SECONDS), 'NX',
|
||||
]);
|
||||
if (claim?.result !== 'OK') return null;
|
||||
const taskRaw = await redisGet(url, token, buildSimulationTaskKey(runId));
|
||||
if (!taskRaw?.runId) {
|
||||
await redisDel(url, token, lockKey);
|
||||
return null;
|
||||
}
|
||||
return taskRaw;
|
||||
}
|
||||
|
||||
async function completeSimulationTask(runId) {
|
||||
if (!runId) return;
|
||||
const { url, token } = getRedisCredentials();
|
||||
await redisCommand(url, token, ['ZREM', SIMULATION_TASK_QUEUE_KEY, runId]);
|
||||
await redisDel(url, token, buildSimulationTaskKey(runId));
|
||||
await redisDel(url, token, buildSimulationLockKey(runId));
|
||||
}
|
||||
|
||||
async function listQueuedSimulationTasks(limit = 10) {
|
||||
const { url, token } = getRedisCredentials();
|
||||
const response = await redisCommand(url, token, [
|
||||
'ZRANGE', SIMULATION_TASK_QUEUE_KEY, '0', String(Math.max(0, limit - 1)),
|
||||
]);
|
||||
return Array.isArray(response?.result) ? response.result : [];
|
||||
}
|
||||
|
||||
async function processNextSimulationTask(options = {}) {
|
||||
const workerId = options.workerId || `sim-worker-${process.pid}-${Date.now()}`;
|
||||
const queuedRunIds = options.runId ? [options.runId] : await listQueuedSimulationTasks(10);
|
||||
|
||||
for (const runId of queuedRunIds) {
|
||||
if (!validateRunId(runId)) {
|
||||
console.warn(` [Simulation] Skipping invalid runId format: ${String(runId).slice(0, 80)}`);
|
||||
continue;
|
||||
}
|
||||
const task = await claimSimulationTask(runId, workerId);
|
||||
if (!task) continue;
|
||||
|
||||
try {
|
||||
const { url, token } = getRedisCredentials();
|
||||
|
||||
// Idempotency: skip if already processed for this runId
|
||||
const existing = await redisGet(url, token, SIMULATION_OUTCOME_LATEST_KEY);
|
||||
if (existing?.runId === runId) {
|
||||
console.log(` [Simulation] Skipping ${runId} — outcome already written`);
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'skipped', reason: 'already_processed', runId };
|
||||
}
|
||||
|
||||
// Read package pointer from Redis
|
||||
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
|
||||
if (!pkgPointer?.pkgKey) {
|
||||
console.warn(` [Simulation] No package pointer for ${runId}`);
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'failed', reason: 'no_package_pointer', runId };
|
||||
}
|
||||
if (pkgPointer.runId && pkgPointer.runId !== runId) {
|
||||
console.warn(` [Simulation] Package runId mismatch: task=${runId} pkg=${pkgPointer.runId} — using latest package (Phase 2 behaviour)`);
|
||||
}
|
||||
|
||||
const storageConfig = resolveR2StorageConfig();
|
||||
if (!storageConfig) {
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'failed', reason: 'no_storage_config', runId };
|
||||
}
|
||||
|
||||
const pkgData = await getR2JsonObject(storageConfig, pkgPointer.pkgKey);
|
||||
if (!pkgData?.selectedTheaters) {
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'failed', reason: 'package_read_failed', runId };
|
||||
}
|
||||
|
||||
// Phase 2 scope: maritime chokepoint + energy/logistics theaters only
|
||||
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
|
||||
isMaritimeChokeEnergyCandidate(t),
|
||||
);
|
||||
console.log(` [Simulation] ${runId}: ${eligibleTheaters.length}/${pkgData.selectedTheaters.length} theaters eligible`);
|
||||
|
||||
const theaterResults = [];
|
||||
const failedTheaters = [];
|
||||
|
||||
for (const theater of eligibleTheaters) {
|
||||
console.log(` [Simulation] Running theater: ${theater.theaterId}`);
|
||||
const result = await runTheaterSimulation(theater, pkgData);
|
||||
if (result.failed) {
|
||||
console.warn(` [Simulation] Theater ${theater.theaterId} failed: ${result.reason}`);
|
||||
failedTheaters.push({ theaterId: theater.theaterId, reason: result.reason });
|
||||
continue;
|
||||
}
|
||||
|
||||
const r2Paths = result.round2?.paths || [];
|
||||
const r1Paths = result.round1?.paths || [];
|
||||
const mergedPaths = (r2Paths.length ? r2Paths : r1Paths).map((p) => {
|
||||
const r1Path = r1Paths.find((r) => r.pathId === p.pathId);
|
||||
return {
|
||||
pathId: p.pathId,
|
||||
label: sanitizeForPrompt(p.label || p.pathId).slice(0, 80),
|
||||
summary: sanitizeForPrompt(p.summary || '').slice(0, 200),
|
||||
keyActors: Array.isArray(p.keyActors) ? p.keyActors.map((s) => sanitizeForPrompt(String(s)).slice(0, 80)).slice(0, 6) : [],
|
||||
roundByRoundEvolution: Array.isArray(p.roundByRoundEvolution)
|
||||
? p.roundByRoundEvolution.map((r) => ({ round: r.round, summary: sanitizeForPrompt(r.summary || '').slice(0, 160) }))
|
||||
: [{ round: 1, summary: sanitizeForPrompt((r1Path?.summary || p.summary || '')).slice(0, 160) }],
|
||||
confidence: typeof p.confidence === 'number' ? Math.max(0, Math.min(1, p.confidence)) : 0.5,
|
||||
timingMarkers: Array.isArray(p.timingMarkers)
|
||||
? p.timingMarkers.slice(0, 6).map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) }))
|
||||
: [],
|
||||
};
|
||||
});
|
||||
|
||||
theaterResults.push({
|
||||
theaterId: theater.theaterId,
|
||||
topPaths: mergedPaths,
|
||||
dominantReactions: (result.round1?.dominantReactions || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
stabilizers: (result.round2?.stabilizers || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
invalidators: (result.round2?.invalidators || []).map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4).map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) })),
|
||||
});
|
||||
}
|
||||
|
||||
const outcome = {
|
||||
runId,
|
||||
schemaVersion: SIMULATION_OUTCOME_SCHEMA_VERSION,
|
||||
runnerVersion: SIMULATION_RUNNER_VERSION,
|
||||
sourceSimulationPackageKey: pkgPointer.pkgKey,
|
||||
theaterResults,
|
||||
failedTheaters,
|
||||
globalObservations: eligibleTheaters.length === 0
|
||||
? 'No maritime chokepoint/energy theaters in package'
|
||||
: theaterResults.length === 0 ? 'All theaters failed simulation' : '',
|
||||
confidenceNotes: `${theaterResults.length}/${eligibleTheaters.length} theaters completed`,
|
||||
generatedAt: pkgData.generatedAt || Date.now(),
|
||||
};
|
||||
|
||||
const writeResult = await writeSimulationOutcome(pkgData, outcome, { storageConfig });
|
||||
await completeSimulationTask(runId);
|
||||
console.log(` [Simulation] Completed ${runId}: ${theaterResults.length} theaters → ${writeResult?.outcomeKey}`);
|
||||
return { status: 'completed', runId, theaterCount: theaterResults.length, outcomeKey: writeResult?.outcomeKey };
|
||||
} catch (err) {
|
||||
console.warn(` [Simulation] Task failed for ${runId}: ${err.message}`);
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'failed', reason: err.message, runId };
|
||||
}
|
||||
}
|
||||
return { status: 'idle' };
|
||||
}
|
||||
|
||||
async function runSimulationWorker({ once = false, runId = '' } = {}) {
|
||||
for (;;) {
|
||||
const result = await processNextSimulationTask({ runId });
|
||||
if (once) return result;
|
||||
if (result?.status === 'idle') await sleep(SIMULATION_POLL_INTERVAL_MS);
|
||||
}
|
||||
}
|
||||
|
||||
export {
|
||||
CANONICAL_KEY,
|
||||
PRIOR_KEY,
|
||||
@@ -15535,6 +15969,17 @@ export {
|
||||
enqueueDeepForecastTask,
|
||||
processNextDeepForecastTask,
|
||||
runDeepForecastWorker,
|
||||
SIMULATION_OUTCOME_LATEST_KEY,
|
||||
SIMULATION_OUTCOME_SCHEMA_VERSION,
|
||||
buildSimulationOutcomeKey,
|
||||
writeSimulationOutcome,
|
||||
buildSimulationRound1SystemPrompt,
|
||||
buildSimulationRound2SystemPrompt,
|
||||
extractSimulationRoundPayload,
|
||||
runTheaterSimulation,
|
||||
enqueueSimulationTask,
|
||||
processNextSimulationTask,
|
||||
runSimulationWorker,
|
||||
scoreImpactExpansionQuality,
|
||||
buildImpactExpansionDebugPayload,
|
||||
runImpactExpansionPromptRefinement,
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
/**
|
||||
* Shared Redis pointer keys for simulation artifacts.
|
||||
* Defined here so TypeScript handlers and seed scripts agree on the exact string.
|
||||
* The MJS seed script keeps its own copy (cannot import TS source directly).
|
||||
*/
|
||||
export const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
|
||||
export const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
|
||||
|
||||
/**
|
||||
* Static cache keys for the bootstrap endpoint.
|
||||
* Only keys with NO request-varying suffixes are included.
|
||||
|
||||
@@ -152,6 +152,7 @@ const RPC_CACHE_TIER: Record<string, CacheTier> = {
|
||||
'/api/prediction/v1/list-prediction-markets': 'medium',
|
||||
'/api/forecast/v1/get-forecasts': 'medium',
|
||||
'/api/forecast/v1/get-simulation-package': 'slow',
|
||||
'/api/forecast/v1/get-simulation-outcome': 'slow',
|
||||
'/api/supply-chain/v1/get-chokepoint-status': 'medium',
|
||||
'/api/news/v1/list-feed-digest': 'slow',
|
||||
'/api/intelligence/v1/get-country-facts': 'daily',
|
||||
|
||||
45
server/worldmonitor/forecast/v1/get-simulation-outcome.ts
Normal file
45
server/worldmonitor/forecast/v1/get-simulation-outcome.ts
Normal file
@@ -0,0 +1,45 @@
|
||||
import type {
|
||||
ForecastServiceHandler,
|
||||
ServerContext,
|
||||
GetSimulationOutcomeRequest,
|
||||
GetSimulationOutcomeResponse,
|
||||
} from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
|
||||
import { getRawJson } from '../../../_shared/redis';
|
||||
import { markNoCacheResponse } from '../../../_shared/response-headers';
|
||||
import { SIMULATION_OUTCOME_LATEST_KEY } from '../../../_shared/cache-keys';
|
||||
|
||||
type OutcomePointer = { runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number };
|
||||
|
||||
function isOutcomePointer(v: unknown): v is OutcomePointer {
|
||||
if (!v || typeof v !== 'object') return false;
|
||||
const o = v as Record<string, unknown>;
|
||||
return typeof o['runId'] === 'string' && typeof o['outcomeKey'] === 'string'
|
||||
&& typeof o['schemaVersion'] === 'string' && typeof o['theaterCount'] === 'number'
|
||||
&& typeof o['generatedAt'] === 'number';
|
||||
}
|
||||
|
||||
const NOT_FOUND: GetSimulationOutcomeResponse = {
|
||||
found: false, runId: '', outcomeKey: '', schemaVersion: '', theaterCount: 0, generatedAt: 0, note: '', error: '',
|
||||
};
|
||||
|
||||
export const getSimulationOutcome: ForecastServiceHandler['getSimulationOutcome'] = async (
|
||||
ctx: ServerContext,
|
||||
req: GetSimulationOutcomeRequest,
|
||||
): Promise<GetSimulationOutcomeResponse> => {
|
||||
try {
|
||||
const raw = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY);
|
||||
const pointer = isOutcomePointer(raw) ? raw : null;
|
||||
if (!pointer?.outcomeKey) {
|
||||
markNoCacheResponse(ctx.request); // don't cache not-found — outcome may appear soon after a simulation run
|
||||
return NOT_FOUND;
|
||||
}
|
||||
const note = req.runId && req.runId !== pointer.runId
|
||||
? 'runId filter not yet active; returned outcome may differ from requested run'
|
||||
: '';
|
||||
return { found: true, runId: pointer.runId, outcomeKey: pointer.outcomeKey, schemaVersion: pointer.schemaVersion, theaterCount: pointer.theaterCount, generatedAt: pointer.generatedAt, note, error: '' };
|
||||
} catch (err) {
|
||||
console.warn('[getSimulationOutcome] Redis error:', err instanceof Error ? err.message : String(err));
|
||||
markNoCacheResponse(ctx.request); // don't cache error state
|
||||
return { ...NOT_FOUND, error: 'redis_unavailable' };
|
||||
}
|
||||
};
|
||||
@@ -6,8 +6,17 @@ import type {
|
||||
} from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
|
||||
import { getRawJson } from '../../../_shared/redis';
|
||||
import { markNoCacheResponse } from '../../../_shared/response-headers';
|
||||
import { SIMULATION_PACKAGE_LATEST_KEY } from '../../../_shared/cache-keys';
|
||||
|
||||
const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
|
||||
type PackagePointer = { runId: string; pkgKey: string; schemaVersion: string; theaterCount: number; generatedAt: number };
|
||||
|
||||
function isPackagePointer(v: unknown): v is PackagePointer {
|
||||
if (!v || typeof v !== 'object') return false;
|
||||
const o = v as Record<string, unknown>;
|
||||
return typeof o['runId'] === 'string' && typeof o['pkgKey'] === 'string'
|
||||
&& typeof o['schemaVersion'] === 'string' && typeof o['theaterCount'] === 'number'
|
||||
&& typeof o['generatedAt'] === 'number';
|
||||
}
|
||||
|
||||
const NOT_FOUND: GetSimulationPackageResponse = {
|
||||
found: false, runId: '', pkgKey: '', schemaVersion: '', theaterCount: 0, generatedAt: 0, note: '', error: '',
|
||||
@@ -18,9 +27,8 @@ export const getSimulationPackage: ForecastServiceHandler['getSimulationPackage'
|
||||
req: GetSimulationPackageRequest,
|
||||
): Promise<GetSimulationPackageResponse> => {
|
||||
try {
|
||||
const pointer = await getRawJson(SIMULATION_PACKAGE_LATEST_KEY) as {
|
||||
runId: string; pkgKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
|
||||
} | null;
|
||||
const raw = await getRawJson(SIMULATION_PACKAGE_LATEST_KEY);
|
||||
const pointer = isPackagePointer(raw) ? raw : null;
|
||||
if (!pointer?.pkgKey) {
|
||||
markNoCacheResponse(ctx.request); // don't cache not-found — package may appear soon after a deep run
|
||||
return NOT_FOUND;
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import type { ForecastServiceHandler } from '../../../../src/generated/server/worldmonitor/forecast/v1/service_server';
|
||||
import { getForecasts } from './get-forecasts';
|
||||
import { getSimulationPackage } from './get-simulation-package';
|
||||
import { getSimulationOutcome } from './get-simulation-outcome';
|
||||
|
||||
export const forecastHandler: ForecastServiceHandler = { getForecasts, getSimulationPackage };
|
||||
export const forecastHandler: ForecastServiceHandler = { getForecasts, getSimulationPackage, getSimulationOutcome };
|
||||
|
||||
@@ -136,6 +136,21 @@ export interface GetSimulationPackageResponse {
|
||||
error: string;
|
||||
}
|
||||
|
||||
export interface GetSimulationOutcomeRequest {
|
||||
runId: string;
|
||||
}
|
||||
|
||||
export interface GetSimulationOutcomeResponse {
|
||||
found: boolean;
|
||||
runId: string;
|
||||
outcomeKey: string;
|
||||
schemaVersion: string;
|
||||
theaterCount: number;
|
||||
generatedAt: number;
|
||||
note: string;
|
||||
error: string;
|
||||
}
|
||||
|
||||
export interface FieldViolation {
|
||||
field: string;
|
||||
description: string;
|
||||
@@ -235,6 +250,31 @@ export class ForecastServiceClient {
|
||||
return await resp.json() as GetSimulationPackageResponse;
|
||||
}
|
||||
|
||||
async getSimulationOutcome(req: GetSimulationOutcomeRequest, options?: ForecastServiceCallOptions): Promise<GetSimulationOutcomeResponse> {
|
||||
let path = "/api/forecast/v1/get-simulation-outcome";
|
||||
const params = new URLSearchParams();
|
||||
if (req.runId != null && req.runId !== "") params.set("runId", String(req.runId));
|
||||
const url = this.baseURL + path + (params.toString() ? "?" + params.toString() : "");
|
||||
|
||||
const headers: Record<string, string> = {
|
||||
"Content-Type": "application/json",
|
||||
...this.defaultHeaders,
|
||||
...options?.headers,
|
||||
};
|
||||
|
||||
const resp = await this.fetchFn(url, {
|
||||
method: "GET",
|
||||
headers,
|
||||
signal: options?.signal,
|
||||
});
|
||||
|
||||
if (!resp.ok) {
|
||||
return this.handleError(resp);
|
||||
}
|
||||
|
||||
return await resp.json() as GetSimulationOutcomeResponse;
|
||||
}
|
||||
|
||||
private async handleError(resp: Response): Promise<never> {
|
||||
const body = await resp.text();
|
||||
if (resp.status === 400) {
|
||||
|
||||
@@ -136,6 +136,21 @@ export interface GetSimulationPackageResponse {
|
||||
error: string;
|
||||
}
|
||||
|
||||
export interface GetSimulationOutcomeRequest {
|
||||
runId: string;
|
||||
}
|
||||
|
||||
export interface GetSimulationOutcomeResponse {
|
||||
found: boolean;
|
||||
runId: string;
|
||||
outcomeKey: string;
|
||||
schemaVersion: string;
|
||||
theaterCount: number;
|
||||
generatedAt: number;
|
||||
note: string;
|
||||
error: string;
|
||||
}
|
||||
|
||||
export interface FieldViolation {
|
||||
field: string;
|
||||
description: string;
|
||||
@@ -183,6 +198,7 @@ export interface RouteDescriptor {
|
||||
export interface ForecastServiceHandler {
|
||||
getForecasts(ctx: ServerContext, req: GetForecastsRequest): Promise<GetForecastsResponse>;
|
||||
getSimulationPackage(ctx: ServerContext, req: GetSimulationPackageRequest): Promise<GetSimulationPackageResponse>;
|
||||
getSimulationOutcome(ctx: ServerContext, req: GetSimulationOutcomeRequest): Promise<GetSimulationOutcomeResponse>;
|
||||
}
|
||||
|
||||
export function createForecastServiceRoutes(
|
||||
@@ -285,6 +301,53 @@ export function createForecastServiceRoutes(
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
method: "GET",
|
||||
path: "/api/forecast/v1/get-simulation-outcome",
|
||||
handler: async (req: Request): Promise<Response> => {
|
||||
try {
|
||||
const pathParams: Record<string, string> = {};
|
||||
const url = new URL(req.url, "http://localhost");
|
||||
const params = url.searchParams;
|
||||
const body: GetSimulationOutcomeRequest = {
|
||||
runId: params.get("runId") ?? "",
|
||||
};
|
||||
if (options?.validateRequest) {
|
||||
const bodyViolations = options.validateRequest("getSimulationOutcome", body);
|
||||
if (bodyViolations) {
|
||||
throw new ValidationError(bodyViolations);
|
||||
}
|
||||
}
|
||||
|
||||
const ctx: ServerContext = {
|
||||
request: req,
|
||||
pathParams,
|
||||
headers: Object.fromEntries(req.headers.entries()),
|
||||
};
|
||||
|
||||
const result = await handler.getSimulationOutcome(ctx, body);
|
||||
return new Response(JSON.stringify(result as GetSimulationOutcomeResponse), {
|
||||
status: 200,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
});
|
||||
} catch (err: unknown) {
|
||||
if (err instanceof ValidationError) {
|
||||
return new Response(JSON.stringify({ violations: err.violations }), {
|
||||
status: 400,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
});
|
||||
}
|
||||
if (options?.onError) {
|
||||
return options.onError(err, req);
|
||||
}
|
||||
const message = err instanceof Error ? err.message : String(err);
|
||||
return new Response(JSON.stringify({ message }), {
|
||||
status: 500,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
});
|
||||
}
|
||||
},
|
||||
},
|
||||
];
|
||||
}
|
||||
|
||||
|
||||
@@ -55,6 +55,13 @@ import {
|
||||
SIMULATION_PACKAGE_SCHEMA_VERSION,
|
||||
SIMULATION_PACKAGE_LATEST_KEY,
|
||||
writeSimulationPackage,
|
||||
SIMULATION_OUTCOME_LATEST_KEY,
|
||||
SIMULATION_OUTCOME_SCHEMA_VERSION,
|
||||
buildSimulationOutcomeKey,
|
||||
writeSimulationOutcome,
|
||||
buildSimulationRound1SystemPrompt,
|
||||
buildSimulationRound2SystemPrompt,
|
||||
extractSimulationRoundPayload,
|
||||
} from '../scripts/seed-forecasts.mjs';
|
||||
|
||||
import {
|
||||
@@ -5528,6 +5535,24 @@ describe('simulation package export', () => {
|
||||
})), true);
|
||||
});
|
||||
|
||||
it('isMaritimeChokeEnergyCandidate accepts candidate with energy bucket on root (flat shape, no marketContext)', () => {
|
||||
// Flat shape: topBucketId is on the candidate root, no marketContext object.
|
||||
// This is the package JSON shape written by buildSimulationPackageFromDeepSnapshot.
|
||||
assert.equal(isMaritimeChokeEnergyCandidate(makeCandidate({
|
||||
marketContext: undefined,
|
||||
topBucketId: 'energy',
|
||||
})), true);
|
||||
});
|
||||
|
||||
it('isMaritimeChokeEnergyCandidate rejects flat shape with non-energy bucket and no energy commodity', () => {
|
||||
assert.equal(isMaritimeChokeEnergyCandidate(makeCandidate({
|
||||
marketContext: undefined,
|
||||
topBucketId: 'semis',
|
||||
commodityKey: '',
|
||||
marketBucketIds: ['semis'],
|
||||
})), false);
|
||||
});
|
||||
|
||||
it('buildSimulationPackageFromDeepSnapshot returns null when no qualifying candidates', () => {
|
||||
const pkg = buildSimulationPackageFromDeepSnapshot(makeSnapshot([
|
||||
makeCandidate({ routeFacilityKey: '' }),
|
||||
@@ -5705,3 +5730,197 @@ describe('simulation package export', () => {
|
||||
assert.equal(result, null);
|
||||
});
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// MiroFish Phase 2 — Simulation Runner
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
const minimalTheater = {
|
||||
theaterId: 'test-theater-1',
|
||||
theaterRegion: 'Red Sea',
|
||||
theaterLabel: 'Red Sea / Bab-el-Mandeb',
|
||||
candidateStateId: 'state-001',
|
||||
routeFacilityKey: 'Red Sea',
|
||||
dominantRegion: 'Middle East',
|
||||
macroRegions: ['MENA'],
|
||||
topBucketId: 'energy',
|
||||
topChannel: 'price_spike',
|
||||
marketBucketIds: ['energy', 'freight'],
|
||||
};
|
||||
|
||||
const minimalPkg = {
|
||||
runId: 'run-001',
|
||||
generatedAt: 1711234567000,
|
||||
selectedTheaters: [minimalTheater],
|
||||
entities: [
|
||||
{ entityId: 'houthi-forces', name: 'Houthi Forces', class: 'military_or_security_actor', region: 'Yemen', stance: 'active', objectives: [], constraints: [], relevanceToTheater: 'test-theater-1' },
|
||||
{ entityId: 'aramco-exports', name: 'Saudi Aramco', class: 'exporter_or_importer', region: 'Saudi Arabia', stance: 'stressed', objectives: [], constraints: [], relevanceToTheater: 'test-theater-1' },
|
||||
],
|
||||
eventSeeds: [
|
||||
{ seedId: 'seed-1', theaterId: 'test-theater-1', type: 'live_news', summary: 'Houthi missile attack on Red Sea shipping', evidenceRefs: ['E1'], timing: 'T+0h' },
|
||||
{ seedId: 'seed-2', theaterId: 'test-theater-1', type: 'state_signal', summary: 'Oil tanker rerouting Cape of Good Hope', evidenceRefs: ['E2'], timing: 'T+12h' },
|
||||
],
|
||||
constraints: { 'test-theater-1': ['No actor may unilaterally close the Strait of Bab-el-Mandeb'] },
|
||||
evaluationTargets: { 'test-theater-1': ['Oil price trajectory over 72h', 'Shipping diversion extent'] },
|
||||
simulationRequirement: { 'test-theater-1': 'Simulate how a Red Sea disruption propagates through energy and logistics markets' },
|
||||
};
|
||||
|
||||
describe('simulation runner — prompt builders', () => {
|
||||
it('Round 1 prompt contains theater label and region', () => {
|
||||
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
|
||||
assert.ok(prompt.includes('Red Sea / Bab-el-Mandeb'), 'should include theater label');
|
||||
assert.ok(prompt.includes('Red Sea'), 'should include theater region');
|
||||
});
|
||||
|
||||
it('Round 1 prompt contains all 3 required path IDs', () => {
|
||||
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
|
||||
assert.ok(prompt.includes('"escalation"'), 'should mention escalation path');
|
||||
assert.ok(prompt.includes('"containment"'), 'should mention containment path');
|
||||
assert.ok(prompt.includes('"spillover"'), 'should mention spillover path');
|
||||
});
|
||||
|
||||
it('Round 1 prompt lists entity IDs', () => {
|
||||
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
|
||||
assert.ok(prompt.includes('houthi-forces'), 'should include entity entityId');
|
||||
assert.ok(prompt.includes('aramco-exports'), 'should include entity entityId');
|
||||
});
|
||||
|
||||
it('Round 1 prompt lists event seed IDs', () => {
|
||||
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
|
||||
assert.ok(prompt.includes('seed-1'), 'should include seed-1');
|
||||
assert.ok(prompt.includes('seed-2'), 'should include seed-2');
|
||||
});
|
||||
|
||||
it('Round 1 prompt includes simulation requirement', () => {
|
||||
const prompt = buildSimulationRound1SystemPrompt(minimalTheater, minimalPkg);
|
||||
assert.ok(prompt.includes('Red Sea disruption'), 'should include simulationRequirement text');
|
||||
});
|
||||
|
||||
it('Round 2 prompt contains Round 1 path summaries', () => {
|
||||
const round1 = {
|
||||
paths: [
|
||||
{ pathId: 'escalation', summary: 'Escalation path summary', initialReactions: [{ actorId: 'houthi-forces' }] },
|
||||
{ pathId: 'containment', summary: 'Containment path summary', initialReactions: [] },
|
||||
{ pathId: 'spillover', summary: 'Spillover path summary', initialReactions: [] },
|
||||
],
|
||||
};
|
||||
const prompt = buildSimulationRound2SystemPrompt(minimalTheater, minimalPkg, round1);
|
||||
assert.ok(prompt.includes('Escalation path summary'), 'should include round 1 escalation summary');
|
||||
assert.ok(prompt.includes('Containment path summary'), 'should include round 1 containment summary');
|
||||
assert.ok(prompt.includes('ROUND 2'), 'should indicate this is round 2');
|
||||
});
|
||||
|
||||
it('Round 2 prompt includes valid actor IDs list', () => {
|
||||
const round1 = { paths: [] };
|
||||
const prompt = buildSimulationRound2SystemPrompt(minimalTheater, minimalPkg, round1);
|
||||
assert.ok(prompt.includes('houthi-forces'), 'should include valid actor IDs');
|
||||
});
|
||||
});
|
||||
|
||||
describe('simulation runner — extractSimulationRoundPayload', () => {
|
||||
const r1Payload = JSON.stringify({
|
||||
paths: [
|
||||
{ pathId: 'escalation', label: 'Escalate', summary: 'Forces escalate', initialReactions: [] },
|
||||
{ pathId: 'containment', label: 'Contain', summary: 'Forces contained', initialReactions: [] },
|
||||
{ pathId: 'spillover', label: 'Spill', summary: 'Spillover effect', initialReactions: [] },
|
||||
],
|
||||
dominantReactions: ['Actor A: escalates'],
|
||||
note: 'Three divergent paths',
|
||||
});
|
||||
|
||||
const r2Payload = JSON.stringify({
|
||||
paths: [
|
||||
{ pathId: 'escalation', label: 'Full Escalation', summary: 'Escalated 72h', keyActors: ['houthi-forces'], roundByRoundEvolution: [{ round: 1, summary: 'Round 1' }, { round: 2, summary: 'Round 2' }], confidence: 0.75, timingMarkers: [{ event: 'First strike', timing: 'T+6h' }] },
|
||||
{ pathId: 'containment', label: 'Contained', summary: 'Contained 72h', keyActors: [], roundByRoundEvolution: [], confidence: 0.6, timingMarkers: [] },
|
||||
{ pathId: 'spillover', label: 'Spilled', summary: 'Spillover 72h', keyActors: [], roundByRoundEvolution: [], confidence: 0.4, timingMarkers: [] },
|
||||
],
|
||||
stabilizers: ['International pressure'],
|
||||
invalidators: ['New attack'],
|
||||
globalObservations: 'Cross-theater ripple effects expected',
|
||||
confidenceNotes: 'Moderate confidence overall',
|
||||
});
|
||||
|
||||
it('parses valid Round 1 JSON directly', () => {
|
||||
const result = extractSimulationRoundPayload(r1Payload, 1);
|
||||
assert.ok(Array.isArray(result.paths), 'should return paths array');
|
||||
assert.equal(result.paths.length, 3, 'should have 3 paths');
|
||||
assert.equal(result.paths[0].pathId, 'escalation');
|
||||
assert.ok(Array.isArray(result.dominantReactions), 'should include dominantReactions');
|
||||
assert.equal(result.diagnostics.stage, 'direct');
|
||||
});
|
||||
|
||||
it('parses valid Round 2 JSON directly', () => {
|
||||
const result = extractSimulationRoundPayload(r2Payload, 2);
|
||||
assert.ok(Array.isArray(result.paths), 'should return paths array');
|
||||
assert.equal(result.paths.length, 3);
|
||||
assert.ok(Array.isArray(result.stabilizers), 'should include stabilizers');
|
||||
assert.ok(Array.isArray(result.invalidators), 'should include invalidators');
|
||||
assert.ok(typeof result.globalObservations === 'string');
|
||||
});
|
||||
|
||||
it('strips fenced code blocks and parses Round 1', () => {
|
||||
const fenced = `\`\`\`json\n${r1Payload}\n\`\`\``;
|
||||
const result = extractSimulationRoundPayload(fenced, 1);
|
||||
assert.ok(Array.isArray(result.paths), 'should parse fenced JSON');
|
||||
assert.equal(result.paths.length, 3);
|
||||
});
|
||||
|
||||
it('strips <think> tags before parsing', () => {
|
||||
const withThink = `<think>internal reasoning here</think>\n${r1Payload}`;
|
||||
const result = extractSimulationRoundPayload(withThink, 1);
|
||||
assert.ok(Array.isArray(result.paths), 'should parse after stripping think tags');
|
||||
});
|
||||
|
||||
it('returns null paths on invalid JSON', () => {
|
||||
const result = extractSimulationRoundPayload('not valid json', 1);
|
||||
assert.equal(result.paths, null);
|
||||
assert.equal(result.diagnostics.stage, 'no_json');
|
||||
});
|
||||
|
||||
it('returns null paths when paths array is missing', () => {
|
||||
const result = extractSimulationRoundPayload('{"no_paths": true}', 1);
|
||||
assert.equal(result.paths, null);
|
||||
});
|
||||
|
||||
it('returns null paths when no valid pathId present', () => {
|
||||
const badPaths = JSON.stringify({ paths: [{ pathId: 'unknown', summary: 'x' }] });
|
||||
const result = extractSimulationRoundPayload(badPaths, 1);
|
||||
assert.equal(result.paths, null);
|
||||
});
|
||||
|
||||
it('uses extractFirstJsonObject fallback for prefix text', () => {
|
||||
const withPrefix = `Here is the result:\n${r1Payload}\nEnd.`;
|
||||
const result = extractSimulationRoundPayload(withPrefix, 1);
|
||||
assert.ok(Array.isArray(result.paths), 'should parse via extractFirstJsonObject fallback');
|
||||
});
|
||||
});
|
||||
|
||||
describe('simulation runner — outcome key builder', () => {
|
||||
it('buildSimulationOutcomeKey produces a key ending in simulation-outcome.json', () => {
|
||||
const key = buildSimulationOutcomeKey('run-123', 1711234567000);
|
||||
assert.ok(key.endsWith('/simulation-outcome.json'), `unexpected key: ${key}`);
|
||||
assert.ok(key.includes('run-123'), 'should include runId');
|
||||
});
|
||||
|
||||
it('SIMULATION_OUTCOME_LATEST_KEY is the canonical Redis pointer key', () => {
|
||||
assert.equal(SIMULATION_OUTCOME_LATEST_KEY, 'forecast:simulation-outcome:latest');
|
||||
});
|
||||
|
||||
it('SIMULATION_OUTCOME_SCHEMA_VERSION is v1', () => {
|
||||
assert.equal(SIMULATION_OUTCOME_SCHEMA_VERSION, 'v1');
|
||||
});
|
||||
});
|
||||
|
||||
describe('simulation runner — writeSimulationOutcome', () => {
|
||||
it('returns null when R2 storage is not configured', async () => {
|
||||
const outcome = { theaterResults: [], failedTheaters: [], runId: 'run-001', generatedAt: Date.now() };
|
||||
const result = await writeSimulationOutcome(minimalPkg, outcome, { storageConfig: null });
|
||||
assert.equal(result, null);
|
||||
});
|
||||
|
||||
it('returns null when pkg has no runId', async () => {
|
||||
const outcome = { theaterResults: [], failedTheaters: [] };
|
||||
const result = await writeSimulationOutcome({ generatedAt: Date.now() }, outcome, { storageConfig: null });
|
||||
assert.equal(result, null);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -230,3 +230,54 @@ describe('getVesselSnapshot caching (HIGH-1)', () => {
|
||||
// NOTE: Full integration test (mocking fetch, verifying cache hits) requires
|
||||
// a TypeScript-capable test runner. This structural test verifies the pattern.
|
||||
});
|
||||
|
||||
// ========================================================================
|
||||
// getSimulationOutcome handler — structural tests
|
||||
// ========================================================================
|
||||
|
||||
describe('getSimulationOutcome handler', () => {
|
||||
const src = readSrc('server/worldmonitor/forecast/v1/get-simulation-outcome.ts');
|
||||
|
||||
it('returns found:false (NOT_FOUND) when pointer is absent', () => {
|
||||
// The handler must define a NOT_FOUND sentinel with found: false
|
||||
assert.match(src, /found:\s*false/,
|
||||
'NOT_FOUND constant should set found: false');
|
||||
// And return it when the pointer is missing
|
||||
assert.match(src, /return\s+NOT_FOUND/,
|
||||
'Should return NOT_FOUND when key is absent');
|
||||
});
|
||||
|
||||
it('uses isOutcomePointer type guard before accessing pointer fields', () => {
|
||||
assert.match(src, /isOutcomePointer\(raw\)/,
|
||||
'Should use isOutcomePointer type guard on getRawJson result');
|
||||
// Guard must check string and number fields — not just truthy
|
||||
assert.match(src, /typeof\s+o\[.runId.\]\s*===\s*'string'/,
|
||||
'Type guard should verify runId is a string');
|
||||
assert.match(src, /typeof\s+o\[.theaterCount.\]\s*===\s*'number'/,
|
||||
'Type guard should verify theaterCount is a number');
|
||||
});
|
||||
|
||||
it('returns found:true with all pointer fields on success', () => {
|
||||
assert.match(src, /found:\s*true/,
|
||||
'Success path should return found: true');
|
||||
// Must propagate all pointer fields
|
||||
assert.match(src, /outcomeKey:\s*pointer\.outcomeKey/,
|
||||
'Success path should include outcomeKey from pointer');
|
||||
assert.match(src, /theaterCount:\s*pointer\.theaterCount/,
|
||||
'Success path should include theaterCount from pointer');
|
||||
});
|
||||
|
||||
it('populates note when runId supplied but does not match pointer runId', () => {
|
||||
assert.match(src, /req\.runId.*pointer\.runId/,
|
||||
'Should compare req.runId with pointer.runId for note');
|
||||
assert.match(src, /runId filter not yet active/,
|
||||
'Note text should explain the Phase 3 deferral');
|
||||
});
|
||||
|
||||
it('returns redis_unavailable error string on Redis failure', () => {
|
||||
assert.match(src, /redis_unavailable/,
|
||||
'Should return redis_unavailable on catch');
|
||||
assert.match(src, /markNoCacheResponse.*catch|catch[\s\S]*?markNoCacheResponse/,
|
||||
'Should mark no-cache on error to avoid caching error state');
|
||||
});
|
||||
});
|
||||
|
||||
@@ -0,0 +1,79 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p1
|
||||
issue_id: "018"
|
||||
tags: [code-review, security, simulation-runner, prompt-injection]
|
||||
---
|
||||
|
||||
# Unsanitized entity/seed fields injected into LLM simulation prompts
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`buildSimulationRound1SystemPrompt` and `buildSimulationRound2SystemPrompt` interpolate multiple fields directly into LLM system prompts without calling `sanitizeForPrompt`. The fields `e.entityId`, `e.class`, `e.stance`, `s.seedId`, `s.type`, `s.timing`, and Round 1 `r.actorId` all bypass sanitization entirely. These fields originate from external news data processed by the package builder, where `entityId` is derived from actor names extracted from live headlines via regex. A crafted headline can produce an `entityId` that, when embedded in the system prompt with the instruction "use exact entityId when citing actors", forms a valid prompt injection payload.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (HIGH):** `e.entityId` injected raw with explicit directive to LLM to use it verbatim:
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs ~line 15402
|
||||
`- ${e.entityId} | ${sanitizeForPrompt(e.name)} | class=${e.class} | stance=${e.stance || 'unknown'}`
|
||||
// e.entityId, e.class, e.stance — none sanitized
|
||||
```
|
||||
|
||||
**F-2 (HIGH):** Event seed fields `s.seedId`, `s.type`, `s.timing` injected raw:
|
||||
```javascript
|
||||
`- ${s.seedId} [${s.type}] ${sanitizeForPrompt(s.summary)} (${s.timing})`
|
||||
```
|
||||
|
||||
**F-3 (HIGH):** Round 2 prompt uses `r.actorId` from Round 1 LLM output (chaining injection risk):
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs ~line 15468
|
||||
actors: ${(p.initialReactions || []).slice(0, 3).map((r) => r.actorId).join(', ')}
|
||||
// r.actorId comes from LLM JSON output — not sanitized before round 2 injection
|
||||
```
|
||||
|
||||
`sanitizeProposedLlmAddition` exists in the same file and provides keyword-pattern blocking ("ignore", "override", "you must") but is never called on simulation fields.
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Apply `sanitizeForPrompt` to all bypassed fields (Recommended)
|
||||
|
||||
```javascript
|
||||
// In buildSimulationRound1SystemPrompt:
|
||||
const entityList = theaterEntities.slice(0, 10).map(
|
||||
(e) => `- ${sanitizeForPrompt(e.entityId)} | ${sanitizeForPrompt(e.name)} | class=${sanitizeForPrompt(e.class)} | stance=${sanitizeForPrompt(e.stance || 'unknown')}`,
|
||||
).join('\n');
|
||||
|
||||
const seedList = theaterSeeds.slice(0, 8).map(
|
||||
(s) => `- ${sanitizeForPrompt(s.seedId)} [${sanitizeForPrompt(s.type)}] ${sanitizeForPrompt(s.summary)} (${sanitizeForPrompt(s.timing)})`,
|
||||
).join('\n');
|
||||
|
||||
// In buildSimulationRound2SystemPrompt:
|
||||
actors: ${(p.initialReactions || []).slice(0, 3).map((r) => sanitizeForPrompt(r.actorId || '')).join(', ')}
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
### Option B: Enforce allowlist regex on `entityId` at package-build time
|
||||
|
||||
Add `/^[a-z0-9_\-]{1,80}$/` validation in `buildSimulationPackageEntities` at the point where `entityId` is generated. Reject any ID not matching the pattern. This is defense-in-depth upstream.
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] All fields interpolated into simulation system prompts are wrapped in `sanitizeForPrompt()`
|
||||
- [ ] `e.entityId`, `e.class`, `e.stance` sanitized in `buildSimulationRound1SystemPrompt`
|
||||
- [ ] `s.seedId`, `s.type`, `s.timing` sanitized in `buildSimulationRound1SystemPrompt`
|
||||
- [ ] `r.actorId` sanitized in `buildSimulationRound2SystemPrompt`
|
||||
- [ ] Test: entity with `entityId` containing newline + directive text produces sanitized prompt
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `buildSimulationRound1SystemPrompt` (~line 15397), `buildSimulationRound2SystemPrompt` (~line 15430), `buildSimulationRound2SystemPrompt` (~line 15468)
|
||||
- Existing function: `sanitizeForPrompt(text)` at line ~13481 — strips newlines, `<>{}`, control chars, truncates at 200 chars
|
||||
- Related: todo #013 (package-builder sanitization) — this is the downstream consumer gap
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review
|
||||
@@ -0,0 +1,92 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p1
|
||||
issue_id: "019"
|
||||
tags: [code-review, security, simulation-runner, path-traversal]
|
||||
---
|
||||
|
||||
# `runId` flows unvalidated into Redis key construction and R2 path
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`buildSimulationTaskKey(runId)` and `buildSimulationLockKey(runId)` construct Redis keys via string concatenation using `runId` with no format validation. More critically, `runId` flows into `buildSimulationOutcomeKey` → `buildTraceRunPrefix` which constructs an R2 key of the form `seed-data/forecast-traces/{year}/{month}/{day}/{runId}/simulation-outcome.json`. A `runId` containing `/../` path traversal sequences could produce an R2 key escaping the intended namespace.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (HIGH):** R2 path uses `runId` directly in `buildTraceRunPrefix`:
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs — buildTraceRunPrefix (~line 4407)
|
||||
`${basePrefix}/${year}/${month}/${day}/${runId}`
|
||||
// runId containing '/../' produces: seed-data/forecast-traces/2026/03/24/../../../evil
|
||||
```
|
||||
|
||||
**F-2 (MEDIUM):** Redis key construction via simple concatenation:
|
||||
```javascript
|
||||
function buildSimulationTaskKey(runId) { return `${SIMULATION_TASK_KEY_PREFIX}:${runId}`; }
|
||||
function buildSimulationLockKey(runId) { return `${SIMULATION_LOCK_KEY_PREFIX}:${runId}`; }
|
||||
// No format guard — runId from CLI argv or queue member
|
||||
```
|
||||
|
||||
**F-3 (MEDIUM):** ZADD member in task queue uses raw `runId`:
|
||||
```javascript
|
||||
await redisCommand(url, token, ['ZADD', SIMULATION_TASK_QUEUE_KEY, String(Date.now()), runId]);
|
||||
// If queue is poisoned, `listQueuedSimulationTasks` returns the malformed runId
|
||||
// which then flows into all downstream key construction
|
||||
```
|
||||
|
||||
Entry points: `process.argv` in `process-simulation-tasks.mjs` (operator-controlled, lower risk) and `listQueuedSimulationTasks` (queue member, higher risk if queue is ever written from an untrusted path).
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Validate `runId` format before any key operation (Recommended)
|
||||
|
||||
The existing `parseForecastRunGeneratedAt` (~line 4414) matches `/^(\d{10,})/`, suggesting `runId` values are timestamp-prefixed. Enforce this:
|
||||
|
||||
```javascript
|
||||
const VALID_RUN_ID = /^\d{13,}-[a-z0-9\-]{1,64}$/i;
|
||||
|
||||
function validateRunId(runId) {
|
||||
if (!runId || !VALID_RUN_ID.test(runId)) return null;
|
||||
return runId;
|
||||
}
|
||||
|
||||
// In enqueueSimulationTask:
|
||||
const safeRunId = validateRunId(runId);
|
||||
if (!safeRunId) return { queued: false, reason: 'invalid_run_id_format' };
|
||||
|
||||
// In processNextSimulationTask, validate each queuedRunId before processing:
|
||||
for (const rawId of queuedRunIds) {
|
||||
const runId = validateRunId(rawId);
|
||||
if (!runId) { console.warn('[Simulation] Skipping malformed runId:', rawId); continue; }
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
### Option B: Sanitize R2 path components
|
||||
|
||||
Apply `path.normalize` and prefix-check on the constructed R2 key before write:
|
||||
```javascript
|
||||
const key = buildSimulationOutcomeKey(runId, generatedAt);
|
||||
if (!key.startsWith('seed-data/forecast-traces/')) throw new Error('R2 key escaped namespace');
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low — defense-in-depth after Option A
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `enqueueSimulationTask` validates `runId` matches expected format before Redis write
|
||||
- [ ] `processNextSimulationTask` validates each `runId` from queue before key construction
|
||||
- [ ] R2 key is prefix-checked before write in `writeSimulationOutcome`
|
||||
- [ ] Invalid `runId` produces `{ queued: false, reason: 'invalid_run_id_format' }` not a silent key operation
|
||||
- [ ] Test: `runId` of `"../../../evil"` is rejected before Redis/R2 operations
|
||||
|
||||
## Technical Details
|
||||
|
||||
- Files: `scripts/seed-forecasts.mjs` — `enqueueSimulationTask` (~line 15636), `buildSimulationTaskKey` (~line 15633), `processNextSimulationTask` (~line 15682), `writeSimulationOutcome` (~line 15613)
|
||||
- Related: `buildTraceRunPrefix` (~line 4407) — used by all trace artifact key builders
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review
|
||||
@@ -0,0 +1,78 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p1
|
||||
issue_id: "020"
|
||||
tags: [code-review, typescript, simulation-runner, type-safety]
|
||||
---
|
||||
|
||||
# Unvalidated `as` cast on `getRawJson` result in `get-simulation-outcome.ts`
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`getRawJson` returns `Promise<unknown | null>`. The handler casts the result with `as { runId: string; outcomeKey: string; ... } | null` — a TypeScript compile-time assertion with no runtime enforcement. If Redis contains a malformed value (wrong shape, missing fields, renamed keys from a schema migration), `pointer.runId`, `pointer.outcomeKey`, etc. would be `undefined`, and the handler returns a partially-populated `GetSimulationOutcomeResponse` with `undefined` values spread into proto fields. The same pattern exists in `get-simulation-package.ts` and should be fixed in both files simultaneously.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (P1):** TypeScript `as` cast provides zero runtime protection:
|
||||
```typescript
|
||||
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts line 21
|
||||
const pointer = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY) as {
|
||||
runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
|
||||
} | null;
|
||||
// If Redis has { run_id: 'x', outcome_key: 'y' } (snake_case), pointer.runId === undefined
|
||||
// Handler returns { found: true, runId: undefined, ... } — malformed response
|
||||
```
|
||||
|
||||
**F-2 (P2):** Same pattern in `get-simulation-package.ts` line ~21 — fix both together.
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Add a type guard function (Recommended)
|
||||
|
||||
```typescript
|
||||
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts
|
||||
|
||||
function isOutcomePointer(v: unknown): v is {
|
||||
runId: string; outcomeKey: string; schemaVersion: string; theaterCount: number; generatedAt: number;
|
||||
} {
|
||||
if (typeof v !== 'object' || v === null) return false;
|
||||
const p = v as Record<string, unknown>;
|
||||
return typeof p['runId'] === 'string'
|
||||
&& typeof p['outcomeKey'] === 'string'
|
||||
&& typeof p['schemaVersion'] === 'string'
|
||||
&& typeof p['theaterCount'] === 'number'
|
||||
&& typeof p['generatedAt'] === 'number';
|
||||
}
|
||||
|
||||
// In handler:
|
||||
const raw = await getRawJson(SIMULATION_OUTCOME_LATEST_KEY);
|
||||
if (!isOutcomePointer(raw)) {
|
||||
markNoCacheResponse(ctx.request);
|
||||
return NOT_FOUND; // treat malformed as not-found
|
||||
}
|
||||
const pointer = raw; // fully typed, no cast
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low — safe degradation to NOT_FOUND on invalid data
|
||||
|
||||
### Option B: Use zod schema validation (heavier but more maintainable)
|
||||
|
||||
Add a `z.object({...}).safeParse()` call. Only viable if zod is already in the project dependencies.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `get-simulation-outcome.ts` uses a type guard instead of `as` cast
|
||||
- [ ] Malformed Redis value returns `NOT_FOUND` response (not a partially-populated response)
|
||||
- [ ] `get-simulation-package.ts` receives the same fix simultaneously
|
||||
- [ ] TypeScript strict mode still passes after the change (no `any` introduced)
|
||||
- [ ] Test: mocked `getRawJson` returning `{ run_id: 'x' }` (wrong key names) → handler returns `found: false`
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `server/worldmonitor/forecast/v1/get-simulation-outcome.ts` lines 21-23
|
||||
- File: `server/worldmonitor/forecast/v1/get-simulation-package.ts` lines ~21-23 (same pattern)
|
||||
- `getRawJson` return type: `Promise<unknown | null>` — correct to return unknown
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:kieran-typescript-reviewer in PR #2220 review
|
||||
68
todos/021-pending-p1-simulation-no-http-trigger-endpoint.md
Normal file
68
todos/021-pending-p1-simulation-no-http-trigger-endpoint.md
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
status: pending
|
||||
priority: p1
|
||||
issue_id: "021"
|
||||
tags: [code-review, agent-native, simulation-runner, api]
|
||||
---
|
||||
|
||||
# No HTTP endpoint to trigger a simulation run — agents cannot initiate simulations
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Simulation runs can only be triggered by a human operator running `node scripts/process-simulation-tasks.mjs --once` in the Railway environment. `enqueueSimulationTask(runId)` and `runSimulationWorker` are exported from `scripts/seed-forecasts.mjs` but are only callable from worker processes, not via HTTP. Agents operating through the HTTP API (AI Market Implications panel, future orchestration agents, LLM tool calls) have read-only access to the system — they can discover the latest simulation outcome pointer but cannot trigger a new simulation. For a feature described as AI-driven forecasting, agents being permanently blocked from initiating analysis is a design gap.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (P1):** No `POST /api/forecast/v1/trigger-simulation` or equivalent endpoint exists.
|
||||
|
||||
**F-2 (P1):** `enqueueSimulationTask(runId)` is exported and callable, but only from Node.js processes — no HTTP surface.
|
||||
|
||||
**F-3 (P2):** Compounded by `runId` filter being a no-op in `getSimulationOutcome` — even if an agent knew its trigger succeeded, it cannot verify its specific run completed vs. a concurrent run superseding it.
|
||||
|
||||
**Capability map:**
|
||||
|
||||
| Action | Human | Agent (HTTP) |
|
||||
|---|---|---|
|
||||
| Check outcome exists | ✅ | ✅ |
|
||||
| Read outcome pointer | ✅ | ✅ |
|
||||
| Trigger simulation run | ✅ (Railway CLI) | ❌ |
|
||||
| Check if run in progress | ✅ (logs) | ❌ |
|
||||
| Verify specific run completed | ✅ | ❌ (runId filter no-op) |
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Add `POST /api/forecast/v1/trigger-simulation` (Recommended)
|
||||
|
||||
A thin Vercel handler following the same proto pattern:
|
||||
|
||||
1. New proto message: `TriggerSimulationRequest { string run_id = 1; }`, `TriggerSimulationResponse { bool queued = 1; string run_id = 2; string reason = 3; }`
|
||||
2. New handler: reads `SIMULATION_PACKAGE_LATEST_KEY` from Redis to derive `runId` if not supplied, calls `enqueueSimulationTask(runId)`, returns `{ queued, runId, reason }`
|
||||
3. The actual execution remains Railway-side (existing poll loop picks it up) — the endpoint only enqueues
|
||||
4. Rate-limit to 1 trigger per 5 minutes to prevent spam (can reuse existing rate-limit pattern)
|
||||
|
||||
Estimated effort: 1 proto file + 1 handler file + 1 service.proto entry + `make generate` — same scope as `get-simulation-outcome.ts`.
|
||||
|
||||
### Option B: Webhook trigger from deep forecast completion
|
||||
|
||||
When `processNextDeepForecastTask` completes and writes a simulation package, automatically call `enqueueSimulationTask`. This makes simulation trigger automatic rather than agent-driven. Simpler but removes on-demand triggering flexibility.
|
||||
|
||||
Effort: Small | Risk: Low — no new HTTP surface, but agents still can't trigger ad-hoc
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `POST /api/forecast/v1/trigger-simulation` returns `{ queued: true, runId }` when package is available
|
||||
- [ ] Returns `{ queued: false, reason: 'no_package' }` when no simulation package exists
|
||||
- [ ] Returns `{ queued: false, reason: 'duplicate' }` when the same runId is already queued
|
||||
- [ ] Rate limited to prevent spam
|
||||
- [ ] Agent-native: an agent calling the trigger endpoint then polling `getSimulationOutcome` can complete a trigger-and-verify workflow
|
||||
|
||||
## Technical Details
|
||||
|
||||
- Would-be handler: `server/worldmonitor/forecast/v1/trigger-simulation.ts`
|
||||
- Entry point: `enqueueSimulationTask(runId)` in `scripts/seed-forecasts.mjs` (already exported)
|
||||
- Pattern reference: `get-simulation-outcome.ts` for handler structure, `service.proto` for RPC addition
|
||||
- Related: todo #029 (runId filter no-op) — fix both for complete trigger-and-verify loop
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:agent-native-reviewer in PR #2220 review
|
||||
@@ -0,0 +1,63 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "022"
|
||||
tags: [code-review, architecture, simulation-runner, correctness]
|
||||
---
|
||||
|
||||
# `pkgPointer.runId` never compared to task `runId` — can silently simulate wrong package
|
||||
|
||||
## Problem Statement
|
||||
|
||||
In `processNextSimulationTask`, after claiming a task for `runId=A`, the code reads `SIMULATION_PACKAGE_LATEST_KEY` which returns the *latest* package pointer — not necessarily the one for run A. If a new simulation package for run B is written to Redis while task A is still queued, the worker picks up task A but processes run B's package data. The outcome is written under run A's `runId` but contains run B content. No warning is logged, no error is returned. This is especially relevant in Phase 3 when per-run lookup becomes active.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (HIGH):** `pkgPointer.runId` is read but never compared to the task's `runId`:
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs ~line 15697
|
||||
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
|
||||
if (!pkgPointer?.pkgKey) { ... return { status: 'failed', reason: 'no_package_pointer' }; }
|
||||
// Missing: if (pkgPointer.runId && pkgPointer.runId !== runId) { ... abort ... }
|
||||
|
||||
const pkgData = await getR2JsonObject(storageConfig, pkgPointer.pkgKey);
|
||||
// pkgData.runId !== runId — proceeds to simulate and write outcome under wrong runId
|
||||
```
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Add explicit runId mismatch guard (Recommended)
|
||||
|
||||
```javascript
|
||||
const pkgPointer = await redisGet(url, token, SIMULATION_PACKAGE_LATEST_KEY);
|
||||
if (!pkgPointer?.pkgKey) { ... return failed; }
|
||||
|
||||
// Guard: skip if package is for a different run
|
||||
if (pkgPointer.runId && pkgPointer.runId !== runId) {
|
||||
console.warn(` [Simulation] Package mismatch: task=${runId} pkg=${pkgPointer.runId} — skipping`);
|
||||
await completeSimulationTask(runId);
|
||||
return { status: 'skipped', reason: 'package_run_mismatch', runId };
|
||||
}
|
||||
```
|
||||
|
||||
This is non-breaking: if `pkgPointer.runId` is absent (old format), the guard is skipped and behavior is unchanged.
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
### Option B: Accept current behavior, document explicitly
|
||||
|
||||
Add a comment explaining that "latest wins" is intentional and document the Phase 3 migration path. Safe for Phase 2 where only one run stream exists.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Guard added: if `pkgPointer.runId !== runId`, task is completed and `{ status: 'skipped', reason: 'package_run_mismatch' }` returned
|
||||
- [ ] Log line emitted on mismatch for operational visibility
|
||||
- [ ] Test: enqueue task for runId A, set package pointer to runId B — processNextSimulationTask returns `skipped/package_run_mismatch`
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `processNextSimulationTask` (~line 15697)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review
|
||||
@@ -0,0 +1,67 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "023"
|
||||
tags: [code-review, security, simulation-runner, data-integrity]
|
||||
---
|
||||
|
||||
# LLM output arrays written to R2 without per-element sanitization or length limits
|
||||
|
||||
## Problem Statement
|
||||
|
||||
In `processNextSimulationTask`, several LLM output arrays are written to R2 using only `map(String).slice(0, N)` — which ensures items are strings but applies no length cap per element and no sanitization. A single oversized or injection-containing LLM output item (e.g., a `stabilizers` entry of 50,000 characters) is written directly to R2 and later served to clients without truncation. Additionally, the `timingMarkers` sourced from `result.round2?.paths?.[0]` use a different (non-sanitized) path compared to the per-path `timingMarkers` processing that correctly applies `sanitizeForPrompt`.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (MEDIUM):**
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs ~line 15766
|
||||
dominantReactions: (result.round1?.dominantReactions || []).map(String).slice(0, 6),
|
||||
stabilizers: (result.round2?.stabilizers || []).map(String).slice(0, 6),
|
||||
invalidators: (result.round2?.invalidators || []).map(String).slice(0, 6),
|
||||
keyActors: Array.isArray(p.keyActors) ? p.keyActors.map(String).slice(0, 6) : [],
|
||||
// No per-element length limit or sanitization — each string can be arbitrarily long
|
||||
```
|
||||
|
||||
**F-2 (MEDIUM):**
|
||||
```javascript
|
||||
// ~line 15769 — timingMarkers from round2 paths[0] (different code path than per-path markers)
|
||||
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4),
|
||||
// Individual marker objects NOT sanitized — but per-path timingMarkers at ~15757 DO sanitize
|
||||
```
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Apply `sanitizeForPrompt` + length cap to all LLM array elements (Recommended)
|
||||
|
||||
```javascript
|
||||
dominantReactions: (result.round1?.dominantReactions || [])
|
||||
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
stabilizers: (result.round2?.stabilizers || [])
|
||||
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
invalidators: (result.round2?.invalidators || [])
|
||||
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
|
||||
keyActors: Array.isArray(p.keyActors)
|
||||
? p.keyActors.map((s) => sanitizeForPrompt(String(s)).slice(0, 80)).slice(0, 6)
|
||||
: [],
|
||||
// For timingMarkers at ~15769 — apply same sanitization as per-path version:
|
||||
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4)
|
||||
.map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) })),
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `dominantReactions`, `stabilizers`, `invalidators` elements capped at 120 chars each with `sanitizeForPrompt`
|
||||
- [ ] `keyActors` elements capped at 80 chars each with `sanitizeForPrompt`
|
||||
- [ ] `timingMarkers` at the theater-result level uses the same sanitization as per-path version
|
||||
- [ ] Test: LLM output with a 10,000-char `stabilizers[0]` is truncated to ≤120 chars in R2 artifact
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `processNextSimulationTask` (~lines 15752, 15766-15769)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review
|
||||
@@ -0,0 +1,69 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "024"
|
||||
tags: [code-review, architecture, simulation-runner, schema-drift]
|
||||
---
|
||||
|
||||
# `isMaritimeChokeEnergyCandidate` hand-rolled adapter creates schema drift risk
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`processNextSimulationTask` calls `isMaritimeChokeEnergyCandidate` with a manually-constructed adapter object mapping fields from `selectedTheaters` items individually. The function expects `{ routeFacilityKey, marketBucketIds, marketContext: { topBucketId }, commodityKey }` but `selectedTheaters` stores `topBucketId` flat (not under `marketContext`). If `selectedTheaters` items ever gain a `marketContext` field directly (as the upstream data model already uses), the manual mapping shadows the real `marketContext.topBucketId` with an empty string. If the function's logic ever expands to use additional fields, the call site silently fails to pass them.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (MEDIUM):**
|
||||
```javascript
|
||||
// scripts/seed-forecasts.mjs ~line 15719
|
||||
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
|
||||
isMaritimeChokeEnergyCandidate({
|
||||
routeFacilityKey: t.routeFacilityKey || '',
|
||||
marketBucketIds: t.marketBucketIds || [],
|
||||
marketContext: { topBucketId: t.topBucketId || '' }, // t.topBucketId is flat; marketContext is reconstructed
|
||||
commodityKey: t.commodityKey || '',
|
||||
}),
|
||||
);
|
||||
// If t gains a real marketContext field, the reconstructed one shadows it
|
||||
// isMaritimeChokeEnergyCandidate called at line 12190 uses the full candidate object directly
|
||||
```
|
||||
|
||||
Two call sites for the same function with different input shapes is a maintenance hazard.
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Pass theater object directly, normalize inside the function (Recommended)
|
||||
|
||||
Update `isMaritimeChokeEnergyCandidate` to accept both flat and nested shapes:
|
||||
```javascript
|
||||
function isMaritimeChokeEnergyCandidate(candidate) {
|
||||
const topBucket = candidate.marketContext?.topBucketId || candidate.topBucketId || '';
|
||||
// ...rest of logic unchanged, just reads topBucket instead of candidate.marketContext.topBucketId
|
||||
}
|
||||
|
||||
// In processNextSimulationTask — just pass t directly:
|
||||
const eligibleTheaters = (pkgData.selectedTheaters || []).filter((t) =>
|
||||
isMaritimeChokeEnergyCandidate(t)
|
||||
);
|
||||
```
|
||||
|
||||
Effort: Small | Risk: Low — backwards compatible, no behavior change for existing call site at line 12190
|
||||
|
||||
### Option B: Verify that `selectedTheaters` schema already includes all needed fields
|
||||
|
||||
Check `buildSimulationPackageFromDeepSnapshot` to confirm it writes `routeFacilityKey`, `marketBucketIds`, `topBucketId`, `commodityKey` to theater items. If confirmed, document the flat-vs-nested convention with a comment at the call site.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `isMaritimeChokeEnergyCandidate` accepts both flat (`topBucketId`) and nested (`marketContext.topBucketId`) input
|
||||
- [ ] `processNextSimulationTask` passes `t` directly without hand-rolling the adapter
|
||||
- [ ] Both call sites (line 12190 and new line 15719) produce identical classification results
|
||||
- [ ] Existing tests for `isMaritimeChokeEnergyCandidate` still pass
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `isMaritimeChokeEnergyCandidate` (~line 11871), `processNextSimulationTask` (~line 15719), `buildSimulationPackageFromDeepSnapshot` (~line 12190)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review
|
||||
@@ -0,0 +1,60 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "025"
|
||||
tags: [code-review, performance, simulation-runner, llm]
|
||||
---
|
||||
|
||||
# Round 1 token budget (1800) may be too tight for fully-populated theaters
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`SIMULATION_ROUND1_MAX_TOKENS = 1800` is the output token cap for Round 1 LLM calls. With a fully-populated theater (10 entities, 8 seeds, constraints, eval targets, simulation requirement, plus the ~350-token JSON response template), the system prompt alone consumes ~1,030 tokens. This leaves ~770 tokens for the response. A minimal valid Round 1 response (3 paths with labels, summaries, and 3 `initialReactions` each) costs ~700-900 tokens. At the high end of entity/seed density, the model will truncate its JSON mid-object, causing `round1_parse_failed` and marking the theater as failed — silently, with no token-exhaustion signal in the diagnostic.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1 (HIGH):** Token budget vs. prompt size analysis:
|
||||
|
||||
- Static template text: ~350 tokens
|
||||
- 10 entities at ~20 tokens each: ~200 tokens
|
||||
- 8 event seeds at ~25 tokens each: ~200 tokens
|
||||
- simulationRequirement + constraints + evalTargets: ~255 tokens
|
||||
- **Total input: ~1,005 tokens**
|
||||
- **Output budget remaining: 795 tokens**
|
||||
- Minimal valid Round 1 response (3 paths, 3 reactions each): **~700-900 tokens**
|
||||
- Margin: **-105 to +95 tokens** — essentially zero at max density
|
||||
|
||||
`SIMULATION_ROUND2_MAX_TOKENS = 2500` is adequate for Round 2 (shorter input, richer output).
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Raise `SIMULATION_ROUND1_MAX_TOKENS` to 2200 + cap `initialReactions` in prompt (Recommended)
|
||||
|
||||
```javascript
|
||||
const SIMULATION_ROUND1_MAX_TOKENS = 2200; // was 1800
|
||||
|
||||
// In buildSimulationRound1SystemPrompt INSTRUCTIONS section, add:
|
||||
// - Maximum 3 initialReactions per path
|
||||
```
|
||||
|
||||
This provides a 1,195-token output margin (2200 - 1005) which comfortably fits 3 paths × 3 reactions. The `initialReactions` cap aligns with existing behavior (only 3 are used in Round 2 path summaries).
|
||||
|
||||
Effort: Trivial | Risk: Very Low — increases LLM output budget, no structural change
|
||||
|
||||
### Option B: Dynamic token calculation based on entity/seed count
|
||||
|
||||
Calculate prompt token estimate and adjust `maxTokens` accordingly. More precise but adds complexity with no meaningful benefit given the fixed slice limits.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `SIMULATION_ROUND1_MAX_TOKENS` raised from 1800 to 2200
|
||||
- [ ] INSTRUCTIONS block in `buildSimulationRound1SystemPrompt` includes "- Maximum 3 initialReactions per path"
|
||||
- [ ] Existing tests pass (prompt builder tests check content, not token count)
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `SIMULATION_ROUND1_MAX_TOKENS` (~line 38), `buildSimulationRound1SystemPrompt` INSTRUCTIONS section (~line 15445)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:performance-oracle in PR #2220 review
|
||||
@@ -0,0 +1,64 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "026"
|
||||
tags: [code-review, typescript, simulation-runner, maintainability]
|
||||
---
|
||||
|
||||
# Redis key strings duplicated between TS handler and MJS seed script
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest'` is defined independently in both `server/worldmonitor/forecast/v1/get-simulation-outcome.ts` and `scripts/seed-forecasts.mjs`. The same duplication exists for `SIMULATION_PACKAGE_LATEST_KEY`. `server/_shared/cache-keys.ts` (referenced in the worldmonitor-bootstrap-registration pattern) exists for exactly this purpose: shared Redis key constants that TypeScript handlers and seed scripts need to agree on. A future rename in one file without the other produces a silent miss where the handler reads an empty key forever.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1:**
|
||||
```typescript
|
||||
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts line 10
|
||||
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
|
||||
|
||||
// scripts/seed-forecasts.mjs line 35
|
||||
const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
|
||||
// Two independent definitions with no enforcement of consistency
|
||||
```
|
||||
|
||||
**F-2:** Same pattern for `SIMULATION_PACKAGE_LATEST_KEY` between `get-simulation-package.ts` and `seed-forecasts.mjs`.
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Move keys to `server/_shared/cache-keys.ts`, import in handler (Recommended)
|
||||
|
||||
```typescript
|
||||
// server/_shared/cache-keys.ts — add:
|
||||
export const SIMULATION_OUTCOME_LATEST_KEY = 'forecast:simulation-outcome:latest';
|
||||
export const SIMULATION_PACKAGE_LATEST_KEY = 'forecast:simulation-package:latest';
|
||||
|
||||
// server/worldmonitor/forecast/v1/get-simulation-outcome.ts — replace local const:
|
||||
import { SIMULATION_OUTCOME_LATEST_KEY } from '../../../_shared/cache-keys';
|
||||
```
|
||||
|
||||
The seed script (`scripts/seed-forecasts.mjs`) keeps its own definition since it's a standalone MJS module that cannot import from TypeScript source. But the TypeScript handler becomes the downstream consumer of a canonical definition, making renames TypeScript-checked.
|
||||
|
||||
Effort: Small | Risk: Low
|
||||
|
||||
### Option B: Add a comment cross-referencing both locations
|
||||
|
||||
Not a fix, but documents the relationship so a human renaming one knows to update the other. Use as a stopgap if Option A causes import complexity.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `SIMULATION_OUTCOME_LATEST_KEY` exported from `server/_shared/cache-keys.ts`
|
||||
- [ ] `get-simulation-outcome.ts` imports from `cache-keys.ts` instead of local const
|
||||
- [ ] `SIMULATION_PACKAGE_LATEST_KEY` moved simultaneously
|
||||
- [ ] `get-simulation-package.ts` updated to import from `cache-keys.ts`
|
||||
- [ ] TypeScript compilation clean after change
|
||||
|
||||
## Technical Details
|
||||
|
||||
- Files: `server/worldmonitor/forecast/v1/get-simulation-outcome.ts:10`, `server/worldmonitor/forecast/v1/get-simulation-package.ts:~10`, `server/_shared/cache-keys.ts`
|
||||
- Scripts keep their own definitions (they're standalone MJS — can't import from TS source)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:kieran-typescript-reviewer in PR #2220 review
|
||||
@@ -0,0 +1,64 @@
|
||||
---
|
||||
status: complete
|
||||
priority: p2
|
||||
issue_id: "027"
|
||||
tags: [code-review, agent-native, simulation-runner, api]
|
||||
---
|
||||
|
||||
# `runId` filter in `getSimulationOutcome` is a no-op with no OpenAPI documentation
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`GetSimulationOutcomeRequest.runId` is accepted as a query parameter but explicitly ignored — the handler always returns the latest outcome. The proto file has a comment explaining this ("Currently ignored; always returns the latest outcome. Reserved for Phase 3"), but this comment does not surface in the generated OpenAPI spec's `description` field. Agents and API consumers relying on the OpenAPI spec see a `runId` parameter with no description and no indication that it is non-functional. An agent that triggers a simulation run, notes the `runId`, and passes it to `getSimulationOutcome` will silently receive a different run's outcome with no way to detect the mismatch (except the `note` field, which is easy to overlook).
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1:** Proto comment exists but does not reach OpenAPI:
|
||||
```proto
|
||||
// proto/worldmonitor/forecast/v1/get_simulation_outcome.proto line 9
|
||||
message GetSimulationOutcomeRequest {
|
||||
// Currently ignored; always returns the latest outcome. Reserved for Phase 3 per-run lookup.
|
||||
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
|
||||
}
|
||||
```
|
||||
Generated `docs/api/ForecastService.openapi.yaml` has the `runId` parameter with no `description` field.
|
||||
|
||||
**F-2:** Agent trigger-and-verify workflow is unreliable without per-run lookup:
|
||||
|
||||
1. Agent calls `POST /api/forecast/v1/trigger-simulation` (when it exists) → gets `runId=A`
|
||||
2. Agent polls `GET /api/forecast/v1/get-simulation-outcome?runId=A`
|
||||
3. Run B completes first, writes `found: true, runId: B` to Redis
|
||||
4. Handler returns run B's outcome with `note: "runId filter not yet active; returned outcome may differ"`
|
||||
5. Agent receives `note` but may not check it; proceeds to act on wrong run's data
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Add description annotation to proto field so it propagates to OpenAPI (Recommended)
|
||||
|
||||
Check if sebuf's proto generator picks up leading comments or if it requires a `description` annotation extension. If the generator supports field descriptions, add:
|
||||
```proto
|
||||
// IMPORTANT: Currently a no-op. Always returns the latest available outcome regardless of runId.
|
||||
// Per-run lookup is reserved for Phase 3. Check the response 'note' field when runId is supplied.
|
||||
string run_id = 1 [(sebuf.http.query) = { name: "runId" }];
|
||||
```
|
||||
|
||||
If the generator does not propagate comments, manually update the generated OpenAPI yaml as a post-generation step.
|
||||
|
||||
### Option B: Document in the handler's response `note` more prominently
|
||||
|
||||
Current `note` text: "runId filter not yet active; returned outcome may differ from requested run". This is already a reasonable signal. Ensure the proto `note` field also has a description in OpenAPI explaining its purpose.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] OpenAPI `description` for the `runId` parameter in `GetSimulationOutcome` explains it is currently a no-op
|
||||
- [ ] OpenAPI `description` for the `note` response field explains it is populated when `runId` mismatch occurs
|
||||
- [ ] Combined with todo #021 (trigger endpoint), a full trigger-and-verify loop is documented
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `proto/worldmonitor/forecast/v1/get_simulation_outcome.proto`
|
||||
- File: `docs/api/ForecastService.openapi.yaml` (auto-generated — check if manual edits survive `make generate`)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:agent-native-reviewer in PR #2220 review
|
||||
@@ -0,0 +1,60 @@
|
||||
---
|
||||
status: pending
|
||||
priority: p3
|
||||
issue_id: "028"
|
||||
tags: [code-review, architecture, simulation-runner, schema]
|
||||
---
|
||||
|
||||
# No structured `completionStatus` field in simulation outcome — callers must parse strings
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The simulation outcome has no machine-readable `completionStatus` field. Callers must re-derive completion state from `theaterResults.length`, `failedTheaters.length`, and the string-encoded `globalObservations` field. This works for Phase 2 but will block Phase 3 callers (UI panels, downstream agents) that need to branch on `partial` vs `all_failed` vs `no_eligible_theaters`.
|
||||
|
||||
## Findings
|
||||
|
||||
**F-1:**
|
||||
```javascript
|
||||
const outcome = {
|
||||
globalObservations: eligibleTheaters.length === 0
|
||||
? 'No maritime chokepoint/energy theaters in package'
|
||||
: theaterResults.length === 0 ? 'All theaters failed simulation' : '',
|
||||
confidenceNotes: `${theaterResults.length}/${eligibleTheaters.length} theaters completed`,
|
||||
// No structured completionStatus or eligibleTheaterCount
|
||||
};
|
||||
```
|
||||
|
||||
Callers deriving status: `theaterResults.length === 0 && failedTheaters.length === 0` could mean "no eligible theaters" or "eligibleTheaters array was somehow empty". No way to distinguish without `eligibleTheaterCount`.
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
```javascript
|
||||
const completionStatus =
|
||||
eligibleTheaters.length === 0 ? 'no_eligible_theaters'
|
||||
: theaterResults.length === 0 ? 'all_failed'
|
||||
: failedTheaters.length > 0 ? 'partial'
|
||||
: 'complete';
|
||||
|
||||
const outcome = {
|
||||
...existingFields,
|
||||
completionStatus,
|
||||
eligibleTheaterCount: eligibleTheaters.length,
|
||||
};
|
||||
```
|
||||
|
||||
Also add `theaterCount` to `GetSimulationOutcomeResponse` proto (currently only `theaterCount` for successful results) — or add `eligibleTheaterCount` field in Phase 3.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `completionStatus: 'no_eligible_theaters' | 'all_failed' | 'partial' | 'complete'` added to outcome schema
|
||||
- [ ] `eligibleTheaterCount` added to outcome schema
|
||||
- [ ] `getSimulationOutcome` RPC response includes `completionStatus` (or proto updated in Phase 3)
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `processNextSimulationTask` (~line 15774) outcome construction
|
||||
- Phase 3 concern: add `completionStatus` to `GetSimulationOutcomeResponse` proto
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:architecture-strategist in PR #2220 review
|
||||
@@ -0,0 +1,38 @@
|
||||
---
|
||||
status: pending
|
||||
priority: p3
|
||||
issue_id: "029"
|
||||
tags: [code-review, performance, simulation-runner, llm]
|
||||
---
|
||||
|
||||
# `getForecastLlmCallOptions` has no cases for `simulation_round_1` / `simulation_round_2`
|
||||
|
||||
## Problem Statement
|
||||
|
||||
`getForecastLlmCallOptions(stage)` maps stage names to provider order and model configuration. Every other pipeline stage (`combined`, `critical_signals`, `impact_expansion`, `market_implications`, etc.) has its own env override so operators can route that stage to a different model. The simulation runner uses `'simulation_round_1'` and `'simulation_round_2'` as stage names, but both fall through to the `else` branch (default provider order). This means simulation stages cannot be independently routed to a more capable reasoning model in Phase 3 without a code change.
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
```javascript
|
||||
// In getForecastLlmCallOptions, add cases:
|
||||
: stage === 'simulation_round_1' || stage === 'simulation_round_2'
|
||||
? (process.env.FORECAST_LLM_SIMULATION_PROVIDER_ORDER
|
||||
? parseForecastProviderOrder(process.env.FORECAST_LLM_SIMULATION_PROVIDER_ORDER)
|
||||
: globalProviderOrder || defaultProviderOrder)
|
||||
```
|
||||
|
||||
This follows the exact pattern of every other named stage. No behavior change until `FORECAST_LLM_SIMULATION_PROVIDER_ORDER` is set.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `simulation_round_1` and `simulation_round_2` have explicit cases in `getForecastLlmCallOptions`
|
||||
- [ ] `FORECAST_LLM_SIMULATION_PROVIDER_ORDER` env var controls simulation provider order when set
|
||||
- [ ] Existing tests pass; no behavior change when env var is unset
|
||||
|
||||
## Technical Details
|
||||
|
||||
- File: `scripts/seed-forecasts.mjs` — `getForecastLlmCallOptions` (~line 3920)
|
||||
|
||||
## Work Log
|
||||
|
||||
- 2026-03-24: Found by compound-engineering:review:performance-oracle in PR #2220 review
|
||||
Reference in New Issue
Block a user