mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(simulation): MiroFish Phase 2 — theater-limited simulation runner Adds the simulation execution layer that consumes simulation-package.json and produces simulation-outcome.json for maritime chokepoint + energy/logistics theaters, closing the WorldMonitor → MiroFish handoff loop. Changes: - scripts/seed-forecasts.mjs: 2-round LLM simulation runner (prompt builders, JSON extractor, runTheaterSimulation, writeSimulationOutcome, task queue with NX dedup lock, runSimulationWorker poll loop) - scripts/process-simulation-tasks.mjs: standalone worker entry point - proto: GetSimulationOutcome RPC + make generate - server/worldmonitor/forecast/v1/get-simulation-outcome.ts: RPC handler - server/gateway.ts: slow tier for get-simulation-outcome - api/health.js: simulationOutcomeLatest in STANDALONE + ON_DEMAND keys - tests: 14 new tests for simulation runner functions * fix(simulation): address P1/P2 code review findings from PR #2220 Security (P1 #018): - sanitizeForPrompt() applied to all entity/seed fields interpolated into Round 1 prompt (entityId, class, stance, seedId, type, timing) - sanitizeForPrompt() applied to actorId and entityIds in Round 2 prompt - sanitizeForPrompt() + length caps applied to all LLM array fields written to R2 (dominantReactions, stabilizers, invalidators, keyActors, timingMarkers) Validation (P1 #019): - Added validateRunId() regex guard - Applied in enqueueSimulationTask() and processNextSimulationTask() loop Type safety (P1 #020): - Added isOutcomePointer() and isPackagePointer() type guards in TS handlers - Replaced unsafe as-casts with runtime-validated guards in both handlers Correctness (P2 #022): - Log warning when pkgPointer.runId does not match task runId Architecture (P2 #024): - isMaritimeChokeEnergyCandidate() accepts both flat and nested topBucketId - Call site simplified to pass theater directly Performance (P2 #025): - SIMULATION_ROUND1_MAX_TOKENS raised 1800 to 2200 - Added max 3 initialReactions instruction to Round 1 prompt Maintainability (P2 #026): - Simulation pointer keys exported from server/_shared/cache-keys.ts - Both TS handlers import from shared location Documentation (P2 #027): - Strengthened runId no-op description in proto and OpenAPI spec * fix(todos): add blank lines around lists in markdown todo files * style(api): reformat openapi yaml to match linter output * test(simulation): add flat-shape filter test + getSimulationOutcome handler coverage Two tests identified as missing during PR #2220 review: 1. isMaritimeChokeEnergyCandidate flat-shape tests — covers the || candidate.topBucketId normalization added in the P1/P2 review pass. The existing tests only used the nested marketContext.topBucketId shape; this adds the flat root-field shape that arrives from the simulation-package.json JSON (selectedTheaters entries have topBucketId at root). 2. getSimulationOutcome handler structural tests — verifies the isOutcomePointer guard, found:false NOT_FOUND return, found:true success path, note population on runId mismatch, and redis_unavailable error string. Follows the readSrc static-analysis pattern used elsewhere in server-handlers.test.mjs (handler imports Redis so full integration test would require a test Redis instance).
3.1 KiB
3.1 KiB
status, priority, issue_id, tags
| status | priority | issue_id | tags | ||||
|---|---|---|---|---|---|---|---|
| complete | p2 | 023 |
|
LLM output arrays written to R2 without per-element sanitization or length limits
Problem Statement
In processNextSimulationTask, several LLM output arrays are written to R2 using only map(String).slice(0, N) — which ensures items are strings but applies no length cap per element and no sanitization. A single oversized or injection-containing LLM output item (e.g., a stabilizers entry of 50,000 characters) is written directly to R2 and later served to clients without truncation. Additionally, the timingMarkers sourced from result.round2?.paths?.[0] use a different (non-sanitized) path compared to the per-path timingMarkers processing that correctly applies sanitizeForPrompt.
Findings
F-1 (MEDIUM):
// scripts/seed-forecasts.mjs ~line 15766
dominantReactions: (result.round1?.dominantReactions || []).map(String).slice(0, 6),
stabilizers: (result.round2?.stabilizers || []).map(String).slice(0, 6),
invalidators: (result.round2?.invalidators || []).map(String).slice(0, 6),
keyActors: Array.isArray(p.keyActors) ? p.keyActors.map(String).slice(0, 6) : [],
// No per-element length limit or sanitization — each string can be arbitrarily long
F-2 (MEDIUM):
// ~line 15769 — timingMarkers from round2 paths[0] (different code path than per-path markers)
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4),
// Individual marker objects NOT sanitized — but per-path timingMarkers at ~15757 DO sanitize
Proposed Solutions
Option A: Apply sanitizeForPrompt + length cap to all LLM array elements (Recommended)
dominantReactions: (result.round1?.dominantReactions || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
stabilizers: (result.round2?.stabilizers || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
invalidators: (result.round2?.invalidators || [])
.map((s) => sanitizeForPrompt(String(s)).slice(0, 120)).slice(0, 6),
keyActors: Array.isArray(p.keyActors)
? p.keyActors.map((s) => sanitizeForPrompt(String(s)).slice(0, 80)).slice(0, 6)
: [],
// For timingMarkers at ~15769 — apply same sanitization as per-path version:
timingMarkers: (result.round2?.paths?.[0]?.timingMarkers || []).slice(0, 4)
.map((m) => ({ event: sanitizeForPrompt(m.event || '').slice(0, 80), timing: String(m.timing || 'T+0h').slice(0, 10) })),
Effort: Small | Risk: Low
Acceptance Criteria
dominantReactions,stabilizers,invalidatorselements capped at 120 chars each withsanitizeForPromptkeyActorselements capped at 80 chars each withsanitizeForPrompttimingMarkersat the theater-result level uses the same sanitization as per-path version- Test: LLM output with a 10,000-char
stabilizers[0]is truncated to ≤120 chars in R2 artifact
Technical Details
- File:
scripts/seed-forecasts.mjs—processNextSimulationTask(~lines 15752, 15766-15769)
Work Log
- 2026-03-24: Found by compound-engineering:review:security-sentinel in PR #2220 review