fix(scoring): scope "critical" to geopolitical events, not domestic tragedies (#3221)

The weight rebalance (PR #3144) amplified a prompt gap: domestic mass shootings (e.g. "8 children killed in Louisiana") scored 88 because the LLM classified them as "critical" (mass-casualty 10+ killed) and the 55% severity weight pushed them into the critical gate. But WorldMonitor is a geopolitical monitor — domestic tragedies are terrible but not geopolitically destabilizing. Prompt change (both ais-relay.cjs + classify-event.ts): - "critical" now explicitly requires GEOPOLITICAL scope: "events that destabilize international order, threaten cross-border security, or disrupt global systems" - Domestic mass-casualty events (mass shootings, industrial accidents) moved to "high" — still important, but not critical-sensitivity alerts - Added counterexamples: "8 children killed in mass shooting in Louisiana → domestic mass-casualty → high" and "23 killed in fireworks factory explosion → industrial accident → high" - Retained: "700 killed in Sudan drone strikes → geopolitical mass- casualty in active civil war → critical" Classify cache: v2→v3 (bust stale entries that lack geopolitical scope). Shadow-log: v4→v5 (clean dataset for recalibration under the scoped prompt). 🤖 Generated with Claude Opus 4.6 via Claude Code Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-25 17:14:57 +02:00 · 2026-04-20 08:40:29 +04:00
parent e2255840f6
commit 14c1314629
8 changed files with 31 additions and 27 deletions
--- a/scripts/ais-relay.cjs
+++ b/scripts/ais-relay.cjs
@@ -3491,19 +3491,22 @@ Return ONLY a JSON array, no other text.
 Levels: critical, high, medium, low, info
 Categories: conflict, protest, disaster, diplomatic, economic, terrorism, cyber, health, environmental, military, crime, infrastructure, tech, general

-Guidelines for LEVEL assignment:
- critical: Active military strikes, mass-casualty events (10+ killed), ceasefire agreements/collapses, nuclear incidents, pandemic declarations, coups, strait/waterway closures
- high: Armed conflict updates, major diplomatic actions, sanctions packages, significant natural disasters, blockades, terrorist attacks
+Guidelines for LEVEL assignment (geopolitical scope required for critical):
+- critical: Active military strikes with international implications, geopolitical mass-casualty events (10+ killed in conflict/terrorism/state action), ceasefire agreements/collapses, nuclear incidents, pandemic declarations, coups, strait/waterway closures
+- high: Armed conflict updates, major diplomatic actions, sanctions packages, significant natural disasters, blockades, terrorist attacks, domestic mass-casualty events (mass shootings, industrial disasters)
 - medium: Ongoing conflict analysis, economic impact reports, protest movements, regional policy changes, military exercises
 - low: Diplomatic meetings, trade discussions, humanitarian aid, election updates, peacekeeping deployments
 - info: Opinion/editorial pieces, analysis/explainer articles, historical retrospectives, lifestyle, entertainment, routine local news, tutorials

-Key distinction: classify by THE EVENT, not the headline's emotional tone.
- "Guardian view on ceasefire: need real peace" → editorial, not a ceasefire → info
+Key distinction: "critical" requires GEOPOLITICAL scope — events that destabilize international order, threaten cross-border security, or disrupt global systems. Domestic tragedies are "high" unless they trigger international diplomatic responses.
+- "8 children killed in mass shooting in Louisiana" → domestic mass-casualty, not geopolitical → high
+- "23 killed in fireworks factory explosion in India" → industrial accident → high
+- "700 killed in Sudan drone strikes" → geopolitical mass-casualty in active civil war → critical
+- "Iran closes Strait of Hormuz" → global trade disruption → critical
+- "Guardian view on ceasefire: need real peace" → editorial → info
 - "Trump's obsession with energy" → opinion/analysis → info
- "Man killed his estranged wife" → domestic crime, not geopolitical → info
+- "Man killed his estranged wife" → domestic crime → info
 - "How to Crack the SAM Database in Kali Linux" → tutorial → info
- "700 killed in Sudan drone strikes" → mass-casualty event → critical

 Input: numbered lines "index|Title"
 Output: [{"i":0,"l":"high","c":"conflict"}, ...]
@@ -3590,7 +3593,7 @@ function matchCountryNamesInText(text) {

 function classifyCacheKey(title) {
  const hash = crypto.createHash('sha256').update(title.toLowerCase()).digest('hex').slice(0, 16);
-  return `classify:sebuf:v2:${hash}`;
+  return `classify:sebuf:v3:${hash}`;
 }

 // LLM provider fallback chain — mirrors seed-insights.mjs LLM_PROVIDERS
--- a/scripts/notification-relay.cjs
+++ b/scripts/notification-relay.cjs
@@ -691,7 +691,7 @@ const IMPORTANCE_SCORE_MIN = Number(process.env.IMPORTANCE_SCORE_MIN ?? 40);
 // The old v1 key (compact string format) is retained by consumers for
 // backward-compat reading but is no longer written. See
 // docs/internal/scoringDiagnostic.md §5 and §9 Step 4.
-const SHADOW_SCORE_LOG_KEY = 'shadow:score-log:v4';
+const SHADOW_SCORE_LOG_KEY = 'shadow:score-log:v5';
 const SHADOW_LOG_TTL = 7 * 24 * 3600; // 7 days

 async function shadowLogScore(event) {
--- a/scripts/shadow-score-report.mjs
+++ b/scripts/shadow-score-report.mjs
@@ -12,7 +12,7 @@ import { resolve } from 'node:path';

 // v2 is the post-fix key (JSON members). v1 is the legacy key (compact strings).
 // Override with SHADOW_SCORE_KEY=shadow:score-log:v2 (pre-weight-rebalance) or v1 (pre-PR #3069).
-const KEY = process.env.SHADOW_SCORE_KEY || 'shadow:score-log:v4';
+const KEY = process.env.SHADOW_SCORE_KEY || 'shadow:score-log:v5';
 const OUT = resolve(process.cwd(), 'shadow-score-report');
 const GATE_MIN = 40;     // current IMPORTANCE_SCORE_MIN default
 const HIGH = 65;         // current shouldNotify "high" sensitivity threshold
--- a/server/_shared/cache-keys.ts
+++ b/server/_shared/cache-keys.ts
@@ -22,7 +22,7 @@ export const STORY_TRACKING_TTL_S = 172800;
 * TTL for all: 172800s (48h), refreshed each digest cycle.
 * Shadow scoring key (written by notification-relay.cjs, which owns the live
 * value — the constant here is documentation only, not imported):
- *   shadow:score-log:v4            ZSet   score=epoch_ms, member=JSON{ts,importanceScore,severity,eventType,title,source,publishedAt,corroborationCount,variant}
+ *   shadow:score-log:v5            ZSet   score=epoch_ms, member=JSON{ts,importanceScore,severity,eventType,title,source,publishedAt,corroborationCount,variant}
 *   shadow:score-log:v3            ZSet   legacy (weight rebalance) — self-prunes via 7d ZREMRANGEBYSCORE
 *   shadow:score-log:v2            ZSet   legacy (stale-score fix) — self-prunes
 *   shadow:score-log:v1            ZSet   legacy (pre-PR #3069) — self-prunes
@@ -32,10 +32,10 @@ export const STORY_SOURCES_KEY = (titleHash: string) => `story:sources:v1:${titl
 export const STORY_PEAK_KEY = (titleHash: string) => `story:peak:v1:${titleHash}`;
 export const DIGEST_ACCUMULATOR_KEY = (variant: string, lang = 'en') => `digest:accumulator:v1:${variant}:${lang}`;
 export const DIGEST_LAST_SENT_KEY = (userId: string, variant: string) => `digest:last-sent:v1:${userId}:${variant}`;
-// NOTE: notification-relay.cjs owns the live value (shadow:score-log:v4 since prompt upgrade).
+// NOTE: notification-relay.cjs owns the live value (shadow:score-log:v5 since prompt upgrade).
 // This export is documentation/discoverability; changing it here does NOT affect the relay.
 // If you modify the key, also update scripts/notification-relay.cjs SHADOW_SCORE_LOG_KEY.
-export const SHADOW_SCORE_LOG_KEY = 'shadow:score-log:v4';
+export const SHADOW_SCORE_LOG_KEY = 'shadow:score-log:v5';
 export const STORY_TTL = 604800;           // 7 days — enough for sustained multi-day stories
 export const DIGEST_ACCUMULATOR_TTL = 172800; // 48h — lookback window for digest content

--- a/server/worldmonitor/intelligence/v1/_shared.ts
+++ b/server/worldmonitor/intelligence/v1/_shared.ts
@@ -9,7 +9,7 @@ import { hashString, sha256Hex } from '../../../_shared/hash';
 // ========================================================================

 export const UPSTREAM_TIMEOUT_MS = 25_000;
-const CLASSIFY_CACHE_PREFIX = 'classify:sebuf:v2:';
+const CLASSIFY_CACHE_PREFIX = 'classify:sebuf:v3:';

 // ========================================================================
 // Tier-1 country definitions (used by risk-scores + country-intel-brief)
--- a/server/worldmonitor/intelligence/v1/classify-event.ts
+++ b/server/worldmonitor/intelligence/v1/classify-event.ts
@@ -52,19 +52,20 @@ export async function classifyEvent(
 Levels: critical, high, medium, low, info
 Categories: conflict, protest, disaster, diplomatic, economic, terrorism, cyber, health, environmental, military, crime, infrastructure, tech, general

-Guidelines for LEVEL assignment:
- critical: Active military strikes, mass-casualty events (10+ killed), ceasefire agreements/collapses, nuclear incidents, pandemic declarations, coups, strait/waterway closures
- high: Armed conflict updates, major diplomatic actions, sanctions packages, significant natural disasters, blockades, terrorist attacks
+Guidelines for LEVEL assignment (geopolitical scope required for critical):
+- critical: Active military strikes with international implications, geopolitical mass-casualty events (10+ killed in conflict/terrorism/state action), ceasefire agreements/collapses, nuclear incidents, pandemic declarations, coups, strait/waterway closures
+- high: Armed conflict updates, major diplomatic actions, sanctions packages, significant natural disasters, blockades, terrorist attacks, domestic mass-casualty events (mass shootings, industrial disasters)
 - medium: Ongoing conflict analysis, economic impact reports, protest movements, regional policy changes, military exercises
 - low: Diplomatic meetings, trade discussions, humanitarian aid, election updates, peacekeeping deployments
 - info: Opinion/editorial pieces, analysis/explainer articles, historical retrospectives, lifestyle, entertainment, routine local news, tutorials

-Key distinction: classify by THE EVENT, not the headline's emotional tone.
- "Guardian view on ceasefire: need real peace" → editorial, not a ceasefire → info
- "Trump's obsession with energy" → opinion/analysis → info
- "Man killed his estranged wife" → domestic crime, not geopolitical → info
+Key distinction: "critical" requires GEOPOLITICAL scope — events that destabilize international order, threaten cross-border security, or disrupt global systems. Domestic tragedies are "high" unless they trigger international diplomatic responses.
+- "8 children killed in mass shooting in Louisiana" → domestic mass-casualty → high
+- "23 killed in fireworks factory explosion" → industrial accident → high
+- "700 killed in Sudan drone strikes" → geopolitical mass-casualty → critical
+- "Iran closes Strait of Hormuz" → global trade disruption → critical
+- "Man killed his estranged wife" → domestic crime → info
 - "How to Crack the SAM Database" → tutorial → info
- "700 killed in Sudan drone strikes" → mass-casualty → critical

 Focus: geopolitical events, conflicts, disasters, diplomacy.
 Classify by real-world event severity, not headline sentiment.
--- a/server/worldmonitor/news/v1/list-feed-digest.ts
+++ b/server/worldmonitor/news/v1/list-feed-digest.ts
@@ -275,7 +275,7 @@ async function enrichWithAiCache(items: ParsedItem[]): Promise<void> {
  const keyMap = new Map<string, ParsedItem[]>();
  for (const item of candidates) {
    const hash = (await sha256Hex(item.title.toLowerCase())).slice(0, 16);
-    const key = `classify:sebuf:v2:${hash}`;
+    const key = `classify:sebuf:v3:${hash}`;
    const existing = keyMap.get(key) ?? [];
    existing.push(item);
    keyMap.set(key, existing);
--- a/tests/notification-relay-shadow-log.test.mjs
+++ b/tests/notification-relay-shadow-log.test.mjs
@@ -63,12 +63,12 @@ describe('shadow-log key version', () => {
  it('uses the v4 JSON-member key (prompt upgrade clean dataset)', () => {
    assert.match(
      relaySrc,
-      /SHADOW_SCORE_LOG_KEY\s*=\s*['"]shadow:score-log:v4['"]/,
-      'notification-relay must write to shadow:score-log:v4 after the prompt upgrade',
+      /SHADOW_SCORE_LOG_KEY\s*=\s*['"]shadow:score-log:v5['"]/,
+      'notification-relay must write to shadow:score-log:v5 after the prompt upgrade',
    );
    assert.ok(
-      !/SHADOW_SCORE_LOG_KEY\s*=\s*['"]shadow:score-log:v[123]['"]/.test(relaySrc),
-      'legacy v1/v2/v3 keys must not be active',
+      !/SHADOW_SCORE_LOG_KEY\s*=\s*['"]shadow:score-log:v[1234]['"]/.test(relaySrc),
+      'legacy v1/v2/v3/v4 keys must not be active',
    );
  });