Files
worldmonitor/scripts
Elie Habib 5d68f0ae6b fix(intelligence): land news:threat:summary:v1 CII work missed from PR #2096 (#2356)
* feat(intelligence): emit news:threat:summary:v1 from relay classify loop for CII

During seedClassifyForVariant(), attribute each title to ISO2 countries
while both title and classification result are in scope. At the end of
seedClassify(), merge per-country threat counts across all variants and
write news:threat:summary:v1 (20min TTL) with { byCountry: { [iso2]: {
critical, high, medium, low, info } }, generatedAt }.

get-risk-scores.ts reads the new key via fetchAuxiliarySources() and
applies weighted scores (critical→4, high→2, medium→1, low→0.5, info→0,
capped at 20) per country into the information component of CII eventScore.

Closes #2053

* fix(intelligence): register news:threat-summary in health.js and expand tests

- Add newsThreatSummary to BOOTSTRAP_KEYS (seed-meta:news:threat-summary,
  maxStaleMin: 60) so relay classify outages surface in health dashboard
- Add 4 tests: boost verification, cap-at-20, unknown-country safety,
  null-threatSummary zero baseline

* fix(classify): de-dup cross-variant titles and attribute to last-mentioned country

P1-A: seedClassify() was summing byCountry across all 5 variants (full/tech/
finance/happy/commodity) without de-duplicating. Shared feeds (CNBC, Yahoo
Finance, FT, HN, Ars) let a single headline count up to 4x before reaching
CII, saturating threatSummaryScore on one story.
Fix: pass seenTitles Set into seedClassifyForVariant; skip attribution for
titles already counted by an earlier variant.

P1-B: matchCountryNamesInText() was attributing every country mentioned in a
headline equally. "UK and US launch strikes on Yemen" raised GB, US, and YE
with identical weight, inflating actor-country CII.
Fix: return only the last country in document order — the grammatical object
of the headline, which is the primary affected country in SVO structure.

* fix(classify): replace last-position heuristic with preposition-pattern attribution

The previous "last-mentioned country" fix still failed for:
- "Yemen says UK and US strikes hit Hodeidah" → returned US (wrong)
- "US strikes on Yemen condemned by Iran" → returned IR (wrong)

Both failures stem from position not conveying grammatical role. Switch to a
preposition/verb-pattern approach: only attribute to a country that immediately
follows a locative preposition (in/on/against/at/into/targeting/toward) or an
attack verb (invades/attacks/bombs/hits/strikes). No pattern match → return []
(skip attribution rather than attribute to the wrong country).

* fix(classify): fix regex hitting, gaza/hamas geo mapping, seed-meta always written

- hitt?(?:ing|s)? instead of hit(?:s|ting)? so "hitting" is matched
- gaza → PS (Palestinian Territories), hamas → PS (was IL)
- seed-meta:news:threat-summary written unconditionally so health check
  does not fire false alerts during no-attribution runs
2026-03-27 12:21:23 +04:00
..