worldmonitor

eliott/worldmonitor

Fork 0

mirror of https://github.com/koala73/worldmonitor.git synced 2026-05-11 17:46:20 +02:00

Files

History

Elie Habib 93304c5c25 feat(intelligence): add countryCode geo-attribution to topStories (#2051 ) (#2094 )

* feat(intelligence): add countryCode geo-attribution to topStories (#2051)

* fix(geo-extract): filter EU as supranational, add unigram stopwords, type countryCode in ServerInsightStory

- Map 'eu'/'europe' to 'XX' (supranational marker, returns null) instead of 'EU' which is not a valid ISO2 code and would be silently ignored by downstream CII scorer
- Add UNIGRAM_STOPWORDS set for high-false-positive single-word entries in country-names.json: chad/jordan/georgia/niger/guinea/mali/peru — these match too frequently as person names and US state names in English headlines; their country meanings are covered by unambiguous aliases (nigerian, georgian context via bigrams, etc.)
- Add countryCode: string | null and pubDate: string to ServerInsightStory TypeScript interface to match what seed-insights.mjs now writes to Redis

* fix(geo-extract): add 'us' to UNIGRAM_STOPWORDS to prevent pronoun false positives

'us' as a bare word matches almost every English headline ("give us",
"tells us", etc.). US coverage is preserved via the 'washington' and
'american' aliases in ALIAS_MAP.

* fix(geo-extract): fix US abbreviation, bigram punctuation, and scan ordering

Three issues:
1. 'us' stopword suppressed uppercase US (country). Fix: pre-process
\bUS\b → 'United States' before lowercasing; remove 'us' from stopwords.

2. Bigram matching used raw tokens so 'West Bank,' and 'Tel Aviv:' missed
their alias entries. Fix: strip punctuation from each token before
forming the bigram key.

3. Two-pass scan (all bigrams then all unigrams) meant 'United States'
bigram fired before earlier unigrams like 'Iran' in 'Iran blames US'.
Fix: single left-to-right scan with local longest-match (bigram at i
before unigram at i), preserving first-mention document order.

2026-03-23 16:32:34 +04:00

acled-oauth.mjs

fix(acled): add OAuth token manager with automatic refresh (#1437 )

2026-03-12 22:24:40 +04:00

ai-tokens.json

feat(finance): crypto sectors heatmap + DeFi/AI/Alt token panels + expanded crypto news (#1900 )

2026-03-20 10:34:20 +04:00

commodities.json

feat(commodities): expand tracking to 23 symbols — agriculture and coal (#2135 )

2026-03-23 14:19:20 +04:00

country-names.json

feat(advisories): gold standard migration for security advisories (#1637 )