* refactor(country-maps): consolidate country name/ISO maps
Expand shared/country-names.json from 265 to 309 entries by merging
geojson names, COUNTRY_ALIAS_MAP, upstream API variants (World Bank,
WHO, UN, FAO), and seed-correlation extras.
Add ISO3 map generator (generate-iso3-maps.cjs) producing
iso3-to-iso2.json (239 entries) and iso2-to-iso3.json (239 entries)
with TWN and XKX supplements.
Add build-country-names.cjs for reproducible expansion from all sources.
Sync scripts/shared/ copies for edge-function test compatibility.
* refactor: consolidate country name/code mappings into single canonical sources
Eliminates fragmented country mapping across the repo. Every feature
(resilience, conflict, correlation, intelligence) was maintaining its
own partial alias map.
Data consolidation:
- Expand shared/country-names.json from 265 to 302 entries covering
World Bank, WHO, UN, FAO, and correlation script naming variants
- Generate shared/iso3-to-iso2.json (239 entries) and
shared/iso2-to-iso3.json from countries.geojson + supplements
(Taiwan TWN, Kosovo XKX)
Consumer migrations:
- _country-resolver.mjs: delete COUNTRY_ALIAS_MAP (37 entries),
replace 2MB geojson parse with 5KB iso3-to-iso2.json
- conflict/_shared.ts: replace 33-entry ISO2_TO_ISO3 literal
- seed-conflict-intel.mjs: replace 20-entry ISO2_TO_ISO3 literal
- _dimension-scorers.ts: replace geojson-based ISO3 construction
- get-risk-scores.ts: replace 31-entry ISO3_TO_ISO2 literal
- seed-correlation.mjs: replace 102-entry COUNTRY_NAME_TO_ISO2
and 90-entry ISO3_TO_ISO2, use resolveIso2() from canonical
resolver, lower short-alias threshold to 2 chars with word
boundary matching, export matchCountryNamesInText(), add isMain
guard
Tests:
- New tests/country-resolver.test.mjs with structural validation,
parity regression for all 37 old aliases, ISO3 bidirectional
consistency, and Taiwan/Kosovo assertions
- Updated resilience seed test for new resolver signature
Net: -190 lines, 0 hardcoded country maps remaining
* fix: normalize raw text before country name matching
Text matchers (geo-extract, seed-security-advisories, seed-correlation)
were matching normalized keys against raw text containing diacritics
and punctuation. "Curaçao", "Timor-Leste", "Hong Kong S.A.R." all
failed to resolve after country-names.json keys were normalized.
Fix: apply NFKD + diacritic stripping + punctuation normalization to
input text before matching, same transform used on the keys.
Also add "hong kong" and "sao tome" as short-form keys for bigram
headline matching in geo-extract.
* fix: remove 'u s' alias that caused US/VI misattribution
'u s' in country-names.json matched before 'u s virgin islands' in
geo-extract's bigram scanner, attributing Virgin Islands headlines
to US. Removed since 'usa', 'united states', and the uppercase US
expansion already cover the United States.