Files
worldmonitor/shared/iso3-to-iso2.json
Elie Habib 02555671f2 refactor: consolidate country name/code mappings into single canonical sources (#2676)
* refactor(country-maps): consolidate country name/ISO maps

Expand shared/country-names.json from 265 to 309 entries by merging
geojson names, COUNTRY_ALIAS_MAP, upstream API variants (World Bank,
WHO, UN, FAO), and seed-correlation extras.

Add ISO3 map generator (generate-iso3-maps.cjs) producing
iso3-to-iso2.json (239 entries) and iso2-to-iso3.json (239 entries)
with TWN and XKX supplements.

Add build-country-names.cjs for reproducible expansion from all sources.
Sync scripts/shared/ copies for edge-function test compatibility.

* refactor: consolidate country name/code mappings into single canonical sources

Eliminates fragmented country mapping across the repo. Every feature
(resilience, conflict, correlation, intelligence) was maintaining its
own partial alias map.

Data consolidation:
- Expand shared/country-names.json from 265 to 302 entries covering
  World Bank, WHO, UN, FAO, and correlation script naming variants
- Generate shared/iso3-to-iso2.json (239 entries) and
  shared/iso2-to-iso3.json from countries.geojson + supplements
  (Taiwan TWN, Kosovo XKX)

Consumer migrations:
- _country-resolver.mjs: delete COUNTRY_ALIAS_MAP (37 entries),
  replace 2MB geojson parse with 5KB iso3-to-iso2.json
- conflict/_shared.ts: replace 33-entry ISO2_TO_ISO3 literal
- seed-conflict-intel.mjs: replace 20-entry ISO2_TO_ISO3 literal
- _dimension-scorers.ts: replace geojson-based ISO3 construction
- get-risk-scores.ts: replace 31-entry ISO3_TO_ISO2 literal
- seed-correlation.mjs: replace 102-entry COUNTRY_NAME_TO_ISO2
  and 90-entry ISO3_TO_ISO2, use resolveIso2() from canonical
  resolver, lower short-alias threshold to 2 chars with word
  boundary matching, export matchCountryNamesInText(), add isMain
  guard

Tests:
- New tests/country-resolver.test.mjs with structural validation,
  parity regression for all 37 old aliases, ISO3 bidirectional
  consistency, and Taiwan/Kosovo assertions
- Updated resilience seed test for new resolver signature

Net: -190 lines, 0 hardcoded country maps remaining

* fix: normalize raw text before country name matching

Text matchers (geo-extract, seed-security-advisories, seed-correlation)
were matching normalized keys against raw text containing diacritics
and punctuation. "Curaçao", "Timor-Leste", "Hong Kong S.A.R." all
failed to resolve after country-names.json keys were normalized.

Fix: apply NFKD + diacritic stripping + punctuation normalization to
input text before matching, same transform used on the keys.

Also add "hong kong" and "sao tome" as short-form keys for bigram
headline matching in geo-extract.

* fix: remove 'u s' alias that caused US/VI misattribution

'u s' in country-names.json matched before 'u s virgin islands' in
geo-extract's bigram scanner, attributing Virgin Islands headlines
to US. Removed since 'usa', 'united states', and the uppercase US
expansion already cover the United States.
2026-04-04 15:38:02 +04:00

242 lines
3.5 KiB
JSON

{
"ABW": "AW",
"AFG": "AF",
"AGO": "AO",
"AIA": "AI",
"ALA": "AX",
"ALB": "AL",
"AND": "AD",
"ARE": "AE",
"ARG": "AR",
"ARM": "AM",
"ASM": "AS",
"ATA": "AQ",
"ATF": "TF",
"ATG": "AG",
"AUS": "AU",
"AUT": "AT",
"AZE": "AZ",
"BDI": "BI",
"BEL": "BE",
"BEN": "BJ",
"BFA": "BF",
"BGD": "BD",
"BGR": "BG",
"BHR": "BH",
"BHS": "BS",
"BIH": "BA",
"BLM": "BL",
"BLR": "BY",
"BLZ": "BZ",
"BMU": "BM",
"BOL": "BO",
"BRA": "BR",
"BRB": "BB",
"BRN": "BN",
"BTN": "BT",
"BWA": "BW",
"CAF": "CF",
"CAN": "CA",
"CHE": "CH",
"CHL": "CL",
"CHN": "CN",
"CIV": "CI",
"CMR": "CM",
"COD": "CD",
"COG": "CG",
"COK": "CK",
"COL": "CO",
"COM": "KM",
"CPV": "CV",
"CRI": "CR",
"CUB": "CU",
"CUW": "CW",
"CYM": "KY",
"CYP": "CY",
"CZE": "CZ",
"DEU": "DE",
"DJI": "DJ",
"DMA": "DM",
"DNK": "DK",
"DOM": "DO",
"DZA": "DZ",
"ECU": "EC",
"EGY": "EG",
"ERI": "ER",
"ESH": "EH",
"ESP": "ES",
"EST": "EE",
"ETH": "ET",
"FIN": "FI",
"FJI": "FJ",
"FLK": "FK",
"FRA": "FR",
"FRO": "FO",
"FSM": "FM",
"GAB": "GA",
"GBR": "GB",
"GEO": "GE",
"GGY": "GG",
"GHA": "GH",
"GIB": "GI",
"GIN": "GN",
"GMB": "GM",
"GNB": "GW",
"GNQ": "GQ",
"GRC": "GR",
"GRD": "GD",
"GRL": "GL",
"GTM": "GT",
"GUM": "GU",
"GUY": "GY",
"HKG": "HK",
"HMD": "HM",
"HND": "HN",
"HRV": "HR",
"HTI": "HT",
"HUN": "HU",
"IDN": "ID",
"IMN": "IM",
"IND": "IN",
"IOT": "IO",
"IRL": "IE",
"IRN": "IR",
"IRQ": "IQ",
"ISL": "IS",
"ISR": "IL",
"ITA": "IT",
"JAM": "JM",
"JEY": "JE",
"JOR": "JO",
"JPN": "JP",
"KAZ": "KZ",
"KEN": "KE",
"KGZ": "KG",
"KHM": "KH",
"KIR": "KI",
"KNA": "KN",
"KOR": "KR",
"KWT": "KW",
"LAO": "LA",
"LBN": "LB",
"LBR": "LR",
"LBY": "LY",
"LCA": "LC",
"LIE": "LI",
"LKA": "LK",
"LSO": "LS",
"LTU": "LT",
"LUX": "LU",
"LVA": "LV",
"MAC": "MO",
"MAF": "MF",
"MAR": "MA",
"MCO": "MC",
"MDA": "MD",
"MDG": "MG",
"MDV": "MV",
"MEX": "MX",
"MHL": "MH",
"MKD": "MK",
"MLI": "ML",
"MLT": "MT",
"MMR": "MM",
"MNE": "ME",
"MNG": "MN",
"MNP": "MP",
"MOZ": "MZ",
"MRT": "MR",
"MSR": "MS",
"MUS": "MU",
"MWI": "MW",
"MYS": "MY",
"NAM": "NA",
"NCL": "NC",
"NER": "NE",
"NFK": "NF",
"NGA": "NG",
"NIC": "NI",
"NIU": "NU",
"NLD": "NL",
"NOR": "NO",
"NPL": "NP",
"NRU": "NR",
"NZL": "NZ",
"OMN": "OM",
"PAK": "PK",
"PAN": "PA",
"PCN": "PN",
"PER": "PE",
"PHL": "PH",
"PLW": "PW",
"PNG": "PG",
"POL": "PL",
"PRI": "PR",
"PRK": "KP",
"PRT": "PT",
"PRY": "PY",
"PSE": "PS",
"PYF": "PF",
"QAT": "QA",
"ROU": "RO",
"RUS": "RU",
"RWA": "RW",
"SAU": "SA",
"SDN": "SD",
"SEN": "SN",
"SGP": "SG",
"SGS": "GS",
"SHN": "SH",
"SLB": "SB",
"SLE": "SL",
"SLV": "SV",
"SMR": "SM",
"SOM": "SO",
"SPM": "PM",
"SRB": "RS",
"SSD": "SS",
"STP": "ST",
"SUR": "SR",
"SVK": "SK",
"SVN": "SI",
"SWE": "SE",
"SWZ": "SZ",
"SXM": "SX",
"SYC": "SC",
"SYR": "SY",
"TCA": "TC",
"TCD": "TD",
"TGO": "TG",
"THA": "TH",
"TJK": "TJ",
"TKM": "TM",
"TLS": "TL",
"TON": "TO",
"TTO": "TT",
"TUN": "TN",
"TUR": "TR",
"TUV": "TV",
"TWN": "TW",
"TZA": "TZ",
"UGA": "UG",
"UKR": "UA",
"UMI": "UM",
"URY": "UY",
"USA": "US",
"UZB": "UZ",
"VAT": "VA",
"VCT": "VC",
"VEN": "VE",
"VGB": "VG",
"VIR": "VI",
"VNM": "VN",
"VUT": "VU",
"WLF": "WF",
"WSM": "WS",
"XKX": "XK",
"YEM": "YE",
"ZAF": "ZA",
"ZMB": "ZM",
"ZWE": "ZW"
}