eliott/worldmonitor - worldmonitor - lab48

eliott/worldmonitor

mirror of https://github.com/koala73/worldmonitor.git synced 2026-04-25 17:14:57 +02:00

Author	SHA1	Message	Date
Elie Habib	8089fd9d53	feat(resilience): publish resilience:static:fao aggregate from static seed (#3050 ) * feat(resilience): publish resilience:static:fao aggregate from static seed Weekly validation cron Outcome-Backtest reads resilience:static:fao for the Food Crisis Escalation family, but nothing wrote that key — dangling reference, Food Crisis stuck at AUC=0.5. IPC Phase 3+ data is already fetched by fetchFsinDataset (HDX global IPC CSV) and stored per-country. This PR reshapes the same in-memory map into an aggregate view and writes it in the existing Redis pipeline — no extra fetch, no new cron service. Output shape matches what detectFoodCrisis already walks: { countries: { [iso2]: { ipcPhase, phase, peopleInCrisis, year, source } }, count, fetchedAt, seedYear, source: 'hdx-ipc' } Only Phase 3+ countries are included, matching IPC's own publish rule. Absence = not-monitored-crisis, consistent with scoreFoodWater()'s stable-absence semantics. Tests: 5 unit tests for buildFaoAggregate (incl. contract test against detectFoodCrisis) + 1 health.js registration test. No cron/Railway changes needed — seed-bundle-static-ref picks it up on its next October window; restart to backfill sooner. FX Stress / Power Outages / Refugees / Conflict also fail today but for different reasons (detector shape mismatches) — out of scope here. * fix(resilience): wire resilienceStaticFao into SEED_META to unmask empty-state Reviewer catch on #3050: adding resilienceStaticFao to STANDALONE_KEYS and EMPTY_DATA_OK_KEYS without a matching SEED_META entry leaves seedStale=null in the standalone-key health branch, so an empty or missing resilience:static:fao key resolves to plain OK instead of STALE_SEED — silently masking the exact bug this PR is meant to surface. Adds SEED_META.resilienceStaticFao pointing at seed-meta:resilience:static (same heartbeat as resilienceStaticIndex, since the aggregate is written in the same Redis pipeline by the same seeder). Now: missing data with stale heartbeat -> STALE_SEED (warn); with fresh heartbeat and no countries in Phase 3+ -> OK (still valid per EMPTY_DATA_OK_KEYS). Same trap documented in feedback_empty_data_ok_keys_bootstrap_blind_spot.md but in the STANDALONE_KEYS path, not BOOTSTRAP_KEYS. Test locks it in with a source-string regex assertion.	2026-04-13 13:00:58 +04:00
Elie Habib	968244e2ec	fix(seed): revert WHO $top to 1000 + add $skip pagination (#2827 ) WHO GHO API rejects $top > 1000 with HTTP 400, breaking all WHO indicators in production. Revert to $top=1000 and paginate via $skip. UHC indicator has 4680 rows requiring 5 pages. Adds per-indicator row count logging.	2026-04-08 17:24:50 +04:00
Elie Habib	83cecb5aef	feat(resilience): add WB mean applied tariff rate to tradeSanctions (#2811 ) * feat(resilience): add WB mean applied tariff rate to tradeSanctions World Bank TM.TAX.MRCH.WM.AR.ZS covers 180+ countries, supplementing the WTO top-50 metrics that only cover major reporters. Reduces reporter-set bias by providing a global trade openness signal. Reweights: sanctions 0.45, WTO restrictions 0.15, WTO barriers 0.15, WB tariff rate 0.25. * fix: update pinned test assertions for WB tariff rate reweighting Adjusts scoreTradeSanctions test assertions for the new 4-metric blend (sanctions 0.45, restrictions 0.15, barriers 0.15, tariff 0.25) and bumps TOTAL_DATASET_SLOTS from 9 to 10 in payload assembly tests. * fix(seed): bump static source version to v5 + sync indicator registry for trade Version bump ensures appliedTariffRate backfills to existing 2026 snapshots. Registry updated from 3-metric to 4-metric trade-sanctions weights. * fix(resilience): correct appliedTariffRate sourceKey to resilience:static:{ISO2} * fix(resilience): bump score cache to v4 + add tariff rate to release-gate fixtures Score/ranking cache keys bumped to v4 to invalidate stale pre-tariff cached responses. Release-gate fixtures now include appliedTariffRate so the gate exercises the full 4-metric trade-sanctions path. * fix(test): update pinned scorer assertions after rebase onto main With all Phase 2+3 PRs merged (FX reserves, broadband, WHO metrics, zero-event guards), the combined fixture data produces economic=66.33, infrastructure=79, overallScore=68.72.	2026-04-08 10:56:12 +04:00
Elie Habib	6aa822e9f9	feat(resilience): FX reserves adequacy in currencyExternal (#2812 ) * feat(resilience): add FX reserves adequacy to currencyExternal dimension World Bank FI.RES.TOTL.MO (total reserves in months of imports) covers ~160 countries, filling the BIS EER coverage gap (~40 economies). For BIS countries: reserves supplement volatility + deviation (weight 0.15). For non-BIS countries: reserves combine with IMF inflation proxy (0.4/0.6 blend) for much better currency stability coverage than inflation alone. Normalization: 1 month (near crisis) = 0, 12+ months (very safe) = 100. * fix(seed): bump static source version to v4 for fxReservesMonths backfill Without version bump, existing 2026 snapshots won't be republished and fxReservesMonths field will never backfill until next annual cycle. * fix(resilience): bump score cache to v3 for FX reserves scorer change scoreCurrencyExternal now includes FX reserves adequacy, changing scores for all countries. Bump cache key to invalidate stale pre-reserves cached responses on deploy. * fix(seed): retry static seed when previous run had failed datasets shouldSkipSeedYear() now returns false when seed-meta records non-empty failedDatasets, allowing backfill of datasets that failed on the first run (e.g., fxReservesMonths upstream outage during v4 rollout). Previously, partial success with status:'ok' caused all future same-year runs to skip permanently.	2026-04-08 10:45:25 +04:00
Elie Habib	bea26b3175	feat(resilience): add WB broadband penetration to infrastructure dimension (#2813 ) * feat(resilience): add WB broadband penetration to infrastructure dimension World Bank IT.NET.BBND.P2 (fixed broadband subscriptions per 100 people) added as new sub-metric in scoreInfrastructure. normalizeHigherBetter(0, 40). Reweights: electricity 0.30, roads 0.30, outages 0.25, broadband 0.15. * fix(resilience): add explicit outagesRaw null guard in scoreInfrastructure Matches the established pattern in scoreCyberDigital where both source presence and penalty > 0 are checked before scoring. * test(resilience): pin expected broadband numeric contribution in infrastructure scorer Strengthens the broadband test from directional-only to pinned numeric assertion, catching regressions in normalization goalposts or weight changes.	2026-04-08 10:32:33 +04:00
Elie Habib	78c8381547	feat(resilience): add WHO physician density + health expenditure to healthPublicService (#2808 ) * feat(resilience): add WHO physician density + health expenditure to healthPublicService Two new sub-metrics from WHO GHO OData: - Physician density per 1k (HWF_0001): normalizeHigherBetter(0, 5) - Health expenditure per capita PPP (GHED_CHE_pc_PPP_SHA2011): normalizeHigherBetter(50, 5000) Reweights existing metrics: UHC 0.35, measles 0.25, beds 0.10, physicians 0.15, expenditure 0.15. Bumps static source version to v3 for backfill. * fix(resilience): replace dead WHO health expenditure indicator with working alternative GHED_CHE_pc_PPP_SHA2011 returns empty dataset from WHO GHO API, causing the 15% healthExp weight to silently drop from production scoring. Replaced with GHED_CHE_pc_US_SHA2011 (per capita current USD), which has 4849 records across all countries. Renamed field healthExpPerCapitaPpp to healthExpPerCapitaUsd and adjusted normalization goalposts from (50, 5000) to (20, 3000) to reflect current-USD scale. Bumped source version to v4. * fix(seed): increase WHO $top to 10000 to prevent pagination truncation + add transform test WHO GHO API returns exactly 1000 rows with no @odata.nextLink for physician density and health expenditure indicators, silently truncating country coverage. Increasing $top to 10000 fetches all rows in one page (typical WHO indicators have 2000-5000 rows). Also adds seed-level test for the HWF_0001 per-10k to per-1k division transform.	2026-04-08 10:21:59 +04:00
Elie Habib	023e2e60e7	feat(resilience): add tradeToGdpPct to resilience static bundle (#2794 ) * feat(seed): add tradeToGdpPct to resilience static bundle from WB NE.TRD.GNFS.ZS Trade as % of GDP is needed for Phase 2 exposure-weighting of shipping stress. Small open economies (Singapore ~300%, Belgium ~170%) will feel shipping disruption more than large autarkies (US ~25%). * fix(seed): bump static source version to v2 for tradeToGdp backfill The static seeder skips re-runs within the same seed year if a snapshot exists. Without a version bump, the new tradeToGdpPct field would never backfill for countries already seeded in 2026. Also added sourceVersion check to shouldSkipSeedYear() so future version bumps automatically force a re-seed without needing to clear Redis.	2026-04-07 22:26:09 +04:00
Elie Habib	6bff5c6e40	fix(resilience): robust GPI 404 fallback — structured error status, no proxy on 404, tested (#2759 )	2026-04-06 09:07:45 +04:00
Elie Habib	d170cbd56e	fix(resilience): GPI 404 — skip proxy retry, log info instead of err (#2758 )	2026-04-06 09:00:40 +04:00
Elie Habib	1f45bc326c	fix(resilience): replace broken FAO Aquastat BigQuery URL with World Bank water stress API; fix scorer/seeder shape mismatch (#2755 )	2026-04-06 08:34:39 +04:00
Elie Habib	0b3f04531a	fix(resilience): export+test GPI/FSIN/Aquastat CSV parsers; fix FSIN field mismatch (#2751 ) * fix(resilience): export+test GPI/FSIN/Aquastat CSV parsers; fix FSIN field mismatch Three CSV parsers in seed-resilience-static.mjs were private and untested. The GPI year-fallback logic (currentYear 404 in April, falls back to currentYear-1) was invisible to CI. Exports added: gpiUrlForYear(yr) -- URL builder, makes the year-fallback testable parseGpiRows(csv, year) -- GPI CSV parser extracted from fetchGpiDataset parseFsinRows(csv) -- FSIN/IPC parser extracted from fetchFsinDataset parseAquastatRows(csv) -- Aquastat parser extracted from fetchAquastatDataset Bug fixed: parseFsinRows was writing { phase3plus, phase4, phase5 } but scoreFoodWater() reads staticRecord.fao.peopleInCrisis and .phase, fields that the seeder never wrote. In production the crisis sub-metric was always null. Fixed by mapping phase3plus to peopleInCrisis and deriving .phase from the highest active IPC phase level (Phase 5 present => IPC Phase 5 etc). Also fixed the skip guard: safeNum('') returns 0 (not null), so the old == null check let zero-phase rows through. Changed to falsy (!phase3plus) which correctly skips both null and zero. Tests: 12 new cases covering column-name schema variations (old/new HDX schema), GPI min-country guard, Aquastat latest-year preference, Variable_Id fallback column, and the FSIN zero-phase skip behavior. * test(resilience): update recovery fixtures to new fao schema; assert scorer-compatible fields	2026-04-06 07:43:19 +04:00
Elie Habib	7db6ec60c4	fix(resilience-static): preserve existing Redis fao data when FSIN fetch fails (#2748 ) * fix(resilience-static): preserve existing Redis fao data when FSIN fetch fails * fix(resilience-static): block publish when Redis recovery reads fail; add recoverFailedDatasets tests	2026-04-05 23:53:10 +04:00
Elie Habib	02d5272469	feat(resilience): phase 1 scoring corrections — electricity consumption, certainty imputation, coverage weighting (#2743 ) Root causes diagnosed from production screenshots: - Lebanon energy=89: Eurostat is EU-only so IEA dependency=null; OWID showed low fossil use during crisis which appeared "clean", inflating the score. - USA "Low confidence": crisis monitoring databases (IPC, UNHCR, UCDP) only track countries IN crisis; absence was treated as missing data instead of positive signal. Fixes: - Add EG.USE.ELEC.KH.PC (electricity consumption kWh/cap) to scoreEnergy. Very low per-capita consumption captures grid collapse regardless of Eurostat coverage. Weight 0.30 (dominant when IEA dependency is null). Lebanon ~1200 kWh/cap now scores ~13 on this sub-metric vs USA ~12000 = 100. - Certainty imputation for tradeSanctions: when global sanctions list loaded but country absent, score = 100 (0 pressure). Prevents stable economies losing 55% coverage weight for being unsanctioned. - Certainty imputation for foodWater: when country has governance data (WGI present) but no FSIN/IPC entry, score = 87. "Not in food crisis database" is a positive signal, not a data gap. - Coverage-weighted domain aggregation in buildDomainList: dimensions with zero coverage no longer drag the domain average down. Low-data dimensions contribute proportionally to their coverage weight. - Add EG.USE.ELEC.KH.PC to seed-resilience-static.mjs infrastructure pull. Tests: - Lebanon energy < 50 with null IEA + 1200 kWh/cap (certifies the fix) - Sanctions certainty imputation: FI absent from list scores 100, coverage 1 - LB (fragile) < ZA (stressed) in release gate - US.lowConfidence === false in release gate - Update scorers snapshot: energy 63->78, overall 66.70->68.95 - Update dimension scorer coverage assertion: 7-metric blend vs old 5-metric	2026-04-05 23:18:53 +04:00
Elie Habib	bfe5cf25ef	fix(resilience): adapt FAO/IPC fetcher to new HDX CSV schema (#2698 ) * fix(resilience): adapt FAO/IPC fetcher to new HDX CSV schema HDX changed column names: "Country (ISO3)" → "Country" (now contains ISO3 codes), "Phase 3+ #" → "Phase 3+ number current", etc. The old column names caused every row to fail iso2 resolution, resulting in fao: null for all 222 countries. Fix: fall back to new column names for both country code and phase number fields. Pass the value as both iso3 and name to resolveIso2 so it works regardless of format. * fix: restore missing year variable in FAO/IPC fetcher	2026-04-05 07:18:18 +04:00
Elie Habib	f4f772ab83	fix(resilience): proxy fallback for HDX/FSIN fetch on Railway (#2695 ) * fix(resilience): proxy fallback for fetchText when datacenter IP is blocked HDX (Humdata) returns 404 from Railway's datacenter IP, causing the FAO/FSIN dataset to fail every run. fetchText now tries direct first, then falls back to PROXY_URL via HTTP CONNECT tunnel (same pattern as fredFetchJson in _seed-utils.mjs). * refactor: consolidate proxy tunnel into shared httpsProxyFetchRaw Extract the HTTP CONNECT proxy tunnel logic into a single exported httpsProxyFetchRaw() in _seed-utils.mjs. Both httpsProxyFetchJson (FRED/proxy) and seed-resilience-static's fetchText proxy fallback now use the shared helper instead of maintaining duplicate ~65-line implementations. Removes dynamic node:https/tls/net/util/zlib imports from seed-resilience-static.mjs (now uses static imports via _seed-utils).	2026-04-05 00:24:56 +04:00
Elie Habib	02555671f2	refactor: consolidate country name/code mappings into single canonical sources (#2676 ) * refactor(country-maps): consolidate country name/ISO maps Expand shared/country-names.json from 265 to 309 entries by merging geojson names, COUNTRY_ALIAS_MAP, upstream API variants (World Bank, WHO, UN, FAO), and seed-correlation extras. Add ISO3 map generator (generate-iso3-maps.cjs) producing iso3-to-iso2.json (239 entries) and iso2-to-iso3.json (239 entries) with TWN and XKX supplements. Add build-country-names.cjs for reproducible expansion from all sources. Sync scripts/shared/ copies for edge-function test compatibility. * refactor: consolidate country name/code mappings into single canonical sources Eliminates fragmented country mapping across the repo. Every feature (resilience, conflict, correlation, intelligence) was maintaining its own partial alias map. Data consolidation: - Expand shared/country-names.json from 265 to 302 entries covering World Bank, WHO, UN, FAO, and correlation script naming variants - Generate shared/iso3-to-iso2.json (239 entries) and shared/iso2-to-iso3.json from countries.geojson + supplements (Taiwan TWN, Kosovo XKX) Consumer migrations: - _country-resolver.mjs: delete COUNTRY_ALIAS_MAP (37 entries), replace 2MB geojson parse with 5KB iso3-to-iso2.json - conflict/_shared.ts: replace 33-entry ISO2_TO_ISO3 literal - seed-conflict-intel.mjs: replace 20-entry ISO2_TO_ISO3 literal - _dimension-scorers.ts: replace geojson-based ISO3 construction - get-risk-scores.ts: replace 31-entry ISO3_TO_ISO2 literal - seed-correlation.mjs: replace 102-entry COUNTRY_NAME_TO_ISO2 and 90-entry ISO3_TO_ISO2, use resolveIso2() from canonical resolver, lower short-alias threshold to 2 chars with word boundary matching, export matchCountryNamesInText(), add isMain guard Tests: - New tests/country-resolver.test.mjs with structural validation, parity regression for all 37 old aliases, ISO3 bidirectional consistency, and Taiwan/Kosovo assertions - Updated resilience seed test for new resolver signature Net: -190 lines, 0 hardcoded country maps remaining * fix: normalize raw text before country name matching Text matchers (geo-extract, seed-security-advisories, seed-correlation) were matching normalized keys against raw text containing diacritics and punctuation. "Curaçao", "Timor-Leste", "Hong Kong S.A.R." all failed to resolve after country-names.json keys were normalized. Fix: apply NFKD + diacritic stripping + punctuation normalization to input text before matching, same transform used on the keys. Also add "hong kong" and "sao tome" as short-form keys for bigram headline matching in geo-extract. * fix: remove 'u s' alias that caused US/VI misattribution 'u s' in country-names.json matched before 'u s virgin islands' in geo-extract's bigram scanner, attributing Virgin Islands headlines to US. Removed since 'usa', 'united states', and the uppercase US expansion already cover the United States.	2026-04-04 15:38:02 +04:00
Lucas Passos	f36e337692	feat(resilience): add static country seeder (#2658 ) * feat(resilience): add static country seeder Root cause: the resilience work needed a canonical per-country snapshot with health visibility and failure-safe Redis behavior, but the repo had no annual seed for multi-source country attributes. Changes: - add scripts/seed-resilience-static.mjs with per-country keys, manifest/meta writes, partial dataset failure handling, and prior-snapshot preservation on total failure - register the manifest/meta in api/health.js and api/seed-health.js without expanding bootstrap scope - extend scripts/railway-set-watch-paths.mjs with a dedicated seed-resilience-static service config and cron support - add focused tests for parser/shape contracts and Railway config wiring Validation: - node --test tests/resilience-static-seed.test.mjs tests/railway-set-watch-paths.test.mjs tests/bootstrap.test.mjs tests/edge-functions.test.mjs - npm run typecheck:api (fails on upstream baseline: missing vitest in server/__tests__/entitlement-check.test.ts) - smoke checks for fetchWhoDataset/fetchEnergyDependencyDataset/fetchRsfDataset against live sources * refactor(resilience): extract country resolver, wire real data sources - Extract country resolver (COUNTRY_ALIAS_MAP, normalizeCountryToken, isIso2, isIso3, createCountryResolvers, resolveIso2) into reusable scripts/_country-resolver.mjs for sharing with scoring layer - Replace env-gated GPI/FSIN/AQUASTAT stubs with real endpoints: - GPI: Vision of Humanity CSV (dynamic year URL with fallback) - FSIN: HDX IPC wide-format CSV (stable download URL) - AQUASTAT: FAO BigQuery API CSV (water stress + dependency + per capita) - Remove dead code: fetchBinary, parseTabularPayload, pickField, fetchOptionalTabularRows (no longer needed with known CSV formats) - Harden RSF parser: reject if < 100 countries (was === 0) 993 → 829 lines in seed script + 113 lines in shared resolver * fix(resilience): add _country-resolver to watch paths, catch Eurostat parse errors - Add scripts/_country-resolver.mjs to Railway watch patterns so resolver changes trigger a redeploy - Wrap parseEurostatEnergyDataset in try-catch so a malformed 200 response falls through to World Bank fallback instead of aborting * fix(resilience): cap pagination loops, check pipeline results - World Bank: cap at 100 pages to prevent runaway from malformed totalPages response - WHO GHO: cap at 50 pages and throw if pagination link persists (prevents infinite loop from cyclic nextLink) - publishSuccess: inspect per-command pipeline results and throw on partial failures to prevent status:ok with missing country keys (which would lock out same-year retries via shouldSkipSeedYear) --------- Co-authored-by: Elie Habib <elie.habib@gmail.com>	2026-04-04 11:47:16 +04:00