17 Commits

Author SHA1 Message Date
Elie Habib
8089fd9d53 feat(resilience): publish resilience:static:fao aggregate from static seed (#3050)
* feat(resilience): publish resilience:static:fao aggregate from static seed

Weekly validation cron Outcome-Backtest reads resilience:static:fao for
the Food Crisis Escalation family, but nothing wrote that key — dangling
reference, Food Crisis stuck at AUC=0.5.

IPC Phase 3+ data is already fetched by fetchFsinDataset (HDX global IPC
CSV) and stored per-country. This PR reshapes the same in-memory map into
an aggregate view and writes it in the existing Redis pipeline — no extra
fetch, no new cron service.

Output shape matches what detectFoodCrisis already walks:
  { countries: { [iso2]: { ipcPhase, phase, peopleInCrisis, year, source } },
    count, fetchedAt, seedYear, source: 'hdx-ipc' }

Only Phase 3+ countries are included, matching IPC's own publish rule.
Absence = not-monitored-crisis, consistent with scoreFoodWater()'s
stable-absence semantics.

Tests: 5 unit tests for buildFaoAggregate (incl. contract test against
detectFoodCrisis) + 1 health.js registration test. No cron/Railway
changes needed — seed-bundle-static-ref picks it up on its next October
window; restart to backfill sooner.

FX Stress / Power Outages / Refugees / Conflict also fail today but for
different reasons (detector shape mismatches) — out of scope here.

* fix(resilience): wire resilienceStaticFao into SEED_META to unmask empty-state

Reviewer catch on #3050: adding resilienceStaticFao to STANDALONE_KEYS
and EMPTY_DATA_OK_KEYS without a matching SEED_META entry leaves
seedStale=null in the standalone-key health branch, so an empty or
missing resilience:static:fao key resolves to plain OK instead of
STALE_SEED — silently masking the exact bug this PR is meant to
surface.

Adds SEED_META.resilienceStaticFao pointing at seed-meta:resilience:static
(same heartbeat as resilienceStaticIndex, since the aggregate is written
in the same Redis pipeline by the same seeder). Now: missing data with
stale heartbeat -> STALE_SEED (warn); with fresh heartbeat and no
countries in Phase 3+ -> OK (still valid per EMPTY_DATA_OK_KEYS).

Same trap documented in feedback_empty_data_ok_keys_bootstrap_blind_spot.md
but in the STANDALONE_KEYS path, not BOOTSTRAP_KEYS.

Test locks it in with a source-string regex assertion.
2026-04-13 13:00:58 +04:00
Elie Habib
968244e2ec fix(seed): revert WHO $top to 1000 + add $skip pagination (#2827)
WHO GHO API rejects $top > 1000 with HTTP 400, breaking all WHO
indicators in production. Revert to $top=1000 and paginate via $skip.
UHC indicator has 4680 rows requiring 5 pages. Adds per-indicator
row count logging.
2026-04-08 17:24:50 +04:00
Elie Habib
83cecb5aef feat(resilience): add WB mean applied tariff rate to tradeSanctions (#2811)
* feat(resilience): add WB mean applied tariff rate to tradeSanctions

World Bank TM.TAX.MRCH.WM.AR.ZS covers 180+ countries, supplementing
the WTO top-50 metrics that only cover major reporters. Reduces
reporter-set bias by providing a global trade openness signal.

Reweights: sanctions 0.45, WTO restrictions 0.15, WTO barriers 0.15,
WB tariff rate 0.25.

* fix: update pinned test assertions for WB tariff rate reweighting

Adjusts scoreTradeSanctions test assertions for the new 4-metric blend
(sanctions 0.45, restrictions 0.15, barriers 0.15, tariff 0.25) and
bumps TOTAL_DATASET_SLOTS from 9 to 10 in payload assembly tests.

* fix(seed): bump static source version to v5 + sync indicator registry for trade

Version bump ensures appliedTariffRate backfills to existing 2026
snapshots. Registry updated from 3-metric to 4-metric trade-sanctions
weights.

* fix(resilience): correct appliedTariffRate sourceKey to resilience:static:{ISO2}

* fix(resilience): bump score cache to v4 + add tariff rate to release-gate fixtures

Score/ranking cache keys bumped to v4 to invalidate stale pre-tariff
cached responses. Release-gate fixtures now include appliedTariffRate
so the gate exercises the full 4-metric trade-sanctions path.

* fix(test): update pinned scorer assertions after rebase onto main

With all Phase 2+3 PRs merged (FX reserves, broadband, WHO metrics,
zero-event guards), the combined fixture data produces economic=66.33,
infrastructure=79, overallScore=68.72.
2026-04-08 10:56:12 +04:00
Elie Habib
6aa822e9f9 feat(resilience): FX reserves adequacy in currencyExternal (#2812)
* feat(resilience): add FX reserves adequacy to currencyExternal dimension

World Bank FI.RES.TOTL.MO (total reserves in months of imports) covers
~160 countries, filling the BIS EER coverage gap (~40 economies).

For BIS countries: reserves supplement volatility + deviation (weight 0.15).
For non-BIS countries: reserves combine with IMF inflation proxy (0.4/0.6
blend) for much better currency stability coverage than inflation alone.

Normalization: 1 month (near crisis) = 0, 12+ months (very safe) = 100.

* fix(seed): bump static source version to v4 for fxReservesMonths backfill

Without version bump, existing 2026 snapshots won't be republished and
fxReservesMonths field will never backfill until next annual cycle.

* fix(resilience): bump score cache to v3 for FX reserves scorer change

scoreCurrencyExternal now includes FX reserves adequacy, changing scores
for all countries. Bump cache key to invalidate stale pre-reserves
cached responses on deploy.

* fix(seed): retry static seed when previous run had failed datasets

shouldSkipSeedYear() now returns false when seed-meta records non-empty
failedDatasets, allowing backfill of datasets that failed on the first
run (e.g., fxReservesMonths upstream outage during v4 rollout).
Previously, partial success with status:'ok' caused all future same-year
runs to skip permanently.
2026-04-08 10:45:25 +04:00
Elie Habib
bea26b3175 feat(resilience): add WB broadband penetration to infrastructure dimension (#2813)
* feat(resilience): add WB broadband penetration to infrastructure dimension

World Bank IT.NET.BBND.P2 (fixed broadband subscriptions per 100
people) added as new sub-metric in scoreInfrastructure.
normalizeHigherBetter(0, 40). Reweights: electricity 0.30, roads 0.30,
outages 0.25, broadband 0.15.

* fix(resilience): add explicit outagesRaw null guard in scoreInfrastructure

Matches the established pattern in scoreCyberDigital where both source
presence and penalty > 0 are checked before scoring.

* test(resilience): pin expected broadband numeric contribution in infrastructure scorer

Strengthens the broadband test from directional-only to pinned numeric
assertion, catching regressions in normalization goalposts or weight
changes.
2026-04-08 10:32:33 +04:00
Elie Habib
78c8381547 feat(resilience): add WHO physician density + health expenditure to healthPublicService (#2808)
* feat(resilience): add WHO physician density + health expenditure to healthPublicService

Two new sub-metrics from WHO GHO OData:
- Physician density per 1k (HWF_0001): normalizeHigherBetter(0, 5)
- Health expenditure per capita PPP (GHED_CHE_pc_PPP_SHA2011): normalizeHigherBetter(50, 5000)

Reweights existing metrics: UHC 0.35, measles 0.25, beds 0.10,
physicians 0.15, expenditure 0.15.

Bumps static source version to v3 for backfill.

* fix(resilience): replace dead WHO health expenditure indicator with working alternative

GHED_CHE_pc_PPP_SHA2011 returns empty dataset from WHO GHO API, causing
the 15% healthExp weight to silently drop from production scoring.

Replaced with GHED_CHE_pc_US_SHA2011 (per capita current USD), which has
4849 records across all countries. Renamed field healthExpPerCapitaPpp to
healthExpPerCapitaUsd and adjusted normalization goalposts from (50, 5000)
to (20, 3000) to reflect current-USD scale. Bumped source version to v4.

* fix(seed): increase WHO $top to 10000 to prevent pagination truncation + add transform test

WHO GHO API returns exactly 1000 rows with no @odata.nextLink for
physician density and health expenditure indicators, silently truncating
country coverage. Increasing $top to 10000 fetches all rows in one page
(typical WHO indicators have 2000-5000 rows).

Also adds seed-level test for the HWF_0001 per-10k to per-1k division
transform.
2026-04-08 10:21:59 +04:00
Elie Habib
023e2e60e7 feat(resilience): add tradeToGdpPct to resilience static bundle (#2794)
* feat(seed): add tradeToGdpPct to resilience static bundle from WB NE.TRD.GNFS.ZS

Trade as % of GDP is needed for Phase 2 exposure-weighting of shipping
stress. Small open economies (Singapore ~300%, Belgium ~170%) will feel
shipping disruption more than large autarkies (US ~25%).

* fix(seed): bump static source version to v2 for tradeToGdp backfill

The static seeder skips re-runs within the same seed year if a snapshot
exists. Without a version bump, the new tradeToGdpPct field would never
backfill for countries already seeded in 2026.

Also added sourceVersion check to shouldSkipSeedYear() so future version
bumps automatically force a re-seed without needing to clear Redis.
2026-04-07 22:26:09 +04:00
Elie Habib
6bff5c6e40 fix(resilience): robust GPI 404 fallback — structured error status, no proxy on 404, tested (#2759) 2026-04-06 09:07:45 +04:00
Elie Habib
d170cbd56e fix(resilience): GPI 404 — skip proxy retry, log info instead of err (#2758) 2026-04-06 09:00:40 +04:00
Elie Habib
1f45bc326c fix(resilience): replace broken FAO Aquastat BigQuery URL with World Bank water stress API; fix scorer/seeder shape mismatch (#2755) 2026-04-06 08:34:39 +04:00
Elie Habib
0b3f04531a fix(resilience): export+test GPI/FSIN/Aquastat CSV parsers; fix FSIN field mismatch (#2751)
* fix(resilience): export+test GPI/FSIN/Aquastat CSV parsers; fix FSIN field mismatch

Three CSV parsers in seed-resilience-static.mjs were private and untested.
The GPI year-fallback logic (currentYear 404 in April, falls back to
currentYear-1) was invisible to CI. Exports added:

  gpiUrlForYear(yr)       -- URL builder, makes the year-fallback testable
  parseGpiRows(csv, year) -- GPI CSV parser extracted from fetchGpiDataset
  parseFsinRows(csv)      -- FSIN/IPC parser extracted from fetchFsinDataset
  parseAquastatRows(csv)  -- Aquastat parser extracted from fetchAquastatDataset

Bug fixed: parseFsinRows was writing { phase3plus, phase4, phase5 } but
scoreFoodWater() reads staticRecord.fao.peopleInCrisis and .phase, fields
that the seeder never wrote. In production the crisis sub-metric was always
null. Fixed by mapping phase3plus to peopleInCrisis and deriving .phase from
the highest active IPC phase level (Phase 5 present => IPC Phase 5 etc).

Also fixed the skip guard: safeNum('') returns 0 (not null), so the old
== null check let zero-phase rows through. Changed to falsy (!phase3plus)
which correctly skips both null and zero.

Tests: 12 new cases covering column-name schema variations (old/new HDX
schema), GPI min-country guard, Aquastat latest-year preference,
Variable_Id fallback column, and the FSIN zero-phase skip behavior.

* test(resilience): update recovery fixtures to new fao schema; assert scorer-compatible fields
2026-04-06 07:43:19 +04:00
Elie Habib
7db6ec60c4 fix(resilience-static): preserve existing Redis fao data when FSIN fetch fails (#2748)
* fix(resilience-static): preserve existing Redis fao data when FSIN fetch fails

* fix(resilience-static): block publish when Redis recovery reads fail; add recoverFailedDatasets tests
2026-04-05 23:53:10 +04:00
Elie Habib
02d5272469 feat(resilience): phase 1 scoring corrections — electricity consumption, certainty imputation, coverage weighting (#2743)
Root causes diagnosed from production screenshots:
- Lebanon energy=89: Eurostat is EU-only so IEA dependency=null; OWID showed
  low fossil use during crisis which appeared "clean", inflating the score.
- USA "Low confidence": crisis monitoring databases (IPC, UNHCR, UCDP) only
  track countries IN crisis; absence was treated as missing data instead of
  positive signal.

Fixes:
- Add EG.USE.ELEC.KH.PC (electricity consumption kWh/cap) to scoreEnergy.
  Very low per-capita consumption captures grid collapse regardless of
  Eurostat coverage. Weight 0.30 (dominant when IEA dependency is null).
  Lebanon ~1200 kWh/cap now scores ~13 on this sub-metric vs USA ~12000 = 100.
- Certainty imputation for tradeSanctions: when global sanctions list loaded
  but country absent, score = 100 (0 pressure). Prevents stable economies
  losing 55% coverage weight for being unsanctioned.
- Certainty imputation for foodWater: when country has governance data (WGI
  present) but no FSIN/IPC entry, score = 87. "Not in food crisis database"
  is a positive signal, not a data gap.
- Coverage-weighted domain aggregation in buildDomainList: dimensions with
  zero coverage no longer drag the domain average down. Low-data dimensions
  contribute proportionally to their coverage weight.
- Add EG.USE.ELEC.KH.PC to seed-resilience-static.mjs infrastructure pull.

Tests:
- Lebanon energy < 50 with null IEA + 1200 kWh/cap (certifies the fix)
- Sanctions certainty imputation: FI absent from list scores 100, coverage 1
- LB (fragile) < ZA (stressed) in release gate
- US.lowConfidence === false in release gate
- Update scorers snapshot: energy 63->78, overall 66.70->68.95
- Update dimension scorer coverage assertion: 7-metric blend vs old 5-metric
2026-04-05 23:18:53 +04:00
Elie Habib
bfe5cf25ef fix(resilience): adapt FAO/IPC fetcher to new HDX CSV schema (#2698)
* fix(resilience): adapt FAO/IPC fetcher to new HDX CSV schema

HDX changed column names: "Country (ISO3)" → "Country" (now contains
ISO3 codes), "Phase 3+ #" → "Phase 3+ number current", etc. The old
column names caused every row to fail iso2 resolution, resulting in
fao: null for all 222 countries.

Fix: fall back to new column names for both country code and phase
number fields. Pass the value as both iso3 and name to resolveIso2
so it works regardless of format.

* fix: restore missing year variable in FAO/IPC fetcher
2026-04-05 07:18:18 +04:00
Elie Habib
f4f772ab83 fix(resilience): proxy fallback for HDX/FSIN fetch on Railway (#2695)
* fix(resilience): proxy fallback for fetchText when datacenter IP is blocked

HDX (Humdata) returns 404 from Railway's datacenter IP, causing the
FAO/FSIN dataset to fail every run. fetchText now tries direct first,
then falls back to PROXY_URL via HTTP CONNECT tunnel (same pattern
as fredFetchJson in _seed-utils.mjs).

* refactor: consolidate proxy tunnel into shared httpsProxyFetchRaw

Extract the HTTP CONNECT proxy tunnel logic into a single exported
httpsProxyFetchRaw() in _seed-utils.mjs. Both httpsProxyFetchJson
(FRED/proxy) and seed-resilience-static's fetchText proxy fallback
now use the shared helper instead of maintaining duplicate ~65-line
implementations.

Removes dynamic node:https/tls/net/util/zlib imports from
seed-resilience-static.mjs (now uses static imports via _seed-utils).
2026-04-05 00:24:56 +04:00
Elie Habib
02555671f2 refactor: consolidate country name/code mappings into single canonical sources (#2676)
* refactor(country-maps): consolidate country name/ISO maps

Expand shared/country-names.json from 265 to 309 entries by merging
geojson names, COUNTRY_ALIAS_MAP, upstream API variants (World Bank,
WHO, UN, FAO), and seed-correlation extras.

Add ISO3 map generator (generate-iso3-maps.cjs) producing
iso3-to-iso2.json (239 entries) and iso2-to-iso3.json (239 entries)
with TWN and XKX supplements.

Add build-country-names.cjs for reproducible expansion from all sources.
Sync scripts/shared/ copies for edge-function test compatibility.

* refactor: consolidate country name/code mappings into single canonical sources

Eliminates fragmented country mapping across the repo. Every feature
(resilience, conflict, correlation, intelligence) was maintaining its
own partial alias map.

Data consolidation:
- Expand shared/country-names.json from 265 to 302 entries covering
  World Bank, WHO, UN, FAO, and correlation script naming variants
- Generate shared/iso3-to-iso2.json (239 entries) and
  shared/iso2-to-iso3.json from countries.geojson + supplements
  (Taiwan TWN, Kosovo XKX)

Consumer migrations:
- _country-resolver.mjs: delete COUNTRY_ALIAS_MAP (37 entries),
  replace 2MB geojson parse with 5KB iso3-to-iso2.json
- conflict/_shared.ts: replace 33-entry ISO2_TO_ISO3 literal
- seed-conflict-intel.mjs: replace 20-entry ISO2_TO_ISO3 literal
- _dimension-scorers.ts: replace geojson-based ISO3 construction
- get-risk-scores.ts: replace 31-entry ISO3_TO_ISO2 literal
- seed-correlation.mjs: replace 102-entry COUNTRY_NAME_TO_ISO2
  and 90-entry ISO3_TO_ISO2, use resolveIso2() from canonical
  resolver, lower short-alias threshold to 2 chars with word
  boundary matching, export matchCountryNamesInText(), add isMain
  guard

Tests:
- New tests/country-resolver.test.mjs with structural validation,
  parity regression for all 37 old aliases, ISO3 bidirectional
  consistency, and Taiwan/Kosovo assertions
- Updated resilience seed test for new resolver signature

Net: -190 lines, 0 hardcoded country maps remaining

* fix: normalize raw text before country name matching

Text matchers (geo-extract, seed-security-advisories, seed-correlation)
were matching normalized keys against raw text containing diacritics
and punctuation. "Curaçao", "Timor-Leste", "Hong Kong S.A.R." all
failed to resolve after country-names.json keys were normalized.

Fix: apply NFKD + diacritic stripping + punctuation normalization to
input text before matching, same transform used on the keys.

Also add "hong kong" and "sao tome" as short-form keys for bigram
headline matching in geo-extract.

* fix: remove 'u s' alias that caused US/VI misattribution

'u s' in country-names.json matched before 'u s virgin islands' in
geo-extract's bigram scanner, attributing Virgin Islands headlines
to US. Removed since 'usa', 'united states', and the uppercase US
expansion already cover the United States.
2026-04-04 15:38:02 +04:00
Lucas Passos
f36e337692 feat(resilience): add static country seeder (#2658)
* feat(resilience): add static country seeder

Root cause: the resilience work needed a canonical per-country snapshot with health visibility and failure-safe Redis behavior, but the repo had no annual seed for multi-source country attributes.

Changes:
- add scripts/seed-resilience-static.mjs with per-country keys, manifest/meta writes, partial dataset failure handling, and prior-snapshot preservation on total failure
- register the manifest/meta in api/health.js and api/seed-health.js without expanding bootstrap scope
- extend scripts/railway-set-watch-paths.mjs with a dedicated seed-resilience-static service config and cron support
- add focused tests for parser/shape contracts and Railway config wiring

Validation:
- node --test tests/resilience-static-seed.test.mjs tests/railway-set-watch-paths.test.mjs tests/bootstrap.test.mjs tests/edge-functions.test.mjs
- npm run typecheck:api (fails on upstream baseline: missing vitest in server/__tests__/entitlement-check.test.ts)
- smoke checks for fetchWhoDataset/fetchEnergyDependencyDataset/fetchRsfDataset against live sources

* refactor(resilience): extract country resolver, wire real data sources

- Extract country resolver (COUNTRY_ALIAS_MAP, normalizeCountryToken,
  isIso2, isIso3, createCountryResolvers, resolveIso2) into reusable
  scripts/_country-resolver.mjs for sharing with scoring layer

- Replace env-gated GPI/FSIN/AQUASTAT stubs with real endpoints:
  - GPI: Vision of Humanity CSV (dynamic year URL with fallback)
  - FSIN: HDX IPC wide-format CSV (stable download URL)
  - AQUASTAT: FAO BigQuery API CSV (water stress + dependency + per capita)

- Remove dead code: fetchBinary, parseTabularPayload, pickField,
  fetchOptionalTabularRows (no longer needed with known CSV formats)

- Harden RSF parser: reject if < 100 countries (was === 0)

993 → 829 lines in seed script + 113 lines in shared resolver

* fix(resilience): add _country-resolver to watch paths, catch Eurostat parse errors

- Add scripts/_country-resolver.mjs to Railway watch patterns so
  resolver changes trigger a redeploy
- Wrap parseEurostatEnergyDataset in try-catch so a malformed 200
  response falls through to World Bank fallback instead of aborting

* fix(resilience): cap pagination loops, check pipeline results

- World Bank: cap at 100 pages to prevent runaway from malformed
  totalPages response
- WHO GHO: cap at 50 pages and throw if pagination link persists
  (prevents infinite loop from cyclic nextLink)
- publishSuccess: inspect per-command pipeline results and throw on
  partial failures to prevent status:ok with missing country keys
  (which would lock out same-year retries via shouldSkipSeedYear)

---------

Co-authored-by: Elie Habib <elie.habib@gmail.com>
2026-04-04 11:47:16 +04:00