* docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question) PR 5.1 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.3 (#3374) so the known-limitations.md section append is additive. Read-only static audit of scoreTradeSanctions + the sanctions:country-counts:v1 seed — framed around the Codex-reformulated construct question: should designated-party domicile count penalize resilience? Findings 1. The count is "OFAC-designated-party domicile locations," NOT "sanctions against this country." Seeder (`scripts/seed-sanctions- pressure.mjs:85-93`) parses OFAC Advanced XML SDN + Consolidated, extracts each designated party's Locations, and increments `map[countryCode]` by 1 for every location country on that party. 2. The count conflates three semantically distinct categories a resilience construct might treat differently: (a) Country-level sanction target (NK SDN listings) — correct penalty (b) Domiciled sanctioned entity (RU bank in Moscow, post-2022) — debatable, country hosts the actor (c) Transit / shell entity (UAE trading co listed under SDGT for Iran evasion; CY SPV for a Russian oligarch) — country is NOT the target, but takes the penalty 3. Observed GCC cohort impact: AE scores 54 vs KW/QA 82. The −28 gap is almost entirely driven by category (c) listings — AE is a financial hub where sanctioned parties incorporate shells. 4. Three options documented for the construct decision (NOT decided in this PR): - Option 1: Keep flat count (status quo, defensible via secondary- sanctions / FATF argument) - Option 2: Program-weighted count — weight DPRK/IRAN/SYRIA/etc. at 1.0, SDGT/SDNTK/CYBER/etc. at 0.3-0.5. Recommended; seeder already captures `programs` per entry — data is there, scorer just doesn't read it. - Option 3: Transit-hub exclusion list (AE, SG, HK, CY, VG, KY) — brittle + normative, not recommended 5. Recommendation documented: Option 2. Implementation deferred to a separate methodology-decision PR (outside auto-mode authority). Shipped - `docs/methodology/known-limitations.md` — new section extending the file: "tradeSanctions — designated-party domicile construct question." Covers what the count represents, the three categories with examples, observed GCC impact, three options w/ trade-offs, recommendation, follow-up audit list (entity-sample gated on API-key access), and file references. - `tests/resilience-sanctions-field-mapping.test.mts` (new) — 10 regression-guard tests pinning CURRENT behavior: 1-6. normalizeSanctionCount piecewise anchors: count=0→100, 1→90, 10→75, 50→50, 200→25, 500→≤1 7. Monotonicity: strictly decreasing across the ramp 8. Country absent from map defaults to count=0 → score 100 (intentional "no designated parties here" semantics) 9. Seed outage (raw=null) → null score slot, NOT imputed (protects against silent data-outage scoring) 10. Construct anchor: count=1 is exactly 10 points below count=0 (pins the "first listing drops 10" design choice) Verified - `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail - `npm run test:data` — 6721 pass / 0 fail - `npm run typecheck` / `typecheck:api` — green - `npm run lint` / `lint:md` — clean * fix(resilience): PR 5.1 review — tighten count=500 assertion; clarify weightedBlend weights Addresses 2 P2 Greptile findings on #3375: 1. Tighten count=500 assertion. Was `<= 1` with a comment stating the exact value is 0. That loose bound silently tolerates roundScore / boundary drift that would be the very signal this regression guard exists to catch. Changed to strict equality `=== 0`. 2. Clarify the "zero weight" comment on the sanctions-only harness. The other slots DO contribute their declared weights (0.15 + 0.15 + 0.25 = 0.55) to weightedBlend's `totalWeight` denominator — only `availableWeight` (the score-computation denominator) drops to 0.45 because their score is null. The previous comment elided this distinction and could mislead a reader into thinking the null slots contributed nothing at all. Expanded to state exactly how `coverage` and `score` each behave. Verified - `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail (count=500 now pins the exact 0 floor)
19 KiB
Known limitations — resilience scorer
Documented construct limitations, data-source edge cases, and modeling-choice notes that aren't bugs but reviewers should know before interpreting individual countries' scores.
Each entry names: the dimension(s) affected, the root cause, the observable signature, and either the fix path or the reason it is NOT being fixed.
Displacement field-mapping (scoreSocialCohesion / scoreBorderSecurity / scoreStateContinuity)
Dimensions. socialCohesion (weight 0.25 of the blend),
borderSecurity (weight 0.35 of the blend), stateContinuity
(weight 0.20 of the blend).
Source. UNHCR Population API
(https://api.unhcr.org/population/v1/population/), written via
scripts/seed-displacement-summary.mjs into the Redis key
displacement:summary:v1:<year>.
What UNHCR covers, and what it does not. The UNHCR Population registry tracks four displacement categories:
refugees— people forced to flee and recognized under the 1951 Convention / UNHCR mandateasylum_seekers— people whose claim is not yet determinedidps— internally displaced persons (inside their own country)stateless— people without recognized nationality
It does NOT include:
- Labor migrants (covered by UN DESA International Migrant Stock / IOM's World Migration Report — a separate dataset)
- Student / tourist flows
- Naturalised citizens or long-settled foreign-born populations
Field mapping audit (static, code-side — no live-data access used for this audit):
| Scorer field read | Seeder source | Seeder formula | Semantics |
|---|---|---|---|
displacement.totalDisplaced |
UNHCR refugees + asylum_seekers + idps + stateless summed on the origin side (coo_iso) |
Line 140 of seed-displacement-summary.mjs |
How many people from THIS country are currently displaced (origin outflow + internal) |
displacement.hostTotal |
UNHCR refugees + asylum_seekers summed on the asylum side (coa_iso) |
Lines 148-150 of seed-displacement-summary.mjs |
How many UNHCR-registered people THIS country is currently hosting |
displacement.refugees / asylumSeekers / idps / stateless |
Direct per-category copy from UNHCR rows (origin side) | Lines 136-139 | As UNHCR reports them |
displacement.hostRefugees / hostAsylumSeekers |
Direct per-category copy (asylum side) | Lines 148-149 | As UNHCR reports them |
Finding. The field mapping is code-correct. Labor migrants
are not in the UNHCR endpoint at all, so the plan's hypothesis —
"does totalDisplaced inadvertently include labor migrants?" — is
negative at the seeder level. Countries whose foreign-born
populations are dominated by labor migrants (GCC states, Singapore,
Malaysia) will have small totalDisplaced AND small hostTotal
under UNHCR's definition. That is the UNHCR-semantic output, not
a bug.
Implication for the GCC cohort-audit question. GCC countries
score high on socialCohesion's displacement sub-component
(log10(0) → 0 → normalizes to 100) because UNHCR records them as
having small refugee inflows/outflows — correct per UNHCR
semantics, regardless of labor migrant stock. If the resilience
construct wants "demographic pressure from foreign-born
populations" as an indicator, that would require a SEPARATE data
source (UN DESA migrant stock) and a separate dimension — not a
change to this one.
Modeling note — scoreBorderSecurity fallback chain is
effectively dead code. The scorer reads
hostTotal ?? totalDisplaced at line 1412 of
_dimension-scorers.ts. Intent (from the surrounding comments):
- Primary (
hostTotal): how many UNHCR-registered people this country hosts → direct border-security signal. - Fallback (
totalDisplaced): how many of this country's people are displaced → indirect border-security signal for origin-dominated countries.
Discovered during this audit: the fallback does not fire in production, for two compounding reasons.
safeNum(null)returns0, notnull. JavaScript'sNumber(null) === 0(whileNumber(undefined) === NaN), so the scorer'ssafeNumhelper classifiesnullas a finite zero. The??operator only falls back on null/undefined, sosafeNum(null) ?? safeNum(totalDisplaced)evaluates to0.scripts/seed-displacement-summary.mjsALWAYS writeshostTotal: 0explicitly for origin-only countries (lines 141-144 of the seeder). There is no production shape wherehostTotalisundefined— which is the only case??would actually fall back under.
Observable consequence. Origin-only countries with large
outflows but no asylum inflow — Syria (~7M displaced), Venezuela
(~6M), Afghanistan (~5M), Ukraine during peak — score 100 on
scoreBorderSecurity's displacement sub-component (35% of the
dim). The actual signal is never picked up. Turkey-pattern
(large host, small origin) works correctly.
Why not fixing this today. A one-line change (|| instead of
??, or hostTotal > 0 ? hostTotal : totalDisplaced) would
flip the borderSecurity score for ~6 high-outflow origin
countries by a material amount — a methodology change, not a
pure bug-fix. That belongs in a construct-decision PR with a
cohort-audit snapshot before/after, not bundled into an audit
doc PR. Opening a follow-up to decide: should borderSecurity
reflect origin-outflow pressure, host-inflow pressure, or both?
Test pin. tests/resilience-displacement-field-mapping.test.mts
pins the CURRENT behavior (Syria-pattern scores 100 on this
sub-component). A future construct decision that flips the
semantics must update that test in the same commit.
What WOULD be a bug, if observed (not observed today). If a
future UNHCR schema change renamed refugees/idps/etc.
without the seeder catching it, totalDisplaced would silently
drop to 0 across the board — presenting as "every country is a
perfect-cohesion utopia" in the rankings. Mitigation: the
existing seed-health gate in /api/health fails on
displacement:summary:v1:<year> record count < threshold, which
would trip before scores propagate. Verified by reading
validate() at line 216-223 of seed-displacement-summary.mjs.
Follow-up audit (requires API-key access, not in scope of this
PR). Spot-check 10 countries' raw displacement:summary:v1:<year>
payloads against UNHCR Refugee Data Finder
(https://www.unhcr.org/refugee-statistics/) to verify the seeder's
sum reproduces UNHCR's published figures:
- High host-pressure states: DE, TR, PK, UG, BD, CO, LB
- High origin-outflow states: SY, UA, AF, VE
- Labor-migrant-dominated states (should show small UNHCR numbers regardless of labor migrant stock): AE, QA, KW, SG
Write the comparison into this file as a subsection when the spot-check runs.
References.
- Seeder:
scripts/seed-displacement-summary.mjs - Scorer reads:
server/worldmonitor/resilience/v1/_dimension-scorers.tslines 843 (getCountryDisplacement), 1383, 1412, 1765 - UNHCR Population API schema: https://api.unhcr.org/docs/population.html
- Plan reference:
docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md§PR 5.2
foodWater scorer — construct-deterministic cohort identity (scoreFoodWater)
Dimension. foodWater (weight 1.0 in the health-food domain
aggregate). Reads from resilience:static:<ISO2> via
readStaticCountry. Three weighted slots:
| Slot | Source | Weight | Mapping |
|---|---|---|---|
| People in food crisis (log10) | fao.peopleInCrisis (HDX IPC/FSIN) |
0.45 | normalizeLowerBetter(log10(max(1, n)), 0, 7) |
| IPC phase number | fao.phase → digit extracted |
0.15 | normalizeLowerBetter(phase, 1, 5) |
| AQUASTAT water indicator | aquastat.value + aquastat.indicator (WB ER.H2O.FWST.ZS, labelled 'water stress') |
0.40 | normalizeLowerBetter(value, 0, 100) when indicator contains stress/withdrawal/dependency; normalizeHigherBetter when availability/renewable/access |
What the plan's predecessor concern was. The cohort-audit plan
observed that GCC countries all score ~53 on foodWater and
asked whether this was a "mystery regional default" or genuine
construct output.
Finding — it is genuine construct output.
-
IPC/HDX doesn't publish active food-crisis data for food-secure states like the GCC.
scripts/seed-resilience-static.mjswritesfao: null(or omits the block) for those countries. -
The scorer's
fao == nullbranch imputesIMPUTE.ipcFood={ score: 88, certaintyCoverage: 0.7, imputationClass: 'stable-absence' }(see_dimension-scorers.tsline 135) at weight 0.6 for the combined peopleInCrisis+phase slot. -
AQUASTAT for the GCC is EXTREME. WB indicator
ER.H2O.FWST.ZSmeasures freshwater withdrawal as a % of internal renewable resources. Desert economies with desalination routinely exceed 100% (Kuwait ~3200%, Bahrain ~3400%, UAE ~2080%, Qatar ~770%). Values > 100 clamp the sub-score to 0 under the lower-better normaliser against (0, 100). -
Under the
fao: nullbranch (which is what the static seeder emits for GCC in production) plus clamped AQUASTAT=0 at weight 0.4, the weighted blend is:weightedScore = (IMPUTE.ipcFood × 0.6 + 0 × 0.4) / (0.6 + 0.4) = (88 × 0.6) / 1.0 = 52.8 → 53Pinned as an anchor test in
tests/resilience-foodwater-field-mapping.test.mts. Note that an alternative scenario —faopresent withpeopleInCrisis: 0andphase: null— converges on a near-identical 52.94 via the else branch formula(100×0.45 + 0×0.4) / 0.85. That convergence is a coincidence of the specific zero-peopleInCrisis input, NOT the construct's intent — the test fixture is intentionally shaped to exercise the IMPUTE path that matches production.
Why GCC scores are identical across the cohort. GCC countries share:
- Same IPC status (not monitored → same impute constant)
- Same AQUASTAT indicator (
'water stress', WB's standard label) - Extreme and similarly-clamped withdrawal ratios (all > 100 → all clamp to 0 on the AQUASTAT sub-score)
Identical inputs → identical outputs. That is construct determinism, not a regional-default lookup. Pinned with a synthetic two-country test: identical input shapes produce identical scores; different water profiles produce different scores.
Regression-guard tests in
tests/resilience-foodwater-field-mapping.test.mts:
- Indicator routing:
'water stress'→ lower-better;'renewable water availability'→ higher-better. - GCC extreme-withdrawal anchor: AQUASTAT value=2000 +
fao: null(IMPUTE branch, matching production) blends to exactly 53 via(88×0.6 + 0×0.4) / 1.0 = 52.8 → 53. - IPC-absent with static record present: imputes
ipcFood=88; observed AQUASTAT wins →imputationClass=nullper weightedBlend's T1.7 rule. - Fully-imputed (FAO missing AND AQUASTAT missing): surfaces
imputationClass='stable-absence'. - Fully-absent static record (seeder never ran): returns coverage=0, NOT an impute.
- Cohort determinism: identical inputs → identical scores; different water-profile inputs → different scores.
Implication — no fix required. The scorer is producing the
construct it's specified to produce. The observed GCC identity
is a correct summary statement: "non-crisis food security +
severe water-withdrawal stress." A future construct decision
might split foodWater into food and water sub-dims so the
water-stress signal doesn't saturate the combined dim across
desert economies — but that is a construct redesign, not a
bug fix.
Follow-up data-side spot-check (requires API key / Redis
access; not in scope of this PR). Pull raw AQUASTAT + FAO
inputs for GCC + IL + JO (similar water-stressed region) and
verify the seeder-written values against WB's live API
response. If a GCC country's WB value differs substantially
from the figures above, the seeder may have a stale-year
picker bug — unlikely given seed-resilience-static.mjs uses
mrv=15 + selectLatestWorldBankByCountry, but worth
verifying.
References.
- Seeder:
scripts/seed-resilience-static.mjslines 658-680 (WB_WATER_STRESS_INDICATOR,fetchAquastatDataset,buildAquastatWbMap) - Scorer reads:
server/worldmonitor/resilience/v1/_dimension-scorers.tslines 895 (scoreAquastatValue), 1471 (scoreFoodWater), 135 (IMPUTE.ipcFoodconstant) - WB indicator docs: https://data.worldbank.org/indicator/ER.H2O.FWST.ZS
- Plan reference:
docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md§PR 5.3 - Test regression guards:
tests/resilience-foodwater-field-mapping.test.mts
tradeSanctions — "designated-party domicile" construct question (scoreTradeSanctions)
Dimension. tradeSanctions (scoreTradeSanctions, weight 0.45
of the blend for the sanctions sub-component; 1.0 for the dim in
the economic domain).
Source. sanctions:country-counts:v1, a flat ISO2 → count
map written by scripts/seed-sanctions-pressure.mjs. The seeder
parses OFAC's Advanced XML (SDN + Consolidated lists), extracts
each designated party's Locations, and increments
map[countryCode] by 1 for every country listed in that
party's locations.
What the count ACTUALLY represents
The count is "how many OFAC-designated parties list this country as a location" — not "how many sanctions this country is under." A single designated entity's primary country gets +1; a shell that's domiciled in country X but operates via country Y will typically list both and increment both counts.
Consequence: the count conflates three semantically distinct categories that a resilience construct might want to treat differently.
| Category | Example | Current scorer impact | Construct question |
|---|---|---|---|
| (a) Country-level sanction target | North Korea SDN listings | +1 per designated entity/person inside the sanctioned state | Penalizing the state is the INTENDED signal — resilience is genuinely degraded by comprehensive sanctions |
| (b) Domiciled sanctioned entity | Russian bank HQ'd in Moscow, designated post-2022 invasion | +1 per listing | The country's resilience is indirectly penalized for hosting the sanctioned actor — debatable |
| (c) Transit / shell entity listing | UAE-based trading company designated under SDGT for Iran oil smuggling; Cyprus-registered SPV facilitating a Russian oligarch's asset transfer | +1 per listing even when the country itself is NOT the sanctions target | The country is penalized because it's a financial hub that shell entities incorporate in — construct-debatable |
Observed effect in the 2026-04-24 cohort audit
| Country | tradeSanctions dim score |
Interpretation under current construct |
|---|---|---|
| KW | 82 | Low designated-party count (mostly a clean jurisdiction) |
| QA | 82 | Low count |
| AE | 54 | High count — dominated by category (c): Iran-evasion shell entities, Russian-asset SPVs |
| SA | (similar) | Low count |
AE's gap of −28 vs KW/QA is almost entirely driven by category (c) listings. Under the CURRENT scorer, AE's resilience is penalized for being a financial hub where sanctioned parties incorporate shells — regardless of whether the UAE state is complicit or targeted by the listing.
Construct options (not decided here)
This PR deliberately does NOT pick an option; the scoring implication is large enough that the decision belongs to a separate construct-discussion PR with cohort snapshots.
Option 1 — Keep the current flat count (status quo).
- Rationale: financial-sanctions exposure IS a real resilience risk even for transit-hub jurisdictions. A country that functions as a shell-entity jurisdiction ends up correlated with secondary-sanctions enforcement actions, correspondent- banking isolation, and FATF grey-listing pressure.
- Cost: countries whose domestic policy is NOT what earned them the count (UAE-on-Iran, Cyprus-on-Russia) carry a score penalty for the behavior of entities that happen to have listed addresses there.
Option 2 — Weight by OFAC program category.
- Rationale: programs encode the nature of the designation.
DPRK,IRAN,SYRIA,VENEZUELA,CUBAare country-comprehensive;SDGT,SDNTK,CYBER,RUSSIA-EO,GLOMAGare typically entity-specific. - Approach: weight category-(a) programs at 1.0 and category- (c)-ish programs at 0.3–0.5 based on a named mapping.
- Cost: requires maintaining a program→category manifest;
program codes change over time; currently the seeder already
captures
programsper entry (seescripts/seed-sanctions-pressure.mjslines 95-108) — the data is there, the scorer just doesn't read it.
Option 3 — Exclude transit-hub jurisdictions from the domicile-count signal.
- Rationale: a small number of jurisdictions (AE, SG, HK, CY, VG, KY) account for a disproportionate share of shell-entity listings. A hardcoded exclusion list would remove the category-(c) bias for those jurisdictions specifically.
- Cost: hardcoded list is brittle + normative — who gets on it decides who "wins" the scoring change.
Recommendation
Option 2 is the most defensible methodology change and is
also the only one that requires data already being collected.
The seeder captures programs per entry; a scorer update
would read sanctions:program-pressure:v1 or an extended
country-counts:v2 with per-program breakdowns and apply a
rubric-mapped weight to each program.
This PR does NOT implement Option 2. It:
- Documents the three categories explicitly (above)
- Pins the CURRENT
normalizeSanctionCountpiecewise scale with regression tests so a future scorer refactor cannot silently flip the behavior - Flags the construct question for a methodology-decision PR
Follow-up audit (requires API key / Redis access)
Per the plan's §PR 5.1 task list, an entity-level sample audit of the raw OFAC data would classify 10 entries per country for AE, HK, SG, CY, TR, RU, IR, US into categories (a)/(b)/(c) and produce a calibration point for an Option-2 program-weight mapping. Out of scope for this doc-only PR.
Regression-guard tests
Pinned in
tests/resilience-sanctions-field-mapping.test.mts:
normalizeSanctionCountpiecewise anchors:count=0 → 100,count=1 → 90,count=10 → 75,count=50 → 50,count=200 → 25,count=500 → ≤ 0.- Monotonicity: more designated parties → lower score.
- Scorer reads
sanctions:country-counts:v1[ISO2]and defaults to 0 (score=100) when the country is absent from the map — intentional, since absence means "no designated parties located here," not "data missing." sanctionsRaw == null(seed outage) → null score slot, NOT imputed — protects against silent data-outage scoring.
References.
- Seeder:
scripts/seed-sanctions-pressure.mjslines 83-93 (buildCountryCounts) - Scorer:
server/worldmonitor/resilience/v1/_dimension-scorers.tslines 263 (RESILIENCE_SANCTIONS_KEY), 535 (normalizeSanctionCount), 1057 (scoreTradeSanctions) - OFAC SDN docs: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
- Plan reference:
docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md§PR 5.1 - Test regression guards:
tests/resilience-sanctions-field-mapping.test.mts