mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question) (#3375)
* docs(resilience): PR 5.1 — sanctions construct audit (designated-party domicile question) PR 5.1 of cohort-audit plan 2026-04-24-002. Stacked on PR 5.3 (#3374) so the known-limitations.md section append is additive. Read-only static audit of scoreTradeSanctions + the sanctions:country-counts:v1 seed — framed around the Codex-reformulated construct question: should designated-party domicile count penalize resilience? Findings 1. The count is "OFAC-designated-party domicile locations," NOT "sanctions against this country." Seeder (`scripts/seed-sanctions- pressure.mjs:85-93`) parses OFAC Advanced XML SDN + Consolidated, extracts each designated party's Locations, and increments `map[countryCode]` by 1 for every location country on that party. 2. The count conflates three semantically distinct categories a resilience construct might treat differently: (a) Country-level sanction target (NK SDN listings) — correct penalty (b) Domiciled sanctioned entity (RU bank in Moscow, post-2022) — debatable, country hosts the actor (c) Transit / shell entity (UAE trading co listed under SDGT for Iran evasion; CY SPV for a Russian oligarch) — country is NOT the target, but takes the penalty 3. Observed GCC cohort impact: AE scores 54 vs KW/QA 82. The −28 gap is almost entirely driven by category (c) listings — AE is a financial hub where sanctioned parties incorporate shells. 4. Three options documented for the construct decision (NOT decided in this PR): - Option 1: Keep flat count (status quo, defensible via secondary- sanctions / FATF argument) - Option 2: Program-weighted count — weight DPRK/IRAN/SYRIA/etc. at 1.0, SDGT/SDNTK/CYBER/etc. at 0.3-0.5. Recommended; seeder already captures `programs` per entry — data is there, scorer just doesn't read it. - Option 3: Transit-hub exclusion list (AE, SG, HK, CY, VG, KY) — brittle + normative, not recommended 5. Recommendation documented: Option 2. Implementation deferred to a separate methodology-decision PR (outside auto-mode authority). Shipped - `docs/methodology/known-limitations.md` — new section extending the file: "tradeSanctions — designated-party domicile construct question." Covers what the count represents, the three categories with examples, observed GCC impact, three options w/ trade-offs, recommendation, follow-up audit list (entity-sample gated on API-key access), and file references. - `tests/resilience-sanctions-field-mapping.test.mts` (new) — 10 regression-guard tests pinning CURRENT behavior: 1-6. normalizeSanctionCount piecewise anchors: count=0→100, 1→90, 10→75, 50→50, 200→25, 500→≤1 7. Monotonicity: strictly decreasing across the ramp 8. Country absent from map defaults to count=0 → score 100 (intentional "no designated parties here" semantics) 9. Seed outage (raw=null) → null score slot, NOT imputed (protects against silent data-outage scoring) 10. Construct anchor: count=1 is exactly 10 points below count=0 (pins the "first listing drops 10" design choice) Verified - `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail - `npm run test:data` — 6721 pass / 0 fail - `npm run typecheck` / `typecheck:api` — green - `npm run lint` / `lint:md` — clean * fix(resilience): PR 5.1 review — tighten count=500 assertion; clarify weightedBlend weights Addresses 2 P2 Greptile findings on #3375: 1. Tighten count=500 assertion. Was `<= 1` with a comment stating the exact value is 0. That loose bound silently tolerates roundScore / boundary drift that would be the very signal this regression guard exists to catch. Changed to strict equality `=== 0`. 2. Clarify the "zero weight" comment on the sanctions-only harness. The other slots DO contribute their declared weights (0.15 + 0.15 + 0.25 = 0.55) to weightedBlend's `totalWeight` denominator — only `availableWeight` (the score-computation denominator) drops to 0.45 because their score is null. The previous comment elided this distinction and could mislead a reader into thinking the null slots contributed nothing at all. Expanded to state exactly how `coverage` and `score` each behave. Verified - `npx tsx --test tests/resilience-sanctions-field-mapping.test.mts` — 10 pass / 0 fail (count=500 now pins the exact 0 floor)
This commit is contained in:
@@ -267,3 +267,148 @@ verifying.
|
|||||||
§PR 5.3
|
§PR 5.3
|
||||||
- Test regression guards:
|
- Test regression guards:
|
||||||
`tests/resilience-foodwater-field-mapping.test.mts`
|
`tests/resilience-foodwater-field-mapping.test.mts`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## tradeSanctions — "designated-party domicile" construct question (scoreTradeSanctions)
|
||||||
|
|
||||||
|
**Dimension.** `tradeSanctions` (`scoreTradeSanctions`, weight 0.45
|
||||||
|
of the blend for the sanctions sub-component; 1.0 for the dim in
|
||||||
|
the `economic` domain).
|
||||||
|
|
||||||
|
**Source.** `sanctions:country-counts:v1`, a flat `ISO2 → count`
|
||||||
|
map written by `scripts/seed-sanctions-pressure.mjs`. The seeder
|
||||||
|
parses OFAC's Advanced XML (SDN + Consolidated lists), extracts
|
||||||
|
each designated party's `Locations`, and increments
|
||||||
|
`map[countryCode]` by 1 for every country listed in that
|
||||||
|
party's locations.
|
||||||
|
|
||||||
|
### What the count ACTUALLY represents
|
||||||
|
|
||||||
|
The count is **"how many OFAC-designated parties list this
|
||||||
|
country as a location"** — not "how many sanctions this country
|
||||||
|
is under." A single designated entity's primary country gets +1;
|
||||||
|
a shell that's domiciled in country X but operates via country Y
|
||||||
|
will typically list both and increment both counts.
|
||||||
|
|
||||||
|
Consequence: the count conflates three semantically distinct
|
||||||
|
categories that a resilience construct might want to treat
|
||||||
|
differently.
|
||||||
|
|
||||||
|
| Category | Example | Current scorer impact | Construct question |
|
||||||
|
|---|---|---|---|
|
||||||
|
| (a) **Country-level sanction target** | North Korea SDN listings | +1 per designated entity/person inside the sanctioned state | Penalizing the state is the INTENDED signal — resilience is genuinely degraded by comprehensive sanctions |
|
||||||
|
| (b) **Domiciled sanctioned entity** | Russian bank HQ'd in Moscow, designated post-2022 invasion | +1 per listing | The country's resilience is indirectly penalized for hosting the sanctioned actor — debatable |
|
||||||
|
| (c) **Transit / shell entity listing** | UAE-based trading company designated under SDGT for Iran oil smuggling; Cyprus-registered SPV facilitating a Russian oligarch's asset transfer | +1 per listing even when the country itself is NOT the sanctions target | The country is penalized because it's a financial hub that shell entities incorporate in — construct-debatable |
|
||||||
|
|
||||||
|
### Observed effect in the 2026-04-24 cohort audit
|
||||||
|
|
||||||
|
| Country | `tradeSanctions` dim score | Interpretation under current construct |
|
||||||
|
|---|---|---|
|
||||||
|
| KW | 82 | Low designated-party count (mostly a clean jurisdiction) |
|
||||||
|
| QA | 82 | Low count |
|
||||||
|
| AE | 54 | High count — dominated by category (c): Iran-evasion shell entities, Russian-asset SPVs |
|
||||||
|
| SA | (similar) | Low count |
|
||||||
|
|
||||||
|
AE's gap of −28 vs KW/QA is almost entirely driven by category
|
||||||
|
(c) listings. Under the CURRENT scorer, AE's resilience is
|
||||||
|
penalized for being a financial hub where sanctioned parties
|
||||||
|
incorporate shells — regardless of whether the UAE state is
|
||||||
|
complicit or targeted by the listing.
|
||||||
|
|
||||||
|
### Construct options (not decided here)
|
||||||
|
|
||||||
|
This PR deliberately does NOT pick an option; the scoring
|
||||||
|
implication is large enough that the decision belongs to a
|
||||||
|
separate construct-discussion PR with cohort snapshots.
|
||||||
|
|
||||||
|
**Option 1 — Keep the current flat count (status quo).**
|
||||||
|
|
||||||
|
- Rationale: financial-sanctions exposure IS a real resilience
|
||||||
|
risk even for transit-hub jurisdictions. A country that
|
||||||
|
functions as a shell-entity jurisdiction ends up correlated
|
||||||
|
with secondary-sanctions enforcement actions, correspondent-
|
||||||
|
banking isolation, and FATF grey-listing pressure.
|
||||||
|
- Cost: countries whose domestic policy is NOT what earned them
|
||||||
|
the count (UAE-on-Iran, Cyprus-on-Russia) carry a score
|
||||||
|
penalty for the behavior of entities that happen to have
|
||||||
|
listed addresses there.
|
||||||
|
|
||||||
|
**Option 2 — Weight by OFAC program category.**
|
||||||
|
|
||||||
|
- Rationale: programs encode the nature of the designation.
|
||||||
|
`DPRK`, `IRAN`, `SYRIA`, `VENEZUELA`, `CUBA` are
|
||||||
|
country-comprehensive; `SDGT`, `SDNTK`, `CYBER`, `RUSSIA-EO`,
|
||||||
|
`GLOMAG` are typically entity-specific.
|
||||||
|
- Approach: weight category-(a) programs at 1.0 and category-
|
||||||
|
(c)-ish programs at 0.3–0.5 based on a named mapping.
|
||||||
|
- Cost: requires maintaining a program→category manifest;
|
||||||
|
program codes change over time; currently the seeder already
|
||||||
|
captures `programs` per entry (see
|
||||||
|
`scripts/seed-sanctions-pressure.mjs` lines 95-108) — the
|
||||||
|
data is there, the scorer just doesn't read it.
|
||||||
|
|
||||||
|
**Option 3 — Exclude transit-hub jurisdictions from the
|
||||||
|
domicile-count signal.**
|
||||||
|
|
||||||
|
- Rationale: a small number of jurisdictions (AE, SG, HK, CY,
|
||||||
|
VG, KY) account for a disproportionate share of shell-entity
|
||||||
|
listings. A hardcoded exclusion list would remove the
|
||||||
|
category-(c) bias for those jurisdictions specifically.
|
||||||
|
- Cost: hardcoded list is brittle + normative — who gets on it
|
||||||
|
decides who "wins" the scoring change.
|
||||||
|
|
||||||
|
### Recommendation
|
||||||
|
|
||||||
|
**Option 2** is the most defensible methodology change and is
|
||||||
|
also the only one that requires data already being collected.
|
||||||
|
The seeder captures `programs` per entry; a scorer update
|
||||||
|
would read `sanctions:program-pressure:v1` or an extended
|
||||||
|
`country-counts:v2` with per-program breakdowns and apply a
|
||||||
|
rubric-mapped weight to each program.
|
||||||
|
|
||||||
|
**This PR does NOT implement Option 2.** It:
|
||||||
|
|
||||||
|
1. Documents the three categories explicitly (above)
|
||||||
|
2. Pins the CURRENT `normalizeSanctionCount` piecewise scale
|
||||||
|
with regression tests so a future scorer refactor cannot
|
||||||
|
silently flip the behavior
|
||||||
|
3. Flags the construct question for a methodology-decision PR
|
||||||
|
|
||||||
|
### Follow-up audit (requires API key / Redis access)
|
||||||
|
|
||||||
|
Per the plan's §PR 5.1 task list, an entity-level sample audit
|
||||||
|
of the raw OFAC data would classify 10 entries per country
|
||||||
|
for AE, HK, SG, CY, TR, RU, IR, US into categories (a)/(b)/(c)
|
||||||
|
and produce a calibration point for an Option-2 program-weight
|
||||||
|
mapping. Out of scope for this doc-only PR.
|
||||||
|
|
||||||
|
### Regression-guard tests
|
||||||
|
|
||||||
|
Pinned in
|
||||||
|
`tests/resilience-sanctions-field-mapping.test.mts`:
|
||||||
|
|
||||||
|
- `normalizeSanctionCount` piecewise anchors:
|
||||||
|
`count=0 → 100`, `count=1 → 90`, `count=10 → 75`,
|
||||||
|
`count=50 → 50`, `count=200 → 25`, `count=500 → ≤ 0`.
|
||||||
|
- Monotonicity: more designated parties → lower score.
|
||||||
|
- Scorer reads `sanctions:country-counts:v1[ISO2]` and defaults
|
||||||
|
to 0 (score=100) when the country is absent from the map —
|
||||||
|
intentional, since absence means "no designated parties
|
||||||
|
located here," not "data missing."
|
||||||
|
- `sanctionsRaw == null` (seed outage) → null score slot,
|
||||||
|
NOT imputed — protects against silent data-outage scoring.
|
||||||
|
|
||||||
|
**References.**
|
||||||
|
|
||||||
|
- Seeder: `scripts/seed-sanctions-pressure.mjs` lines 83-93
|
||||||
|
(`buildCountryCounts`)
|
||||||
|
- Scorer: `server/worldmonitor/resilience/v1/_dimension-scorers.ts`
|
||||||
|
lines 263 (`RESILIENCE_SANCTIONS_KEY`),
|
||||||
|
535 (`normalizeSanctionCount`), 1057 (`scoreTradeSanctions`)
|
||||||
|
- OFAC SDN docs: https://ofac.treasury.gov/specially-designated-nationals-and-blocked-persons-list-sdn-human-readable-lists
|
||||||
|
- Plan reference:
|
||||||
|
`docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md`
|
||||||
|
§PR 5.1
|
||||||
|
- Test regression guards:
|
||||||
|
`tests/resilience-sanctions-field-mapping.test.mts`
|
||||||
|
|||||||
149
tests/resilience-sanctions-field-mapping.test.mts
Normal file
149
tests/resilience-sanctions-field-mapping.test.mts
Normal file
@@ -0,0 +1,149 @@
|
|||||||
|
// Regression guard for scoreTradeSanctions's normalizeSanctionCount
|
||||||
|
// piecewise anchors and field-mapping contract.
|
||||||
|
//
|
||||||
|
// Context. PR 5.1 of plan 2026-04-24-002 (see
|
||||||
|
// `docs/methodology/known-limitations.md#tradesanctions-designated-party-domicile-construct-question`)
|
||||||
|
// documents the construct-ambiguity of counting OFAC-designated-party
|
||||||
|
// domicile locations as a resilience signal. The audit proposes three
|
||||||
|
// options for handling the transit-hub-shell-entity case but
|
||||||
|
// intentionally does NOT implement a scoring change. This test file
|
||||||
|
// pins the CURRENT scorer behavior so that a future methodology
|
||||||
|
// decision (Option 2 = program-weighted count; Option 3 = transit-hub
|
||||||
|
// exclusion; or status quo) updates these tests explicitly.
|
||||||
|
//
|
||||||
|
// Pinning protects against silent scorer refactors: if someone swaps
|
||||||
|
// the piecewise scale, flips the imputation path, or changes how the
|
||||||
|
// seed-outage null branch interacts with weightedBlend, this file
|
||||||
|
// fails before the scoring change propagates to a live publication.
|
||||||
|
|
||||||
|
import assert from 'node:assert/strict';
|
||||||
|
import { describe, it } from 'node:test';
|
||||||
|
|
||||||
|
import {
|
||||||
|
scoreTradeSanctions,
|
||||||
|
type ResilienceSeedReader,
|
||||||
|
} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
|
||||||
|
|
||||||
|
const TEST_ISO2 = 'XX';
|
||||||
|
|
||||||
|
// Minimal synthetic reader: only the sanctions key is populated, so the
|
||||||
|
// scorer's other slots (restrictions, barriers, tariff) drop to null
|
||||||
|
// and contribute zero weight. Isolates the sanctions slot math.
|
||||||
|
function sanctionsOnlyReader(sanctionsCount: number | null): ResilienceSeedReader {
|
||||||
|
return async (key: string) => {
|
||||||
|
if (key === 'sanctions:country-counts:v1') {
|
||||||
|
return sanctionsCount == null ? null : { [TEST_ISO2]: sanctionsCount };
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
describe('normalizeSanctionCount — piecewise anchors pinned', () => {
|
||||||
|
// The scorer's piecewise scale (see _dimension-scorers.ts line 535):
|
||||||
|
// count=0 → 100
|
||||||
|
// count=1-10 → 90..75 (linear)
|
||||||
|
// count=11-50 → 75..50 (linear)
|
||||||
|
// count=51-200 → 50..25 (linear)
|
||||||
|
// count=201+ → 25..0 (linear at 0.1/step, clamped 0)
|
||||||
|
//
|
||||||
|
// The tests drive scoreTradeSanctions end-to-end with an otherwise-
|
||||||
|
// empty reader so the sanctions slot is the only one contributing a
|
||||||
|
// non-null score to the weightedBlend. Note the OTHER slots still
|
||||||
|
// contribute their declared weights (restrictions 0.15, barriers
|
||||||
|
// 0.15, tariff 0.25) to weightedBlend's `totalWeight` denominator —
|
||||||
|
// they just don't contribute to `availableWeight` (the score-
|
||||||
|
// computation denominator) because their score is null. So the
|
||||||
|
// surfaced `coverage` value reflects the 0.45 sanctions weight over
|
||||||
|
// the full 1.0 totalWeight; the surfaced `score` reflects the
|
||||||
|
// sanctions-slot score alone (since it's the only non-null input).
|
||||||
|
|
||||||
|
it('count=0 anchors at score 100 (no designated parties)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(0));
|
||||||
|
assert.equal(result.score, 100, `expected 100 at count=0, got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('count=1 anchors at score 90 (first listing drops 10 points)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(1));
|
||||||
|
assert.equal(result.score, 90, `expected 90 at count=1, got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('count=10 anchors at score 75 (end of the 1-10 ramp)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(10));
|
||||||
|
assert.equal(result.score, 75, `expected 75 at count=10, got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('count=50 anchors at score 50 (end of the 11-50 ramp)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(50));
|
||||||
|
assert.equal(result.score, 50, `expected 50 at count=50, got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('count=200 anchors at score 25 (end of the 51-200 ramp)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(200));
|
||||||
|
assert.equal(result.score, 25, `expected 25 at count=200, got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('count=500 anchors at score 0 (high-count tail clamped to floor)', async () => {
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(500));
|
||||||
|
// At count=500: 25 - (500-200)*0.1 = 25 - 30 = -5 → clamped to 0
|
||||||
|
// via `roundScore` which clamps to [0, 100]. Equality assertion
|
||||||
|
// (not <= 1) so a future roundScore / boundary change that nudges
|
||||||
|
// the result off 0 breaks the test loudly instead of silently.
|
||||||
|
assert.equal(result.score, 0,
|
||||||
|
`expected exactly 0 at count=500 (heavily-sanctioned state; clamped from -5); got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('monotonic: more designated parties → strictly lower score', async () => {
|
||||||
|
const scores = await Promise.all([0, 1, 10, 50, 200, 500].map(
|
||||||
|
(n) => scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(n)),
|
||||||
|
));
|
||||||
|
for (let i = 1; i < scores.length; i++) {
|
||||||
|
assert.ok(scores[i].score < scores[i - 1].score,
|
||||||
|
`score must strictly decrease with count; got [${scores.map((s) => s.score).join(', ')}]`);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
describe('scoreTradeSanctions — field-mapping + outage semantics', () => {
|
||||||
|
it('country absent from sanctions map defaults to count=0 (score 100)', async () => {
|
||||||
|
// The map is ISO2 → count. A country NOT in the map is semantically
|
||||||
|
// "no designated parties located here" — NOT "data missing". The
|
||||||
|
// scorer reads `sanctionsCounts[countryCode] ?? 0` (line 1070).
|
||||||
|
const reader: ResilienceSeedReader = async (key) => {
|
||||||
|
if (key === 'sanctions:country-counts:v1') {
|
||||||
|
return { US: 100, RU: 800 }; // our test country XX is NOT in this map
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
};
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, reader);
|
||||||
|
assert.equal(result.score, 100,
|
||||||
|
`absent-from-map must score 100 (count=0 semantics); got ${result.score}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('sanctions seed outage (raw=null) contributes null score slot — NOT imputed', async () => {
|
||||||
|
// When the seed key is entirely absent (not just the country key),
|
||||||
|
// `sanctionsRaw == null` and the slot goes to { score: null, weight: 0.45 }
|
||||||
|
// (line 1082-1083 of _dimension-scorers.ts). This is an intentional
|
||||||
|
// fail-null behavior: we must NOT impute a score on seed outage,
|
||||||
|
// because imputing would mask the outage. The other slots also drop
|
||||||
|
// to null (nothing in our synthetic reader), so weightedBlend returns
|
||||||
|
// coverage=0 — a clean zero-signal state that propagates as low
|
||||||
|
// confidence at the dim level.
|
||||||
|
const reader: ResilienceSeedReader = async () => null;
|
||||||
|
const result = await scoreTradeSanctions(TEST_ISO2, reader);
|
||||||
|
assert.equal(result.coverage, 0,
|
||||||
|
`full-outage must produce coverage=0 (no impute-as-if-clean); got ${result.coverage}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('construct-document anchor: count=1 differs from count=0 by exactly 10 points', async () => {
|
||||||
|
// Pins the "first designated party drops the score by 10" design
|
||||||
|
// choice. A future methodology PR that decides Option 2 (program-
|
||||||
|
// weighted) or Option 3 (transit-hub exclusion) will necessarily
|
||||||
|
// update this anchor if the weight-1 semantics change.
|
||||||
|
const [zero, one] = await Promise.all([
|
||||||
|
scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(0)),
|
||||||
|
scoreTradeSanctions(TEST_ISO2, sanctionsOnlyReader(1)),
|
||||||
|
]);
|
||||||
|
assert.equal(zero.score - one.score, 10,
|
||||||
|
`count=1 must be exactly 10 points below count=0; got ${zero.score - one.score}`);
|
||||||
|
});
|
||||||
|
});
|
||||||
Reference in New Issue
Block a user