feat(resilience): PR 0 — cohort-sanity release-gate harness (#3369)

* feat(resilience): PR 0 — cohort-sanity release-gate harness Lands the audit infrastructure for the resilience cohort-ranking structural audit (plan 2026-04-24-002). Release gate, not merge gate: the audit tells release review what to look at before publishing a ranking; it does not block a PR. What's new - scripts/audit-resilience-cohorts.mjs — Markdown report generator. Fetches the live ranking + per-country scores (or reads a fixture in offline mode), emits per-cohort per-dimension tables, contribution decomposition, saturated / outlier / identical-score flags, and a top-N movers comparison vs a baseline snapshot. - tests/resilience-construct-invariants.test.mts — 12 formula-level anchor-value assertions with synthetic inputs. Covers HHI, external debt (Greenspan-Guidotti anchor), and sovereign fiscal buffer (saturating transform). Tests the MATH, not a country's rank. - tests/fixtures/resilience-audit-fixture.json — offline fixture that mirrors the 2026-04-24 GCC state (KW>QA>AE) so the audit tool can be smoke-tested without API-key access. - docs/methodology/cohort-sanity-release-gate.md — operational doc explaining when to run, how to read the report, and the explicit anti-pattern note on rank-targeted acceptance criteria. Verified - `npx tsx --test tests/resilience-construct-invariants.test.mts` — 12 pass (HHI, debt, SWF invariants all green against current scorer) - `npm run test:data` — 6706 pass / 0 fail - `FIXTURE=tests/fixtures/resilience-audit-fixture.json OUT=/tmp/audit.md node scripts/audit-resilience-cohorts.mjs` runs to completion and correctly flags: (a) coverage-outlier on AE.importConcentration (0.3 vs peers 1.0) (b) saturated-high on GCC.externalDebtCoverage (all 6 at 100) — the two top cohort-sanity findings from the plan. Not in this PR - The live-API baseline snapshot (docs/snapshots/resilience-ranking-live-pre-cohort-audit-2026-04-24.json) is deferred to a manual release-prep step: run `WORLDMONITOR_API_KEY=wm_xxx API_BASE=https://api.worldmonitor.app node scripts/freeze-resilience-ranking.mjs` before the first methodology PR (PR 1 HHI period widening) so its movers table has something to compare against. - No scorer changes. No cache-prefix bumps. This PR is pure tooling. * fix(resilience): fail-closed on fetch failures + pillar-combine formula mode Addresses review P1 + P2 on PR #3369. P1 — fetch-failure silent-drop. Per-country score fetches that failed were logged to stderr, silently stored as null, and then filtered out of cohort tables via `codes.filter((cc) => scoreMap.get(cc))`. A transient 403/500 on the very country carrying the ranking anomaly could produce a Markdown report that looked valid — wrong failure mode for a release gate. Fix: - `fetchScoresConcurrent` now tracks failures in a dedicated Map and does NOT insert null placeholders; missing cohort members are computed against the requested cohort code set. - The report has a ⛔ blocker banner at top AND an always-rendered "Fetch failures / missing members" section (shown even when empty, so an operator learns to look). - `STRICT=1` writes the report, then exits code 3 on any fetch failure or missing cohort member, code 4 on formula-mode drift, code 0 otherwise. Automation can differentiate the two. P2 — pillar-combine formula mode invalidates contribution rows. `docs/methodology/cohort-sanity-release-gate.md:63` tells operators to run this audit before activating `RESILIENCE_PILLAR_COMBINE_ENABLED`, but the contribution decomposition is a domain-weighted roll-up that is ONLY valid when `overallScore = sum(domain.score * domain.weight)`. Once pillar combine is on, `overallScore = penalizedPillarScore(pillars)` (non-linear in dim scores); decomposition rows become materially misleading for exactly the release-gate scenario the doc prescribes. Fix: - Added `detectFormulaMode(scoreMap)` that takes countries with: (a) `sum(domain.weight)` within 0.05 of 1.0 (complete response), AND (b) every dim at `coverage ≥ 0.9` (stable share math) and compares `|Σ contributions - overallScore|` against `CONTRIB_TOLERANCE` (default 1.5). If > 50% of ≥ 3 eligible countries drift, pillar combine is flagged. - Report emits a ⛔ blocker banner at top, a "Formula mode" line in the header, and a "Formula-mode diagnostic" section with the first three offenders. Under `STRICT=1` exits code 4. - Methodology doc updated: new "Fail-closed semantics" section, "Formula mode" operator guide, ENV table entries for STRICT + CONTRIB_TOLERANCE. Verified: - `tests/audit-cohort-formula-detection.test.mts` (NEW) — 3 child-process smoke tests: missing-members banner + STRICT exit 3, all-clear exit 0, pillar-mode banner + STRICT exit 4. All pass. - `npx tsx --test tests/resilience-construct-invariants.test.mts tests/audit-cohort-formula-detection.test.mts` — 15 pass / 0 fail - `npm run test:data` — 6709 pass / 0 fail - `npm run typecheck` / `typecheck:api` — green - `npm run lint` / `lint:md` — no warnings on new / changed files (refactor split buildReport complexity from 51 → under 50 by extracting `renderCohortSection` + `renderDimCell`) - Fixture smoke: AE.importConcentration coverage-outlier and GCC.externalDebtCoverage saturated-high flags still fire correctly. * fix(resilience): PR 0 review — fixture-mode source label, try/catch country-names, ASCII minus Addresses 3 P2 Greptile findings on #3369: 1. **Misleading Source: line in fixture mode.** `FIXTURE_PATH` sets `API_BASE=''`, so the report header showed a bare "/api/..." path that never resolved — making a fixture run visually indistinguishable from a live run. Now surfaces `Source: fixture://<path>` in fixture mode. 2. **`loadCountryNameMap` crashes without useful diagnostics.** A missing or unparseable `shared/country-names.json` produced a raw unhandled rejection. Now the read and the parse are each wrapped in their own try/catch; on either failure the script logs a developer-friendly warning and falls back to ISO-2 codes (report shows "AE" instead of "Uae"). Keeps the audit operable in CI-offline scenarios. 3. **Unicode minus `−` (U+2212) instead of ASCII `-` in `fmtDelta`.** Downstream operators diff / grep / CSV-pipe the report; the Unicode minus breaks byte-level text tooling. Replaced with ASCII hyphen- minus. Left the U+2212 in the formula-mode diagnostic prose (`|Σ contributions − overallScore|`) where it's mathematical notation, not data. Verified - `npx tsx --test tests/audit-cohort-formula-detection.test.mts tests/resilience-construct-invariants.test.mts` — 15 pass / 0 fail - Fixture-mode run produces `Source: fixture://tests/fixtures/...` - Movers-table negative deltas now use ASCII `-`
2026-04-25 17:14:57 +02:00 · 2026-04-24 18:13:22 +04:00
parent 34dfc9a451
commit df392b0514
5 changed files with 1372 additions and 0 deletions
--- a/docs/methodology/cohort-sanity-release-gate.md
+++ b/docs/methodology/cohort-sanity-release-gate.md
@@ -0,0 +1,240 @@
+# Cohort-sanity release gate
+
+Operational procedure for the resilience cohort-sanity audit. This is a
+**release gate**, not a merge gate. The audit tells release review what
+to look at before publishing a ranking; it does not block a PR from
+merging.
+
+## What this exists to catch
+
+A composite resilience score can be mathematically correct yet produce
+rankings that contradict first-principles domain judgment — usually
+because ONE input has a coverage gap, a saturated goalpost, or a
+denominator that's structurally wrong for one sub-class of entities
+(re-export hubs, single-sector states, SWF-parked-reserve designs).
+
+Cohort-sanity is the test the codebase can't run on its own. It says:
+"given these cohorts, does the ranking match the construct each
+cohort is defined to probe?" Not "does country A rank above country
+B" — see the anti-pattern section below.
+
+Relevant background in the repository:
+
+- `docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md` —
+  the audit plan that motivates this gate.
+- Skill `cohort-ranking-sanity-surfaces-hidden-data-gaps` — the general
+  diagnostic protocol (data bug / methodology bug / construct limitation
+  / value judgment), including the anti-pattern note on rank-targeted
+  acceptance criteria.
+- `tests/resilience-construct-invariants.test.mts` — formula-level
+  invariants with synthetic inputs. These test the SCORING MATH; they
+  don't flip to fail on a live-ranking change.
+
+## Artifacts
+
+1. **`scripts/audit-resilience-cohorts.mjs`** — emits a structured
+   Markdown report with:
+   - Full top-N ranking table
+   - Per-cohort per-dimension breakdown (GCC, OECD-nuclear, ASEAN trade
+     hubs, LatAm-petro, African-fragile, post-Soviet, stressed-debt,
+     re-export hubs, SWF-heavy exporters, fragile-floor)
+   - Contribution decomposition: for each country, each dim's
+     `score × coverage × dimWeight × domainWeight` contribution to
+     overall
+   - Flagged patterns: saturated dims, low-coverage outliers, identical
+     scores across cohort members
+   - Top-N movers vs a baseline snapshot
+
+2. **`tests/resilience-construct-invariants.test.mts`** — formula-level
+   anchor-value assertions. Part of `npm run test:data`. Failing means
+   the scorer formula drifted; investigate before editing the test.
+
+3. **`docs/snapshots/resilience-ranking-live-pre-cohort-audit-YYYY-MM-DD.json`** —
+   the baseline snapshot for movers comparison. Refresh before each
+   methodology change.
+
+## When to run
+
+- **Pre-publication**: any time the published ranking is about to
+  change externally (site, API consumers, newsletter, partner feed).
+- **Every merge touching a scorer file** in `server/worldmonitor/resilience/v1/_dimension-scorers.ts`,
+  `server/worldmonitor/resilience/v1/_shared.ts`, or a scorer-feeding
+  seeder in `scripts/seed-recovery-*.mjs`, `scripts/seed-bundle-resilience-*.mjs`.
+- **Before activating a feature flag** that alters the scorer
+  (`RESILIENCE_ENERGY_V2_ENABLED`, `RESILIENCE_PILLAR_COMBINE_ENABLED`,
+  `RESILIENCE_SCHEMA_V2_ENABLED`).
+- **After a cache-prefix bump** (`resilience:score:vN`,
+  `resilience:ranking:vN`, `resilience:history:vN`) — once the new
+  prefix has warmed up, rerun the audit so the movers table reflects
+  the new values and nothing else.
+
+## How to run
+
+```bash
+# Online (hits the live API; requires WORLDMONITOR_API_KEY)
+WORLDMONITOR_API_KEY=wm_xxx \
+API_BASE=https://api.worldmonitor.app \
+BASELINE=docs/snapshots/resilience-ranking-live-pre-cohort-audit-2026-04-24.json \
+OUT=/tmp/cohort-audit-$(date +%Y-%m-%d).md \
+node scripts/audit-resilience-cohorts.mjs
+
+# Offline (fixture mode — for CI / dry-run / regression comparison)
+FIXTURE=tests/fixtures/resilience-audit-fixture.json \
+OUT=/tmp/cohort-audit-fixture.md \
+node scripts/audit-resilience-cohorts.mjs
+```
+
+Recommended environment variables:
+
+| Var | Default | Notes |
+|---|---|---|
+| `API_BASE` | (required unless FIXTURE set) | e.g. `https://api.worldmonitor.app` |
+| `WORLDMONITOR_API_KEY` | (required unless FIXTURE set) | resilience RPCs are in `PREMIUM_RPC_PATHS` |
+| `FIXTURE` | (empty) | JSON fixture with `{ ranking, scores }` shape — skips all network calls |
+| `BASELINE` | (empty) | Path to a frozen ranking JSON for movers comparison |
+| `OUT` | (stdout) | Path for the Markdown report |
+| `TOP_N` | 60 | Rows to render in the full-ranking table |
+| `MOVERS_N` | 30 | Rows to render in the movers table |
+| `CONCURRENCY` | 6 | Parallel score-endpoint fetches |
+| `STRICT` | unset | `1` = fail-closed. Report still writes, then exit 3 on fetch failures/missing members, exit 4 on formula-mode drift, exit 0 otherwise. Recommended for release-gate automation. |
+| `CONTRIB_TOLERANCE` | 1.5 | Points of drift tolerated between `Σ contributions` and `overallScore` before formula-mode drift is declared. |
+
+### Fail-closed semantics
+
+The audit is fail-closed on two axes. Both are implemented in
+`scripts/audit-resilience-cohorts.mjs` and documented here so that a
+release-gate operator cannot shortcut them by reading only the
+rendered tables.
+
+1. **Fetch failures / missing cohort members.** When a per-country score
+   fetch fails (HTTP 4xx/5xx, timeout, DNS), the country is NOT silently
+   dropped. The failure is recorded in the run's `failures` map, banner'd
+   as a ⛔ block at the top of the report, and rendered in a dedicated
+   "Fetch failures / missing members" section that is ALWAYS present
+   (even when empty, so an operator learns to look for it). Fixture mode
+   uses the same mechanism for cohort members absent from the fixture.
+
+2. **Formula-mode mismatch (`RESILIENCE_PILLAR_COMBINE_ENABLED`).** The
+   contribution decomposition is a domain-weighted roll-up that is ONLY
+   mathematically valid when `overallScore` is computed via the legacy
+   `sum(domain.score * domain.weight)` path. Once pillar combine is on,
+   `overallScore = penalizedPillarScore(pillars)` — a non-linear
+   function of the dim scores — and the decomposition rows no longer
+   sum to overall. The harness detects this by taking any country with:
+
+   - `sum(domain.weight)` within 0.05 of 1.0 (complete response)
+   - every dim at `coverage ≥ 0.9` (stable share math)
+
+   and checking `|Σ contributions - overallScore| ≤ CONTRIB_TOLERANCE`.
+   If more than 50% of ≥ 3 eligible countries drift beyond the
+   tolerance, a ⛔ blocker banner fires at report top AND a
+   "Formula-mode diagnostic" section prints the first three offenders
+   with their Σ vs overall numbers. Until the harness grows a
+   pillar-aware decomposition, the contribution tables under pillar
+   mode must be treated as *"legacy-formula reference only"*.
+
+### Formula mode
+
+The operator guide for what to do when the formula-mode banner fires:
+
+- **If the banner is a false positive** (e.g. scorer changed a dim
+  weight and the audit mirror in `scripts/audit-resilience-cohorts.mjs`
+  `DIM_WEIGHTS` is stale): update the mirror, re-run. This is the
+  `production-logic-mirror-silent-divergence` pattern — the mirror
+  must move with the scorer.
+- **If pillar combine actually activated:** stop using the
+  contribution-decomposition tables for this release gate. Fall back
+  to the per-dimension score table + the construct invariants test +
+  movers review. File a follow-up to grow the harness a pillar-aware
+  decomposition before the next methodology PR under pillar mode.
+- **Exit codes under `STRICT=1`:** `3` = fetch/missing, `4` = formula
+  mode, `0` = all clear. These are distinct so automation can
+  differentiate "the infra is broken" from "the code path is no
+  longer decomposable."
+
+## How to read the report
+
+The report surfaces five categories of signal. **Treat each as a
+prompt for investigation, not a merge gate.**
+
+### 1. Per-cohort per-dimension table
+
+Read across rows. If one country has `IMPUTED` / `unmonitored` /
+`coverage < 0.5` where peers have full coverage, that's a seed-level
+gap — probably a late-reporter window or a missing manifest entry.
+Fix the seed, not the score.
+
+### 2. Contribution decomposition
+
+Each cell shows how many overall-score points that dimension
+contributes to that country. If the row sum doesn't match overall
+score (not within ~0.5 points), the scorer is using a composition
+formula the audit script doesn't understand — investigate
+`_shared.ts`'s `coverageWeightedMean` + `penalizedPillarScore`
+branches and update the decomposition accordingly.
+
+### 3. Flagged patterns
+
+- **Saturated-high**: every cohort member scores > 95 on a dim. The
+  dim contributes zero discrimination within that cohort — either the
+  construct genuinely doesn't apply (acceptable; document in
+  `known-limitations.md`), or the goalpost is too generous (re-anchor).
+- **Saturated-low**: every member scores < 5. Same question in reverse;
+  often a seed failure rather than a construct issue.
+- **Identical scores**: all ≥ 3 cohort members hit the same non-trivial
+  value. Usually a regional-default leak or a missing-data imputation
+  class returning the same number.
+- **Coverage outlier**: one country is `coverage < 0.5` while peers
+  are ≥ 0.9. This is almost always the ranking-inversion smoking gun.
+
+### 4. Top-N movers vs baseline
+
+Expected movers post-methodology-PR are construct-consistent: a
+re-export-hub PR should move re-export hubs, not SWF-heavy exporters.
+Surprise movers trigger investigation before publication.
+
+### 5. Anchor invariants
+
+Run `npx tsx --test tests/resilience-construct-invariants.test.mts`.
+An anchor drift > 1 point on `score(ratio=1.0)=50` or
+`score(em=12)≈63` means someone silently re-goalposted or rewrote a
+saturating transform. This is a bug until proven otherwise.
+
+## Anti-pattern: rank-targeted acceptance criteria
+
+**Never put "ENTITY A > ENTITY B" as a merge gate in this workflow.**
+Once a review commits to producing a specific ranking, every construct
+/ manifest / goalpost knob becomes a lever to tune toward that
+outcome — even subconsciously — and the methodology loses its
+construct integrity.
+
+Use instead:
+
+- **Construct monotonicity tests** — synthetic inputs, not country
+  identity: `score(HHI=0.05) > score(HHI=0.20)`,
+  `score(ratio=1.0) = 50 ± 1`. These fail when the MATH breaks, not
+  when the RANKING changes.
+- **Out-of-sample cohort behaviour** — define a cohort the fix is
+  SUPPOSED to move proportionally (re-export hubs, SWF-heavy
+  exporters, stressed states). Acceptance: cohort behaviour matches
+  the construct change, not a target position.
+- **Top-N movers review** — movers should be cohort members the
+  construct predicts; surprises trigger investigation.
+- **Honest "outcome may not resolve"** — if the original sanity-
+  failure (the ranking inversion that triggered the audit) is not
+  guaranteed to resolve under the in-scope fixes, say so explicitly.
+  A plan that acknowledges "the inversion may persist after all
+  fixes, because the dominant driver is out of scope" is stronger
+  than one that over-promises.
+
+If a release reviewer asks "will this make A rank above B", the
+correct answer is: *"A will move by the amount the construct
+predicts. Where it ends up relative to B is an outcome."*
+
+## Follow-ups
+
+- Every novel gap identified by the audit should land as a section in
+  `docs/methodology/known-limitations.md` so future reviewers see the
+  diagnosis trail.
+- If a gap is fixed in a PR, the audit report from that PR's
+  post-merge run should be attached to the PR as an artifact.
--- a/scripts/audit-resilience-cohorts.mjs
+++ b/scripts/audit-resilience-cohorts.mjs
@@ -0,0 +1,680 @@
+#!/usr/bin/env node
+// Release-gate audit harness for the resilience scorer. Emits a Markdown
+// report that surfaces cohort-level ranking sanity issues BEFORE they reach
+// publication. Designed as a release gate, not a commit gate — see
+// docs/methodology/cohort-sanity-release-gate.md for the interpretation
+// contract and the explicit anti-pattern note on rank-targeted acceptance
+// criteria.
+//
+// What this does:
+//   1. Fetch the live ranking via GET /api/resilience/v1/get-resilience-ranking.
+//   2. For every country in the named cohorts (GCC, OECD-nuclear, ASEAN-
+//      trade-hub, LatAm-petro, African-fragile, post-Soviet, stressed-debt),
+//      fetch the full per-dimension score via GET
+//      /api/resilience/v1/get-resilience-score?countryCode=XX.
+//   3. Emit a Markdown report with:
+//        - Full ranking table (top N + grey-outs summary)
+//        - Per-cohort per-dimension breakdown (score / coverage / imputation)
+//        - Contribution decomposition: per country, per dim,
+//          (score × coverage × dimWeight × domainWeight) toward overall
+//        - Flagged patterns: saturated dimensions (>95 across cohort),
+//          low-coverage outliers (coverage < 0.5 where peers are 1.0),
+//          identical-score clusters (same score across all cohort members)
+//        - Top-N movers vs a baseline snapshot (optional)
+//
+// What this does NOT do:
+//   - Assert country rank orderings ("AE > KW"). That would couple the gate
+//     to outcome-seeking; the audit is intentionally descriptive.
+//   - Fail the build. It's a report generator. Release review reads the
+//     report and decides whether to hold publication.
+//
+// Usage:
+//   WORLDMONITOR_API_KEY=wm_xxx API_BASE=https://api.worldmonitor.app \
+//     node scripts/audit-resilience-cohorts.mjs
+//   WORLDMONITOR_API_KEY=wm_xxx API_BASE=... \
+//     BASELINE=docs/snapshots/resilience-ranking-live-pre-cohort-audit-2026-04-24.json \
+//     OUT=/tmp/audit.md node scripts/audit-resilience-cohorts.mjs
+//   FIXTURE=tests/fixtures/resilience-audit-fixture.json node scripts/audit-resilience-cohorts.mjs
+//
+// Auth: the resilience ranking + score endpoints are in PREMIUM_RPC_PATHS
+// (see src/shared/premium-paths.ts). A valid WORLDMONITOR_API_KEY is
+// required whether running from a trusted browser origin or not — the
+// premium gate forces the key.
+//
+// Fixture mode (FIXTURE env): reads a JSON file with shape
+//   { ranking: GetResilienceRankingResponse, scores: { [cc]: GetResilienceScoreResponse } }
+// and builds the report without any network calls. Useful for offline runs
+// and for regression-comparing the audit output itself across scorer
+// changes (diff the Markdown).
+//
+// Failure modes the script explicitly surfaces (NOT silent-drops):
+//   1. Per-country fetch failure (HTTP 4xx/5xx, timeout). Tracked in a
+//      `failures` map, rendered as a top-of-report blocker banner and a
+//      dedicated "Fetch failures / missing members" section, so a
+//      reviewer skimming the artifact cannot miss that the cohort was
+//      only partially audited.
+//   2. Formula-mode mismatch. When `RESILIENCE_PILLAR_COMBINE_ENABLED`
+//      is active, `overallScore = penalizedPillarScore(pillars)` — a
+//      non-linear function of the dim scores — and the contribution
+//      decomposition (domain-weighted) no longer sums to overall. The
+//      harness detects this via Σ-contribution vs overall drift and
+//      flags it at report top so the operator knows the decomposition
+//      rows are reference-only.
+// STRICT=1 exits non-zero (code 3 for fetch failures, 4 for formula
+// mismatch) AFTER writing the report, so release-gate automation can't
+// treat a partial/stale audit as green.
+
+import fs from 'node:fs/promises';
+import path from 'node:path';
+import { fileURLToPath } from 'node:url';
+import { execSync } from 'node:child_process';
+
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = path.dirname(__filename);
+const REPO_ROOT = path.resolve(__dirname, '..');
+
+const FIXTURE_PATH = process.env.FIXTURE || '';
+const API_BASE = (process.env.API_BASE || '').replace(/\/$/, '');
+if (!FIXTURE_PATH) {
+  if (!API_BASE) {
+    console.error('[audit-resilience-cohorts] API_BASE env var required (e.g. https://api.worldmonitor.app), or FIXTURE=path.json for offline mode');
+    process.exit(2);
+  }
+  if (!process.env.WORLDMONITOR_API_KEY) {
+    console.error('[audit-resilience-cohorts] WORLDMONITOR_API_KEY env var required; resilience RPC paths are in PREMIUM_RPC_PATHS.');
+    process.exit(2);
+  }
+}
+
+const RANKING_URL = `${API_BASE}/api/resilience/v1/get-resilience-ranking`;
+const SCORE_URL = (cc) => `${API_BASE}/api/resilience/v1/get-resilience-score?countryCode=${encodeURIComponent(cc)}`;
+const BASELINE_PATH = process.env.BASELINE || '';
+const OUT_PATH = process.env.OUT || '';
+const TOP_N_FULL_RANKING = Number(process.env.TOP_N || 60);
+const MOVERS_N = Number(process.env.MOVERS_N || 30);
+const CONCURRENCY = Number(process.env.CONCURRENCY || 6);
+// STRICT=1 makes the audit fail-closed: any per-country fetch failure OR any
+// detected formula-mode change (pillar-combine on, contribution rows
+// invalid) exits non-zero so the release-gate operator cannot accidentally
+// ship a partial / misleading report. Default (STRICT unset) still renders
+// but banners the issue prominently at report top.
+const STRICT = process.env.STRICT === '1' || process.env.STRICT === 'true';
+// Tolerance for "sum(contributions) vs overallScore" equality check used
+// to detect pillar-combine formula mode (see decomposeContributions).
+const CONTRIBUTION_SUM_TOLERANCE = Number(process.env.CONTRIB_TOLERANCE || 1.5);
+
+// Named cohorts. Membership reflects the construct question each cohort
+// answers — not "who should rank where." See release-gate doc for rationale.
+const COHORTS = {
+  GCC: ['AE', 'SA', 'KW', 'QA', 'OM', 'BH'],
+  'OECD-nuclear': ['FR', 'US', 'GB', 'JP', 'KR', 'DE', 'CA', 'FI', 'SE', 'BE'],
+  'ASEAN-trade-hub': ['SG', 'MY', 'TH', 'VN', 'ID', 'PH'],
+  'LatAm-petro': ['BR', 'MX', 'CO', 'VE', 'AR', 'EC'],
+  'African-fragile': ['NG', 'ZA', 'ET', 'KE', 'GH', 'CD', 'SD'],
+  'Post-Soviet': ['RU', 'KZ', 'AZ', 'UA', 'UZ', 'GE', 'AM'],
+  'Stressed-debt': ['LK', 'PK', 'AR', 'LB', 'TR', 'EG', 'TN'],
+  'Re-export-hub': ['SG', 'HK', 'NL', 'BE', 'PA', 'AE', 'MY', 'LT'],
+  'SWF-heavy-exporter': ['NO', 'QA', 'KW', 'SA', 'KZ', 'AZ'],
+  'Fragile-floor': ['YE', 'SY', 'SO', 'AF'],
+};
+
+// Coarse domain weights mirrored from _dimension-scorers.ts for contribution
+// decomposition. The live API already returns domain.weight per country,
+// so we READ that from the API rather than hardcoding — this table is only
+// used for sanity-cross-check in the header.
+const EXPECTED_DOMAIN_WEIGHTS = {
+  economic: 0.17,
+  infrastructure: 0.15,
+  energy: 0.11,
+  'social-governance': 0.19,
+  'health-food': 0.13,
+  recovery: 0.25,
+};
+
+function commitSha() {
+  try {
+    return execSync('git rev-parse HEAD', { cwd: REPO_ROOT, stdio: ['ignore', 'pipe', 'ignore'] })
+      .toString()
+      .trim();
+  } catch {
+    return 'unknown';
+  }
+}
+
+async function loadCountryNameMap() {
+  const filePath = path.join(REPO_ROOT, 'shared', 'country-names.json');
+  let raw;
+  try {
+    raw = await fs.readFile(filePath, 'utf8');
+  } catch (err) {
+    console.error(`[audit] shared/country-names.json read failed (${err.code || err.name}): ${err.message}. Falling back to ISO-2 codes in the report (country names will appear as CC).`);
+    return {};
+  }
+  let forward;
+  try {
+    forward = JSON.parse(raw);
+  } catch (err) {
+    console.error(`[audit] shared/country-names.json parse failed: ${err.message}. Falling back to ISO-2 codes.`);
+    return {};
+  }
+  const reverse = {};
+  for (const [name, iso2] of Object.entries(forward)) {
+    const code = String(iso2 || '').toUpperCase();
+    if (!/^[A-Z]{2}$/.test(code)) continue;
+    if (reverse[code]) continue;
+    reverse[code] = name.replace(/\b([a-z])/g, (_, c) => c.toUpperCase());
+  }
+  return reverse;
+}
+
+function apiHeaders() {
+  const h = {
+    accept: 'application/json',
+    // Full UA (not the 10-char Node default) avoids middleware.ts's short-UA
+    // bot guard that 403s bare `node` fetches on the edge path.
+    'user-agent': 'audit-resilience-cohorts/1.0 (+scripts/audit-resilience-cohorts.mjs)',
+  };
+  if (process.env.WORLDMONITOR_API_KEY) {
+    h['X-WorldMonitor-Key'] = process.env.WORLDMONITOR_API_KEY;
+  }
+  return h;
+}
+
+async function fetchRanking() {
+  const response = await fetch(RANKING_URL, { headers: apiHeaders() });
+  if (!response.ok) {
+    throw new Error(`HTTP ${response.status} from ${RANKING_URL}: ${await response.text().catch(() => '')}`);
+  }
+  return response.json();
+}
+
+async function fetchScore(countryCode) {
+  const response = await fetch(SCORE_URL(countryCode), { headers: apiHeaders() });
+  if (!response.ok) {
+    throw new Error(`HTTP ${response.status} for ${countryCode}`);
+  }
+  return response.json();
+}
+
+async function fetchScoresConcurrent(countryCodes) {
+  const scores = new Map();
+  const failures = new Map(); // cc → error message
+  const queue = [...countryCodes];
+  async function worker() {
+    while (queue.length) {
+      const cc = queue.shift();
+      if (!cc) return;
+      try {
+        const data = await fetchScore(cc);
+        scores.set(cc, data);
+      } catch (err) {
+        console.error(`[audit] ${cc} failed: ${err.message}`);
+        failures.set(cc, err.message || 'unknown fetch error');
+        // Do NOT insert null into scores — silent-drop was the P1 bug.
+        // Failures are tracked distinctly so the report can banner them
+        // and STRICT mode can exit non-zero.
+      }
+    }
+  }
+  const workers = Array.from({ length: Math.min(CONCURRENCY, queue.length) }, worker);
+  await Promise.all(workers);
+  return { scores, failures };
+}
+
+function round1(n) {
+  return Math.round(n * 10) / 10;
+}
+
+function round2(n) {
+  return Math.round(n * 100) / 100;
+}
+
+// Given a score document, compute the contribution of every dimension to the
+// overall score. The overall is (by construct) a domain-weighted roll-up of
+// coverage-weighted dimension means. For contribution reporting we use the
+// "effective share" each dim has toward overall:
+//   domainShare = domainWeight
+//   withinDomainShare = (dim.coverage × dimWeight) / Σ(coverage × dimWeight) for that domain
+//   overallContribution = dim.score × withinDomainShare × domainShare
+// The sum of overallContribution across all dims ≈ overallScore (modulo
+// pillar-combine path when enabled, which isn't contribution-decomposable
+// by a clean formula).
+function decomposeContributions(scoreDoc, dimWeights) {
+  const rows = [];
+  for (const domain of scoreDoc.domains ?? []) {
+    const dims = domain.dimensions ?? [];
+    let denom = 0;
+    for (const d of dims) {
+      const w = dimWeights[d.id] ?? 1.0;
+      denom += (d.coverage ?? 0) * w;
+    }
+    for (const d of dims) {
+      const w = dimWeights[d.id] ?? 1.0;
+      const withinDomainShare = denom > 0 ? ((d.coverage ?? 0) * w) / denom : 0;
+      const contribution = (d.score ?? 0) * withinDomainShare * (domain.weight ?? 0);
+      rows.push({
+        domainId: domain.id,
+        domainWeight: domain.weight,
+        dimensionId: d.id,
+        score: d.score,
+        coverage: d.coverage,
+        imputationClass: d.imputationClass || '',
+        dimWeight: w,
+        withinDomainShare,
+        contribution,
+      });
+    }
+  }
+  return rows;
+}
+
+// Weight multipliers mirrored from _dimension-scorers.ts. Mirror is acceptable
+// here because the audit script is a diagnostic — if dim weights drift we'll
+// see contribution rows that don't sum to overallScore and investigate.
+const DIM_WEIGHTS = {
+  macroFiscal: 1.0,
+  currencyExternal: 1.0,
+  tradeSanctions: 1.0,
+  cyberDigital: 1.0,
+  logisticsSupply: 1.0,
+  infrastructure: 1.0,
+  energy: 1.0,
+  governanceInstitutional: 1.0,
+  socialCohesion: 1.0,
+  borderSecurity: 1.0,
+  informationCognitive: 1.0,
+  healthPublicService: 1.0,
+  foodWater: 1.0,
+  fiscalSpace: 1.0,
+  reserveAdequacy: 1.0,
+  externalDebtCoverage: 1.0,
+  importConcentration: 1.0,
+  stateContinuity: 1.0,
+  fuelStockDays: 1.0,
+  liquidReserveAdequacy: 0.5,
+  sovereignFiscalBuffer: 0.5,
+};
+
+function flagDimensionPatterns(cohortName, cohortCodes, scoreMap) {
+  const flags = [];
+  // Collect per-dimension values across the cohort.
+  const byDim = new Map();
+  for (const cc of cohortCodes) {
+    const doc = scoreMap.get(cc);
+    if (!doc) continue;
+    for (const domain of doc.domains ?? []) {
+      for (const dim of domain.dimensions ?? []) {
+        if (!byDim.has(dim.id)) byDim.set(dim.id, []);
+        byDim.get(dim.id).push({ cc, score: dim.score, coverage: dim.coverage, imputationClass: dim.imputationClass });
+      }
+    }
+  }
+  for (const [dimId, entries] of byDim.entries()) {
+    // Saturated dim: every member scores > 95
+    if (entries.length >= 3 && entries.every((e) => e.score > 95)) {
+      flags.push({
+        cohort: cohortName,
+        kind: 'saturated-high',
+        dimension: dimId,
+        message: `Every cohort member scores > 95 on ${dimId}; dim contributes zero discrimination within the cohort.`,
+      });
+    }
+    // Saturated low: every member scores < 5
+    if (entries.length >= 3 && entries.every((e) => e.score < 5)) {
+      flags.push({
+        cohort: cohortName,
+        kind: 'saturated-low',
+        dimension: dimId,
+        message: `Every cohort member scores < 5 on ${dimId}; construct may not apply or seed is missing.`,
+      });
+    }
+    // Identical score across cohort (variance = 0 and ≥ 3 entries)
+    if (entries.length >= 3) {
+      const first = entries[0].score;
+      if (entries.every((e) => e.score === first) && first > 0 && first < 100) {
+        flags.push({
+          cohort: cohortName,
+          kind: 'identical-scores',
+          dimension: dimId,
+          message: `All ${entries.length} cohort members have identical ${dimId} = ${first}; possible imputed-default or region-default leak.`,
+        });
+      }
+    }
+    // Low-coverage outlier: one entry has coverage < 0.5 while peers ≥ 0.9
+    const lowCov = entries.filter((e) => (e.coverage ?? 0) < 0.5);
+    const highCov = entries.filter((e) => (e.coverage ?? 0) >= 0.9);
+    if (lowCov.length && highCov.length >= lowCov.length * 2) {
+      flags.push({
+        cohort: cohortName,
+        kind: 'coverage-outlier',
+        dimension: dimId,
+        message: `Low coverage on ${dimId}: ${lowCov.map((e) => `${e.cc}(${round2(e.coverage)})`).join(', ')}; peers have full coverage.`,
+      });
+    }
+  }
+  return flags;
+}
+
+function computeMovers(currentItems, baselineItems, n) {
+  if (!baselineItems) return [];
+  const baselineByCc = new Map(baselineItems.map((x) => [x.countryCode, x]));
+  const currentByCc = new Map(currentItems.map((x) => [x.countryCode, x]));
+  const deltas = [];
+  for (const [cc, cur] of currentByCc.entries()) {
+    const prev = baselineByCc.get(cc);
+    if (!prev) continue;
+    const curScore = typeof cur.overallScore === 'number' ? cur.overallScore : null;
+    const prevScore = typeof prev.overallScoreRaw === 'number' ? prev.overallScoreRaw : (typeof prev.overallScore === 'number' ? prev.overallScore : null);
+    if (curScore == null || prevScore == null) continue;
+    deltas.push({
+      countryCode: cc,
+      scoreDelta: curScore - prevScore,
+      curScore,
+      prevScore,
+      curRank: cur.__rank,
+      prevRank: prev.rank ?? null,
+    });
+  }
+  deltas.sort((a, b) => Math.abs(b.scoreDelta) - Math.abs(a.scoreDelta));
+  return deltas.slice(0, n);
+}
+
+function fmtDelta(delta) {
+  if (delta === 0) return '·';
+  // ASCII hyphen-minus, not U+2212 MINUS. Downstream operators diff
+  // audit reports with `grep`/`awk`/CSV pipelines that treat the two
+  // characters differently; keeping ASCII preserves byte-level
+  // greppability of negative deltas.
+  const sign = delta > 0 ? '+' : '-';
+  return `${sign}${Math.abs(delta).toFixed(2)}`;
+}
+
+function section(label, body) {
+  return `\n## ${label}\n\n${body}\n`;
+}
+
+// Detect whether overall is computed via the legacy domain-weighted
+// formula (contribution decomposition is valid) or the pillar-combine
+// formula (penalizedPillarScore — decomposition is NOT valid and the
+// operator MUST know). Signal: |Σ contributions - overallScore| across
+// countries with COMPLETE domain coverage exceeds
+// CONTRIBUTION_SUM_TOLERANCE. "Complete" requires:
+//   (a) sum(domain.weight) within 0.05 of 1.0 (all 6 domains present)
+//   (b) every dim has coverage ≥ 0.9 (so the dim-share math is stable)
+// Both gates prevent false positives from small/partial fixtures or
+// live-API responses where the call happened to land mid-backfill.
+function detectFormulaMode(scoreMap) {
+  let diffsExceeded = 0;
+  let checked = 0;
+  const examples = [];
+  for (const [cc, doc] of scoreMap.entries()) {
+    if (!doc) continue;
+    const domains = doc.domains ?? [];
+    const domainWeightSum = domains.reduce((a, d) => a + (d.weight ?? 0), 0);
+    if (Math.abs(domainWeightSum - 1.0) > 0.05) continue; // incomplete response
+    const hasFullCoverage = domains.every((dom) =>
+      (dom.dimensions ?? []).every((dim) => (dim.coverage ?? 0) >= 0.9),
+    );
+    if (!hasFullCoverage) continue;
+    const rows = decomposeContributions(doc, DIM_WEIGHTS);
+    const sum = rows.reduce((a, r) => a + r.contribution, 0);
+    const overall = doc.overallScore ?? 0;
+    const diff = Math.abs(sum - overall);
+    checked += 1;
+    if (diff > CONTRIBUTION_SUM_TOLERANCE) {
+      diffsExceeded += 1;
+      if (examples.length < 3) examples.push({ cc, sum, overall, diff });
+    }
+  }
+  // Heuristic: if > 50% of eligible countries drift AND at least 3 were
+  // checked, pillar-combine is probably active. Below 3 checked we skip
+  // the flag entirely — the signal is too noisy to banner-block on.
+  const pillarModeLikely = checked >= 3 && diffsExceeded / checked > 0.5;
+  return { pillarModeLikely, checked, diffsExceeded, examples };
+}
+
+function renderCohortSection(cohortName, codes, scoreMap, nameMap) {
+  const present = codes.filter((cc) => scoreMap.get(cc));
+  if (!present.length) return '';
+
+  // Collect all dims seen in this cohort.
+  const dimIds = new Set();
+  for (const cc of present) {
+    const doc = scoreMap.get(cc);
+    for (const dom of doc.domains ?? []) for (const dim of dom.dimensions ?? []) dimIds.add(dim.id);
+  }
+  const orderedDims = [...dimIds].sort();
+
+  let body = `Members: ${present.join(', ')}\n\n`;
+
+  // Overall table
+  body += `**Overall**\n\n| CC | Country | Overall | Baseline | Stress | Level |\n|---|---|---:|---:|---:|---|\n`;
+  for (const cc of present) {
+    const doc = scoreMap.get(cc);
+    body += `| ${cc} | ${nameMap[cc] ?? cc} | ${round1(doc.overallScore)} | ${round1(doc.baselineScore)} | ${round1(doc.stressScore)} | ${doc.level} |\n`;
+  }
+
+  // Per-dim scores
+  body += `\n**Per-dimension score** (score · coverage · imputationClass if set)\n\n`;
+  body += `| Dim | ${present.join(' | ')} |\n|---| ${present.map(() => '---:').join(' | ')} |\n`;
+  for (const dimId of orderedDims) {
+    const cells = present.map((cc) => renderDimCell(scoreMap.get(cc), dimId));
+    body += `| ${dimId} | ${cells.join(' | ')} |\n`;
+  }
+
+  // Contribution decomposition (sums to overall per country under legacy formula).
+  body += `\n**Contribution decomposition** (points toward overall score)\n\n`;
+  body += `| Dim | ${present.join(' | ')} |\n|---| ${present.map(() => '---:').join(' | ')} |\n`;
+  const contribByCc = new Map(
+    present.map((cc) => [cc, decomposeContributions(scoreMap.get(cc), DIM_WEIGHTS)]),
+  );
+  for (const dimId of orderedDims) {
+    const cells = present.map((cc) => {
+      const row = (contribByCc.get(cc) ?? []).find((r) => r.dimensionId === dimId);
+      return row ? row.contribution.toFixed(2) : '—';
+    });
+    body += `| ${dimId} | ${cells.join(' | ')} |\n`;
+  }
+  const sums = present.map((cc) => (contribByCc.get(cc) ?? []).reduce((a, r) => a + r.contribution, 0));
+  body += `| **sum contrib** | ${sums.map((s) => s.toFixed(2)).join(' | ')} |\n`;
+  const overalls = present.map((cc) => scoreMap.get(cc).overallScore);
+  body += `| **overallScore** | ${overalls.map((s) => round1(s)).join(' | ')} |\n`;
+
+  return section(`Cohort: ${cohortName}`, body);
+}
+
+function renderDimCell(doc, dimId) {
+  for (const dom of doc.domains ?? []) {
+    for (const dim of dom.dimensions ?? []) {
+      if (dim.id === dimId) {
+        const cov = round2(dim.coverage ?? 0);
+        const imp = dim.imputationClass ? ` · *${dim.imputationClass}*` : '';
+        return `${Math.round(dim.score ?? 0)} · ${cov}${imp}`;
+      }
+    }
+  }
+  return '—';
+}
+
+function buildReport({ ranking, scoreMap, nameMap, movers, capturedAt, sha, failures, requestedCohortCodes }) {
+  const items = ranking.items ?? [];
+  const greyedOut = ranking.greyedOut ?? [];
+  const failureList = [...(failures?.entries?.() ?? [])];
+  const missingCohortMembers = (requestedCohortCodes ?? []).filter((cc) => !scoreMap.get(cc));
+  const formulaMode = detectFormulaMode(scoreMap);
+
+  let md = `# Resilience cohort-sanity audit report\n\n`;
+
+  // Blocking banners at the very top. Operator MUST see these before the
+  // tables below. STRICT mode will exit non-zero after writing the report
+  // so an operator can inspect the diagnostics and then re-run.
+  if (failureList.length || missingCohortMembers.length) {
+    md += `> ⛔ **Fetch failures / missing cohort members.** ${failureList.length} per-country fetch(es) failed; `;
+    md += `${missingCohortMembers.length} cohort member(s) are missing from the score map. `;
+    md += `Tables below only reflect the members that DID load. `;
+    md += `Re-run the audit (STRICT=1 recommended) before treating this report as release-gate evidence.\n\n`;
+  }
+  if (formulaMode.pillarModeLikely) {
+    md += `> ⛔ **Formula mode not supported.** ${formulaMode.diffsExceeded}/${formulaMode.checked} full-coverage countries show `;
+    md += `|Σ contributions − overallScore| > ${CONTRIBUTION_SUM_TOLERANCE}. This almost certainly means \`RESILIENCE_PILLAR_COMBINE_ENABLED\` `;
+    md += `is active (penalizedPillarScore), and the **contribution decomposition tables below are NOT valid**. `;
+    md += `Treat them as "legacy-formula reference only." `;
+    md += `See \`docs/methodology/cohort-sanity-release-gate.md#formula-mode\`.\n\n`;
+  }
+
+  // In FIXTURE mode `API_BASE` is empty → `RANKING_URL` would render as
+  // a bare "/api/resilience/v1/get-resilience-ranking" path that never
+  // resolved. Surface "fixture://<path>" instead so a diff against a
+  // live-run report is visibly distinguishable.
+  const sourceLabel = FIXTURE_PATH ? `fixture://${FIXTURE_PATH}` : RANKING_URL;
+  md += `- Captured: ${capturedAt}\n- Commit: ${sha}\n- Source: ${sourceLabel}\n- Ranked: ${items.length} · Grey-out: ${greyedOut.length}\n`;
+  md += `- Generated by: \`scripts/audit-resilience-cohorts.mjs\`\n`;
+  md += `- Expected domain weights: ${Object.entries(EXPECTED_DOMAIN_WEIGHTS).map(([k, v]) => `${k}=${v}`).join(', ')}\n`;
+  md += `- Formula mode: ${formulaMode.pillarModeLikely ? '**PILLAR-COMBINE (decomposition invalid)**' : 'legacy domain-weighted (decomposition valid)'}\n`;
+  md += `- Fetch failures: ${failureList.length} · Missing cohort members: ${missingCohortMembers.length}\n`;
+  if (BASELINE_PATH) md += `- Baseline snapshot: \`${BASELINE_PATH}\`\n`;
+
+  // Dedicated "what failed" section, rendered even when empty so operators
+  // always know to check for it.
+  {
+    let failBody = '';
+    if (failureList.length) {
+      failBody += `| CC | Country | Error |\n|---|---|---|\n`;
+      for (const [cc, msg] of failureList) {
+        failBody += `| ${cc} | ${nameMap[cc] ?? cc} | ${String(msg).replace(/\|/g, '\\|').slice(0, 200)} |\n`;
+      }
+    }
+    if (missingCohortMembers.length) {
+      failBody += `\n**Cohort members with no score data:** ${missingCohortMembers.join(', ')}\n`;
+      failBody += `\nThe cohorts below were rendered using only members that loaded successfully. `;
+      failBody += `An operator comparing to a prior audit should assume the missing members may carry the very anomaly under review.\n`;
+    }
+    if (!failBody) failBody = '_No fetch failures and all cohort members present._';
+    md += section('Fetch failures / missing members', failBody);
+  }
+
+  if (formulaMode.pillarModeLikely && formulaMode.examples.length) {
+    let fmBody = `| CC | Σ contrib | overallScore | |diff| |\n|---|---:|---:|---:|\n`;
+    for (const ex of formulaMode.examples) {
+      fmBody += `| ${ex.cc} | ${ex.sum.toFixed(2)} | ${ex.overall.toFixed(2)} | ${ex.diff.toFixed(2)} |\n`;
+    }
+    fmBody += `\n**Diagnosis.** Under the legacy domain-weighted formula, Σ contributions ≈ overallScore (within ~${CONTRIBUTION_SUM_TOLERANCE} pts of drift for rounding). When \`RESILIENCE_PILLAR_COMBINE_ENABLED\` is active, \`overallScore\` is computed by \`penalizedPillarScore(pillars)\` which is non-linear in the dimension scores; contribution decomposition by domain-weight no longer sums to overall. The audit script does not yet implement a pillar-aware decomposition — fix that before relying on this report under pillar-combine mode.\n`;
+    md += section('Formula-mode diagnostic', fmBody);
+  }
+
+  // Ranking table
+  let body = '| # | CC | Country | Overall | Coverage | Level | Low-conf |\n|---:|---|---|---:|---:|---|---|\n';
+  items.slice(0, TOP_N_FULL_RANKING).forEach((x, i) => {
+    body += `| ${i + 1} | ${x.countryCode} | ${nameMap[x.countryCode] ?? x.countryCode} | ${round1(x.overallScore)} | ${round2(x.overallCoverage)} | ${x.level} | ${x.lowConfidence ? '⚠' : ''} |\n`;
+  });
+  md += section(`Top ${TOP_N_FULL_RANKING} ranking`, body);
+
+  // Per-cohort per-dimension breakdown
+  for (const [cohortName, codes] of Object.entries(COHORTS)) {
+    md += renderCohortSection(cohortName, codes, scoreMap, nameMap);
+  }
+
+  // Flagged patterns
+  const allFlags = [];
+  for (const [cohortName, codes] of Object.entries(COHORTS)) {
+    allFlags.push(...flagDimensionPatterns(cohortName, codes, scoreMap));
+  }
+  if (allFlags.length) {
+    let flagBody = `| Cohort | Kind | Dimension | Message |\n|---|---|---|---|\n`;
+    for (const f of allFlags) {
+      flagBody += `| ${f.cohort} | ${f.kind} | ${f.dimension} | ${f.message} |\n`;
+    }
+    md += section('Flagged patterns', flagBody);
+  } else {
+    md += section('Flagged patterns', '_No cohort-sanity patterns tripped heuristic thresholds._');
+  }
+
+  // Movers
+  if (movers?.length) {
+    let mvBody = `Baseline: \`${BASELINE_PATH}\`\n\n`;
+    mvBody += `| CC | Country | Prev | Current | Δ | Prev rank | Current rank |\n|---|---|---:|---:|---:|---:|---:|\n`;
+    for (const m of movers) {
+      mvBody += `| ${m.countryCode} | ${nameMap[m.countryCode] ?? m.countryCode} | ${round1(m.prevScore)} | ${round1(m.curScore)} | ${fmtDelta(round2(m.scoreDelta))} | ${m.prevRank ?? '—'} | ${m.curRank ?? '—'} |\n`;
+    }
+    md += section(`Top-${MOVERS_N} movers vs baseline`, mvBody);
+  }
+
+  md += `\n---\n\n*This audit is a release-gate diagnostic, not a merge-blocker. Rank-targeted acceptance criteria are an explicit anti-pattern — see \`docs/methodology/cohort-sanity-release-gate.md\`.*\n`;
+  return { md, failureList, missingCohortMembers, formulaMode };
+}
+
+async function main() {
+  const nameMap = await loadCountryNameMap();
+  const cohortCodeSet = new Set();
+  for (const codes of Object.values(COHORTS)) for (const cc of codes) cohortCodeSet.add(cc);
+  const requestedCohortCodes = [...cohortCodeSet].sort();
+
+  let ranking;
+  let scoreMap;
+  let failures = new Map();
+  if (FIXTURE_PATH) {
+    const raw = await fs.readFile(path.resolve(REPO_ROOT, FIXTURE_PATH), 'utf8');
+    const fixture = JSON.parse(raw);
+    ranking = fixture.ranking ?? { items: [], greyedOut: [] };
+    scoreMap = new Map(Object.entries(fixture.scores ?? {}));
+    // Fixture mode has no network calls, but a fixture may legitimately
+    // omit cohort members (for small smoke-test fixtures). Rather than
+    // silently dropping them, compute the missing set here too so the
+    // report banners them identically to live-mode fetch failures.
+    console.error(`[audit] FIXTURE mode: ${path.resolve(REPO_ROOT, FIXTURE_PATH)} (ranked=${(ranking.items || []).length}, scores=${scoreMap.size})`);
+  } else {
+    ranking = await fetchRanking();
+    console.error(`[audit] fetching per-country scores for ${requestedCohortCodes.length} cohort members at concurrency=${CONCURRENCY}`);
+    const result = await fetchScoresConcurrent(requestedCohortCodes);
+    scoreMap = result.scores;
+    failures = result.failures;
+  }
+  const items = ranking.items ?? [];
+  items.forEach((x, i) => { x.__rank = i + 1; });
+
+  let movers = [];
+  if (BASELINE_PATH) {
+    try {
+      const raw = await fs.readFile(path.resolve(REPO_ROOT, BASELINE_PATH), 'utf8');
+      const baseline = JSON.parse(raw);
+      movers = computeMovers(items, baseline.items, MOVERS_N);
+    } catch (err) {
+      console.error(`[audit] baseline read failed: ${err.message}`);
+    }
+  }
+
+  const capturedAt = new Date().toISOString();
+  const sha = commitSha();
+  const { md, failureList, missingCohortMembers, formulaMode } = buildReport({
+    ranking, scoreMap, nameMap, movers, capturedAt, sha, failures, requestedCohortCodes,
+  });
+
+  if (OUT_PATH) {
+    await fs.mkdir(path.dirname(path.resolve(REPO_ROOT, OUT_PATH)), { recursive: true });
+    await fs.writeFile(path.resolve(REPO_ROOT, OUT_PATH), md, 'utf8');
+    console.error(`[audit] wrote ${OUT_PATH}`);
+  } else {
+    process.stdout.write(md);
+  }
+
+  // STRICT mode fails the run AFTER writing the report so operators still
+  // have the diagnostic artifact on disk. Exit codes:
+  //   3 — fetch failures or missing cohort members
+  //   4 — formula-mode change detected (pillar-combine active, decomposition invalid)
+  //   0 — all clear
+  if (STRICT) {
+    if (failureList.length || missingCohortMembers.length) {
+      console.error(`[audit] STRICT: ${failureList.length} fetch failure(s), ${missingCohortMembers.length} missing cohort member(s); exiting 3`);
+      process.exit(3);
+    }
+    if (formulaMode.pillarModeLikely) {
+      console.error(`[audit] STRICT: formula-mode mismatch detected (pillar-combine likely); contribution decomposition invalid; exiting 4`);
+      process.exit(4);
+    }
+  }
+}
+
+main().catch((err) => {
+  console.error('[audit-resilience-cohorts] failed:', err);
+  process.exit(1);
+});
--- a/tests/audit-cohort-formula-detection.test.mts
+++ b/tests/audit-cohort-formula-detection.test.mts
@@ -0,0 +1,166 @@
+// Smoke-tests for the fail-closed behaviour of
+// `scripts/audit-resilience-cohorts.mjs`. Verifies:
+//   (1) Missing cohort members produce a ⛔ banner at report top
+//       and a dedicated "Fetch failures / missing members" section.
+//   (2) STRICT=1 exits non-zero (code 3) when members are missing.
+//   (3) Formula-mode detection correctly banners when pillar-combine
+//       is active (Σ contributions ≠ overallScore for complete responses)
+//       and correctly does NOT banner when contributions sum.
+//
+// The tests drive the script as a child process against synthetic
+// fixtures so they exercise the full `main()` flow (report shape,
+// exit codes, stderr logging) rather than just the pure helpers.
+
+import assert from 'node:assert/strict';
+import { describe, it } from 'node:test';
+import { spawnSync } from 'node:child_process';
+import path from 'node:path';
+import fs from 'node:fs';
+import os from 'node:os';
+import { fileURLToPath } from 'node:url';
+
+const __filename = fileURLToPath(import.meta.url);
+const REPO_ROOT = path.resolve(path.dirname(__filename), '..');
+const SCRIPT = path.join(REPO_ROOT, 'scripts', 'audit-resilience-cohorts.mjs');
+
+function writeFixture(name: string, fixture: unknown): string {
+  const tmpFile = path.join(os.tmpdir(), `audit-fixture-${name}-${process.pid}.json`);
+  fs.writeFileSync(tmpFile, JSON.stringify(fixture));
+  return tmpFile;
+}
+
+function runAudit(env: Record<string, string>): { status: number | null; stdout: string; stderr: string; report: string } {
+  const outFile = path.join(os.tmpdir(), `audit-out-${Date.now()}-${Math.random().toString(36).slice(2)}.md`);
+  const result = spawnSync('node', [SCRIPT], {
+    env: { ...process.env, OUT: outFile, ...env },
+    encoding: 'utf8',
+  });
+  let report = '';
+  try { report = fs.readFileSync(outFile, 'utf8'); } catch { /* no report written */ }
+  return {
+    status: result.status,
+    stdout: result.stdout ?? '',
+    stderr: result.stderr ?? '',
+    report,
+  };
+}
+
+// Complete fixture: 57 cohort members so missing-member banner does NOT fire.
+// Domain weights sum to 1.0 and coverage is 1.0 throughout.
+// Σ contributions per country should land within CONTRIB_TOLERANCE of overall.
+function buildCompleteFixture(options: { pillarMode?: boolean } = {}): unknown {
+  const allCohortCodes = Array.from(new Set([
+    'AE', 'SA', 'KW', 'QA', 'OM', 'BH',
+    'FR', 'US', 'GB', 'JP', 'KR', 'DE', 'CA', 'FI', 'SE', 'BE',
+    'SG', 'MY', 'TH', 'VN', 'ID', 'PH',
+    'BR', 'MX', 'CO', 'VE', 'AR', 'EC',
+    'NG', 'ZA', 'ET', 'KE', 'GH', 'CD', 'SD',
+    'RU', 'KZ', 'AZ', 'UA', 'UZ', 'GE', 'AM',
+    'LK', 'PK', 'LB', 'TR', 'EG', 'TN',
+    'HK', 'NL', 'PA', 'LT',
+    'NO',
+    'YE', 'SY', 'SO', 'AF',
+  ]));
+
+  const buildDoc = (overallScore: number) => {
+    const dimScore = overallScore;
+    return {
+      countryCode: 'XX',
+      overallScore: options.pillarMode ? 10 : overallScore,
+      // When pillarMode=true we deliberately set overallScore to a value
+      // that won't match Σ contributions (penalizedPillarScore semantics)
+      // so the detector fires. coverage=1.0 across all dims keeps the
+      // eligibility gate satisfied.
+      level: 'moderate',
+      baselineScore: overallScore,
+      stressScore: overallScore,
+      stressFactor: 0.2,
+      domains: [
+        { id: 'economic', weight: 0.17, score: dimScore, dimensions: [
+          { id: 'macroFiscal', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+        { id: 'infrastructure', weight: 0.15, score: dimScore, dimensions: [
+          { id: 'infrastructure', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+        { id: 'energy', weight: 0.11, score: dimScore, dimensions: [
+          { id: 'energy', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+        { id: 'social-governance', weight: 0.19, score: dimScore, dimensions: [
+          { id: 'governanceInstitutional', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+        { id: 'health-food', weight: 0.13, score: dimScore, dimensions: [
+          { id: 'healthPublicService', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+        { id: 'recovery', weight: 0.25, score: dimScore, dimensions: [
+          { id: 'externalDebtCoverage', score: dimScore, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+        ]},
+      ],
+    };
+  };
+
+  const scores: Record<string, unknown> = {};
+  for (const cc of allCohortCodes) {
+    scores[cc] = { ...(buildDoc(70) as Record<string, unknown>), countryCode: cc };
+  }
+  const items = allCohortCodes.slice(0, 6).map((cc) => ({
+    countryCode: cc, overallScore: 70, level: 'moderate', lowConfidence: false, overallCoverage: 1.0, rankStable: true,
+  }));
+  return { ranking: { items, greyedOut: [] }, scores };
+}
+
+describe('audit-resilience-cohorts fail-closed — missing cohort members', () => {
+  it('banners the report when fixture omits cohort members AND exits 3 under STRICT=1', () => {
+    // Minimal fixture intentionally omits almost every cohort member.
+    const fixture = {
+      ranking: { items: [
+        { countryCode: 'AE', overallScore: 72.72, level: 'high', lowConfidence: false, overallCoverage: 0.88, rankStable: true },
+      ], greyedOut: [] },
+      scores: {
+        AE: { countryCode: 'AE', overallScore: 72.72, level: 'high', baselineScore: 72, stressScore: 70, stressFactor: 0.15, domains: [
+          { id: 'recovery', weight: 0.25, score: 50, dimensions: [
+            { id: 'externalDebtCoverage', score: 100, coverage: 1.0, observedWeight: 1, imputedWeight: 0, imputationClass: '' },
+          ]},
+        ]},
+      },
+    };
+    const fixturePath = writeFixture('missing-members', fixture);
+    try {
+      const result = runAudit({ FIXTURE: fixturePath, STRICT: '1' });
+      assert.equal(result.status, 3, `expected STRICT exit code 3 for missing members; got ${result.status}; stderr=${result.stderr}`);
+      assert.match(result.report, /⛔ \*\*Fetch failures \/ missing cohort members/, 'expected missing-members banner at report top');
+      assert.match(result.report, /## Fetch failures \/ missing members/, 'expected dedicated Fetch-failures section');
+      assert.match(result.report, /Cohort members with no score data:/, 'expected missing-members list');
+    } finally {
+      fs.unlinkSync(fixturePath);
+    }
+  });
+
+  it('exits 0 under STRICT=1 when all cohort members present + formula matches', () => {
+    const fixture = buildCompleteFixture({ pillarMode: false });
+    const fixturePath = writeFixture('complete', fixture);
+    try {
+      const result = runAudit({ FIXTURE: fixturePath, STRICT: '1' });
+      assert.equal(result.status, 0, `expected STRICT exit 0; got ${result.status}; stderr=${result.stderr}`);
+      assert.doesNotMatch(result.report, /⛔ \*\*Fetch failures/, 'missing-members banner should NOT fire');
+      assert.doesNotMatch(result.report, /⛔ \*\*Formula mode not supported/, 'formula-mode banner should NOT fire on legacy-formula response');
+    } finally {
+      fs.unlinkSync(fixturePath);
+    }
+  });
+});
+
+describe('audit-resilience-cohorts fail-closed — formula mode', () => {
+  it('banners the report when Σ contributions diverges from overallScore AND exits 4 under STRICT=1', () => {
+    const fixture = buildCompleteFixture({ pillarMode: true });
+    const fixturePath = writeFixture('pillar-mode', fixture);
+    try {
+      const result = runAudit({ FIXTURE: fixturePath, STRICT: '1' });
+      assert.equal(result.status, 4, `expected STRICT exit code 4 for formula mismatch; got ${result.status}; stderr=${result.stderr}`);
+      assert.match(result.report, /⛔ \*\*Formula mode not supported/, 'expected formula-mode banner at report top');
+      assert.match(result.report, /PILLAR-COMBINE \(decomposition invalid\)/, 'expected formula-mode line in header');
+      assert.match(result.report, /## Formula-mode diagnostic/, 'expected dedicated formula-mode diagnostic section');
+    } finally {
+      fs.unlinkSync(fixturePath);
+    }
+  });
+});
--- a/tests/fixtures/resilience-audit-fixture.json
+++ b/tests/fixtures/resilience-audit-fixture.json
@@ -0,0 +1,125 @@
+{
+  "_comment": "Minimal synthetic fixture for scripts/audit-resilience-cohorts.mjs end-to-end dry-run. Mirrors the 2026-04-24 GCC snapshot (KW>QA>AE) so the fixture-mode run produces the observed-real-world deltas the audit is designed to surface. Values are approximate; used only for structural verification.",
+  "ranking": {
+    "items": [
+      { "countryCode": "KW", "overallScore": 79.08, "level": "high", "lowConfidence": false, "overallCoverage": 0.92, "rankStable": true },
+      { "countryCode": "QA", "overallScore": 77.06, "level": "high", "lowConfidence": false, "overallCoverage": 0.95, "rankStable": true },
+      { "countryCode": "AE", "overallScore": 72.72, "level": "high", "lowConfidence": false, "overallCoverage": 0.88, "rankStable": true },
+      { "countryCode": "SA", "overallScore": 68.04, "level": "high", "lowConfidence": false, "overallCoverage": 0.90, "rankStable": true },
+      { "countryCode": "OM", "overallScore": 65.74, "level": "moderate", "lowConfidence": false, "overallCoverage": 0.82, "rankStable": true },
+      { "countryCode": "BH", "overallScore": 61.69, "level": "moderate", "lowConfidence": false, "overallCoverage": 0.85, "rankStable": true }
+    ],
+    "greyedOut": []
+  },
+  "scores": {
+    "AE": {
+      "countryCode": "AE",
+      "overallScore": 72.72,
+      "level": "high",
+      "baselineScore": 72.0,
+      "stressScore": 70.0,
+      "stressFactor": 0.15,
+      "domains": [
+        { "id": "recovery", "score": 62, "weight": 0.25, "dimensions": [
+          { "id": "sovereignFiscalBuffer", "score": 27, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "liquidReserveAdequacy", "score": 38, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "importConcentration", "score": 50, "coverage": 0.3, "observedWeight": 0, "imputedWeight": 1, "imputationClass": "unmonitored" },
+          { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "fiscalSpace", "score": 76, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "stateContinuity", "score": 85, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "economic", "score": 74, "weight": 0.17, "dimensions": [
+          { "id": "tradeSanctions", "score": 54, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "currencyExternal", "score": 73, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "macroFiscal", "score": 80, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "energy", "score": 79, "weight": 0.11, "dimensions": [
+          { "id": "energy", "score": 79, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "health-food", "score": 62, "weight": 0.13, "dimensions": [
+          { "id": "foodWater", "score": 53, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "healthPublicService", "score": 75, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "social-governance", "score": 74, "weight": 0.19, "dimensions": [
+          { "id": "socialCohesion", "score": 70, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "governanceInstitutional", "score": 78, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "infrastructure", "score": 78, "weight": 0.15, "dimensions": [
+          { "id": "infrastructure", "score": 80, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]}
+      ]
+    },
+    "KW": {
+      "countryCode": "KW",
+      "overallScore": 79.08,
+      "level": "high",
+      "baselineScore": 79.0,
+      "stressScore": 78.0,
+      "stressFactor": 0.11,
+      "domains": [
+        { "id": "recovery", "score": 90, "weight": 0.25, "dimensions": [
+          { "id": "sovereignFiscalBuffer", "score": 98, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "liquidReserveAdequacy", "score": 72, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "importConcentration", "score": 85, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "fiscalSpace", "score": 98, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "stateContinuity", "score": 80, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "economic", "score": 78, "weight": 0.17, "dimensions": [
+          { "id": "tradeSanctions", "score": 82, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "currencyExternal", "score": 86, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "macroFiscal", "score": 70, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "energy", "score": 55, "weight": 0.11, "dimensions": [
+          { "id": "energy", "score": 55, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "health-food", "score": 60, "weight": 0.13, "dimensions": [
+          { "id": "foodWater", "score": 53, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "healthPublicService", "score": 72, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "social-governance", "score": 72, "weight": 0.19, "dimensions": [
+          { "id": "socialCohesion", "score": 68, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "governanceInstitutional", "score": 76, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "infrastructure", "score": 76, "weight": 0.15, "dimensions": [
+          { "id": "infrastructure", "score": 76, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]}
+      ]
+    },
+    "QA": {
+      "countryCode": "QA",
+      "overallScore": 77.06,
+      "level": "high",
+      "baselineScore": 77.0,
+      "stressScore": 76.0,
+      "stressFactor": 0.12,
+      "domains": [
+        { "id": "recovery", "score": 85, "weight": 0.25, "dimensions": [
+          { "id": "sovereignFiscalBuffer", "score": 95, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "liquidReserveAdequacy", "score": 68, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "importConcentration", "score": 70, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+          { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]},
+        { "id": "economic", "score": 78, "weight": 0.17, "dimensions": [
+          { "id": "tradeSanctions", "score": 82, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+        ]}
+      ]
+    },
+    "SA": { "countryCode": "SA", "overallScore": 68.04, "level": "high", "baselineScore": 68.0, "stressScore": 67.0, "stressFactor": 0.14, "domains": [
+      { "id": "recovery", "score": 70, "weight": 0.25, "dimensions": [
+        { "id": "sovereignFiscalBuffer", "score": 72, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" },
+        { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+      ]}
+    ]},
+    "OM": { "countryCode": "OM", "overallScore": 65.74, "level": "moderate", "baselineScore": 66.0, "stressScore": 64.0, "stressFactor": 0.16, "domains": [
+      { "id": "recovery", "score": 60, "weight": 0.25, "dimensions": [
+        { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+      ]}
+    ]},
+    "BH": { "countryCode": "BH", "overallScore": 61.69, "level": "moderate", "baselineScore": 62.0, "stressScore": 60.0, "stressFactor": 0.18, "domains": [
+      { "id": "recovery", "score": 55, "weight": 0.25, "dimensions": [
+        { "id": "externalDebtCoverage", "score": 100, "coverage": 1.0, "observedWeight": 1, "imputedWeight": 0, "imputationClass": "" }
+      ]}
+    ]}
+  }
+}
--- a/tests/resilience-construct-invariants.test.mts
+++ b/tests/resilience-construct-invariants.test.mts
@@ -0,0 +1,161 @@
+// Construct invariants — formula-level assertions with synthetic inputs.
+//
+// Purpose. Complement `resilience-dimension-monotonicity.test.mts` (which
+// pins direction) with precise ANCHOR-VALUE checks. These tests fail when
+// the scoring FORMULA breaks, not when a country's RANK changes. They are
+// deliberately country-identity-free so the audit gate (see
+// `docs/methodology/cohort-sanity-release-gate.md`) does not collapse into
+// an outcome-seeking "ENTITY A must > ENTITY B" assertion — that is the
+// anti-pattern the cohort-sanity skill explicitly warns against.
+//
+// Plan reference. PR 0 from
+// `docs/plans/2026-04-24-002-fix-resilience-cohort-ranking-structural-audit-plan.md`
+// (§"PR 0 — Release-gate audit harness"):
+//   > `score(HHI=0.05) > score(HHI=0.20)`
+//   > `score(debtToReservesRatio=0) > score(ratio=1) > score(ratio=2)`
+//   > `score(effMo=12) > score(effMo=3)`
+//   > `score(lowCarbonShare=80, fossilImportDep=0) > score(lowCarbonShare=0, fossilImportDep=100)`
+//
+// The tests are organised by scorer and include both the monotonicity
+// claim and the precise anchor value where the construct fixes one
+// (Greenspan-Guidotti = 50; saturating transform at effMo=12 = ~63).
+// An anchor drift > 1 point is an invariant break: investigate before
+// editing the test.
+
+import assert from 'node:assert/strict';
+import { describe, it } from 'node:test';
+
+import {
+  scoreImportConcentration,
+  scoreExternalDebtCoverage,
+  scoreSovereignFiscalBuffer,
+  type ResilienceSeedReader,
+} from '../server/worldmonitor/resilience/v1/_dimension-scorers.ts';
+
+const TEST_ISO2 = 'XX';
+
+function makeReader(keyValueMap: Record<string, unknown>): ResilienceSeedReader {
+  return async (key: string) => keyValueMap[key] ?? null;
+}
+
+describe('construct invariants — importConcentration', () => {
+  async function scoreWith(hhi: number) {
+    return scoreImportConcentration(TEST_ISO2, makeReader({
+      'resilience:recovery:import-hhi:v1': { countries: { [TEST_ISO2]: { hhi } } },
+    }));
+  }
+
+  it('score(HHI=0.05) > score(HHI=0.20)', async () => {
+    const diversified = await scoreWith(0.05);
+    const concentrated = await scoreWith(0.20);
+    assert.ok(
+      diversified.score > concentrated.score,
+      `HHI 0.05→0.20 should lower score; got ${diversified.score} → ${concentrated.score}`,
+    );
+  });
+
+  it('HHI=0 anchors at score 100 (no-concentration pole)', async () => {
+    const r = await scoreWith(0);
+    assert.ok(Math.abs(r.score - 100) < 1, `expected ~100 at HHI=0, got ${r.score}`);
+  });
+
+  it('HHI=0.5 (fully concentrated under current 0..5000 goalpost) anchors at score 0', async () => {
+    // Current scorer: hhi×10000 normalised against (0, 5000). 0.5×10000 = 5000 → 0.
+    const r = await scoreWith(0.5);
+    assert.ok(Math.abs(r.score - 0) < 1, `expected ~0 at HHI=0.5 under current goalpost, got ${r.score}`);
+  });
+});
+
+describe('construct invariants — externalDebtCoverage (Greenspan-Guidotti anchor)', () => {
+  async function scoreWith(debtToReservesRatio: number) {
+    return scoreExternalDebtCoverage(TEST_ISO2, makeReader({
+      'resilience:recovery:external-debt:v1': {
+        countries: { [TEST_ISO2]: { debtToReservesRatio } },
+      },
+    }));
+  }
+
+  it('ratio=0 → score 100 (zero-rollover-exposure pole)', async () => {
+    const r = await scoreWith(0);
+    assert.ok(Math.abs(r.score - 100) < 1, `expected ~100 at ratio=0, got ${r.score}`);
+  });
+
+  it('ratio=1.0 → score 50 (Greenspan-Guidotti threshold)', async () => {
+    const r = await scoreWith(1.0);
+    assert.ok(
+      Math.abs(r.score - 50) < 1,
+      `expected ~50 at ratio=1.0 under Greenspan-Guidotti anchor (worst=2), got ${r.score}`,
+    );
+  });
+
+  it('ratio=2.0 → score 0 (acute rollover-shock pole)', async () => {
+    const r = await scoreWith(2.0);
+    assert.ok(Math.abs(r.score - 0) < 1, `expected ~0 at ratio=2.0, got ${r.score}`);
+  });
+
+  it('monotonic: score(ratio=0) > score(ratio=1) > score(ratio=2)', async () => {
+    const [r0, r1, r2] = await Promise.all([scoreWith(0), scoreWith(1), scoreWith(2)]);
+    assert.ok(r0.score > r1.score && r1.score > r2.score,
+      `expected strictly decreasing; got ${r0.score}, ${r1.score}, ${r2.score}`);
+  });
+});
+
+describe('construct invariants — sovereignFiscalBuffer (saturating transform)', () => {
+  // Saturating transform per scorer (line ~1687):
+  //   score = 100 * (1 - exp(-em / 12))
+  // Reference values (not tuning points — these are what the formula SHOULD
+  // produce if no one has silently redefined it):
+  //   em=0  → 0
+  //   em=3  → 100*(1-e^-0.25) ≈ 22.1
+  //   em=12 → 100*(1-e^-1)    ≈ 63.2
+  //   em=24 → 100*(1-e^-2)    ≈ 86.5
+  //   em→∞  → 100
+
+  async function scoreWithEm(em: number) {
+    return scoreSovereignFiscalBuffer(TEST_ISO2, makeReader({
+      'resilience:recovery:sovereign-wealth:v1': {
+        countries: { [TEST_ISO2]: { totalEffectiveMonths: em, completeness: 1.0 } },
+      },
+    }));
+  }
+
+  it('em=0 → score 0 (no SWF buffer)', async () => {
+    const r = await scoreWithEm(0);
+    assert.ok(Math.abs(r.score - 0) < 1, `expected ~0 at em=0, got ${r.score}`);
+  });
+
+  it('em=12 → score ≈ 63 (one-year saturating anchor)', async () => {
+    const r = await scoreWithEm(12);
+    const expected = 100 * (1 - Math.exp(-1));
+    assert.ok(
+      Math.abs(r.score - expected) < 1,
+      `expected ~${expected.toFixed(1)} at em=12, got ${r.score}`,
+    );
+  });
+
+  it('em=24 → score ≈ 86 (two-year saturating anchor)', async () => {
+    const r = await scoreWithEm(24);
+    const expected = 100 * (1 - Math.exp(-2));
+    assert.ok(
+      Math.abs(r.score - expected) < 1,
+      `expected ~${expected.toFixed(1)} at em=24, got ${r.score}`,
+    );
+  });
+
+  it('monotonic: score(em=3) < score(em=12) < score(em=24)', async () => {
+    const [r3, r12, r24] = await Promise.all([scoreWithEm(3), scoreWithEm(12), scoreWithEm(24)]);
+    assert.ok(r3.score < r12.score && r12.score < r24.score,
+      `expected strictly increasing; got em=3:${r3.score}, em=12:${r12.score}, em=24:${r24.score}`);
+  });
+
+  it('country not in manifest → score 0, coverage 1.0 (legitimate zero, not imputed)', async () => {
+    // Seed present but country absent = "no SWF" (legitimate structural zero).
+    // This is distinct from "seed missing entirely" which returns IMPUTE.
+    const r = await scoreSovereignFiscalBuffer(TEST_ISO2, makeReader({
+      'resilience:recovery:sovereign-wealth:v1': { countries: {} },
+    }));
+    assert.equal(r.score, 0, `expected 0 when country has no manifest entry, got ${r.score}`);
+    assert.equal(r.coverage, 1.0, `expected coverage=1.0 (legitimate observation), got ${r.coverage}`);
+    assert.equal(r.imputationClass, null, `expected null imputation (not imputed), got ${r.imputationClass}`);
+  });
+});