fix(resilience): widen Comtrade period to 4y + surface picked year (#3372)

PR 1 of cohort-audit plan 2026-04-24-002. Unblocks UAE, Oman, Bahrain
(and any other late-reporter) on the importConcentration dimension.

Problem
- seed-recovery-import-hhi.mjs queries Comtrade with `period=Y-1,Y-2`
  (currently "2025,2024"). Several reporters publish Comtrade 1-2y
  behind — their 2024/2025 rows are empty while 2023 is populated.
- With no data in the queried window, parseRecords() returned [] for
  the reporter, the seeder counted a "skip", the scorer fell through
  to IMPUTE (score=50, coverage=0.3, imputationClass="unmonitored"),
  and the cohort-sanity audit flagged AE as a coverage-outlier inside
  the GCC — exactly the class of silent gap the audit is designed to
  catch.

Fix
1. Widen the Comtrade period parameter to a 4-year window Y-1..Y-4
   via a new `buildPeriodParam(now)` helper. On-time reporters still
   pick their latest year via the existing completeness tiebreak in
   parseRecords(); late reporters now pick up whatever year they
   actually published in (2023 for UAE, etc.).
2. parseRecords() now returns { rows, year } — the year surfaces in
   the per-country payload as `year: number | null` for operator
   freshness audit. The scorer already expects this shape
   (_dimension-scorers.ts:1524 RecoveryImportHhiCountry.year); this
   PR actually populates it.
3. `buildPeriodParam` + `parseRecords` are exported so their unit
   tests can pin year-selection behaviour without hitting Comtrade.

Note on PR 2 of the same plan
The plan calls out "PR 2 — externalDebtCoverage re-goalpost to
Greenspan-Guidotti" as unshipped. It IS shipped: commit 7f78a7561
"PR 3 §3.5 point 3 — re-goalpost externalDebtCoverage (0..5 → 0..2)"
landed under the prior workstream 2026-04-22-001. The new construct
invariants in tests/resilience-construct-invariants.test.mts
(shipped in PR 0 / #3369) confirm score(ratio=0)=100, score(1)=50,
score(2)=0 against current main. PR 2 of the cohort-audit plan is a
no-op; I'll flag this on the plan review thread rather than bundle
a plan edit into this PR.

Verified
- `npx tsx --test tests/seed-recovery-import-hhi.test.mjs` — 19 pass
  (10 existing + 9 new: buildPeriodParam shape; parseRecords picks
  completeness-tiebreak, newer-year-on-ties, late-reporter fallback;
  empty/negative/world-aggregate handling)
- `npx tsx --test tests/seed-comtrade-5xx-retry.test.mjs` — green
  (the `{ records, status }` destructure pattern at the caller still
  works; the new third field `year` is additive)
- `npm run test:data` — 6703 pass / 0 fail
- `npm run typecheck` / `typecheck:api` — green
- `npm run lint` / `lint:md` — no new warnings
- No cache-prefix bump: the payload shape only ADDS an optional
  field; old snapshots remain valid readers.

Acceptance per plan
- Construct invariant: score(HHI=0.05) > score(HHI=0.20) — already
  covered in tests/resilience-construct-invariants.test.mts (PR #3369)
- Monotonicity pin: score(hhi=0.15) > score(hhi=0.45) — already
  covered in tests/resilience-dimension-monotonicity.test.mts

Post-deploy verification
After the next Railway seed-bundle-resilience-recovery cron tick,
confirm UAE/OM/BH appear in `resilience:recovery:import-hhi:v1`
with non-null hhi and `year` = 2023 (or their actual latest year).
Then re-run the cohort audit — the GCC coverage-outlier flag on
AE.importConcentration should disappear.
This commit is contained in:
Elie Habib
2026-04-24 18:13:41 +04:00
committed by GitHub
parent df392b0514
commit 0081da4148
2 changed files with 141 additions and 9 deletions

View File

@@ -52,11 +52,23 @@ for (const [iso2, code] of Object.entries(COMTRADE_REPORTER_OVERRIDES)) {
const ALL_REPORTERS = Object.values(UN_TO_ISO2).filter(c => c.length === 2);
function parseRecords(data) {
// Parse Comtrade imports into partner-value rows for HHI. Picks the
// "best" year per reporter using a freshness-weighted rule:
// (a) prefer years with more partner rows (proxy for data completeness);
// (b) on ties, prefer the most recent year (newer data wins).
//
// PR 1 of plan 2026-04-24-002: period window is 4y (Y-1..Y-4). Late-
// reporters like UAE, Oman, Bahrain publish Comtrade 1-2y behind; with
// the original Y-1..Y-2 window their per-reporter query returned an
// empty set and they fell through to IMPUTED on importConcentration.
// The 4y window gives us a chance to pick a reporter's latest
// non-empty year without degrading the result for on-time reporters
// (they still get their newest year on the completeness tiebreak).
export function parseRecords(data) {
const records = data?.data ?? [];
if (!Array.isArray(records)) return [];
if (!Array.isArray(records)) return { rows: [], year: null };
const valid = records.filter(r => r && Number(r.primaryValue ?? 0) > 0);
if (valid.length === 0) return [];
if (valid.length === 0) return { rows: [], year: null };
const byPeriod = new Map();
for (const r of valid) {
const p = String(r.period ?? r.refPeriodId ?? '0');
@@ -75,10 +87,12 @@ function parseRecords(data) {
bestPeriod = p;
}
}
return byPeriod.get(bestPeriod).map(r => ({
const rows = byPeriod.get(bestPeriod).map(r => ({
partnerCode: String(r.partnerCode ?? r.partner2Code ?? '000'),
primaryValue: Number(r.primaryValue ?? 0),
}));
const yearNum = Number(bestPeriod);
return { rows, year: Number.isFinite(yearNum) ? yearNum : null };
}
// Comtrade transient 5xx (500/502/503/504) must be retried or the reporter
@@ -94,12 +108,24 @@ export function isTransientComtrade(status) {
let _retrySleep = sleep;
export function __setSleepForTests(fn) { _retrySleep = typeof fn === 'function' ? fn : sleep; }
// 4-year period window. Plan 2026-04-24-002 §PR 1: late-reporters
// (UAE, Oman, Bahrain and others) publish Comtrade 1-2y behind G7, so
// a Y-1..Y-2 window silently drops them. Y-1..Y-4 keeps on-time
// reporters' latest-year data AND picks up late reporters' most
// recent published year.
const PERIOD_WINDOW_YEARS = 4;
export function buildPeriodParam(nowYear = new Date().getFullYear()) {
const years = [];
for (let i = 1; i <= PERIOD_WINDOW_YEARS; i++) years.push(nowYear - i);
return years.join(',');
}
export async function fetchImportsForReporter(reporterCode, apiKey) {
const url = new URL(COMTRADE_URL);
url.searchParams.set('reporterCode', reporterCode);
url.searchParams.set('flowCode', 'M');
url.searchParams.set('cmdCode', 'TOTAL');
url.searchParams.set('period', `${new Date().getFullYear() - 1},${new Date().getFullYear() - 2}`);
url.searchParams.set('period', buildPeriodParam());
url.searchParams.set('subscription-key', apiKey);
async function once() {
@@ -135,8 +161,9 @@ export async function fetchImportsForReporter(reporterCode, apiKey) {
break;
}
if (!resp.ok) return { records: [], status: resp.status };
return { records: parseRecords(await resp.json()), status: resp.status };
if (!resp.ok) return { records: [], year: null, status: resp.status };
const { rows, year } = parseRecords(await resp.json());
return { records: rows, year, status: resp.status };
}
export function computeHhi(records) {
@@ -184,7 +211,7 @@ async function runWorker(apiKey, queue, countries, progressRef) {
if (!unCode) { progressRef.skipped++; continue; }
try {
const { records, status } = await fetchImportsForReporter(unCode, apiKey);
const { records, year, status } = await fetchImportsForReporter(unCode, apiKey);
if (records.length === 0) {
if (status && status !== 200) progressRef.errors++;
progressRef.skipped++;
@@ -197,6 +224,11 @@ async function runWorker(apiKey, queue, countries, progressRef) {
hhi: result.hhi,
concentrated: result.hhi > 0.25,
partnerCount: result.partnerCount,
// `year` is the reporter's latest non-empty Comtrade year inside
// the 4y window. Publication-lag auditors (operators + the
// cohort-sanity audit at scripts/audit-resilience-cohorts.mjs)
// read this to see which reporters are 2-3y stale vs current.
year,
fetchedAt: new Date().toISOString(),
};
progressRef.fetched++;

View File

@@ -1,7 +1,7 @@
import { describe, it } from 'node:test';
import assert from 'node:assert/strict';
import { computeHhi } from '../scripts/seed-recovery-import-hhi.mjs';
import { computeHhi, buildPeriodParam, parseRecords } from '../scripts/seed-recovery-import-hhi.mjs';
describe('seed-recovery-import-hhi', () => {
it('computes HHI=1 for single-partner imports', () => {
@@ -107,3 +107,103 @@ describe('seed-recovery-import-hhi', () => {
assert.equal(result.partnerCount, 2);
});
});
// PR 1 of plan 2026-04-24-002: 4-year period window + pick-latest-per-reporter
// to unblock late-reporters (UAE, OM, BH) who publish Comtrade 1-2y behind.
describe('seed-recovery-import-hhi — period window + pick-latest', () => {
describe('buildPeriodParam', () => {
it('emits a 4-year window descending from Y-1 to Y-4', () => {
assert.equal(buildPeriodParam(2026), '2025,2024,2023,2022');
});
it('defaults to the current system year when no arg passed', () => {
const nowYear = new Date().getFullYear();
const produced = buildPeriodParam();
const parts = produced.split(',').map(Number);
assert.equal(parts.length, 4, 'must always produce exactly 4 years');
assert.equal(parts[0], nowYear - 1, 'first year is Y-1 relative to system clock');
assert.equal(parts[3], nowYear - 4, 'last year is Y-4');
});
it('never emits the current year (Comtrade is always behind by at least 1y)', () => {
const produced = buildPeriodParam(2026).split(',').map(Number);
assert.ok(!produced.includes(2026), `${produced} must not include the current year`);
});
});
describe('parseRecords — picks year with most partners', () => {
it('picks the year with the most partner rows (completeness tiebreak)', () => {
const data = { data: [
// 2023 has 3 partners → fewer than 2024
{ period: 2023, partnerCode: '156', primaryValue: 100 },
{ period: 2023, partnerCode: '842', primaryValue: 100 },
{ period: 2023, partnerCode: '276', primaryValue: 100 },
// 2024 has 5 partners → winner on completeness
{ period: 2024, partnerCode: '156', primaryValue: 100 },
{ period: 2024, partnerCode: '842', primaryValue: 100 },
{ period: 2024, partnerCode: '276', primaryValue: 100 },
{ period: 2024, partnerCode: '392', primaryValue: 100 },
{ period: 2024, partnerCode: '410', primaryValue: 100 },
]};
const { rows, year } = parseRecords(data);
assert.equal(year, 2024, 'should pick 2024 (more partners)');
assert.equal(rows.length, 5, 'should return the 2024 rows only');
});
it('picks the most recent year when partner counts tie', () => {
const data = { data: [
{ period: 2022, partnerCode: '156', primaryValue: 100 },
{ period: 2022, partnerCode: '842', primaryValue: 100 },
{ period: 2023, partnerCode: '156', primaryValue: 100 },
{ period: 2023, partnerCode: '842', primaryValue: 100 },
]};
const { rows, year } = parseRecords(data);
assert.equal(year, 2023, 'should pick the newer year on ties');
assert.equal(rows.length, 2);
});
it('picks the only populated year for late-reporters (the UAE/OM/BH scenario)', () => {
// UAE pattern: Comtrade has 2023 data but 2024/2025 rows are empty.
const data = { data: [
{ period: 2023, partnerCode: '156', primaryValue: 500 },
{ period: 2023, partnerCode: '842', primaryValue: 500 },
{ period: 2023, partnerCode: '276', primaryValue: 500 },
// No 2024/2025 rows — this is what the API returns for a late reporter.
]};
const { rows, year } = parseRecords(data);
assert.equal(year, 2023, 'must surface 2023 as the latest non-empty year');
assert.equal(rows.length, 3, 'must return all 2023 rows intact');
});
it('returns { rows: [], year: null } on empty input (no IMPUTE surface)', () => {
assert.deepEqual(parseRecords({ data: [] }), { rows: [], year: null });
assert.deepEqual(parseRecords({}), { rows: [], year: null });
assert.deepEqual(parseRecords(null), { rows: [], year: null });
});
it('ignores rows with primaryValue <= 0', () => {
const data = { data: [
{ period: 2024, partnerCode: '156', primaryValue: 0 },
{ period: 2024, partnerCode: '842', primaryValue: -100 },
{ period: 2023, partnerCode: '156', primaryValue: 500 },
]};
const { rows, year } = parseRecords(data);
assert.equal(year, 2023, 'only 2023 has a positive-value row');
assert.equal(rows.length, 1);
});
it('ignores world-aggregate partner codes (0, 000) in the completeness count', () => {
// 2024 has one real partner + two world-aggregate rows (4 total rows,
// but only 1 "usable"); 2023 has two real partners (2 usable). 2023 wins.
const data = { data: [
{ period: 2024, partnerCode: '0', primaryValue: 1000 },
{ period: 2024, partnerCode: '000', primaryValue: 1000 },
{ period: 2024, partnerCode: '156', primaryValue: 500 },
{ period: 2023, partnerCode: '156', primaryValue: 500 },
{ period: 2023, partnerCode: '842', primaryValue: 500 },
]};
const { year } = parseRecords(data);
assert.equal(year, 2023, 'completeness count must exclude world-aggregates');
});
});
});