mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(seed): BUNDLE_RUN_STARTED_AT_MS env + runSeed SIGTERM cleanup
Prereq for the re-export-share Comtrade seeder (plan 2026-04-24-003),
usable by any cohort seeder whose consumer needs bundle-level freshness.
Two coupled changes:
1. `_bundle-runner.mjs` injects `BUNDLE_RUN_STARTED_AT_MS` into every
spawned child. All siblings in a single bundle run share one value
(captured at `runBundle` start, not spawn time). Consumers use this
to detect stale peer keys — if a peer's seed-meta predates the
current bundle run, fall back to a hard default rather than read
a cohort-peer's last-week output.
2. `_seed-utils.mjs::runSeed` registers a `process.once('SIGTERM')`
handler that releases the acquired lock and extends existing-data
TTL before exiting 143. `_bundle-runner.mjs` sends SIGTERM on
section timeout, then SIGKILL after KILL_GRACE_MS (5s). Without
this handler the `finally` path never runs on SIGKILL, leaving
the 30-min acquireLock reservation in place until its own TTL
expires — the next cron tick silently skips the resource.
Regression guard memory: `bundle-runner-sigkill-leaks-child-lock` (PR
#3128 root cause).
Tests added:
- bundle-runner env injection (value within run bounds)
- sibling sections share the same timestamp (critical for the
consumer freshness guard)
- runSeed SIGTERM path: exit 143 + cleanup log
- process.once contract: second SIGTERM does not re-enter handler
* fix(seed): address P1/P2 review findings on SIGTERM + bundle contracts
Addresses PR #3384 review findings (todos 256, 257, 259, 260):
#256 (P1) — SIGTERM handler narrowed to fetch phase only. Was installed
at runSeed entry and armed through every `process.exit` path; could
race `emptyDataIsFailure: true` strict-floor exits (IMF-External,
WB-bulk) and extend seed-meta TTL when the contract forbids it —
silently re-masking 30-day outages. Now the handler is attached
immediately before `withRetry(fetchFn)` and removed in a try/finally
that covers all fetch-phase exit branches.
#257 (P1) — `BUNDLE_RUN_STARTED_AT_MS` now has a first-class helper.
Exported `getBundleRunStartedAtMs()` from `_seed-utils.mjs` with JSDoc
describing the bundle-freshness contract. Fleet-wide helper so the
next consumer seeder imports instead of rediscovering the idiom.
#259 (P2) — SIGTERM cleanup runs `Promise.allSettled` on disjoint-key
ops (`releaseLock` + `extendExistingTtl`). Serialising compounded
Upstash latency during the exact failure mode (Redis degraded) this
handler exists to handle, risking breach of the 5s SIGKILL grace.
#260 (P2) — `_bundle-runner.mjs` asserts topological order on
optional `dependsOn` section field. Throws on unknown-label refs and
on deps appearing at a later index. Fleet-wide contract replacing
the previous prose-comment ordering guarantee.
Tests added/updated:
- New: SIGTERM handler removed after fetchFn completes (narrowed-scope
contract — post-fetch SIGTERM must NOT trigger TTL extension)
- New: dependsOn unknown-label + out-of-order + happy-path (3 tests)
Full test suite: 6,866 tests pass (+4 net).
* fix(seed): getBundleRunStartedAtMs returns null outside a bundle run
Review follow-up: the earlier `Math.floor(Date.now()/1000)*1000` fallback
regressed standalone (non-bundle) runs. A consumer seeder invoked
manually just after its peer wrote `fetchedAt = (now - 5s)` would see
`bundleStartMs = Date.now()`, reject the perfectly-fresh peer envelope
as "stale", and fall back to defaults — defeating the point of the
peer-read path outside the bundle.
Returning null when `BUNDLE_RUN_STARTED_AT_MS` is unset/invalid keeps
the freshness gate scoped to its real purpose (across-bundle-tick
staleness) and lets standalone runs skip the gate entirely. Consumers
check `bundleStartMs != null` before applying the comparison; see the
companion `seed-sovereign-wealth.mjs` change on the stacked PR.
* test(seed): SIGTERM cleanup test now verifies Redis DEL + EXPIRE calls
Greptile review P2 on PR #3384: the existing test only asserted exit
code + log line, not that the Redis ops were actually issued. The
log claim was ahead of the test.
Fixture now logs every Upstash fetch call's shape (EVAL / pipeline-
EXPIRE / other) to stderr. Test asserts:
- >=1 EVAL op was issued during SIGTERM cleanup (releaseLock Lua
script on the lock key)
- >=1 pipeline-EXPIRE op was issued (extendExistingTtl on canonical
+ seed-meta keys)
- The EVAL body carries the runSeed-generated runId (proves it's
THIS run's release, not a phantom op)
- The EXPIRE pipeline touches both the canonicalKey AND the
seed-meta key (proves the keys[] array was built correctly
including the extraKeys merge path)
Full test suite: 6,866 tests pass, typecheck clean.
* feat(resilience): Comtrade-backed re-export-share seeder + SWF Redis read
Plan ref: docs/plans/2026-04-24-003-feat-reexport-share-comtrade-seeder-plan.md
Motivating case. Before this PR, the SWF `rawMonths` denominator for
the `sovereignFiscalBuffer` dimension used GROSS annual imports for
every country. For re-export hubs (goods transiting without domestic
settlement), this structurally under-reports resilience: UAE's 2023
$941B of imports include $334B of transit flow that never represents
domestic consumption. Net imports = gross × (1 − reexport_share).
The previous (PR 3A) design flattened a hand-curated YAML into Redis;
the YAML shipped empty and never populated, so the correction never
applied and the cohort audit showed no movement.
Gap #2 (this PR). Two coupled changes to make the correction actually
apply:
1. Comtrade-backed seeder (`scripts/seed-recovery-reexport-share.mjs`).
Rewritten to fetch UN Comtrade `flowCode=RX` (re-exports) and
`flowCode=M` (imports) per cohort member, compute share = RX/M at
the latest co-populated year, clamp to [0.05, 0.95], publish the
envelope. Header auth (`Ocp-Apim-Subscription-Key`) — subscription
key never reaches URL/logs/Redis. `maxRecords=250000` cap with
truncation detection. Sequential + retry-on-429 with backoff.
Hub cohort resolved by Phase 0 empirical probe (plan §Phase 0):
['AE', 'PA']. Six candidates (SG/HK/NL/BE/MY/LT) return HTTP 200
with zero RX rows — Comtrade doesn't expose RX for those reporters.
2. SWF seeder reads from Redis (`scripts/seed-sovereign-wealth.mjs`).
Swaps `loadReexportShareByCountry()` (YAML) for
`loadReexportShareFromRedis()` (Redis key written by #1). Guarded
by bundle-run freshness: if the sibling Reexport-Share seeder's
`seed-meta` predates `BUNDLE_RUN_STARTED_AT_MS` (set by the
prereq PR's `_bundle-runner.mjs` env-injection), HARD fallback
to gross imports rather than apply last-month's stale share.
Health registries. Both new keys registered in BOTH `api/health.js`
SEED_META (60-day alert threshold) and `api/seed-health.js`
SEED_DOMAINS (43200min interval). feedback_two_health_endpoints_must_match.
Bundle wiring. `seed-bundle-resilience-recovery` Reexport-Share
timeout bumped 60s → 300s (Comtrade + retry can take 2-3 min
worst-case). Ordering preserved: Reexport-Share before Sovereign-
Wealth so the SWF seeder reads a freshly-written key in the same
cron tick.
Deletions. YAML + loader + 7 obsolete loader tests removed; single
source of truth is now Comtrade → Redis.
Prereq. Stacks on PR #3384 (feat/bundle-runner-env-sigterm)
which adds BUNDLE_RUN_STARTED_AT_MS env injection + runSeed
SIGTERM cleanup. This PR's bundle-freshness guard depends on
that env variable.
Tests (19 new, 7 deleted, +12 net):
- Pure math: parseComtradeFlowResponse, computeShareFromFlows,
clampShare, declareRecords + credential-leak source scan (15)
- Integration (Gap #2 regression guards): SWF seeder loadReexport
ShareFromRedis — fresh/absent/malformed/stale-meta/missing-meta (5)
- Health registry dual-registry drift guard — scoped to this PR's
keys, respecting pre-existing asymmetry (4)
- Bundle-ordering + timeout assertions (2)
Phase 0 cohort validation committed to plan. Full test suite
passes: 6,881 tests.
* fix(resilience): address P1/P2 review findings — adopt shared helpers, pin freshness boundary
Addresses PR #3385 review findings:
#257 (P1) consumer — `seed-sovereign-wealth.mjs` imports the shared
`getBundleRunStartedAtMs` helper from `_seed-utils.mjs` (added in the
prereq commit) instead of its own `getBundleStartMs`. Single source of
truth for the bundle-freshness contract.
#258 (P2) — `seed-recovery-reexport-share.mjs` isMain guard uses the
canonical `pathToFileURL(process.argv[1]).href === import.meta.url`
form instead of basename-suffix matching. Handles symlinks, case-
different paths on macOS HFS+, and Windows path separators without
string munging.
#260 (P2) consumer — Sovereign-Wealth declares `dependsOn:
['Reexport-Share']` in the bundle spec. `_bundle-runner.mjs` (prereq
commit) now enforces topological order on load and throws on
violation — replaces the previous prose-comment ordering contract.
#261 (P2) — added a test to `tests/seed-sovereign-wealth-reads-redis-
reexport-share.test.mts` pinning the inclusive-boundary semantic:
`fetchedAtMs === bundleStartMs` must be treated as FRESH. Guards
against a future refactor to `<=` that would silently reject peers
writing at the very first millisecond of the bundle run.
Rebased onto updated prereq. Full test suite: 6,886 tests pass (+5 net).
* fix(resilience): freshness gate skipped in standalone mode; meta still required
Review catch: the previous `bundleStartMs = Date.now()` fallback made
standalone/manual `seed-sovereign-wealth.mjs` runs ALWAYS reject any
previously-seeded re-export-share meta as "stale" — even when the
operator ran the Reexport seeder milliseconds beforehand. Defeated
the point of the peer-read path outside the bundle.
With `getBundleRunStartedAtMs()` now returning null outside a bundle
(companion commit on the prereq branch), the consumer only applies
the freshness gate when `bundleStartMs != null`. Standalone runs
accept any `fetchedAt` — the operator is responsible for ordering.
Two guards survive the change:
- Meta MUST exist (absence = peer-outage fail-safe, both modes)
- In-bundle: meta MUST be at or after `BUNDLE_RUN_STARTED_AT_MS`
Two new tests pin both modes:
- standalone: accepts meta written 10 min before this process started
- standalone: still rejects missing meta (peer-outage fail-safe
survives gate bypass)
Rebased onto updated prereq. Full test suite: 6,888 tests (+2 net).
* fix(resilience): filter world-aggregate Comtrade rows + skip final-retry sleep
Greptile review of PR #3385 flagged two P2s in the Comtrade seeder.
Finding #3 (parseComtradeFlowResponse double-count risk):
`cmdCode=TOTAL` without a partner filter currently returns only
world-aggregate rows in practice — but `parseComtradeFlowResponse`
summed every row unconditionally. A future refactor adding per-
partner querying would silently double-count (world-aggregate row +
partner-level rows for the same year), cutting the derived share in
half with no test signal.
Fix: explicit `partnerCode ∈ {'0', 0, null/undefined}` filter. Matches
current empirical behavior (aggregate-only responses) and makes the
construct robust to a future partner-level query.
Finding #4 (wasted backoff on final retry):
429 and 5xx branches slept `backoffMs` before `continue`, but on
`attempt === RETRY_MAX_ATTEMPTS` the loop condition fails immediately
after — the sleep was pure waste. Added early-return (parallel to the
existing pattern in the network-error catch branch) so the final
attempt exits the retry loop at the first non-success response
without extra latency.
Tests:
- 3 new `parseComtradeFlowResponse` variants: world-only filter,
numeric-0 partnerCode shape, rows without partnerCode field
- Existing tests updated: the double-count assertion replaced with
a "per-partner rows must NOT sum into the world-aggregate total"
assertion that pins the new contract
Rebased onto updated prereq. Full test suite: 6,890 tests (+2 net).
184 lines
7.6 KiB
TypeScript
184 lines
7.6 KiB
TypeScript
// Pure-math tests for the Comtrade-backed re-export-share seeder.
|
|
// Verifies the three extracted helpers (`parseComtradeFlowResponse`,
|
|
// `computeShareFromFlows`, `clampShare`) behave correctly in isolation,
|
|
// and that no subscription-key query param ever appears in the
|
|
// serialized envelope (belt-and-suspenders even with header auth).
|
|
//
|
|
// Context: plan 2026-04-24-003 §Phase 3 tests 1-6. These replace the
|
|
// 7 obsolete reexport-share-loader tests (YAML flattener deleted in
|
|
// this same PR).
|
|
|
|
import assert from 'node:assert/strict';
|
|
import { describe, it } from 'node:test';
|
|
|
|
import {
|
|
clampShare,
|
|
computeShareFromFlows,
|
|
declareRecords,
|
|
parseComtradeFlowResponse,
|
|
} from '../scripts/seed-recovery-reexport-share.mjs';
|
|
|
|
describe('parseComtradeFlowResponse', () => {
|
|
it('sums primaryValue per year, skipping zero/negative/non-numeric', () => {
|
|
const rows = [
|
|
{ period: 2023, primaryValue: 100_000 },
|
|
{ period: 2023, primaryValue: 50_000 },
|
|
{ period: 2022, primaryValue: 30_000 },
|
|
{ period: 2021, primaryValue: 0 }, // skipped (zero)
|
|
{ period: 2021, primaryValue: -5 }, // skipped (negative)
|
|
{ period: 2021, primaryValue: 'x' }, // skipped (non-numeric)
|
|
];
|
|
const out = parseComtradeFlowResponse(rows);
|
|
assert.equal(out.get(2023), 150_000);
|
|
assert.equal(out.get(2022), 30_000);
|
|
assert.equal(out.has(2021), false);
|
|
});
|
|
|
|
it('sums ONLY world-aggregate rows (partnerCode=0), excludes partner-level rows', () => {
|
|
// Defensive filter: if Comtrade returns BOTH a world-aggregate
|
|
// row (partner=0) AND per-partner breakdown rows for the same
|
|
// year, summing all would silently double-count and cut any
|
|
// derived share in half. We sum only partnerCode='0' / 0 / null.
|
|
const rows = [
|
|
{ period: 2023, partnerCode: '0', primaryValue: 1_000_000 }, // world aggregate — include
|
|
{ period: 2023, partnerCode: '842', primaryValue: 200_000 }, // per-partner (US) — EXCLUDE
|
|
{ period: 2023, partnerCode: '826', primaryValue: 150_000 }, // per-partner (UK) — EXCLUDE
|
|
];
|
|
const out = parseComtradeFlowResponse(rows);
|
|
assert.equal(out.get(2023), 1_000_000,
|
|
'per-partner rows must not add to the world-aggregate total');
|
|
});
|
|
|
|
it('accepts numeric 0 partnerCode (shape variant)', () => {
|
|
// Comtrade has occasionally emitted numeric 0 vs string '0' depending
|
|
// on response shape; both must be treated as world-aggregate.
|
|
const rows = [
|
|
{ period: 2023, partnerCode: 0, primaryValue: 500 },
|
|
{ period: 2023, partnerCode: '0', primaryValue: 500 },
|
|
];
|
|
const out = parseComtradeFlowResponse(rows);
|
|
assert.equal(out.get(2023), 1_000);
|
|
});
|
|
|
|
it('accepts rows with no partnerCode field (older response shape)', () => {
|
|
// Defensive: if a response shape omits partnerCode entirely,
|
|
// treat the row as world-aggregate rather than silently dropping it.
|
|
const rows = [{ period: 2024, primaryValue: 42 }];
|
|
const out = parseComtradeFlowResponse(rows);
|
|
assert.equal(out.get(2024), 42);
|
|
});
|
|
|
|
it('handles refPeriodId fallback when period is absent', () => {
|
|
const rows = [{ refPeriodId: 2024, primaryValue: 42 }];
|
|
const out = parseComtradeFlowResponse(rows);
|
|
assert.equal(out.get(2024), 42);
|
|
});
|
|
|
|
it('returns empty map on empty input', () => {
|
|
assert.equal(parseComtradeFlowResponse([]).size, 0);
|
|
});
|
|
});
|
|
|
|
describe('computeShareFromFlows', () => {
|
|
it('picks the latest co-populated year and returns share = RX / M', () => {
|
|
const rx = new Map([[2023, 300], [2022, 200], [2021, 100]]);
|
|
const m = new Map([[2023, 1000], [2022, 500], [2021, 400]]);
|
|
const picked = computeShareFromFlows(rx, m);
|
|
assert.equal(picked?.year, 2023);
|
|
assert.equal(picked?.share, 0.3);
|
|
assert.equal(picked?.reexportsUsd, 300);
|
|
assert.equal(picked?.importsUsd, 1000);
|
|
});
|
|
|
|
it('ignores years where RX or M is missing', () => {
|
|
const rx = new Map([[2024, 500], [2022, 200]]); // 2024 is RX-only
|
|
const m = new Map([[2023, 1000], [2022, 500]]); // 2023 is M-only
|
|
const picked = computeShareFromFlows(rx, m);
|
|
// Only 2022 is co-populated; even though 2024 is newer, it's not in M.
|
|
assert.equal(picked?.year, 2022);
|
|
assert.equal(picked?.share, 0.4);
|
|
});
|
|
|
|
it('returns null when no year is co-populated', () => {
|
|
const rx = new Map([[2024, 500]]);
|
|
const m = new Map([[2022, 500]]);
|
|
assert.equal(computeShareFromFlows(rx, m), null);
|
|
});
|
|
|
|
it('returns null when imports at picked year is zero (guards division)', () => {
|
|
// This can only happen if parseComtradeFlowResponse changes behavior;
|
|
// test the branch anyway since computeShareFromFlows is exported for
|
|
// tests and could be called with hand-crafted maps.
|
|
const rx = new Map([[2023, 300]]);
|
|
const m = new Map([[2023, 0]]);
|
|
assert.equal(computeShareFromFlows(rx, m), null);
|
|
});
|
|
});
|
|
|
|
describe('clampShare', () => {
|
|
it('returns null for sub-floor shares (< 0.05)', () => {
|
|
assert.equal(clampShare(0.03), null);
|
|
assert.equal(clampShare(0.049999), null);
|
|
assert.equal(clampShare(0), null);
|
|
});
|
|
|
|
it('caps above-ceiling shares at 0.95 (< 1 guard for computeNetImports)', () => {
|
|
assert.equal(clampShare(1.2), 0.95);
|
|
assert.equal(clampShare(0.99), 0.95);
|
|
assert.equal(clampShare(0.951), 0.95);
|
|
});
|
|
|
|
it('passes through in-range shares unchanged', () => {
|
|
assert.equal(clampShare(0.05), 0.05);
|
|
assert.equal(clampShare(0.355), 0.355);
|
|
assert.equal(clampShare(0.5), 0.5);
|
|
assert.equal(clampShare(0.95), 0.95);
|
|
});
|
|
|
|
it('returns null for NaN, Infinity, and negative', () => {
|
|
assert.equal(clampShare(NaN), null);
|
|
assert.equal(clampShare(Infinity), null);
|
|
assert.equal(clampShare(-0.1), null);
|
|
});
|
|
});
|
|
|
|
describe('declareRecords', () => {
|
|
it('counts material entries in the published payload', () => {
|
|
const payload = { countries: { AE: {}, PA: {} } };
|
|
assert.equal(declareRecords(payload), 2);
|
|
});
|
|
|
|
it('returns 0 for empty countries map (valid zero state)', () => {
|
|
assert.equal(declareRecords({ countries: {} }), 0);
|
|
assert.equal(declareRecords(null), 0);
|
|
assert.equal(declareRecords({}), 0);
|
|
});
|
|
});
|
|
|
|
describe('credential-leak regression guard', () => {
|
|
it('module source must not embed subscription-key in any URL literal', async () => {
|
|
// Read the seeder source file and assert no literal `subscription-key=`
|
|
// appears anywhere. Belt-and-suspenders even though fetchComtradeFlow
|
|
// uses header auth — if any future refactor adds `subscription-key=`
|
|
// to a URL builder, this test fails before it leaks to prod Redis.
|
|
const { readFile } = await import('node:fs/promises');
|
|
const { fileURLToPath } = await import('node:url');
|
|
const here = fileURLToPath(import.meta.url);
|
|
const seederPath = here.replace(/\/tests\/.*$/, '/scripts/seed-recovery-reexport-share.mjs');
|
|
const src = await readFile(seederPath, 'utf8');
|
|
// Flag only string-literal embeddings inside '...', "...", or `...`;
|
|
// regex literals (/subscription-key=/i used by the defensive serialize
|
|
// check) are intentional safeguards, not leaks.
|
|
// [^'\n] variant prevents the regex from spanning across multiple
|
|
// lines, which would falsely match any two unrelated quotes that
|
|
// happen to sandwich a `subscription-key=` reference elsewhere.
|
|
const stringLitMatches = [
|
|
...src.matchAll(/'[^'\n]*subscription-key=[^'\n]*'/g),
|
|
...src.matchAll(/"[^"\n]*subscription-key=[^"\n]*"/g),
|
|
...src.matchAll(/`[^`\n]*subscription-key=[^`\n]*`/g),
|
|
];
|
|
assert.equal(stringLitMatches.length, 0,
|
|
`found hardcoded subscription-key in string literal: ${stringLitMatches.map(m => m[0]).join(', ')}`);
|
|
});
|
|
});
|