fix(intelligence): preserve prior summary when all regions are dedup-skipped

Addresses 1 P1 finding on PR #2940: the seed script was overwriting a
good summary with an empty {regions: [], recordCount: 0} payload
whenever a cron invocation landed inside the 15-minute idempotency
bucket and every region returned duplicate-bucket. api/health.js
classifies the empty array as EMPTY_DATA, flipping /api/health to red
on a completely harmless retry.

New health policy in the seed main():
  1. persisted > 0 && failed === 0: write fresh summary + seed-meta
  2. persisted === 0 && failed === 0: all regions dedup-skipped. Do
     NOT write (preserve the prior good summary). Return cleanly.
  3. failed > 0: skip meta write and exit 1. The 12h maxStaleMin
     budget flips /api/health to STALE on sustained degradation.

The bundle runner's freshness gate (intervalMs * 0.8) means the next
full run lands roughly every 6h regardless. In steady state, the 15min
dedup bucket only catches bundle cron retries after a prior success
within the same bucket, so path 2 is rare but real: the first cron
tick after a pod restart inside the dedup window is the failure mode
this fixes.

Verification
  - tests/regional-snapshot.test.mjs: 50/50 pass
  - npm run typecheck:all: clean
  - tsc -p scripts/jsconfig.json: 0 errors in regional-snapshot
  - biome lint: clean on touched file
  - Static analysis confirms exactly 1 writeExtraKeyWithMeta call
    (only in branch 1) so branches 2 and 3 cannot accidentally
    overwrite the summary
This commit is contained in:
Elie Habib
2026-04-11 16:48:43 +04:00
parent c92eb12b2a
commit bef3225ebf

View File

@@ -214,12 +214,20 @@ async function main() {
}
}
// Health: only write seed-meta when ALL regions succeeded. If any region
// failed, skip the meta write so /api/health flips to STALE after 12h of
// persistent degradation instead of silently reporting OK. The bundle
// runner's freshness gate will retry this seed on the next cycle.
// Health policy:
// 1. persisted > 0 && failed === 0: write the fresh summary + seed-meta.
// 2. persisted === 0 && failed === 0: all regions dedup-skipped (e.g., a
// retry within the 15min idempotency bucket). Preserve the prior good
// summary by skipping the write entirely. api/health.js classifies an
// empty `regions: []` + `recordCount: 0` as EMPTY_DATA which flips the
// overall health to red, so overwriting on a no-op retry is actively
// harmful. The 12h maxStaleMin budget lets the next full run refresh
// the payload naturally.
// 3. failed > 0: skip the meta write so /api/health flips to STALE after
// the maxStaleMin budget on persistent degradation instead of silently
// reporting OK. The bundle runner's freshness gate retries next cycle.
const elapsed = ((Date.now() - t0) / 1000).toFixed(1);
if (failed === 0) {
if (failed === 0 && persisted > 0) {
const ttlSec = 12 * 60 * 60; // 12h, 2x the 6h cron cadence
await writeExtraKeyWithMeta(
`intelligence:regional-snapshots:summary:v1`,
@@ -233,6 +241,12 @@ async function main() {
return;
}
if (failed === 0) {
// All regions dedup-skipped. Preserve the prior summary and return cleanly.
console.log(`[regional-snapshots] Done in ${elapsed}s: persisted=0 skipped=${skipped} failed=0 (all dedup-skipped, prior summary preserved)`);
return;
}
console.error(`[regional-snapshots] Done in ${elapsed}s: persisted=${persisted} skipped=${skipped} failed=${failed}`);
for (const f of failedRegions) {
console.error(` [${f.region}] ${f.error}`);