mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(resilience): PR 2 §3.4 recovery-domain weight rebalance
Dials the two PR 2 §3.4 recovery dims (liquidReserveAdequacy,
sovereignFiscalBuffer) to ~10% share each of the recovery-domain
score via a new per-dimension weight channel in the coverage-weighted
mean. Matches the plan's direction that the sovereign-wealth signal
complement — rather than dominate — the classical liquid-reserves
and fiscal-space signals.
Implementation
- RESILIENCE_DIMENSION_WEIGHTS: new Record<ResilienceDimensionId, number>
alongside RESILIENCE_DOMAIN_WEIGHTS. Every dim has an explicit entry
(default 1.0) so rebalance decisions stay auditable; the two new
recovery dims carry 0.5 each.
Share math at full coverage (6 active recovery dims):
weight sum = 4 × 1.0 + 2 × 0.5 = 5.0
each new-dim share = 0.5 / 5.0 = 0.10 ✓
each core-dim share = 1.0 / 5.0 = 0.20
Retired dims (reserveAdequacy, fuelStockDays) keep weight 1.0 in
the map; their coverage=0 neutralizes them at the coverage channel
regardless. Explicit entries guard against a future scorer bug
accidentally returning coverage>0 for a retired dim and falling
through the `?? 1.0` default — every retirement decision is now
tied to a single explicit source of truth.
- coverageWeightedMean (_shared.ts): refactored to apply
`coverage × dimWeight` per dim instead of `coverage` alone. Backward-
compatible when all weights default to 1.0 (reduces to the original
mean). All three aggregation callers — buildDomainList, baseline-
Score, stressScore — pick up the weighting transparently.
Test coverage
1. New `tests/resilience-recovery-weight-rebalance.test.mts`:
pins the per-dim weight values, asserts the share math
(0.10 new / 0.20 core), verifies completeness of the weight map,
and documents why retired dims stay in the map at 1.0.
2. New `tests/resilience-recovery-ordering.test.mts`: fixture-based
Spearman-proxy sensitivity check. Asserts NO > US > YE ordering
preserved on both the overall score and the recovery-domain
subscore after the rebalance. (Live post-merge Spearman rerun
against the PR 0 snapshot is tracked as a follow-up commit.)
3. resilience-scorers.test.mts fixture anchors updated in lockstep:
baselineScore: 60.35 → 62.17 (low-scoring liquidReserveAdequacy
+ partial-coverage SWF now contribute ~half the weight)
overallScore: 63.60 → 64.39 (recovery subscore lifts by ~3 pts
from the rebalance, overall by ~0.79)
recovery flat mean: 48.75 (unchanged — flat mean doesn't apply
weights by design; documents the coverage-weighted diff)
Local coverageWeightedMean helper in the test mirrors the
production implementation (weights applied per dim).
Methodology doc
- New "Per-dimension weights in the recovery domain" subsection with
the weight table and a sentence explaining the cap. Cross-references
the source of truth (RESILIENCE_DIMENSION_WEIGHTS).
Deliberate non-goals
- Live post-merge Spearman ≥0.85 check against the PR 0 baseline
snapshot. Fixture ordering is preserved (new ordering test); the
live-data check runs after Railway cron refreshes the rankings on
the new weights and commits docs/snapshots/resilience-ranking-live-
post-pr2-<date>.json. Tracked as the final piece of PR 2 §3.4
alongside the health.js / bootstrap graduation (waiting on the
7-day Railway cron bake-in window).
Tests: 6588/6588 data-tier tests pass. Typecheck clean on both
tsconfig configs. Biome clean on touched files. NO > US > YE
fixture ordering preserved.
* fix(resilience): PR 2 review — thread RESILIENCE_DIMENSION_WEIGHTS through the comparison harness
Greptile P2: the operator comparison harness
(scripts/compare-resilience-current-vs-proposed.mjs) claims its domain
scores "mirror the production scorer's coverage-weighted mean" and is
the artifact generator for Spearman / rank-delta acceptance decisions.
After PR 2 §3.4's weight rebalance, the production mirror diverged —
production now applies RESILIENCE_DIMENSION_WEIGHTS (liquidReserveAdequacy
= 0.5, sovereignFiscalBuffer = 0.5) inside coverageWeightedMean, but
the harness still used equal-weight aggregation.
Left unfixed, post-merge Spearman / rank-delta diagnostics would
compare live API scores (with the 0.5 recovery weights) against
harness predictions that assume equal-share dims — silently biasing
every acceptance decision until someone noticed a country's rank-
delta didn't track.
Fix
- Mirrored coverageWeightedMean now accepts dimensionWeights and
applies `coverage × weight` per dim, matching _shared.ts exactly.
- Mirrored buildDomainList accepts + forwards dimensionWeights.
- main() imports RESILIENCE_DIMENSION_WEIGHTS from the scorer module
and passes it through to buildDomainList at the single call site.
- Missing-entry default = 1.0 (same contract as production) — makes
the harness forward-compatible with any future weight refactor
(adds a new dim without an explicit entry, old production fallback
path still produces the correct number).
Verification
- Harness syntax-check clean (node -c).
- RESILIENCE_DIMENSION_WEIGHTS import resolves correctly from the
harness's import path.
- 509/509 resilience tests still pass (harness isn't in the test
suite; the invariant is that production ↔ harness use the same
math, and the production side is covered by tests/resilience-
recovery-weight-rebalance.test.mts).
* fix(resilience): PR 2 review — bump cache prefixes v10→v11 + document coverage-vs-weight asymmetry
Greptile P1 + P2 on PR #3328.
P1 — cache prefix not bumped after formula change
--------------------------------------------------
The per-dim weight rebalance changes the score formula, but the
`_formula` tag only distinguishes 'd6' vs 'pc' (pillar-combined vs
legacy 6-domain) — it does NOT detect intra-'d6' weight changes. Left
unfixed, scores cached before deploy would be served with the old
equal-weight math for up to the full 6h TTL, and the ranking key for
up to its 12h TTL. Matches the established v9→v10 pattern for every
prior formula-changing deploy.
Bumped in lockstep:
- RESILIENCE_SCORE_CACHE_PREFIX: v10 → v11
- RESILIENCE_RANKING_CACHE_KEY: v10 → v11
- RESILIENCE_HISTORY_KEY_PREFIX: v5 → v6
- scripts/seed-resilience-scores.mjs local mirrors
- api/health.js resilienceRanking literal
- 4 analysis/backtest scripts that read the cached keys directly
- Test fixtures in resilience-{ranking, handlers, scores-seed,
pillar-aggregation}.test.* that assert on literal key values
The v5→v6 history bump is the critical one: without it, pre-rebalance
history points would mix with post-rebalance points inside the 30-day
window, and change30d / trend math would diff values from different
formulas against each other, producing false-negative "falling" trends
for every country across the deploy window.
P2 — coverage-vs-weight asymmetry in computeLowConfidence / computeOverallCoverage
----------------------------------------------------------------------------------
Reviewer flagged that these two functions still average coverage
equally across all non-retired dims, even after the scoring aggregation
started applying RESILIENCE_DIMENSION_WEIGHTS. The asymmetry is
INTENTIONAL — these signals answer a different question from scoring:
scoring aggregation: "how much does each dim matter to the score?"
coverage signal: "how much real data do we have on this country?"
A dim at weight 0.5 still has the same data-availability footprint as
a weight=1.0 dim: its coverage value reflects whether we successfully
fetched the upstream source, not whether the scorer cares about it.
Applying scoring weights to the coverage signal would let a
half-weight dim hide half its sparsity from the overallCoverage pill,
misleading users reading coverage as a data-quality indicator.
Added explicit comments to both functions noting the asymmetry is
deliberate and pointing at the other site for matching rationale.
No code change — just documentation.
Tests: 6588/6588 data-tier tests pass (+511 resilience-specific
including the prefix-literal assertions). Typecheck clean on both
tsconfig configs. Biome clean on touched files.
* docs(resilience): bump methodology doc cache-prefix references to v11/v6
Greptile P2 on PR #3328: Redis keys table in the reproducibility
appendix still published `score:v10` / `ranking:v10` / `history:v5`,
and the rollback instructions told operators to flush those keys.
After the recovery-domain weight rebalance, live cache runs at
`score:v11` / `ranking:v11` / `history:v6`.
- Updated the Redis keys table (line 490-492) to match `_shared.ts`.
- Updated the rollback block to name the current keys.
- Left the historical "Activation sequence" narrative intact (it
accurately describes the pillar-combine PR's v9→v10 / v4→v5 bump)
but added a parenthetical pointing at the current v11/v6 values.
No code change — doc-only correction for operator accuracy.
* fix(docs): escape MDX-unsafe `<137` pattern to unblock Mintlify deploy
Line 643 had `(<137 countries)` — MDX parses `<137` as a JSX tag
starting with digit `1`, which is illegal and breaks the deploy with
"Unexpected character \`1\` (U+0031) before name". Surfaced after the
prior cache-prefix commit forced Mintlify to re-parse this file.
Replaced with "fewer than 137 countries" for unambiguous rendering.
Other `<` occurrences in this doc (lines 34, 642) are followed by
whitespace and don't trip MDX's tag parser.
353 lines
16 KiB
JavaScript
353 lines
16 KiB
JavaScript
#!/usr/bin/env node
|
|
import {
|
|
getRedisCredentials,
|
|
loadEnvFile,
|
|
logSeedResult,
|
|
writeFreshnessMetadata,
|
|
} from './_seed-utils.mjs';
|
|
import { unwrapEnvelope } from './_seed-envelope-source.mjs';
|
|
|
|
loadEnvFile(import.meta.url);
|
|
|
|
const API_BASE = process.env.API_BASE_URL || 'https://api.worldmonitor.app';
|
|
// Reuse WORLDMONITOR_VALID_KEYS when a dedicated WORLDMONITOR_API_KEY isn't set —
|
|
// any entry in that comma-separated list is accepted by the API (same
|
|
// validation list that server/_shared/premium-check.ts and validateApiKey read).
|
|
// Avoids duplicating the same secret under a second env-var name per service.
|
|
const WM_KEY = process.env.WORLDMONITOR_API_KEY
|
|
|| (process.env.WORLDMONITOR_VALID_KEYS ?? '').split(',').map((k) => k.trim()).filter(Boolean)[0]
|
|
|| '';
|
|
const SEED_UA = 'Mozilla/5.0 (compatible; WorldMonitor-Seed/1.0)';
|
|
|
|
// Bumped v10 → v11 in lockstep with server/worldmonitor/resilience/v1/
|
|
// _shared.ts for the PR 2 §3.4 recovery-domain weight rebalance.
|
|
// Seeder and server MUST agree on the prefix or the seeder writes
|
|
// scores the handler will never read.
|
|
export const RESILIENCE_SCORE_CACHE_PREFIX = 'resilience:score:v11:';
|
|
export const RESILIENCE_RANKING_CACHE_KEY = 'resilience:ranking:v11';
|
|
// Must match the server-side RESILIENCE_RANKING_CACHE_TTL_SECONDS. Extended
|
|
// to 12h (2x the cron interval) so a missed/slow cron can't create an
|
|
// EMPTY_ON_DEMAND gap before the next successful rebuild.
|
|
export const RESILIENCE_RANKING_CACHE_TTL_SECONDS = 12 * 60 * 60;
|
|
export const RESILIENCE_STATIC_INDEX_KEY = 'resilience:static:index:v1';
|
|
|
|
const INTERVAL_KEY_PREFIX = 'resilience:intervals:v1:';
|
|
const INTERVAL_TTL_SECONDS = 7 * 24 * 60 * 60;
|
|
const DRAWS = 100;
|
|
|
|
const DOMAIN_WEIGHTS = {
|
|
economic: 0.22,
|
|
infrastructure: 0.20,
|
|
energy: 0.15,
|
|
'social-governance': 0.25,
|
|
'health-food': 0.18,
|
|
};
|
|
|
|
const DOMAIN_ORDER = [
|
|
'economic',
|
|
'infrastructure',
|
|
'energy',
|
|
'social-governance',
|
|
'health-food',
|
|
];
|
|
|
|
export function computeIntervals(domainScores, domainWeights, draws = DRAWS) {
|
|
const samples = [];
|
|
for (let i = 0; i < draws; i++) {
|
|
const jittered = domainWeights.map((w) => w * (0.9 + Math.random() * 0.2));
|
|
const sum = jittered.reduce((s, w) => s + w, 0);
|
|
const normalized = jittered.map((w) => w / sum);
|
|
const score = domainScores.reduce((s, d, idx) => s + d * normalized[idx], 0);
|
|
samples.push(score);
|
|
}
|
|
samples.sort((a, b) => a - b);
|
|
return {
|
|
p05: Math.round(samples[Math.max(0, Math.ceil(draws * 0.05) - 1)] * 10) / 10,
|
|
p95: Math.round(samples[Math.min(draws - 1, Math.ceil(draws * 0.95) - 1)] * 10) / 10,
|
|
};
|
|
}
|
|
|
|
async function redisGetJson(url, token, key) {
|
|
const resp = await fetch(`${url}/get/${encodeURIComponent(key)}`, {
|
|
headers: { Authorization: `Bearer ${token}` },
|
|
signal: AbortSignal.timeout(5_000),
|
|
});
|
|
if (!resp.ok) return null;
|
|
const data = await resp.json();
|
|
if (!data?.result) return null;
|
|
try { return unwrapEnvelope(JSON.parse(data.result)).data; } catch { return null; }
|
|
}
|
|
|
|
async function redisPipeline(url, token, commands) {
|
|
const resp = await fetch(`${url}/pipeline`, {
|
|
method: 'POST',
|
|
headers: { Authorization: `Bearer ${token}`, 'Content-Type': 'application/json' },
|
|
body: JSON.stringify(commands),
|
|
signal: AbortSignal.timeout(30_000),
|
|
});
|
|
if (!resp.ok) {
|
|
const text = await resp.text().catch(() => '');
|
|
throw new Error(`Redis pipeline HTTP ${resp.status} — ${text.slice(0, 200)}`);
|
|
}
|
|
return resp.json();
|
|
}
|
|
|
|
function countCachedFromPipeline(results) {
|
|
let count = 0;
|
|
for (const entry of results) {
|
|
if (typeof entry?.result === 'string') {
|
|
try { JSON.parse(entry.result); count++; } catch { /* malformed */ }
|
|
}
|
|
}
|
|
return count;
|
|
}
|
|
|
|
async function computeAndWriteIntervals(url, token, countryCodes, pipelineResults) {
|
|
const weights = DOMAIN_ORDER.map((id) => DOMAIN_WEIGHTS[id]);
|
|
const commands = [];
|
|
|
|
for (let i = 0; i < countryCodes.length; i++) {
|
|
const raw = pipelineResults[i]?.result ?? null;
|
|
if (!raw || raw === 'null') continue;
|
|
try {
|
|
const score = JSON.parse(raw);
|
|
if (!score.domains?.length) continue;
|
|
|
|
const domainScores = DOMAIN_ORDER.map((id) => {
|
|
const d = score.domains.find((dom) => dom.id === id);
|
|
return d?.score ?? 0;
|
|
});
|
|
|
|
const interval = computeIntervals(domainScores, weights, DRAWS);
|
|
const payload = {
|
|
p05: interval.p05,
|
|
p95: interval.p95,
|
|
draws: DRAWS,
|
|
computedAt: new Date().toISOString(),
|
|
};
|
|
commands.push(['SET', `${INTERVAL_KEY_PREFIX}${countryCodes[i]}`, JSON.stringify(payload), 'EX', INTERVAL_TTL_SECONDS]);
|
|
} catch { /* skip malformed */ }
|
|
}
|
|
|
|
if (commands.length === 0) {
|
|
console.log('[resilience-scores] No domain data available for intervals');
|
|
return 0;
|
|
}
|
|
|
|
const PIPE_BATCH = 50;
|
|
for (let i = 0; i < commands.length; i += PIPE_BATCH) {
|
|
await redisPipeline(url, token, commands.slice(i, i + PIPE_BATCH));
|
|
}
|
|
console.log(`[resilience-scores] Wrote ${commands.length} interval keys`);
|
|
|
|
await writeFreshnessMetadata('resilience', 'intervals', commands.length, '', INTERVAL_TTL_SECONDS);
|
|
return commands.length;
|
|
}
|
|
|
|
async function seedResilienceScores() {
|
|
const { url, token } = getRedisCredentials();
|
|
|
|
const index = await redisGetJson(url, token, RESILIENCE_STATIC_INDEX_KEY);
|
|
const countryCodes = (index?.countries ?? [])
|
|
.map((c) => String(c || '').trim().toUpperCase())
|
|
.filter((c) => /^[A-Z]{2}$/.test(c));
|
|
|
|
if (countryCodes.length === 0) {
|
|
console.warn('[resilience-scores] Static index is empty — has seed-resilience-static run this year?');
|
|
return { skipped: true, reason: 'no_index' };
|
|
}
|
|
|
|
console.log(`[resilience-scores] Reading cached scores for ${countryCodes.length} countries...`);
|
|
|
|
const getCommands = countryCodes.map((c) => ['GET', `${RESILIENCE_SCORE_CACHE_PREFIX}${c}`]);
|
|
const preResults = await redisPipeline(url, token, getCommands);
|
|
const preWarmed = countCachedFromPipeline(preResults);
|
|
|
|
console.log(`[resilience-scores] ${preWarmed}/${countryCodes.length} scores pre-warmed`);
|
|
|
|
const missing = countryCodes.length - preWarmed;
|
|
if (missing > 0) {
|
|
console.log(`[resilience-scores] Warming ${missing} missing via ranking endpoint...`);
|
|
try {
|
|
// ?refresh=1 MUST be set here. The ranking aggregate (12h TTL) routinely
|
|
// outlives the per-country score keys (6h TTL), so in the post-6h /
|
|
// pre-12h window the handler's cache-hit early-return would fire and
|
|
// skip the whole warm path — scores would stay missing, coverage would
|
|
// degrade, and only the per-country laggard fallback (or nothing, if
|
|
// WM_KEY is absent) would recover. Forcing a recompute routes the call
|
|
// through warmMissingResilienceScores and its chunked pipeline SET.
|
|
const headers = { 'User-Agent': SEED_UA, 'Accept': 'application/json' };
|
|
if (WM_KEY) headers['X-WorldMonitor-Key'] = WM_KEY;
|
|
const resp = await fetch(`${API_BASE}/api/resilience/v1/get-resilience-ranking?refresh=1`, {
|
|
headers,
|
|
signal: AbortSignal.timeout(60_000),
|
|
});
|
|
if (resp.ok) {
|
|
const data = await resp.json();
|
|
const ranked = data.items?.length ?? 0;
|
|
const greyed = data.greyedOut?.length ?? 0;
|
|
console.log(`[resilience-scores] Ranking: ${ranked} ranked, ${greyed} greyed out`);
|
|
} else {
|
|
console.warn(`[resilience-scores] Ranking endpoint returned ${resp.status}`);
|
|
}
|
|
} catch (err) {
|
|
console.warn(`[resilience-scores] Ranking warmup failed (best-effort): ${err.message}`);
|
|
}
|
|
|
|
// Re-check which countries are still missing after bulk warmup
|
|
const postResults = await redisPipeline(url, token, getCommands);
|
|
const stillMissing = [];
|
|
for (let i = 0; i < countryCodes.length; i++) {
|
|
const raw = postResults[i]?.result ?? null;
|
|
if (!raw || raw === 'null') { stillMissing.push(countryCodes[i]); continue; }
|
|
try {
|
|
const parsed = JSON.parse(raw);
|
|
if (parsed.overallScore <= 0) stillMissing.push(countryCodes[i]);
|
|
} catch { stillMissing.push(countryCodes[i]); }
|
|
}
|
|
|
|
// Warm laggards individually (countries the bulk ranking timed out on)
|
|
if (stillMissing.length > 0 && !WM_KEY) {
|
|
console.warn(`[resilience-scores] ${stillMissing.length} laggards found but neither WORLDMONITOR_API_KEY nor WORLDMONITOR_VALID_KEYS is set — skipping individual warmup`);
|
|
}
|
|
let laggardsWarmed = 0;
|
|
if (stillMissing.length > 0 && WM_KEY) {
|
|
console.log(`[resilience-scores] Warming ${stillMissing.length} laggards individually...`);
|
|
const BATCH = 5;
|
|
for (let i = 0; i < stillMissing.length; i += BATCH) {
|
|
const batch = stillMissing.slice(i, i + BATCH);
|
|
const results = await Promise.allSettled(batch.map(async (cc) => {
|
|
const scoreUrl = `${API_BASE}/api/resilience/v1/get-resilience-score?countryCode=${cc}`;
|
|
const resp = await fetch(scoreUrl, {
|
|
headers: { 'User-Agent': SEED_UA, 'Accept': 'application/json', 'X-WorldMonitor-Key': WM_KEY },
|
|
signal: AbortSignal.timeout(30_000),
|
|
});
|
|
if (!resp.ok) throw new Error(`${cc}: HTTP ${resp.status}`);
|
|
return cc;
|
|
}));
|
|
laggardsWarmed += results.filter(r => r.status === 'fulfilled').length;
|
|
}
|
|
console.log(`[resilience-scores] Laggards warmed: ${laggardsWarmed}/${stillMissing.length}`);
|
|
}
|
|
|
|
const finalResults = await redisPipeline(url, token, getCommands);
|
|
const finalWarmed = countCachedFromPipeline(finalResults);
|
|
console.log(`[resilience-scores] Final: ${finalWarmed}/${countryCodes.length} cached`);
|
|
|
|
const intervalsWritten = await computeAndWriteIntervals(url, token, countryCodes, finalResults);
|
|
const rankingPresent = await refreshRankingAggregate({ url, token, laggardsWarmed });
|
|
return { skipped: false, recordCount: finalWarmed, total: countryCodes.length, intervalsWritten, rankingPresent };
|
|
}
|
|
|
|
const intervalsWritten = await computeAndWriteIntervals(url, token, countryCodes, preResults);
|
|
// Refresh the ranking aggregate on every cron, even when per-country
|
|
// scores are still warm from the previous tick. Ranking has a 12h TTL vs
|
|
// a 6h cron cadence — skipping the refresh when the key is still alive
|
|
// would let it drift toward expiry without a rebuild, and a single missed
|
|
// cron would then produce an EMPTY_ON_DEMAND gap before the next one runs.
|
|
const rankingPresent = await refreshRankingAggregate({ url, token, laggardsWarmed: 0 });
|
|
return { skipped: false, recordCount: preWarmed, total: countryCodes.length, intervalsWritten, rankingPresent };
|
|
}
|
|
|
|
// Trigger a ranking rebuild via the public endpoint EVERY cron, regardless of
|
|
// whether resilience:ranking:v9 is still live at probe time. Short-circuiting
|
|
// on "key present" left a timing hole: if the key was written late in a prior
|
|
// run and the next cron fires early, the key is still alive at probe time →
|
|
// rebuild skipped → key expires a short while later and stays absent until a
|
|
// cron eventually runs when it's missing. One cheap HTTP per cron keeps both
|
|
// the ranking AND its sibling seed-meta rolling forward, and self-heals the
|
|
// partial-pipeline case where ranking was written but meta wasn't — handler
|
|
// retries the atomic pair on every cron.
|
|
//
|
|
// Returns whether the ranking key is present in Redis after the rebuild
|
|
// attempt (observability only — no caller gates on this).
|
|
async function refreshRankingAggregate({ url, token, laggardsWarmed }) {
|
|
const reason = laggardsWarmed > 0 ? `${laggardsWarmed} laggard warms` : 'scheduled cron refresh';
|
|
try {
|
|
// ?refresh=1 tells the handler to skip its cache-hit early-return and
|
|
// recompute-then-SET atomically. Avoids the earlier "DEL then rebuild"
|
|
// flow where a failed rebuild would leave the ranking absent instead of
|
|
// stale-but-present.
|
|
const rebuildHeaders = { 'User-Agent': SEED_UA, 'Accept': 'application/json' };
|
|
if (WM_KEY) rebuildHeaders['X-WorldMonitor-Key'] = WM_KEY;
|
|
const rebuildResp = await fetch(`${API_BASE}/api/resilience/v1/get-resilience-ranking?refresh=1`, {
|
|
headers: rebuildHeaders,
|
|
signal: AbortSignal.timeout(60_000),
|
|
});
|
|
if (rebuildResp.ok) {
|
|
const rebuilt = await rebuildResp.json();
|
|
const total = (rebuilt.items?.length ?? 0) + (rebuilt.greyedOut?.length ?? 0);
|
|
console.log(`[resilience-scores] Refreshed ${RESILIENCE_RANKING_CACHE_KEY} with ${total} countries (${reason})`);
|
|
} else {
|
|
console.warn(`[resilience-scores] Refresh ranking HTTP ${rebuildResp.status} — ranking cache stays at its prior state until next cron`);
|
|
}
|
|
} catch (err) {
|
|
console.warn(`[resilience-scores] Failed to refresh ranking cache: ${err.message}`);
|
|
}
|
|
|
|
// Verify BOTH the ranking data key AND the seed-meta key. Upstash REST
|
|
// pipeline is non-transactional: the handler's atomic SET could land the
|
|
// ranking but miss the meta, leaving /api/health reading stale meta over a
|
|
// fresh ranking. If the meta didn't land within ~5 minutes, log a warning
|
|
// so ops can grep for it — next cron will retry (ranking SET is
|
|
// idempotent).
|
|
const [rankingLen, metaFresh] = await Promise.all([
|
|
fetch(`${url}/strlen/${encodeURIComponent(RESILIENCE_RANKING_CACHE_KEY)}`, {
|
|
headers: { Authorization: `Bearer ${token}` },
|
|
signal: AbortSignal.timeout(5_000),
|
|
}).then((r) => r.ok ? r.json() : null).then((d) => Number(d?.result || 0)).catch(() => 0),
|
|
fetch(`${url}/get/seed-meta:resilience:ranking`, {
|
|
headers: { Authorization: `Bearer ${token}` },
|
|
signal: AbortSignal.timeout(5_000),
|
|
}).then((r) => r.ok ? r.json() : null).then((d) => {
|
|
if (!d?.result) return false;
|
|
try {
|
|
const meta = JSON.parse(d.result);
|
|
return typeof meta?.fetchedAt === 'number' && (Date.now() - meta.fetchedAt) < 5 * 60 * 1000;
|
|
} catch { return false; }
|
|
}).catch(() => false),
|
|
]);
|
|
const rankingPresent = rankingLen > 0;
|
|
if (rankingPresent && !metaFresh) {
|
|
console.warn(`[resilience-scores] Partial publish: ranking:v9 present but seed-meta not fresh — next cron will retry (handler SET is idempotent)`);
|
|
}
|
|
return rankingPresent;
|
|
}
|
|
|
|
// The seeder does NOT write seed-meta:resilience:ranking. Previously it did,
|
|
// as a "heartbeat" when Pro traffic was quiet — but it could only attest to
|
|
// "recordCount of per-country scores", not to whether `resilience:ranking:v9`
|
|
// was actually published this cron. The ranking handler gates its SET on a
|
|
// 75% coverage threshold and skips both the ranking and its meta when the
|
|
// gate fails; a stale-but-present ranking key combined with a fresh seeder
|
|
// meta write was exactly the "meta says fresh, data is stale" failure mode
|
|
// this PR exists to eliminate. The handler is now the sole writer of meta,
|
|
// and it writes both keys atomically via the same pipeline only when coverage
|
|
// passes. refreshRankingAggregate() triggers the handler every cron so meta
|
|
// never goes silently stale during quiet Pro usage — which was the original
|
|
// reason the seeder meta write existed.
|
|
|
|
async function main() {
|
|
const startedAt = Date.now();
|
|
const result = await seedResilienceScores();
|
|
logSeedResult('resilience:scores', result.recordCount ?? 0, Date.now() - startedAt, {
|
|
skipped: Boolean(result.skipped),
|
|
...(result.total != null && { total: result.total }),
|
|
...(result.reason != null && { reason: result.reason }),
|
|
...(result.intervalsWritten != null && { intervalsWritten: result.intervalsWritten }),
|
|
});
|
|
if (!result.skipped && (result.recordCount ?? 0) > 0 && !result.rankingPresent) {
|
|
// Observability only — seeder never writes seed-meta. Health will flag the
|
|
// stale meta on its own if this persists across multiple cron ticks.
|
|
console.warn('[resilience-scores] resilience:ranking:v9 absent after rebuild attempt; handler-side coverage gate likely tripped. Next cron will retry.');
|
|
}
|
|
}
|
|
|
|
if (process.argv[1]?.endsWith('seed-resilience-scores.mjs')) {
|
|
main().catch((err) => {
|
|
const message = err instanceof Error ? err.message : String(err);
|
|
console.error(`FATAL: ${message}`);
|
|
process.exit(1);
|
|
});
|
|
}
|