Files
worldmonitor/scripts/seed-climate-ocean-ice.mjs
Elie Habib 044598346e feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated (#3097)
* feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated

Opt-in contract path in runSeed: when opts.declareRecords is provided, write
{_seed, data} envelope to the canonical key alongside legacy seed-meta:*
(dual-write). State machine: OK / OK_ZERO / RETRY with zeroIsValid opt.
declareRecords throws or returns non-integer → hard fail (contract violation).
extraKeys[*] support per-key declareRecords; each extra key writes its own
envelope. Legacy seeders (no declareRecords) entirely unchanged.

Migrated all 91 scripts/seed-*.mjs to contract mode. Each exports
declareRecords returning the canonical record count, and passes
schemaVersion: 1 + maxStaleMin (matched to api/health.js SEED_META, or 2.5x
interval where no registry entry exists). Contract conformance reports 84/86
seeders with full descriptor (2 pre-existing warnings).

Legacy seed-meta keys still written so unmigrated readers keep working;
follow-up slices flip health.js + readers to envelope-first.

Tests: 61/61 PR 1 tests still pass.

Next slices for PR 2:
- api/health.js registry collapse + 15 seed-bundle-*.mjs canonicalKey wiring
- reader migration (mcp, resilience, aviation, displacement, regional-snapshot)
- direct writers — ais-relay.cjs, consumer-prices-core publish.ts
- public-boundary stripSeedEnvelope + test migration

Plan: docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md

* fix(seed-contract): unwrap envelopes in internal cross-seed readers

After PR 2a enveloped 91 canonical keys as {_seed, data}, every script-side
reader that returned the raw parsed JSON started silently handing callers the
envelope instead of the bare payload. WoW baselines (bigmac, grocery-basket,
fear-greed) saw undefined .countries / .composite; seed-climate-anomalies saw
undefined .normals from climate:zone-normals:v1; seed-thermal-escalation saw
undefined .fireDetections from wildfire:fires:v1; seed-forecasts' ~40-key
pipeline batch returned envelopes for every input.

Fix: route every script-side reader through unwrapEnvelope(...).data. Legacy
bare-shape values pass through unchanged (unwrapEnvelope returns
{_seed: null, data: raw} for any non-envelope shape).

Changed:
- scripts/_seed-utils.mjs: import unwrapEnvelope; redisGet, readSeedSnapshot,
  verifySeedKey all unwrap. Exported new readCanonicalValue() helper for
  cross-seed consumers.
- 18 seed-*.mjs scripts with local redisGet-style helpers or inline fetch
  patched to unwrap via the envelope source module (subagent sweep).
- scripts/seed-forecasts.mjs pipeline batch: parse() unwraps each result.
- scripts/seed-energy-spine.mjs redisMget: unwraps each result.

Tests:
- tests/seed-utils-envelope-reads.test.mjs: 7 new cases covering envelope
  + legacy + null paths for readSeedSnapshot and verifySeedKey.
- Full seed suite: 67/67 pass (was 61, +6 new).

Addresses both of user's P1 findings on PR #3097.

* feat(seed-contract): envelope-aware reads in server + api helpers

Every RPC and public-boundary reader now automatically strips _seed from
contract-mode canonical keys. Legacy bare-shape values pass through unchanged
(unwrapEnvelope no-ops on non-envelope shapes).

Changed helpers (one-place fix — unblocks ~60 call sites):
- server/_shared/redis.ts: getRawJson, getCachedJson, getCachedJsonBatch
  unwrap by default. cachedFetchJson inherits via getCachedJson.
- api/_upstash-json.js: readJsonFromUpstash unwraps (covers api/mcp.ts
  tool responses + all its canonical-key reads).
- api/bootstrap.js: getCachedJsonBatch unwraps (public-boundary —
  clients never see envelope metadata).

Left intentionally unchanged:
- api/health.js / api/seed-health.js: read only seed-meta:* keys which
  remain bare-shape during dual-write. unwrapEnvelope already imported at
  the meta-read boundary (PR 1) as a defensive no-op.

Tests: 67/67 seed tests pass. typecheck + typecheck:api clean.

This is the blast-radius fix the PR #3097 review called out — external
readers that would otherwise see {_seed, data} after the writer side
migrated.

* fix(test): strip export keyword in vm.runInContext'd seed source

cross-source-signals-regulatory.test.mjs loads scripts/seed-cross-source-signals.mjs
via vm.runInContext, which cannot parse ESM `export` syntax. PR 2a added
`export function declareRecords` to every seeder, which broke this test's
static-analysis approach.

Fix: strip the `export` keyword from the declareRecords line in the
preprocessed source string so the function body still evaluates as a plain
declaration.

Full test:data suite: 5307/5307 pass. typecheck + typecheck:api clean.

* feat(seed-contract): consumer-prices publish.ts writes envelopes

Wrap the 5 canonical keys written by consumer-prices-core/src/jobs/publish.ts
(overview, movers:7d/30d, freshness, categories:7d/30d/90d, retailer-spread,
basket-series) in {_seed, data} envelopes. Legacy seed-meta:<key> writes
preserved for dual-write.

Inlined a buildEnvelope helper (10 lines) rather than taking a cross-package
dependency — consumer-prices-core is a standalone npm package. Documented the
four-file parity contract (mjs source, ts mirror, js edge mirror, this copy).

Contract fields: sourceVersion='consumer-prices-core-publish-v1', schemaVersion=1,
state='OK' (recordCount>0) or 'OK_ZERO' (legitimate zero).

Typecheck: no new errors in publish.ts.

* fix(seed-contract): 3 more server-side readers unwrap envelopes

Found during final audit:

- server/worldmonitor/resilience/v1/_shared.ts: resilience score reader
  parsed cached GetResilienceScoreResponse raw. Contract-mode seed-resilience-scores
  now envelopes those keys.
- server/worldmonitor/resilience/v1/get-resilience-ranking.ts: p05/p95
  interval lookup parsed raw from seed-resilience-scores' extra-key path.
- server/worldmonitor/infrastructure/v1/_shared.ts: mgetJson() used for
  count-source keys (wildfire:fires:v1, news:insights:v1) which are both
  contract-mode now.

All three now unwrap via server/_shared/seed-envelope. Legacy shapes pass
through unchanged.

Typecheck clean.

* feat(seed-contract): ais-relay.cjs direct writes produce envelopes

32 canonical-key write sites in scripts/ais-relay.cjs now produce {_seed, data}
envelopes. Inlined buildEnvelope() (CJS module can't require ESM source) +
envelopeWrite(key, data, ttlSeconds, meta) wrapper. Enveloped keys span market
bootstrap, aviation, cyber-threats, theater-posture, weather-alerts, economic
spending/fred/worldbank, tech-events, corridor-risk, usni-fleet, shipping-stress,
social:reddit, wsb-tickers, pizzint, product-catalog, chokepoint transits,
ucdp-events, satellites, oref.

Left bare (not seeded data keys): seed-meta:* (dual-write legacy),
classifyCacheKey LLM cache, notam:prev-closed-state internal state,
wm:notif:scan-dedup flags.

Updated tests/ucdp-seed-resilience.test.mjs regex to accept both upstashSet
(pre-contract) and envelopeWrite (post-contract) call patterns.

* feat(seed-contract): 15 bundle files add canonicalKey for envelope gate

54 bundle sections across 12 files now declare canonicalKey alongside the
existing seedMetaKey. _bundle-runner.mjs (from PR 1) prefers canonicalKey
when both are present — gates section runs on envelope._seed.fetchedAt
read directly from the data key, eliminating the meta-outlives-data class
of bugs.

Files touched:
- climate (5), derived-signals (2), ecb-eu (3), energy-sources (6),
  health (2), imf-extended (4), macro (10), market-backup (9),
  portwatch (4), relay-backup (2), resilience-recovery (5), static-ref (2)

Skipped (14 sections, 3 whole bundles): multi-key writers, dynamic
templated keys (displacement year-scoped), or non-runSeed orchestrators
(regional brief cron, resilience-scores' 222-country publish, validation/
benchmark scripts). These continue to use seedMetaKey or their own gate.

seedMetaKey preserved everywhere — dual-write. _bundle-runner.mjs falls
back to legacy when canonicalKey is absent.

All 15 bundles pass node --check. test:data: 5307/5307. typecheck:all: clean.

* fix(seed-contract): 4 PR #3097 review P1s — transform/declareRecords mismatches + envelope leaks

Addresses both P1 findings and the extra-key seed-meta leak surfaced in review:

1. runSeed helper-level invariant: seed-meta:* keys NEVER envelope.
   scripts/_seed-utils.mjs exports shouldEnvelopeKey(key) — returns false for
   any key starting with 'seed-meta:'. Both atomicPublish (canonical) and
   writeExtraKey (extras) gate the envelope wrap through this helper. Fixes
   seed-iea-oil-stocks' ANALYSIS_META_EXTRA_KEY silently getting enveloped,
   which broke health.js parsing the value as bare {fetchedAt, recordCount}.
   Also defends against any future manual writeExtraKey(..., envelopeMeta)
   call that happens to target a seed-meta:* key.

2. seed-token-panels canonical + extras fixed.
   publishTransform returns data.defi (the defi panel itself, shape {tokens}).
   Old declareRecords counted data.defi.tokens + data.ai.tokens + data.other.tokens
   on the transformed payload → 0 → RETRY path → canonical market:defi-tokens:v1
   never wrote, and because runSeed returned before the extraKeys loop,
   market:ai-tokens:v1 + market:other-tokens:v1 stayed stale too.
   New: declareRecords counts data.tokens on the transformed shape. AI_KEY +
   OTHER_KEY extras reuse the same function (transforms return structurally
   identical panels). Added isMain guard so test imports don't fire runSeed.

3. api/product-catalog.js cached reader unwraps envelope.
   ais-relay.cjs now envelopes product-catalog:v2 via envelopeWrite(). The
   edge reader did raw JSON.parse(result) and returned {_seed, data} to
   clients, breaking the cached path. Fix: import unwrapEnvelope from
   ./_seed-envelope.js, apply after JSON.parse. One site — :238-241 is
   downstream of getFromCache(), so the single reader fix covers both.

4. Regression lock tests/seed-contract-transform-regressions.test.mjs (11 cases):
   - shouldEnvelopeKey invariant: seed-meta:* false, canonical true
   - Token-panels declareRecords works on transformed shape (canonical + both extras)
   - Explicit repro of pre-fix buggy signature returning 0 — guards against revert
   - resolveRecordCount accepts 0, rejects non-integer
   - Product-catalog envelope unwrap returns bare shape; legacy passes through

Verification:
- npm run test:data → 5318/5318 pass (was 5307 — 11 new regressions)
- npm run typecheck:all → clean
- node --check on every modified script

iea-oil-stocks canonical declareRecords was NOT broken (user confirmed during
review — buildIndex preserves .members); only its ANALYSIS_META_EXTRA_KEY
was affected, now covered generically by commit 1's helper invariant.

* fix(seed-contract): seed-token-panels validateFn also runs on post-transform shape

Review finding: fixing declareRecords wasn't sufficient — atomicPublish() runs
validateFn(publishData) on the transformed payload too. seed-token-panels'
validate() checked data.defi/.ai/.other on the transformed {tokens} shape,
returned false, and runSeed took the early skipped-write branch (before even
reaching the declareRecords RETRY logic). Net effect: same as before the
declareRecords fix — canonical + both extras stayed stale.

Fix: validate() now checks the canonical defi panel directly (Array.isArray
(data?.tokens) && has at least one t.price > 0). AI/OTHER panels are validated
implicitly by their own extraKey declareRecords on write.

Audited the other 9 seeders with publishTransform (bls-series, bis-extended,
bis-data, gdelt-intel, trade-flows, iea-oil-stocks, jodi-gas, sanctions-pressure,
forecasts): all validateFn's correctly target the post-transform shape. Only
token-panels regressed.

Added 4 regression tests (tests/seed-contract-transform-regressions.test.mjs):
- validate accepts transformed panel with priced tokens
- validate rejects all-zero-price tokens
- validate rejects empty/missing tokens
- Explicit pre-fix repro (buggy old signature fails on transformed shape)

Verification:
- npm run test:data → 5322/5322 pass (was 5318; +4 new)
- npm run typecheck:all → clean
- node --check clean

* feat(seed-contract): add /api/seed-contract-probe validation endpoint

Single machine-readable gate for 'is PR #3097 working in production'.
Replaces the curl/jq ritual with one authenticated edge call that returns
HTTP 200 ok:true or 503 + failing check list.

What it validates:
- 8 canonical keys have {_seed, data} envelopes with required data fields
  and minRecords floors (fsi-eu, zone-normals, 3 token panels + minRecords
  guard against token-panels RETRY regression, product-catalog, wildfire,
  earthquakes).
- 2 seed-meta:* keys remain BARE (shouldEnvelopeKey invariant; guards
  against iea-oil-stocks ANALYSIS_META_EXTRA_KEY-class regressions).
- /api/product-catalog + /api/bootstrap responses contain no '_seed' leak.

Auth: x-probe-secret header must match RELAY_SHARED_SECRET (reuses existing
Vercel↔Railway internal trust boundary).

Probe logic is exported (checkProbe, checkPublicBoundary, DEFAULT_PROBES) for
hermetic testing. tests/seed-contract-probe.test.mjs covers every branch:
envelope pass/fail on field/records/shape, bare pass/fail on shape/field,
missing/malformed JSON, Redis non-2xx, boundary seed-leak detection,
DEFAULT_PROBES sanity (seed-meta invariant present, token-panels minRecords
guard present).

Usage:
  curl -H "x-probe-secret: $RELAY_SHARED_SECRET" \
       https://api.worldmonitor.app/api/seed-contract-probe

PR 3 will extend the probe with a stricter mode that asserts seed-meta:*
keys are GONE (not just bare) once legacy dual-write is removed.

Verification:
- tests/seed-contract-probe.test.mjs → 15/15 pass
- npm run test:data → 5338/5338 (was 5322; +16 new incl. conformance)
- npm run typecheck:all → clean

* fix(seed-contract): tighten probe — minRecords on AI/OTHER + cache-path source header

Review P2 findings: the probe's stated guards were weaker than advertised.

1. market:ai-tokens:v1 + market:other-tokens:v1 probes claimed to guard the
   token-panels extra-key RETRY regression but only checked shape='envelope'
   + dataHas:['tokens']. If an extra-key declareRecords regressed to 0, both
   probes would still pass because checkProbe() only inspects _seed.recordCount
   when minRecords is set. Now both enforce minRecords: 1.

2. /api/product-catalog boundary check only asserted no '_seed' leak — which
   is also true for the static fallback path. A broken cached reader
   (getFromCache returning null or throwing) could serve fallback silently
   and still pass this probe. Now:
   - api/product-catalog.js emits X-Product-Catalog-Source: cache|dodo|fallback
     on the response (the json() helper gained an optional source param wired
     to each of the three branches).
   - checkPublicBoundary declaratively requires that header's value match
     'cache' for /api/product-catalog, so a fallback-serve fails the probe
     with reason 'source:fallback!=cache' or 'source:missing!=cache'.

Test updates (tests/seed-contract-probe.test.mjs):
- Boundary check reworked to use a BOUNDARY_CHECKS config with optional
  requireSourceHeader per endpoint.
- New cases: served-from-cache passes, served-from-fallback fails with source
  mismatch, missing header fails, seed-leak still takes precedence, bad
  status fails.
- Token-panels sanity test now asserts minRecords≥1 on all 3 panels.

Verification:
- tests/seed-contract-probe.test.mjs → 17/17 pass (was 15, +2 net)
- npm run test:data → 5340/5340
- npm run typecheck:all → clean
2026-04-15 09:16:27 +04:00

516 lines
18 KiB
JavaScript

#!/usr/bin/env node
import { loadEnvFile, CHROME_UA, runSeed, getRedisCredentials } from './_seed-utils.mjs';
import { unwrapEnvelope } from './_seed-envelope-source.mjs';
loadEnvFile(import.meta.url);
// Cron: daily 08:00 UTC (0 8 * * *)
export const CLIMATE_OCEAN_ICE_KEY = 'climate:ocean-ice:v1';
export const CACHE_TTL = 86400; // 24h — daily satellite/climate indicator refresh
const NSIDC_DAILY_URL = 'https://noaadata.apps.nsidc.org/NOAA/G02135/north/daily/data/N_seaice_extent_daily_v4.0.csv';
const NSIDC_CLIMATOLOGY_URL = 'https://noaadata.apps.nsidc.org/NOAA/G02135/north/daily/data/N_seaice_extent_climatology_1981-2010_v4.0.csv';
const SEA_LEVEL_URL = 'https://sealevel.nasa.gov/overlay-global-mean-sea-level';
const OHC_700M_URL = 'https://www.ncei.noaa.gov/data/oceans/woa/DATA_ANALYSIS/3M_HEAT_CONTENT/DATA/basin/yearly/h22-w0-700m.dat';
const NOAA_GLOBAL_OCEAN_V6_INDEX_URL = 'https://www.ncei.noaa.gov/data/noaa-global-surface-temperature/v6/access/timeseries/';
const NOAA_GLOBAL_OCEAN_V51_URL = 'https://www.ncei.noaa.gov/data/noaa-global-surface-temperature/v5.1/access/timeseries/aravg.mon.ocean.90S.90N.v5.1.0.202312.asc';
function round(value, decimals = 2) {
if (!Number.isFinite(value)) return NaN;
const scale = 10 ** decimals;
return Math.round(value * scale) / scale;
}
function median(values) {
if (!values.length) return NaN;
const sorted = [...values].sort((a, b) => a - b);
const mid = Math.floor(sorted.length / 2);
if (sorted.length % 2 === 1) return sorted[mid];
return (sorted[mid - 1] + sorted[mid]) / 2;
}
function toMonthKey(year, month) {
return `${year}-${String(month).padStart(2, '0')}`;
}
function monthStartMs(year, month) {
return Date.UTC(year, month - 1, 1);
}
function dayOfYear(year, month, day) {
return Math.floor((Date.UTC(year, month - 1, day) - Date.UTC(year, 0, 0)) / (24 * 60 * 60 * 1000));
}
function midYearMs(yearWithFraction) {
const wholeYear = Math.floor(yearWithFraction);
const fraction = yearWithFraction - wholeYear;
const yearStart = Date.UTC(wholeYear, 0, 1);
return Math.round(yearStart + fraction * 365.2425 * 24 * 60 * 60 * 1000);
}
async function fetchText(url, label, { timeoutMs = 20_000 } = {}) {
const resp = await fetch(url, {
headers: {
Accept: 'text/plain,text/csv,application/json,text/html;q=0.9,*/*;q=0.8',
'User-Agent': CHROME_UA,
},
signal: AbortSignal.timeout(timeoutMs),
});
if (!resp.ok) throw new Error(`${label} HTTP ${resp.status}`);
return resp.text();
}
export function parseSeaIceDailyRows(text) {
return text
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && /^\d{4}\s*,\s*\d{1,2}\s*,\s*\d{1,2}\s*,/.test(line))
.map((line) => line.split(',').map((part) => Number(part.trim())))
.map((cols) => ({
year: cols[0],
month: cols[1],
day: cols[2],
extent: cols[3],
area: cols[4],
measuredAt: Date.UTC(cols[0], cols[1] - 1, cols[2]),
}))
.filter((row) => Number.isInteger(row.year)
&& Number.isInteger(row.month)
&& Number.isInteger(row.day)
&& Number.isFinite(row.extent)
&& row.extent > 0)
.sort((a, b) => a.measuredAt - b.measuredAt);
}
export function parseSeaIceMonthlyRows(text, month) {
return text
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && /^\d/.test(line))
.map((line) => {
const cols = line.split(',').map((part) => part.trim());
const primaryExtent = Number(cols[4]);
const fallbackExtent = Number(cols.at(-2));
return {
year: Number(cols[0]),
month,
// NSIDC v4 monthly files are: year, mo, source_dataset, region, extent, area.
extent: Number.isFinite(primaryExtent) && primaryExtent > 0 ? primaryExtent : fallbackExtent,
};
})
.filter((row) => Number.isInteger(row.year) && Number.isFinite(row.extent) && row.extent > 0);
}
export function parseSeaIceClimatologyRows(text) {
return text
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && /^\d{3}\s*,/.test(line))
.map((line) => line.split(',').map((part) => Number(part.trim())))
.map((cols) => ({
doy: cols[0],
medianExtent: cols[5],
}))
.filter((row) => Number.isInteger(row.doy) && row.doy >= 1 && row.doy <= 366
&& Number.isFinite(row.medianExtent) && row.medianExtent > 0);
}
export function computeSeaIceMonthlyMedians(rowsByMonth) {
const medians = new Map();
for (const [month, rows] of rowsByMonth.entries()) {
const baseline = rows
.filter((row) => row.year >= 1981 && row.year <= 2010)
.map((row) => row.extent)
.filter((value) => Number.isFinite(value));
if (baseline.length) {
medians.set(month, round(median(baseline), 2));
}
}
return medians;
}
function enumerateRecentMonths(year, month, count = 12) {
const result = [];
for (let offset = count - 1; offset >= 0; offset--) {
const date = new Date(Date.UTC(year, month - 1 - offset, 1));
result.push({
year: date.getUTCFullYear(),
month: date.getUTCMonth() + 1,
key: toMonthKey(date.getUTCFullYear(), date.getUTCMonth() + 1),
});
}
return result;
}
export function buildIceTrend12m(dailyRows, monthlyMedians) {
const latest = dailyRows.at(-1);
if (!latest) return [];
const latestByMonth = new Map();
for (const row of dailyRows) latestByMonth.set(toMonthKey(row.year, row.month), row);
return enumerateRecentMonths(latest.year, latest.month, 12)
.map(({ key, month }) => {
const row = latestByMonth.get(key);
const medianExtent = monthlyMedians.get(month);
if (!row || !Number.isFinite(medianExtent)) return null;
return {
month: key,
extentMkm2: round(row.extent, 2),
anomalyMkm2: round(row.extent - medianExtent, 2),
};
})
.filter((row) => row != null);
}
export function buildIceTrend12mFromClimatology(dailyRows, dailyMedianByDoy) {
const latest = dailyRows.at(-1);
if (!latest) return [];
const latestByMonth = new Map();
for (const row of dailyRows) latestByMonth.set(toMonthKey(row.year, row.month), row);
return enumerateRecentMonths(latest.year, latest.month, 12)
.map(({ key }) => {
const row = latestByMonth.get(key);
if (!row) return null;
const climatologyMedian = dailyMedianByDoy.get(dayOfYear(row.year, row.month, row.day));
if (!Number.isFinite(climatologyMedian)) return null;
return {
month: key,
extentMkm2: round(row.extent, 2),
anomalyMkm2: round(row.extent - climatologyMedian, 2),
};
})
.filter((row) => row != null);
}
function classifyArcticTrend(current, monthlyMedian, dailyRows) {
const sameDayHistory = dailyRows.filter((row) => row.month === current.month && row.day === current.day);
const minSameDayExtent = sameDayHistory.length
? Math.min(...sameDayHistory.map((row) => row.extent))
: Number.POSITIVE_INFINITY;
if (sameDayHistory.length >= 2 && current.extent <= minSameDayExtent + 1e-9) {
return 'record_low';
}
if (!Number.isFinite(monthlyMedian)) return null;
const anomaly = current.extent - monthlyMedian;
if (anomaly <= -0.5) return 'below_average';
if (anomaly >= 0.5) return 'above_average';
return 'average';
}
async function fetchSeaIceSection() {
const [dailyText, climatologyResult] = await Promise.all([
fetchText(NSIDC_DAILY_URL, 'NSIDC daily sea ice', { timeoutMs: 90_000 }),
fetchText(NSIDC_CLIMATOLOGY_URL, 'NSIDC sea ice climatology').then((text) => ({ text })).catch((err) => {
console.warn(`[OceanIce] NSIDC sea ice climatology unavailable: ${err?.message || err}`);
return null;
}),
]);
const dailyRows = parseSeaIceDailyRows(dailyText);
if (!dailyRows.length) {
throw new Error('NSIDC daily sea ice rows missing');
}
const latest = dailyRows.at(-1);
const dailyMedianByDoy = new Map(
climatologyResult
? parseSeaIceClimatologyRows(climatologyResult.text).map((row) => [row.doy, row.medianExtent])
: [],
);
const currentMedian = dailyMedianByDoy.get(dayOfYear(latest.year, latest.month, latest.day));
const trend12m = buildIceTrend12mFromClimatology(dailyRows, dailyMedianByDoy);
const arcticTrend = classifyArcticTrend(latest, currentMedian, dailyRows);
return {
data: {
arctic_extent_mkm2: round(latest.extent, 2),
...(Number.isFinite(currentMedian) ? { arctic_extent_anomaly_mkm2: round(latest.extent - currentMedian, 2) } : {}),
...(arcticTrend != null ? { arctic_trend: arcticTrend } : {}),
...(trend12m.length
? {
ice_trend_12m: trend12m.map((point) => ({
month: point.month,
extent_mkm2: point.extentMkm2,
anomaly_mkm2: point.anomalyMkm2,
})),
}
: {}),
},
measuredAt: latest.measuredAt,
};
}
export function parseSeaLevelOverlay(html) {
const normalized = html.replace(/<[^>]+>/g, ' ').replace(/\s+/g, ' ').trim();
const riseMatch = normalized.match(/RISE SINCE 1993\s+([0-9]+(?:\.[0-9]+)?)\s+millimeters/i)
?? normalized.match(/since 1993[^0-9]*([0-9]+(?:\.[0-9]+)?)\s*(?:mm|millimeters)/i);
const rateMatch = normalized.match(/current yearly rate of\s+[0-9.]+\s+inches\/year\s+\(([0-9.]+)\s+centimeters\/year\)/i)
?? normalized.match(/current.*?([0-9]+(?:\.[0-9]+)?)\s+centimeters\/year/i);
return {
seaLevelMmAbove1993: riseMatch ? round(Number(riseMatch[1]), 1) : NaN,
seaLevelAnnualRiseMm: rateMatch ? round(Number(rateMatch[1]) * 10, 1) : NaN,
};
}
async function fetchSeaLevelSection() {
const html = await fetchText(SEA_LEVEL_URL, 'NASA global mean sea level');
const parsed = parseSeaLevelOverlay(html);
if (!Number.isFinite(parsed.seaLevelMmAbove1993) && !Number.isFinite(parsed.seaLevelAnnualRiseMm)) {
throw new Error('Sea level page missing rise/rate values');
}
return {
data: {
...(Number.isFinite(parsed.seaLevelMmAbove1993) ? { sea_level_mm_above_1993: parsed.seaLevelMmAbove1993 } : {}),
...(Number.isFinite(parsed.seaLevelAnnualRiseMm) ? { sea_level_annual_rise_mm: parsed.seaLevelAnnualRiseMm } : {}),
},
};
}
export function parseOhcYearlyRows(text) {
const rows = text
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && !/^YEAR\b/i.test(line))
.map((line) => line.split(/\s+/).map((part) => Number(part)))
.map((cols) => ({
yearMid: cols[0],
world: cols[1],
}))
.filter((row) => Number.isFinite(row.yearMid) && Number.isFinite(row.world));
if (rows.length) return rows;
const numbers = Array.from(text.matchAll(/-?\d+(?:\.\d+)?/g), (match) => Number(match[0]));
const fallback = [];
for (let index = 0; index + 6 < numbers.length; index += 7) {
fallback.push({
yearMid: numbers[index],
world: numbers[index + 1],
});
}
return fallback.filter((row) => Number.isFinite(row.yearMid) && Number.isFinite(row.world));
}
async function fetchOhcSection() {
const text = await fetchText(OHC_700M_URL, 'NOAA ocean heat content');
const rows = parseOhcYearlyRows(text);
const latest = rows.at(-1);
if (!latest) throw new Error('OHC yearly rows missing');
return {
data: {
ohc_0_700m_zj: round(latest.world * 10, 2),
},
measuredAt: midYearMs(latest.yearMid),
};
}
export function parseOceanTemperatureRows(text) {
return text
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && /^\d{4}\s+\d{1,2}\s+[-+]?\d/.test(line))
.map((line) => line.split(/\s+/))
.map((cols) => ({
year: Number(cols[0]),
month: Number(cols[1]),
anomaly: Number(cols[2]),
}))
.filter((row) => Number.isInteger(row.year)
&& Number.isInteger(row.month)
&& row.month >= 1
&& row.month <= 12
&& Number.isFinite(row.anomaly))
.sort((a, b) => (a.year - b.year) || (a.month - b.month));
}
export function computeOceanBaselineOffsets(rows, startYear = 1991, endYear = 2020) {
const totals = new Map();
const counts = new Map();
for (const row of rows) {
if (row.year < startYear || row.year > endYear) continue;
totals.set(row.month, (totals.get(row.month) ?? 0) + row.anomaly);
counts.set(row.month, (counts.get(row.month) ?? 0) + 1);
}
const offsets = new Map();
for (let month = 1; month <= 12; month++) {
const total = totals.get(month);
const count = counts.get(month);
if (!Number.isFinite(total) || !count) continue;
offsets.set(month, round(total / count, 3));
}
return offsets;
}
export function extractLatestOceanSeriesPath(indexHtml) {
const matches = Array.from(
indexHtml.matchAll(/href="(aravg\.mon\.ocean\.90S\.90N\.(v6\.[0-9.]+?)\.(\d{6})\.asc)"/g),
(match) => ({
path: match[1],
version: match[2],
period: Number(match[3]),
}),
);
if (!matches.length) return null;
matches.sort((left, right) => {
if (left.period !== right.period) return left.period - right.period;
return left.version.localeCompare(right.version, undefined, { numeric: true });
});
return matches.at(-1)?.path ?? null;
}
async function fetchSstSection() {
const [indexHtml, baselineText] = await Promise.all([
fetchText(NOAA_GLOBAL_OCEAN_V6_INDEX_URL, 'NOAA global ocean temperature index'),
fetchText(NOAA_GLOBAL_OCEAN_V51_URL, 'NOAA global ocean temperature baseline'),
]);
const latestPath = extractLatestOceanSeriesPath(indexHtml);
if (!latestPath) {
throw new Error('NOAA global ocean temperature index missing latest series path');
}
const currentText = await fetchText(new URL(latestPath, NOAA_GLOBAL_OCEAN_V6_INDEX_URL).toString(), 'NOAA global ocean temperature series');
const currentRows = parseOceanTemperatureRows(currentText);
const baselineRows = parseOceanTemperatureRows(baselineText);
const latest = currentRows.at(-1);
if (!latest) throw new Error('NOAA global ocean temperature rows missing');
const offsets = computeOceanBaselineOffsets(baselineRows);
const baselineOffset = offsets.get(latest.month);
if (!Number.isFinite(baselineOffset)) {
throw new Error(`Missing NOAA ocean baseline offset for month ${latest.month}`);
}
// NOAA v6 ocean-only anomalies are relative to 1991-2020. Convert them back
// to the requested 1971-2000 reference period using the NOAA v5.1 ocean-only
// 1991-2020 monthly mean anomalies, which are already expressed against
// 1971-2000. This keeps the output on the requested baseline while staying
// on current NOAA data for the latest month.
return {
data: {
sst_anomaly_c: round(latest.anomaly + baselineOffset, 2),
},
measuredAt: monthStartMs(latest.year, latest.month),
};
}
const SOURCE_FIELD_GROUPS = [
['arctic_extent_mkm2', 'arctic_extent_anomaly_mkm2', 'arctic_trend', 'ice_trend_12m'],
['sea_level_mm_above_1993', 'sea_level_annual_rise_mm'],
['ohc_0_700m_zj'],
['sst_anomaly_c'],
];
export function buildOceanIcePayload(settled, priorCache) {
const payload = {};
const measuredAts = [];
for (let i = 0; i < settled.length; i++) {
const result = settled[i];
if (result?.data) {
Object.assign(payload, result.data);
if (Number.isFinite(result.measuredAt) && result.measuredAt > 0) {
measuredAts.push(result.measuredAt);
}
} else if (priorCache && typeof priorCache === 'object' && i < SOURCE_FIELD_GROUPS.length) {
for (const field of SOURCE_FIELD_GROUPS[i]) {
if (priorCache[field] != null) payload[field] = priorCache[field];
}
}
}
if (!Object.keys(payload).length) {
throw new Error('All ocean/ice upstreams failed');
}
if (measuredAts.length) payload.measured_at = Math.max(...measuredAts);
return payload;
}
async function readPriorCache() {
try {
const { url, token } = getRedisCredentials();
const resp = await fetch(`${url}/get/${encodeURIComponent(CLIMATE_OCEAN_ICE_KEY)}`, {
headers: { Authorization: `Bearer ${token}` },
signal: AbortSignal.timeout(5_000),
});
if (!resp.ok) return null;
const data = await resp.json();
return data.result ? unwrapEnvelope(JSON.parse(data.result)).data : null;
} catch {
return null;
}
}
export async function fetchOceanIceData() {
const [allSettled, prior] = await Promise.all([
Promise.allSettled([
fetchSeaIceSection(),
fetchSeaLevelSection(),
fetchOhcSection(),
fetchSstSection(),
]),
readPriorCache(),
]);
const resolved = allSettled.map((result, i) => {
if (result.status === 'fulfilled') return result.value;
console.warn(`[OceanIce] Source ${i} failed: ${result.reason?.message || result.reason}`);
return null;
});
const hadFailures = resolved.some((r) => r == null);
if (hadFailures && prior) {
console.log('[OceanIce] Merging failed source groups with prior cache');
}
return buildOceanIcePayload(resolved, hadFailures ? prior : undefined);
}
export function countIndicators(data) {
const payload = data ?? {};
return [
payload.arctic_extent_mkm2,
payload.arctic_extent_anomaly_mkm2,
payload.sea_level_mm_above_1993,
payload.sea_level_annual_rise_mm,
payload.ohc_0_700m_zj,
payload.sst_anomaly_c,
Array.isArray(payload.ice_trend_12m) && payload.ice_trend_12m.length ? 1 : NaN,
].filter((value) => Number.isFinite(value)).length;
}
function validate(data) {
return countIndicators(data) > 0;
}
const isMain = process.argv[1] && import.meta.url.endsWith(process.argv[1].replace(/^file:\/\//, ''));
export function declareRecords(data) {
return typeof countIndicators === "function" ? countIndicators(data) : 0;
}
if (isMain) {
runSeed('climate', 'ocean-ice', CLIMATE_OCEAN_ICE_KEY, fetchOceanIceData, {
validateFn: validate,
ttlSeconds: CACHE_TTL,
recordCount: countIndicators,
sourceVersion: 'nsidc-sea-ice_v4-climatology-noaa-ohc-nasa-gmsl-noaa-global-ocean-v6-v51-baseline-v3',
declareRecords,
schemaVersion: 1,
maxStaleMin: 2880,
}).catch((err) => {
const cause = err.cause ? ` (cause: ${err.cause.message || err.cause.code || err.cause})` : '';
console.error('FATAL:', (err.message || err) + cause);
process.exit(1);
});
}