mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
Context: PR #3225 globalised EP3 because the per-country shape was missing the section budget. Post-merge production log (2026-04-20) proved the globalisation itself was worse: 42s/page full-table scans (ArcGIS has no `date` index — confirmed via service metadata probe) AND intermittent "Invalid query parameters" on the global WHERE. Probes of outStatistics as an alternative showed it works for small countries (BRA: 19s, 103 ports) but times out server-side for heavy ones (USA: 313k historic rows, 30s+ server-compute, multiple retries returned HTTP_STATUS 000). Not a reliable path. The only shape ArcGIS reliably handles is per-country WHERE ISO3='X' AND date > Y (uses the ISO3 index). Its problem was fitting 174 countries in the 420s portwatch bundle budget — solve that by giving it its own container. Changes: - scripts/seed-portwatch-port-activity.mjs: restore per-country paginated EP3 with the accumulator shape from PR #3225 folded into the per-country loop (memory stays O(ports-per-country), not O(all-rows)). Keep every stabiliser: AbortSignal.any through fetchWithTimeout, SIGTERM handler with stage/batch/errors flush, per-country Promise.race with AbortController that actually cancels the work, eager p.catch for mid-batch error flush. - Add fetchWithRetryOnInvalidParams — single retry on the specific "Invalid query parameters" error class ArcGIS has returned intermittently in prod. Does not retry other error classes. - Bump LOCK_TTL_MS from 30 to 60 min to match the wider wall-time budget of the standalone cron. - scripts/seed-bundle-portwatch.mjs: remove PW-Port-Activity from the main portwatch bundle. Keeps PW-Disruptions (hourly), PW-Main (6h), PW-Chokepoints-Ref (weekly). - scripts/seed-bundle-portwatch-port-activity.mjs: new 1-section bundle. 540s section timeout, 570s bundle budget. Includes the full Railway service provisioning checklist in the header. - Dockerfile.seed-bundle-portwatch-port-activity: mirrors the resilience-validation pattern — node:22-alpine, full scripts/ tree copy (avoids the add-an-import-forget-to-COPY class that has bit us 3+ times), shared/ for _country-resolver. - tests/portwatch-port-activity-seed.test.mjs: rewrite assertions for the per-country shape. 54 tests pass (was 50, +4 for new assertions on the standalone bundle + Dockerfile + retry wrapper + ISO3 shape). Full test:data: 5883 pass. Typecheck + lint clean. Post-merge Railway provisioning: see header of seed-bundle-portwatch-port-activity.mjs for the 7-step checklist.
52 lines
2.5 KiB
JavaScript
52 lines
2.5 KiB
JavaScript
#!/usr/bin/env node
|
|
// Standalone Railway cron service for supply_chain:portwatch-ports.
|
|
//
|
|
// Split out of seed-bundle-portwatch.mjs on 2026-04-20 because ArcGIS
|
|
// Daily_Ports_Data queries scale poorly at the N-countries level: even
|
|
// with per-country ISO3-indexed WHERE clauses + concurrency 12, wall
|
|
// time exceeded the bundle's 540s budget. Globalising the fetch (PR
|
|
// #3225) traded timeouts for a different failure mode (42s full-table
|
|
// scans + intermittent "Invalid query parameters"). Giving this seeder
|
|
// its own container decouples its worst-case runtime from the main
|
|
// portwatch bundle and lets it run on an interval appropriate to the
|
|
// ~10-day upstream dataset lag.
|
|
//
|
|
// Railway service provisioning checklist (after merge):
|
|
// 1. Create new service: portwatch-port-activity-seed
|
|
// 2. Builder: DOCKERFILE, dockerfilePath: Dockerfile.seed-bundle-portwatch-port-activity
|
|
// 3. Root directory: "" (empty) — avoids NIXPACKS auto-detection (see
|
|
// feedback_railway_dockerfile_autodetect_overrides_builder.md)
|
|
// 4. Cron schedule: "0 */24 * * *" (daily, UTC) — dataset lag means
|
|
// 12h cadence is overkill; 24h keeps us inside the freshness
|
|
// expectations downstream
|
|
// 5. Env vars (copy from existing seed services):
|
|
// UPSTASH_REDIS_REST_URL, UPSTASH_REDIS_REST_TOKEN,
|
|
// PROXY_URL (for 429 fallback)
|
|
// 6. Watch paths (in service settings):
|
|
// scripts/seed-portwatch-port-activity.mjs,
|
|
// scripts/seed-bundle-portwatch-port-activity.mjs,
|
|
// scripts/_seed-utils.mjs,
|
|
// scripts/_proxy-utils.cjs,
|
|
// scripts/_country-resolver.mjs,
|
|
// scripts/_bundle-runner.mjs,
|
|
// Dockerfile.seed-bundle-portwatch-port-activity
|
|
// 7. Monitor first run for STALE_SEED recovery on portwatch-ports.
|
|
import { runBundle, HOUR } from './_bundle-runner.mjs';
|
|
|
|
await runBundle('portwatch-port-activity', [
|
|
{
|
|
label: 'PW-Port-Activity',
|
|
script: 'seed-portwatch-port-activity.mjs',
|
|
seedMetaKey: 'supply_chain:portwatch-ports',
|
|
canonicalKey: 'supply_chain:portwatch-ports:v1:_countries',
|
|
// 12h interval gate — matches the historical cadence. Actual Railway
|
|
// cron should trigger at 24h; the interval gate prevents rapid-fire
|
|
// re-runs if someone manually retriggers mid-day.
|
|
intervalMs: 12 * HOUR,
|
|
// 540s section timeout — full budget for the one section. Bundle
|
|
// runner still SIGTERMs if the child hangs, and the seeder's
|
|
// SIGTERM handler releases the lock + extends TTLs.
|
|
timeoutMs: 540_000,
|
|
},
|
|
], { maxBundleMs: 570_000 });
|