mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* fix(portwatch): per-country timeout + SIGTERM progress flush
Diagnosed from Railway log 2026-04-20T04:00-04:07: Port-Activity section hit
the 420s section cap with only batch 1/15 logged. Gap between batch 1 (67.3s)
and SIGTERM was 352s of silence — batch 2 stalled because Promise.allSettled
waits for the slowest country and processCountry had no per-country budget.
One slow country (USA/CHN with many ports × many pages under ArcGIS EP3
throttling) blocked the whole batch and cascaded to the section timeout,
leaving batches 2..15 unattempted.
Two changes, both stabilisers ahead of the proper fix (globalising EP3):
1. Wrap processCountry in Promise.race against a 90s PER_COUNTRY_TIMEOUT_MS.
Bounds worst-case batch time at ~90s regardless of ArcGIS behaviour.
Orphan fetches keep running until their own AbortSignal.timeout(45s)
fires — acceptable since the process exits soon after either way.
2. Share a `progress` object between fetchAll() and the SIGTERM handler so
the kill path flushes batch index, seeded count, and the first 10 error
messages. Past timeout kills discarded the errors array entirely,
making every regression undiagnosable.
* fix(portwatch): address PR #3222 P1+P2 (propagate abort, eager error flush)
Review feedback on #3222:
P1 — The 90s per-country timeout did not actually stop the timed-out
country's work; Promise.race rejected but processCountry kept paginating
with fresh 45s fetch timeouts per page, violating the CONCURRENCY=12 cap
and amplifying ArcGIS throttling instead of containing it.
Fix: thread an AbortController signal from withPerCountryTimeout through
processCountry → fetchActivityRows → fetchWithTimeout. fetchWithTimeout
combines the caller signal with AbortSignal.timeout(FETCH_TIMEOUT) via
AbortSignal.any so the per-country abort propagates into the in-flight
fetch. fetchActivityRows also checks signal.aborted between pages so a
cancel lands on the next iteration boundary even if the current page
has already resolved. Node 24 runtime supports AbortSignal.any.
P2 — SIGTERM diagnostics missed failures from the currently-stuck batch
because progress.errors was only populated after Promise.allSettled
returned. A kill during the pending await left progress.errors empty.
Fix: attach p.catch(err => errors.push(...)) to each wrapped promise
before Promise.allSettled. Rejections land in the shared errors array
at the moment they fire, so a SIGTERM mid-batch sees every rejection
that has already occurred (including per-country timeouts that have
already aborted their controllers). The settled loop skips rejected
outcomes to avoid double-counting.
Also exports withPerCountryTimeout with an injectable timeoutMs so the
new runtime tests can exercise the abort path at 40ms. Runtime tests
verify: (a) timer fires → underlying signal aborted + work rejects with
the per-country message, (b) work-resolves-first returns the value,
(c) work-rejects-first surfaces the real error, (d) eager .catch flush
populates a shared errors array before allSettled resolves.
Tests: 45 pass (was 38, +7 — 4 runtime + 3 source-regex).
Full test:data: 5867 pass. Typecheck + lint clean.
* fix(portwatch): abort also cancels 429 proxy fallback (PR #3222 P1 follow-up)
Second review iteration on #3222: the per-country AbortController fix
from b2f4a2626 stopped at the direct fetch() and did not reach the 429
proxy fallback. httpsProxyFetchRaw only accepted timeoutMs, so a
timed-out country could keep a CONNECT tunnel + request alive for up
to another FETCH_TIMEOUT (45s) after the batch moved on — the exact
throttling scenario the PR is meant to contain. The concurrency cap
was still violated on the slow path.
Threads `signal` all the way through:
- scripts/_proxy-utils.cjs: proxyConnectTunnel + proxyFetch accept an
optional signal option. Early-reject if `signal.aborted` before
opening the socket. Otherwise addEventListener('abort') destroys the
in-flight proxy socket + TLS tunnel and rejects with signal.reason.
Listener removed in cleanup() on all terminal paths. Refactored both
functions around resolveOnce/rejectOnce guards so the abort path
races cleanly with timeout and network errors without double-settle.
- scripts/_seed-utils.mjs: httpsProxyFetchRaw accepts + forwards
`signal` to proxyFetch.
- scripts/seed-portwatch-port-activity.mjs: fetchWithTimeout's 429
branch passes its caller signal to httpsProxyFetchRaw.
Backward compatible: signal is optional in every layer, so the many
other callers of proxyFetch / httpsProxyFetchRaw across the repo are
unaffected.
Tests: 49 pass (was 45, +4). New runtime test proves pre-aborted
signals reject proxyFetch synchronously without touching the network.
Source-regex tests assert signal threading at each layer. Full
test:data 5871 pass. Typecheck + lint clean.
8.5 KiB
8.5 KiB