mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
main
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
57414e4762 |
fix(open-meteo): curl proxy as second-choice when CONNECT proxy fails (#3119)
* fix(open-meteo): curl proxy as second-choice when CONNECT proxy fails Decodo's CONNECT egress and curl egress reach DIFFERENT IP pools (per scripts/_proxy-utils.cjs:67). Probed 2026-04-16 against Yahoo Finance: Yahoo via CONNECT (httpsProxyFetchRaw): HTTP 404 Yahoo via curl (curlFetch): HTTP 200 For Open-Meteo both paths happen to work today, but pinning the helper to one path is a single point of failure if Decodo rebalances pools, or if Open-Meteo starts behaving like Yahoo. PR #3118 wired only the CONNECT path (`httpsProxyFetchRaw`); this commit adds curl as a second-choice attempt that runs only when CONNECT also fails. Cascade: direct retries (3) → CONNECT proxy (1) → curl proxy (1) → throw Steady-state cost: zero. Curl exec only runs when CONNECT also failed. Final exhausted-throw now appends the LAST proxy error too, so on-call sees both upstream signals (direct + proxy) instead of just direct. Tests: added 4 cases locking the cascade behavior: - CONNECT fails → curl succeeds: returns curl data, neither throws - CONNECT succeeds: curl never invoked (cost gate) - CONNECT fails AND curl fails: throws exhausted with both errors visible in the message (HTTP 429 from direct + curl 502 from proxy) - curl returns malformed JSON: caught + warns + throws exhausted Updated 2 existing tests to also stub _proxyCurlFetcher so they don't shell out to real curl when CONNECT is mocked-failed (would have run real curl with proxy.test:8000 → 8s timeout per test). Verification: - tests/open-meteo-proxy-fallback.test.mjs → 12/12 pass (was 8, +4 new) - npm run test:data → 5367/5367 (+4) - npm run typecheck:all → clean Followup to PR #3118. * fix: CONNECT leg uses resolveProxyForConnect; lock production defaults P1 from PR #3119 review: the cascade was logged as 'CONNECT proxy → curl proxy' but BOTH legs were resolving via resolveProxy() — which rewrites gate.decodo.com → us.decodo.com for curl egress. So the 'two-leg cascade' was actually one Decodo egress pool wearing two transport mechanisms. Defeats the redundancy this PR is supposed to provide. Fix: import resolveProxyForConnect (preserves gate.decodo.com — the host Decodo routes via its CONNECT egress pool, distinct from the curl-egress pool reached by us.decodo.com via resolveProxy). CONNECT leg uses resolveProxyForConnect; curl leg uses resolveProxy. Matches the established pattern in scripts/seed-portwatch-chokepoints-ref.mjs:33-37 and scripts/seed-recovery-external-debt.mjs:31-35. Refactored test seams: split single _proxyResolver into _connectProxyResolver + _curlProxyResolver. Test files inject both. P2 fix: every cascade test injected _proxyResolver, so the suite stayed green even when production defaults were misconfigured. Exported _PROXY_DEFAULTS object and added 2 lock-tests: 1. CONNECT leg uses resolveProxyForConnect, curl leg uses resolveProxy (reference equality on each of 4 default fields). 2. connect/curl resolvers are different functions — guards against the 'collapsed cascade' regression class generally, not just this specific instance. Updated the 8 existing cascade tests to inject BOTH resolvers. The docstring at the top of the file now spells out the wiring invariant and points to the lock-tests. Verification: - tests/open-meteo-proxy-fallback.test.mjs: 14/14 pass (+2) - npm run test:data: 5369/5369 (+2) - npm run typecheck:all: clean Followup commit on PR #3119. * fix(open-meteo): future-proof sync curlFetch call with Promise.resolve+await Greptile P2: _proxyCurlFetcher (curlFetch / execFileSync) is sync today, adjacent CONNECT path is async (await _proxyFetcher(...)). A future refactor of curlFetch to async would silently break this line — JSON.parse would receive a Promise<string> instead of a string and explode at parse time, not at the obvious call site. Wrapping with await Promise.resolve(...) is a no-op for the current sync implementation but auto-handles a future async refactor. Comment spells out the contract so the wrap doesn't read as cargo-cult. Tests still 14/14. |
||
|
|
5d1c8625e9 |
fix(seed-climate-zone-normals): proxy fallback when Open-Meteo 429s on Railway IP (#3118)
* fix(seed-climate-zone-normals): proxy fallback when Open-Meteo 429s on Railway IP Railway logs.1776312819911.log showed seed-climate-zone-normals failing every batch with HTTP 429 from Open-Meteo's free-tier per-IP throttle (2026-04-16). The seeder retried with 2/4/8/16s backoff but exhausted without ever falling back to the project's Decodo proxy infrastructure that other rate-limited sources (FRED, IMF) already use. Open-Meteo throttles by source IP. Railway containers share IP pools and get 429 storms whenever zone-normals fires (monthly cron — high churn when it runs). Result: PR #3097's bake clock for climate:zone-normals:v1 couldn't start, because the seeder couldn't write the contract envelope even when manually triggered. Fix: after direct retries exhaust, _open-meteo-archive.mjs falls back to httpsProxyFetchRaw (Decodo) — same pattern as fredFetchJson and imfFetchJson in _seed-utils.mjs. Skips silently if no proxy is configured (preserves existing behavior in non-Railway envs). Added tests/open-meteo-proxy-fallback.test.mjs (4 cases): - 429 with no proxy → throws after exhausting retries (pre-fix behavior preserved) - 200 OK → returns parsed batch without touching proxy path - batch size mismatch → throws even on 200 - Non-retryable 500 → break out, attempt proxy, throw exhausted (no extra direct retry — matches new control flow) Verification: npm run test:data → 5359/5359, +4 new. node --check clean. Same pattern can be applied to any other helper that fetches Open-Meteo (grep 'open-meteo' scripts/) if more 429s show up. * fix: proxy fallback runs on thrown direct errors + actually-exercised tests Addresses two PR #3118 review findings. P1: catch block did 'throw err' on the final direct attempt, silently bypassing the proxy fallback for thrown-error cases (timeout, ECONNRESET, DNS failures). Only non-OK HTTP responses reached the proxy path. Fix: record the error in lastDirectError and 'break' so control falls through to the proxy fallback regardless of whether the direct path failed via thrown error or non-OK status. Also: include lastDirectError context in the final 'retries exhausted' message + Error.cause so on-call can see what triggered the fallback attempt (was: opaque 'retries exhausted'). P2: tests didn't exercise the actual proxy path. Refactored helper to accept _proxyResolver and _proxyFetcher opt overrides (production defaults to real resolveProxy/httpsProxyFetchRaw from _seed-utils.mjs; tests inject mocks). Added 4 new cases: - 429 + proxy succeeds → returns proxy data - thrown fetch error on final retry → proxy fallback runs (P1 regression guard with explicit assertion: directCalls=2, proxyCalls=1) - 429 + proxy ALSO fails → throws exhausted, original HTTP 429 in message + cause chain - Proxy returns wrong batch size → caught + warns + throws exhausted Verification: - tests/open-meteo-proxy-fallback.test.mjs: 8/8 pass (4 added) - npm run test:data: 5363/5363 pass (+4 from prior 5359) - node --check clean |