2 Commits

Author SHA1 Message Date
Elie Habib
57414e4762 fix(open-meteo): curl proxy as second-choice when CONNECT proxy fails (#3119)
* fix(open-meteo): curl proxy as second-choice when CONNECT proxy fails

Decodo's CONNECT egress and curl egress reach DIFFERENT IP pools (per
scripts/_proxy-utils.cjs:67). Probed 2026-04-16 against Yahoo Finance:

  Yahoo via CONNECT (httpsProxyFetchRaw): HTTP 404
  Yahoo via curl (curlFetch):              HTTP 200

For Open-Meteo both paths happen to work today, but pinning the helper
to one path is a single point of failure if Decodo rebalances pools, or
if Open-Meteo starts behaving like Yahoo. PR #3118 wired only the
CONNECT path (`httpsProxyFetchRaw`); this commit adds curl as a
second-choice attempt that runs only when CONNECT also fails.

Cascade:
  direct retries (3) → CONNECT proxy (1) → curl proxy (1) → throw

Steady-state cost: zero. Curl exec only runs when CONNECT also failed.

Final exhausted-throw now appends the LAST proxy error too, so on-call
sees both upstream signals (direct + proxy) instead of just direct.

Tests: added 4 cases locking the cascade behavior:

- CONNECT fails → curl succeeds: returns curl data, neither throws
- CONNECT succeeds: curl never invoked (cost gate)
- CONNECT fails AND curl fails: throws exhausted with both errors
  visible in the message (HTTP 429 from direct + curl 502 from proxy)
- curl returns malformed JSON: caught + warns + throws exhausted

Updated 2 existing tests to also stub _proxyCurlFetcher so they don't
shell out to real curl when CONNECT is mocked-failed (would have run
real curl with proxy.test:8000 → 8s timeout per test).

Verification:
- tests/open-meteo-proxy-fallback.test.mjs → 12/12 pass (was 8, +4 new)
- npm run test:data → 5367/5367 (+4)
- npm run typecheck:all → clean

Followup to PR #3118.

* fix: CONNECT leg uses resolveProxyForConnect; lock production defaults

P1 from PR #3119 review: the cascade was logged as 'CONNECT proxy → curl
proxy' but BOTH legs were resolving via resolveProxy() — which rewrites
gate.decodo.com → us.decodo.com for curl egress. So the 'two-leg
cascade' was actually one Decodo egress pool wearing two transport
mechanisms. Defeats the redundancy this PR is supposed to provide.

Fix: import resolveProxyForConnect (preserves gate.decodo.com — the
host Decodo routes via its CONNECT egress pool, distinct from the
curl-egress pool reached by us.decodo.com via resolveProxy). CONNECT
leg uses resolveProxyForConnect; curl leg uses resolveProxy. Matches
the established pattern in scripts/seed-portwatch-chokepoints-ref.mjs:33-37
and scripts/seed-recovery-external-debt.mjs:31-35.

Refactored test seams: split single _proxyResolver into
_connectProxyResolver + _curlProxyResolver. Test files inject both.

P2 fix: every cascade test injected _proxyResolver, so the suite stayed
green even when production defaults were misconfigured. Exported
_PROXY_DEFAULTS object and added 2 lock-tests:

  1. CONNECT leg uses resolveProxyForConnect, curl leg uses resolveProxy
     (reference equality on each of 4 default fields).
  2. connect/curl resolvers are different functions — guards against the
     'collapsed cascade' regression class generally, not just this
     specific instance.

Updated the 8 existing cascade tests to inject BOTH resolvers. The
docstring at the top of the file now spells out the wiring invariant
and points to the lock-tests.

Verification:
- tests/open-meteo-proxy-fallback.test.mjs: 14/14 pass (+2)
- npm run test:data: 5369/5369 (+2)
- npm run typecheck:all: clean

Followup commit on PR #3119.

* fix(open-meteo): future-proof sync curlFetch call with Promise.resolve+await

Greptile P2: _proxyCurlFetcher (curlFetch / execFileSync) is sync today,
adjacent CONNECT path is async (await _proxyFetcher(...)). A future
refactor of curlFetch to async would silently break this line — JSON.parse
would receive a Promise<string> instead of a string and explode at parse
time, not at the obvious call site.

Wrapping with await Promise.resolve(...) is a no-op for the current sync
implementation but auto-handles a future async refactor. Comment spells
out the contract so the wrap doesn't read as cargo-cult.

Tests still 14/14.
2026-04-16 09:24:12 +04:00
Elie Habib
5d1c8625e9 fix(seed-climate-zone-normals): proxy fallback when Open-Meteo 429s on Railway IP (#3118)
* fix(seed-climate-zone-normals): proxy fallback when Open-Meteo 429s on Railway IP

Railway logs.1776312819911.log showed seed-climate-zone-normals failing
every batch with HTTP 429 from Open-Meteo's free-tier per-IP throttle
(2026-04-16). The seeder retried with 2/4/8/16s backoff but exhausted
without ever falling back to the project's Decodo proxy infrastructure
that other rate-limited sources (FRED, IMF) already use.

Open-Meteo throttles by source IP. Railway containers share IP pools and
get 429 storms whenever zone-normals fires (monthly cron — high churn
when it runs). Result: PR #3097's bake clock for climate:zone-normals:v1
couldn't start, because the seeder couldn't write the contract envelope
even when manually triggered.

Fix: after direct retries exhaust, _open-meteo-archive.mjs falls back to
httpsProxyFetchRaw (Decodo) — same pattern as fredFetchJson and
imfFetchJson in _seed-utils.mjs. Skips silently if no proxy is configured
(preserves existing behavior in non-Railway envs).

Added tests/open-meteo-proxy-fallback.test.mjs (4 cases):
- 429 with no proxy → throws after exhausting retries (pre-fix behavior preserved)
- 200 OK → returns parsed batch without touching proxy path
- batch size mismatch → throws even on 200
- Non-retryable 500 → break out, attempt proxy, throw exhausted (no extra
  direct retry — matches new control flow)

Verification: npm run test:data → 5359/5359, +4 new. node --check clean.

Same pattern can be applied to any other helper that fetches Open-Meteo
(grep 'open-meteo' scripts/) if more 429s show up.

* fix: proxy fallback runs on thrown direct errors + actually-exercised tests

Addresses two PR #3118 review findings.

P1: catch block did 'throw err' on the final direct attempt, silently
bypassing the proxy fallback for thrown-error cases (timeout, ECONNRESET,
DNS failures). Only non-OK HTTP responses reached the proxy path. Fix:
record the error in lastDirectError and 'break' so control falls through
to the proxy fallback regardless of whether the direct path failed via
thrown error or non-OK status.

Also: include lastDirectError context in the final 'retries exhausted'
message + Error.cause so on-call can see what triggered the fallback
attempt (was: opaque 'retries exhausted').

P2: tests didn't exercise the actual proxy path. Refactored helper to
accept _proxyResolver and _proxyFetcher opt overrides (production
defaults to real resolveProxy/httpsProxyFetchRaw from _seed-utils.mjs;
tests inject mocks). Added 4 new cases:

- 429 + proxy succeeds → returns proxy data
- thrown fetch error on final retry → proxy fallback runs (P1 regression
  guard with explicit assertion: directCalls=2, proxyCalls=1)
- 429 + proxy ALSO fails → throws exhausted, original HTTP 429 in
  message + cause chain
- Proxy returns wrong batch size → caught + warns + throws exhausted

Verification:
- tests/open-meteo-proxy-fallback.test.mjs: 8/8 pass (4 added)
- npm run test:data: 5363/5363 pass (+4 from prior 5359)
- node --check clean
2026-04-16 08:28:05 +04:00