Two P2 comments from cubic on 9a09c4d7:
1. TimeoutWrappedCDPClient.__init__ trusted its cdp_request_timeout_s arg
blindly. nan / inf / <=0 would either make every CDP call time out
immediately (nan) or disable the guard (inf / <=0) — same defensive
gap we already fixed for the env-var path. Extracted _coerce_valid_
timeout() that mirrors _parse_env_cdp_timeout's validation; constructor
now routes through it, so both entry points are equally safe.
2. test_send_raw_times_out_on_silent_server used an inline copy of the
wrapper logic rather than the real TimeoutWrappedCDPClient.send_raw.
A regression in the production method — e.g. accidentally removing
the asyncio.wait_for — would not fail the test. Rewrote to:
- Construct via __new__ (skip CDPClient.__init__'s WebSocket setup)
- unittest.mock.patch the parent CDPClient.send_raw with a hanging
coroutine
- Call the real TimeoutWrappedCDPClient.send_raw, which does
super().send_raw(...) → our patched stub
- Assert it raises TimeoutError within the cap
Also added test_send_raw_passes_through_when_fast (fast-path regression
guard) and test_constructor_rejects_invalid_timeout (validation for
fix#1). All 14 tests in the timeout suite pass locally.
cdp_use.CDPClient.send_raw awaits a future that only resolves when the
browser sends a response with a matching message id. There is no timeout
on that await. Against the cloud browser service, the failure mode we
observed is: WebSocket stays alive at the TCP/keepalive layer (proxy
keeps pong-ing our pings), but the browser upstream is dead / unhealthy
and never sends any CDP response. send_raw's future never resolves, and
every higher-level timeout in browser-use (session.start's 15s connect
guard, agent.step_timeout, tools.act's action timeout) relies on
eventually getting a response — so they all wait forever too.
Evidence from a 170k-task collector run: 1,090 empty-history traces,
100% hit the 240s outer watchdog, median duration 582s, max 2214s, with
cloud HTTP layer clean throughout (all 200/201). One sample showed
/json/version returning 200 OK and then 5 minutes of total silence on
the WebSocket before forced stop — classic silent-hang.
Fix: add TimeoutWrappedCDPClient, a thin subclass of cdp_use.CDPClient
that wraps send_raw in asyncio.wait_for(timeout=cdp_request_timeout_s).
Any CDP method that doesn't respond within the cap raises plain
TimeoutError, which propagates through existing `except TimeoutError`
handlers in session.py / tools/service.py. Uses the same defensive env
parse pattern as BROWSER_USE_ACTION_TIMEOUT_S — rejects empty /
non-numeric / nan / inf / non-positive values with a warning fallback.
Default is 60s: generous for slow operations like Page.captureScreenshot
or Page.printToPDF on heavy pages, but well below the 180s step timeout
and any typical outer watchdog. Override via BROWSER_USE_CDP_TIMEOUT_S.
Wired into both CDPClient construction sites in session.py (initial
connect + reconnect path). All 17 existing real-browser tests
(test_action_blank_page, test_multi_act_guards) still pass.