Files
browser-use/tests/ci/test_cdp_timeout.py
Saurav Panda 9a09c4d7dc fix(cdp): timeout-wrap CDPClient.send_raw to break silent WebSocket hangs
cdp_use.CDPClient.send_raw awaits a future that only resolves when the
browser sends a response with a matching message id. There is no timeout
on that await. Against the cloud browser service, the failure mode we
observed is: WebSocket stays alive at the TCP/keepalive layer (proxy
keeps pong-ing our pings), but the browser upstream is dead / unhealthy
and never sends any CDP response. send_raw's future never resolves, and
every higher-level timeout in browser-use (session.start's 15s connect
guard, agent.step_timeout, tools.act's action timeout) relies on
eventually getting a response — so they all wait forever too.

Evidence from a 170k-task collector run: 1,090 empty-history traces,
100% hit the 240s outer watchdog, median duration 582s, max 2214s, with
cloud HTTP layer clean throughout (all 200/201). One sample showed
/json/version returning 200 OK and then 5 minutes of total silence on
the WebSocket before forced stop — classic silent-hang.

Fix: add TimeoutWrappedCDPClient, a thin subclass of cdp_use.CDPClient
that wraps send_raw in asyncio.wait_for(timeout=cdp_request_timeout_s).
Any CDP method that doesn't respond within the cap raises plain
TimeoutError, which propagates through existing `except TimeoutError`
handlers in session.py / tools/service.py. Uses the same defensive env
parse pattern as BROWSER_USE_ACTION_TIMEOUT_S — rejects empty /
non-numeric / nan / inf / non-positive values with a warning fallback.

Default is 60s: generous for slow operations like Page.captureScreenshot
or Page.printToPDF on heavy pages, but well below the 180s step timeout
and any typical outer watchdog. Override via BROWSER_USE_CDP_TIMEOUT_S.

Wired into both CDPClient construction sites in session.py (initial
connect + reconnect path). All 17 existing real-browser tests
(test_action_blank_page, test_multi_act_guards) still pass.
2026-04-20 17:40:32 -07:00

85 lines
3.0 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""Regression tests for TimeoutWrappedCDPClient.
cdp_use.CDPClient.send_raw awaits a future that only resolves when the browser
sends a matching response. When the server goes silent (observed against cloud
browsers whose WebSocket stays connected at TCP/keepalive layer but never
replies), send_raw hangs forever. The wrapper turns that hang into a fast
TimeoutError.
"""
from __future__ import annotations
import asyncio
import time
import pytest
from browser_use.browser._cdp_timeout import (
DEFAULT_CDP_REQUEST_TIMEOUT_S,
TimeoutWrappedCDPClient,
_parse_env_cdp_timeout,
)
class _HangingClient(TimeoutWrappedCDPClient):
"""Wrapper whose parent send_raw never returns — simulates a silent server."""
def __init__(self, cdp_request_timeout_s: float) -> None:
# Skip real CDPClient.__init__ — we only exercise the timeout wrapper.
self._cdp_request_timeout_s = cdp_request_timeout_s
self.call_count = 0
async def _parent_send_raw(self, *_args, **_kwargs):
self.call_count += 1
await asyncio.sleep(30) # Way longer than any test timeout.
return {}
async def send_raw(self, method, params=None, session_id=None):
# Inline version of TimeoutWrappedCDPClient.send_raw using our _parent_send_raw
# (avoids needing a real WebSocket).
try:
return await asyncio.wait_for(
self._parent_send_raw(method=method, params=params, session_id=session_id),
timeout=self._cdp_request_timeout_s,
)
except TimeoutError as e:
raise TimeoutError(f'CDP method {method!r} did not respond within {self._cdp_request_timeout_s:.0f}s.') from e
@pytest.mark.asyncio
async def test_send_raw_times_out_on_silent_server():
"""A CDP method that gets no response must raise TimeoutError within the cap."""
client = _HangingClient(cdp_request_timeout_s=0.5)
start = time.monotonic()
with pytest.raises(TimeoutError) as exc:
await client.send_raw('Target.getTargets')
elapsed = time.monotonic() - start
assert client.call_count == 1
# Returned within the cap (plus a small scheduling margin), not after the
# full 30s sleep.
assert elapsed < 2.0, f'wrapper did not enforce timeout; took {elapsed:.2f}s'
assert 'Target.getTargets' in str(exc.value)
assert '0s' in str(exc.value) or '1s' in str(exc.value)
def test_default_cdp_timeout_is_reasonable():
"""Default must give headroom above typical slow CDP calls but stay below
the 180s agent step_timeout so hangs surface before step-level kills."""
assert 10.0 <= DEFAULT_CDP_REQUEST_TIMEOUT_S <= 120.0, (
f'Default CDP timeout ({DEFAULT_CDP_REQUEST_TIMEOUT_S}s) is outside the sensible 10120s range'
)
def test_parse_env_rejects_malformed_values():
"""Mirrors the defensive parse used for BROWSER_USE_ACTION_TIMEOUT_S."""
for bad in ('', 'nan', 'NaN', 'inf', '-inf', '0', '-5', 'abc'):
assert _parse_env_cdp_timeout(bad) == 60.0, f'Expected fallback for {bad!r}'
# Finite positive values take effect.
assert _parse_env_cdp_timeout('30') == 30.0
assert _parse_env_cdp_timeout('15.5') == 15.5
# None (env var not set) also falls back.
assert _parse_env_cdp_timeout(None) == 60.0