P2 codex comment on 9a09c4d7: the public `action_timeout` parameter on
tools.act() skipped the same defensive validation that the env-var path
already had. Passing nan made every action time out instantly; inf /
<=0 disabled the guard entirely. Either mode silently defeats the safety
this module exists to provide, especially for callers sourcing timeouts
from runtime config.
Extracted _coerce_valid_action_timeout() (pairs with _parse_env_action_
timeout) and routed the override through it. None / nan / inf /
non-positive all fall back to the env-derived default with a warning.
New test_act_rejects_invalid_action_timeout_override asserts the
fallback by passing bad values and verifying the fast handler actually
executes to completion (which wouldn't happen if nan → immediate
timeout or if inf → hang would leak through).
Two more issues from automated review on #4711:
1. (P2, Codex) float() accepts 'nan' and 'inf' — both parse successfully
and bypass the fallback path. 'nan' makes asyncio.wait_for time out
immediately for every action; 'inf' effectively disables the hang
guard. Extracted the parse into _parse_env_action_timeout() which
rejects non-finite and non-positive values (including 0 and negatives)
with a warning + fallback.
2. (P2, Cubic) The previous reload test left browser_use.tools.service
pinned at _DEFAULT_ACTION_TIMEOUT_S=45.0 (the last monkeypatch value),
which would leak into any later test in the same worker. Added a
_restore_service_module fixture that pops the env var and reloads
cleanly on teardown.
Expanded test coverage to include 'nan', 'NaN', 'inf', '-inf', '0', '-5'
alongside the existing '' / 'abc' cases — all fall back to 180s.
Two issues flagged by automated review on #4711:
1. (P1, Codex) The 90s default was *below* the extract action's intentional
120s page_extraction_llm.ainvoke timeout (tools/service.py:1096,1172).
Slow-but-valid extractions against large pages would be truncated into
timeout errors — a regression. Raised default to 180s, which sits above
that 120s inner cap with grace.
2. (P2, Cubic + Codex) float(os.getenv('BROWSER_USE_ACTION_TIMEOUT_S', '90'))
ran at import time. An empty or non-numeric value (common with env
templating) raised ValueError and prevented browser_use.tools.service
from importing at all — turning a config typo into a process-wide
startup failure. Wrapped in try/except with a warning and fallback to
the hardcoded 180s default.
Tests:
- test_default_action_timeout_accommodates_extract_action — pins the
default >= 150s so future edits can't silently regress extract.
- test_malformed_env_timeout_does_not_break_import — reloads the module
with empty / non-numeric env values and asserts it falls back cleanly,
plus verifies a valid numeric env value still takes effect.
Individual CDP calls like Page.navigate() have their own 20s timeouts, but
the surrounding event-bus plumbing (await event, event_result()) does not.
When a cloud browser's CDP WebSocket goes silent mid-session, agent handlers
hang indefinitely — agents never emit a step, any outer watchdog eventually
fires, and the run returns with zero history.
Observed in practice: a 170k-task collector run produced 1,090 empty-history
traces (21% of output). 100% hit the 240s outer watchdog; median 582s, max
2214s. Cloud HTTP layer was clean (all 200/201) — hang was entirely in CDP.
Wrap registry.execute_action in asyncio.wait_for with a configurable per-
action cap (default 90s, BROWSER_USE_ACTION_TIMEOUT_S env var or
tools.act(action_timeout=...)). On timeout, the action returns
ActionResult(error=...) so the agent can record the step and recover.
New tests/ci/test_action_timeout.py covers both hung and fast handlers.
Existing tools.act tests (test_multi_act_guards, test_action_blank_page)
still pass.