Commit Graph

4 Commits

Author SHA1 Message Date
Saurav Panda
32416bb48c address codex review: validate tools.act(action_timeout=...) override
P2 codex comment on 9a09c4d7: the public `action_timeout` parameter on
tools.act() skipped the same defensive validation that the env-var path
already had. Passing nan made every action time out instantly; inf /
<=0 disabled the guard entirely. Either mode silently defeats the safety
this module exists to provide, especially for callers sourcing timeouts
from runtime config.

Extracted _coerce_valid_action_timeout() (pairs with _parse_env_action_
timeout) and routed the override through it. None / nan / inf /
non-positive all fall back to the env-derived default with a warning.

New test_act_rejects_invalid_action_timeout_override asserts the
fallback by passing bad values and verifying the fast handler actually
executes to completion (which wouldn't happen if nan → immediate
timeout or if inf → hang would leak through).
2026-04-20 17:55:59 -07:00
Saurav Panda
d2985dcab9 review: reject non-finite timeouts + restore module after reload tests
Two more issues from automated review on #4711:

1. (P2, Codex) float() accepts 'nan' and 'inf' — both parse successfully
   and bypass the fallback path. 'nan' makes asyncio.wait_for time out
   immediately for every action; 'inf' effectively disables the hang
   guard. Extracted the parse into _parse_env_action_timeout() which
   rejects non-finite and non-positive values (including 0 and negatives)
   with a warning + fallback.

2. (P2, Cubic) The previous reload test left browser_use.tools.service
   pinned at _DEFAULT_ACTION_TIMEOUT_S=45.0 (the last monkeypatch value),
   which would leak into any later test in the same worker. Added a
   _restore_service_module fixture that pops the env var and reloads
   cleanly on teardown.

Expanded test coverage to include 'nan', 'NaN', 'inf', '-inf', '0', '-5'
alongside the existing '' / 'abc' cases — all fall back to 180s.
2026-04-20 15:45:36 -07:00
Saurav Panda
1488a39b7f address PR review: raise default cap + tolerate bad env values
Two issues flagged by automated review on #4711:

1. (P1, Codex) The 90s default was *below* the extract action's intentional
   120s page_extraction_llm.ainvoke timeout (tools/service.py:1096,1172).
   Slow-but-valid extractions against large pages would be truncated into
   timeout errors — a regression. Raised default to 180s, which sits above
   that 120s inner cap with grace.

2. (P2, Cubic + Codex) float(os.getenv('BROWSER_USE_ACTION_TIMEOUT_S', '90'))
   ran at import time. An empty or non-numeric value (common with env
   templating) raised ValueError and prevented browser_use.tools.service
   from importing at all — turning a config typo into a process-wide
   startup failure. Wrapped in try/except with a warning and fallback to
   the hardcoded 180s default.

Tests:
- test_default_action_timeout_accommodates_extract_action — pins the
  default >= 150s so future edits can't silently regress extract.
- test_malformed_env_timeout_does_not_break_import — reloads the module
  with empty / non-numeric env values and asserts it falls back cleanly,
  plus verifies a valid numeric env value still takes effect.
2026-04-20 15:36:15 -07:00
Saurav Panda
ce81ada89a fix(tools): enforce per-action timeout to prevent hung event handlers
Individual CDP calls like Page.navigate() have their own 20s timeouts, but
the surrounding event-bus plumbing (await event, event_result()) does not.
When a cloud browser's CDP WebSocket goes silent mid-session, agent handlers
hang indefinitely — agents never emit a step, any outer watchdog eventually
fires, and the run returns with zero history.

Observed in practice: a 170k-task collector run produced 1,090 empty-history
traces (21% of output). 100% hit the 240s outer watchdog; median 582s, max
2214s. Cloud HTTP layer was clean (all 200/201) — hang was entirely in CDP.

Wrap registry.execute_action in asyncio.wait_for with a configurable per-
action cap (default 90s, BROWSER_USE_ACTION_TIMEOUT_S env var or
tools.act(action_timeout=...)). On timeout, the action returns
ActionResult(error=...) so the agent can record the step and recover.

New tests/ci/test_action_timeout.py covers both hung and fast handlers.
Existing tools.act tests (test_multi_act_guards, test_action_blank_page)
still pass.
2026-04-20 15:22:30 -07:00