P2 codex comment on 9a09c4d7: the public `action_timeout` parameter on
tools.act() skipped the same defensive validation that the env-var path
already had. Passing nan made every action time out instantly; inf /
<=0 disabled the guard entirely. Either mode silently defeats the safety
this module exists to provide, especially for callers sourcing timeouts
from runtime config.
Extracted _coerce_valid_action_timeout() (pairs with _parse_env_action_
timeout) and routed the override through it. None / nan / inf /
non-positive all fall back to the env-derived default with a warning.
New test_act_rejects_invalid_action_timeout_override asserts the
fallback by passing bad values and verifying the fast handler actually
executes to completion (which wouldn't happen if nan → immediate
timeout or if inf → hang would leak through).
Two P2 comments from cubic on 9a09c4d7:
1. TimeoutWrappedCDPClient.__init__ trusted its cdp_request_timeout_s arg
blindly. nan / inf / <=0 would either make every CDP call time out
immediately (nan) or disable the guard (inf / <=0) — same defensive
gap we already fixed for the env-var path. Extracted _coerce_valid_
timeout() that mirrors _parse_env_cdp_timeout's validation; constructor
now routes through it, so both entry points are equally safe.
2. test_send_raw_times_out_on_silent_server used an inline copy of the
wrapper logic rather than the real TimeoutWrappedCDPClient.send_raw.
A regression in the production method — e.g. accidentally removing
the asyncio.wait_for — would not fail the test. Rewrote to:
- Construct via __new__ (skip CDPClient.__init__'s WebSocket setup)
- unittest.mock.patch the parent CDPClient.send_raw with a hanging
coroutine
- Call the real TimeoutWrappedCDPClient.send_raw, which does
super().send_raw(...) → our patched stub
- Assert it raises TimeoutError within the cap
Also added test_send_raw_passes_through_when_fast (fast-path regression
guard) and test_constructor_rejects_invalid_timeout (validation for
fix#1). All 14 tests in the timeout suite pass locally.
cdp_use.CDPClient.send_raw awaits a future that only resolves when the
browser sends a response with a matching message id. There is no timeout
on that await. Against the cloud browser service, the failure mode we
observed is: WebSocket stays alive at the TCP/keepalive layer (proxy
keeps pong-ing our pings), but the browser upstream is dead / unhealthy
and never sends any CDP response. send_raw's future never resolves, and
every higher-level timeout in browser-use (session.start's 15s connect
guard, agent.step_timeout, tools.act's action timeout) relies on
eventually getting a response — so they all wait forever too.
Evidence from a 170k-task collector run: 1,090 empty-history traces,
100% hit the 240s outer watchdog, median duration 582s, max 2214s, with
cloud HTTP layer clean throughout (all 200/201). One sample showed
/json/version returning 200 OK and then 5 minutes of total silence on
the WebSocket before forced stop — classic silent-hang.
Fix: add TimeoutWrappedCDPClient, a thin subclass of cdp_use.CDPClient
that wraps send_raw in asyncio.wait_for(timeout=cdp_request_timeout_s).
Any CDP method that doesn't respond within the cap raises plain
TimeoutError, which propagates through existing `except TimeoutError`
handlers in session.py / tools/service.py. Uses the same defensive env
parse pattern as BROWSER_USE_ACTION_TIMEOUT_S — rejects empty /
non-numeric / nan / inf / non-positive values with a warning fallback.
Default is 60s: generous for slow operations like Page.captureScreenshot
or Page.printToPDF on heavy pages, but well below the 180s step timeout
and any typical outer watchdog. Override via BROWSER_USE_CDP_TIMEOUT_S.
Wired into both CDPClient construction sites in session.py (initial
connect + reconnect path). All 17 existing real-browser tests
(test_action_blank_page, test_multi_act_guards) still pass.
The main execution loop already wraps _execute_step with asyncio.wait_for
using settings.step_timeout (default 180s). But _execute_initial_actions,
which runs before the main loop, is unwrapped — if it hangs (e.g. the
first navigate stalls on a silent CDP WebSocket before the per-action
timeout can catch it), the agent blocks indefinitely without ever
entering the main loop. No step gets recorded, history stays empty, and
any outer watchdog eventually kills the run with zero diagnostic data.
Wrap _execute_initial_actions with the same step_timeout. On timeout,
record the failure in state.last_result / consecutive_failures and fall
through to the main execution loop so the agent can still attempt to
recover. InterruptedError (from an interrupting callback) is still
swallowed silently — same contract as before.
Paired with the per-action asyncio.wait_for added in tools/service.py,
this closes the last unprotected path in the pre-main-loop flow.
Two more issues from automated review on #4711:
1. (P2, Codex) float() accepts 'nan' and 'inf' — both parse successfully
and bypass the fallback path. 'nan' makes asyncio.wait_for time out
immediately for every action; 'inf' effectively disables the hang
guard. Extracted the parse into _parse_env_action_timeout() which
rejects non-finite and non-positive values (including 0 and negatives)
with a warning + fallback.
2. (P2, Cubic) The previous reload test left browser_use.tools.service
pinned at _DEFAULT_ACTION_TIMEOUT_S=45.0 (the last monkeypatch value),
which would leak into any later test in the same worker. Added a
_restore_service_module fixture that pops the env var and reloads
cleanly on teardown.
Expanded test coverage to include 'nan', 'NaN', 'inf', '-inf', '0', '-5'
alongside the existing '' / 'abc' cases — all fall back to 180s.
`asyncio.wait_for(stop_recording(), timeout=5.0)` could expire while the
ffmpeg encoder was still flushing, leading the daemon's subsequent
`os._exit(0)` to kill the executor thread mid-write and leave the exact
truncated MP4 this hook was meant to prevent. `stop_recording()` already
offloads the blocking close to an executor, so awaiting it directly is
safe — and if it genuinely hangs, a stuck daemon is a clearer failure
signal than silent video corruption.
Verified end-to-end: start recording → `open` → `close` (no explicit
`record stop`) now produces a decodable MP4 with the captured frames.
Two issues flagged by automated review on #4711:
1. (P1, Codex) The 90s default was *below* the extract action's intentional
120s page_extraction_llm.ainvoke timeout (tools/service.py:1096,1172).
Slow-but-valid extractions against large pages would be truncated into
timeout errors — a regression. Raised default to 180s, which sits above
that 120s inner cap with grace.
2. (P2, Cubic + Codex) float(os.getenv('BROWSER_USE_ACTION_TIMEOUT_S', '90'))
ran at import time. An empty or non-numeric value (common with env
templating) raised ValueError and prevented browser_use.tools.service
from importing at all — turning a config typo into a process-wide
startup failure. Wrapped in try/except with a warning and fallback to
the hardcoded 180s default.
Tests:
- test_default_action_timeout_accommodates_extract_action — pins the
default >= 150s so future edits can't silently regress extract.
- test_malformed_env_timeout_does_not_break_import — reloads the module
with empty / non-numeric env values and asserts it falls back cleanly,
plus verifies a valid numeric env value still takes effect.
Individual CDP calls like Page.navigate() have their own 20s timeouts, but
the surrounding event-bus plumbing (await event, event_result()) does not.
When a cloud browser's CDP WebSocket goes silent mid-session, agent handlers
hang indefinitely — agents never emit a step, any outer watchdog eventually
fires, and the run returns with zero history.
Observed in practice: a 170k-task collector run produced 1,090 empty-history
traces (21% of output). 100% hit the 240s outer watchdog; median 582s, max
2214s. Cloud HTTP layer was clean (all 200/201) — hang was entirely in CDP.
Wrap registry.execute_action in asyncio.wait_for with a configurable per-
action cap (default 90s, BROWSER_USE_ACTION_TIMEOUT_S env var or
tools.act(action_timeout=...)). On timeout, the action returns
ActionResult(error=...) so the agent can record the step and recover.
New tests/ci/test_action_timeout.py covers both hung and fast handlers.
Existing tools.act tests (test_multi_act_guards, test_action_blank_page)
still pass.
- `on_BrowserConnectedEvent` now catches `RuntimeError` from
`start_recording()` so sessions with `record_video_dir` configured but
missing `[video]` extras (or a viewport that can't be sized) keep
starting — prior graceful-degradation behavior is restored.
- Lazy `RecordingWatchdog` in the CLI handler now calls
`attach_to_session()`, so `AgentFocusChangedEvent` / `BrowserStopEvent`
handlers are wired correctly if the session dispatches them.
- Daemon shutdown finalizes any in-progress recording before tearing the
browser down, preventing truncated MP4s on `close`, idle timeout, or
signal-driven exit.
- Added regression test that monkeypatches `start_recording` to raise and
asserts `on_BrowserConnectedEvent` swallows it without breaking startup.
Closes#4533.
- `RecordingWatchdog` gains public `start_recording(path, size?, framerate?)`,
`stop_recording() -> Path`, and `is_recording`; the existing
`BrowserConnectedEvent`/`BrowserStopEvent` path is refactored to use them,
so profile-driven recording behavior is unchanged.
- `browser-use record start <path>` / `record stop` / `record status`
subcommands wired through argparse, daemon dispatch, and the browser
command handler. `record stop` prints the saved file path so it can be
captured programmatically, matching the issue's requested UX. Works with
`--session NAME` via the existing named-daemon infrastructure.
- The CLI's `CLIBrowserSession` intentionally skips watchdogs; the handler
lazily instantiates `RecordingWatchdog` on first `record start` so CLI
recording doesn't pay the watchdog-setup cost for non-recording sessions.
- Output format is `.mp4` (libx264) since that's what the existing
`VideoRecorderService` encodes; optional dependency gate is unchanged
(`pip install "browser-use[video]"`).
- New `tests/ci/test_action_record.py` exercises the full stack against a
real headless browser + `pytest-httpserver`, verifying decodable MP4
output, double-start rejection, stop-without-start no-op, that the
existing `profile.record_video_dir` flow still works, and the argparse /
dispatch wiring.
On Windows with a non-UTF-8 default locale (e.g. Chinese GBK/CP936),
open() without an explicit encoding uses the system code page. Chrome's
Local State file is always UTF-8, so profile names containing non-ASCII
characters (e.g. Chinese '用户1') are decoded as mojibake.
Fixes#4673
When `browser-use connect` fails to discover a running Chrome, the error
now points to the correct `chrome://inspect/#remote-debugging` URL. The
SKILL.md also guides agents to prompt users with two options: enable
remote debugging or use managed Chromium with a Chrome profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add UTM params to all cloud-bound links across README, CLI, and error messages
- Rewrite README Open Source vs Cloud section: position cloud browsers as
recommended pairing for OSS users, remove separate Use Both section
- Rewrite error messages for use_cloud=True and ChatBrowserUse() to clearly
state what is wrong and what to do next
- Add missing URLs: invalid API key now links to key page, insufficient
credits now links to billing page
- Add cloud browser nudge on captcha detection (logger.warning)
- Add cloud browser nudge on local browser launch failure
- Use os.open() with mode 0o600 instead of write-then-chmod to eliminate
the permission race window where the temp file is briefly world-readable.
- Raise instead of warn when token file write fails: a daemon that cannot
persist its auth token is permanently unauthorized for all clients, so
failing fast is correct (identified by cubic).
Generate a secrets.token_hex(32) on daemon startup, write it atomically
to ~/.browser-use/{session}.token (chmod 0o600), and validate it on every
incoming request via hmac.compare_digest. The client reads the token file
and includes it in each send_command() call.
This closes the arbitrary-code-execution vector where any local process
could connect to the deterministic Windows TCP port (or a world-readable
Unix socket) and dispatch the 'python' action to run eval()/exec() as the
daemon owner.
Bumps aiohttp from 3.13.3 to 3.13.4 in requirements-cli.txt.
Fixes uncapped memory usage from insufficient trailer header restrictions
(aio-libs/aiohttp@0c2e9da).
- cloud.py: remove BROWSER_USE_API_KEY env var fallback (violates CLI
policy of config.json as single source of truth); instead detect the
env var in the error path and print a targeted migration hint
- setup.py: replace Path.rename() with shutil.move() so the temp file
can be moved across filesystems (e.g. /tmp -> /usr/local/bin)
A SIGKILL mid-write truncates config.json; read_config() catches
json.JSONDecodeError and returns {}, silently wiping the API key and
all other settings. Mirror the pattern already used by _write_state():
write to a sibling temp file, fsync, chmod 600, then os.replace() into
place — which is atomic on POSIX and effectively atomic on Windows.
Downloads to a temp file, fetches the .sha256sum file Cloudflare publishes
alongside each release, and verifies before moving to the install destination.
Protects against MITM/CDN tampering. Temp file is cleaned up on failure.
The CLI previously accepted the env var as a fallback; this PR dropped it
without a migration path, breaking CI/CD pipelines that set it as a secret.
Restore backwards-compat by checking the env var after config.json and
printing a deprecation warning with the migration command.
- Remove BROWSER_USE_API_KEY env var as a read source from CLI code; config.json is the only source of truth
- Split _create_cloud_profile into daemon-safe _inner (raises) and CLI wrapper (sys.exit)
- Daemon auto-heal no longer kills process on profile creation API errors
_get_or_create_cloud_profile reads config instantly instead of
validating via GET /profiles/{id} on every connect. If the profile
is invalid, _provision_cloud_browser auto-heals by creating a new
one and retrying. Saves ~500ms-1s on every cloud connect.
Library keeps recording off by default. CLI reads cloud_connect_recording
from config (defaults True). Users can disable with:
browser-use config set cloud_connect_recording false
Page.enable fails on browser-level CDP targets. Wrap in try/except
like the library's PopupsWatchdog does. Dialog handler still
registers regardless — events may fire on some CDP implementations.