browser-use

mirror of https://github.com/browser-use/browser-use synced 2026-04-22 17:45:09 +02:00

Author	SHA1	Message	Date
Saurav Panda	32416bb48c	address codex review: validate tools.act(action_timeout=...) override P2 codex comment on `9a09c4d7`: the public `action_timeout` parameter on tools.act() skipped the same defensive validation that the env-var path already had. Passing nan made every action time out instantly; inf / <=0 disabled the guard entirely. Either mode silently defeats the safety this module exists to provide, especially for callers sourcing timeouts from runtime config. Extracted _coerce_valid_action_timeout() (pairs with _parse_env_action_ timeout) and routed the override through it. None / nan / inf / non-positive all fall back to the env-derived default with a warning. New test_act_rejects_invalid_action_timeout_override asserts the fallback by passing bad values and verifying the fast handler actually executes to completion (which wouldn't happen if nan → immediate timeout or if inf → hang would leak through).	2026-04-20 17:55:59 -07:00
Saurav Panda	847ad78365	address cubic review: validate constructor timeout + exercise real send_raw Two P2 comments from cubic on `9a09c4d7`: 1. TimeoutWrappedCDPClient.__init__ trusted its cdp_request_timeout_s arg blindly. nan / inf / <=0 would either make every CDP call time out immediately (nan) or disable the guard (inf / <=0) — same defensive gap we already fixed for the env-var path. Extracted _coerce_valid_ timeout() that mirrors _parse_env_cdp_timeout's validation; constructor now routes through it, so both entry points are equally safe. 2. test_send_raw_times_out_on_silent_server used an inline copy of the wrapper logic rather than the real TimeoutWrappedCDPClient.send_raw. A regression in the production method — e.g. accidentally removing the asyncio.wait_for — would not fail the test. Rewrote to: - Construct via __new__ (skip CDPClient.__init__'s WebSocket setup) - unittest.mock.patch the parent CDPClient.send_raw with a hanging coroutine - Call the real TimeoutWrappedCDPClient.send_raw, which does super().send_raw(...) → our patched stub - Assert it raises TimeoutError within the cap Also added test_send_raw_passes_through_when_fast (fast-path regression guard) and test_constructor_rejects_invalid_timeout (validation for fix #1). All 14 tests in the timeout suite pass locally.	2026-04-20 17:54:37 -07:00
Saurav Panda	9a09c4d7dc	fix(cdp): timeout-wrap CDPClient.send_raw to break silent WebSocket hangs cdp_use.CDPClient.send_raw awaits a future that only resolves when the browser sends a response with a matching message id. There is no timeout on that await. Against the cloud browser service, the failure mode we observed is: WebSocket stays alive at the TCP/keepalive layer (proxy keeps pong-ing our pings), but the browser upstream is dead / unhealthy and never sends any CDP response. send_raw's future never resolves, and every higher-level timeout in browser-use (session.start's 15s connect guard, agent.step_timeout, tools.act's action timeout) relies on eventually getting a response — so they all wait forever too. Evidence from a 170k-task collector run: 1,090 empty-history traces, 100% hit the 240s outer watchdog, median duration 582s, max 2214s, with cloud HTTP layer clean throughout (all 200/201). One sample showed /json/version returning 200 OK and then 5 minutes of total silence on the WebSocket before forced stop — classic silent-hang. Fix: add TimeoutWrappedCDPClient, a thin subclass of cdp_use.CDPClient that wraps send_raw in asyncio.wait_for(timeout=cdp_request_timeout_s). Any CDP method that doesn't respond within the cap raises plain TimeoutError, which propagates through existing `except TimeoutError` handlers in session.py / tools/service.py. Uses the same defensive env parse pattern as BROWSER_USE_ACTION_TIMEOUT_S — rejects empty / non-numeric / nan / inf / non-positive values with a warning fallback. Default is 60s: generous for slow operations like Page.captureScreenshot or Page.printToPDF on heavy pages, but well below the 180s step timeout and any typical outer watchdog. Override via BROWSER_USE_CDP_TIMEOUT_S. Wired into both CDPClient construction sites in session.py (initial connect + reconnect path). All 17 existing real-browser tests (test_action_blank_page, test_multi_act_guards) still pass.	2026-04-20 17:40:32 -07:00
Saurav Panda	a9a7201d8d	Merge branch 'main' into fix/action-timeout-hang	2026-04-20 17:02:08 -07:00
Saurav Panda	a97ba48345	fix(agent): timeout _execute_initial_actions so empty-history hangs cannot happen The main execution loop already wraps _execute_step with asyncio.wait_for using settings.step_timeout (default 180s). But _execute_initial_actions, which runs before the main loop, is unwrapped — if it hangs (e.g. the first navigate stalls on a silent CDP WebSocket before the per-action timeout can catch it), the agent blocks indefinitely without ever entering the main loop. No step gets recorded, history stays empty, and any outer watchdog eventually kills the run with zero diagnostic data. Wrap _execute_initial_actions with the same step_timeout. On timeout, record the failure in state.last_result / consecutive_failures and fall through to the main execution loop so the agent can still attempt to recover. InterruptedError (from an interrupting callback) is still swallowed silently — same contract as before. Paired with the per-action asyncio.wait_for added in tools/service.py, this closes the last unprotected path in the pre-main-loop flow.	2026-04-20 16:39:22 -07:00
Saurav Panda	d2985dcab9	review: reject non-finite timeouts + restore module after reload tests Two more issues from automated review on #4711: 1. (P2, Codex) float() accepts 'nan' and 'inf' — both parse successfully and bypass the fallback path. 'nan' makes asyncio.wait_for time out immediately for every action; 'inf' effectively disables the hang guard. Extracted the parse into _parse_env_action_timeout() which rejects non-finite and non-positive values (including 0 and negatives) with a warning + fallback. 2. (P2, Cubic) The previous reload test left browser_use.tools.service pinned at _DEFAULT_ACTION_TIMEOUT_S=45.0 (the last monkeypatch value), which would leak into any later test in the same worker. Added a _restore_service_module fixture that pops the env var and reloads cleanly on teardown. Expanded test coverage to include 'nan', 'NaN', 'inf', '-inf', '0', '-5' alongside the existing '' / 'abc' cases — all fall back to 180s.	2026-04-20 15:45:36 -07:00
Saurav Panda	44f7ead5cd	Drop timeout on recording finalize in daemon shutdown `asyncio.wait_for(stop_recording(), timeout=5.0)` could expire while the ffmpeg encoder was still flushing, leading the daemon's subsequent `os._exit(0)` to kill the executor thread mid-write and leave the exact truncated MP4 this hook was meant to prevent. `stop_recording()` already offloads the blocking close to an executor, so awaiting it directly is safe — and if it genuinely hangs, a stuck daemon is a clearer failure signal than silent video corruption. Verified end-to-end: start recording → `open` → `close` (no explicit `record stop`) now produces a decodable MP4 with the captured frames.	2026-04-20 15:38:34 -07:00
Saurav Panda	1488a39b7f	address PR review: raise default cap + tolerate bad env values Two issues flagged by automated review on #4711: 1. (P1, Codex) The 90s default was below the extract action's intentional 120s page_extraction_llm.ainvoke timeout (tools/service.py:1096,1172). Slow-but-valid extractions against large pages would be truncated into timeout errors — a regression. Raised default to 180s, which sits above that 120s inner cap with grace. 2. (P2, Cubic + Codex) float(os.getenv('BROWSER_USE_ACTION_TIMEOUT_S', '90')) ran at import time. An empty or non-numeric value (common with env templating) raised ValueError and prevented browser_use.tools.service from importing at all — turning a config typo into a process-wide startup failure. Wrapped in try/except with a warning and fallback to the hardcoded 180s default. Tests: - test_default_action_timeout_accommodates_extract_action — pins the default >= 150s so future edits can't silently regress extract. - test_malformed_env_timeout_does_not_break_import — reloads the module with empty / non-numeric env values and asserts it falls back cleanly, plus verifies a valid numeric env value still takes effect.	2026-04-20 15:36:15 -07:00
Saurav Panda	ce81ada89a	fix(tools): enforce per-action timeout to prevent hung event handlers Individual CDP calls like Page.navigate() have their own 20s timeouts, but the surrounding event-bus plumbing (await event, event_result()) does not. When a cloud browser's CDP WebSocket goes silent mid-session, agent handlers hang indefinitely — agents never emit a step, any outer watchdog eventually fires, and the run returns with zero history. Observed in practice: a 170k-task collector run produced 1,090 empty-history traces (21% of output). 100% hit the 240s outer watchdog; median 582s, max 2214s. Cloud HTTP layer was clean (all 200/201) — hang was entirely in CDP. Wrap registry.execute_action in asyncio.wait_for with a configurable per- action cap (default 90s, BROWSER_USE_ACTION_TIMEOUT_S env var or tools.act(action_timeout=...)). On timeout, the action returns ActionResult(error=...) so the agent can record the step and recover. New tests/ci/test_action_timeout.py covers both hung and fast handlers. Existing tools.act tests (test_multi_act_guards, test_action_blank_page) still pass.	2026-04-20 15:22:30 -07:00
Saurav Panda	132756dabb	Address PR review feedback for record start/stop - `on_BrowserConnectedEvent` now catches `RuntimeError` from `start_recording()` so sessions with `record_video_dir` configured but missing `[video]` extras (or a viewport that can't be sized) keep starting — prior graceful-degradation behavior is restored. - Lazy `RecordingWatchdog` in the CLI handler now calls `attach_to_session()`, so `AgentFocusChangedEvent` / `BrowserStopEvent` handlers are wired correctly if the session dispatches them. - Daemon shutdown finalizes any in-progress recording before tearing the browser down, preventing truncated MP4s on `close`, idle timeout, or signal-driven exit. - Added regression test that monkeypatches `start_recording` to raise and asserts `on_BrowserConnectedEvent` swallows it without breaking startup.	2026-04-20 15:11:49 -07:00
Saurav Panda	b1d933258c	Add `record start/stop` CLI command for session video capture Closes #4533. - `RecordingWatchdog` gains public `start_recording(path, size?, framerate?)`, `stop_recording() -> Path`, and `is_recording`; the existing `BrowserConnectedEvent`/`BrowserStopEvent` path is refactored to use them, so profile-driven recording behavior is unchanged. - `browser-use record start <path>` / `record stop` / `record status` subcommands wired through argparse, daemon dispatch, and the browser command handler. `record stop` prints the saved file path so it can be captured programmatically, matching the issue's requested UX. Works with `--session NAME` via the existing named-daemon infrastructure. - The CLI's `CLIBrowserSession` intentionally skips watchdogs; the handler lazily instantiates `RecordingWatchdog` on first `record start` so CLI recording doesn't pay the watchdog-setup cost for non-recording sessions. - Output format is `.mp4` (libx264) since that's what the existing `VideoRecorderService` encodes; optional dependency gate is unchanged (`pip install "browser-use[video]"`). - New `tests/ci/test_action_record.py` exercises the full stack against a real headless browser + `pytest-httpserver`, verifying decodable MP4 output, double-start rejection, stop-without-start no-op, that the existing `profile.record_video_dir` flow still works, and the argparse / dispatch wiring.	2026-04-20 14:50:52 -07:00
Laith Weinberger	a336bd8a50	Merge remote-tracking branch 'origin/main' into fix/handle-lmnr-type-error-on-import-4046	2026-04-15 18:33:31 -04:00
Laith Weinberger	4476f6e16e	fix input clear fallbacks and clarify clear-then-type behavior	2026-04-15 17:31:04 -04:00
laithrw	9c314e626e	Merge branch 'main' into fix/local-state-utf8-encoding	2026-04-15 16:06:56 -04:00
voidborne-d	4c2d136de9	fix: add utf-8 encoding to Local State file read in list_chrome_profiles On Windows with a non-UTF-8 default locale (e.g. Chinese GBK/CP936), open() without an explicit encoding uses the system code page. Chrome's Local State file is always UTF-8, so profile names containing non-ASCII characters (e.g. Chinese '用户1') are decoded as mojibake. Fixes #4673	2026-04-15 17:07:51 +00:00
Shawn Pana	d0fbf4c580	improve connect failure UX: fix chrome://inspect link and add fallback guidance When `browser-use connect` fails to discover a running Chrome, the error now points to the correct `chrome://inspect/#remote-debugging` URL. The SKILL.md also guides agents to prompt users with two options: enable remote debugging or use managed Chromium with a Chrome profile. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:23:50 -07:00
laithrw	76604913ad	Merge branch 'main' into fix/browser-session-close	2026-04-12 17:45:44 -04:00
Laith Weinberger	c1eb87a35f	close alias for BrowserSession stop thousands of users have attempted to use close, so why not add it	2026-04-12 17:43:04 -04:00
Laith Weinberger	03e2bc4da8	prefer Playwright chromium over sys Chrome by default	2026-04-12 11:41:50 -04:00
Laith Weinberger	df4e2f9f15	fix: handle BrokenPipeError gracefully when MCP client disconnects	2026-04-12 11:34:39 -04:00
Laith Weinberger	9ad4c63cdb	fix pagination classifier to prioritize semantic labels over shared glyph symbols	2026-04-11 18:30:46 -04:00
Laith Weinberger	65f87b7fca	fix sensitive_data redaction order to prevent substring leaks	2026-04-11 18:16:24 -04:00
Laith Weinberger	99a8674214	fix asyncio.get_event_loop for python 3.14 cli compatibilit	2026-04-11 18:10:12 -04:00
laithrw	60e7767228	Merge branch 'main' into issue-4631-review	2026-04-11 18:07:12 -04:00
Laith Weinberger	534eaafe7a	clear dom cache after scroll to prevent stale extract data	2026-04-11 18:03:06 -04:00
Alezander9	76569995fd	Improve OSS-to-cloud conversion: UTM tracking, better error messages, and cloud nudges - Add UTM params to all cloud-bound links across README, CLI, and error messages - Rewrite README Open Source vs Cloud section: position cloud browsers as recommended pairing for OSS users, remove separate Use Both section - Rewrite error messages for use_cloud=True and ChatBrowserUse() to clearly state what is wrong and what to do next - Add missing URLs: invalid API key now links to key page, insufficient credits now links to billing page - Add cloud browser nudge on captcha detection (logger.warning) - Add cloud browser nudge on local browser launch failure	2026-04-08 22:05:50 -07:00
Saurav Panda	1a94f96ce9	fix: update imports for browser-use-sdk 3.4.2 and handle UUID id fields	2026-04-08 17:51:22 -07:00
laithrw	83317179cb	Merge branch 'main' into fix/mcp-stdin-guard	2026-04-08 17:25:16 -04:00
Laith Weinberger	674547a414	fix: guard against missing stdin in MCP stdio server startup	2026-04-08 17:21:11 -04:00
Laith Weinberger	3242d9dbc2	fix: use object.__setattr__ for LLM ainvoke patching to avoid pydantic crash	2026-04-08 17:20:46 -04:00
laithrw	6129ceb0b0	Merge branch 'main' into fix/agent-history-safe-access	2026-04-07 14:18:38 -04:00
sauravpanda	ca2185ba61	fix: create token temp file with 0o600 at open() time; raise on failure - Use os.open() with mode 0o600 instead of write-then-chmod to eliminate the permission race window where the temp file is briefly world-readable. - Raise instead of warn when token file write fails: a daemon that cannot persist its auth token is permanently unauthorized for all clients, so failing fast is correct (identified by cubic).	2026-04-02 17:58:12 -07:00
sauravpanda	a05a053da6	fix: add per-session auth token to daemon socket to prevent unauthorized code execution Generate a secrets.token_hex(32) on daemon startup, write it atomically to ~/.browser-use/{session}.token (chmod 0o600), and validate it on every incoming request via hmac.compare_digest. The client reads the token file and includes it in each send_command() call. This closes the arbitrary-code-execution vector where any local process could connect to the deterministic Windows TCP port (or a world-readable Unix socket) and dispatch the 'python' action to run eval()/exec() as the daemon owner.	2026-04-02 17:41:15 -07:00
Saurav Panda	14ada65183	Merge branch 'main' into worktree-fix+aiohttp-security-upgrade	2026-04-02 16:31:45 -07:00
sauravpanda	22f0e501e9	fix: upgrade aiohttp to 3.13.4 to patch memory exhaustion vulnerability Bumps aiohttp from 3.13.3 to 3.13.4 in requirements-cli.txt. Fixes uncapped memory usage from insufficient trailer header restrictions (aio-libs/aiohttp@0c2e9da).	2026-04-02 16:26:39 -07:00
sauravpanda	56d8aa8483	fix: address review violations — drop env var fallback, fix cross-fs move - cloud.py: remove BROWSER_USE_API_KEY env var fallback (violates CLI policy of config.json as single source of truth); instead detect the env var in the error path and print a targeted migration hint - setup.py: replace Path.rename() with shutil.move() so the temp file can be moved across filesystems (e.g. /tmp -> /usr/local/bin)	2026-04-02 16:21:24 -07:00
sauravpanda	ea99055e53	fix: write config.json atomically via tmp+rename to prevent silent data loss A SIGKILL mid-write truncates config.json; read_config() catches json.JSONDecodeError and returns {}, silently wiping the API key and all other settings. Mirror the pattern already used by _write_state(): write to a sibling temp file, fsync, chmod 600, then os.replace() into place — which is atomic on POSIX and effectively atomic on Windows.	2026-04-02 13:08:37 -07:00
sauravpanda	7a887e156e	fix: verify cloudflared binary SHA256 checksum before installing on Linux Downloads to a temp file, fetches the .sha256sum file Cloudflare publishes alongside each release, and verifies before moving to the install destination. Protects against MITM/CDN tampering. Temp file is cleaned up on failure.	2026-04-02 13:08:37 -07:00
sauravpanda	96bb65dcb5	fix: warn on deprecated BROWSER_USE_API_KEY env var instead of silently ignoring it The CLI previously accepted the env var as a fallback; this PR dropped it without a migration path, breaking CI/CD pipelines that set it as a secret. Restore backwards-compat by checking the env var after config.json and printing a deprecation warning with the migration command.	2026-04-02 13:08:37 -07:00
shawn pana	a7b476ee46	Merge branch 'main' into multi-session	2026-04-02 13:02:51 -07:00
ShawnPana	0530545c1a	fix: API key single source of truth (config.json only), daemon-safe profile creation - Remove BROWSER_USE_API_KEY env var as a read source from CLI code; config.json is the only source of truth - Split _create_cloud_profile into daemon-safe _inner (raises) and CLI wrapper (sys.exit) - Daemon auto-heal no longer kills process on profile creation API errors	2026-04-02 11:51:38 -07:00
ShawnPana	47ba16b8ab	ux: show Connecting.../Closing... status during slow operations	2026-04-02 11:29:03 -07:00
ShawnPana	1deb430f8f	perf: skip profile validation HTTP call on cloud connect _get_or_create_cloud_profile reads config instantly instead of validating via GET /profiles/{id} on every connect. If the profile is invalid, _provision_cloud_browser auto-heals by creating a new one and retrying. Saves ~500ms-1s on every cloud connect.	2026-04-02 11:21:12 -07:00
ShawnPana	63904858f4	fix: CDP URL only in sessions --json, not in table output	2026-04-01 22:40:48 -07:00
ShawnPana	8c6d042a79	fix: truncate CDP URL in sessions table, full URL in --json	2026-04-01 22:39:30 -07:00
ShawnPana	b1522b5e23	fix: sessions shows CDP URL for cloud sessions too Ping response now returns live CDP URL from the browser session (not just the constructor arg). Cloud sessions show their provisioned CDP URL.	2026-04-01 22:37:23 -07:00
ShawnPana	c09ea5a0b2	feat: sessions command shows CDP URL for each session	2026-04-01 22:35:11 -07:00
ShawnPana	d6c9b8a24b	fix: reject invalid boolean config values instead of silently coercing to False	2026-04-01 22:29:36 -07:00
ShawnPana	f27c567aad	fix: enable_recording defaults False in library, configurable in CLI Library keeps recording off by default. CLI reads cloud_connect_recording from config (defaults True). Users can disable with: browser-use config set cloud_connect_recording false	2026-04-01 22:22:33 -07:00
ShawnPana	a2bcc1a3f9	fix: wrap Page.enable in try/except (not supported on root CDP client) Page.enable fails on browser-level CDP targets. Wrap in try/except like the library's PopupsWatchdog does. Dialog handler still registers regardless — events may fire on some CDP implementations.	2026-04-01 22:13:55 -07:00

1 2 3 4 5 ...

4735 Commits