browser-use

mirror of https://github.com/browser-use/browser-use synced 2026-04-22 17:45:09 +02:00

Author	SHA1	Message	Date
Saurav Panda	44f7ead5cd	Drop timeout on recording finalize in daemon shutdown `asyncio.wait_for(stop_recording(), timeout=5.0)` could expire while the ffmpeg encoder was still flushing, leading the daemon's subsequent `os._exit(0)` to kill the executor thread mid-write and leave the exact truncated MP4 this hook was meant to prevent. `stop_recording()` already offloads the blocking close to an executor, so awaiting it directly is safe — and if it genuinely hangs, a stuck daemon is a clearer failure signal than silent video corruption. Verified end-to-end: start recording → `open` → `close` (no explicit `record stop`) now produces a decodable MP4 with the captured frames.	2026-04-20 15:38:34 -07:00
Saurav Panda	132756dabb	Address PR review feedback for record start/stop - `on_BrowserConnectedEvent` now catches `RuntimeError` from `start_recording()` so sessions with `record_video_dir` configured but missing `[video]` extras (or a viewport that can't be sized) keep starting — prior graceful-degradation behavior is restored. - Lazy `RecordingWatchdog` in the CLI handler now calls `attach_to_session()`, so `AgentFocusChangedEvent` / `BrowserStopEvent` handlers are wired correctly if the session dispatches them. - Daemon shutdown finalizes any in-progress recording before tearing the browser down, preventing truncated MP4s on `close`, idle timeout, or signal-driven exit. - Added regression test that monkeypatches `start_recording` to raise and asserts `on_BrowserConnectedEvent` swallows it without breaking startup.	2026-04-20 15:11:49 -07:00
sauravpanda	ca2185ba61	fix: create token temp file with 0o600 at open() time; raise on failure - Use os.open() with mode 0o600 instead of write-then-chmod to eliminate the permission race window where the temp file is briefly world-readable. - Raise instead of warn when token file write fails: a daemon that cannot persist its auth token is permanently unauthorized for all clients, so failing fast is correct (identified by cubic).	2026-04-02 17:58:12 -07:00
sauravpanda	a05a053da6	fix: add per-session auth token to daemon socket to prevent unauthorized code execution Generate a secrets.token_hex(32) on daemon startup, write it atomically to ~/.browser-use/{session}.token (chmod 0o600), and validate it on every incoming request via hmac.compare_digest. The client reads the token file and includes it in each send_command() call. This closes the arbitrary-code-execution vector where any local process could connect to the deterministic Windows TCP port (or a world-readable Unix socket) and dispatch the 'python' action to run eval()/exec() as the daemon owner.	2026-04-02 17:41:15 -07:00
ShawnPana	b1522b5e23	fix: sessions shows CDP URL for cloud sessions too Ping response now returns live CDP URL from the browser session (not just the constructor arg). Cloud sessions show their provisioned CDP URL.	2026-04-01 22:37:23 -07:00
ShawnPana	0bf1f02d97	fix: CI failures — ruff formatting, type errors, test_setup_command - Ruff format all skill_cli and test files - Fix type: get_config_value returns str\|int\|None, callers cast properly - Fix type: BrowserWrapper.actions is non-optional (always provided) - Fix type: config comparison uses 'is' not '==' - Rewrite test_setup_command for new setup.handle(yes=True) API - Add None guard in test_cli_lifecycle for state file	2026-04-01 19:58:33 -07:00
ShawnPana	ca05f46352	refactor: remove --agent/register/tab-ownership, sessions-as-agents model Multi-agent isolation is now achieved through separate sessions (--session NAME), each with its own browser. Removed: - register command and agents.json - --agent flag and agent_id plumbing - TabOwnershipManager and all tab locking logic - dispatch lock and focus swapping between agents - tab_ownership.py (deleted) - test_tab_ownership.py (deleted) Simplified tab commands: no lock checks, no _tab_list injection, no _resolved_target_id params. agent_focus_target_id stays for single-agent tab tracking. Tested: 3 concurrent subagents on separate cloud sessions, 3 concurrent subagents on separate headless Chromium sessions.	2026-04-01 17:34:46 -07:00
ShawnPana	73a926caa6	feat: add connect command + --agent flag, decouple multi-agent from Chrome - browser-use connect: one-time command to discover and connect to local Chrome (like cloud connect but for local) - --agent INDEX: per-command flag for multi-agent tab isolation, works with any browser mode (cloud, profile, cdp-url, headless) - register is now per-session ({session}.agents.json) - --connect deprecated with migration message - SKILL.md updated for new connect/--agent workflow - Tested: 3 concurrent agents on shared cloud browser session	2026-04-01 16:52:56 -07:00
ShawnPana	032e73eec7	feat: config-driven proxy/timeout for cloud connect, enable recording cloud connect reads cloud_connect_proxy and cloud_connect_timeout from ~/.browser-use/config.json. Recording always enabled via enableRecording default on CreateBrowserRequest. No CLI flags — edit config for custom settings, use cloud v2 REST for full control.	2026-03-31 13:44:27 -07:00
ShawnPana	b4b2a4c18d	feat: zero-config cloud connect with auto-managed profile cloud connect now works with no flags. On first use, creates a "Browser Use CLI" profile via the Cloud API and saves the ID to config.json. Subsequent connects reuse it (validates on each call, recreates if deleted). Removed --timeout, --proxy-country, --profile-id flags and their plumbing through daemon/sessions. Power users who need custom browser settings use cloud v2 POST /browsers directly.	2026-03-31 12:28:42 -07:00
ShawnPana	526bde4d0f	refactor: remove standalone switch and close-tab command aliases These were duplicates of tab switch and tab close with separate lock-checking and focus-resolution code paths. Having one way to do each thing reduces maintenance surface and avoids isolation bugs.	2026-03-30 20:09:50 -07:00
ShawnPana	8ade5748e2	feat: add daemon lifecycle state file and shutdown re-entrancy guard Daemon now writes <session>.state.json at each lifecycle transition (initializing → ready → starting → running → shutting_down → stopped/failed). All shutdown triggers funneled through _request_shutdown() to prevent double cleanup. Startup rollback on failure cleans up browser resources. Also fixes cloud browser leak: CLIBrowserSession.stop() now explicitly stops the remote browser via API instead of just disconnecting the websocket. Daemon ping response now includes PID for split-brain resolution.	2026-03-30 18:43:21 -07:00
ShawnPana	ac48ebecaf	fix: SIGTERM fallback for orphaned daemons in close command When the daemon's socket is unreachable but the PID file references a live process, close now sends SIGTERM directly instead of printing "No active browser session" and leaving the daemon running forever.	2026-03-30 18:39:06 -07:00
ShawnPana	18c2199e22	fix: serialize multi-agent focus swaps with dispatch lock Concurrent agents could interleave focus swaps on the shared BrowserSession, corrupting each other's state. Wrap the entire swap-execute-restore cycle in _dispatch_lock and separate the tab-ownership path from single-agent mode.	2026-03-28 15:58:29 -07:00
ShawnPana	6bb73ab274	merge upstream/main, resolve install_lite.sh conflict (take upstream)	2026-03-26 11:40:33 -07:00
ShawnPana	464bd167c3	fix: daemon zombie on close, add idle timeout, clean up PID file before exit	2026-03-26 11:31:15 -07:00
ShawnPana	e09ba11ef1	feat: remove event bus dependency, add dialog handling - Remove event bus tab listeners from daemon — track tabs directly - Remove dead event bus fallback branches from commands/browser.py - Replace SwitchTabEvent/CloseTabEvent dispatches with direct CDP calls - Update python_session.py to use ActionHandler instead of event bus - Add JS dialog handler (alert/confirm/prompt) to CLIBrowserSession - Surface auto-dismissed popup messages in state output - Only dummy EventBus() for watchdog constructors remains (unavoidable)	2026-03-25 16:37:26 -07:00
Laith Weinberger	a31809b83d	preserve error exit code	2026-03-25 17:15:34 -04:00
Laith Weinberger	c9ea2fb1cf	fix daemon orphaning when external browser dies on --connect/--cdp-url	2026-03-25 17:07:03 -04:00
ShawnPana	7bb2741292	feat: lightweight CLIBrowserSession — no watchdogs, no event bus Subclass BrowserSession as CLIBrowserSession that calls connect() directly instead of start(). Skips all 13 watchdogs and event bus handler registration. Actions execute via ActionHandler which calls DefaultActionWatchdog methods directly and DomService for DOM snapshots. - CLIBrowserSession.start() → connect() only (CDP + SessionManager) - CLIBrowserSession.stop() → close websocket directly (no BrowserStopEvent) - CLIBrowserSession.kill() → Browser.close + disconnect - ActionHandler wraps DefaultActionWatchdog for click/type/scroll/keys/etc - DomService called directly for state (no DOMWatchdog) - Monkey-patches _enable_page_monitoring to no-op after initial connect - Disables auto-reconnect (_intentional_stop = True) - Falls back to event bus path if ActionHandler is not available	2026-03-25 13:28:12 -07:00
ShawnPana	f819d3e3a6	feat: add `tab` command (list, new, switch, close) and agent registration - `browser-use register` assigns numeric agent index for --connect mode - `--connect <index>` requires explicit agent index (no more bare --connect) - `tab list` shows all tabs with lock status per agent - `tab new [url]` creates a new tab without visually switching - `tab switch <index>` changes agent focus without activating Chrome tab - `tab close <index> [index...]` closes multiple tabs in one command - Agent registry in ~/.browser-use/agents.json with 5min expiry - Improved error messages guide agents to register or use their own tab - Session lock prevents double BrowserSession creation on simultaneous connect - Updated SKILL.md with register workflow and tab commands	2026-03-24 17:27:38 -07:00
ShawnPana	31694df283	feat: multi-agent tab isolation for --connect mode Multiple agents can share one browser via --connect without interfering with each other. Each agent registers with `browser-use register` to get a numeric index, then passes it with `--connect <index>` on every command. - Tab locking: mutating commands (click, type, open) lock the tab to the agent. Other agents get an error if they try to mutate the same tab. Read-only commands (state, screenshot) work on any tab. - Agent registry: agents.json tracks registered agents with timestamps. Expired agents (5min inactive) get cleaned up automatically. - Session lock: prevents double BrowserSession creation when two agents connect simultaneously. - Focus swap: daemon swaps agent_focus_target_id and cached_selector_map per-agent before each command, so element indices are isolated.	2026-03-24 16:38:53 -07:00
ShawnPana	54f3febdfa	address PR review comments: fix cloud v2 --help, guard double signal, error to stderr, stale port - Narrow cloud --help intercept to only fire when --help is immediately after 'cloud', so 'cloud v2 --help' still shows OpenAPI endpoints - Guard signal handler against concurrent shutdown tasks on repeated signals - Route error response bodies to stderr in cloud REST commands - Replace stale port 49200 in README Windows troubleshooting	2026-03-19 22:04:34 -07:00
ShawnPana	c5885e9e06	store shutdown task reference to prevent browser orphaning on signal exit await the shutdown task in run() so browser cleanup (kill/stop) completes before the event loop tears down. Previously, asyncio.run() could cancel the shutdown mid-cleanup, leaving Chrome processes orphaned.	2026-03-19 21:55:14 -07:00
ShawnPana	9980a9ea4f	remove redundant string quotes on type annotations in daemon.py	2026-03-19 21:21:18 -07:00
ShawnPana	bff2918558	add --connect flag for Chrome auto-discovery and fix daemon shutdown for external browsers Adds `--connect` to auto-discover running Chrome instances via DevToolsActivePort files and well-known port probing, eliminating manual CDP URL construction. Fixes daemon process hanging on `close` when connected to external browsers (--connect, --cdp-url, cloud) by calling stop() (disconnect) instead of kill() (terminate).	2026-03-18 09:41:58 -07:00
ShawnPana	3318f56318	simplify CLI infrastructure: single-session daemon, remove install modes, streamline setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 10:12:41 -07:00
ShawnPana	0dc2fc5a6f	remove `run` command and agent infrastructure from CLI The `run` command pulled in heavy SDK dependencies (openai, anthropic, google), had a bug (await on sync get_llm), and is superseded by `browser-use cloud` for agent execution. CLI is now purely a browser automation interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 11:44:13 -07:00
ShawnPana	4be221386b	changes	2026-03-11 12:57:50 -07:00
ShawnPana	859cb97063	simplify daemon architecture: single session, socket-as-liveness, no PID/lock files Replace the multi-session server (server.py, SessionRegistry, portalocker locking, PID files, orphan detection) with a minimal daemon (daemon.py) that holds one BrowserSession in memory. Socket file existence = alive. Auto-exits when browser dies via CDP watchdog. -2277 lines, +142 lines across 20 files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 19:05:44 -08:00

30 Commits