Two additional performance fixes for heavy pages:
1. Hoist get_or_create_cdp_session() outside _construct_enhanced_node
Previously called once PER DOM NODE inside the recursive tree
construction. On a 100k-element page, this was 100k+ async
operations. Now resolved once before recursion starts.
2. Add _MAX_RECTS=5000 safety cap to RectUnionPure
The paint order rect union can fragment exponentially with many
overlapping translucent layers (each add() splits up to 4 rects).
Cap prevents memory/CPU explosion on complex pages.
Also: expanded stress test suite to 15 pages (up to 132k elements)
including shadow DOM + iframe combos, overlapping layers, cross-origin
iframes, and a 100k flat element test. All 15 pass.
10 progressively heavier test pages (1k to 50k+ elements):
- Flat divs, nested tables, shadow DOM, iframes, deep nesting
- Mega forms, SVG heavy, event listeners, cross-origin iframes
- Ultimate stress test combining everything
Two test modes:
- --dom-only: tests DOM capture without LLM (fast, no API key needed)
- --agent: tests full agent loop with real LLM on subset of pages
Pages with very large DOMs (e.g. Stimulsoft designer with 20,000+
elements) cause the browser state capture to time out, making the
agent unable to interact with the browser.
Three targeted fixes:
1. Skip JS listener detection on heavy pages (>10k elements)
The querySelectorAll('*') + getEventListeners() loop followed by
individual DOM.describeNode CDP calls for each listener element is
O(n) and can take 10s+ alone on heavy pages.
2. Batch DOM.describeNode calls (chunks of 50)
Previously all calls fired at once via asyncio.gather, flooding the
CDP WebSocket and causing timeouts on concurrent operations.
3. Adaptive CDP timeouts based on page complexity
- >15k elements: 25s initial / 10s retry (was 10s/2s)
- >5k elements: 15s initial / 5s retry
- Normal pages: unchanged 10s/2s
cloud connect reads cloud_connect_proxy and cloud_connect_timeout from
~/.browser-use/config.json. Recording always enabled via enableRecording
default on CreateBrowserRequest. No CLI flags — edit config for custom
settings, use cloud v2 REST for full control.
CLIBrowserSession._provision_cloud_browser now reads the env var and
overrides CloudBrowserClient.api_base_url before provisioning. This
keeps the change CLI-only without modifying library code.
Single env var to override the cloud API base URL for all versions.
Per-version overrides (BROWSER_USE_CLOUD_BASE_URL_V2/V3) still take
precedence for backward compatibility.
cloud connect now works with no flags. On first use, creates a
"Browser Use CLI" profile via the Cloud API and saves the ID to
config.json. Subsequent connects reuse it (validates on each call,
recreates if deleted).
Removed --timeout, --proxy-country, --profile-id flags and their
plumbing through daemon/sessions. Power users who need custom browser
settings use cloud v2 POST /browsers directly.
## Summary
- Close `CloudBrowserClient`'s httpx connection pool in
`on_BrowserStopEvent` after `stop_browser()` completes
- Make `CloudBrowserClient.close()` idempotent (safe to call multiple
times)
## Problem
`BrowserSession.on_BrowserStopEvent` calls `stop_browser()` to end the
cloud session but never calls `_cloud_browser_client.close()`. The
underlying `httpx.AsyncClient` connection pool stays alive in memory.
On AWS Lambda with provisioned concurrency, the same container handles
many invocations sequentially. Each invocation creates a new
`BrowserSession` → new `CloudBrowserClient` → new `httpx.AsyncClient`,
but the old ones are never cleaned up. Memory climbs from ~1.3GB to the
3GB ceiling over hours, triggering OOM kills.
**Observed in production**: 21 `Runtime.ExitError` crashes in 6 hours,
`aws.lambda.enhanced.max_memory_used` saturated at 3,009 MB (the Lambda
limit), 3 confirmed `out_of_memory` events.
## Test plan
- [ ] Run `tests/ci` suite to verify no regressions
- [ ] Verify cloud browser sessions still clean up properly
(`stop_browser` + `close`)
- [ ] Verify calling `close()` twice doesn't raise
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Close the `httpx` client pool when a cloud browser session stops to
prevent memory leaks and OOM on AWS Lambda. Also makes
`CloudBrowserClient.close()` safe to call multiple times.
- **Bug Fixes**
- Always call `_cloud_browser_client.close()` in
`BrowserSession.on_BrowserStopEvent` after `stop_browser()` (in finally)
to free the `httpx.AsyncClient` pool.
- Make `CloudBrowserClient.close()` idempotent by checking
`client.is_closed` before `aclose()`.
<sup>Written for commit 5e644981e8.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
BrowserSession.on_BrowserStopEvent calls stop_browser() but never calls
_cloud_browser_client.close(), leaving the httpx connection pool alive.
On Lambda provisioned concurrency, these pools accumulate across
invocations — memory climbs from ~1.3GB to the 3GB ceiling over hours,
triggering OOM kills (21 Runtime.ExitError crashes in 6 hours observed
in production).
Changes:
- Call _cloud_browser_client.close() in on_BrowserStopEvent after
stop_browser completes (in a finally block so it runs even if
stop_browser fails)
- Make CloudBrowserClient.close() idempotent (check is_closed before
calling aclose) so it's safe to call multiple times
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New references/cdp-python.md with tested recipes for raw CDP access
via the Python session: activating tabs, listing targets, running JS,
device emulation, cookies. SKILL.md updated with pointer to the
reference file explaining when to reach for CDP vs CLI commands.
These were duplicates of tab switch and tab close with separate
lock-checking and focus-resolution code paths. Having one way to
do each thing reduces maintenance surface and avoids isolation bugs.
Default close-tab and tab-close paths used session_manager.get_focused_target()
which returns Chrome's globally focused tab. In multi-agent mode this lets
one agent close another's tab. Now uses agent_focus_target_id first, falling
back to global focus only in single-agent mode.
Also: fresh daemon spawn now uses phase-aware state file waiting (15s) instead
of fixed 5s socket polling, consistent with ensure_daemon's probe logic.
- ensure_daemon now phase-aware: waits for initializing/starting/
shutting_down with staleness timeouts, errors on unhealthy sessions
- _close_session only cleans files after confirmed PID death
- _handle_sessions won't delete live daemon's files on stale terminal state
- _probe_session uses socket_pid for split-brain PID resolution
- _is_daemon_process works on Windows (wmic)
- Updated + added robustness tests (codex-authored)
15 tests covering state file transitions, _probe_session branches (live
daemon, dead PID, no files, corrupt state), close via socket, close
orphaned daemon, close --all with mixed sessions, sessions listing with
phase column, and stale/terminal state cleanup.
Uses real daemon subprocesses with BROWSER_USE_HOME overridden to temp
dirs. No mocking.
Replace scattered PID/socket/process checks with _probe_session() that
reads the state file, reconciles PIDs, checks liveness, and probes the
socket without deleting anything. Callers decide cleanup policy.
Adds _is_pid_alive, _is_daemon_process, _terminate_pid (cross-platform
with SIGKILL escalation), _close_session (shared by close and close-all).
sessions command now shows phase column. close and close-all both handle
orphaned daemons via SIGTERM fallback. close polls for PID disappearance
up to 15s before giving up.
Daemon now writes <session>.state.json at each lifecycle transition
(initializing → ready → starting → running → shutting_down → stopped/failed).
All shutdown triggers funneled through _request_shutdown() to prevent
double cleanup. Startup rollback on failure cleans up browser resources.
Also fixes cloud browser leak: CLIBrowserSession.stop() now explicitly
stops the remote browser via API instead of just disconnecting the
websocket.
Daemon ping response now includes PID for split-brain resolution.
When the daemon's socket is unreachable but the PID file references a
live process, close now sends SIGTERM directly instead of printing
"No active browser session" and leaving the daemon running forever.
Sockets without a corresponding live PID file were never cleaned up
because cleanup was only triggered per-session on next use. Add a
glob pass in _handle_sessions to remove any .sock file that doesn't
match a live session.
Concurrent agents could interleave focus swaps on the shared
BrowserSession, corrupting each other's state. Wrap the entire
swap-execute-restore cycle in _dispatch_lock and separate the
tab-ownership path from single-agent mode.
CDP input gesture simulation doesn't work in --connect mode (external
Chrome). Switch to window.scrollBy via Runtime.evaluate which works
regardless of connection mode.
cloud connect was silently reading credentials from the library's
~/.config/browseruse/cloud_auth.json, bypassing cloud login/logout
entirely. Now ensure_daemon injects the CLI config's API key into the
daemon subprocess env, so all cloud commands share a single auth source.
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Adds a lightweight installer (`install_lite.sh`) that sets up the
`browser-use` CLI with minimal deps and Chromium, avoiding the full
library. Also extracts CLI deps into `requirements-cli.txt`, adds a CI
test, and hardens the installer with a `curl` pre-check and non-fatal
validate.
- New Features
- `install_lite.sh` installs Python 3.11+, `uv`, the CLI (`--no-deps`)
with minimal deps from `requirements-cli.txt`, and Chromium via `uvx
playwright` in `~/.browser-use-env`, and configures PATH (Linux, macOS,
Windows).
- Installs the `profile-use` helper into `~/.browser-use/bin` for
profile management.
- CI test (`tests/ci/test_cli_lite_deps.py`) boots the CLI in a clean
venv with only minimal deps and checks key imports.
- Minimal CLI deps pinned in
`browser_use/skill_cli/requirements-cli.txt`.
- Bug Fixes
- Validation is non-fatal; next steps are shown even if checks warn.
- Checks for `curl` before installing `uv` to prevent confusing
failures.
<sup>Written for commit 7e92031593.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Problem
Closes#4211
`browser_click` in the MCP server uses `oneOf` at the top level of
`inputSchema` to express "provide either `index` OR
`coordinate_x`+`coordinate_y`":
```json
{
"type": "object",
"properties": { ... },
"oneOf": [
{"required": ["index"]},
{"required": ["coordinate_x", "coordinate_y"]}
]
}
```
Claude's API (and other strict MCP clients) rejects
`oneOf`/`allOf`/`anyOf` at the top level of a tool input schema:
```
400 tools.N.custom.input_schema: input_schema does not support oneOf, allOf, or anyOf at the top level
```
This cascades — it breaks **all** MCP tools in the session, not just
`browser_click`.
## Fix
Remove the `oneOf` block. The constraint is now expressed in the
property descriptions instead. No behaviour change: the runtime guard in
`_click()` already returns a clear error string when neither `index` nor
coordinates are provided.
## Diff summary
- `-4 lines`: remove the `oneOf: [...]` block entirely
- `+3 lines`: update the three property descriptions to say "Provide
this OR ..."
The `_click()` handler is untouched.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Remove the top-level oneOf from the `browser_click` tool input schema so
Claude and other strict MCP clients accept it and the session doesn’t
break. No behavior change; `_click()` still validates, and field
descriptions now clarify using `index` or `coordinate_x`+`coordinate_y`.
<sup>Written for commit 6094cddd90.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Summary
- derive cloud browser session UUID from Browser Use CDP host (e.g.
`<uuid>.cdp0.browser-use.com`)
- use that UUID as fallback in `BrowserSession.on_BrowserStopEvent` when
`_cloud_browser_client.current_session_id` is empty
- ensure reconnected sessions created via `BrowserSession(cdp_url=...)`
are properly stopped in cloud, not just reset locally
## Why
Reconnected cloud sessions (common MFA resume pattern) can leak browser
instances because stop logic only used `current_session_id`, which is
populated when the same process created the cloud browser. In reconnect
flows, that field is often unset.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Ensure cloud browser sessions are stopped when reconnecting via cdp_url,
preventing leaked instances during MFA/resume flows. We derive the
session UUID from the CDP URL host and pass it to `stop_browser` with
the explicit ID.
- **Bug Fixes**
- Added a helper to extract the session UUID from the CDP host (e.g.,
<uuid>.cdpN.browser-use.com).
- Updated stop logic to use `current_session_id` or the derived UUID and
call `stop_browser(id)`; logs now include the session ID.
<sup>Written for commit 4c883feb7c.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
- Remove event bus tab listeners from daemon — track tabs directly
- Remove dead event bus fallback branches from commands/browser.py
- Replace SwitchTabEvent/CloseTabEvent dispatches with direct CDP calls
- Update python_session.py to use ActionHandler instead of event bus
- Add JS dialog handler (alert/confirm/prompt) to CLIBrowserSession
- Surface auto-dismissed popup messages in state output
- Only dummy EventBus() for watchdog constructors remains (unavoidable)
Resolves#4352
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes Windows tunnel lifecycle bugs by spawning `cloudflared` as a
detached daemon and reliably terminating it to prevent orphaned
processes. Resolves#4352.
- **Bug Fixes**
- Windows spawn uses `CREATE_NEW_PROCESS_GROUP | CREATE_NO_WINDOW`; Unix
uses `start_new_session=True`.
- Windows kill uses Win32 `OpenProcess/TerminateProcess` with brief
polling; Unix uses `SIGTERM` with `SIGKILL` fallback.
<sup>Written for commit 4950c3baa1.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Summary
- Clear `last_model_output` and `last_result` at the start of `step()`
to prevent stale data from previous steps being recorded in history on
timeout
- Increment `n_steps` in `_execute_step`'s timeout handler to prevent
the main loop from retrying the same step number
Fixes#4480
## What changed
**`step()` — clear stale state at entry:**
```diff
self.step_start_time = time.time()
+
+ # Clear previous step state to prevent stale data from being recorded
+ self.state.last_model_output = None
+ self.state.last_result = None
+
browser_state_summary = None
```
**`_execute_step()` — ensure counter advances on timeout:**
```diff
self.state.last_result = [ActionResult(error=error_msg)]
+ # Ensure step counter advances on timeout
+ if self.state.n_steps == step + 1:
+ self.state.n_steps += 1
```
The guard `if self.state.n_steps == step + 1` prevents double-increment
when `_finalize()` has already incremented on the normal path.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes stale history entries and a stuck step counter when a step times
out. Clears per-step state after `_prepare_context` and ensures
`n_steps` advances on timeout.
- **Bug Fixes**
- Clear `last_model_output` and `last_result` right after
`_prepare_context` (before LLM/action calls) so prompts keep previous
output and timeouts don't record stale data.
- In `_execute_step()`, increment `n_steps` after a timeout (with a
guard) so the loop moves forward.
<sup>Written for commit c0a11dc61e.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Resolves#4510
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes double-serialized action fields in Anthropic tool responses to
prevent schema validation errors and failed tool calls. Preserves
original tracebacks in the fallback path.
- **Bug Fixes**
- Parse nested JSON strings in tool inputs; normalize newline/carriage
return/tab escapes when needed.
- Applied in `browser_use/llm/anthropic/chat.py` and
`browser_use/llm/aws/chat_anthropic.py`.
<sup>Written for commit f74a4435b1.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->