<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Guard `mcp.server.stdio` startup against missing stdin. When stdin is
absent, raise a clear RuntimeError so the process fails fast instead of
producing ambiguous startup errors.
<sup>Written for commit 83317179cb.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes runtime crashes when patching LLM ainvoke on Pydantic-backed
models by using `object.__setattr__` to assign `tracked_ainvoke`. This
bypasses Pydantic attribute validation and keeps the tracking wrapper
working.
<sup>Written for commit 3242d9dbc2.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Fixes#4463
### Problems
**1. `AgentHistoryList.load_from_dict` — KeyError on missing keys**
The current implementation uses direct dict subscript access that
crashes when loading history saved with an older browser-use version or
when a step entry is missing expected keys:
```python
for h in data['history']: # KeyError if 'history' key absent
if h['model_output']: # KeyError if 'model_output' absent
if 'interacted_element' not in h['state']: # KeyError if 'state' absent
```
**2. `AgentHistoryList.final_result()` — IndexError on empty result
list**
```python
if self.history and self.history[-1].result[-1].extracted_content:
# ^^^^^^^^^^^^^^^^ IndexError when result == []
```
Every sibling method (`is_done`, `is_successful`, `judgement`,
`is_judged`, `is_validated`) already guards with
`len(self.history[-1].result) > 0`. `final_result` was the only one
missing this check.
### Fix
- **`load_from_dict`**: use `data.get('history', [])` and `h.get(key)`
instead of direct subscript access, matching standard defensive Python
dict usage.
- **`final_result`**: add `len(self.history[-1].result) > 0` guard
before accessing `result[-1]`, consistent with all other methods in the
class.
### Changes
Only `browser_use/agent/views.py` is modified (14 additions, 8
deletions).
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Prevent crashes when loading agent history with missing fields and when
reading the final result on empty lists. This makes `AgentHistoryList`
safer and backward-compatible with legacy or partial histories.
- **Bug Fixes**
- `load_from_dict`: use `data.get('history', [])` and `h.get()` for safe
access; initialize missing `state` and default
`state.interacted_element` to `None`; validate `model_output` only when
it’s a dict.
- `final_result`: check `len(self.history[-1].result) > 0` before
indexing; return `None` when empty or no extracted content, consistent
with other methods.
<sup>Written for commit 6129ceb0b0.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
In response to #4214
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Revamped the LLM model docs with a quick-reference table, use‑case
recommendations, and clearer provider setup. Adds native coverage for
more providers, updates examples, and improves env/installation
guidance.
- **Refactors**
- Added Quick Reference (provider → class → env) and use‑case picks
based on benchmarks.
- Renamed to “Browser Use Cloud,” updated pricing/models, and listed
`browser-use/bu-30b-a3b-preview`.
- Expanded provider docs: DeepSeek, Mistral, Cerebras, OpenRouter,
LiteLLM; improved OpenAI/Anthropic/Gemini/Azure/Bedrock/Vercel/OCI
guidance with links.
- Standardized examples to `Agent` + `Chat*` classes and noted key env
vars.
- Moved OpenRouter out of OpenAI-compatible section; trimmed/reorganized
OpenAI-compatible examples.
- Minor fixes: `GOOGLE_API_KEY` deprecation note, Anthropic coordinate
clicking note, “LangChain” capitalization.
<sup>Written for commit c1151715d3.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Summary
- **Root cause**: The daemon accepted any connection on its socket
without authentication. On Windows the port is deterministic
(`adler32(session)`-derived), so any local process could connect and
dispatch the `python` action, executing arbitrary code via
`eval()`/`exec()` as the daemon owner. On Unix the socket may be
world-writable under a permissive umask.
- **Fix**: On `Daemon.run()`, generate a `secrets.token_hex(32)` token,
write it atomically to `~/.browser-use/{session}.token` with `chmod
0o600`. Validate every incoming request with `hmac.compare_digest`
before dispatching. Delete the token file on shutdown.
- **Client**: `send_command()` in `main.py` now reads the token file and
attaches it to every request. Falls back to `''` for old daemons
(no-op).
## Test plan
- [ ] Start daemon, run `browser-use python "1+1"` — should work
normally
- [ ] Send a raw request without token to socket — should get
`{"success": false, "error": "Unauthorized"}`
- [ ] Unauthenticated `shutdown` action should be ignored (daemon stays
up)
- [ ] After daemon stops, `~/.browser-use/default.token` should be
deleted
- [ ] All CI tests pass (`uv run pytest -vxs tests/ci`)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Secure the daemon socket with a per-session auth token to block
unauthorized local connections and arbitrary code execution via the
`python` action. The CLI now reads the token and includes it with each
command.
- **Bug Fixes**
- Generate a session token in `Daemon.run()` (`secrets.token_hex(32)`);
write it atomically to `~/.browser-use/{session}.token` via a `*.tmp`
file created with `0o600`, then replace; raise if write fails; delete on
shutdown.
- Validate every request using `hmac.compare_digest`; unauthorized calls
return `{"success": false, "error": "Unauthorized"}` and `shutdown` is
honored only for authorized, successful requests.
- Client `send_command()` reads the token and sends it as `token`; falls
back to `''` for older daemons.
<sup>Written for commit ca2185ba61.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
- Use os.open() with mode 0o600 instead of write-then-chmod to eliminate
the permission race window where the temp file is briefly world-readable.
- Raise instead of warn when token file write fails: a daemon that cannot
persist its auth token is permanently unauthorized for all clients, so
failing fast is correct (identified by cubic).
Generate a secrets.token_hex(32) on daemon startup, write it atomically
to ~/.browser-use/{session}.token (chmod 0o600), and validate it on every
incoming request via hmac.compare_digest. The client reads the token file
and includes it in each send_command() call.
This closes the arbitrary-code-execution vector where any local process
could connect to the deterministic Windows TCP port (or a world-readable
Unix socket) and dispatch the 'python' action to run eval()/exec() as the
daemon owner.
## Summary
- Bumps `requests` from `2.32.5` to `2.33.0` in `pyproject.toml`
- Fixes a local privilege escalation / file substitution vulnerability
in `requests.utils.extract_zipped_paths()`: prior versions extracted to
a predictable temp filename with no validation, allowing a local
attacker with write access to the temp directory to pre-create a
malicious file that the library would load instead
- 2.33.0 now extracts to a non-deterministic location, eliminating the
race condition
> **Note:** Standard usage of the Requests library is not affected. Only
applications calling `extract_zipped_paths()` directly are impacted.
## Test plan
- [ ] No functional code changes — version pin bump only
- [ ] `uv sync` resolves cleanly with the new pin
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Upgrade `requests` from `2.32.5` to `2.33.0` to fix a temp-file path
traversal/race in `requests.utils.extract_zipped_paths()` that could
allow local file substitution. Standard `requests` usage isn’t affected;
only direct callers of that helper were exposed, and no app changes are
needed.
<sup>Written for commit a7fb3884c4.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
## Summary
- Bumps `aiohttp` from `3.13.3` to `3.13.4` in
`browser_use/skill_cli/requirements-cli.txt`
- Fixes a security vulnerability where insufficient restrictions on
trailer header handling could allow uncapped memory usage, potentially
causing memory exhaustion from attacker-controlled requests/responses
Upstream patch: aio-libs/aiohttp@0c2e9daCloses#26
## Test plan
- [ ] No functional code changes — version pin bump only
- [ ] Verify `install_lite.sh` installs the patched version
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Upgrade `aiohttp` to 3.13.4 in the CLI to patch a memory exhaustion
vulnerability in trailer header handling. Ensures the lite installer
pulls the fixed version.
- **Dependencies**
- Bump `aiohttp` from `3.13.3` to `3.13.4` in
`browser_use/skill_cli/requirements-cli.txt` to prevent uncapped trailer
header processing that could lead to memory exhaustion.
<sup>Written for commit 14ada65183.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
Bumps requests from 2.32.5 to 2.33.0.
extract_zipped_paths() previously wrote to a predictable temp path with no
validation, allowing a local attacker to pre-create a malicious file that
would be loaded in its place. 2.33.0 extracts to a non-deterministic
location, eliminating the race condition.
## Summary
Three security and correctness issues found during review of #4514,
which has since been merged. All three affect the new `skill_cli` layer
introduced by that PR.
- **`BROWSER_USE_API_KEY` env var silently ignored**
(`commands/cloud.py`): #4514 dropped the env var fallback in
`_get_api_key()` without a migration path, breaking CI/CD pipelines that
set it as a secret. Restored with a deprecation warning directing users
to `browser-use config set api_key`.
- **`_install_cloudflared()` downloads binary without integrity check**
(`commands/setup.py`, Linux only): Raw `urllib.request.urlretrieve`
wrote directly to the install destination with no verification. Now
downloads to a temp file, fetches the `.sha256sum` Cloudflare publishes
alongside each release, verifies SHA256 before installing, and cleans up
on failure. macOS (`brew`) and Windows (`winget`) were already safe —
they verify internally.
- **`write_config()` not atomic** (`config.py`): Direct
`path.write_text()` truncates `config.json` on `SIGKILL` mid-write;
`read_config()` catches `json.JSONDecodeError` and returns `{}`,
silently wiping the API key and all settings. Now uses
`tempfile.mkstemp(dir=same_dir)` + `fsync` + `os.replace()` — the same
pattern `_write_state()` in `daemon.py` already uses correctly.
## Test plan
- [ ] `BROWSER_USE_API_KEY=sk-xxx browser-use cloud connect` prints
deprecation warning and still authenticates
- [ ] After `browser-use config set api_key sk-xxx`, commands work
without the env var set
- [ ] On Linux: cloudflared install rejects a tampered binary with a
clear SHA256 mismatch error
- [ ] `SIGKILL` during `browser-use config set` leaves `config.json`
intact or absent, never truncated
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Bumps aiohttp from 3.13.3 to 3.13.4 in requirements-cli.txt.
Fixes uncapped memory usage from insufficient trailer header restrictions
(aio-libs/aiohttp@0c2e9da).
- cloud.py: remove BROWSER_USE_API_KEY env var fallback (violates CLI
policy of config.json as single source of truth); instead detect the
env var in the error path and print a targeted migration hint
- setup.py: replace Path.rename() with shutil.move() so the temp file
can be moved across filesystems (e.g. /tmp -> /usr/local/bin)
A SIGKILL mid-write truncates config.json; read_config() catches
json.JSONDecodeError and returns {}, silently wiping the API key and
all other settings. Mirror the pattern already used by _write_state():
write to a sibling temp file, fsync, chmod 600, then os.replace() into
place — which is atomic on POSIX and effectively atomic on Windows.
Downloads to a temp file, fetches the .sha256sum file Cloudflare publishes
alongside each release, and verifies before moving to the install destination.
Protects against MITM/CDN tampering. Temp file is cleaned up on failure.
The CLI previously accepted the env var as a fallback; this PR dropped it
without a migration path, breaking CI/CD pipelines that set it as a secret.
Restore backwards-compat by checking the env var after config.json and
printing a deprecation warning with the migration command.
Multiple agents can share one browser via --connect without interfering
with each other. Each agent registers with `browser-use register` to get
a numeric index, then passes it with `--connect <index>` on every
command.
- Tab locking: mutating commands (click, type, open) lock the tab to the
agent. Other agents get an error if they try to mutate the same tab.
Read-only commands (state, screenshot) work on any tab.
- Agent registry: agents.json tracks registered agents with timestamps.
Expired agents (5min inactive) get cleaned up automatically.
- Session lock: prevents double BrowserSession creation when two agents
connect simultaneously.
- Focus swap: daemon swaps agent_focus_target_id and cached_selector_map
per-agent before each command, so element indices are isolated.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Replaces multi‑session isolation with per‑session browsers via
`--session`, adds a lightweight direct‑action runtime, tab controls, and
a phase‑aware daemon. Deprecates `--connect`, enables zero‑config `cloud
connect` with config‑driven recording and config‑only cloud auth, adds a
`config` command, strengthens lifecycle probes/close behavior, speeds up
cloud connect by skipping per‑call profile validation, and shows
“Connecting…”/“Closing…” during slow operations.
- **New Features**
- Session isolation: each `--session` has its own daemon, socket,
PID/state files, and browser; sessions run in parallel; `sessions` shows
phase and includes CDP URL only in `--json` (including cloud).
- Tabs: `tab list`, `tab new [url]`, `tab switch <index>`, `tab close
[index...]`; clearer errors that suggest `new`; `close` uses logical
focus; JS dialogs auto‑dismiss and surface in `state`; `state` uses the
focused tab’s title and preserves empty titles.
- Lightweight runtime: `CLIBrowserSession` with direct actions via
`ActionHandler` (no event bus/watchdogs); faster, more reliable scroll
(JS `scrollBy` with left/right fix); `Page.enable` wrapped to avoid
root‑target failures.
- Connect & cloud: one‑time `connect` for local Chrome; `cloud connect`
reads `cloud_connect_proxy`, `cloud_connect_timeout`, and
`cloud_connect_recording` from `browser-use config` (CLI default
recording on; library default remains off); unified base via
`BROWSER_USE_CLOUD_BASE_URL` host (CLI appends `/api/{version}`); `cloud
signup` to get/save API key; API key is read only from
`~/.browser-use/config.json` (env var ignored) and forwarded to
`profile-use`; cloud profile ID is read from config without per‑call GET
validation and auto‑heals on invalid/missing IDs (created only when
needed), with daemon‑safe error handling; empty API key is treated as
unset.
- Config, setup & lifecycle: `browser-use config` (set/get/list/unset)
with secure `config.json`; `doctor` shows config state with docs link;
installers create `config.json` and lite requires `curl`; interactive
`setup` (`--yes` for CI); daemon writes a phase‑aware state file,
includes PID in pings, has idle timeout, cleans up orphaned daemons,
supports cross‑platform shutdown with SIGTERM fallback, and avoids
deleting sockets for live daemons with stale PID files.
- Tests & fixes: real subprocess lifecycle tests; cloud stop timeout;
cloud connect faster via skipped profile validation; tests read API keys
from `config.json`; tolerate shutdown race after SIGTERM.
- **Migration**
- Use `--session <NAME>` for isolation. Removed `--agent`, `browser-use
register`, and tab‑ownership locking.
- Use `tab` subcommands; standalone `switch` and `close-tab` are
removed.
- Run `browser-use connect` once for local Chrome; the `--connect` flag
is deprecated and now errors with guidance.
- For cloud, use `cloud connect` with zero flags; customize via
`browser-use config` (`cloud_connect_proxy`, `cloud_connect_timeout`,
`cloud_connect_recording`); set `BROWSER_USE_CLOUD_BASE_URL` to the host
(CLI appends `/api/{version}`); use `cloud signup` to obtain and save an
API key; set the API key via `browser-use config set api_key <KEY>` —
`BROWSER_USE_API_KEY` is ignored by the CLI.
<sup>Written for commit a7b476ee46.
Summary will update on new commits.</sup>
<!-- End of auto-generated description by cubic. -->
- Remove BROWSER_USE_API_KEY env var as a read source from CLI code; config.json is the only source of truth
- Split _create_cloud_profile into daemon-safe _inner (raises) and CLI wrapper (sys.exit)
- Daemon auto-heal no longer kills process on profile creation API errors
_get_or_create_cloud_profile reads config instantly instead of
validating via GET /profiles/{id} on every connect. If the profile
is invalid, _provision_cloud_browser auto-heals by creating a new
one and retrying. Saves ~500ms-1s on every cloud connect.
Four small fixes for DOM capture performance. **4 files changed, +51
lines, -7 lines.**
## Fixes
### 1. `enhanced_snapshot.py` — Convert `isClickable` list to set before
loop
`_parse_rare_boolean_data` used `index in list` (O(n) per call), called
per node = O(n²).
```python
# Before: O(n) list scan per node
return index in rare_data['index'] # List[int]
# After: O(1) set lookup
is_clickable_set = set(nodes['isClickable']['index']) # once
return index in rare_data_set # per node
```
**20k elements: 14,160ms → 2,973ms. 100k elements: 356s → 9s.**
### 2. `paint_order.py` — Cap `RectUnionPure` at 5,000 rects
Each `add()` can split existing rects into up to 4 pieces. With
overlapping translucent layers this grows exponentially. `_MAX_RECTS =
5000` stops it. When hit, `contains()` returns False — less aggressive
filtering, same correctness.
**20k elements: 372s → 4.7s.**
### 3. `service.py` — Skip JS listener detection on pages with >10k
elements
Early bail-out inside the existing JS expression. Interactive elements
still detected via accessibility tree + `ClickableElementDetector`.
### 4. `dom_watchdog.py` — Add 2s timeout to pending network request
check
`_get_pending_network_requests()` had no timeout. On slow CI machines or
pages mid-navigation, it hangs for 15s+ silently eating into the 30s
`BrowserStateRequestEvent` budget. This causes the DOM capture to
timeout, producing 5 consecutive failures → agent termination.
Found by tracing eval failures on eBay search results where the
DOMWatchdog produced zero log output for 15 seconds between "STARTING"
and the timeout.
## Benchmarks
Profiled every pipeline stage on `main` vs this branch:
| Stage | main (20k) | This PR (20k) |
|---|---:|---:|
| `build_snapshot_lookup` | 14,160ms | **2,973ms** |
| JS listener detection | 2,326ms | **0ms** |
| `calculate_paint_order` | 372,303ms | **4,683ms** |
| Everything else | ~12s | ~8s |
| **Total** | **~400s** | **~16s** |
## Eval results
Ran WebBench_READ_v5 (198 tasks) on both branches, 4 runs total.
Task-by-task comparison on 35 common tasks:
- Always pass (both branches): 6
- Always fail (both branches): 13
- **Main always better: 0**
- **Ours always better: 1**
- Flaky (random): 15
No regressions. Normal pages (<2k nodes) unaffected — same timings, same
behavior.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Medium risk because it changes DOM capture heuristics (timeouts and
early-bail conditions) that can affect which elements are marked
clickable/visible or filtered by paint order on very large pages.
>
> **Overview**
> Speeds up DOM capture on very large pages by removing a key O(n²)
hotspot and adding protective caps/timeouts to avoid long stalls.
>
> `enhanced_snapshot.build_snapshot_lookup` now converts CDP
`isClickable.index` to a `set` once and uses O(1) membership checks per
node. Paint-order filtering adds a 5k-rectangle safety cap in
`RectUnionPure.add()` to prevent runaway rect fragmentation, and JS
click-listener detection skips entirely when the page has >10k elements.
>
> `DOMWatchdog` also wraps the pending-network-request precheck in a 2s
`asyncio.wait_for` and logs a timeout, preventing this step from
consuming most of the `BrowserStateRequestEvent` budget.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3aa68384ad. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Library keeps recording off by default. CLI reads cloud_connect_recording
from config (defaults True). Users can disable with:
browser-use config set cloud_connect_recording false
Page.enable fails on browser-level CDP targets. Wrap in try/except
like the library's PopupsWatchdog does. Dialog handler still
registers regardless — events may fire on some CDP implementations.
test_cli_upload and test_cli_coordinate_click create SessionInfo
without actions, hitting the type-narrowing guard. Now provide
ActionHandler in all test cases.
On slow CI machines, _get_pending_network_requests() can hang for 15s+
when Chrome is busy loading/rendering after a navigation. This silently
eats into the 30s BrowserStateRequestEvent budget, leaving insufficient
time for the actual DOM capture — causing 5 consecutive timeouts and
agent termination.
Observed on eBay search results in eval runs: DOMWatchdog started but
produced zero log output for 15 seconds before the timeout killed it.
The pending network check was the first await after the URL log.
- type: ignore on each param line in sessions.py (pyright per-line)
- Remove ActionHandler assert in browser.py (breaks pre-existing tests)
- Ruff format