When `browser-use connect` fails to discover a running Chrome, the error
now points to the correct `chrome://inspect/#remote-debugging` URL. The
SKILL.md also guides agents to prompt users with two options: enable
remote debugging or use managed Chromium with a Chrome profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Core workflow now starts with bare headless (no setup step). connect and
cloud connect mentioned prominently but not required. Python/CDP and
multi-session pushed to reference files. Added session recovery tip.
219 lines (down from 272).
Multi-agent isolation is now achieved through separate sessions
(--session NAME), each with its own browser. Removed:
- register command and agents.json
- --agent flag and agent_id plumbing
- TabOwnershipManager and all tab locking logic
- dispatch lock and focus swapping between agents
- tab_ownership.py (deleted)
- test_tab_ownership.py (deleted)
Simplified tab commands: no lock checks, no _tab_list injection,
no _resolved_target_id params. agent_focus_target_id stays for
single-agent tab tracking.
Tested: 3 concurrent subagents on separate cloud sessions,
3 concurrent subagents on separate headless Chromium sessions.
- browser-use connect: one-time command to discover and connect to local
Chrome (like cloud connect but for local)
- --agent INDEX: per-command flag for multi-agent tab isolation, works
with any browser mode (cloud, profile, cdp-url, headless)
- register is now per-session ({session}.agents.json)
- --connect deprecated with migration message
- SKILL.md updated for new connect/--agent workflow
- Tested: 3 concurrent agents on shared cloud browser session
cloud connect now works with no flags. On first use, creates a
"Browser Use CLI" profile via the Cloud API and saves the ID to
config.json. Subsequent connects reuse it (validates on each call,
recreates if deleted).
Removed --timeout, --proxy-country, --profile-id flags and their
plumbing through daemon/sessions. Power users who need custom browser
settings use cloud v2 POST /browsers directly.
New references/cdp-python.md with tested recipes for raw CDP access
via the Python session: activating tabs, listing targets, running JS,
device emulation, cookies. SKILL.md updated with pointer to the
reference file explaining when to reach for CDP vs CLI commands.
- `browser-use register` assigns numeric agent index for --connect mode
- `--connect <index>` requires explicit agent index (no more bare --connect)
- `tab list` shows all tabs with lock status per agent
- `tab new [url]` creates a new tab without visually switching
- `tab switch <index>` changes agent focus without activating Chrome tab
- `tab close <index> [index...]` closes multiple tabs in one command
- Agent registry in ~/.browser-use/agents.json with 5min expiry
- Improved error messages guide agents to register or use their own tab
- Session lock prevents double BrowserSession creation on simultaneous connect
- Updated SKILL.md with register workflow and tab commands
Multiple agents can share one browser via --connect without interfering
with each other. Each agent registers with `browser-use register` to get
a numeric index, then passes it with `--connect <index>` on every command.
- Tab locking: mutating commands (click, type, open) lock the tab to the
agent. Other agents get an error if they try to mutate the same tab.
Read-only commands (state, screenshot) work on any tab.
- Agent registry: agents.json tracks registered agents with timestamps.
Expired agents (5min inactive) get cleaned up automatically.
- Session lock: prevents double BrowserSession creation when two agents
connect simultaneously.
- Focus swap: daemon swaps agent_focus_target_id and cached_selector_map
per-agent before each command, so element indices are isolated.
- subagent.md: fix TOC anchor to match renamed heading (#python-agents-cloud-sdk)
- subagent.md: fix relative link to ../features.md from guides/ subdir
- SKILL.md: use full v3 base URL instead of .../api/v3
- agent.md: fix llm_timeout docs to match auto-detection logic (Groq 30s,
Gemini 75s, o3/Claude/DeepSeek 90s, default 75s)
- examples.md: guard pw.stop() with None check to prevent UnboundLocalError
- api-v2.md: show all S3 presigned form fields in upload example
- models.md: fix ChatVercel provider_options and env var (AI_GATEWAY_API_KEY)
- examples.md: add try/finally for browser cleanup in parallel and playwright examples
- agent.md: fix defaults to match source (max_actions_per_step=5, max_failures=5,
use_vision=True, llm_timeout=60, step_timeout=180)
- features.md: fix undefined session_id in workspace example
- monitoring.md: add missing await on get_usage_summary()
- api-v2.md: fix upload flow to use multipart POST with fields (S3-style)
- SKILL.md: use uv pip for cloud SDK install
- integrations.md: use distinct MCP server name for docs endpoint
New skill providing documentation reference for writing Python code against
the browser-use library and making Cloud REST API calls. Condensed from
CLOUD.md (2700 lines of OpenAPI YAML → 530 lines of tables) and AGENTS.md
(1021 lines → 762 lines with markup stripped and sections deduplicated).
Merge duplicate Essential Commands / Commands sections into one consolidated
block. Collapse verbose flag permutations and workflow details into compact
one-liners. Down from 653 total lines to 362.
- Document coordinate clicking (click <x> <y>) in all interaction sections
- Annotate extract command as not yet implemented in README and SKILL.md
- Remove browser.extract() from Python wrapper docs (raises NotImplementedError)
- Add --session flag to global options tables in both SKILL.md files
- Add `browser-use upload <index> <path>` command for uploading files to
file input elements via the CLI
- Extract find_file_input_near_element from nested closures in tools/service.py
to a reusable method on BrowserSession, deduplicating two copies
- Add BrowserWrapper.upload() for the Python REPL
- Resolve file paths to absolute on the client side before sending to daemon
- Update SKILL.md files and README with upload command docs
Document profile subcommand as passthrough to profile-use Go binary,
add file layout section showing ~/.browser-use/ structure, update
authenticated browsing workflow examples for new profile list output.
Unify all CLI-managed files under ~/.browser-use/ (config, sockets, PIDs,
binaries, tunnels) instead of scattering across ~/.config/browser-use/ and
~/.browser-use/run/. Add profile-use Go binary as managed subcommand via
browser-use profile, with auto-download fallback and install.sh integration.
Wire cloudflared and profile-use availability checks into browser-use doctor.
Adds `--connect` to auto-discover running Chrome instances via DevToolsActivePort
files and well-known port probing, eliminating manual CDP URL construction. Fixes
daemon process hanging on `close` when connected to external browsers (--connect,
--cdp-url, cloud) by calling stop() (disconnect) instead of kill() (terminate).
The `run` command pulled in heavy SDK dependencies (openai, anthropic,
google), had a bug (await on sync get_llm), and is superseded by
`browser-use cloud` for agent execution. CLI is now purely a browser
automation interface.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Login/logout with API key persistence, versioned REST calls (v2/v3),
task polling, and OpenAPI-driven help. Stdlib only, no daemon needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the multi-session server (server.py, SessionRegistry, portalocker locking,
PID files, orphan detection) with a minimal daemon (daemon.py) that holds one
BrowserSession in memory. Socket file existence = alive. Auto-exits when browser
dies via CDP watchdog.
-2277 lines, +142 lines across 20 files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove all cloud API paths from the CLI while leaving core library
cloud support (BrowserSession(use_cloud=True), browser/cloud/) untouched.
Deleted: api_key.py, cloud_task.py, cloud_session.py
Removed: --browser remote, cloud-only run flags, task/session subcommands,
cloud profile ops (create/update/delete/sync), remote mode validation
Kept: tunnel (just Cloudflare), all local commands, install_config.py
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update all references from browser-use.com/install.sh to
browser-use.com/cli/install.sh across documentation and code.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
browser-use/SKILL.md should be the superset containing everything.
Added the following sections that were only in remote-browser:
- Setup section with install.sh one-liner and modes table
- Exposing Local Dev Servers section with tunnel commands
- browser-use doctor in troubleshooting
- Enhanced Monitoring Subagents section with token-efficient table
- Common Patterns section (dev server + screenshot loop)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- install.sh: Change default BROWSER_USE_REPO from ShawnPana/browser-use
to browser-use/browser-use
- install.sh: Generalize branch name in dev testing comment
- SKILL.md: Remove dev testing section with fork URLs, keep clean
pip install instruction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Default task status shows only latest step (token efficient for agents)
- Add --compact/-c flag to show all steps with reasoning
- Add --verbose/-v flag to show all steps with URLs + actions
- Add --last/-n, --reverse/-r, --step/-s flags for navigating long tasks
- Display duration for tasks (e.g., "18s", "5m 13s") and sessions
- Format proxy cost to 2 decimal places in session output
- Update SKILL.md docs with recommended monitoring workflow
Previously, `browser-use tunnel 3000` would start a session server with
the default browser mode (chromium), which poisoned subsequent
`--browser remote` commands with a mode mismatch error.
Now tunnel commands are handled directly in main.py without starting a
browser session:
npm run dev &
browser-use tunnel 3000 # No session created
browser-use --browser remote open <tunnel-url> # Works!
Changes:
- Move tunnel logic to standalone functions in tunnel_manager.py
- Handle tunnel commands in main.py before ensure_server()
- Tunnels persist across browser-use close (independent lifecycle)
- Add `browser-use tunnel stop --all` command
Breaking behavior change:
- `browser-use close` no longer stops tunnels
- Use `browser-use tunnel stop <port>` or `tunnel stop --all` explicitly
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a session is started without --browser remote and a subsequent command
uses --browser remote, the CLI now errors with guidance instead of silently
using the local browser (which loses cloud features like live_url).
Code changes:
- Add session metadata file to track browser_mode per session
- Error only when remote requested but session is local
- Allow local-on-remote (user gets more features than requested)
- Clean up metadata on session close
SKILL.md improvements:
- Clarify that --browser remote only needed on FIRST command
- Add table showing session modes
- Add "What happens if you forget" section with error example
- Add troubleshooting entry for mode mismatch error
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>