mirror of https://github.com/browser-use/browser-use synced 2026-04-22 17:45:09 +02:00

Go to file

laithrw 86d33635c5 fix(agent): prevent stale history and stuck step counter on timeout (#4481 )

## Summary

- Clear `last_model_output` and `last_result` at the start of `step()`
to prevent stale data from previous steps being recorded in history on
timeout
- Increment `n_steps` in `_execute_step`'s timeout handler to prevent
the main loop from retrying the same step number

Fixes #4480

## What changed

**`step()` — clear stale state at entry:**
```diff
  self.step_start_time = time.time()
+
+ # Clear previous step state to prevent stale data from being recorded
+ self.state.last_model_output = None
+ self.state.last_result = None
+
  browser_state_summary = None
```

**`_execute_step()` — ensure counter advances on timeout:**
```diff
  self.state.last_result = [ActionResult(error=error_msg)]
+ # Ensure step counter advances on timeout
+ if self.state.n_steps == step + 1:
+     self.state.n_steps += 1
```

The guard `if self.state.n_steps == step + 1` prevents double-increment
when `_finalize()` has already incremented on the normal path.

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes stale history entries and a stuck step counter when a step times
out. Clears per-step state after `_prepare_context` and ensures
`n_steps` advances on timeout.

- **Bug Fixes**
- Clear `last_model_output` and `last_result` right after
`_prepare_context` (before LLM/action calls) so prompts keep previous
output and timeouts don't record stale data.
- In `_execute_step()`, increment `n_steps` after a timeout (with a
guard) so the loop moves forward.

<sup>Written for commit c0a11dc61e.
Summary will update on new commits.</sup>

<!-- End of auto-generated description by cubic. -->

2026-03-25 19:28:20 -04:00

.github

drop unnecessary contents:read from cloud_evals workflow

2026-03-22 17:14:42 -04:00

bin

refactor: improve type checking and linting script

2025-11-11 12:00:20 -08:00

browser_use

Merge branch 'main' into fix/step-timeout-counter-and-stale-history

2026-03-25 19:22:19 -04:00

docker

fix: correct docker build context path in build-base-images.sh

2026-02-03 11:23:20 +00:00

examples

rm code agent

2026-03-21 02:05:42 -04:00

skills

fix Selenium wss:// handling and webhook JSON parse guard

2026-03-21 19:04:16 -07:00

static

update benchmark plot with bu-ultra result (78%)

2026-03-24 15:51:44 -07:00

tests

fix pricing url cache cleanup and remove phantom code_use module

2026-03-25 18:16:53 -04:00

.dockerignore

ignore .bak files in docker and git

2025-06-25 23:21:45 -07:00

.env.example

env var to disable version check

2025-11-26 13:06:03 -08:00

.gitattributes

Remove mp4 files and update gitattributes

2024-11-18 20:49:06 +01:00

.gitignore

docs: Add cleanup instructions to SKILL.md

2026-01-21 21:12:34 -08:00

.pre-commit-config.yaml

fixed styling issues

2026-02-23 15:24:40 -08:00

.python-version

bump dependency versions

2025-06-03 16:03:08 -07:00

AGENTS.md

Remove $10 free credit mentions from documentation

2026-03-10 22:06:46 +00:00

CLAUDE.md

improve extract data

2025-08-31 12:01:39 -07:00

CLOUD.md

docs: update CLOUD.md to reflect 5 free tasks offer

2026-03-19 12:31:54 +01:00

Dockerfile

Refactor Dockerfile and update tests for improved functionality

2025-08-26 23:42:11 -07:00

Dockerfile.fast

speedup docker build to 20s

2025-06-27 05:36:23 -07:00

LICENSE

Added MIT license

2024-11-05 19:07:17 +01:00

pyproject.toml

Remove litellm from dependencies (supply chain attack CVE)

2026-03-24 20:29:44 -07:00

README.md

Clarify Cloud option benefits in README

2026-03-24 16:05:12 -07:00

README.md

🌤️ Want to skip the setup? Use our cloud for faster, scalable, stealth-enabled browser automation!

🤖 LLM Quickstart

Direct your favorite coding agent (Cursor, Claude Code, etc) to Agents.md
Prompt away!

👋 Human Quickstart

1. Create environment and install Browser-Use with uv (Python>=3.11):

uv init && uv add browser-use && uv sync
# uvx browser-use install  # Run if you don't have Chromium installed

2. [Optional] Get your API key from Browser Use Cloud:

# .env
BROWSER_USE_API_KEY=your-key
# GOOGLE_API_KEY=your-key
# ANTHROPIC_API_KEY=your-key

3. Run your first agent:

from browser_use import Agent, Browser, ChatBrowserUse
# from browser_use import ChatGoogle  # ChatGoogle(model='gemini-3-flash-preview')
# from browser_use import ChatAnthropic  # ChatAnthropic(model='claude-sonnet-4-6')
import asyncio

async def main():
    browser = Browser(
        # use_cloud=True,  # Use a stealth browser on Browser Use Cloud
    )

    agent = Agent(
        task="Find the number of stars of the browser-use repo",
        llm=ChatBrowserUse(),
        # llm=ChatGoogle(model='gemini-3-flash-preview'),
        # llm=ChatAnthropic(model='claude-sonnet-4-6'),
        browser=browser,
    )
    await agent.run()

if __name__ == "__main__":
    asyncio.run(main())

Check out the library docs and the cloud docs for more!

Open Source vs Cloud

We benchmark Browser Use across 100 real-world browser tasks. Full benchmark is open source: browser-use/benchmark.

Use Open Source

You need custom tools or deep code-level integration
You want to self-host and deploy browser agents on your own machines

Use Cloud (recommended)

Much better agent for complex tasks (see plot above)
Easiest way to start and scale
Best stealth with proxy rotation and captcha solving
1000+ integrations (Gmail, Slack, Notion, and more)
Persistent filesystem and memory

Use Both

Use the open-source library with your custom tools while running our cloud browsers and ChatBrowserUse model

Demos

📋 Form-Filling

Task = "Fill in this job application with my resume and information."

Example code ↗

🍎 Grocery-Shopping

Task = "Put this list of items into my instacart."

https://github.com/user-attachments/assets/a6813fa7-4a7c-40a6-b4aa-382bf88b1850

Example code ↗

💻 Personal-Assistant.

Task = "Help me find parts for a custom PC."

https://github.com/user-attachments/assets/ac34f75c-057a-43ef-ad06-5b2c9d42bf06

Example code ↗

💡See more examples here ↗ and give us a star!

🚀 Template Quickstart

Want to get started even faster? Generate a ready-to-run template:

uvx browser-use init --template default

This creates a browser_use_default.py file with a working example. Available templates:

default - Minimal setup to get started quickly
advanced - All configuration options with detailed comments
tools - Examples of custom tools and extending the agent

You can also specify a custom output path:

uvx browser-use init --template default --output my_agent.py

💻 CLI

Fast, persistent browser automation from the command line:

browser-use open https://example.com    # Navigate to URL
browser-use state                       # See clickable elements
browser-use click 5                     # Click element by index
browser-use type "Hello"                # Type text
browser-use screenshot page.png         # Take screenshot
browser-use close                       # Close browser

The CLI keeps the browser running between commands for fast iteration. See CLI docs for all commands.

Claude Code Skill

For Claude Code, install the skill to enable AI-assisted browser automation:

mkdir -p ~/.claude/skills/browser-use
curl -o ~/.claude/skills/browser-use/SKILL.md \
  https://raw.githubusercontent.com/browser-use/browser-use/main/skills/browser-use/SKILL.md

Integrations, hosting, custom tools, MCP, and more on our Docs ↗

FAQ

What's the best model to use?

We optimized ChatBrowserUse() specifically for browser automation tasks. On avg it completes tasks 3-5x faster than other models with SOTA accuracy.

Pricing (per 1M tokens):

Input tokens: $0.20
Cached input tokens: $0.02
Output tokens: $2.00

For other LLM providers, see our supported models documentation.

Should I use the Browser Use system prompt with the open-source preview model?

Yes. If you use ChatBrowserUse(model='browser-use/bu-30b-a3b-preview') with a normal Agent(...), Browser Use still sends its default agent system prompt for you.

You do not need to add a separate custom "Browser Use system message" just because you switched to the open-source preview model. Only use extend_system_message or override_system_message when you intentionally want to customize the default behavior for your task.

If you want the best default speed/accuracy, we still recommend the newer hosted bu-* models. If you want the open-source preview model, the setup stays the same apart from the model= value.

Can I use custom tools with the agent?

Yes! You can add custom tools to extend the agent's capabilities:

from browser_use import Tools

tools = Tools()

@tools.action(description='Description of what this tool does.')
def custom_tool(param: str) -> str:
    return f"Result: {param}"

agent = Agent(
    task="Your task",
    llm=llm,
    browser=browser,
    tools=tools,
)

Can I use this for free?

Yes! Browser-Use is open source and free to use. You only need to choose an LLM provider (like OpenAI, Google, ChatBrowserUse, or run local models with Ollama).

Terms of Service

This open-source library is licensed under the MIT License. For Browser Use services & data policy, see our Terms of Service and Privacy Policy.

How do I handle authentication?

Check out our authentication examples:

Using real browser profiles - Reuse your existing Chrome profile with saved logins
If you want to use temporary accounts with inbox, choose AgentMail
To sync your auth profile with the remote browser, run curl -fsSL https://browser-use.com/profile.sh | BROWSER_USE_API_KEY=XXXX sh (replace XXXX with your API key)

These examples show how to maintain sessions and handle authentication seamlessly.

How do I solve CAPTCHAs?

For CAPTCHA handling, you need better browser fingerprinting and proxies. Use Browser Use Cloud which provides stealth browsers designed to avoid detection and CAPTCHA challenges.

How do I go into production?

Chrome can consume a lot of memory, and running many agents in parallel can be tricky to manage.

For production use cases, use our Browser Use Cloud API which handles:

Scalable browser infrastructure
Memory management
Proxy rotation
Stealth browser fingerprinting
High-performance parallel execution

Tell your computer what to do, and it gets it done.

Made with ❤️ in Zurich and San Francisco

README.md Unescape Escape

🤖 LLM Quickstart

👋 Human Quickstart

Open Source vs Cloud

Demos

📋 Form-Filling

Task = "Fill in this job application with my resume and information."

🍎 Grocery-Shopping

Task = "Put this list of items into my instacart."

💻 Personal-Assistant.

Task = "Help me find parts for a custom PC."

💡See more examples here ↗ and give us a star!

🚀 Template Quickstart

💻 CLI

Claude Code Skill

Integrations, hosting, custom tools, MCP, and more on our Docs ↗

FAQ

README.md