- Removed redundant description from action field in AgentOutput and its subclasses.
- Updated action extraction documentation in Tools to clarify usage and limitations.
- Enhanced search_engine field description in SearchAction for better clarity on default behavior.
Auto-generated PR for: bump-anthropic-version-for-linter
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Bumps `anthropic` dependency to `>=0.68.1,<1.0.0` in `pyproject.toml`.
>
> - **Dependencies**:
> - Update `anthropic` version constraint in `pyproject.toml` from
`>=0.58.2,<1.0.0` to `>=0.68.1,<1.0.0`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3bbdcb1e97. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Auto-generated PR for branch: screenshot-tool
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds an auto vision mode that includes screenshots only when requested
via a new take_screenshot action, updating agent logic, prompts,
telemetry, and docs.
>
> - **Vision behavior**:
> - Add `use_vision` mode `"auto"` (default) across `Agent`,
`MessageManager`, `AgentSettings`, and telemetry; supports `bool |
Literal['auto']`.
> - Conditionally include `browser_state_summary.screenshot` only when
requested via action metadata `{"include_screenshot": true}` or when
`use_vision=True`; pass `effective_use_vision` to `.get_user_message()`.
> - Exclude `take_screenshot` tool from `Tools` when `use_vision=False`.
> - **New tool**:
> - Add `take_screenshot` action in `browser_use/tools/service.py` that
returns metadata to request a screenshot in the next observation.
> - **Prompts**:
> - Update system prompts to note `browser_vision` screenshot is present
only after `take_screenshot`; instruct to use `take_screenshot` when
unsure.
> - **Telemetry**:
> - Update `AgentTelemetryEvent.use_vision` type to `bool |
Literal['auto']`.
> - **Docs**:
> - Update `use_vision` parameter docs (default `"auto"` and behavior)
and add `take_screenshot` to available tools.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
8ee687fd2d. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
### 🐛 Summary
Fixes [#3238](https://github.com/browser-use/browser-use/issues/3238)
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Added role="option" to the interactive roles in the DOM serializer so
these nodes are kept instead of excluded. This fixes missing bounding
boxes for option-like items in listboxes/selects.
<!-- End of auto-generated description by cubic. -->
Auto-generated PR for: less-quality-image
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Replaces PNG with JPEG for screenshots and image data URLs across
capture, messaging, serializers, tests, and examples to reduce payload
size.
>
> - **Images & Screenshots**:
> - Capture screenshots as JPEG (`quality=60`) via `ScreenshotWatchdog`
and embed as `data:image/jpeg;base64,...` in `CreateAgentStepEvent`.
> - Agent prompts now attach screenshots using JPEG URLs and media
types.
> - **LLM Serializers**:
> - Google: decode data URLs and send images as `image/jpeg` bytes.
> - Anthropic: parse base64 URLs; default unrecognized media type to
`image/jpeg`.
> - Ollama: parse base64 data URLs labeled as JPEG.
> - **Message Types**:
> - `ImageURL.media_type` default changed from `image/png` to
`image/jpeg`.
> - **Tests & Examples**:
> - Gemini image test and `add_image_context` example updated to
generate/use JPEG data URLs and media types.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4a78c64dcf. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated description by cubic. -->
## Summary by cubic
Simplified tool action descriptions and naming to make actions shorter
and easier to use. Renamed a few actions/params and set a default for
scrolling to reduce boilerplate without changing behavior.
- **Refactors**
- Shorter, clearer descriptions across actions (search, go_to_url,
click, input_text, upload_file, switch_tab, scroll, scroll_to_text,
execute_js).
- Renamed actions: click_element_by_index → click,
upload_file_to_element → upload_file.
- Renamed ClickElementAction param: while_holding_ctrl → ctrl.
- Set ScrollAction.num_pages default to 1.0.
- **Migration**
- Update ClickElementAction payloads to use ctrl instead of
while_holding_ctrl.
- If actions are referenced by function name, switch to click and
upload_file.
- Scroll now defaults to one page when num_pages is omitted.
<!-- End of auto-generated description by cubic. -->