Commit Graph

50 Commits

Author SHA1 Message Date
mertunsall
92da75520b fixes based on evals 2025-07-11 18:36:58 +02:00
mertunsall
d1f532b45d - remove append_file and instead merge it into write_file
- new replace_file_str
- add trailing new lines and leading_newline parameters to writing into file for robustness
- adapt system prompts accordingly
2025-07-11 17:58:01 +02:00
Nick Sweeting
848fdf8aad clearer exception handling 2025-07-10 18:26:08 -07:00
Cursor Agent
168719561e Add element-specific scrolling with optional index parameter
Co-authored-by: mamagnus00 <mamagnus00@gmail.com>
2025-07-08 07:56:57 +00:00
Daniel T.
c5a3c0cf2e Refines dynamic page state calculations
Replaces static page statistics with dynamic calculations for
viewport and scroll position metrics. Simplifies code logic in
scroll handling by removing unnecessary defaults for page
scrolling. Improves readability and reliability of page
scroll operations by requiring explicit num_pages parameter.

Enhances page-view consistency and user interaction handling.
2025-07-07 21:22:12 +02:00
Daniel T.
539274a7d4 Updates scroll functionality to use page units
Replaces pixel-based scrolling with page unit determination, allowing for more intuitive scroll actions by specifying the number of pages. Adjusts related documentation, examples, and tests to reflect this change for improved code consistency and user experience.
2025-07-07 18:21:37 +02:00
Daniel T.
eb7c7fa2bc Enhances scroll action testing
Removes unnecessary Field description in ScrollAction to streamline code.

Implements additional test cases for varying scroll amounts, improving test coverage and validation of scrolling behavior in the integration tests.
2025-07-07 17:34:57 +02:00
Daniel T.
610419c9d7 Enhances scroll functionality with pixel control
Introduces ability to scroll by a specific number of pixels
via a new 'amount' parameter, defaulting to one page height
if unspecified. Updates relevant documentation and examples
to reflect these changes, enhancing navigation precision.

Improves user experience by allowing finer control over page
scrolling actions.

Relates to user feedback on scroll improvements.
2025-07-07 17:19:59 +02:00
Cursor Agent
ee2c1d2ad0 Remove optional xpath from ClickElementAction model
Co-authored-by: mailmertunsal <mailmertunsal@gmail.com>
2025-07-02 07:43:21 +00:00
Cursor Agent
c24d5b4320 Remove optional xpath parameter from InputTextAction model
Co-authored-by: mamagnus00 <mamagnus00@gmail.com>
2025-07-01 21:51:05 +00:00
mertunsall
649b2d5277 Refactor URL navigation and scrolling actions
- Updated GoToUrlAction to include a new_tab parameter for opening URLs in a new tab or navigating in the current tab.
- Removed the OpenTabAction as its functionality is now integrated into GoToUrlAction.
- Enhanced scroll action to allow scrolling up or down based on a boolean parameter.
- Cleaned up commented-out code related to accessibility tree retrieval for better readability.
2025-06-29 22:48:40 +02:00
mertunsall
640d1a15fe Add upload file action 2025-06-29 21:36:15 +02:00
Gregor Žunič
0691500c6a wip 2025-06-29 16:22:19 +02:00
Gregor Žunič
7a10ae0c96 Squashed commit langchain to native 2025-06-24 12:26:55 +02:00
mertunsall
1e549d72f3 now the model can display files to the users 2025-06-03 18:07:47 +02:00
Nick Sweeting
23252192a0 remove non-functional group tabs code 2025-05-04 19:40:27 +08:00
Nick Sweeting
2be4ba4f70 more pyupgrade changes 2025-05-02 20:50:21 +08:00
Gregor Žunič
922ad08fd7 removed random unnecesary functions from controller 2025-04-14 11:22:55 +02:00
Gregor Žunič
27b22a3986 Update extraction branch (#1296)
* Add http_credentials to browser context.

* Detect index change caused by page change in `multi_act`

* Fix: Add sameSite cookie validation and auto-correction

* added drag drop action

* added drag drop action

* pre-commit run -a (again)

* added active_tab state management

* add anthropic RateLimitError to list of defined rate limit errors

* Add Retry handling to bedrock example

* fix pydantic issue causing config to be overwritten

* fix config setup

* Add botocore, organize imports

* ignore langchain beta warnings

* silence pydantic deprecation warnings

* Update bedrock_claude for ruff

* ruff checks

* Update pyproject.toml and bedrock_claude.py

* ignore signal registration errors in edge case environments

* ruff fixes

* add example of getting 2FA code from 1password

* add rebrowser and creepjs to stealth tests

* Fix typos discovered by codespell

* uv add --dev codespell

* Update browser_use/browser/context.py

* lint

* Add explanations override_system_message and extend_system_message

* Update custom-functions.mdx

.get_current_page() is an async function which have to be awaited.

* Fix: Remove unnecessary await from _verified_api_keys

* Fix: Correct indentation for cookie loading log message in BrowserContext

* Fix: resolve undefined response variable in Deepseek model raw tool calling mode

* fix: increment consecutive_failure on every step error

* Fix azure example not working due to agent memory changes

* dont expose debug port on all addresses and keep security enabled by default

* custom browser addition

---------

Co-authored-by: Bartlomiej Wietrak <bartekwietrak@gmail.com>
Co-authored-by: Alin Jiang <alineveryday@outlook.com>
Co-authored-by: dipfocus <dipfocus@gmail.com>
Co-authored-by: PaperBoardOfficial <hiremohitforsoftwarerole@gmail.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Alezander9 <alexander.j.yue@gmail.com>
Co-authored-by: dheerajoruganty <db2winfb@gmail.com>
Co-authored-by: Nick Sweeting <git@sweeting.me>
Co-authored-by: Dheeraj Oruganty <53569374+dheerajoruganty@users.noreply.github.com>
Co-authored-by: Bart <46058081+b0rgcube@users.noreply.github.com>
Co-authored-by: Nick Sweeting <github@sweeting.me>
Co-authored-by: pppp606 <ppppp303@gmail.com>
Co-authored-by: Oswy <74738120+oswy-cpu@users.noreply.github.com>
Co-authored-by: BurnyCoder <happymancz@email.cz>
Co-authored-by: zhushijie <mr.zhushijie@gmail.com>
Co-authored-by: john-rtr <jonathan.ratier@gmail.com>
Co-authored-by: mathisarends-viadee <mathis.arends@viadee.de>
Co-authored-by: lorenss-m <saeclmusic@gmail.com>
Co-authored-by: Magnus Müller <67061560+MagMueller@users.noreply.github.com>
2025-04-05 19:08:50 +02:00
Nick Sweeting
fb6fa259a8 apply ruff safe fixes 2025-03-28 18:11:36 -07:00
Nick Sweeting
56977be7d1 Merge branch 'main' into Improve-click-by-text-handling 2025-03-27 19:43:34 -07:00
Nick Sweeting
d2d8a7e3c6 Merge branch 'main' into Add-wait-for-element-action 2025-03-27 17:45:08 -07:00
jersobh
f6960ad80e improving the element click to fallback to js evaluate; use optional element type 2025-03-27 16:21:09 +00:00
Nick Sweeting
b47c20c780 Merge branch 'main' into Click-on-non-indexed-elements 2025-03-26 16:50:35 -07:00
Nick Sweeting
85fc7c7e90 add support for close_tab action 2025-03-26 14:07:03 -07:00
Jeff Andrade
0e5e108e10 Merge branch 'browser-use:main' into Click-on-non-indexed-elements 2025-03-26 09:37:09 +00:00
jersobh
4a201f4c69 handling multiple elements with the same text 2025-03-26 02:53:14 +00:00
Nick Sweeting
d95527ab27 Merge branch 'main' into enhancement/add_tabGroups_chrome 2025-03-25 16:34:08 -07:00
jersobh
63f54cc100 Clicking by xpath, css selector or text 2025-03-24 12:34:01 +00:00
Magnus Müller
c07b5c9a94 Add last step warning and enhance action result tracking
- Introduce warning message for the last step in the agent
- Add `success` parameter to `ActionResult` and `DoneAction`
- Modify `AgentStepInfo` to check for last step
- Update `AgentHistoryList` methods to handle optional values and success status
2025-02-22 23:43:41 -08:00
Henry
e174508b96 Fix inconsistent indentation in controller/views.py 2025-02-18 17:35:50 -08:00
SwapnilSonker
92fe9212a9 #544 issue resolved , used chrometabs api 2025-02-08 22:48:46 +05:30
Morris Lee
defec5b9f6 to fix agent test by adding ExtractPageContentAction into controller views 2025-02-08 18:28:46 +08:00
jersobh
8db1230b9d turn selector mandatory and timeout optional with a default wait time of 10 seconds 2025-02-02 01:23:05 +00:00
jersobh
ac88cf46ee Added initial_actions and service action on registry for wait_for_element with timeout 2025-02-02 01:20:02 +00:00
magmueller
23a4481175 Enabled extraction with llm 2025-01-31 23:38:23 -08:00
j0yk1ll
7c93f6c0e8 fix: could not parse response on go_back action 2025-01-26 15:49:27 +01:00
magmueller
a9d095e365 No default value try 2025-01-13 07:18:13 -08:00
magmueller
d24e157106 Fix default is not permitted 2025-01-13 07:13:30 -08:00
magmueller
18a42db882 Only option for markdown or text depending if links are needed 2025-01-12 18:16:53 -08:00
magmueller
cb654d3ab8 Scrolling to text works 2024-12-03 11:09:43 +01:00
magmueller
90ef74aceb Removed num_clicks, fixed type valuation -> evaluation, and multiaction with output list 2024-12-01 21:49:57 +01:00
magmueller
3696a3b7ed Include xpath 2024-11-28 06:50:44 +01:00
Gregor Žunič
0019105a49 fixed merge errors 2024-11-22 15:08:58 +01:00
Gregor Žunič
5941dd2752 Merge branch 'staging' into gregorzunic/bu-56-switch-from-selenium-to-playwright 2024-11-22 15:08:30 +01:00
Gregor Žunič
b0c390f2c0 fixed multi tab management, clicking timeouts, general bugfixes 2024-11-20 17:33:49 +01:00
Gregor Žunič
f7148e3542 untested version of playwright (kinda works) 2024-11-19 18:32:31 +01:00
magmueller
1e9dee081b Core function to scroll up and down on page 2024-11-17 16:43:54 +01:00
Gregor Žunič
89c63fdd63 Added custom actions registry and fixed extraction layer (#20)
* Validator

* Test mind2web

* Cleaned up logger

* Pytest logger

* Cleaned up logger

* Disable flag for human input

* Multiple clicks per button

* Multiple clicks per button

* More structured system prompt

* Fields with description

* System prompt example

* One logger

* Cleaner logging

* Log step in step function

* Fix critical clicking error - wrong argument used

* Improved thought process of agent

* Improve system prompt

* Remove human input message

* Custome action registration

* Pydantic model for custom actions

* Pydantic model for custome output

* Runs through, model outputs functions, but not called yet

* Work in progress - description for custome actions

* Description works, but schema not yet

* Model can call the right action - but is not executed

* Seperate is_controller_action  and is_custom_action

* Works! Model can call custom function

* Use registry for action, but result is not feed back to model

* Include result in messages

* Works with custom function - but typing is not correct

* Renamed registry

* First test cases

* Captcha tests

* Pydantic for tests

* Improve prompts for multy step

* System prompt structure

* Handle errors like validation error

* Refactor error handling in agent

* Refactor error handling in agent

* Improved logging

* Update view

* Fix click parameter to index

* Simplify dynamic actions

* Use run instead of step

* Rename history

* Rename AgentService to Agent

* Rename ControllerService to Controller

* Pytest file

* Rename get state

* Rename BrowserService

* reversed dom extraction recursion to while

* Rename use_vision

* Rename use_vision

* reversed dom tree items and made browser less anoying

* Renaming and fixing type errors

* Renamed class names for agent

* updated requirements

* Update prompt

* Action registration works for user and controller

* Fix done call by returning ActionResult

* Fix if result is none

* Rename AgentOutput and ActionModel

* Improved prompt Passes 6/8 tests from test_agent_actions

* Calculate token cost

* Improve display

* Simplified logger

* Test function calling

* created super simple xpath extraction algo

* Tests logging

* tiny fixes to dom extraction

* Remove test

* Dont log number of clicks

* Pytest file

* merged per element js checks

* Check if driver is still open

* super fast processing

* fixed agent planning and stuff

* Fix example

* Fix example

* Improve error

* Improved error correction

* New line for step

* small type error fixes

* Test for pydantic

* Fix line

* Removed sample

* fixed readme and examples

---------

Co-authored-by: magmueller <mamagnus00@gmail.com>
2024-11-15 21:42:02 +01:00
Gregor Žunič
68201df624 src -> browser_use 2024-11-06 18:18:00 +01:00