browser-use

mirror of https://github.com/browser-use/browser-use synced 2026-05-06 17:52:15 +02:00

Author	SHA1	Message	Date
Mert Unsal	25a2eecbfd	Enhance system prompt reasoning (#2022 ) - Improve clarity on using extract_structured_data and user request processing. - Added guidance on file system to avoid overwriting existing content. - Included additional reasoning patterns for better task management and progress tracking.	2025-06-21 09:56:34 +02:00
Mert Unsal	59c05a4f59	Merge branch 'main' into mert/improve_system_prompt	2025-06-21 09:55:56 +02:00
Magnus Müller	8194ecbc3e	Pass laminar_eval_id from frontend (#2024 )	2025-06-21 09:37:51 +02:00
Magnus Müller	f1d5dc5a17	Pass laminar_eval_id from frontend	2025-06-21 09:31:14 +02:00
Magnus Müller	726dd30c82	Monitor eval cpu (#2023 ) <!-- This is an auto-generated description by cubic. --> ## Summary by cubic Added detailed CPU and memory monitoring to the evaluation workflow and service to help track resource usage and catch issues like high memory or CPU load during eval runs. - New Features - Logs system resource stats before, during, and after evaluation runs. - Starts a background resource monitor that checks CPU, memory, and process counts every 30 seconds. - Adds signal handling and heartbeat logging for better debugging and graceful shutdowns. - Collects and uploads resource logs and debug info as workflow artifacts. <!-- End of auto-generated description by cubic. -->	2025-06-21 00:33:26 +02:00
Magnus Müller	adfc553692	Merge branch 'main' into monitor-eval	2025-06-21 00:32:20 +02:00
Magnus Müller	83d92513a4	Monitor eval cpu	2025-06-20 23:35:56 +02:00
Mert Unsal	64eeac9a17	Update browser_use/agent/system_prompt.md Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2025-06-20 22:45:44 +02:00
mertunsall	16708e916d	Enhance system prompt reasoning - Improve clarity on using extract_structured_data and user request processing. - Added guidance on file system to avoid overwriting existing content. - Included additional reasoning patterns for better task management and progress tracking.	2025-06-20 22:42:21 +02:00
Mert Unsal	6b9263d9f6	Quick Fix to History (#2021 ) Fixed a bug where a "No model output" message was added to the agent history even when step_number was zero, where there is no model output.	2025-06-20 21:38:28 +02:00
Mert Unsal	97bd9fb3d2	Update browser_use/agent/message_manager/service.py Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2025-06-20 21:31:51 +02:00
mertunsall	f3b406f42a	Quick Fix to History	2025-06-20 21:28:06 +02:00
Magnus Müller	1c54954a0d	Improve AgentOutput format and reasoning style (#2020 ) - Change reasoning rules in system prompt - Flatten AgentOutput to get rid of current_state - Add agent initialization in history. - Update the example tool call. - Add a current_state property to AgentOutput for compatibility. <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Flattened the AgentOutput model by removing the current_state field, updated the system prompt reasoning rules, and added agent initialization to the history for better clarity and compatibility. - Refactors - AgentOutput now uses top-level fields instead of a nested current_state. - Added a current_state property for backward compatibility. - Simplified and clarified reasoning rules in the system prompt. - Updated example tool call and included agent initialization in the history. <!-- End of auto-generated description by cubic. -->	2025-06-20 21:24:01 +02:00
mertunsall	0c656cb631	fix more	2025-06-20 19:33:14 +02:00
mertunsall	8bc6c3d2b7	hotfix	2025-06-20 19:32:51 +02:00
mertunsall	2ec840ed4a	fix tests for new output	2025-06-20 19:13:58 +02:00
mertunsall	26fdce7357	fix test case with new output structure	2025-06-20 19:02:07 +02:00
mertunsall	9ae6354b76	- Change reasoning rules in system prompt - Flatten AgentOutput to get rid of current_state - Add agent initialization in history. - Update the example tool call. - Add a current_state property to AgentOutput for compatibility.	2025-06-20 18:55:04 +02:00
Magnus Müller	179f2917f5	Change Agent State, System Prompt, Add FileSystem, etc. (#1874 ) ## Agent State and System Prompt This PR aims to improve the agent by rewriting the way the state is constructed, as described in `system_prompt.md`, new state consists of: 1. Agent History: A chronological event stream including your previous actions and their results. This may be truncated or partially omitted. 2. Agent State: Includes the ultimate goal provided by the user, current progress, and relevant contextual memory. 3. Browser State: Contains current URL, open tabs, interactive elements indexed for actions, visible page content, and (sometimes) any visual context provided by screenshots or page snapshots. 4. Read State: If your previous action involved reading a file or extracting content (e.g., from a webpage), the full result will be included here. This data is only shown in the current step and will not appear in future Agent History. You are responsible for saving or interpreting the information appropriately during this step. Please refer to `system_prompt.md` for explanation of what the model gets in its context. The PR rewrites the `MessageManager` to achieve the displaying of this state. This helps maintaining a better state as the model always sees a constant number of messages: system prompt, example tool calls, and current state. ## File System The model now has access to a File System it can interact with so that it can make a plan, write intermediate results, etc. 1. Agent is initialized with two files `todo.md` and `results.md`. First one is used so that the model can plan and the contents of this document is always displayed in agent's state. Second one is so that model can accumulate results for long tasks. File system is always displayed in Agent State in format: file_name — num_lines lines 2. Model is allowed to use 3 functionalities, `write_file`, `append_file`, and `read_file`. We should improve the descriptions of these functions. Finally, there should be an option to "edit a string" (Replace line `"{content}"` to `"{new_content}"`. 3. Currently, for safety reasons, agent can ONLY create files in format `{file_name}.{md\|txt}` and cannot create subdirectories or go back. This makes sure that agent's 4. The file system is currently launched in a temporary directory with a random uuid - this can be changed so that we just use `tempfile`'s temporary file and temporary directory functionality, stored in memory and to be deleted when program terminates. 6. Optionally, the user might want to keep the results saved in the directory so there should be an option to set a directory which will not be deleted at the end. 7. Finally, the agent is not allowed to be initialized in an existing directory. This is to make sure that files written in a previous do not impact a new agent's behavior accidentally. However, this is annoying as it requires the user to delete the directory between multiple runs. ## AgentBrain AgentBrain object has a new thinking field - need to make sure that this is taken care of in the entire codebase, saved/displayed when necessary. ## AgentResult AgentResult object has 2 new fields, `memory` which is what will be added to Action History that the agent will see and `update_read_state`, a boolean that determines whether the action should result in updating the read state. This is used for actions such as `extract_content` and `read_file`, where the model should see the content of the file once but file contents shouldn't remain in history forever. ## Details and Next Steps 1. Currently browser state is capped at 10k characters - this is very primitive and should be changed with a smarter semantic processing layer. 2. I think we want to permanently include file system in the Agent - this feature seems very important. So the controller functions for using a file system should be moved from agent into controller. 3. We should add a function call for sending user a message with the results before calling `done` - this is simply more clean and more intuitive for the model. 4. Save the current page as PDF file should directly use the file system. 5. Currently, there seems to be a bug with `append_file`, should be fixed. 6. Not all language models seem to work with current version - for example, maybe we should get rid of `current_state` field. 7. Get rid of langchain and do our own LLM calls for better tractability and more control over what goes into the LLM. I think there are a lot more things to be addressed regarding this PR, but this is all I could gather.	2025-06-20 15:40:10 +02:00
Magnus Müller	4a8cf30dac	Merge branch 'main' into mert/new_everything	2025-06-20 12:27:19 +02:00
Magnus Müller	902a1dfb66	Add gemini-2.5-flash (#2018 )	2025-06-20 12:20:53 +02:00
Magnus Müller	0e5a8942f3	Add gemini-2.5-flash	2025-06-20 12:19:47 +02:00
Nick Sweeting	1cc94b6688	improve cloud sync logging	2025-06-20 03:00:35 -07:00
Nick Sweeting	ca9588d6d6	bump version 0.3.1	2025-06-20 02:38:33 -07:00
Nick Sweeting	3177118aa7	fix lint errors	2025-06-20 02:38:22 -07:00
Nick Sweeting	14e420b74b	Revert "Fix cross-origin iframe DOM retrieval" (#2017 )	2025-06-20 02:37:52 -07:00
Nick Sweeting	e9b2fea57f	Revert "Fix cross-origin iframe DOM retrieval"	2025-06-20 05:37:22 -04:00
Nick Sweeting	51626ef42d	Eventbus fixes and cleanup (#2016 ) 0.3.0	2025-06-20 02:30:13 -07:00
Nick Sweeting	2baf4e7907	Merge branch 'main' into eventbus	2025-06-20 02:26:44 -07:00
Nick Sweeting	933bddc02f	improve error logging and bump bubus version	2025-06-20 02:24:01 -07:00
Nick Sweeting	7766b8d630	cleanup event wrapping and unwrapping	2025-06-20 02:06:23 -07:00
Magnus Müller	4c2952d640	Squashed commit of the following: commit `a9cf53a1b1` Merge: `5aa62c11` `0f9ffa10` Author: Magnus Müller <67061560+MagMueller@users.noreply.github.com> Date: Fri Jun 20 10:41:19 2025 +0200 Set user_data_dir to None (#2015) <!-- This is an auto-generated description by cubic. --> Changed browser session setup to use incognito mode by setting user_data_dir to None, preventing persistent state between evaluation runs. <!-- End of auto-generated description by cubic. --> commit `0f9ffa1072` Author: Magnus Müller <67061560+MagMueller@users.noreply.github.com> Date: Fri Jun 20 10:38:01 2025 +0200 Set user_data_dir to None commit `5aa62c1113` Merge: `d8a9d21b` `e559ff5e` Author: Nick Sweeting <git@sweeting.me> Date: Thu Jun 19 23:01:49 2025 -0700 Fix cross-origin iframe DOM retrieval (#1965) commit `d8a9d21b00` Merge: `3e5f3049` `b6be1583` Author: Nick Sweeting <git@sweeting.me> Date: Thu Jun 19 23:01:21 2025 -0700 Fix critical domain restriction bypass vulnerability (#2006) commit `b6be158319` Author: Sahar <saharhashai@gmail.com> Date: Thu Jun 19 02:28:34 2025 -0700 Delete tests/ci/test_security_url_validation.py commit `aca4b57329` Author: Sahar <saharhashai@gmail.com> Date: Thu Jun 19 02:27:57 2025 -0700 Delete SECURITY_FIX_REPORT.md commit `45872c1e45` Author: Your Name <your.email@example.com> Date: Thu Jun 19 11:24:50 2025 +0200 fix(security): prevent domain restriction bypass in controller actions - Add domain validation to controller.click() and controller.type() methods - Implement comprehensive security checks before executing actions - Prevent potential prompt injection and unauthorized data access - Add extensive test coverage for domain validation scenarios - Update documentation with security considerations This critical fix prevents complete bypass of domain restrictions that could enable attackers to perform unauthorized actions on any domain. commit `e559ff5eaa` Merge: `19ae8a11` `f348e0c5` Author: Nick Sweeting <git@sweeting.me> Date: Sat Jun 14 01:56:09 2025 -0700 Merge branch 'main' into main commit `19ae8a1146` Merge: `e1b3ff9e` `08ed0be3` Author: Nick Sweeting <git@sweeting.me> Date: Sat Jun 14 00:31:30 2025 -0700 Merge branch 'main' into main commit `e1b3ff9e9d` Author: Ilya Biryukov <ilbiryuk@microsoft.com> Date: Thu Jun 12 17:40:40 2025 -0700 Revert changes to examples/features/multiple_agents_same_browser.py commit `d20a3b55d6` Author: Ilya Biryukov <ilbiryuk@microsoft.com> Date: Thu Jun 12 17:30:59 2025 -0700 Fix pre-commit lint issues and compile error in multiple_agents_same_browser commit `13d5468aa2` Author: Ilya Biryukov <ilbiryuk@microsoft.com> Date: Thu Jun 12 14:07:21 2025 -0700 Fix cross-origin iframe DOM retrieval	2025-06-20 10:51:06 +02:00
Magnus Müller	a9cf53a1b1	Set user_data_dir to None (#2015 ) <!-- This is an auto-generated description by cubic. --> ## Summary by cubic Changed browser session setup to use incognito mode by setting user_data_dir to None, preventing persistent state between evaluation runs. <!-- End of auto-generated description by cubic. -->	2025-06-20 10:41:19 +02:00
Magnus Müller	0f9ffa1072	Set user_data_dir to None	2025-06-20 10:38:01 +02:00
Magnus Müller	907867f976	Refactor agent history update logic to handle None model_output case, ensuring proper logging and description formatting for failed parsing scenarios.	2025-06-20 09:28:23 +02:00
Magnus Müller	eda5140363	Improve error logging in message manager by appending ellipsis to truncated error messages for better clarity in action results.	2025-06-20 08:27:46 +02:00
Magnus Müller	1a891202ea	Enhance system prompt documentation by clarifying the structure of the JSON response. Added important notes regarding the top-level elements "current_state" and "action" to improve understanding of the expected format.	2025-06-20 08:25:48 +02:00
Nick Sweeting	5aa62c1113	Fix cross-origin iframe DOM retrieval (#1965 )	2025-06-19 23:01:49 -07:00
Nick Sweeting	d8a9d21b00	Fix critical domain restriction bypass vulnerability (#2006 )	2025-06-19 23:01:21 -07:00
Magnus Müller	3af0a37e71	Rename step information	2025-06-20 07:47:35 +02:00
Magnus Müller	8b93dc626f	Only show task once	2025-06-20 07:35:02 +02:00
Magnus Müller	c4f1b5f935	Fix parsing error handling - the entire tool call is in tool_call_args. Before it parsed it wrong with no error	2025-06-20 07:32:24 +02:00
mertunsall	4e55d7c886	Update test case for max string length validation to use MAX_TASK_LENGTH constant, improving maintainability and clarity in error messages.	2025-06-20 01:21:49 +02:00
mertunsall	0c2de169f8	Update maximum length constants in cloud_events.py to accommodate larger data sizes: MAX_STRING_LENGTH increased to 100K, MAX_URL_LENGTH and MAX_TASK_LENGTH adjusted to 10K.	2025-06-20 01:13:23 +02:00
mertunsall	7cdcbc1385	Remove the test_extract_content_action method from TestControllerIntegration, streamlining the test suite by eliminating outdated or redundant tests.	2025-06-20 01:06:28 +02:00
Mert Unsal	b77393917d	Apply suggestions from code review Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2025-06-20 01:00:39 +02:00
Mert Unsal	198dd161b8	Update browser_use/agent/system_prompt.md Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2025-06-20 00:56:37 +02:00
Mert Unsal	832c59c9c3	Update browser_use/agent/system_prompt.md Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2025-06-20 00:56:17 +02:00
Magnus Müller	2ee381d283	Refactor message_manager tests to use a temporary file system path, enhancing isolation and reliability of test cases.	2025-06-20 00:18:48 +02:00
Magnus Müller	831682e2a3	Add FileSystem dependency to message_manager tests	2025-06-20 00:13:06 +02:00

1 2 3 4 5 ...

3270 Commits