- Move 4px placeholder screenshot constant to browser_use.browser.views
- Update all references to use the single definition
- Fix all lint errors and formatting issues
Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com>
Replaces static page statistics with dynamic calculations for
viewport and scroll position metrics. Simplifies code logic in
scroll handling by removing unnecessary defaults for page
scrolling. Improves readability and reliability of page
scroll operations by requiring explicit num_pages parameter.
Enhances page-view consistency and user interaction handling.
Replaces double quotes with single quotes for consistency in
string declarations. Removes unnecessary trailing whitespace
across various files to improve code readability and maintain
a uniform style.
Adds comprehensive page information including viewport and total
page dimensions, detailed scroll position, and page statistics.
Updates message prompts to utilize this data, improving content
presentation and potential user actions.
Relates to enhanced scrolling feature.
* Validator
* Test mind2web
* Cleaned up logger
* Pytest logger
* Cleaned up logger
* Disable flag for human input
* Multiple clicks per button
* Multiple clicks per button
* More structured system prompt
* Fields with description
* System prompt example
* One logger
* Cleaner logging
* Log step in step function
* Fix critical clicking error - wrong argument used
* Improved thought process of agent
* Improve system prompt
* Remove human input message
* Custome action registration
* Pydantic model for custom actions
* Pydantic model for custome output
* Runs through, model outputs functions, but not called yet
* Work in progress - description for custome actions
* Description works, but schema not yet
* Model can call the right action - but is not executed
* Seperate is_controller_action and is_custom_action
* Works! Model can call custom function
* Use registry for action, but result is not feed back to model
* Include result in messages
* Works with custom function - but typing is not correct
* Renamed registry
* First test cases
* Captcha tests
* Pydantic for tests
* Improve prompts for multy step
* System prompt structure
* Handle errors like validation error
* Refactor error handling in agent
* Refactor error handling in agent
* Improved logging
* Update view
* Fix click parameter to index
* Simplify dynamic actions
* Use run instead of step
* Rename history
* Rename AgentService to Agent
* Rename ControllerService to Controller
* Pytest file
* Rename get state
* Rename BrowserService
* reversed dom extraction recursion to while
* Rename use_vision
* Rename use_vision
* reversed dom tree items and made browser less anoying
* Renaming and fixing type errors
* Renamed class names for agent
* updated requirements
* Update prompt
* Action registration works for user and controller
* Fix done call by returning ActionResult
* Fix if result is none
* Rename AgentOutput and ActionModel
* Improved prompt Passes 6/8 tests from test_agent_actions
* Calculate token cost
* Improve display
* Simplified logger
* Test function calling
* created super simple xpath extraction algo
* Tests logging
* tiny fixes to dom extraction
* Remove test
* Dont log number of clicks
* Pytest file
* merged per element js checks
* Check if driver is still open
* super fast processing
* fixed agent planning and stuff
* Fix example
* Fix example
* Improve error
* Improved error correction
* New line for step
* small type error fixes
* Test for pydantic
* Fix line
* Removed sample
* fixed readme and examples
---------
Co-authored-by: magmueller <mamagnus00@gmail.com>