mirror of
https://github.com/browser-use/browser-use
synced 2026-05-06 17:52:15 +02:00
133 lines
4.0 KiB
Plaintext
133 lines
4.0 KiB
Plaintext
---
|
|
title: "Add Tools"
|
|
description: ""
|
|
icon: "plus"
|
|
mode: "wide"
|
|
---
|
|
|
|
|
|
Examples:
|
|
- deterministic clicks
|
|
- file handling
|
|
- calling APIs
|
|
- human-in-the-loop
|
|
- browser interactions
|
|
- calling LLMs
|
|
- get 2fa codes
|
|
- send emails
|
|
- Playwright integration (see [GitHub example](https://github.com/browser-use/browser-use/blob/main/examples/browser/playwright_integration.py))
|
|
- ...
|
|
|
|
Simply add `@tools.action(...)` to your function.
|
|
|
|
```python
|
|
from browser_use import Tools, Agent
|
|
|
|
tools = Tools()
|
|
|
|
@tools.action(description='Ask human for help with a question')
|
|
def ask_human(question: str) -> ActionResult:
|
|
answer = input(f'{question} > ')
|
|
return f'The human responded with: {answer}'
|
|
```
|
|
|
|
```python
|
|
agent = Agent(task='...', llm=llm, tools=tools)
|
|
```
|
|
|
|
- **`description`** *(required)* - What the tool does, the LLM uses this to decide when to call it.
|
|
- **`allowed_domains`** - List of domains where tool can run (e.g. `['*.example.com']`), defaults to all domains
|
|
|
|
The Agent fills your function parameters based on their names, type hints, & defaults.
|
|
|
|
|
|
## Available Objects
|
|
|
|
Your function has access to these objects:
|
|
|
|
- **`browser_session: BrowserSession`** - Current browser session for CDP access
|
|
- **`cdp_client`** - Direct Chrome DevTools Protocol client
|
|
- **`page_extraction_llm: BaseChatModel`** - The LLM you pass into agent. This can be used to do a custom llm call here.
|
|
- **`file_system: FileSystem`** - File system access
|
|
- **`available_file_paths: list[str]`** - Available files for upload/processing
|
|
- **`has_sensitive_data: bool`** - Whether action contains sensitive data
|
|
|
|
|
|
## Browser Interaction Examples
|
|
|
|
You can use `browser_session` to directly interact with page elements using CSS selectors:
|
|
|
|
```python
|
|
from browser_use import Tools, Agent, ActionResult, BrowserSession
|
|
|
|
tools = Tools()
|
|
|
|
@tools.action(description='Click the submit button using CSS selector')
|
|
async def click_submit_button(browser_session: BrowserSession):
|
|
# Get the current page
|
|
page = await browser_session.must_get_current_page()
|
|
|
|
# Get element(s) by CSS selector
|
|
elements = await page.get_elements_by_css_selector('button[type="submit"]')
|
|
|
|
if not elements:
|
|
return ActionResult(extracted_content='No submit button found')
|
|
|
|
# Click the first matching element
|
|
await elements[0].click()
|
|
|
|
return ActionResult(extracted_content='Submit button clicked!')
|
|
```
|
|
|
|
|
|
Available methods on `Page`:
|
|
- `get_elements_by_css_selector(selector: str)` - Returns list of matching elements
|
|
- `get_element_by_prompt(prompt: str, llm)` - Returns element or None using LLM
|
|
- `must_get_element_by_prompt(prompt: str, llm)` - Returns element or raises error
|
|
|
|
Available methods on `Element`:
|
|
- `click()` - Click the element
|
|
- `type(text: str)` - Type text into the element
|
|
- `get_text()` - Get element text content
|
|
- See `browser_use/actor/element.py` for more methods
|
|
|
|
## Pydantic Input
|
|
|
|
You can use Pydantic for the tool parameters:
|
|
|
|
```python
|
|
from pydantic import BaseModel
|
|
|
|
class Cars(BaseModel):
|
|
name: str = Field(description='The name of the car, e.g. "Toyota Camry"')
|
|
price: int = Field(description='The price of the car as int in USD, e.g. 25000')
|
|
|
|
@tools.action(description='Save cars to file')
|
|
def save_cars(cars: list[Cars]) -> str:
|
|
with open('cars.json', 'w') as f:
|
|
json.dump(cars, f)
|
|
return f'Saved {len(cars)} cars to file'
|
|
|
|
task = "find cars and save them to file"
|
|
```
|
|
## Domain Restrictions
|
|
|
|
Limit tools to specific domains:
|
|
|
|
```python
|
|
@tools.action(
|
|
description='Fill out banking forms',
|
|
allowed_domains=['https://mybank.com']
|
|
)
|
|
def fill_bank_form(account_number: str) -> str:
|
|
# Only works on mybank.com
|
|
return f'Filled form for account {account_number}'
|
|
```
|
|
|
|
## Advanced Example
|
|
|
|
For a comprehensive example of custom tools with Playwright integration, see:
|
|
**[Playwright Integration Example](https://github.com/browser-use/browser-use/blob/main/examples/browser/playwright_integration.py)**
|
|
|
|
This shows how to create custom actions that use Playwright's precise browser automation alongside Browser-Use.
|