--- title: "Custom Functions" description: "Extend default agent and write custom action functions to do certain tasks" icon: "function" --- Custom actions are functions *you* provide, that are added to our [default actions](https://github.com/browser-use/browser-use/blob/main/browser_use/controller/service.py) the agent can use to accomplish tasks. Action functions can request [arbitrary parameters](#action-parameters-via-pydantic-model) that the LLM has to come up with + a fixed set of [framework-provided arguments](#framework-provided-parameters) for browser APIs / `Agent(context=...)` / etc. Our default set of actions is already quite powerful, the built-in `Controller` provides basics like `open_tab`, `scroll_down`, `extract_content`, [and more](https://github.com/browser-use/browser-use/blob/main/browser_use/controller/service.py). It's easy to add your own actions to implement additional custom behaviors, integrations with other apps, or performance optimizations. For examples of custom actions (e.g. uploading files, asking a human-in-the-loop for help, drawing a polygon with the mouse, and more), see [examples/custom-functions](https://github.com/browser-use/browser-use/tree/main/examples/custom-functions). ## Action Function Registration To register your own custom functions (which can be `sync` or `async`), decorate them with the `@controller.action(...)` decorator. This saves them into the `controller.registry`. ```python from browser_use import Controller, ActionResult controller = Controller() @controller.action('Ask human for help with a question', domains=['example.com']) # pass allowed_domains= or page_filter= to limit actions to certain pages def ask_human(question: str) -> ActionResult: answer = input(f'{question} > ') return ActionResult(extracted_content=f'The human responded with: {answer}', include_in_memory=True) ``` ```python # Then pass your controller to the agent to use it agent = Agent( task='...', llm=llm, controller=controller, ) ``` Keep your action function names and descriptions short and concise: - The LLM chooses between actions to run solely based on the function name and description - The LLM decides how to fill action params based on their names, type hints, & defaults --- ## Action Parameters Browser Use supports two patterns for defining action parameters: normal function arguments, or a Pydantic model. ### Function Arguments For simple actions that don't need default values, you can define the action parameters directly as arguments to the function. This one takes a single string argument, `css_selector`. When the LLM calls an action, it sees its argument names & types, and will provide values that fit. ```python @controller.action('Click element') def click_element(css_selector: str, page: Page) -> ActionResult: # css_selector is an action param the LLM must provide when calling # page is a special framework-provided param to access the browser APIs (see below) await page.locator(css_selector).click() return ActionResult(extracted_content=f"Clicked element {css_selector}") ``` ### Pydantic Model You can define a pydantic model for the parameters your action expects by setting a `@controller.action(..., param_model=MyParams)`. This allows you to use optional parameters, default values, `Annotated[...]` types with custom validation, field descriptions, and other features offered by pydantic. When the agent calls calls your agent function, an instance of your model with the values filled by the LLM will be passed as the argument named `params` to your action function. Using a pydantic model is helpful because it allows more flexibility and power to enforce the schema of the values the LLM should provide. The LLM gets the entire pydantic JSON schema for your `param_model`, it will see the function name & description + individual field names, types, descriptions, and default values. ```python from typing import Annotated from pydantic import BaseModel, AfterValidator from browser_use import ActionResult class MyParams(BaseModel): field1: int field2: str = 'default value' field3: Annotated[str, AfterValidator(lambda s: s.lower())] # example: enforce always lowercase field4: str = Field(default='abc', description='Detailed description for the LLM') @controller.action('My action', param_model=MyParams) def my_action(params: MyParams, page: Page) -> ActionResult: await page.keyboard.type(params.field2) return ActionResult(extracted_content=f"Inputted {params} on {page.url}") ``` Any special framework-provided arguments (e.g. `page`) will be passed as separate positional arguments after `params`. To use a `BaseModel` the arg *must* be called `params`. Action function args are matched and filled like named arguments; arg order doesn't matter but names and types do. ### Framework-Provided Parameters These special action parameters are injected by the `Controller` and are passed as extra args to any actions that expect them. For example, actions that need to run playwright code to interact with the browser should take the argument `page` or `browser_session`. - `page: Page` - The current Playwright page (shortcut for `browser_session.get_current_page()`) - `browser_session: BrowserSession` - The current browser session (and playwright context via `browser_session.browser_context`) - `context: AgentContext` - Any optional top-level context object passed to the Agent, e.g. `Agent(context=user_provided_obj)` - `page_extraction_llm: BaseChatModel` - LLM instance used for page content extraction - `available_file_paths: list[str]` - List of available file paths for upload / processing - `has_sensitive_data: bool` - Whether the action content contains sensitive data markers (check this to avoid logging sensitive data to terminal by accident) #### Example: Action uses the current `page` ```python from playwright.async_api import Page from browser_use import Controller, ActionResult controller = Controller() @controller.action('Type keyboard input into a page') async def input_text_into_page(text: str, page: Page) -> ActionResult: await page.keyboard.type(text) return ActionResult(extracted_content='Website opened') ``` #### Example: Action uses the `browser_context` ```python from browser_use import BrowserSession, Controller, ActionResult controller = Controller() @controller.action('Open website') async def open_website(url: str, browser_session: BrowserSession) -> ActionResult: # find matching existing tab by looking through all pages in playwright browser_context all_tabs = await browser_session.browser_context.pages for tab in all_tabs: if tab.url == url: await tab.bring_to_foreground() return ActionResult(extracted_content=f'Switched to tab with url {url}') # otherwise, create a new tab new_tab = await browser_session.browser_context.new_page() await new_tab.goto(url) return ActionResult(extracted_content=f'Opened new tab with url {url}') ``` --- ## Important Rules 1. **Return an [`ActionResult`](https://github.com/search?q=repo%3Abrowser-use%2Fbrowser-use+%22class+ActionResult%28BaseModel%29%22&type=code)**: All actions should return an `ActionResult | str | None`. The stringified version of the result is passed back to the LLM, and optionally persisted in the long-term memory when `ActionResult(..., include_in_memory=True)`. 2. **Type hints on arguments are required**: They are used to verify that action params don't conflict with special arguments injected by the controller (e.g. `page`) 3. **Actions functions called directly must be passed kwargs**: When calling actions from other actions or python code, you must **pass all parameters as kwargs only**, even though the actions are usually defined using positional args (for the same reasons as [pluggy](https://pluggy.readthedocs.io/en/stable/index.html#calling-hooks)). Action arguments are always matched by name and type, **not** positional order, so this helps prevent ambiguity / reordering issues while keeping action signatures short. ```python @controller.action('Fill in the country form field') def input_country_field(country: str, page: Page) -> ActionResult: await some_action(123, page=page) # ❌ not allowed: positional args, use kwarg syntax when calling await some_action(abc=123, page=page) # ✅ allowed: action params & special kwargs await some_other_action(params=OtherAction(abc=123), page=page) # ✅ allowed: params=model & special kwargs ``` ```python # Using Pydantic Model to define action params (recommended) class PinCodeParams(BaseModel): code: int retries: int = 3 # ✅ supports optional/defaults @controller.action('...', param_model=PinCodeParams) async def input_pin_code(params: PinCodeParams, page: Page): ... # ✅ special params at the end # Using function arguments to define action params async def input_pin_code(code: int, retries: int, page: Page): ... # ✅ params first, special params second, no defaults async def input_pin_code(code: int, retries: int=3): ... # ✅ defaults ok only if no special params needed async def input_pin_code(code: int, retries: int=3, page: Page): ... # ❌ Python SyntaxError! not allowed ``` --- ## Reusing Custom Actions Across Agents You can use the same controller for multiple agents. ```python controller = Controller() # ... register actions to the controller agent = Agent( task="Go to website X and find the latest news", llm=llm, controller=controller ) # Run the agent await agent.run() agent2 = Agent( task="Go to website Y and find the latest news", llm=llm, controller=controller ) await agent2.run() ``` The controller is stateless and can be used to register multiple actions and multiple agents. ## Exclude functions If you want to exclude some registered actions and make them unavailable to the agent, you can do: ```python controller = Controller(exclude_actions=['open_tab', 'search_google']) agent = Agent(controller=controller, ...) ``` If you want actions to only be available on certain pages, and to not tell the LLM about them on other pages, you can use the `allowed_domains` and `page_filter`: ```python from pydantic import BaseModel from browser_use import Controller, ActionResult controller = Controller() async def is_ai_allowed(page: Page): if api.some_service.check_url(page.url): logger.warning('Allowing AI agent to visit url:', page.url) return True return False @controller.action('Fill out secret_form', allowed_domains=['https://*.example.com'], page_filter=is_ai_allowed) def fill_out_form(...) -> ActionResult: ... will only be runnable by LLM on pages that match https://*.example.com *AND* where is_ai_allowed(page) returns True ```