mirror of
https://github.com/browser-use/browser-use
synced 2026-05-06 17:52:15 +02:00
57 lines
1.7 KiB
Plaintext
57 lines
1.7 KiB
Plaintext
---
|
|
title: "Basics"
|
|
description: "Low-level Playwright-like browser automation with direct and full CDP control and precise element interactions"
|
|
icon: "code"
|
|
mode: "wide"
|
|
---
|
|
|
|
## Core Architecture
|
|
|
|
```mermaid
|
|
graph TD
|
|
A[Browser] --> B[Page]
|
|
B --> C[Element]
|
|
B --> D[Mouse]
|
|
B --> E[AI Features]
|
|
C --> F[DOM Interactions]
|
|
D --> G[Coordinate Operations]
|
|
E --> H[LLM Integration]
|
|
```
|
|
|
|
### Core Classes
|
|
|
|
- **Browser** (alias: **BrowserSession**): Main session manager
|
|
- **Page**: Represents a browser tab/iframe
|
|
- **Element**: Individual DOM element operations
|
|
- **Mouse**: Coordinate-based mouse operations
|
|
|
|
## Basic Usage
|
|
|
|
```python
|
|
from browser_use import Browser, Agent
|
|
from browser_use.llm.openai import ChatOpenAI
|
|
|
|
async def main():
|
|
llm = ChatOpenAI(api_key="your-api-key")
|
|
browser = Browser()
|
|
await browser.start()
|
|
|
|
# 1. Actor: Precise navigation and element interactions
|
|
page = await browser.new_page("https://github.com/login")
|
|
email_input = await page.must_get_element_by_prompt("username field", llm=llm)
|
|
await email_input.fill("your-username")
|
|
|
|
# 2. Agent: AI-driven complex tasks
|
|
agent = Agent(browser=browser, llm=llm)
|
|
await agent.run("Complete login and navigate to my repositories")
|
|
|
|
await browser.stop()
|
|
```
|
|
|
|
## Important Notes
|
|
|
|
- **Not Playwright**: Actor is built on CDP, not Playwright. The API resembles Playwright as much as possible for easy migration, but is sorta subset.
|
|
- **Immediate Returns**: `get_elements_by_css_selector()` doesn't wait for visibility
|
|
- **Manual Timing**: You handle navigation timing and waiting
|
|
- **JavaScript Format**: `evaluate()` requires arrow function format: `() => {}`
|