Magnus Müller
|
a19b583430
|
remove 2 longer tests because this test mainly shows if browseruse works end to end
|
2025-09-08 08:35:18 -07:00 |
|
Philipp Wiederkehr
|
de1e3e06de
|
Update tests/agent_tasks/google_maps_3d.yaml
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
|
2025-07-01 12:20:40 +02:00 |
|
Philipp Wiederkehr
|
4b753f9a6e
|
Create google_maps_3d.yaml
Added a new task to test the agent's capability with sophisticated Google Maps queries.
|
2025-07-01 12:17:57 +02:00 |
|
Nick Sweeting
|
fe78e45a65
|
add waits into cloudflare eval test
|
2025-06-09 20:37:25 -07:00 |
|
Nick Sweeting
|
2540693482
|
tweak agent_tasks for evals slightly
|
2025-06-09 20:20:41 -07:00 |
|
Magnus Müller
|
38cfa86738
|
Update browser_use_pip.yaml to simplify output requirements and refactor run_task in evaluate_tasks.py to remove shared profile parameter, enhancing browser session management with a dedicated profile.
|
2025-06-07 11:50:30 +02:00 |
|
Magnus Müller
|
25c96bcaca
|
Add README for contributing agent tasks with guidelines and testing instructions
|
2025-06-07 00:51:32 +02:00 |
|
Magnus Müller
|
2c355ce6d9
|
Add new agent task tests for Amazon laptop search, browser-use pip installation, and Cloudflare captcha solving
|
2025-06-07 00:49:39 +02:00 |
|