Commit Graph

8 Commits

Author SHA1 Message Date
Magnus Müller
a19b583430 remove 2 longer tests because this test mainly shows if browseruse works end to end 2025-09-08 08:35:18 -07:00
Philipp Wiederkehr
de1e3e06de Update tests/agent_tasks/google_maps_3d.yaml
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-07-01 12:20:40 +02:00
Philipp Wiederkehr
4b753f9a6e Create google_maps_3d.yaml
Added a new task to test the agent's capability with sophisticated Google Maps queries.
2025-07-01 12:17:57 +02:00
Nick Sweeting
fe78e45a65 add waits into cloudflare eval test 2025-06-09 20:37:25 -07:00
Nick Sweeting
2540693482 tweak agent_tasks for evals slightly 2025-06-09 20:20:41 -07:00
Magnus Müller
38cfa86738 Update browser_use_pip.yaml to simplify output requirements and refactor run_task in evaluate_tasks.py to remove shared profile parameter, enhancing browser session management with a dedicated profile. 2025-06-07 11:50:30 +02:00
Magnus Müller
25c96bcaca Add README for contributing agent tasks with guidelines and testing instructions 2025-06-07 00:51:32 +02:00
Magnus Müller
2c355ce6d9 Add new agent task tests for Amazon laptop search, browser-use pip installation, and Cloudflare captcha solving 2025-06-07 00:49:39 +02:00