Commit Graph

315 Commits

Author SHA1 Message Date
Magnus Müller
99dad56141 Run v8 2025-10-29 23:07:09 -07:00
Magnus Müller
04e987df99 Add stacked DOM elements test for complex scenarios
- Introduced a new HTML template for testing stacked DOM elements, including open and closed shadow DOMs, same-origin and cross-origin iframes, and a final button.
- Implemented a test case to validate click functionality through these stacked elements, ensuring that all interactive elements are clickable and that the click counter reflects the expected number of clicks.
- Enhanced the browser session configuration to accommodate a taller window size for better visibility of stacked elements.

This update aims to improve the robustness of DOM interaction tests and ensure comprehensive coverage of complex scenarios in the browser automation framework.
2025-10-26 23:16:06 -07:00
Magnus Müller
8ff9dbe090 Remove unnecessary blank lines in the eval-on-pr workflow file to improve readability and maintainability. 2025-10-26 16:40:57 -07:00
Magnus Müller
67a0919fe1 remove comment 2025-10-26 16:10:47 -07:00
Magnus Müller
c23137f21e New logs 2025-10-26 16:07:03 -07:00
Magnus Müller
45448477d6 New logs 2025-10-26 15:51:33 -07:00
Magnus Müller
3b09df2b01 New logs 2025-10-26 15:43:31 -07:00
Magnus Müller
bf41b0ef23 Pass PR branch name to evaluation platform
- Add branchName parameter to API call
- This allows the UI to show the actual PR branch instead of 'main'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 15:34:41 -07:00
Magnus Müller
d3fdc3bbfd New logs 2025-10-26 15:28:20 -07:00
Magnus Müller
0a896a0eaf New logs 2025-10-26 15:19:19 -07:00
Magnus Müller
162a8d3a2d New logs 2025-10-26 15:15:04 -07:00
Magnus Müller
83ccb630cc New logs 2025-10-26 15:03:04 -07:00
Magnus Müller
9915af7e1b New logs 2025-10-26 15:00:07 -07:00
Magnus Müller
2e083f050d Remove commented options from eval-on-pr.yml to streamline the workflow configuration 2025-10-26 14:55:48 -07:00
Magnus Müller
b09af25e86 Remove commented options from eval-on-pr.yml to streamline the workflow configuration 2025-10-26 14:54:14 -07:00
Magnus Müller
a0bb331c7c Update error message in eval-on-pr.yml for clarity 2025-10-26 13:27:32 -07:00
Magnus Müller
7b962095e7 Update error message in eval-on-pr.yml for clarity 2025-10-26 13:24:59 -07:00
Magnus Müller
659c0d5652 Update error message in eval-on-pr.yml for clarity 2025-10-26 13:17:36 -07:00
Magnus Müller
c1e86bc46d Update error message in eval-on-pr.yml for clarity 2025-10-26 13:13:24 -07:00
Magnus Müller
9e05f10b34 New logs 2025-10-26 13:08:46 -07:00
Magnus Müller
37df2bdc69 New logs 2025-10-26 13:04:03 -07:00
Magnus Müller
e7b628ff95 New logs 2025-10-26 12:53:47 -07:00
Magnus Müller
394484d9ec New logs 2025-10-26 12:49:29 -07:00
Magnus Müller
89fbccd5c3 New logs 2025-10-26 12:37:07 -07:00
Magnus Müller
81edf51b44 New logs 2025-10-26 12:24:53 -07:00
Magnus Müller
82300c7ca5 New logs 2025-10-26 11:58:40 -07:00
Magnus Müller
0a81e787ce New logs 2025-10-26 11:07:58 -07:00
Magnus Müller
bdc9f024a8 Less errors 2025-10-26 11:04:51 -07:00
Magnus Müller
a9d3c4dd59 Less errors 2025-10-26 10:46:40 -07:00
Magnus Müller
d64004a039 Less errors 2025-10-26 10:43:11 -07:00
Magnus Müller
1e4ea53616 Update 2025-10-26 10:40:45 -07:00
Magnus Müller
974069e9dd Trigger eval 2025-10-26 10:36:02 -07:00
Magnus Müller
a46deaec61 eval on pr 2025-10-26 10:27:33 -07:00
Magnus Müller
3c7cdcc385 Remove frozen 2025-10-25 10:44:06 -07:00
Magnus Müller
863aaff487 Less retry 2025-10-25 10:42:13 -07:00
Magnus Müller
d6a0193d96 Api keys 2025-10-25 10:38:05 -07:00
Magnus Müller
ad6dac41c8 Cache uv 2025-10-25 10:33:47 -07:00
Magnus Müller
ecf1b9950b Linter 2025-10-25 10:31:31 -07:00
Magnus Müller
16d8833048 Api key rename 2025-10-25 10:24:40 -07:00
Magnus Müller
df4babf9a1 Env keys 2025-10-25 10:21:59 -07:00
Magnus Müller
6e31a05acd Split model tests 2025-10-25 10:16:46 -07:00
Magnus Müller
aef11dc6f1 Parallel tests 2025-10-25 09:56:33 -07:00
Magnus Müller
4db51f9e24 Ci cd test update 2025-10-25 09:12:15 -07:00
Magnus Müller
9d2e379af7 Api key 2025-10-24 02:40:35 -07:00
Magnus Müller
c1abe88048 Higher timeout + weekly chrome cache 2025-10-24 02:30:29 -07:00
Magnus Müller
9f711b4281 Reduce timeout 2025-10-24 01:41:22 -07:00
Magnus Müller
46e727f7bd Enhance CI workflow by adding conditional caching for Chromium installation
This update introduces a caching mechanism for Chromium binaries in the GitHub Actions workflow. The installation of Chromium will now only occur if it is not already cached, reducing unnecessary downloads and speeding up the CI process. This change aims to optimize the workflow efficiency, particularly during parallel test runs.
2025-10-24 01:28:39 -07:00
Magnus Müller
e6cb8e7587 Add explicit uv package caching to speed up CI
The built-in astral-sh/setup-uv cache wasn't working properly,
causing 2m+ downloads of Python packages on every test run.

Added explicit ~/.cache/uv caching keyed on uv.lock hash.

Before: 2m 9s downloading packages (numpy, oci, imageio-ffmpeg, etc)
After: ~5s restoring from cache

Saves ~2 minutes per test job × 40 parallel jobs = 80 runner-minutes saved
2025-10-24 01:25:03 -07:00
Magnus Müller
b31c3ab0b4 no sync pip 2025-10-24 01:24:41 -07:00
Magnus Müller
7b0b92991c Optimize lint workflow by skipping redundant uv sync checks
Add --no-sync flag to uv run commands that come after uv sync.

Before:
- uv sync (50s)
- uv run pre-commit (50s sync check + 8s run = 58s)
- Total: 108s

After:
- uv sync (50s)
- uv run --no-sync pre-commit (8s run)
- Total: 58s

Saves 50 seconds per lint job (46% faster)
2025-10-24 01:23:46 -07:00