Nick Sweeting
|
e3d21d33a1
|
fix evaluate_tasks.py errors in CI
|
2025-06-16 17:20:27 -07:00 |
|
Nick Sweeting
|
06488e11ba
|
fix clickaction error handling test
|
2025-06-11 00:05:58 -07:00 |
|
Nick Sweeting
|
1fd8e0ec92
|
try statuses-write option
|
2025-06-10 23:57:41 -07:00 |
|
Nick Sweeting
|
ffd36eb5da
|
tweak env vars for CI
|
2025-06-10 03:58:37 -07:00 |
|
Magnus Müller
|
eaab9f04d7
|
Enhance GitHub Actions workflow and evaluate_tasks.py to include detailed task evaluation results. The workflow now captures and displays detailed results in a structured format, while the Python script outputs detailed results as JSON for better integration with GitHub Actions. This improves visibility and understanding of task outcomes in the evaluation process.
|
2025-06-07 13:39:16 +02:00 |
|
Magnus Müller
|
3ecee462a2
|
Update GitHub Actions workflow permissions to allow writing comments on pull requests and issues, enhancing interaction capabilities for automated testing processes.
|
2025-06-07 13:22:43 +02:00 |
|
Magnus Müller
|
bdf29c34fb
|
Add PR comment functionality to GitHub Actions workflow for agent task evaluation results. This includes a summary of passed tasks, percentage score, and status emoji based on task outcomes, enhancing visibility of evaluation results directly in pull requests.
|
2025-06-07 13:16:30 +02:00 |
|
Magnus Müller
|
8d9b24b03a
|
Add summary output for agent tasks evaluation in CI workflow
|
2025-06-07 11:27:06 +02:00 |
|
Magnus Müller
|
52c60f8367
|
Remove dependency on tests job in evaluate-tasks step of CI workflow
|
2025-06-07 11:11:23 +02:00 |
|
Magnus Müller
|
576519ee40
|
Enhance CI workflow by adding agent tasks evaluation step and updating evaluate_tasks.py to output evaluation results
|
2025-06-07 10:59:24 +02:00 |
|
Magnus Müller
|
3666d2b077
|
Add agent tasks evaluation script and update CI workflow to include it
|
2025-06-07 10:49:03 +02:00 |
|
Magnus Müller
|
074169f80f
|
Merge branch 'main' into tests/eval
|
2025-06-07 09:15:41 +02:00 |
|
Nick Sweeting
|
4358924964
|
only install chromium in tests
|
2025-06-06 19:32:20 -07:00 |
|
Magnus Müller
|
42dac3dce1
|
Add API key environment variables to GitHub Actions workflow for enhanced test capabilities
|
2025-06-07 01:11:48 +02:00 |
|
Nick Sweeting
|
8504bc4c7b
|
parallelize playwright tests using loop scope=session and pytest-xdist
|
2025-06-06 02:03:25 -07:00 |
|
Nick Sweeting
|
3d10260543
|
fix missing link between find_tests and test job in CI
|
2025-05-23 19:22:55 -07:00 |
|
Nick Sweeting
|
37a36dbd28
|
catch failure case up-front
|
2025-05-23 19:12:46 -07:00 |
|
Nick Sweeting
|
6a1ed628e3
|
properly split filenamees out of ls results in test discovery
|
2025-05-23 19:10:25 -07:00 |
|
Nick Sweeting
|
063f103efd
|
more warning on filure to list tests
|
2025-05-23 19:05:52 -07:00 |
|
Nick Sweeting
|
06ee004a88
|
add assertion to tests discovery
|
2025-05-23 18:33:18 -07:00 |
|
Nick Sweeting
|
9fcd5cd7b2
|
debugging tests discovery
|
2025-05-23 18:32:28 -07:00 |
|
Nick Sweeting
|
e19e1c5dfc
|
fix ci tests
|
2025-05-23 18:29:59 -07:00 |
|
Nick Sweeting
|
3940462d8d
|
Potential fix for code scanning alert no. 28: Workflow does not contain permissions
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
|
2025-05-23 03:50:15 -04:00 |
|
Nick Sweeting
|
6b8360c475
|
better logging
|
2025-05-22 23:17:21 -07:00 |
|
Nick Sweeting
|
18554e2834
|
autodetect tests for ci by looking in folder
|
2025-05-22 05:34:04 -07:00 |
|
Nick Sweeting
|
bdb9bc81a3
|
install both chrome and chromium channels
|
2025-05-20 03:55:57 -07:00 |
|
Nick Sweeting
|
312a738ce9
|
better browser info logging at startup and tests
|
2025-05-20 03:50:04 -07:00 |
|
Nick Sweeting
|
836e1ddbf0
|
rename test
|
2025-05-20 02:33:33 -07:00 |
|
Nick Sweeting
|
7e26eb14b1
|
add glob support to allowed_domains
|
2025-05-13 18:25:28 -07:00 |
|
Nick Sweeting
|
3f4c918acf
|
fix tests to use playwright too
|
2025-05-09 18:22:29 -07:00 |
|
Nick Sweeting
|
4f625fd762
|
nevermind we still need uv run
|
2025-05-06 19:03:22 +08:00 |
|
Nick Sweeting
|
38dfb8e36e
|
rely on already activated venv
|
2025-05-06 19:01:25 +08:00 |
|
Nick Sweeting
|
005a1310bb
|
see if tests work without fonts for speed
|
2025-05-06 18:58:42 +08:00 |
|
Nick Sweeting
|
de7b4e1c82
|
install pkg deps separately from playwright
|
2025-05-06 18:13:46 +08:00 |
|
Nick Sweeting
|
aa26ac1850
|
jk
|
2025-05-06 17:57:42 +08:00 |
|
Nick Sweeting
|
0164f8d9e7
|
only install CI-ready version of chromium
|
2025-05-06 17:57:06 +08:00 |
|
Nick Sweeting
|
062654e532
|
fix github actions CI tests
|
2025-05-06 17:54:37 +08:00 |
|
Nick Sweeting
|
28906fd5d6
|
print a warning if any required sensitive data keys are not defined
|
2025-05-04 20:18:52 +08:00 |
|
Nick Sweeting
|
84c43e6a1e
|
add tests for current tab detection
|
2025-05-04 17:57:23 +08:00 |
|
Nick Sweeting
|
084414f910
|
fix ruff issues
|
2025-05-02 20:40:55 +08:00 |
|
Nick Sweeting
|
37f211d81d
|
Update .github/workflows/test.yaml
|
2025-05-02 02:42:31 -04:00 |
|
Nick Sweeting
|
345d49ab3d
|
Apply suggestions from code review
|
2025-05-02 02:41:02 -04:00 |
|
Anirudha619
|
d523434d00
|
add test_controller test to github workflows
|
2025-05-01 23:57:53 +05:30 |
|
Nick Sweeting
|
12d0240a2d
|
tweak styling
|
2025-04-28 05:28:21 +08:00 |
|
Nick Sweeting
|
3ac1a2b773
|
tweak skip rules
|
2025-04-28 05:19:20 +08:00 |
|
Nick Sweeting
|
405bf63c9b
|
prevent github actions from running twice for each commit
|
2025-04-28 05:09:28 +08:00 |
|
dha-aa
|
b070506563
|
restructure and rename workflows for improved GitHub PR UI
|
2025-04-27 08:44:30 +00:00 |
|
dha-aa
|
d70d3d1450
|
re-structure test matrix with expanded test categories for browser and models
|
2025-04-27 08:29:33 +00:00 |
|
dha-aa
|
1b975c3ce1
|
ci: display job names in GitHub PR UI
|
2025-04-24 09:15:27 +00:00 |
|
dha-aa
|
09d12c8ddf
|
ci: split tests into parallel browser & model jobs
|
2025-04-24 09:08:11 +00:00 |
|