Clearing the callback opens a window for the WebContent process to crash
while we do not have a callback set. In practice, we see this with ASAN
crashes, where the crash occurs after we've already signaled that the
test has completed.
We now set the crash handler once during init. This required moving the
clearing of other callbacks to the test completion handler (we were
clearing these callbacks in several different ways anyways, so now we
will at least be consistent).
This mode allows us to test the HTTP disk cache with two mechanisms:
1. If RequestServer is launched with --http-disk-cache-mode=testing, it
will cache requests with a X-Ladybird-Enable-Disk-Cache header.
2. In test mode, RS will include a X-Ladybird-Disk-Cache-Status response
header indicating how the response was handled by the cache. There is
no standard way for a web request to know what happened with respect
to the disk cache, so this fills that hole for testing.
This mode is not exposed to users.
The function currently has 2 purposes: (1) To copy dependent dlls for
executables to output binary directory. This ensures that these helper
processes can be ran after a build given not all DLLs from vcpkg libs
get implicitly copied to the bin folder. (2) Allow fully background
and/or GUI processes to use the Windows Subsystem. This prevents
unnecessarily launching a console for the process, as we either require
no user interaction or the user interaction is all handled in the GUI.
This is used by tests to set the default time zone to UTC.
This is because certain tests create JavaScript Date objects, which are
in the current timezone.
We only supported headless clipboard management in test-web. So when WPT
tests the clipboard APIs, we would blindly try to access the Qt app,
which does not exist.
Note that the AppKit UI has no such restriction, as the NSPasteboard is
accessible even without a GUI.
When trying to repro a failed CI test, it is handy to know the order in
which test-web ran its tests. We've seen in the past that the exact
order can be important to debug flakey tests.
This adds the index of the WebView running the test to verbose log
statements (which is enabled in CI).
Clipboard handling largely has nothing to do with the individual web
views. Rather, we interact with the system clipboard at the application
level. So let's move these implementations to the Application.
For every ref tests actual and expected screenshots are taken. These
screenshots are only needed while the individual test executes. However,
they are never freed during the run, leading to a continuous increase in
memory usage of the test runner while executing ref tests.
With the number of ref tests growing, this currently amounts to nearly 3
GB of uncompressed bitmap data being held in memory. Lets avoid that by
clearing the screenshot data at the end of each test. With this change
applied, the memory usage of test-web stays stable and below 100 MB for
the entire test run.
When a test is active in a test-web view, show the relative path to the
test instead of the view's URL. This gives a better starting point for
debugging than whatever the last loaded URL happened to be.
If no test is active, we still show the view's URL.
The BUILD_RPATH/INSTALL_RPATH CMake infrastructure is not supported
on Windows, but we want to ensure Ladybird executables are runnable
after the build phase so there can be an efficient dev loop.
lagom_copy_runtime_dlls() can be used by executable targets so all
their dependent dlls are copied to their output directory in their
post build step.
We occasionally (frequently) time out in CI. If ctest triggers this
timeout, it sends a SIGSTOP followed by a SIGKILL. Since we want to
gracefully exit the test runner, ask ctest to send a SIGTERM instead.
This should cause active test status reports to show up in CI.
If you interrupt the test runner (Ctrl+C, SIGINT) or if the test runner
is gracefully being terminated (SIGTERM), it now reports the current
status of all the spawned views with their URL and, if an active test is
still being run, the time since the start of the test.
Hopefully this will help gain insight into which tests are hanging.
The test logs tend to get a bit mixed together, so this makes it
possible to identify which values correspond to which test when multiple
are failing at once.
With the newly supported fuzzy matching in our test-web runner, we can
now define the expected maximum color channel and pixel count errors per
failing test and set a baseline they should not exceed.
The figures I added to these tests all come from my macOS M4 machine.
Most discrepancies seem to come from color calculations being slightly
off.
WPT reference tests can add metadata to tests to instruct the test
runner how to interpret the results. Because of this, it is not enough
to have an action that starts loading the (mis)match reference: we need
the test runner to receive the metadata so it can act accordingly.
This sets our test runner up for potentially supporting multiple
(mis)match references, and fuzzy rendering matches - the latter will be
implemented in the following commit.
This produces more granural information on the actual pixel errors
present between two bitmaps. It now also asserts that both bitmaps are
the same size, since we should never compare two differently sized
bitmaps anyway.
Of the available 71 screenshot tests that we have, 42 fail on macOS
arm64. Let's make it possible to skip those and run the remaining
succeeding screenshot tests on arm64 anyway.
When a test has a `reftest-wait` class, we now wait for fonts to load
and then for 2 `requestAnimationFrame` callbacks. This matches the
behavior of the WPT runner and allows more imported WPT tests to work
correctly.
We have a test for localStorage that repeatedly adds new items until the
quota is exceeded:
`webstorage/storage_local_setitem_quotaexceedederr.window.html`.
This test becomes significantly slower when localStorage is backed by
SQL database. Let's disable database usage in test mode for now, as this
test is likely to be flaky on CI due to timeouts.
You would have to just know that you need to define the constructor with
this declaration. Let's allow subclasses to define constructors as they
see fit.
This is causing errors on the WPT runner, which does not have a display
output. To do this requires shuffling around the Main::Arguments struct,
as we now need access to it from overridden WebView::Application methods
after construction.
Now that headless mode is built into the main Ladybird executable, the
headless-browser's only purpose is to run tests. So let's move it to the
testing directory and rename it to test-web (a la test-js / test-wasm).