mirror of
https://github.com/LadybirdBrowser/ladybird
synced 2026-04-27 18:17:22 +02:00
LibWeb: Replace spin_until in HTMLParser::the_end() with state machine
HTMLParser::the_end() had three spin_until calls that blocked the event loop: step 5 (deferred scripts), step 7 (ASAP scripts), and step 8 (load event delay). This replaces them with an HTMLParserEndState state machine that progresses asynchronously via callbacks. The state machine has three phases matching the three spin_until calls: - WaitingForDeferredScripts: loops executing ready deferred scripts - WaitingForASAPScripts: waits for ASAP script lists to empty - WaitingForLoadEventDelay: waits for nothing to delay the load event Notification triggers re-evaluate the state machine when conditions change: HTMLScriptElement::mark_as_ready, stylesheet unblocking in StyleElementBase/HTMLLinkElement, did_stop_being_active_document, and DocumentLoadEventDelayer decrements. NavigableContainer state changes (session history readiness, content navigable cleared, lazy load flag) also trigger re-evaluation of the load event delay check. Key design decisions and why: 1. Microtask checkpoint in schedule_progress_check(): The old spin_until called perform_a_microtask_checkpoint() before checking conditions. This is critical because HTMLImageElement::update_the_image_data step 8 queues a microtask that creates the DocumentLoadEventDelayer. Without the checkpoint, check_progress() would see zero delayers and complete before images start delaying the load event. 2. deferred_invoke in schedule_progress_check(): I tried Core::Timer (0ms), queue_global_task, and synchronous calls. Timers caused non-deterministic ordering with the HTML event loop's task processing timer, leading to image layout tests failing (wrong subtest pass/fail patterns). Synchronous calls fired too early during image load processing before dimensions were set, causing 0-height images in layout tests. queue_global_task had task ordering issues with the session history traversal queue. deferred_invoke runs after the current callback returns but within the same event loop pump, giving the right balance. 3. Navigation load event guard (m_navigation_load_event_guard): During cross-document navigation, finalize_a_cross_document_navigation step 2 calls set_delaying_load_events(false) before the session history traversal activates the new document. This creates a transient state where the parent's load event delay check sees the about:blank (which has ready_for_post_load_tasks=true) as the active document and completes prematurely.
This commit is contained in:
committed by
Alexander Kalenik
parent
b542617e09
commit
df96b69e7a
Notes:
github-actions[bot]
2026-03-28 22:15:52 +00:00
Author: https://github.com/kalenikaliaksandr Commit: https://github.com/LadybirdBrowser/ladybird/commit/df96b69e7aa Pull-request: https://github.com/LadybirdBrowser/ladybird/pull/8660
@@ -49,12 +49,14 @@
|
||||
#include <LibWeb/Infra/Strings.h>
|
||||
#include <LibWeb/MathML/TagNames.h>
|
||||
#include <LibWeb/Namespace.h>
|
||||
#include <LibWeb/Platform/EventLoopPlugin.h>
|
||||
#include <LibWeb/SVG/SVGScriptElement.h>
|
||||
#include <LibWeb/SVG/TagNames.h>
|
||||
|
||||
namespace Web::HTML {
|
||||
|
||||
GC_DEFINE_ALLOCATOR(HTMLParser);
|
||||
GC_DEFINE_ALLOCATOR(HTMLParserEndState);
|
||||
|
||||
static inline void log_parse_error(SourceLocation const& location = SourceLocation::current())
|
||||
{
|
||||
@@ -275,8 +277,6 @@ void HTMLParser::run(URL::URL const& url, HTMLTokenizer::StopAtInsertionPoint st
|
||||
// https://html.spec.whatwg.org/multipage/parsing.html#the-end
|
||||
void HTMLParser::the_end(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> parser)
|
||||
{
|
||||
auto& heap = document->heap();
|
||||
|
||||
// Once the user agent stops parsing the document, the user agent must run the following steps:
|
||||
|
||||
// NOTE: This is a static method because the spec sometimes wants us to "act as if the user agent had stopped
|
||||
@@ -332,33 +332,128 @@ void HTMLParser::the_end(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> pa
|
||||
return;
|
||||
}
|
||||
|
||||
// 5. While the list of scripts that will execute when the document has finished parsing is not empty:
|
||||
while (!document->scripts_to_execute_when_parsing_has_finished().is_empty()) {
|
||||
// 1. Spin the event loop until the first script in the list of scripts that will execute when the document has finished parsing
|
||||
// has its "ready to be parser-executed" flag set and the parser's Document has no style sheet that is blocking scripts.
|
||||
main_thread_event_loop().spin_until(GC::create_function(heap, [document] {
|
||||
return document->scripts_to_execute_when_parsing_has_finished().first()->is_ready_to_be_parser_executed()
|
||||
&& !document->has_a_style_sheet_that_is_blocking_scripts();
|
||||
}));
|
||||
// Steps 5-11 are handled by the HTMLParserEndState state machine.
|
||||
auto state = HTMLParserEndState::create(document, parser);
|
||||
document->set_html_parser_end_state(state);
|
||||
state->schedule_progress_check();
|
||||
}
|
||||
|
||||
// 2. Execute the first script in the list of scripts that will execute when the document has finished parsing.
|
||||
document->scripts_to_execute_when_parsing_has_finished().first()->execute_script();
|
||||
static constexpr int THE_END_TIMEOUT_MS = 15000;
|
||||
|
||||
// 3. Remove the first script element from the list of scripts that will execute when the document has finished parsing (i.e. shift out the first entry in the list).
|
||||
(void)document->scripts_to_execute_when_parsing_has_finished().take_first();
|
||||
GC::Ref<HTMLParserEndState> HTMLParserEndState::create(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> parser)
|
||||
{
|
||||
return document->heap().allocate<HTMLParserEndState>(document, parser);
|
||||
}
|
||||
|
||||
HTMLParserEndState::HTMLParserEndState(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> parser)
|
||||
: m_document(document)
|
||||
, m_parser(parser)
|
||||
, m_timeout(Platform::Timer::create_single_shot(heap(), THE_END_TIMEOUT_MS, GC::create_function(heap(), [this] {
|
||||
if (m_phase != Phase::Completed)
|
||||
dbgln("HTMLParserEndState: timed out in phase {}", to_underlying(m_phase));
|
||||
})))
|
||||
{
|
||||
m_timeout->start();
|
||||
}
|
||||
|
||||
void HTMLParserEndState::visit_edges(Cell::Visitor& visitor)
|
||||
{
|
||||
Base::visit_edges(visitor);
|
||||
visitor.visit(m_document);
|
||||
visitor.visit(m_parser);
|
||||
visitor.visit(m_timeout);
|
||||
}
|
||||
|
||||
void HTMLParserEndState::schedule_progress_check()
|
||||
{
|
||||
if (m_phase == Phase::Completed)
|
||||
return;
|
||||
if (m_check_pending)
|
||||
return;
|
||||
m_check_pending = true;
|
||||
Platform::EventLoopPlugin::the().deferred_invoke(GC::create_function(heap(), [this] {
|
||||
// NOTE: Pending microtasks (e.g. image load event delayer creation from update_the_image_data
|
||||
// step 8) must be processed before we check conditions, matching spin_until's behavior.
|
||||
// Skip the checkpoint when the microtask queue is empty to avoid unnecessary work
|
||||
// (save/restore execution context stack, notify_about_rejected_promises, etc.).
|
||||
if (!main_thread_event_loop().microtask_queue_empty()) {
|
||||
auto& vm = main_thread_event_loop().vm();
|
||||
vm.save_execution_context_stack();
|
||||
vm.clear_execution_context_stack();
|
||||
main_thread_event_loop().perform_a_microtask_checkpoint();
|
||||
vm.restore_execution_context_stack();
|
||||
}
|
||||
check_progress();
|
||||
m_check_pending = false;
|
||||
}));
|
||||
}
|
||||
|
||||
void HTMLParserEndState::check_progress()
|
||||
{
|
||||
// AD-HOC: Bail out if the document is no longer fully active (e.g. navigated away from).
|
||||
if (!m_document->is_fully_active()) {
|
||||
complete();
|
||||
return;
|
||||
}
|
||||
|
||||
switch (m_phase) {
|
||||
case Phase::WaitingForDeferredScripts:
|
||||
// 5. While the list of scripts that will execute when the document has finished parsing is not empty:
|
||||
while (!m_document->scripts_to_execute_when_parsing_has_finished().is_empty()) {
|
||||
auto& first_script = *m_document->scripts_to_execute_when_parsing_has_finished().first();
|
||||
|
||||
// 1. Spin the event loop until the first script in the list of scripts that will execute when the document has finished parsing
|
||||
// has its "ready to be parser-executed" flag set and the parser's Document has no style sheet that is blocking scripts.
|
||||
if (!first_script.is_ready_to_be_parser_executed() || m_document->has_a_style_sheet_that_is_blocking_scripts())
|
||||
return;
|
||||
|
||||
// 2. Execute the first script in the list of scripts that will execute when the document has finished parsing.
|
||||
first_script.execute_script();
|
||||
|
||||
// 3. Remove the first script element from the list of scripts that will execute when the document has finished parsing (i.e. shift out the first entry in the list).
|
||||
(void)m_document->scripts_to_execute_when_parsing_has_finished().take_first();
|
||||
}
|
||||
|
||||
advance_to_asap_scripts_phase();
|
||||
[[fallthrough]];
|
||||
|
||||
case Phase::WaitingForASAPScripts:
|
||||
// 7. Spin the event loop until the set of scripts that will execute as soon as possible and the list of scripts
|
||||
// that will execute in order as soon as possible are empty.
|
||||
if (!m_document->scripts_to_execute_as_soon_as_possible().is_empty()
|
||||
|| !m_document->scripts_to_execute_in_order_as_soon_as_possible().is_empty())
|
||||
return;
|
||||
|
||||
m_phase = Phase::WaitingForLoadEventDelay;
|
||||
[[fallthrough]];
|
||||
|
||||
case Phase::WaitingForLoadEventDelay:
|
||||
// 8. Spin the event loop until there is nothing that delays the load event in the Document.
|
||||
if (m_document->anything_is_delaying_the_load_event())
|
||||
return;
|
||||
|
||||
m_phase = Phase::Completed;
|
||||
[[fallthrough]];
|
||||
|
||||
case Phase::Completed:
|
||||
complete();
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
void HTMLParserEndState::advance_to_asap_scripts_phase()
|
||||
{
|
||||
// AD-HOC: We need to scroll to the fragment on page load somewhere.
|
||||
// But a script that ran in step 5 above may have scrolled the page already,
|
||||
// so only do this if there is an actual fragment to avoid resetting the scroll position unexpectedly.
|
||||
// Spec bug: https://github.com/whatwg/html/issues/10914
|
||||
auto indicated_part = document->determine_the_indicated_part();
|
||||
auto indicated_part = m_document->determine_the_indicated_part();
|
||||
if (indicated_part.has<DOM::Element*>() && indicated_part.get<DOM::Element*>() != nullptr) {
|
||||
document->scroll_to_the_fragment();
|
||||
m_document->scroll_to_the_fragment();
|
||||
}
|
||||
|
||||
// 6. Queue a global task on the DOM manipulation task source given the Document's relevant global object to run the following substeps:
|
||||
queue_global_task(HTML::Task::Source::DOMManipulation, *document, GC::create_function(heap, [document] {
|
||||
queue_global_task(HTML::Task::Source::DOMManipulation, *m_document, GC::create_function(m_document->heap(), [document = m_document] {
|
||||
// 1. Set the Document's load timing info's DOM content loaded event start time to the current high resolution time given the Document's relevant global object.
|
||||
document->load_timing_info().dom_content_loaded_event_start_time = HighResolutionTime::current_high_resolution_time(relevant_global_object(*document));
|
||||
|
||||
@@ -375,32 +470,23 @@ void HTMLParser::the_end(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> pa
|
||||
// FIXME: 5. Invoke WebDriver BiDi DOM content loaded with the Document's browsing context, and a new WebDriver BiDi navigation status whose id is the Document object's navigation id, status is "pending", and url is the Document object's URL.
|
||||
}));
|
||||
|
||||
// 7. Spin the event loop until the set of scripts that will execute as soon as possible and the list of scripts that will execute in order as soon as possible are empty.
|
||||
main_thread_event_loop().spin_until(GC::create_function(heap, [document] {
|
||||
// AD-HOC: Also bail out if the document is no longer fully active (e.g. navigated away from).
|
||||
// Otherwise this spin_until stays on the call stack indefinitely, and all subsequent
|
||||
// event processing on the same event loop happens in nested spin_until pumping.
|
||||
if (!document->is_fully_active())
|
||||
return true;
|
||||
return document->scripts_to_execute_as_soon_as_possible().is_empty();
|
||||
}));
|
||||
m_phase = Phase::WaitingForASAPScripts;
|
||||
}
|
||||
|
||||
// 8. Spin the event loop until there is nothing that delays the load event in the Document.
|
||||
main_thread_event_loop().spin_until(GC::create_function(heap, [document] {
|
||||
// AD-HOC: Bail out if the document is no longer fully active.
|
||||
if (!document->is_fully_active())
|
||||
return true;
|
||||
return !document->anything_is_delaying_the_load_event();
|
||||
}));
|
||||
void HTMLParserEndState::complete()
|
||||
{
|
||||
m_phase = Phase::Completed;
|
||||
m_timeout->stop();
|
||||
m_document->set_html_parser_end_state(nullptr);
|
||||
|
||||
// 9. Queue a global task on the DOM manipulation task source given the Document's relevant global object to run the following steps:
|
||||
queue_global_task(HTML::Task::Source::DOMManipulation, *document, GC::create_function(document->heap(), [document, parser] {
|
||||
queue_global_task(HTML::Task::Source::DOMManipulation, *m_document, GC::create_function(m_document->heap(), [document = m_document, parser = m_parser] {
|
||||
// 1. Update the current document readiness to "complete".
|
||||
document->update_readiness(HTML::DocumentReadyState::Complete);
|
||||
|
||||
// AD-HOC: We need to wait until the document ready state is complete before detaching the parser, otherwise the DOM complete time will not be set correctly.
|
||||
if (parser)
|
||||
document->detach_parser({});
|
||||
document->detach_parser();
|
||||
|
||||
// 2. If the Document object's browsing context is null, then abort these steps.
|
||||
if (!document->browsing_context())
|
||||
@@ -442,7 +528,7 @@ void HTMLParser::the_end(GC::Ref<DOM::Document> document, GC::Ptr<HTMLParser> pa
|
||||
// FIXME: 10. If the Document's print when loaded flag is set, then run the printing steps.
|
||||
|
||||
// 11. The Document is now ready for post-load tasks.
|
||||
document->set_ready_for_post_load_tasks(true);
|
||||
m_document->set_ready_for_post_load_tasks(true);
|
||||
}
|
||||
|
||||
void HTMLParser::process_using_the_rules_for(InsertionMode mode, HTMLToken& token)
|
||||
|
||||
Reference in New Issue
Block a user