LibXML+LibWeb: Use existing HTML entities table for XML parsing too

For XHTML documents, resolve named character entities (e.g.,  )
using the HTML entity table via a getEntity SAX callback. This avoids
parsing a large embedded DTD on every document and matches the approach
used by Blink and WebKit.

This also removes the now-unused DTD infrastructure:

- Remove resolve_external_resource callback from Parser::Options
- Remove resolve_xml_resource() function and its ~60KB embedded DTD
- Remove all call sites passing the unused callback
This commit is contained in:
sideshowbarker
2026-01-09 04:32:12 +09:00
committed by Tim Ledbetter
parent 35bb1e20ee
commit 1b41659efd
Notes: github-actions[bot] 2026-01-09 19:14:36 +00:00
43 changed files with 28321 additions and 55 deletions

View File

@@ -0,0 +1,19 @@
<!doctype html>
<title>Appending from the parser after adopting in an XML document doesn't miss notifications</title>
<link rel="match" href="../../../../../expected/wpt-import/html/the-xhtml-syntax/parsing-xhtml-documents/adopt-while-parsing-001-ref.html">
<link rel="help" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1511329">
<link rel="author" title="Emilio Cobos Álvarez" href="mailto:emilio@crisal.io">
<link rel="author" title="Mozilla" href="https://mozilla.org">
<style>
html, body { margin: 0 }
</style>
<script>
// If we don't get notified of the <div> insertion, the PASS text will never appear.
function parsingInterrupted() {
let frameDoc = document.querySelector("iframe").contentDocument;
let root = frameDoc.documentElement;
document.documentElement.appendChild(root);
root.offsetTop;
}
</script>
<iframe src="support/adopt-while-parsing.xhtml"></iframe>

View File

@@ -0,0 +1,27 @@
<!DOCTYPE html>
<!--
Any copyright is dedicated to the Public Domain.
http://creativecommons.org/publicdomain/zero/1.0/
-->
<html>
<head>
<title>
Test that an XHTML document with a data: URL still handles the XHTML DTD
properly even if the DTD URL is given as a relative URL.
</title>
<link rel="author" title="Boris Zbarsky" href="bzbarsky@mit.edu">
<link rel="match" href="../../../../../expected/wpt-import/html/the-xhtml-syntax/parsing-xhtml-documents/data-xhtml-with-dtd-ref.html">
</head>
<body>
Test passes if it correctly shows &Aacute; in the subframe.
<hr>
<!-- Document in the subframe is:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
&Aacute;
</body>
</html>
-->
<iframe src='data:application/xml,%3C%3Fxml%20version%3D%221.0%22%3F%3E%0A%3C!DOCTYPE%20html%20PUBLIC%20%22-%2F%2FW3C%2F%2FDTD%20XHTML%201.0%20Strict%2F%2FEN%22%20%22DTD%2Fxhtml1-strict.dtd%22%3E%0A%3Chtml%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml%22%3E%0A%20%20%3Cbody%3E%0A%20%20%20%20%26Aacute%3B%0A%20%20%3C%2Fbody%3E%0A%3C%2Fhtml%3E%0A'></iframe>

View File

@@ -0,0 +1,11 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<script>
window.parent.parsingInterrupted();
</script>
<div>
PASS
</div>
</body>
</html>