Commit Graph

130 Commits

Author SHA1 Message Date
breakgimme
4f74ced414 LibWeb: Make User-Agent updates apply to HTTP requests on reload 2025-12-28 09:11:13 -05:00
Aliaksandr Kalenik
f6a7df78e7 LibWeb: Add missing GC visits for XHR::FormDataEntry
3a6782689 fix up that changes `Vector<XHR::FormDataEntry>` to
`GC::ConservativeVector<XHR::FormDataEntry>`.
2025-12-26 19:48:46 +01:00
Andreas Kling
3a6782689f LibWeb: Don't use GC::Root in FormDataEntryValue variant
This was causing reference cycles and leaking entire realms.
2025-12-26 11:57:00 +01:00
Timothy Flynn
696935d8ce LibWeb: Remove outdated note from LoadRequest creation in fetch
The method being referred to here was removed in commit
556364fd76.
2025-12-21 09:24:51 -06:00
Timothy Flynn
bf7b812d0b LibHTTP+LibWeb: Store the in-memory HTTP cache without JS realms
The in-memory HTTP Fetch cache currently keeps the realm which created
each cache entry alive indefinitely. This patch migrates this cache to
LibHTTP, to ensure it is completely unaware of any JS objects.

Now that we are not interacting with Fetch response objects, we can no
longer use Streams infrastructure to pipe the response body into the
Fetch response. Fetch also ultimately creates the cache response once
the HTTP response headers have arrived. So the LibHTTP cache will hold
entries in a pending list until we have received the entire response
body. Then it is moved to a completed list and may be used thereafter.
2025-12-21 08:59:31 -06:00
Timothy Flynn
d08bd14928 LibWeb: Accumulate all network bytes in FetchedDataReceiver
This will allow us to hand off the bytes to the HTTP memory cache.
2025-12-21 08:59:31 -06:00
Timothy Flynn
46b3218241 LibHTTP+LibWeb: Use LibHTTP to calculate stale-while-revalidate values
No need to duplicate this in LibWeb.

In doing so, this also fixes an apparent bug for SWR handling in LibWeb.
We were previously deciding if we were in the SWR lifetime with:

    stale_while_revalidate > current_age

However, the SWR lifetime is meant to be an additional time on top of
the freshness lifetime:

    freshness_lifetime + stale_while_revalidate > current_age
2025-12-14 11:33:02 -05:00
Timothy Flynn
854981714f LibWeb: Remove errant "non-standard" comment
The method this comment was attached to was removed in commit
a5df972055.
2025-12-14 11:33:02 -05:00
Aliaksandr Kalenik
f29212703e LibWeb/Fetch: Use GC::Function for fetch_main_content() callback
...instead of creating GC roots for captured values.
2025-12-09 08:51:48 -05:00
Timothy Flynn
adcf5462af LibWeb+WebContent: Rename the http-cache flag to http-memory-cache
Rather than having http-cache and http-disk-cache, let's rename the
former to http-memory-cache to be extra clear what we are talking about.
2025-12-02 12:19:42 +01:00
Timothy Flynn
2453f0bc04 LibHTTP+LibWeb: Use LibHTTP's cache implementation in LibWeb
There are a couple of remaining RFC 9111 methods in LibWeb's Fetch, but
these are currently directly tied to the way we store GC-allocated HTTP
response objects. So de-coupling that is left as a future exercise.
2025-11-29 08:35:02 -05:00
Timothy Flynn
9375660b64 LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.

The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.

So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.

This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
2025-11-27 14:57:29 +01:00
Timothy Flynn
3dce6766a3 LibWeb: Extract some CORS and MIME Fetch helpers to their own files
An upcoming commit will migrate the contents of Headers.h/cpp to LibHTTP
for use outside of LibWeb. These CORS and MIME helpers depend on other
LibWeb facilities, however, so they cannot be moved.
2025-11-27 14:57:29 +01:00
Timothy Flynn
0fd80a8f99 LibTextCodec+LibWeb: Move isomorphic coders to LibTextCodec
This will be used outside of LibWeb.
2025-11-27 14:57:29 +01:00
Timothy Flynn
7b0dfa61b1 LibWeb: Do not move header name/values that are re-used
Fixes a regression from commit:
f675cfe90f
2025-11-26 21:22:35 -05:00
Timothy Flynn
cbfae97101 LibWeb: Include empty header values when joining duplicated headers
Fixes a regression from commit:
f675cfe90f

It is not sufficient to only check if the builder is empty, as we will
then drop empty header values (when the first found value is empty).

This is tested in WPT by /cors/origin.htm, but that requires an HTTP
server.
2025-11-26 21:22:35 -05:00
Timothy Flynn
f675cfe90f LibWeb: Store HTTP methods and headers as ByteString
The spec declares these as a byte sequence, which we then implemented as
a ByteBuffer. This has become pretty awkward to deal with, as evidenced
by the plethora of `MUST(ByteBuffer::copy(...))` and `.bytes()` calls
everywhere inside Fetch. We would then treat the bytes as a string
anyways by wrapping them in StringView everywhere.

We now store these as a ByteString. This is more comfortable to deal
with, and we no longer need to continually copy underlying storage (as
ByteString is ref-counted).

This work is largely preparatory for an upcoming HTTP header refactor.
2025-11-26 09:15:06 -05:00
Timothy Flynn
ed27eea091 LibWeb: Do not copy the result of HeaderList::extract_header_list_values
There's no need to copy the Vector out of this result every time we call
it. We can move it out or access it directly.
2025-11-26 09:15:06 -05:00
Timothy Flynn
44fbf6451e LibWeb: Simplify Fetch's build-content-range implementation
* Don't pass u64 by reference
* Don't double-format the range numbers
2025-11-26 09:15:06 -05:00
Timothy Flynn
d70224ad2e LibWeb: Organize Fetch Headers.h/Headers.cpp a bit
Generally just define things in the order they are declared (will make a
change to use ByteString in this file a bit easier to follow). Also make
a couple of free functions be class methods on Header / HeaderList.
2025-11-26 09:15:06 -05:00
Prajjwal
1f5ffe04c8 LibWeb: Fix race condition between read_all_bytes and stream population
There might be a race between read_all_bytes and stream population.
If document load reads stream before it is populated, the stream will
be empty and might lead to hang in SessionHistoryTraversalQueue which
is expecting a promise to be resolved on document load.

This race can occur when stream population and document source are set
very close to each other. For example, when a newly generated blob is
set as the source of an iframe.
- navigation/multiple-navigable-cross-document-navigation.html has been
modified to trigger this race.
2025-11-26 12:27:12 +01:00
Aliaksandr Kalenik
69cede4a0f AK+LibWeb: Make StringBase::bytes() lvalue-only
Disallow calling `StringBase::bytes()` on temporaries to avoid returning
`ReadonlyBytes` that outlive the underlying string.

With this change, we catch a real UAF:
`load_result.data = maybe_response.release_value().bytes();`
All other updated call sites were already safe, they just needed to use
an intermediate named variable to satisfy the new lvalue-only
requirement.
2025-11-25 13:02:20 -05:00
Aliaksandr Kalenik
0eb28a1a54 LibWeb: Delete unused BufferPolicy from fetch Request
This is no longer needed since all requests are unbuffered.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
16b0f1e6c2 LibWeb: Delete unused ResourceLoader::load()
...and rename `load_unbuffered()` to `load()`.
2025-11-20 06:29:13 -05:00
Aliaksandr Kalenik
3058274386 LibWeb: Use unbuffered network requests for all Fetch requests
Previously, unbuffered requests were only available as a special mode
for EventSource. With this change, they are enabled by default, which
means chunks can be read from the stream as soon as they arrive.

This unlocks some interesting possibilities, such as starting to parse
HTML documents before the entire response has been received (that, in
turn, allows us to initiate subresource fetches earlier or begin
executing scripts sooner), or start rendering videos before they are
fully downloaded.

Co-authored-by: Timothy Flynn <trflynn89@pm.me>
2025-11-20 06:29:13 -05:00
Timothy Flynn
813986237e LibWeb: Add some tests that exercise the HTTP disk cache
Our HTTP disk cache is currently manually tested against various sites.
This patch adds some tests to cover various scenarios, including non-
cacheable responses, expired responses, and revalidation.

In order to ensure we hit the disk cache in RequestServer, we must
disable the in-memory cache in WebContent.
2025-11-20 09:33:49 +01:00
Psychpsyo
100f37995f Everywhere: Clean up AD-HOC and FIXME comments without colons 2025-11-13 15:56:04 +01:00
Luke Wilde
167de08c81 LibWeb: Remove exception throwing from Fetch
These were only here to manage OOMs, but there's not really any way to
recover from small OOMs in Fetch especially with its async nature.
2025-11-07 04:08:30 +01:00
Timothy Flynn
e0a8eb3767 LibWeb+WebContent: Hook Fetch's HTTP cache into the clear-cache action
And fix a typo in an invocation to clear the cache.
2025-11-05 18:27:36 +01:00
Timothy Flynn
6057719f63 LibWeb: Use fetch to retrieve all HTMLLinkElement resources
HTMLLinkElement is the final user of Resource/ResourceClient (used for
preloads and icons). This ports these link types to use fetch according
to the spec.

Preloads were particularly goofy because they would be stored in the
ResourceLoader's ad-hoc cache. But this cache was never consulted for
organic loads, thus were never used. There is more work to be done to
use these preloads within fetch, but for now they at least are stored
in fetch's HTTP cache for re-use.
2025-11-05 18:27:36 +01:00
Timothy Flynn
5b40398c39 LibWeb: Invoke process_response_consume_body with null in error cases
We were previously invoking it with an empty ByteBuffer, which will be
interpreted as a successful load by HTMLLinkElement in a future commit.
2025-11-05 18:27:36 +01:00
Luke Wilde
eeb5446c1b LibWeb: Avoid including Navigable.h in headers
This greatly reduces how much is recompiled when changing Navigable.h,
from >1000 to 82.
2025-10-20 10:16:55 +01:00
Julian Dominguez-Schatz
4e3387778e LibWeb: Respect IncludeCredentials for Set-Cookie during fetch
Per https://fetch.spec.whatwg.org/#http-network-fetch, Set-Cookie should
only store a cookie if IncludeCredentials::Yes is set. Fixes 1 web
platform test.
2025-09-24 10:12:56 +01:00
Timothy Flynn
2df4835025 LibWeb: Place HTTP cache logging behind a debug flag
It's quite verbose to have logging on by default here.
2025-09-19 13:52:07 +02:00
Timothy Flynn
b4df857a57 LibWeb+LibWebView+WebContent: Replace DNT with GPC
Global Privacy Control aims to be a replacement for Do Not Track. DNT
ended up not being a great solution, as it wasn't enforced by law. This
actually resulted in the DNT header serving as an extra fingerprinting
data point.

GPC is becoming enforced by law in USA states such as California and
Colorado. CA is further working on a bill which requires that browsers
implement such an opt-out preference signal (OOPS):

https://cppa.ca.gov/announcements/2025/20250911.html

This patch replaces DNT with GPC and hooks up the associated settings.
2025-09-16 10:38:20 +02:00
Timothy Flynn
7b3465ab55 LibWeb: Do not require multipart form data to end with CRLF
According to RFC 2046, the BNF of the form data body is:

    multipart-body := [preamble CRLF]
                      dash-boundary transport-padding CRLF
                      body-part *encapsulation
                      close-delimiter transport-padding
                      [CRLF epilogue]

Where "epilogue" is any text that "may be ignored or discarded". So we
should stop parsing the body once we encounter the terminating delimiter
("--").

Note that our parsing function is from an attempt to standardize the
grammar in the spec: https://andreubotella.github.io/multipart-form-data
This proposal hasn't been updated in ~4 years, and the fetch spec still
does not have a formal definition of the body string.
2025-09-15 18:28:48 +02:00
Luke Wilde
4772e1b0c9 LibWeb/Fetch: Add missing spec step for checking for tuple origin
Fixes https://github.com/LadybirdBrowser/ladybird/issues/6188
2025-09-15 09:58:33 +02:00
Pavel Shliak
bbb9159883 LibWeb: Fix Request() TypeError message typo for mode='navigate'
The Request constructor’s mode validation threw
  "Mode must not be 'navigate"
missing the closing quote. Add the trailing quote so the error reads:
  "Mode must not be 'navigate'".
2025-09-15 08:19:34 +01:00
Luke Wilde
05438e70f1 LibWeb: Receive cookies through principal_host_defined_page
Previously we depended on an associated document on the ESO to get to
the page, but Workers do not have documents. However, we can simply get
to the page with `principal_host_defined_page`, removing the issue.
2025-09-09 15:28:38 +02:00
Ali Mohammad Pur
4462348916 Everywhere: Slap some [[clang::lifetimebound]] where appropriate
This first pass only applies to the following two cases:
- Public functions returning a view type into an object they own
- Public ctors storing a view type

This catches a grand total of one (1) issue, which is fixed in
the previous commit.
2025-09-01 11:11:38 +02:00
Luke Wilde
e2c935475f LibWeb/Fetch: Enable callbacks in the abort signal algorithm callback
If the request has a body, the abort will interact with promises, which
requires callbacks to be enabled.

Fixes crashing on Atlassian products.
2025-08-26 16:29:35 +02:00
ayeteadoe
3df8e00d91 LibWeb: Enable EXPLICIT_SYMBOL_EXPORT 2025-08-23 16:04:36 -06:00
ayeteadoe
0a699132f3 WebContent: Enable in Windows CI 2025-08-23 16:04:36 -06:00
Tete17
658477620a LibWeb/LibURL/LibIPC: Extend createObjectURL to also accept MediaSources
This required some changes in LibURL & LibIPC since it has its own
definition of an BlobURLEntry. For now, we don't have a concrete usage
of MediaSource in LibURL so it is defined as an empty struct.

This removes one FIXME in an idl file.
2025-08-19 23:50:38 +02:00
Kenneth Myhra
1228063a85 LibWeb: Enforce Integrity Policy on Fetch requests 2025-08-14 13:37:38 +01:00
Timothy Flynn
70db474cf0 LibJS+LibWeb: Port interned bytecode strings to UTF-16
This was almost a no-op, except we intern JS exception messages. So the
bulk of this patch is porting exception messages to UTF-16.
2025-08-14 10:27:08 +02:00
Kenneth Myhra
0dc2fb3781 LibWeb: Update Fetch's compute the redirect-taint concept 2025-08-12 07:08:33 -04:00
Kenneth Myhra
e9246c15d9 LibWeb: Pass top-level navigation initiator origin to Fetch's Request 2025-08-12 07:08:33 -04:00
Kenneth Myhra
1b350596fb LibWeb: Align Fetching chapter's "To fetch" with latest spec changes 2025-08-08 11:12:53 +01:00
Kenneth Myhra
593ee1ae0a LibWeb: Implement AO populate request from client 2025-08-08 11:12:53 +01:00