Commit Graph

157 Commits

Author SHA1 Message Date
Shannon Booth
3a9a8e38f8 LibWeb/Fetch: Prevent file:// URLs from calling fetch() API
Thanks to Oscar Uribe for the report.

Co-Authored-By: Jelle Raaijmakers <jelle@ladybird.org>
2026-02-21 23:00:57 +01:00
Shannon Booth
006445d7bd LibWeb/Fetch: Write file scheme origin checking in a more readable way
Put the origin into a variable and make use of early return
to error out. No functional impact intended.
2026-02-21 23:00:57 +01:00
Zaggy1024
c1c51079a8 LibWeb: Don't stop fetches in FetchController if it was aborted already
If we've already fired off an error, calling stop_fetch() should
make no difference, other than stopping the Requests::Request.
Eventually, we'll probably want abort() and terminate() to
eventually stop the Requests::Request in an unobservable way.
2026-02-20 19:11:31 -06:00
Zaggy1024
7c0802bd4f LibWeb: Make FetchController's Requests::Request reference weak
This allows the Request to be cleaned up when it becomes inactive,
which in turn allows the GC to clean up the FetchController which is
indirectly captured by a root in the Request callbacks.
2026-02-18 13:13:32 -06:00
Sam Atkins
c5f117f1a2 LibWeb/Fetch: Store Body Blob as Ref not Root
To avoid some churn, Body::source() returns the SourceTypeInternal
object, as it's not exposed to JS. The constructor is also adjusted so
that Body::clone() doesn't have to convert to a SourceType and then
immediately back again.
2026-02-17 07:40:03 -05:00
Praise-Garfield
7b635fc734 LibWeb: Check WWW-Authenticate on response, not request
The AD-HOC guard for HTTP 401 retry checks whether the request
contains a WWW-Authenticate header, but WWW-Authenticate is a
response header sent by the server to indicate which
authentication schemes it accepts. The check uses
request->header_list() where response->header_list() is
intended. Since requests never carry this header, the guard
never matches and the entire 401 retry path is dead code.
2026-02-14 14:35:41 -05:00
Shannon Booth
4d64f21fa5 LibWeb: Give IDL exposed PlatformObjects an InterfaceName
By making use of the WEB_PLATFORM_OBJECT macro we can remove
the boilerplate of needing to add this override for every
serializable platform object so that we can check whether they
are exposed or not.
2026-02-14 20:22:40 +01:00
Niccolo Antonelli Dziri
bed56c676d LibWeb: Use enum instead of bool for CanUseCrossOriginIsolatedAPIs
Change the parameters types of the functions `coarsen_time` and
`coarsened_shared_current_time` from `bool` to
`CanUseCrossOriginIsolatedAPIs` for more coherence with the surrounding
code.
2026-02-13 16:47:42 +00:00
Timothy Flynn
e6c008a269 LibWeb+RequestServer: Attach HTTP cookie headers from RequestServer
We currently attach HTTP cookie headers from LibWeb within Fetch. This
has the downside that the cookie IPC, and the infrastructure around it,
are all synchronous. This blocks the WebContent process entirely while
the cookie is being retrieved, for every request on a page.

We now attach cookie headers from RequestServer. The state machine in
RequestServer::Request allows us to easily do this work asynchronously.
We can also skip this work entirely when the response is served from
disk cache.

Note that we will continue to parse cookies in the WebContent process.
If something goes awry during parsing. we limit the damage to that
process, instead of the UI or RequestServer.

Also note that WebSocket requests still have cookie headers attached
attached from LibWeb. This will be handled in a future patch.

In the future, we may want to introduce a memory cache for cookies in
RequestServer to avoid IPC altogether as able.
2026-02-10 12:21:20 +01:00
Timothy Flynn
d75aee2a56 LibHTTP+LibWeb: Move the IncludeCredentials enum to LibHTTP
This will be sent over IPC to RequestServer in an upcoming patch.
2026-02-10 12:21:20 +01:00
Timothy Flynn
8d97389038 LibHTTP+Everywhere: Move the cookie implementation to LibHTTP
This will allow parsing cookies outside of LibWeb.

LibHTTP is basically becoming the home of HTTP WG specs.
2026-02-10 12:21:20 +01:00
Timothy Flynn
0482b6bb57 LibWeb+LibWebView+WebContent: Implement versioning for document cookies
This patch introduces a cookie cache in the WebContent process to reduce
blocking IPC calls when JS accesses document.cookie. The UI process now
maintains a cookie version counter per-domain in shared memory. When JS
reads document.cookie, we check whether we have a valid cached cookie by
comparing the current shared version to the last used version. If they
match, the cached cookie is returned without IPC.

This optimization is based on Chromium's shared versioning, in which it
was observed that 87% of document.cookie accesses were redundant. See:
https://blog.chromium.org/2024/06/introducing-shared-memory-versioning-to.html

Note that this cache only supports document.cookie, not HTTP Cookie
headers. HTTP cookies are attached to requests with varying URLs and
paths. The cookies that match the document URL might not match the
request URL, which we wouldn't know from WebContent. So attaching the
cached document cookie would be incorrect.

On https://twinings.co.uk, we see approximately 600 document.cookie
requests while the page loads. This patch reduces the time spent in
the document.cookie getter from ~45ms to 2-3ms.
2026-02-05 07:28:07 -05:00
Zaggy1024
32441d3a46 LibWeb: Make FetchController::stop_fetch() cancel the request 2026-01-29 05:22:27 -06:00
Zaggy1024
e99bbad29b LibWeb: Copy fetchParams http_network_or_cache_fetch according to spec
By creating a new FetchController here, we prevent the original
FetchController from affecting the actual ongoing request. This is
necessary to allow FetchController::stop_fetch() to stop an ongoing
fetch in the subsequent commit.

The purpose of the copy appears to be only to change the
httpFetchParams' request over to the newly-cloned httpRequest, so all
existing references should refer back to the originals rather than new
instances.
2026-01-29 05:22:27 -06:00
Andreas Kling
37bdcc3488 LibWeb: Support MIME type sniffing for streaming HTTP responses
Previously, when loading a document, we would try to sniff the MIME
type by reading from the response body's source. However, for streaming
HTTP responses, the body source is Empty (the data comes through the
stream instead), so we had no bytes to sniff.

This caused pages like hypr.land (which sends no Content-Type header)
to be misidentified as plain text instead of HTML, since the MIME
sniffing algorithm would receive zero bytes and fall back to the
default type.

The fix captures the first bytes of the response body during fetch,
storing them on the Body object. These bytes are the "resource header"
defined by the MIME Sniffing spec - up to 1445 bytes, which is enough
to identify any MIME type the spec can detect.

Since bytes may arrive asynchronously during streaming, we use a
callback mechanism: if bytes aren't ready yet when load_document()
needs them, it registers a callback that fires once enough bytes have
been captured (or the stream ends).

The flow is:
1. FetchedDataReceiver receives network bytes, buffers them
2. When Body is created, buffered bytes are flushed to Body's sniff
   buffer, and subsequent bytes are appended as they arrive
3. Before calling load_document(), Navigable waits for sniff bytes
4. load_document() passes the bytes to MimeSniff::Resource::sniff()
2026-01-24 15:21:26 +01:00
Timothy Flynn
d3041dc054 LibHTTP+LibWeb: Support the HTTP Vary response header
We now partition the HTTP disk cache based on the Vary response header.
If a cached response contains a Vary header, we look for each of the
header names in the outgoing HTTP request. The outgoing request must
match every header value in the original request for the cache entry
to be used; otherwise, a new request will be issued, and a separate
cache entry will be created.

Note that we must now defer creating the disk cache file itself until we
have received the response headers. The Vary key is computed from these
headers, and affects the partitioned disk cache file name.

There are further optimizations we can make here. If we have a Vary
mismatch, we could find the best candidate cached response and issue a
conditional HTTP request. The content server may then respond with an
HTTP 304 if the mismatched request headers are actually okay. But for
now, if we have a Vary mismatch, we issue an unconditional request as
a purely correctness-oriented patch.
2026-01-22 08:54:49 -05:00
Timothy Flynn
aa1517b727 LibHTTP+LibWeb+RequestServer: Handle the Fetch API's cache mode
If the cache mode is no-store, we must not interact with the cache at
all.

If the cache mode is reload, we must not use any cached response.

If the cache-mode is only-if-cached or force-cache, we are permitted
to respond with stale cache responses.

Note that we currently cannot test only-if-cached in test-web. Setting
this mode also requires setting the cors mode to same-origin, but our
http-test-server infra requires setting the cors mode to cors.
2026-01-22 07:05:06 -05:00
Timothy Flynn
6b91199253 LibHTTP+LibWeb: Move Infrastructure::Request::CacheMode to LibHTTP
We will need to send this enum over IPC to RequestServer to affect the
disk cache's behavior.
2026-01-22 07:05:06 -05:00
Timothy Flynn
4dda144ce0 LibWeb: Return an HTTP cache partition even if the cache is disabled
Returning null here results in the fetch cache mode becoming hard-set to
no-store. This means the HTTP cache cannot be consulted nor updated. A
future commit will make our disk cache respect this flag, as this is a
valid client-provided value. So instead of setting this flag when the
memory cache is disabled, let's move the check to where the cache later
becomes consulted/updated.
2026-01-22 07:05:06 -05:00
Timothy Flynn
1a5cd6b05f LibWeb: Do not perform any cache revalidation from WebContent
We currently will perform some revalidation from both WebContent and
RequestServer. For simplicity's sake, now that the memory cache only
holds fresh responses, let's remove revalidation handling from the
WebContent process. If a memory-cached response is stale, it's fine
to just forward that request to RequestServer. It will then either
be served by disk cache, or revalidated at that point.
2026-01-19 08:02:14 -05:00
Timothy Flynn
2ac219405f LibHTTP+LibWeb: Purge non-fresh entries from the memory cache
Once a cache entry is not fresh, we now remove it from the memory cache.
We will avoid handling revalidation from within WebContent. Instead, we
will just forward the request to RequestServer, where the disk cache
will handle revalidation for itself if needed.
2026-01-19 08:02:14 -05:00
Timothy Flynn
928522c48e LibWeb: Define the memory cache flag as static
This isn't externally referenced.
2026-01-19 08:02:14 -05:00
Andreas Kling
a39f3c383b LibWeb: Set up report timing steps on fetch body read errors
When fully reading a response body fails, the transform stream's flush
algorithm (which calls processResponseEndOfBody) never runs since flush
only executes on successful stream completion.

This left report_timing_steps null, causing a crash when
process_response_consume_body tried to call report_timing.

The fetch spec doesn't explicitly handle this case. We work around it
by extracting the report timing setup (steps 1-3 of
processResponseEndOfBody) into a helper that we call from both the
success path (via flush) and the error path (via processBodyError).

This is an ad-hoc fix that deviates slightly from the spec, but it
ensures failed fetches still produce useful timing data and don't crash.
2026-01-18 00:30:55 +01:00
Andreas Kling
681d00c218 LibDevTools: Pass request initiator type to network panel
Propagate the request initiator type (e.g., "xmlhttprequest", "fetch",
"script", "stylesheet") from LibWeb through the IPC layer to DevTools.

This enables Firefox DevTools to correctly identify XHR/fetch requests
and display appropriate cause types in the Network panel's "Initiator"
column.
2026-01-15 20:10:19 +01:00
Timothy Flynn
b35645523c LibHTTP+LibWeb: Make memory cache debug logs consistent with disk cache
Let's also not yell.
2026-01-10 09:02:41 -05:00
Timothy Flynn
0d99d54c46 LibHTTP+LibWeb: Do not cache range requests (for now)
We currently do not handle responses for range requests at all in our
HTTP caches. This means if we issue a request for a range of bytes=1-10,
that response will be served to a subsequent request for a range of
bytes=10-20. This is obviously invalid - so until we handle these
requests, just don't cache them for now.
2026-01-08 11:59:12 +01:00
Andreas Kling
2ac363dcba LibGC: Only call finalize() on types that override finalize()
This dramatically cuts down on time spent in the GC's finalizer pass,
since most types don't override finalize().
2026-01-07 20:51:17 +01:00
breakgimme
4f74ced414 LibWeb: Make User-Agent updates apply to HTTP requests on reload 2025-12-28 09:11:13 -05:00
Aliaksandr Kalenik
f6a7df78e7 LibWeb: Add missing GC visits for XHR::FormDataEntry
3a6782689 fix up that changes `Vector<XHR::FormDataEntry>` to
`GC::ConservativeVector<XHR::FormDataEntry>`.
2025-12-26 19:48:46 +01:00
Andreas Kling
3a6782689f LibWeb: Don't use GC::Root in FormDataEntryValue variant
This was causing reference cycles and leaking entire realms.
2025-12-26 11:57:00 +01:00
Timothy Flynn
696935d8ce LibWeb: Remove outdated note from LoadRequest creation in fetch
The method being referred to here was removed in commit
556364fd76.
2025-12-21 09:24:51 -06:00
Timothy Flynn
bf7b812d0b LibHTTP+LibWeb: Store the in-memory HTTP cache without JS realms
The in-memory HTTP Fetch cache currently keeps the realm which created
each cache entry alive indefinitely. This patch migrates this cache to
LibHTTP, to ensure it is completely unaware of any JS objects.

Now that we are not interacting with Fetch response objects, we can no
longer use Streams infrastructure to pipe the response body into the
Fetch response. Fetch also ultimately creates the cache response once
the HTTP response headers have arrived. So the LibHTTP cache will hold
entries in a pending list until we have received the entire response
body. Then it is moved to a completed list and may be used thereafter.
2025-12-21 08:59:31 -06:00
Timothy Flynn
d08bd14928 LibWeb: Accumulate all network bytes in FetchedDataReceiver
This will allow us to hand off the bytes to the HTTP memory cache.
2025-12-21 08:59:31 -06:00
Timothy Flynn
46b3218241 LibHTTP+LibWeb: Use LibHTTP to calculate stale-while-revalidate values
No need to duplicate this in LibWeb.

In doing so, this also fixes an apparent bug for SWR handling in LibWeb.
We were previously deciding if we were in the SWR lifetime with:

    stale_while_revalidate > current_age

However, the SWR lifetime is meant to be an additional time on top of
the freshness lifetime:

    freshness_lifetime + stale_while_revalidate > current_age
2025-12-14 11:33:02 -05:00
Timothy Flynn
854981714f LibWeb: Remove errant "non-standard" comment
The method this comment was attached to was removed in commit
a5df972055.
2025-12-14 11:33:02 -05:00
Aliaksandr Kalenik
f29212703e LibWeb/Fetch: Use GC::Function for fetch_main_content() callback
...instead of creating GC roots for captured values.
2025-12-09 08:51:48 -05:00
Timothy Flynn
adcf5462af LibWeb+WebContent: Rename the http-cache flag to http-memory-cache
Rather than having http-cache and http-disk-cache, let's rename the
former to http-memory-cache to be extra clear what we are talking about.
2025-12-02 12:19:42 +01:00
Timothy Flynn
2453f0bc04 LibHTTP+LibWeb: Use LibHTTP's cache implementation in LibWeb
There are a couple of remaining RFC 9111 methods in LibWeb's Fetch, but
these are currently directly tied to the way we store GC-allocated HTTP
response objects. So de-coupling that is left as a future exercise.
2025-11-29 08:35:02 -05:00
Timothy Flynn
9375660b64 LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP
The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP
caching) implementation. We currently have one implementation in LibWeb
for our in-memory cache and another in RequestServer for our disk cache.

The implementations both largely revolve around interacting with HTTP
headers. But in LibWeb, we are using Fetch's header infra, and in RS we
are using are home-grown header infra from LibHTTP.

So to give these a common denominator, this patch replaces the LibHTTP
implementation with Fetch's infra. Our existing LibHTTP implementation
was not particularly compliant with any spec, so this at least gives us
a standards-based common implementation.

This migration also required moving a handful of other Fetch AOs over
to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP
folder, so perhaps it makes sense for LibHTTP to be the implementation
of that entire set of facilities.)
2025-11-27 14:57:29 +01:00
Timothy Flynn
3dce6766a3 LibWeb: Extract some CORS and MIME Fetch helpers to their own files
An upcoming commit will migrate the contents of Headers.h/cpp to LibHTTP
for use outside of LibWeb. These CORS and MIME helpers depend on other
LibWeb facilities, however, so they cannot be moved.
2025-11-27 14:57:29 +01:00
Timothy Flynn
0fd80a8f99 LibTextCodec+LibWeb: Move isomorphic coders to LibTextCodec
This will be used outside of LibWeb.
2025-11-27 14:57:29 +01:00
Timothy Flynn
7b0dfa61b1 LibWeb: Do not move header name/values that are re-used
Fixes a regression from commit:
f675cfe90f
2025-11-26 21:22:35 -05:00
Timothy Flynn
cbfae97101 LibWeb: Include empty header values when joining duplicated headers
Fixes a regression from commit:
f675cfe90f

It is not sufficient to only check if the builder is empty, as we will
then drop empty header values (when the first found value is empty).

This is tested in WPT by /cors/origin.htm, but that requires an HTTP
server.
2025-11-26 21:22:35 -05:00
Timothy Flynn
f675cfe90f LibWeb: Store HTTP methods and headers as ByteString
The spec declares these as a byte sequence, which we then implemented as
a ByteBuffer. This has become pretty awkward to deal with, as evidenced
by the plethora of `MUST(ByteBuffer::copy(...))` and `.bytes()` calls
everywhere inside Fetch. We would then treat the bytes as a string
anyways by wrapping them in StringView everywhere.

We now store these as a ByteString. This is more comfortable to deal
with, and we no longer need to continually copy underlying storage (as
ByteString is ref-counted).

This work is largely preparatory for an upcoming HTTP header refactor.
2025-11-26 09:15:06 -05:00
Timothy Flynn
ed27eea091 LibWeb: Do not copy the result of HeaderList::extract_header_list_values
There's no need to copy the Vector out of this result every time we call
it. We can move it out or access it directly.
2025-11-26 09:15:06 -05:00
Timothy Flynn
44fbf6451e LibWeb: Simplify Fetch's build-content-range implementation
* Don't pass u64 by reference
* Don't double-format the range numbers
2025-11-26 09:15:06 -05:00
Timothy Flynn
d70224ad2e LibWeb: Organize Fetch Headers.h/Headers.cpp a bit
Generally just define things in the order they are declared (will make a
change to use ByteString in this file a bit easier to follow). Also make
a couple of free functions be class methods on Header / HeaderList.
2025-11-26 09:15:06 -05:00
Prajjwal
1f5ffe04c8 LibWeb: Fix race condition between read_all_bytes and stream population
There might be a race between read_all_bytes and stream population.
If document load reads stream before it is populated, the stream will
be empty and might lead to hang in SessionHistoryTraversalQueue which
is expecting a promise to be resolved on document load.

This race can occur when stream population and document source are set
very close to each other. For example, when a newly generated blob is
set as the source of an iframe.
- navigation/multiple-navigable-cross-document-navigation.html has been
modified to trigger this race.
2025-11-26 12:27:12 +01:00
Aliaksandr Kalenik
69cede4a0f AK+LibWeb: Make StringBase::bytes() lvalue-only
Disallow calling `StringBase::bytes()` on temporaries to avoid returning
`ReadonlyBytes` that outlive the underlying string.

With this change, we catch a real UAF:
`load_result.data = maybe_response.release_value().bytes();`
All other updated call sites were already safe, they just needed to use
an intermediate named variable to satisfy the new lvalue-only
requirement.
2025-11-25 13:02:20 -05:00
Aliaksandr Kalenik
0eb28a1a54 LibWeb: Delete unused BufferPolicy from fetch Request
This is no longer needed since all requests are unbuffered.
2025-11-20 06:29:13 -05:00