ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-05-11 01:22:43 +02:00

Author	SHA1	Message	Date
Andreas Kling	03d1b37354	RequestServer: Pre-resolve preconnect handles via our DNS resolver handle_connect_state (used by <link rel=preconnect>) attached easy handles to the multi without setting CURLOPT_RESOLVE, so libcurl spawned its own threaded resolver for them. When the system stub resolver was slow, the thread got stuck and the next ~Request pthread_join'd it on the main thread for many seconds. Route Connect-mode requests through the same DNS + CURLOPT_RESOLVE path that Fetch uses, so libcurl never spawns the thread.	2026-04-26 17:59:52 +02:00
Andreas Kling	072a3bdb90	RequestServer: Add diagnostic wire-activity logging Per-request and per-connection logging that surfaces enough detail to diagnose where time goes when a page load misbehaves. Gated by a new REQUESTSERVER_WIRE_DEBUG cmakedefine. Documentation/RequestServerWireLogging.md describes each label (wire/wire+/wire++/wire^, wire-batch, wire-stall, wire-burst, wire-pipe-pressure, LibDNS wire-dns, UI wire-cookie) and how to read them.	2026-04-26 17:59:52 +02:00
Andreas Kling	f0765e80f3	RequestServer: Avoid O(N^2) copy when draining response buffer write_queued_bytes_without_blocking() used to allocate a Vector sized to the entire queued response buffer and memcpy every queued byte into it, on every curl on_data_received callback. When the client pipe was slower than the network, the buffer grew, and each arriving chunk triggered a full copy of everything still queued. With enough pending data this added up to tens of gigabytes of memcpy per request and stalled RequestServer for tens of seconds. Drain the stream in a loop using peek_some_contiguous() + send() directly from the underlying chunk, and discard exactly what the socket accepted. No intermediate buffer, no copy. The loop exits on EAGAIN and enables the writer notifier, matching the previous back-pressure behavior.	2026-04-22 13:32:07 +02:00
Timothy Flynn	4ad800b594	RequestServer: Use correct request map for stale-while-revalidate cookie When we request the HTTP cookie for a SWR request, we were providing the cookie to the standard request corresponding to the SWR request's ID. This had two effects: 1. The SWR request would never finish. 2. If the corresponding standard request happened to be a connect-only request, this would result in a crash as we were expecting it to have gone through the normal fetch process. This was seen on some articles on news.google.com.	2026-04-13 19:43:13 -04:00
Timothy Flynn	09e8299721	RequestServer: Move Request::Type enum to its own file This will make it a bit easier to transfer over IPC.	2026-04-13 19:43:13 -04:00
Jelle Raaijmakers	d1f98d596d	RequestServer: Check for curl errors before sending headers I noticed TLS errors only being dumped to stderr, but not shown on our regular error page. This restores that behavior and adds a regression test.	2026-03-12 13:17:18 -04:00
Shannon Booth	74109d2f6b	RequestServer: Only trim HTTP whitespace from response headers The Fetch spec defines HTTP whitespace as tab, LF, CR, and space. Previously, trim_whitespace was also stripping vertical tab (U+000B) and form feed (U+000C), which are not HTTP whitespace characters. Switch to HTTP::normalize_header_value which matches the fetch definition. Fixes 4 subtests for WPT test: https://wpt.live/cors/origin.htm	2026-02-26 20:01:13 +01:00
Timothy Flynn	9bd0a50a01	RequestServer: Capture async references to Request objects weakly It's possible for the client (WebContent) to stop a request while the request is waiting for an async callback to be invoked. So we cannot assume the request itself will still be alive once these callbacks are finally invoked.	2026-02-26 15:36:38 +01:00
Andreas Kling	89f4fdb9d3	RequestServer+LibWeb: Fix resource substitution for cross-origin scripts Substituted responses were missing the Access-Control-Allow-Origin header, causing the CORS check to fail and the script to be treated as a network error. Fix this by adding `Access-Control-Allow-Origin: *` to all substituted responses. Also include the script's src attribute in the error message when a script element's result is null, to make debugging easier.	2026-02-12 16:59:47 +01:00
Timothy Flynn	d97a3d9b5a	LibHTTP+RequestServer: Send revalidation attributes without parsing The caching RFC is quite strict about the format of date strings. If we received a revalidation attribute with an invalid date string, we would previously fail a runtime assertion. This was because to start a revalidation request, we would simply check for the presence of any revalidation header; but then when we issued the request, we would fail to parse the header, and end up with all attributes being null. We now don't parse the revalidation attributes at all. Whatever we receive in the Last-Modified response header is what we will send in the If-Modified-Since request header, verbatim. For better or worse, this is how other browsers behave. So if the server sends us an invalid date string, it can receive its own date format for revalidation.	2026-02-10 09:09:53 -05:00
Timothy Flynn	e6c008a269	LibWeb+RequestServer: Attach HTTP cookie headers from RequestServer We currently attach HTTP cookie headers from LibWeb within Fetch. This has the downside that the cookie IPC, and the infrastructure around it, are all synchronous. This blocks the WebContent process entirely while the cookie is being retrieved, for every request on a page. We now attach cookie headers from RequestServer. The state machine in RequestServer::Request allows us to easily do this work asynchronously. We can also skip this work entirely when the response is served from disk cache. Note that we will continue to parse cookies in the WebContent process. If something goes awry during parsing. we limit the damage to that process, instead of the UI or RequestServer. Also note that WebSocket requests still have cookie headers attached attached from LibWeb. This will be handled in a future patch. In the future, we may want to introduce a memory cache for cookies in RequestServer to avoid IPC altogether as able.	2026-02-10 12:21:20 +01:00
Zaggy1024	b37f42a887	RequestServer: Ignore SIGPIPE for the entire service process We don't want to be terminated when we write to a pipe that's closed, so set SIGPIPE to be ignored in main. We already pass MSG_NOSIGNAL when sending over our socket in RequestPipe, so this is a secondary measure. However, cURL assumes by default that SIGPIPE is unhandled, so before any operations that interact with pipes, they set their own handler, interact with the pipe, then restore the original handler. Since we now ignore the signal, we can just tell cURL not to do this extra work.	2026-01-29 05:22:27 -06:00
Timothy Flynn	46cd83c405	RequestServer: Honor the only-if-cached Cache-Control request directive The only-if-cached directive currently behaves differently in the HTTP Caching RFC compared to the Fetch spec. In the former, we must return an HTTP 504 response if we do not find a cache entry. In the latter, we must return a network error. Note that similar to commit `aa1517b727`, we cannot test the only-if-cached directive. We implement a same-origin restriction aligned with the Fetch API that prevents our test infra from excercising this directive.	2026-01-28 11:31:04 -05:00
Timothy Flynn	6840571cb3	RequestServer: Store the HTTP status code as an optional Rather than using 0 as a nominal value. This just feels a bit better in an upcoming commit that makes further use of it.	2026-01-28 11:31:04 -05:00
Timothy Flynn	54c2ecedca	RequestServer: Do not flush the disk cache for unsuccessful requests If a request failed, or was stopped, do not attempt to write the cache entry footer to disk. Note that at this point, the cache index will not have been created, thus this entry will not be used in the future. We do still delete any partial file on disk. This serves as a more general fix for the issue addressed in commit `9f2ac14521`.	2026-01-23 14:24:20 +01:00
Timothy Flynn	d3041dc054	LibHTTP+LibWeb: Support the HTTP Vary response header We now partition the HTTP disk cache based on the Vary response header. If a cached response contains a Vary header, we look for each of the header names in the outgoing HTTP request. The outgoing request must match every header value in the original request for the cache entry to be used; otherwise, a new request will be issued, and a separate cache entry will be created. Note that we must now defer creating the disk cache file itself until we have received the response headers. The Vary key is computed from these headers, and affects the partitioned disk cache file name. There are further optimizations we can make here. If we have a Vary mismatch, we could find the best candidate cached response and issue a conditional HTTP request. The content server may then respond with an HTTP 304 if the mismatched request headers are actually okay. But for now, if we have a Vary mismatch, we issue an unconditional request as a purely correctness-oriented patch.	2026-01-22 08:54:49 -05:00
Timothy Flynn	36a826815d	LibHTTP+LibWeb+RequestServer: Store request headers in the HTTP caches We need to store request headers in order to handle Vary mismatches. (Note we should also be using BLOB for header storage in sqlite, as they are not necessarily UTF-8.)	2026-01-22 08:54:49 -05:00
Timothy Flynn	aa1517b727	LibHTTP+LibWeb+RequestServer: Handle the Fetch API's cache mode If the cache mode is no-store, we must not interact with the cache at all. If the cache mode is reload, we must not use any cached response. If the cache-mode is only-if-cached or force-cache, we are permitted to respond with stale cache responses. Note that we currently cannot test only-if-cached in test-web. Setting this mode also requires setting the cors mode to same-origin, but our http-test-server infra requires setting the cors mode to cors.	2026-01-22 07:05:06 -05:00
Zaggy1024	84c0eb3dbf	LibCore+LibHTTP+RequestServer: Send data via sockets instead of pipes This brings the implementation on Unix in line with Windows, so we can drop a few ifdefs.	2026-01-19 06:53:29 -05:00
Andreas Kling	18aee32084	RequestServer: Add --resource-map option for URL-to-file substitution This adds support for intercepting network requests and serving local file content instead. When a URL matches an entry in the substitution map, the local file is served while preserving the original URL's origin for cross-origin checks. Usage: Ladybird --resource-map=/path/to/map.json The JSON file format is: { "substitutions": [ { "url": "https://example.com/script.js", "file": "/path/to/local/script.js", "content_type": "application/javascript", "status_code": 200 } ] } Fields: - url (required): Exact URL to intercept (query string and fragment are stripped before matching) - file (required): Absolute path to local file to serve - content_type (optional): Override Content-Type header (defaults to guessing from filename) - status_code (optional): HTTP status code (defaults to 200) This is incredibly useful for debugging production websites: you can intercept any script, stylesheet, or other resource and replace it with a local copy containing your own debug instrumentation, console.log statements, or experimental fixes - all without modifying the actual site or setting up a local dev server.	2026-01-19 10:23:26 +01:00
Timothy Flynn	9f2ac14521	LibHTTP+RequestServer: Do not flush partial responses to the cache index If the cURL request completes with anything other than CURLE_OK, we must not keep the cache entry. For example, if the server's connection closes while transferring data, we receive CURLE_PARTIAL_FILE. We don't want this cache entry to be treated as valid in a subsequent request.	2026-01-08 11:59:12 +01:00
Timothy Flynn	add8402536	LibHTTP+RequestServer: Implement the stale-while-revalidate directive This directive allows our disk cache to serve stale responses for a time indicated by the directive itself, while we revalidate the response in the background. Issuing requests that weren't initiated by a client is a new thing for RequestServer. In this implementation, we associate the request with the client that initiated the request to the stale cache entry. This adds a "background request" mode to the Request object, to prevent us from trying to send any of the revalidation response over IPC.	2025-12-13 13:07:02 -06:00
Timothy Flynn	9cdf353d2a	RequestServer: Add a debug log for request state transitions	2025-12-12 10:54:33 -05:00
Timothy Flynn	624611aa3f	LibRequests+RequestServer: Store request IDs as u64 Not super important, but this will match an upcoming ID for stale-while- revalidate requests. It also would techinically be UB if this ID had ever overflowed. Let's make it 64-bit while we are here to also avoid the possibility that it ever will overflow.	2025-12-12 10:54:33 -05:00
Timothy Flynn	21bbbacd07	LibHTTP+RequestServer: Move the HTTP cache implementation to LibHTTP We currently have two ongoing implementations of RFC 9111, HTTP caching. In order to consolidate these, this patch moves the implementation from RequestServer to LibHTTP for re-use within LibWeb.	2025-11-29 08:35:02 -05:00
Timothy Flynn	9375660b64	LibHTTP+LibWeb+RequestServer: Move Fetch's HTTP header infra to LibHTTP The end goal here is for LibHTTP to be the home of our RFC 9111 (HTTP caching) implementation. We currently have one implementation in LibWeb for our in-memory cache and another in RequestServer for our disk cache. The implementations both largely revolve around interacting with HTTP headers. But in LibWeb, we are using Fetch's header infra, and in RS we are using are home-grown header infra from LibHTTP. So to give these a common denominator, this patch replaces the LibHTTP implementation with Fetch's infra. Our existing LibHTTP implementation was not particularly compliant with any spec, so this at least gives us a standards-based common implementation. This migration also required moving a handful of other Fetch AOs over to LibHTTP. (It turns out these AOs were all from the Fetch/Infra/HTTP folder, so perhaps it makes sense for LibHTTP to be the implementation of that entire set of facilities.)	2025-11-27 14:57:29 +01:00
Timothy Flynn	3663e12585	RequestServer: Do not request partial content for failed cache reads This effectively reverts `9b8f6b8108`. I misunderstood what Chrome was doing here - they will issue a range request only for what they call "sparse" cache entries. These entries are basically used to cache partial large file, e.g. a multi-gigabyte video. If they hit a legitimate read error, they will fail the request with a ERR_CACHE_READ_FAILURE status. We will now (again) fail with a network error when a cache read fails.	2025-11-21 08:48:42 +01:00
Timothy Flynn	4de3f77d37	RequestServer: Add a hook to advance a request's clock time for testing For example, we will want to be able to test that a cached object was expired after N seconds. Rather than waiting that time during testing, this adds a testing-only request header to internally advance the clock for a single HTTP request.	2025-11-20 09:33:49 +01:00
Timothy Flynn	b2c112c41a	LibWebView+RequestServer: Add a simple test mode for the HTTP disk cache This mode allows us to test the HTTP disk cache with two mechanisms: 1. If RequestServer is launched with --http-disk-cache-mode=testing, it will cache requests with a X-Ladybird-Enable-Disk-Cache header. 2. In test mode, RS will include a X-Ladybird-Disk-Cache-Status response header indicating how the response was handled by the cache. There is no standard way for a web request to know what happened with respect to the disk cache, so this fills that hole for testing. This mode is not exposed to users.	2025-11-20 09:33:49 +01:00
ayeteadoe	8348c55570	RequestServer: Support HTTP response write retries on Windows The Windows RequestPipe implementation uses a non blocking local socket pair, which means the non-fatal "resource is temporarily unavailable" error that can occur in the non-blocking HTTP Response data writes can be retried. This was seen often when loading https://ladybird.org. While the EAGAIN errno is defined on Windows, WSAEWOULDBLOCK is the error code returned in this scenario, so we were not detecting that we could retry and treated the failed write attempt as a proper error. We now detect WSAEWOULDBLOCK and convert it into the errno equivalent EWOULDBLOCK. There is precedent for doing a similar conversion in the Windows PosixSocketHelper::read() implementation. Finally, we retry when we receive either EAGAIN or EWOULDBLOCK error codes on all platforms. While POSIX allows these 2 error codes to have the same value, which they do on Linux according to https://www.man7.org/linux/man-pages/man3/errno.3.html, it is not guarenteed. So we now ensure platforms that return EWOULDBLOCK with a value different than EAGAIN also perform write retries.	2025-11-19 09:17:18 +01:00
Tim Ledbetter	fe377977d9	RequestServer: Add a `CURL_DEBUG` flag This sets `CURLOPT_VERBOSE` to 1, which enables detailed logging of all cURL connections to stderr.	2025-11-03 11:55:56 -05:00
Timothy Flynn	a4e3890c05	RequestServer: Implement stale cache revalidation When a request becomes stale, we will now issue a revalidation request (if the response indicates it may be revalidated). We do this by issuing a normal fetch request, with If-None-Match and/or If-Modified-Since request headers. If the server replies with an HTTP 304 status, we update the stored response headers to match the 304's headers, and serve the response to the client from the cache. If the server replies with any other code, we remove the cache entry. We will open a new cache entry to cache the new response, if possible.	2025-11-02 13:03:29 -05:00
Timothy Flynn	3d45a209b6	RequestServer: Rename CacheEntryReader::m_headers to m_response_headers Let's be extra clear what we're talking about here.	2025-11-02 13:03:29 -05:00
Timothy Flynn	20cd19be4d	RequestServer: Store HTTP response headers in the cache index We currently store response headers in the cache entry file, before the response body. When we implement cache revalidation, we will need to update the stored response headers with whatever headers are received in a 304 response. It's not unlikely that those headers will have a size that differs from the stored headers. We would then have to rewrite the entire response body after the new headers. Instead of dealing with those inefficiencies, let's instead store the response headers in the cache index. This will allow us to update the headers with a simple SQL query.	2025-11-02 13:03:29 -05:00
ayeteadoe	643f0de422	RequestServer: Instruct curl to use Windows CA cert store This is required for supporting HTTPS requests. Otherwise we fail with CURLE_PEER_FAILED_VERIFICATION.	2025-10-29 21:07:52 -06:00
ayeteadoe	11ec7c9cea	RequestServer: Create RequestPipe abstraction for request data transfer The Win32 API equivalent to pipe2() is CreatePipe(), which creates read and write anonymous pipe handles that we can set to non-blocking via SetNamedPipeHandleState(); however, this initial approach caused issues as our Windows infrastructure assumes socket-based handles/fds and that we don't use Windows pipes at all, see Core::System::is_socket() in SystemWindows.cpp. So we use socketpair() to keep our current assumptions true. Given that Windows uses socketpair() and Unix uses pipe2(), this RequestPipe abstraction avoids ifdef soup by hiding the details about how the read/write fds pair is created and how response data is written to the client.	2025-10-29 17:47:02 -04:00
Timothy Flynn	7f37889ff1	RequestServer: De-duplicate some disk cache requests We previously had no protection against the same URL being requested multiple times at the same time. For example, if a URL did not have any cache entry and became requested twice, we would open two cache writers concurrently. This would result in both writers piping the response to disk, and we'd have a corrupt cache file. We now hold back requests under certain scenarios until existing cache entries have completed: * If we are opening a cache entry for reading: - If there is an existing reader entry, carry on as normal. We can have multiple readers. - If there is an existing writer entry, defer the request until it is complete. * If we are opening a cache entry for writing: - If there is an existing reader or writer entry, defer the request until it is complete.	2025-10-28 11:52:51 +01:00
Timothy Flynn	95d23d02f1	RequestServer: Pass the Request object to disk cache entry factories This object will be needed in a future commit to store requests awaiting other requests to finish. Doing this in a separate commit just to make that commit less noisy.	2025-10-28 11:52:51 +01:00
Timothy Flynn	5384f84550	RequestServer: Create disk cache writers for new requests immediately We previously waited until we received all response headers before we would create the cache entry. We now create one immediately, and handle writing the headers in its own function. This will allow us to know if a cache entry writer already exists for a given cache key, and thus prevent creating a second writer at the same time.	2025-10-28 11:52:51 +01:00
Timothy Flynn	822fcc39de	RequestServer: Manage request lifetimes as a simple state machine We currently manage request lifetime as both an ActiveRequest structure and a series of lambda callbacks. In an upcoming patch, we will want to "pause" a request to de-duplicate equivalent requests, such that only one request goes over the network and saves its response to the disk cache. To make that easier to reason about, this adds a Request class to manage the lifetime of a request via a state machine. We will now be able to add a "waiting for disk cache" state to stop the request.	2025-10-28 11:52:51 +01:00
Andrew Kaster	65c1c492f9	RequestServer: Remove unused Request class This was made obsolete in `504c80a202`	2025-01-23 21:35:58 +01:00
Timothy Flynn	22e0eeada2	Everywhere: Hoist the Services folder to the top-level	2024-11-10 12:50:45 +01:00

42 Commits