Commit Graph

17 Commits

Author SHA1 Message Date
Shannon Booth
65f4aae1d8 LibURL+Everywhere: Only percent decode URL paths when actually needed
Web specs do not return through javascript percent decoded URL path
components - but we were doing this in a number of places due to the
default behaviour of URL::serialize_path.

Since percent encoded URL paths may not contain valid UTF-8 - this was
resulting in us crashing in these places.

For example - on an HTMLAnchorElement when retrieving the pathname for
the URL of:

http://ladybird.org/foo%C2%91%91

To fix this make the URL class only return the percent encoded
serialized path, matching the URL spec. When the decoded path is
required instead explicitly call URL::percent_decode.

This fixes a crash running WPT URL tests for the anchor element on:

https://wpt.live/url/a-element.html
(cherry picked from commit cc557323326ba55514ef2a8a6e0efd7f09330f06;
amended heavily to call `URL::percent_decode()` on all results of
`url.serialize_path()` in the rest of serenity -- except in
LibGemini, where it looked incorrect, and in LibHTTP, where
LadybirdBrowser/ladybird#983 will add it.)
2024-11-21 17:47:14 -05:00
Shannon Booth
1afda6b82a LibURL+LibWeb: Pass a mutable reference URL to URL parser
If given, the spec expects the input URL to be manipulated on the fly
as it is being parsed, and may ignore any errors thrown by the URL
parser.

Previously, we were not exactly following the specs assumption here
which resulted in us needed to make awkward copies of the URL in these
situations.

For most cases this is not an issue. But it does cause problems for
situations where URL parsing would result in a failure (which is
ignored by the caller), and the URL is _partially_ updated
while parsing.

Such a situation can occur when setting the host of an href alongside a
port number which is not valid. It is expected that this situation will
result in the host being updates - but not the port number.

Adjust the URL parser API so that it mutates the URL given (if any), and
adjust the callers accordingly.

Fixes two tests on https://wpt.live/url/url-setters-a-area.window.html

(cherry picked from commit ff71d8f2c97441bff5975c117a7e7c8820e33e44)
2024-11-09 07:30:40 -05:00
Shannon Booth
3b2bf6534d LibWeb: Don't propogate small OOMs from URLSearchParams
Made easier now that URL percent encode after encoding is also not
throwing any errors. This simplfies a bunch of error handling.

(cherry picked from commit df4739d7ced4159deb2b3e40ba6a1a08b7e7dd5b)
2024-10-19 15:26:29 -04:00
Shannon Booth
db4bab8041 LibURL: Make percent_encode return a String
This simplifies a bunch of places which were needing to error check and
convert from a ByteString to String.

(cherry picked from commit 84a7fead0eefd967d4319f4d71c0a0ca3095d2d1)
2024-10-16 23:56:40 -04:00
BenJilks
26facdeecc LibWeb: Use text encoding from DOM when parsing URLs
This passes the DOM encoding down to the URL parser, so the correct
encoder can be used.

(cherry picked from commit c1958437f983bb9761661534da34934c8dddcf6f)
2024-10-15 22:54:51 -04:00
Shannon Booth
8467171260 LibWeb: Don't strip leading '?' in query initializing a URL
In our implementation of url-initialize, we were invoking the
constructor of URLSearchParams:

https://url.spec.whatwg.org/#dom-urlsearchparams-urlsearchparams

Instead of the 'initialize' AO:

https://url.spec.whatwg.org/#urlsearchparams-initialize

This has the small difference of stripping any leading '?' from the
query (which we are not meant to be doing!).

(cherry picked from commit fd4e943e12aa077f11a537164166fdfb82e29e8d)
2024-10-15 12:08:50 -04:00
Shannon Booth
1308cab372 LibURL+LibWeb: Do not percent decode in password/username getters
Doing it is not part of the spec. Whenever needed, the spec will
explicitly percent decode the username and password.

This fixes some URL WPT tests.

(cherry picked from commit f511c0b441a591bc85f409242229c7b295e118e4)
2024-10-15 12:08:50 -04:00
Kemal Zebari
ddefb5a822 LibWeb: Implement Blob::bytes()
Implements https://w3c.github.io/FileAPI/#dom-blob-bytes.

(cherry picked from commit c5f1e478838092dcf6e4ad8ee0bfef32a47e2d68)
2024-07-28 07:29:31 -04:00
circl
8287790913 LibWeb: Consider resource: URLs to be trustworthy and non-opaque
This makes icons once again load in the directory listings

(cherry picked from commit d14888f31a8378f319efa18028083ff605105101)
2024-06-26 23:11:35 +02:00
Shannon Booth
9b6a1de777 LibWeb: Implement URL.parse
This was an addition to the URL spec, see:

https://github.com/whatwg/url/commit/58acb0
2024-05-13 09:21:12 +02:00
Shannon Booth
67ea56da59 LibWeb: Factor out an 'initialize a URL' AO
This is a small refactor in the URL spec to avoid duplication as part of
the introduction of URL.parse, see:

https://github.com/whatwg/url/commit/58acb0
2024-05-13 09:21:12 +02:00
Shannon Booth
2457fdc7a4 LibWeb: Attach blob to URL on DOMURL::parse
On a non-basic URL parse, we are meant to perform a lookup on the blob
URL registry and attach any blob to that URL if present. Now - we do
that :^)
2024-05-12 15:46:29 -06:00
Shannon Booth
bad44f8fc9 LibWeb: Remove Bindings/Forward.h from LibWeb/Forward.h
This was resulting in a whole lot of rebuilding whenever a new IDL
interface was added.

Instead, just directly include the prototype in every C++ file which
needs it. While we only really need a forward declaration in each cpp
file; including the full prototype header (which itself only includes
LibJS/Object.h, which is already transitively brought in by
PlatformObject) - it seems like a small price to pay compared to what
feels like a full rebuild of LibWeb whenever a new IDL file is added.

Given all of these includes are only needed for the ::initialize
method, there is probably a smart way of avoiding this problem
altogether. I've considered both using some macro trickery or generating
these functions somehow instead.
2024-04-27 18:29:35 -04:00
Shannon Booth
e800605ad3 AK+LibURL: Move AK::URL into a new URL library
This URL library ends up being a relatively fundamental base library of
the system, as LibCore depends on LibURL.

This change has two main benefits:
 * Moving AK back more towards being an agnostic library that can
   be used between the kernel and userspace. URL has never really fit
   that description - and is not used in the kernel.
 * URL _should_ depend on LibUnicode, as it needs punnycode support.
   However, it's not really possible to do this inside of AK as it can't
   depend on any external library. This change brings us a little closer
   to being able to do that, but unfortunately we aren't there quite
   yet, as the code generators depend on LibCore.
2024-03-18 14:06:28 -04:00
Andreas Kling
c0d7f748ed LibWeb: Avoid FlyString lookups when setting IDL interface prototypes
This commit introduces a WEB_SET_PROTOTYPE_FOR_INTERFACE macro that
caches the interface name in a local static FlyString. This means that
we only pay for FlyString-from-literal lookup once per browser lifetime
instead of every time the interface is instantiated.
2024-03-16 16:35:54 +01:00
Shannon Booth
9ce8189f21 Everywhere: Use unqualified AK::URL
Now possible in LibWeb now that there is no longer a Web::URL.
2024-02-25 08:54:31 +01:00
Shannon Booth
f9e5b43b7a LibWeb: Rename URL platform object to DOMURL
Along with putting functions in the URL namespace into a DOMURL
namespace.

This is done as LibWeb is in an awkward situation where it needs
two URL classes. AK::URL is the general purpose URL class which
is all that is needed in 95% of cases. URL in the Web namespace
is needed predominantly for interfacing with the javascript
interfaces.

Because of two URLs in the same namespace, AK::URL has had to be
used throughout LibWeb. If we move AK::URL into a URL namespace,
this becomes more painful - where ::URL::URL is required to
specify the constructor (and something like
::URL::create_with_url_or_path in other places).

To fix this problem - rename the class in LibWeb implementing the
URL IDL interface to DOMURL, along with moving the other Web URL
related classes into this DOMURL folder.

One could argue that this name also makes the situation a little
more clear in LibWeb for why these two URL classes need be used
in the first place.
2024-02-25 08:54:31 +01:00