Commit Graph

27 Commits

Author SHA1 Message Date
Shannon Booth
4c181bdf7f LibWeb: Implement value argument of URLSearchParams.has
(cherry picked from commit 264b5160c2099f1aab6f06ece720de984ea994b2)
2024-10-19 15:26:29 -04:00
Shannon Booth
6f129a8e1f LibWeb: Implement value argument of URLSearchParams.delete
(cherry picked from commit 5637dc43b2d6faf0576e7844581d0ced882d007f)
2024-10-19 15:26:29 -04:00
Shannon Booth
8c62701f82 LibWeb/Tests: Relocate search params test into URL folder
This is where I have been trying to put all of the URL tests.

(cherry picked from commit e2fb24c9b8e973542e8313ddb9e0f9f85a71c0e8)
2024-10-19 15:26:29 -04:00
Shannon Booth
fe2f85c5a0 LibWeb: Implement USVString scalar value handling
USVString is defined in the IDL spec as:

> The USVString type corresponds to scalar value strings. Depending on
> the context, these can be treated as sequences of either 16-bit
> unsigned integer code units or scalar values.

This means we need to account for surrogate code points by using the
replacement character.

This fixes the last test in https://wpt.live/url/url-constructor.any.html

(cherry picked from commit aa32bfa4481f6298c99846025394b7bc415ca621)
2024-10-17 20:28:06 -04:00
Alisson Lauffer
73bbe2b801 LibWeb: Replace "+" in value with a space while decoding search params
(cherry picked from commit d38b28b57b88e865b580f0d5e33310631f13e62c)
2024-10-16 23:56:40 -04:00
Shannon Booth
8f243cc632 LibWeb: Actually run UTF-8 decode without BOM
This fixes a crash using URLSearchParams when provided a percent encoded
string which does not percent decode to valid UTF-8.

Fixes a crash running https://wpt.live/url/urlencoded-parser.any.html

(cherry picked from commit 9c72fc9642266ac92dedbccac7d8c0bd238450cd)
2024-10-16 23:56:40 -04:00
Shannon Booth
decc664458 LibURL: Fail parsing IPV4 URLs starting with 0x that overflow
Parsing last as an IPV4 number was not returning true in "ends with a
number" as the parsing of that part was overflowing. This means that the
URL is not considered to be an IPv4 address, and is treated as a valid
domain.

Helpfully, the spec also points out in a note that this step is
equivalent to simply checking that the last part ends with 0x followed
by only hex digits - which doesn't suffer from any overflow problem!

Arguably this is an editorial issue in the spec where this should be
clarified a little bit. But for now, fixing this fixes 3 sub tests in
WPT for:

https://wpt.live/url/url-constructor.any.html
(cherry picked from commit 6cac2981fb45498f7e5b84ded2669fb62111da17)
2024-10-15 12:08:50 -04:00
Shannon Booth
e01e49e00d LibURL: Fix heuristic for URL domain parsing IDNA fast path
Our heuristic was a bit too simplistic and would not run through the
ToASCII unicode algorithm which performs some extra validation. This
would cause invalid URLs that should fail to be parsed be mistakenly
accepted.

This fixes 8 tests in: https://wpt.live/url/url-constructor.any.html

(cherry picked from commit db3f1180464eefc841d2eccc5a6b441398dd164d)
2024-10-15 12:08:50 -04:00
Shannon Booth
8467171260 LibWeb: Don't strip leading '?' in query initializing a URL
In our implementation of url-initialize, we were invoking the
constructor of URLSearchParams:

https://url.spec.whatwg.org/#dom-urlsearchparams-urlsearchparams

Instead of the 'initialize' AO:

https://url.spec.whatwg.org/#urlsearchparams-initialize

This has the small difference of stripping any leading '?' from the
query (which we are not meant to be doing!).

(cherry picked from commit fd4e943e12aa077f11a537164166fdfb82e29e8d)
2024-10-15 12:08:50 -04:00
Shannon Booth
c9798dc7b3 LibWeb/Tests: Also verify URL search params in constructor tests
(cherry picked from commit d7b19b32787863f05fc97e551b4172a02f6c8563)
2024-10-15 12:08:50 -04:00
Shannon Booth
b40275ce92 LibURL: Don't return early parsing a URL with an empty input
We can't simply use the base URL as it may need to be modified in some
form. For example - for the included test, the fragment was previously
being included in the resulting URL.

This fixes 1 test on https://wpt.live/url/url-constructor.any.html

(cherry picked from commit 1dc4959e915e5f994fb84bf43870a31554ac5d81)
2024-10-15 12:08:50 -04:00
Shannon Booth
8fbf3327eb LibURL: Remove unspecified steps in URL file slash parsing state
There were some extra steps in there which produced wrong results for
relative file URLs.

Fixes 7 test cases in: https://wpt.live/url/url-constructor.any.html

We also need to adjust the test results in TestURL. The behaviour tested
does not match how URL is specified to work as an absolute relative is
given.

(cherry picked from commit 8723f72f0f2ddabbdd2335afca6124f41ba5f267)
2024-10-15 12:08:50 -04:00
Shannon Booth
6c66935420 LibURL: Allow inputs containing only whitespace
The check for:

```
    if (start_index >= end_index)
        return {};
```

To prevent an out of bounds when trimming the start and end of the input
of whitespace was preventing valid URLs (only having whitespace in the
input) from being parsed.

Instead, prevent start_index from ever getting above end_index in the
first place, and don't treat empty inputs as an error.

Fixes one WPT test on:

https://wpt.live/url/url-constructor.any.html
(cherry picked from commit d6af5bf5eb814fcb30959d2bf9fc666cdba716c7)
2024-10-15 12:08:50 -04:00
Shannon Booth
e1417b2da1 LibURL: Also remove carriage returns from URL input
The definition of an "ASCII tab or newline" also includes U+000D CR.

This fixes 3 subtests in:

https://wpt.live/url/url-constructor.any.html
(cherry picked from commit 41cf9f6fe3f6a36094a8db3ace0733e1e3666092)
2024-10-15 12:08:50 -04:00
Shannon Booth
1308cab372 LibURL+LibWeb: Do not percent decode in password/username getters
Doing it is not part of the spec. Whenever needed, the spec will
explicitly percent decode the username and password.

This fixes some URL WPT tests.

(cherry picked from commit f511c0b441a591bc85f409242229c7b295e118e4)
2024-10-15 12:08:50 -04:00
Shannon Booth
2cf67c664d LibURL: Don't consider file:// URL hosts as always opaque
Which was resulting in file URL hosts not being correctly percent
decoded.

(cherry picked from commit a661daea71be0151e4a155db817ff13be9926582)
2024-10-15 12:08:50 -04:00
Shannon Booth
9b6a1de777 LibWeb: Implement URL.parse
This was an addition to the URL spec, see:

https://github.com/whatwg/url/commit/58acb0
2024-05-13 09:21:12 +02:00
Shannon Booth
453dd0cf44 AK: Properly implement steps for shortening a URLs path
Instead of implementing this inline, put it into a function. Use this
new function to correctly implement shortening paths for some places
where this logic was previously missing.

Before these changes, the pathname for the included test was incorrectly
being set to '/' as we were not considering the windows drive letter.
2023-10-26 11:11:41 +02:00
Shannon Booth
bfdf7779ce AK: Correctly set host when parsing URL with a base file:// host
We were completely missing this spec step here. Also leave a FIXME for
the pre-existing implementation of this step, as this doesn't match the
spec.
2023-10-26 11:11:41 +02:00
Shannon Booth
791ad12031 LibWeb/Tests: Support URL tests with an input base 2023-10-26 11:11:41 +02:00
Kemal Zebari
824c54acaf LibWeb/URL: Add strip_trailing_spaces_from_an_opaque_path()
Also remove 2 FIXMEs by including this function.
2023-09-15 11:15:43 -06:00
Shannon Booth
cb4c279e90 AK: Percent encode URL fragments when parsed
This fixes URL fragments containing characters in the fragment encoding
set that were not being correctly percent encoded.
2023-08-31 11:02:18 +02:00
Shannon Booth
4eab37f391 LibWeb/Tests: Also include URL hash in test results
None of the existing tests contain a URL which has a fragment in them,
but this does verify that the URL parser does not actually find any!
Also, this should let us verify the correctness of URLs which actually
do contain fragments.
2023-08-31 11:02:18 +02:00
Shannon Booth
23e82114b4 AK: Do not consider port of 0 as a null port
This fixes an issue where if a port number of 0 was given for a non
special scheme the port number was being dropped.
2023-08-31 11:02:18 +02:00
Shannon Booth
faf9d08371 AK: Fix IPv6 serialization on multiple '0' parts ending in a '0' part
This could happen if a sequence of '0' parts was followed by a longer
sequence of '0' parts at the end of the host. The first sequence was
being used for the compress, and not the second.

For example, [1:1:0:0:1:0:0:0] was being serialized as: [1:1::1:0:0:0]
instead of [1:1:0:0:1::].

Fix this by checking at the end of the loop if we are in the middle of a
sequence of '0' parts that is longer than the current longest.
2023-08-06 10:53:32 +02:00
Shannon Booth
aa7ca80d7c AK: Fix missing step step for serialization of IPv6 hosts
This was resulting in the incorrect host serialization of:

http://[0:1:0:1:0:1:0:1] to [::1:0:1:0:1:0:1]

and:

http://[1:0:1:0:1:0:1:0] to [1::1:0:1:0:1:0]
2023-07-31 14:48:24 +02:00
Shannon Booth
4fdd4dd979 AK: Add missing default port definitions for FTP scheme URLs
This is defined in the spec, but was missing in our table. Fix this, and
add a spec comment for what is missing. Also begin a basic text based
test for URL, so we can get some coverage of LibWeb's usage of URL too.
2023-07-31 14:48:24 +02:00