This page renders the bookmarks as a tree and hook context menu events
up to the UI's bookmarks bar context menus to allow editing bookmarks.
Users can also drag-and-drop bookmark items around.
Let's not have to know off-hand that we need to update Environments.cpp
when adding a new WebUI. It's more obvious just below where we define
the URLs.
Add `ECMAScriptRegex`, LibRegex's C++ facade for ECMAScript regexes.
The facade owns compilation, execution, captures, named groups, and
error translation for the Rust backend, which lets callers stop
depending on the legacy parser and matcher types directly. Use it in the
remaining non-LibJS callers: URLPattern, HTML input pattern handling,
and the places in LibHTTP that only needed token validation.
Where a full regex engine was unnecessary, replace those call sites with
direct character checks. Also update focused LibURL, LibHTTP, and WPT
coverage for the migrated callers and corrected surrogate handling.
execResult may contain additional captures from nested groups
in user-provided regexp parts, exceeding the number of
URLPattern groups.
Fixes a crash in the updated WPT test.
See: https://github.com/whatwg/urlpattern/commit/203d435c32
There have been a few times I have wanted to add a debug log in
one of these functions, but currently that causes a massive rebuild.
Let's just move these out of line.
This aligns our behaviour closer to other browsers, which
_mostly_ consider file scheme URLs as opaque. For test
purposes, allow overriding this behaviour with a commandline
flag.
This was previously a depdency due to the use of
Crypto::get_secure_random for the nonce used for an opaque origin.
Now that this has been moved to AK, we no longer have any dependency
on LibCrypto.
Previously the authority state was parsed character-by-character in a
loop, appending each character to a buffer. When an '@' character was
encountered, the entire buffer would be re-processed to extract and
percent-encode the username and password portions.
This created O(n^2) behavior for URLs with multiple '@' characters, as
each '@' would trigger reprocessing of all previously buffered content.
This commit changes the authority state parser to process the authority
section in chunks rather than character-by-character:
1. Find the next delimiter ('@', '/', '?', '#', or '\' for special URLs)
2. Process the entire chunk up to that delimiter at once
3. Directly extract username/password from the chunk without buffering
With an additional change of switching to:
iterator_at_byte_offset_without_validation (which does not have any
loops), this reduces the time complexity to O(n) and the included
test found by fuzzing to actually complete parsing :^)
LibURL previously assigned a default port to the IRC schemes,
a carryover from SerenityOS where IRC is supported.
This behavior deviates from the URL Standard and affects URL parsing by
eliding an explicitly specified port when it matches the default (this
is considered a legacy behaviour of the web URL schemes). Remove the IRC
default port to restore spec-compliant behavior.
AK/Random is already the same as SecureRandom. See PR for more details.
ProcessPrng is used on Windows for compatibility w/ sandboxing measures
See e.g. https://crbug.com/40277768
The same-origin domain check always returned true if only the scheme
matched. This was because of missing steps to check for the origin's
domain, which didn't exist. Add this concept to URL::Origin, even
though we do not use it at this stage in document.domain setter.
Co-Authored-By: Luke Wilde <luke@ladybird.org>
Instead of '*', which is a nonsensical value. Similar to what we do
for determining the public suffix, if no match could be made via the
PSL algorithm, then take everything after the second dot as
the registrable domain.
This prevents us from considering e.g. b.b.example and
a.example as the same site.
This required some changes in LibURL & LibIPC since it has its own
definition of an BlobURLEntry. For now, we don't have a concrete usage
of MediaSource in LibURL so it is defined as an empty struct.
This removes one FIXME in an idl file.
Our floating point number parser was based on the fast_float library:
https://github.com/fastfloat/fast_float
However, our implementation only supports 8-bit characters. To support
UTF-16, we will need to be able to convert char16_t-based strings to
numbers as well. This works out-of-the-box with fast_float.
We can also use fast_float for integer parsing.
For the AO defined in the URL specification, in the case the
domain does not match against the PSL, we should be returning
the TLD. This fixes a crash for a bunch of WPT tests using the
Document.domain setter when the test is being served by WPT
locally.
We should be doing similar logic in registrable_domain, but that
unfortunately runs into some other issues, so just leave a FIXME
for now.
It is confusing to have both URL::Host::public_suffix and
URL:get_public_suffix, both with slightly different semantics.
Instead, use PublicSuffixData for cases that just want a direct
match against the list, and URL::Host::public_suffix in LibWeb
land as the URL spec defined AO.
I believe this is in the specification since the spec technically
requires passing through a valid unicode string. However, our
implementation already handles a non valid unicode string, and will
do the replacement character substitution.
Opaque origins are meant to be unique in terms of equality from
one another. Since this uniqueness needs to be across processes,
use a nonce to implement the uniqueness check.
Instead, porting over all users to use the newly created
Origin::create_opaque factory function. This also requires porting
over some users of Origin to avoid default construction.
The spec seems to indicate in its wording that while opaque
origins only serialize to 'null', they can still be tested
for equality with one another. Probably we will need to
generate some unique ID which is unique across processes.
No code now relies on using URL's valid state.
A URL can still be _technically_ invalid through use of the URL
constructor or by directly changing URL fields.
However, all URLs should be constructed through the URL parser,
and we should ideally be getting rid of the default constructor
at some stage.
Also, any code which is manually setting URL fields need to be
aware that this is full of pitfalls since there are many different
forms of canonicalization which is bypassed by not going through
the URL parser.
Creating a URL should almost always go through the URLParser to
handle all of the small edge cases involved. This reduces the
need for URL valid state.
There's a bit of a UTF-8 assumption with this change. But nearly every
caller of these methods were immediately creating a String from the
resulting ByteString anyways.
We were not properly handling the case that prefix code point was the
empty string (which we represent as an OptionalNone). While this
still resulted in the correct pattern string being generated, an
incorrect regular expression was being generated causing matching
to fail.
This has no functional difference as far as I can tell, but for
clarity explicitly do not attempt to do this, which has the nice
side effect of not checking for whitespace known to not exist.
It turns out that the problem here was simply that we were trimming
trailing whitespace when we did not need to, which was meaning that
the port number of '80 ' was being converted to the empty string
per URLPattern elision as the port matches the http scheme.