Commit Graph

110 Commits

Author SHA1 Message Date
Timothy Flynn
06796f5f7f LibURL+LibWeb+LibWebView: Convert about:version to a proper WebUI
Passing the browser command line and executable path to every WebContent
process just in case we load about:version always felt a bit weird. We
now use the WebUI framework to load this information on demand.
2026-04-21 06:59:11 -04:00
Timothy Flynn
b544e42809 LibWebView+UI: Add an about:bookmarks page to manage bookmarks
This page renders the bookmarks as a tree and hook context menu events
up to the UI's bookmarks bar context menus to allow editing bookmarks.
Users can also drag-and-drop bookmark items around.
2026-04-09 10:08:06 -04:00
Timothy Flynn
2f3199adbf LibURL+LibWeb: Add a helper to check if a URL is a WebUI URL
Let's not have to know off-hand that we need to update Environments.cpp
when adding a new WebUI. It's more obvious just below where we define
the URLs.
2026-04-09 10:08:06 -04:00
Andreas Kling
34d954e2d7 LibRegex: Add ECMAScriptRegex and migrate callers
Add `ECMAScriptRegex`, LibRegex's C++ facade for ECMAScript regexes.

The facade owns compilation, execution, captures, named groups, and
error translation for the Rust backend, which lets callers stop
depending on the legacy parser and matcher types directly. Use it in the
remaining non-LibJS callers: URLPattern, HTML input pattern handling,
and the places in LibHTTP that only needed token validation.

Where a full regex engine was unnecessary, replace those call sites with
direct character checks. Also update focused LibURL, LibHTTP, and WPT
coverage for the migrated callers and corrected surrogate handling.
2026-03-27 17:32:19 +01:00
Shannon Booth
f4f6aefe32 LibURL/Pattern: Ignore extra RegExp captures in match result
execResult may contain additional captures from nested groups
in user-provided regexp parts, exceeding the number of
URLPattern groups.

Fixes a crash in the updated WPT test.

See: https://github.com/whatwg/urlpattern/commit/203d435c32
2026-03-20 11:29:15 +01:00
Shannon Booth
8124a5028c LibURL: Move some Origin helpers out of line
There have been a few times I have wanted to add a debug log in
one of these functions, but currently that causes a massive rebuild.
Let's just move these out of line.
2026-03-01 01:04:10 +01:00
Shannon Booth
b6e5b960fd LibURL: Add a Origin::is_opaque_file_origin helper
We will likely have some of these dotted around the place AD-HOC,
so let's make this a bit more readable when used.
2026-03-01 01:04:10 +01:00
R-Goc
ae5f28fb40 LibTextEncoder/LibURL: Cleanup includes
Cleans up LibURL/Parser.h to use the forwarding header from
LibTextEncoder.
2026-02-26 18:31:57 +01:00
Tim Ledbetter
5893bcf269 LibURL: Strip trailing dot before running the PSL algorithm 2026-02-22 12:07:43 +01:00
Shannon Booth
1be69479a6 LibURL+Elsewhere: Consider file:// origins opaque by default
This aligns our behaviour closer to other browsers, which
_mostly_ consider file scheme URLs as opaque. For test
purposes, allow overriding this behaviour with a commandline
flag.
2026-02-21 23:00:57 +01:00
Shannon Booth
64532bcfa0 LibURL: Add ability to store whether an origin is a file scheme origin 2026-02-21 23:00:57 +01:00
Timothy Flynn
ea32502947 Everywhere: Run clang-format
The following command was used to clang-format these files:

    clang-format-21 -i $(find . \
        -not \( -path "./\.*" -prune \) \
        -not \( -path "./Build/*" -prune \) \
        -not \( -path "./Toolchain/*" -prune \) \
        -type f -name "*.cpp" -o -name "*.mm" -o -name "*.h")
2026-02-18 08:02:45 -05:00
Shannon Booth
1c3d503146 LibURL: Remove LibCrypto as a dependency
This was previously a depdency due to the use of
Crypto::get_secure_random for the nonce used for an opaque origin.
Now that this has been moved to AK, we no longer have any dependency
on LibCrypto.
2026-01-26 18:46:59 +01:00
Shannon Booth
c8dc5ea27c LibURL: Optimize parsing of URLs in authority state
Previously the authority state was parsed character-by-character in a
loop, appending each character to a buffer. When an '@' character was
encountered, the entire buffer would be re-processed to extract and
percent-encode the username and password portions.

This created O(n^2) behavior for URLs with multiple '@' characters, as
each '@' would trigger reprocessing of all previously buffered content.

This commit changes the authority state parser to process the authority
section in chunks rather than character-by-character:

1. Find the next delimiter ('@', '/', '?', '#', or '\' for special URLs)
2. Process the entire chunk up to that delimiter at once
3. Directly extract username/password from the chunk without buffering

With an additional change of switching to:
iterator_at_byte_offset_without_validation (which does not have any
loops), this reduces the time complexity to O(n) and the included
test found by fuzzing to actually complete parsing :^)
2026-01-26 18:46:59 +01:00
Shannon Booth
add381fe60 LibURL: Remove redundant radix validation in parse_ipv4_number
AK::parse_number already validates that all characters are valid for
the specified radix.
2026-01-26 18:46:59 +01:00
Shannon Booth
fd01e80286 LibURL: Remove a stray newline 2026-01-26 18:46:59 +01:00
Shannon Booth
b6969fb82d LibURL: Remove default port handling for the IRC schemes
LibURL previously assigned a default port to the IRC schemes,
a carryover from SerenityOS where IRC is supported.

This behavior deviates from the URL Standard and affects URL parsing by
eliding an explicitly specified port when it matches the default (this
is considered a legacy behaviour of the web URL schemes). Remove the IRC
default port to restore spec-compliant behavior.
2026-01-26 18:46:59 +01:00
Shannon Booth
5af269913e LibURL: Implement serialization closer to spec steps
No functional change, just reads a little nicer.
2026-01-26 18:46:59 +01:00
Colleirose
bf7fd80140 LibCrypto+AK: Merge LibCrypto/SecureRandom into AK/Random
AK/Random is already the same as SecureRandom. See PR for more details.

ProcessPrng is used on Windows for compatibility w/ sandboxing measures
See e.g. https://crbug.com/40277768
2026-01-23 15:53:27 +01:00
Arran Ireland
bd82dfa048 AK+LibURL: Use AK::IPv4/6 in Host
This resolves two FIXME comments.
2025-12-31 10:24:56 +01:00
Shannon Booth
89dbdd3411 LibURL: Add domain concept to URL::Origin to fix same-origin-domain
The same-origin domain check always returned true if only the scheme
matched. This was because of missing steps to check for the origin's
domain, which didn't exist. Add this concept to URL::Origin, even
though we do not use it at this stage in document.domain setter.

Co-Authored-By: Luke Wilde <luke@ladybird.org>
2025-12-30 13:02:10 +01:00
Shannon Booth
2c11e03582 LibURL: Return a proper registrable domain if it's missing in the PSL
Instead of '*', which is a nonsensical value. Similar to what we do
for determining the public suffix, if no match could be made via the
PSL algorithm, then take everything after the second dot as
the registrable domain.

This prevents us from considering e.g. b.b.example and
a.example as the same site.
2025-12-30 12:40:27 +01:00
Timothy Flynn
c2365653a5 LibURL: Add a few missing internal page factories 2025-09-18 07:27:24 -04:00
Tete17
658477620a LibWeb/LibURL/LibIPC: Extend createObjectURL to also accept MediaSources
This required some changes in LibURL & LibIPC since it has its own
definition of an BlobURLEntry. For now, we don't have a concrete usage
of MediaSource in LibURL so it is defined as an empty struct.

This removes one FIXME in an idl file.
2025-08-19 23:50:38 +02:00
Idan Horowitz
1f1adb6d7e LibWeb+LibURL: Default empty string paths to URL's path in CookieStore 2025-08-17 22:17:36 +02:00
Shannon Booth
e6235210ff LibURL: Convert to scalar string before URL parsing
URL parsing is expected to take place on well formed unicode
strings.
2025-07-07 06:50:57 -04:00
ayeteadoe
25f5936dee CMake: Rename serenity_* helper functions/macros to ladybird_* 2025-07-03 23:19:41 +02:00
Timothy Flynn
62d9a84b8d AK+Everywhere: Replace custom number parsers with fast_float
Our floating point number parser was based on the fast_float library:
https://github.com/fastfloat/fast_float

However, our implementation only supports 8-bit characters. To support
UTF-16, we will need to be able to convert char16_t-based strings to
numbers as well. This works out-of-the-box with fast_float.

We can also use fast_float for integer parsing.
2025-07-03 09:51:56 -04:00
Shannon Booth
bd67a5afaa LibURL: Differentiate cross site opaque origins
Previously if we had two opaque origins both URLs were
being treated as same site.
2025-06-30 08:06:37 +01:00
Shannon Booth
b49b1b35e4 LibURL: Correct logic for domains not matched by PSL in public_suffix
For the AO defined in the URL specification, in the case the
domain does not match against the PSL, we should be returning
the TLD. This fixes a crash for a bunch of WPT tests using the
Document.domain setter when the test is being served by WPT
locally.

We should be doing similar logic in registrable_domain, but that
unfortunately runs into some other issues, so just leave a FIXME
for now.
2025-06-29 12:47:57 +01:00
Shannon Booth
a2b523eeb8 LibURL: Replace use of URL::get_public_suffix
It is confusing to have both URL::Host::public_suffix and
URL:get_public_suffix, both with slightly different semantics.

Instead, use PublicSuffixData for cases that just want a direct
match against the list, and URL::Host::public_suffix in LibWeb
land as the URL spec defined AO.
2025-06-29 12:47:57 +01:00
Shannon Booth
e6ecafea84 LibURL: Remove ErrorOr from get_public_suffix
The caller only expects ASCII and let's ignore any OOM.
2025-06-29 12:47:57 +01:00
Shannon Booth
c3618b891f Meta+LibURL: Always enable public suffix data
We should not encourage no public suffix data as a supported
configuration.
2025-06-29 12:47:57 +01:00
Shannon Booth
68b57daf84 LibURL: Remove uneeded FIXME for UTF-8 decode in URL parsing
I believe this is in the specification since the spec technically
requires passing through a valid unicode string. However, our
implementation already handles a non valid unicode string, and will
do the replacement character substitution.
2025-06-27 18:45:48 +12:00
Shannon Booth
1f4bbc2bfb LibURL: Publicly expose ability to parse a host
This is used by the HTML specification.
2025-06-27 18:45:48 +12:00
Shannon Booth
38765fd617 LibURL: Use a nonce to distinguish opaque origins
Opaque origins are meant to be unique in terms of equality from
one another. Since this uniqueness needs to be across processes,
use a nonce to implement the uniqueness check.
2025-06-25 16:47:09 +01:00
Shannon Booth
e0d7278820 LibURL+LibWeb: Make URL::Origin default constructor private
Instead, porting over all users to use the newly created
Origin::create_opaque factory function. This also requires porting
over some users of Origin to avoid default construction.
2025-06-17 20:54:03 +02:00
Shannon Booth
40d21e343f LibURL: Add FIXME's for testing equality of opaque origins
The spec seems to indicate in its wording that while opaque
origins only serialize to 'null', they can still be tested
for equality with one another. Probably we will need to
generate some unique ID which is unique across processes.
2025-05-24 09:51:44 -04:00
Timothy Flynn
7280ed6312 Meta: Enforce newlines around namespaces
This has come up several times during code review, so let's just enforce
it using a new clang-format 20 option.
2025-05-14 02:01:59 -06:00
Shannon Booth
8e37cd2f71 LibURL: Remove URL's valid state
No code now relies on using URL's valid state.

A URL can still be _technically_ invalid through use of the URL
constructor or by directly changing URL fields.

However, all URLs should be constructed through the URL parser,
and we should ideally be getting rid of the default constructor
at some stage.

Also, any code which is manually setting URL fields need to be
aware that this is full of pitfalls since there are many different
forms of canonicalization which is bypassed by not going through
the URL parser.
2025-04-19 07:18:43 -04:00
Shannon Booth
00bbb2105b LibURL: Port create_with_file_scheme to Optional
Removing one of the main remaining users of URL valid state.
2025-04-19 07:18:43 -04:00
Shannon Booth
2072eee83d LibURL: Implement create_with_file_scheme using URL Parser
Creating a URL should almost always go through the URLParser to
handle all of the small edge cases involved. This reduces the
need for URL valid state.
2025-04-19 07:18:43 -04:00
Ali Mohammad Pur
76f5dce3db LibRegex: Flatten capture group list in MatchState
This makes copying the capture group COWVector significantly cheaper,
as we no longer have to run any constructors for it - just memcpy.
2025-04-18 17:09:27 +02:00
stasoid
32ddeb82d6 LibURL+LibWeb: Remove leading slash when converting url to path
...on Windows
2025-04-10 19:04:21 -06:00
Timothy Flynn
f070264800 Everywhere: Remove sv suffix from format string literals
This prevents the compile-time checks that would catch errors in the
format invocation (which would usually lead to a runtime crash).
2025-04-08 20:00:18 -04:00
Timothy Flynn
0a256b0a9a AK+Everywhere: Change StringView case conversions to return String
There's a bit of a UTF-8 assumption with this change. But nearly every
caller of these methods were immediately creating a String from the
resulting ByteString anyways.
2025-04-07 17:44:38 +02:00
Shannon Booth
0a58497ab9 LibURL/Pattern: Fix PatternParser logic for prefix codepoint comparison
We were not properly handling the case that prefix code point was the
empty string (which we represent as an OptionalNone). While this
still resulted in the correct pattern string being generated, an
incorrect regular expression was being generated causing matching
to fail.
2025-04-07 10:29:09 -04:00
Shannon Booth
4b6f0ee24a LibURL: Do not trim whitespace parsing port in URL parser
This has no functional difference as far as I can tell, but for
clarity explicitly do not attempt to do this, which has the nice
side effect of not checking for whitespace known to not exist.
2025-04-07 10:29:09 -04:00
Shannon Booth
565ccc04a9 LibURL/Pattern: Do not trim whitespace interpreting port
It turns out that the problem here was simply that we were trimming
trailing whitespace when we did not need to, which was meaning that
the port number of '80 ' was being converted to the empty string
per URLPattern elision as the port matches the http scheme.
2025-04-07 10:29:09 -04:00
Timothy Flynn
ee6b2db009 AK+LibURL+LibWeb: Use simdutf to validate ASCII strings
simdutf provides a vectorized ASCII validator, so let's use that instead
of looping over strings manually.
2025-04-06 11:05:58 -04:00