Commit Graph

7 Commits

Author SHA1 Message Date
Andreas Kling
e0de4ef33e LibRegex: Reject negated /v classes that contain strings
Negated unicode-set classes are only valid when every member is
single-code-point. We already rejected direct string-valued members
such as `q{ab}` and `p{RGI_Emoji_Flag_Sequence}` inside `[^...]`,
but nested class-set operands could still smuggle them through, so
patterns like `[^[[p{Emoji_Keycap_Sequence}]]]` and the reported
fuzzed literal compiled instead of throwing.

Validate nested class-set expressions after parsing and reject only the
negated `/v` classes whose resulting multi-code-point strings are still
non-empty. Track the exact string members contributed by string
literals, string properties, and nested classes so intersections and
subtractions can eliminate them before the negated-class check runs.

Add constructor and literal coverage for the reduced nested-string
cases, the original regression, and valid negated set operations that
remove every string member.
2026-03-31 15:59:04 +02:00
Andreas Kling
c12647fc37 LibRegex: Clamp braced quantifier bounds to 2^31 - 1
Browsers clamp braced quantifier bounds above 2^31 - 1 before
checking whether {min,max} is in order. The parser still kept values
up to u32::MAX, so patterns like {2147483648,2147483647} were
rejected even though both bounds should collapse to the same limit.

Clamp parsed braced quantifier bounds to 2^31 - 1 as they are read.
This keeps the existing acceptance of huge exact and open-ended
quantifiers and makes the constructor and regex literal paths agree
with other engines on the out-of-order edge cases.

The RegExp runtime and syntax tests now cover accepted huge
quantifiers, clamped order validation, and huge literal forms. The
reported constructor and literal cases also match other engines.
2026-03-31 15:59:04 +02:00
Andreas Kling
50b137f527 LibJS: Reject mixed surrogate forms in RegExp names
Reject surrogate pairs in named group names unless both halves come
from the same raw form. A literal surrogate half was being
normalized into \uXXXX before LibRegex parsed the pattern, which let
mixed literal and escaped forms sneak through.

Validate surrogate handling on the UTF-16 pattern before
normalization, but only treat \k<...> as a named backreference when
the parser would do that too. Legacy regexes without named groups
still use \k as an identity escape, so their literal text must not be
rejected by the pre-scan.

Add runtime and syntax tests for the mixed forms, the valid literal,
fixed-width, and braced escape cases, and the legacy \k literals.
2026-03-31 15:59:04 +02:00
Andreas Kling
3e1145ef07 LibJS: Synchronous await fast path when microtask queue is empty
When an async function is resumed from a microtask and hits another
await with a non-thenable value (primitive or already-settled native
promise), and the microtask queue is empty, we can resolve the await
synchronously without suspending. No other microtask can observe the
difference in execution order, making this optimization safe.

This avoids the overhead of creating a GC::Function for the microtask
job, enqueuing/dequeuing from the microtask queue, and the execution
context push/pop that comes with it.

A new VM host hook, host_promise_job_queue_is_empty, is added so both
the standalone js binary and LibWeb can provide the appropriate check
for their respective job queue implementations.
2026-03-16 19:15:03 -05:00
Andreas Kling
3a2f2f3926 LibJS: Add fast path in async function await for non-thenable values
Per spec, every `await` goes through PromiseResolve (which wraps the
value in a new Promise via NewPromiseCapability) and then
PerformPromiseThen (which creates PromiseReaction and JobCallback
objects). This results in 13-16 GC cell allocations per await.

Add a fast path that detects two common cases:

1. Primitive values: These can never have a "then" property, so we
   can skip all promise wrapping and directly schedule the async
   function's continuation as a microtask.

2. Already-settled native Promises: If the promise has no own
   properties and its prototype is the intrinsic %Promise.prototype%,
   we can extract the result directly and schedule continuation.

For these cases, we bypass promise_resolve(), new_promise_capability(),
create_resolving_functions(), perform_then(), PromiseReaction creation,
and JobCallback creation -- replacing ~13 GC allocations with 1
(the GC::Function for the microtask job).
2026-03-16 12:01:49 -05:00
Andreas Kling
1f4700a0c8 LibJS: Add tests for regex literal parse-time error reporting
Cover the paths exercised by deferred regex compilation:
- Invalid patterns in literals, eval(), and new Function()
- Duplicate and invalid flags in literals
- Valid literals with various flag combinations
- Multiple literals in the same scope
- Literals inside regular and arrow functions
2026-03-06 13:06:05 +01:00
Jelle Raaijmakers
e3faa9b5ad LibJS: Move tests to /Tests/LibJS 2026-01-22 07:46:48 -05:00