eliott/ladybird - ladybird - lab48

eliott/ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-04-28 02:27:19 +02:00

Author	SHA1	Message	Date
Andreas Kling	e0de4ef33e	LibRegex: Reject negated /v classes that contain strings Negated unicode-set classes are only valid when every member is single-code-point. We already rejected direct string-valued members such as `q{ab}` and `p{RGI_Emoji_Flag_Sequence}` inside `[^...]`, but nested class-set operands could still smuggle them through, so patterns like `[^[[p{Emoji_Keycap_Sequence}]]]` and the reported fuzzed literal compiled instead of throwing. Validate nested class-set expressions after parsing and reject only the negated `/v` classes whose resulting multi-code-point strings are still non-empty. Track the exact string members contributed by string literals, string properties, and nested classes so intersections and subtractions can eliminate them before the negated-class check runs. Add constructor and literal coverage for the reduced nested-string cases, the original regression, and valid negated set operations that remove every string member.	2026-03-31 15:59:04 +02:00
Andreas Kling	c12647fc37	LibRegex: Clamp braced quantifier bounds to 2^31 - 1 Browsers clamp braced quantifier bounds above 2^31 - 1 before checking whether {min,max} is in order. The parser still kept values up to u32::MAX, so patterns like {2147483648,2147483647} were rejected even though both bounds should collapse to the same limit. Clamp parsed braced quantifier bounds to 2^31 - 1 as they are read. This keeps the existing acceptance of huge exact and open-ended quantifiers and makes the constructor and regex literal paths agree with other engines on the out-of-order edge cases. The RegExp runtime and syntax tests now cover accepted huge quantifiers, clamped order validation, and huge literal forms. The reported constructor and literal cases also match other engines.	2026-03-31 15:59:04 +02:00
Andreas Kling	50b137f527	LibJS: Reject mixed surrogate forms in RegExp names Reject surrogate pairs in named group names unless both halves come from the same raw form. A literal surrogate half was being normalized into \uXXXX before LibRegex parsed the pattern, which let mixed literal and escaped forms sneak through. Validate surrogate handling on the UTF-16 pattern before normalization, but only treat \k<...> as a named backreference when the parser would do that too. Legacy regexes without named groups still use \k as an identity escape, so their literal text must not be rejected by the pre-scan. Add runtime and syntax tests for the mixed forms, the valid literal, fixed-width, and braced escape cases, and the legacy \k literals.	2026-03-31 15:59:04 +02:00
Andreas Kling	1f4700a0c8	LibJS: Add tests for regex literal parse-time error reporting Cover the paths exercised by deferred regex compilation: - Invalid patterns in literals, eval(), and new Function() - Duplicate and invalid flags in literals - Valid literals with various flag combinations - Multiple literals in the same scope - Literals inside regular and arrow functions	2026-03-06 13:06:05 +01:00