ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-05-05 22:52:22 +02:00

Author	SHA1	Message	Date
Ali Mohammad Pur	022cd1adca	LibRegex: Use the right offset when patching jumps through fork-trees Fixes #4474.	2025-04-27 12:16:15 +02:00
Ali Mohammad Pur	fca1d33fec	LibRegex: Correctly calculate the target for Repeat in table alts Fixes a bunch of websites breaking because we now verify jump offsets by trying to remove 0-offset jumps. This has been broken for a good while, it was just rare to see Repeat inside alternatives that lended themselves well to tree alts.	2025-04-24 01:17:27 -06:00
Ali Mohammad Pur	4b9abdb963	LibRegex: Remove useless jumps (Jump* +0) before running opts This leads to some more significant performance increases on the simple /<script\|<style\|<link/ regex in speedometer (~2x)	2025-04-23 22:57:49 +02:00
Ali Mohammad Pur	ec0836c9ea	LibRegex: Don't blindly treat multi-target tree jumps as a single jump The tree generation was broken, we just didn't notice it because it was very rarely being picked for more complex bytecodes.	2025-04-23 22:57:49 +02:00
Ali Mohammad Pur	09eb28ee1d	LibRegex: Better estimate the cost of laying out alts as a chain Previously we were counting the total number of nodes in the tree for the chain cost, which greatly underestimated its cost when large bytecode entries were present, This commit switches to estimating it using the total bytecode size, which is a closer value to the true cost than the tree node count. This corresponds to a ~4x perf improvement on /<script\|<style\|<link/ in speedometer.	2025-04-23 22:57:49 +02:00
Ali Mohammad Pur	eea81738cd	AK+Everywhere: Recognise that surrogates in utf16 aren't all that common For the slight cost of counting code points when converting between encodings and a teeny bit of memory, this commit adds a fast path for all-happy utf-16 substrings and code point operations. This seems to be a significant chunk of time spent in many regex benchmarks.	2025-04-23 07:56:02 -06:00
Ali Mohammad Pur	3b4a184f1a	LibRegex: Avoid hashing the state hashes again We already had a really nice hash that had a single issue, this commit fixes that and makes it the hash for the hash table, so we avoid double-hashing and making a long chain. This is an easy 10% perf gain.	2025-04-18 17:09:27 +02:00
Ali Mohammad Pur	446a453719	LibRegex: Pull out the first compare to avoid unnecessary execution This adds a fast-path to drop view indices we know will not match immediately without going through the regex VM.	2025-04-18 17:09:27 +02:00
Ali Mohammad Pur	76f5dce3db	LibRegex: Flatten capture group list in MatchState This makes copying the capture group COWVector significantly cheaper, as we no longer have to run any constructors for it - just memcpy.	2025-04-18 17:09:27 +02:00
Andreas Kling	ca2f0141f6	LibRegex: Remove unused "simple substring search" optimization This is not relevant for LibJS since it only works when the input is UTF-8, and LibJS always provides UTF-16.	2025-04-16 10:04:50 +02:00
Andreas Kling	96f1f15ad6	LibRegex: Remove unused Utf8View/Utf32View support in RegexStringView	2025-04-16 10:04:50 +02:00
Andreas Kling	87ec5b32b0	LibRegex: Use ReadonlySpan to peek into OpCode_Compare LUTs By the time we're executing bytecode, we know the the bytecode will be flattened. This means we can use ReadonlySpan to look into it instead of DisjointChunks::spans(), which allocates.	2025-04-14 17:40:13 +02:00
Andreas Kling	c1c3b01a6c	LibRegex: Allow Vector<Match> to use trivial memcpy Now that Match has no more members that need destruction, we can allow Vector to memcpy them around.	2025-04-14 17:40:13 +02:00
Andreas Kling	5308d77600	LibRegex: Don't use Optional<T> inside regex::Match This prevented Match from being trivially copyable, which we want it to be for fast Vector copying.	2025-04-14 17:40:13 +02:00
Andreas Kling	54edf29f1b	LibRegex: Make Match::capture_group_name an index into the string table This removes another Match member that required destruction. The "API" for accessing the strings is definitely a bit awkward. We'll think of something nicer eventually.	2025-04-14 17:40:13 +02:00
Andreas Kling	9d47cc54f8	LibRegex: Remove unused regex::Match::string and unused constructor This shrinks regex::Match by 8 bytes and removes a member that needs destruction.	2025-04-14 17:40:13 +02:00
Ali Mohammad Pur	69050da929	LibRegex: Merge inverse string table mappings separately	2025-04-06 20:21:16 +02:00
Ali Mohammad Pur	299b9ca572	LibRegex: Check backreference index before looking it up If a backref happens after it's cleared, the slot may be cleared already.	2025-04-06 20:21:16 +02:00
Jess	83e46b3728	LibRegex: Fix crash when parse result exceeds max cache size Before, If the cache was empty we would try and evict non-existant entries and crash. So the fix is to make sure that we don't saturate the cache with a single parse result.	2025-04-04 16:10:25 +02:00
Ali Mohammad Pur	4136d8d13e	LibRegex: Use an interned string table for capture group names This avoids messing around with unsafe string pointers and removes the only non-FlyString-able user of DeprecatedFlyString.	2025-04-02 11:43:13 +02:00
Andreas Kling	e5db913b0d	Revert "LibRegex: Port remaining DeprecatedFlyString to ByteString" This reverts commit `aab3fbe254`. Greatly regressed JavaScript benchmark performance.	2025-04-01 15:40:38 +02:00
Andreas Kling	7c32d1e8a5	Revert "Everywhere: Remove DeprecatedFlyString + any remaining references to it" This reverts commit `3131e6369f`. Greatly regressed JavaScript benchmark performance.	2025-04-01 15:40:27 +02:00
Kenneth Myhra	3131e6369f	Everywhere: Remove DeprecatedFlyString + any remaining references to it	2025-04-01 12:50:00 +02:00
Kenneth Myhra	aab3fbe254	LibRegex: Port remaining DeprecatedFlyString to ByteString	2025-04-01 12:50:00 +02:00
Andreas Kling	6b6d3b32a4	LibRegex: Remove the StringCopyMatches mode This mode made a lot of incorrect assumptions about string lifetimes, and instead of fixing it, let's just remove it and tweak the few unit tests that used it.	2025-03-24 22:27:17 +00:00
Andreas Kling	46a5710238	LibJS: Use FlyString in PropertyKey instead of DeprecatedFlyString This required dealing with substantial fallout.	2025-03-24 22:27:17 +00:00
mikiubo	c85df78c4c	LibRegex: Remove orphaned save points in nested LookAhead	2025-03-17 16:11:02 +01:00
Tim Ledbetter	b9ac99d2eb	Revert "LibRegex: Remove orphaned save points in nested LookAhead" This reverts commit `f2678bfcb8`.	2025-03-14 19:57:33 +00:00
mikiubo	f2678bfcb8	LibRegex: Remove orphaned save points in nested LookAhead	2025-03-14 09:41:41 +01:00
Ali Mohammad Pur	5355710481	LibRegex: Don't treat single-jump blocks as noop in the optimizer	2025-03-09 14:37:57 +01:00
aplefull	389a63d6bf	LibRegex: Allow duplicate named capture groups in separate alternatives	2025-03-05 14:36:09 +01:00
aplefull	61744322ad	LibRegex: Ensure nullable quantifiers backtrack when input remains Makes patterns like `/(a?b??)*/` correctly match the string	2025-03-02 15:19:04 +01:00
Ali Mohammad Pur	ea3b7efd91	LibRegex: Treat the UnicodeSets flag as Unicode Fixes /.../v not being interpreted as a unicode pattern.	2025-02-28 14:31:45 -05:00
mikiubo	8a6f7b787e	LibRegex: Use depth-first search in regex optimizer use depth-first search in optimizer code bacause using breadth-first search generate a bug. Add test example in test lib.	2025-02-25 00:09:20 +01:00
Ali Mohammad Pur	08ebfaff17	LibRegex: Take trailing inversion state into account in block comparison Fixes #3421.	2025-02-01 11:30:02 +01:00
Timothy Flynn	85b424464a	AK+Everywhere: Rename `verify_cast` to `as` Follow-up to `fc20e61e72`.	2025-01-21 11:34:06 -05:00
Ali Mohammad Pur	cce000d57c	LibRegex: Don't repeat the same fork again If some state has already been tried, skip over it as it would never lead to a match regardless. This fixes performance/memory issues in cases like /(a+)+b/.exec("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") or /(a\|a?)+b/... Fixes #2622.	2025-01-17 10:13:51 +01:00
Ali Mohammad Pur	7ceeb85ba7	LibRegex: Avoid use-after-move of trivial object This is not an actual problem as the object is just an enum, but clion was bugging me.	2025-01-17 10:13:51 +01:00
Ali Mohammad Pur	50733c564c	LibRegex: Use the actually correct repeat start offset for Repeat Fixes #2931 and various frequent crashes.	2024-12-23 13:13:52 +01:00
Pavel Shliak	811d5a5c3e	LibRegex: Remove duplicated condition	2024-12-22 12:33:41 +01:00
Pavel Shliak	7dd7f77219	LibRegex: Remove duplicated assignments	2024-12-22 12:33:41 +01:00
Ali Mohammad Pur	eee90f4aa2	LibRegex: Treat checks against nonexistent checkpoints as empty Due to optimiser shenanigans in the tree alternative form, some JumpNonEmpty ops might be moved before their Checkpoint instruction. It is safe to assume the distance between the nonexistent checkpoint and the current op is zero, so just do that.	2024-12-13 10:00:16 +01:00
Ali Mohammad Pur	358378c1c0	LibRegex: Pick the right target for OpCode_Repeat Repeat's 'offset' field is a bit odd in that it is treated as a negative offset, causing a backwards jump when positive; the optimizer didn't correctly model this behaviour, which caused crashes and misopts when dealing with Repeats. This commit fixes that behaviour.	2024-12-13 10:00:16 +01:00
Ali Mohammad Pur	4a8d3e35a3	LibRegex: Add some more debugging info to bytecode block ranges These were getting difficult to differentiate, now they each get a comment on where they came from to aid with future debugging.	2024-12-13 10:00:16 +01:00
Ali Mohammad Pur	f8092455e2	LibRegex: Print OpCode_Repeat's offset as ssize_t	2024-12-13 10:00:16 +01:00
Pavel Shliak	6f81b80114	Everywhere: Include HashMap only where it's actually used	2024-12-09 12:31:16 +01:00
Marc Jessome	efcaf991e6	LibRegex: Ensure nested capture groups have non-conflicting names Take record of the named capture group prior to parsing the group's body. This requires removal of the recorded minimum length of the named capture group directly, and now needs to be looked up via the group minimu lengths table.	2024-11-24 10:26:09 +01:00
Pavel Shliak	cdb54fe504	LibRegex: Clean up #include directives This change aims to improve the speed of incremental builds.	2024-11-21 14:08:33 +01:00
Ali Mohammad Pur	5a4d657a4e	LibRegex: Avoid generating ForkJumps when jumping to the next alt block Fixes #2398.	2024-11-17 20:12:39 +01:00
Ali Mohammad Pur	00bc22c332	LibRegex: Don't immediately ignore TempInverse in optimizer `fe46b2c141` added the reset-temp-inverse flag, but set it up so all tempinverse ops were negated at the start of the next op; this commit makes it so these flags actually persist for one op and not zero. Fixes #2296.	2024-11-17 09:03:29 -05:00

1 2

86 Commits