ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-04-25 17:25:08 +02:00

Author	SHA1	Message	Date
Andreas Kling	c66cab7e6b	AK: Hide tentative HashTable bucket from iterators across ensure() HashMap<_, GC::Ref<_>>::ensure() crashed under UBSan whenever the initialization callback triggered a GC: lookup_for_writing() stamped the target bucket as used and added it to the ordered list before the callback ran, so the marking visitor walked the map, read the uninitialized slot, and failed the returns_nonnull check in GC::Ref. Split bucket reservation into two phases. lookup_for_writing() now hands back the target in the Free state (not in the ordered list, m_size unchanged); callers placement-new the value and then commit via commit_inserted_bucket(). The Robin Hood displacement loop still stamps the slot internally and un-stamps before returning, so probing is unchanged and the whole operation remains a single hash and a single probe.	2026-04-25 06:21:36 +02:00
Andreas Kling	b23aa38546	AK: Adopt mimalloc v2 as main allocator Use mimalloc for Ladybird-owned allocations without overriding malloc(). Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc, and put the embedded Rust crates on the same allocator via a shared shim in AK/kmalloc.cpp. This also lets us drop kfree_sized(), since it no longer used its size argument. StringData, Utf16StringData, JS object storage, Rust error strings, and the CoreAudio playback helpers can all free their AK-backed storage with plain kfree(). Sanitizer builds still use the system allocator. LeakSanitizer does not reliably trace references stored in mimalloc-managed AK containers, so static caches and other long-lived roots can look leaked. Pass the old size into the Rust realloc shim so aligned fallback reallocations can move posix_memalign-backed blocks safely. Static builds still need a little linker help. macOS app binaries need the Rust allocator entry points forced in from liblagom-ak.a, while static ELF links can pull in identical allocator shim definitions from multiple Rust staticlibs. Keep the Apple -u flags and allow those duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.	2026-04-08 09:57:53 +02:00
Tim Ledbetter	972bcdeebe	AK: Use correct relocation for all HashTable entry types Robin Hood displacement and `delete_bucket()` shift-up used BucketType's implicit move operations, which bitwise-copy the `u8` storage array instead of going through T's move constructor and destructor. This change adds `relocate_bucket()` and `swap_buckets()` helpers that use a fast path for trivially-relocatable types and move-construct + destroy for others.	2026-03-19 14:21:44 +01:00
Jelle Raaijmakers	a18722093d	AK: Use power-of-two growth in HashTable Measurements on my device have shown this to make lookups and inserts faster by 30-40% on average, especially for larger number of buckets. The memory cost is significant depending on the exact number of buckets, increasing anywhere between 10% and 100% compared to the old way of growing our capacity. The performance benefits still make this worth it, in my opinion.	2026-02-20 22:47:24 +01:00
Jelle Raaijmakers	2522aad8b7	AK: Reduce HashTable's load factor to 70% Measurements on my device have shown HashTable lookups and inserts to be faster up to ~20%, at the cost of 10-20% more memory usage depending on the exact amount of buckets.	2026-02-20 22:47:24 +01:00
Andreas Kling	ca772caee6	AK: Add HashTable::ensure(hash, predicate, init_callback) ...and use it to make HashMap::ensure() do a single hash lookup instead of three. We achieve this by factoring out everything but the bucket construction logic from HashTable::write_value() into a lookup_for_writing() helper so we can use it from more places.	2025-10-05 21:44:06 +02:00
Zaggy1024	744568c912	AK: Destroy non-trivial types by ref in HashTable::clear_with_capacity Buckets being iterated by pointer instead of reference was causing a compilation error when calling clear_with_capacity() on a HashTable containing a non-trivially-destructible type.	2025-09-22 17:28:00 -05:00
Andreas Kling	59a28febc9	AK: Store hash with HashTable entry to avoid expensive equality checks When T in HashTable<T> has a potentially slow equality check, it can be very profitable to check for a matching hash before full equality. This patch adds may_have_slow_equality_check() to AK::Traits and defaults it to true. For trivial types (pointers, integers, etc) we default it to false. This means we skip the hash check when the equality check would be a single-CPU-word compare anyway. This synergizes really well with things like HashMap<String, V> where collisions previously meant we may have to churn through multiple O(n) equality checks.	2025-09-18 22:37:18 +02:00
Idan Horowitz	5097e72174	AK: Implement take_all_matching(predicate) API in HashTable	2025-08-08 13:09:58 -04:00
Timothy Flynn	7280ed6312	Meta: Enforce newlines around namespaces This has come up several times during code review, so let's just enforce it using a new clang-format 20 option.	2025-05-14 02:01:59 -06:00
Andrew Kaster	5e7e6475c6	AK: Annotate [[no_unique_address]] members with NO_UNIQUE_ADDRESS macro	2025-04-15 02:19:06 -06:00
Jelle Raaijmakers	c7773d0312	Meta: Update my email address everywhere	2024-11-01 12:14:53 +01:00
Andreas Kling	cc4b3cbacc	Meta: Update my e-mail address everywhere	2024-10-04 13:19:50 +02:00
Dan Klishch	5ed7cd6e32	Everywhere: Use east const in more places These changes are compatible with clang-format 16 and will be mandatory when we eventually bump clang-format version. So, since there are no real downsides, let's commit them now.	2024-04-19 06:31:19 -04:00
Andreas Kling	6724f840cd	AK: Early return from empty hash table lookups to avoid hashing When calling get() or find() on an empty HashTable or HashMap, we can avoid hashing the sought-after key.	2024-03-16 14:27:59 +01:00
kleines Filmröllchen	9a026fc8d5	AK: Implement SipHash as the default hash algorithm for most use cases SipHash is highly HashDoS-resistent, initialized with a random seed at startup (i.e. non-deterministic) and usable for security-critical use cases with large enough parameters. We just use it because it's reasonably secure with parameters 1-3 while having excellent properties and not being significantly slower than before.	2023-10-01 11:06:36 +03:30
Daniel Bertalan	4d2af7c3d6	AK: Implement reverse iterators for `OrderedHashTable`	2023-09-24 23:36:43 +02:00
Karol Kosek	e575ee4462	AK+Kernel: Unify Traits<T>::equals()'s argument order on different types There was a small mishmash of argument order, as seen on the table: \| Traits<T>::equals(U, T) \| Traits<T>::equals(T, U) ============= \| ======================= \| ======================= uses equals() \| HashMap \| Vector, HashTable defines equals() \| *String[^1] \| ByteBuffer [^1]: String, DeprecatedString, their Fly-type equivalents and KString. This mostly meant that you couldn't use a StringView for finding a value in Vector<String>. I'm changing the order of arguments to make the trait type itself first (`Traits<T>::equals(T, U)`), as I think it's more expected and makes us more consistent with the rest of the functions that put the stored type first (like StringUtils functions and binary_serach). I've also renamed the variable name "other" in find functions to "entry" to give more importance to the value. With this change, each of the following lines will now compile successfully: Vector<String>().contains_slow("WHF!"sv); HashTable<String>().contains("WHF!"sv); HashMap<ByteBuffer, int>().contains("WHF!"sv.bytes());	2023-08-23 20:21:09 +02:00
Ben Wiederhake	36ff6187f6	Everywhere: Change spelling of 'behaviour' to 'behavior' "The official project language is American English […]." `5d2e915623/CONTRIBUTING.md (L30)` Here's a short statistic of the occurrences of the word "behavio(u)r": $ git grep -IPioh 'behaviou?r' \| sort \| uniq -c \| sort -n 2 BEHAVIOR 24 Behaviour 32 behaviour 407 Behavior 992 behavior Therefore, it is clear that "behaviour" (56 occurrences) should be regarded a typo, and "behavior" (1401 occurrences) should be preferred. Note that The occurrences in LibJS are intentionally NOT changed, because there are taken verbatim from the specification. Hence: $ git grep -IPioh 'behaviou?r' \| sort \| uniq -c \| sort -n 2 BEHAVIOR 10 behaviour 24 Behaviour 407 Behavior 1014 behavior	2023-05-07 01:05:09 +02:00
Ben Wiederhake	ee47c0275e	Everywhere: Run spellcheck on all documentation	2023-05-07 01:05:09 +02:00
Aliaksandr Kalenik	4c6564e3c1	AK: Add values() method in HashTable Add HashTable::values() method that returns all values.	2023-04-28 18:11:44 +02:00
Jelle Raaijmakers	954d660094	AK: Clear OrderedHashTable previous/next pointers on removal With Clang, the previous/next pointers in buckets of an `OrderedHashTable` are not cleared when a bucket is being shifted up as a result of a removed bucket. As a result, an unfortunate pointer mixup could lead to an infinite loop in the `HashTable` iterator, which was exposed in `HashMap::keys()`. Co-authored-by: Luke Wilde <lukew@serenityos.org>	2023-03-15 21:43:52 +01:00
Hediadyoin1	fd8c54d720	AK: Add `take_first` to HashTable and rename `pop` to `take_last` This naming scheme matches Vector. This also changes `take_last` to move the value it takes, and delete by known pointer, avoiding a full lookup and potential copies.	2023-02-21 22:13:06 +01:00
Hediadyoin1	93945062a7	AK: Update HashTables head and tail when shifting during deletion Otherwise we end up with invalid pointers to them, breaking iteration.	2023-02-21 22:13:06 +01:00
Jelle Raaijmakers	c08d137fcd	AK: Reimplement `HashTable` with smart linear probing Instead of rehashing on collisions, we use Robin Hood hashing: a simple linear probe where we keep track of the distance between the bucket and its ideal position. On insertion, we allow a new bucket to "steal" the position of "rich" buckets (those near their ideal position) and move them further down. On removal, we shift buckets back up into the freed slot, decrementing their distance while doing so. This behavior automatically optimizes the number of required probes for any value, and removes the need for periodic rehashing (except when expanding the capacity).	2023-02-17 22:29:51 -07:00
Timothy Flynn	4f5353cbb8	AK: Rename double_hash to rehash_for_collision The name is currently quite confusing as it indicates it hashes doubles.	2023-01-21 10:36:14 +01:00
Eli Youngs	a2024cfb69	AK: Support popping an arbitrary element from a HashTable	2022-12-16 10:41:56 -07:00
Moustafa Raafat	b8f1e1bed2	Everywhere: Remove unnecessary AK and Detail namespace scoping	2022-12-09 11:25:30 +00:00
Linus Groh	d26aabff04	Everywhere: Run clang-format	2022-12-03 23:52:23 +00:00
Andreas Kling	ae3ffdd521	AK: Make it possible to not `using` AK classes into the global namespace This patch adds the `USING_AK_GLOBALLY` macro which is enabled by default, but can be overridden by build flags. This is a step towards integrating Jakt and AK types.	2022-11-26 15:51:34 +01:00
Zaggy1024	a1300d3797	AK: Don't crash in HashTable::clear_with_capacity on an empty table When calling clear_with_capacity on an empty HashTable/HashMap, a null deref would occur when trying to memset() m_buckets. Checking that it has capacity before clearing fixes the issue.	2022-11-11 00:44:04 -07:00
Hendiadyoin1	5bf84a5b0e	AK: Zero previous pointer after fixing the insertion list in HashTable	2022-06-23 20:25:12 +03:00
Idan Horowitz	eb02425ef9	AK: Clear the previous and next pointers of deleted HashTable buckets Usually the values of the previous and next pointers of deleted buckets are never used, as they're not part of the main ordered bucket chain, but if an in-place rehashing is done, which results in the bucket being turned into a free bucket, the stale pointers will remain, at which point any item that is inserted into said free-bucket will have either a stale previous pointer if the HashTable was empty on insertion, or a stale next pointer, resulting in undefined behaviour. This commit also includes a new HashMap test that reproduces this issue	2022-06-22 21:53:13 +02:00
Vitaly Dyachkov	a0a4d169f4	AK+LibGUI: Pass predicate to *_matching() methods by const reference	2022-05-08 17:02:00 +02:00
Idan Horowitz	086969277e	Everywhere: Run clang-format	2022-04-01 21:24:45 +01:00
kleines Filmröllchen	09a12247fb	AK: Use bucket states with special bit patterns in HashTable This simplifies some of the bucket state handling code, as there's now an easy way of checking the basic category of bucket state.	2022-03-31 12:06:13 +02:00
kleines Filmröllchen	49d29c8298	AK: Rehash HashTable in-place instead of shrinking As seen on TV, HashTable can get "thrashed", i.e. it has a bunch of deleted buckets that count towards the load factor. This means that hash tables which are large enough for their contents need to be resized. This was fixed in `9d8da16` with a workaround that shrinks the HashTable back down in these cases, as after the resize and re-hash the load factor is very low again. However, that's not a good solution. If you insert and remove repeatedly around a size boundary, you might get frequent resizes, which involve frequent re-allocations. The new solution is an in-place rehashing algorithm that I came up with. (Do complain to me, I'm at fault.) Basically, it iterates the buckets and re-hashes the used buckets while marking the deleted slots empty. The issue arises with collisions in the re-hash. For this reason, there are two kinds of used buckets during the re-hashing: the normal "used" buckets, which are old and are treated as free space, and the "re-hashed" buckets, which are new and treated as used space, i.e. they trigger probing. Therefore, the procedure for relocating a bucket's contents is as follows: - Locate the "real" bucket of the contents with the hash. That bucket is the starting point for the target bucket, and the current (old) bucket is the bucket we want to move. - While we still need to move the bucket: - If we're the target, something strange happened last iteration or we just re-hashed to the same location. We're done. - If the target is empty or deleted, just move the bucket. We're done. - If the target is a re-hashed full bucket, we probe by double-hashing our hash as usual. Henceforth, we move our target for the next iteration. - If the target is an old full bucket, we swap the target and to-move buckets. Therefore, the bucket to move is a the correct location and the former target, which still needs to find a new place, is now in the bucket to move. So we can just continue with the loop; the target is re-obtained from the bucket to move. This happens for each and every bucket, though some buckets are "coincidentally" moved before their point of iteration is reached. Either way, this guarantees full in-place movement (even without stack storage) and therefore space complexity of O(1). Time complexity is amortized O(2n) asssuming a good hashing function. This leads to a performance improvement of ~30% on the benchmark introduced with the last commit. Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>	2022-03-31 12:06:13 +02:00
kleines Filmröllchen	bcb8937898	AK: Merge HashTable bucket state into one enum The hash table buckets had three different state booleans that are in fact exclusive. In preparation for further states, this commit consolidates them into one enum. This has the added benefit on not relying on the compiler's boolean packing anymore; we definitely now only need one byte for the bucket state.	2022-03-31 12:06:13 +02:00
Daniel Bertalan	e3eb68dd58	AK+Kernel: Avoid double memory clearing of HashTable buckets Since the allocated memory is going to be zeroed immediately anyway, let's avoid redundantly scrubbing it with MALLOC_SCRUB_BYTE just before that. The latest versions of gcc and Clang can automatically do this malloc + memset -> calloc optimization, but I've seen a couple of places where it failed to be done. This commit also adds a naive kcalloc function to the kernel that doesn't (yet) eliminate the redundancy like the userland does.	2022-03-15 11:56:46 +01:00
Andreas Kling	9d8da1697e	AK: Automatically shrink HashTable when removing entries If the utilization of a HashTable (size vs capacity) goes below 20%, we'll now shrink the table down to capacity = (size * 2). This fixes an issue where tables would grow infinitely when inserting and removing keys repeatedly. Basically, we would accumulate deleted buckets with nothing reclaiming them, and eventually deciding that we needed to grow the table (because we grow if used+deleted > limit!) I found this because HashTable iteration was taking a suspicious amount of time in Core::EventLoop::get_next_timer_expiration(). Turns out the timer table kept growing in capacity over time. That made iteration slower and slower since HashTable iterators visit every bucket.	2022-03-07 00:08:22 +01:00
Andreas Kling	eb829924da	AK: Remove return value from HashTable::remove() and HashMap::remove() This was only used by remove_all_matching(), where it's no longer used.	2022-03-07 00:08:22 +01:00
Andreas Kling	623bdd8b6a	AK: Simplify HashTable::remove_all_matching() Just walk the table from start to finish, deleting buckets as we go. This removes the need for remove() to return an iterator, which is preventing me from implementing hash table auto-shrinking.	2022-03-07 00:08:22 +01:00
Idan Horowitz	9b0d90a71d	AK: Support using custom comparison operations for hash compatible keys	2022-01-29 23:01:23 +02:00
James Puleo	10b25d2a57	AK: Implement `HashTable::try_ensure_capacity`, as used in `HashMap` This was used in `HashMap::try_ensure_capacity`, but was missing from `HashTable`s implementation. No one had used `HashMap::try_ensure_capacity` before so it went unnoticed!	2022-01-25 09:17:22 +01:00
Andreas Kling	5279a04c78	AK: Make Hash{Map,Table}::remove_all_matching() return removal success These functions now return whether one or more entries were removed.	2022-01-05 18:57:14 +01:00
Andreas Kling	54cf42fac1	AK: Add HashTable::remove_all_matching(predicate) This removes all matching entries from a table in a single pass.	2022-01-05 18:57:14 +01:00
Hendiadyoin1	c673b7220a	AK: Enable fast path for removal by hash-compatible key in HashMap/Table	2021-12-15 23:35:14 -08:00
Hendiadyoin1	d50360f5dd	AK: Allow hash-compatible key types in Hash[Table\|Map] lookup This will allow us to avoid some potentially expensive type conversion during lookup, like form String to StringView, which would allocate memory otherwise.	2021-12-15 13:09:49 +03:30
Andrew Kaster	762b92c650	AK: Resolve clang-tidy readability-qualified-auto warnings ... In files included by Kernel/Process.cpp and Kernel/Thread.cpp	2021-11-14 22:52:35 +01:00
Andrew Kaster	22feb9d47b	AK: Resolve clang-tidy readability-bool-conversion warnings ... In files included by Kernel/Process.cpp and Kernel/Thread.cpp	2021-11-14 22:52:35 +01:00

1 2 3

112 Commits