HashMap<_, GC::Ref<_>>::ensure() crashed under UBSan whenever the
initialization callback triggered a GC: lookup_for_writing() stamped
the target bucket as used and added it to the ordered list before the
callback ran, so the marking visitor walked the map, read the
uninitialized slot, and failed the returns_nonnull check in GC::Ref.
Split bucket reservation into two phases. lookup_for_writing() now
hands back the target in the Free state (not in the ordered list,
m_size unchanged); callers placement-new the value and then commit via
commit_inserted_bucket(). The Robin Hood displacement loop still
stamps the slot internally and un-stamps before returning, so probing
is unchanged and the whole operation remains a single hash and a
single probe.
Use mimalloc for Ladybird-owned allocations without overriding malloc().
Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc,
and put the embedded Rust crates on the same allocator via a shared
shim in AK/kmalloc.cpp.
This also lets us drop kfree_sized(), since it no longer used its size
argument. StringData, Utf16StringData, JS object storage, Rust error
strings, and the CoreAudio playback helpers can all free their AK-backed
storage with plain kfree().
Sanitizer builds still use the system allocator. LeakSanitizer does not
reliably trace references stored in mimalloc-managed AK containers, so
static caches and other long-lived roots can look leaked. Pass the old
size into the Rust realloc shim so aligned fallback reallocations can
move posix_memalign-backed blocks safely.
Static builds still need a little linker help. macOS app binaries need
the Rust allocator entry points forced in from liblagom-ak.a, while
static ELF links can pull in identical allocator shim definitions from
multiple Rust staticlibs. Keep the Apple -u flags and allow those
duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.
Robin Hood displacement and `delete_bucket()` shift-up used BucketType's
implicit move operations, which bitwise-copy the `u8` storage array
instead of going through T's move constructor and destructor. This
change adds `relocate_bucket()` and `swap_buckets()` helpers that use a
fast path for trivially-relocatable types and move-construct + destroy
for others.
Measurements on my device have shown this to make lookups and inserts
faster by 30-40% on average, especially for larger number of buckets.
The memory cost is significant depending on the exact number of buckets,
increasing anywhere between 10% and 100% compared to the old way of
growing our capacity. The performance benefits still make this worth it,
in my opinion.
Measurements on my device have shown HashTable lookups and inserts to be
faster up to ~20%, at the cost of 10-20% more memory usage depending on
the exact amount of buckets.
...and use it to make HashMap::ensure() do a single hash lookup instead
of three.
We achieve this by factoring out everything but the bucket construction
logic from HashTable::write_value() into a lookup_for_writing() helper
so we can use it from more places.
Buckets being iterated by pointer instead of reference was causing a
compilation error when calling clear_with_capacity() on a HashTable
containing a non-trivially-destructible type.
When T in HashTable<T> has a potentially slow equality check, it can be
very profitable to check for a matching hash before full equality.
This patch adds may_have_slow_equality_check() to AK::Traits and
defaults it to true. For trivial types (pointers, integers, etc) we
default it to false. This means we skip the hash check when the equality
check would be a single-CPU-word compare anyway.
This synergizes really well with things like HashMap<String, V> where
collisions previously meant we may have to churn through multiple O(n)
equality checks.
These changes are compatible with clang-format 16 and will be mandatory
when we eventually bump clang-format version. So, since there are no
real downsides, let's commit them now.
SipHash is highly HashDoS-resistent, initialized with a random seed at
startup (i.e. non-deterministic) and usable for security-critical use
cases with large enough parameters. We just use it because it's
reasonably secure with parameters 1-3 while having excellent properties
and not being significantly slower than before.
There was a small mishmash of argument order, as seen on the table:
| Traits<T>::equals(U, T) | Traits<T>::equals(T, U)
============= | ======================= | =======================
uses equals() | HashMap | Vector, HashTable
defines equals() | *String[^1] | ByteBuffer
[^1]: String, DeprecatedString, their Fly-type equivalents and KString.
This mostly meant that you couldn't use a StringView for finding a value
in Vector<String>.
I'm changing the order of arguments to make the trait type itself first
(`Traits<T>::equals(T, U)`), as I think it's more expected and makes us
more consistent with the rest of the functions that put the stored type
first (like StringUtils functions and binary_serach). I've also renamed
the variable name "other" in find functions to "entry" to give more
importance to the value.
With this change, each of the following lines will now compile
successfully:
Vector<String>().contains_slow("WHF!"sv);
HashTable<String>().contains("WHF!"sv);
HashMap<ByteBuffer, int>().contains("WHF!"sv.bytes());
"The official project language is American English […]."
5d2e915623/CONTRIBUTING.md (L30)
Here's a short statistic of the occurrences of the word "behavio(u)r":
$ git grep -IPioh 'behaviou?r' | sort | uniq -c | sort -n
2 BEHAVIOR
24 Behaviour
32 behaviour
407 Behavior
992 behavior
Therefore, it is clear that "behaviour" (56 occurrences) should be
regarded a typo, and "behavior" (1401 occurrences) should be preferred.
Note that The occurrences in LibJS are intentionally NOT changed,
because there are taken verbatim from the specification. Hence:
$ git grep -IPioh 'behaviou?r' | sort | uniq -c | sort -n
2 BEHAVIOR
10 behaviour
24 Behaviour
407 Behavior
1014 behavior
With Clang, the previous/next pointers in buckets of an
`OrderedHashTable` are not cleared when a bucket is being shifted up as
a result of a removed bucket. As a result, an unfortunate pointer mixup
could lead to an infinite loop in the `HashTable` iterator, which was
exposed in `HashMap::keys()`.
Co-authored-by: Luke Wilde <lukew@serenityos.org>
This naming scheme matches Vector.
This also changes `take_last` to move the value it takes, and delete by
known pointer, avoiding a full lookup and potential copies.
Instead of rehashing on collisions, we use Robin Hood hashing: a simple
linear probe where we keep track of the distance between the bucket and
its ideal position. On insertion, we allow a new bucket to "steal" the
position of "rich" buckets (those near their ideal position) and move
them further down.
On removal, we shift buckets back up into the freed slot, decrementing
their distance while doing so.
This behavior automatically optimizes the number of required probes for
any value, and removes the need for periodic rehashing (except when
expanding the capacity).
This patch adds the `USING_AK_GLOBALLY` macro which is enabled by
default, but can be overridden by build flags.
This is a step towards integrating Jakt and AK types.
When calling clear_with_capacity on an empty HashTable/HashMap, a null
deref would occur when trying to memset() m_buckets. Checking that it
has capacity before clearing fixes the issue.
Usually the values of the previous and next pointers of deleted buckets
are never used, as they're not part of the main ordered bucket chain,
but if an in-place rehashing is done, which results in the bucket being
turned into a free bucket, the stale pointers will remain, at which
point any item that is inserted into said free-bucket will have either
a stale previous pointer if the HashTable was empty on insertion, or a
stale next pointer, resulting in undefined behaviour.
This commit also includes a new HashMap test that reproduces this issue
As seen on TV, HashTable can get "thrashed", i.e. it has a bunch of
deleted buckets that count towards the load factor. This means that hash
tables which are large enough for their contents need to be resized.
This was fixed in 9d8da16 with a workaround that shrinks the HashTable
back down in these cases, as after the resize and re-hash the load
factor is very low again. However, that's not a good solution. If you
insert and remove repeatedly around a size boundary, you might get
frequent resizes, which involve frequent re-allocations.
The new solution is an in-place rehashing algorithm that I came up with.
(Do complain to me, I'm at fault.) Basically, it iterates the buckets
and re-hashes the used buckets while marking the deleted slots empty.
The issue arises with collisions in the re-hash. For this reason, there
are two kinds of used buckets during the re-hashing: the normal "used"
buckets, which are old and are treated as free space, and the
"re-hashed" buckets, which are new and treated as used space, i.e. they
trigger probing. Therefore, the procedure for relocating a bucket's
contents is as follows:
- Locate the "real" bucket of the contents with the hash. That bucket is
the starting point for the target bucket, and the current (old) bucket
is the bucket we want to move.
- While we still need to move the bucket:
- If we're the target, something strange happened last iteration or we
just re-hashed to the same location. We're done.
- If the target is empty or deleted, just move the bucket. We're done.
- If the target is a re-hashed full bucket, we probe by double-hashing
our hash as usual. Henceforth, we move our target for the next
iteration.
- If the target is an old full bucket, we swap the target and to-move
buckets. Therefore, the bucket to move is a the correct location and the
former target, which still needs to find a new place, is now in the
bucket to move. So we can just continue with the loop; the target is
re-obtained from the bucket to move. This happens for each and every
bucket, though some buckets are "coincidentally" moved before their
point of iteration is reached. Either way, this guarantees full in-place
movement (even without stack storage) and therefore space complexity of
O(1). Time complexity is amortized O(2n) asssuming a good hashing
function.
This leads to a performance improvement of ~30% on the benchmark
introduced with the last commit.
Co-authored-by: Hendiadyoin1 <leon.a@serenityos.org>
The hash table buckets had three different state booleans that are in
fact exclusive. In preparation for further states, this commit
consolidates them into one enum. This has the added benefit on not
relying on the compiler's boolean packing anymore; we definitely now
only need one byte for the bucket state.
Since the allocated memory is going to be zeroed immediately anyway,
let's avoid redundantly scrubbing it with MALLOC_SCRUB_BYTE just before
that.
The latest versions of gcc and Clang can automatically do this malloc +
memset -> calloc optimization, but I've seen a couple of places where it
failed to be done.
This commit also adds a naive kcalloc function to the kernel that
doesn't (yet) eliminate the redundancy like the userland does.
If the utilization of a HashTable (size vs capacity) goes below 20%,
we'll now shrink the table down to capacity = (size * 2).
This fixes an issue where tables would grow infinitely when inserting
and removing keys repeatedly. Basically, we would accumulate deleted
buckets with nothing reclaiming them, and eventually deciding that we
needed to grow the table (because we grow if used+deleted > limit!)
I found this because HashTable iteration was taking a suspicious amount
of time in Core::EventLoop::get_next_timer_expiration(). Turns out the
timer table kept growing in capacity over time. That made iteration
slower and slower since HashTable iterators visit every bucket.
Just walk the table from start to finish, deleting buckets as we go.
This removes the need for remove() to return an iterator, which is
preventing me from implementing hash table auto-shrinking.
This was used in `HashMap::try_ensure_capacity`, but was missing from
`HashTable`s implementation. No one had used
`HashMap::try_ensure_capacity` before so it went unnoticed!
This will allow us to avoid some potentially expensive type conversion
during lookup, like form String to StringView, which would allocate
memory otherwise.