Use mimalloc for Ladybird-owned allocations without overriding malloc().
Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc,
and put the embedded Rust crates on the same allocator via a shared
shim in AK/kmalloc.cpp.
This also lets us drop kfree_sized(), since it no longer used its size
argument. StringData, Utf16StringData, JS object storage, Rust error
strings, and the CoreAudio playback helpers can all free their AK-backed
storage with plain kfree().
Sanitizer builds still use the system allocator. LeakSanitizer does not
reliably trace references stored in mimalloc-managed AK containers, so
static caches and other long-lived roots can look leaked. Pass the old
size into the Rust realloc shim so aligned fallback reallocations can
move posix_memalign-backed blocks safely.
Static builds still need a little linker help. macOS app binaries need
the Rust allocator entry points forced in from liblagom-ak.a, while
static ELF links can pull in identical allocator shim definitions from
multiple Rust staticlibs. Keep the Apple -u flags and allow those
duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.
This first pass only applies to the following two cases:
- Public functions returning a view type into an object they own
- Public ctors storing a view type
This catches a grand total of one (1) issue, which is fixed in
the previous commit.
The underlying storage used during string formatting is StringBuilder.
To support UTF-16 strings, this patch allows callers to specify a mode
during StringBuilder construction. The default mode is UTF-8, for which
StringBuilder remains unchanged.
In UTF-16 mode, we treat the StringBuilder's internal ByteBuffer as a
series of u16 code units. Appending a single character will append 2
bytes for that character (cast to a char16_t). Appending a StringView
will transcode the string to UTF-16.
Utf16String also gains the same memory optimization that we added for
String, where we hand-off the underlying buffer to Utf16String to avoid
having to re-allocate.
In the future, we may want to further optimize for ASCII strings. For
example, we could defer committing to the u16-esque storage until we
see a non-ASCII code point.
This was added to be used with `kfree_sized` when we construct a String
from a StringBuilder. As of 53cac71cec, it
is unused, causing some compilers to raise a warning.
This reduces the size of StringData from 24 to 16 bytes.
More work on recovering the performance regression from
DeprecatedFlyString removal.
Local measurements on my MBP:
- 2.5% speedup on Octane/zlib.js
- 2% speedup on Octane/typescript.js
Currently, invoking StringBuilder::to_string will re-allocate the string
data to construct the String. This is wasteful both in terms of memory
and speed.
The goal here is to simply hand the string buffer over to String, and
let String take ownership of that buffer. To do this, StringBuilder must
have the same memory layout as Detail::StringData. This layout is just
the members of the StringData class followed by the string itself.
So when a StringBuilder is created, we reserve sizeof(StringData) bytes
at the front of the buffer. StringData can then construct itself into
the buffer with placement new.
Things to note:
* StringData must now be aware of the actual capacity of its buffer, as
that can be larger than the string size.
* We must take care not to pass ownership of inlined string buffers, as
these live on the stack.
Before this change, the global FlyString table looked like this:
HashMap<StringView, Detail::StringBase>
After this change, we have:
HashTable<Detail::StringData const*, FlyStringTableHashTraits>
The custom hash traits are used to extract the stored hash from
StringData which avoids having to rehash the StringView repeatedly like
we did before.
This necessitated a handful of smaller changes to make it work.