ladybird

mirror of https://github.com/LadybirdBrowser/ladybird synced 2026-04-25 17:25:08 +02:00

Author	SHA1	Message	Date
Andreas Kling	b23aa38546	AK: Adopt mimalloc v2 as main allocator Use mimalloc for Ladybird-owned allocations without overriding malloc(). Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc, and put the embedded Rust crates on the same allocator via a shared shim in AK/kmalloc.cpp. This also lets us drop kfree_sized(), since it no longer used its size argument. StringData, Utf16StringData, JS object storage, Rust error strings, and the CoreAudio playback helpers can all free their AK-backed storage with plain kfree(). Sanitizer builds still use the system allocator. LeakSanitizer does not reliably trace references stored in mimalloc-managed AK containers, so static caches and other long-lived roots can look leaked. Pass the old size into the Rust realloc shim so aligned fallback reallocations can move posix_memalign-backed blocks safely. Static builds still need a little linker help. macOS app binaries need the Rust allocator entry points forced in from liblagom-ak.a, while static ELF links can pull in identical allocator shim definitions from multiple Rust staticlibs. Keep the Apple -u flags and allow those duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.	2026-04-08 09:57:53 +02:00
Ben Wiederhake	dab83d35d1	AK: Remove unused include from ByteString	2026-02-17 12:38:51 +00:00
Andreas Kling	b50ff02da4	AK: Skip ASCII validation in {Utf16String,String}::number()	2025-10-05 11:24:46 +02:00
Timothy Flynn	8472e469f4	AK+LibJS+LibWeb: Recognize that our UTF-16 string is actually WTF-16 For the web, we allow a wobbly UTF-16 encoding (i.e. lonely surrogates are permitted). Only in a few exceptional cases do we strictly require valid UTF-16. As such, our `validate(AllowLonelySurrogates::Yes)` calls will always succeed. It's a wasted effort to ever make such a check. This patch eliminates such invocations. The validation methods will now only check for strict UTF-16, and are only invoked when needed.	2025-08-13 09:56:13 -04:00
Timothy Flynn	36c7302178	AK: Optimize the UTF-16 StringBuilder for ASCII storage When we build a UTF-16 string, we currently always switch to the UTF-16 storage mode inside StringBuilder. Then when it comes time to create the string, we switch the storage to ASCII if possible (by shifting the underlying bytes up). Instead, let's start out with ASCII storage and then switch to UTF-16 storage once we see a non-ASCII code point. For most strings, this will avoid allocating 2x the memory, and avoids many ASCII validation calls.	2025-08-13 09:56:13 -04:00
Timothy Flynn	13ed6aba71	AK+LibIPC: Implement an encoder/decoder for UTF-16 strings	2025-08-02 10:10:14 -07:00
Timothy Flynn	1375e6bf39	AK+LibJS+LibWeb: Use simdutf to create well-formed strings	2025-07-26 00:40:06 +02:00
Timothy Flynn	2803d66d87	AK: Support UTF-16 string formatting The underlying storage used during string formatting is StringBuilder. To support UTF-16 strings, this patch allows callers to specify a mode during StringBuilder construction. The default mode is UTF-8, for which StringBuilder remains unchanged. In UTF-16 mode, we treat the StringBuilder's internal ByteBuffer as a series of u16 code units. Appending a single character will append 2 bytes for that character (cast to a char16_t). Appending a StringView will transcode the string to UTF-16. Utf16String also gains the same memory optimization that we added for String, where we hand-off the underlying buffer to Utf16String to avoid having to re-allocate. In the future, we may want to further optimize for ASCII strings. For example, we could defer committing to the u16-esque storage until we see a non-ASCII code point.	2025-07-18 12:45:38 -04:00
Timothy Flynn	fe676585f5	AK: Add a UTF-16 string with optimized short- and ASCII-string storage This is a strictly UTF-16 string with some optimizations for ASCII. * If created from a short UTF-8 or UTF-16 string that is also ASCII, then the string is stored in an inlined byte buffer. * If created with a long UTF-8 or UTF-16 string that is also ASCII, then the string is stored in an outlined char buffer. * If created with a short or long UTF-8 or UTF-16 string that is not ASCII, then the string is stored in an outlined char16 buffer. We do not store short non-ASCII text in the inlined buffer to avoid confusion with operations such as `length_in_code_units` and `code_unit_at`. For example, "😀" would be stored as 4 UTF-8 bytes in short string form. But we still want `length_in_code_units` to be 2, and `code_unit_at(0)` to be 0xD83D.	2025-07-18 12:45:38 -04:00

9 Commits