Use mimalloc for Ladybird-owned allocations without overriding malloc().
Route kmalloc(), kcalloc(), krealloc(), and kfree() through mimalloc,
and put the embedded Rust crates on the same allocator via a shared
shim in AK/kmalloc.cpp.
This also lets us drop kfree_sized(), since it no longer used its size
argument. StringData, Utf16StringData, JS object storage, Rust error
strings, and the CoreAudio playback helpers can all free their AK-backed
storage with plain kfree().
Sanitizer builds still use the system allocator. LeakSanitizer does not
reliably trace references stored in mimalloc-managed AK containers, so
static caches and other long-lived roots can look leaked. Pass the old
size into the Rust realloc shim so aligned fallback reallocations can
move posix_memalign-backed blocks safely.
Static builds still need a little linker help. macOS app binaries need
the Rust allocator entry points forced in from liblagom-ak.a, while
static ELF links can pull in identical allocator shim definitions from
multiple Rust staticlibs. Keep the Apple -u flags and allow those
duplicate shim symbols for LibJS and LibRegex links on Linux and BSD.
The proposal has not seemed to progress for a while, and there is
a open issue about module imports which breaks HTML integration.
While we could probably make an AD-HOC change to fix that issue,
it is deep enough in the JS engine that I am not particularly
keen on making that change.
Until other browsers begin to make positive signals about shipping
ShadowRealms, let's remove our implementation for now.
There is still some cleanup that can be done with regard to the
HTML integration, but there are a few more items that need to be
untangled there.
RegExpBuiltinExec used to snap any Unicode lastIndex that landed on a
low surrogate back to the start of the pair. That matched `/😀/u`,
but it skipped valid empty matches when the original low-surrogate
position was itself matchable, such as `/p{Script=Cyrillic}?(?<!\D)/v`
on `"A😘"` and the longer fuzzed global case.
Try the snapped position first, then retry the original lastIndex when
the snapped match fails. Only keep that second result when it is empty
at the original low-surrogate position, so consuming /u and /v matches
still cannot split a surrogate pair. In the Rust VM, treat backward
Unicode matches that start between surrogate halves as having no
complete code point to their left, which matches V8's lookbehind
behavior for those positions.
Add reduced coverage for both low-surrogate exec cases, the original
global match count regression, and the consuming-match retry regression.
Reject surrogate pairs in named group names unless both halves come
from the same raw form. A literal surrogate half was being
normalized into \uXXXX before LibRegex parsed the pattern, which let
mixed literal and escaped forms sneak through.
Validate surrogate handling on the UTF-16 pattern before
normalization, but only treat \k<...> as a named backreference when
the parser would do that too. Legacy regexes without named groups
still use \k as an identity escape, so their literal text must not be
rejected by the pre-scan.
Add runtime and syntax tests for the mixed forms, the valid literal,
fixed-width, and braced escape cases, and the legacy \k literals.
This better describes what the method returns and avoids the possible
confusion caused by the mismatch in behavior between
`Value::is_array()` and `Value::as_array()`.
In b2d9fd3352, the root cause of the crash
was somewhat misdiagnosed, particularly around what place in code an
allocation could occur while constants data was uninitialized.
But more importantly, we can do better than the solution in that commit.
Instead of initializing constants with default values and then
overwriting them afterwards, simply initialize them with their actual
values directly when constructing the execution context.
This effectivly reverts commit b2d9fd3352.
The additional data being passed will be used in an upcoming commit.
Allows splitting the churn of modified function signatures from the
logically meaningful code change.
No behavior change.
When allocating ExecutionContext, we were skipping allocating constants,
because they are filled in shortly after. However, there is still some
code executing between allocating the ExecutionContext and assigning
constants. This code may allocate GC-aware objects before the
assignment. Allocating any GC object may cause a garbage collection to
be triggered. And running garbage collection on uninitialized objects
might have all kinds of unintended effects.
To avoid that, simply initialize constants right away. To be safe, also
initialize arguments, but I haven't checked more closely if it is needed
for them or not.
The specific case where this bug manifested was JetStream3's
raytrace-private-class-fields.js, which was triggering a segmentation
fault when trying to visit a functions constant during GC marking phase.
The offending allocation triggering the garbage collection is the
corresponding FunctionEnvironment being created.
This crash was exposed as a combination of multiple things coming
together. In particular, both of 1179e40d3f and 61e6dbe4e7 combined
were exposing the problem, but is seems neither commit is at fault. Most
likely the crash happening or not is sensitive to the exact amount of GC
pressure being present or size of individual execution contexts. And so
any change that affects those might make it appear or go away.
To put in an additional wrinkle, this could only be observed using the
ASM JS interpreter. The CPP interpreter was using a fast path for
function calls that has a different allocation pattern and did not run
into the crash.
No explicit regression test for this change because:
* The problem is very sensitive to implementation details and a
reproduction that stays valid with code changes in the interpreter is
probably impossible to come by.
* The bug was exposed by the JS benchmarks, which already are as good as
a regression test as we are going to get here realistically.
Preserve V8's behavior for bare single-astral literals when a unicode
global search starts in the middle of a surrogate pair. We were
snapping that lastIndex back to the pair start unconditionally,
which let /😀/gu and /\u{1F600}/gu match where V8 returns null.
Expose that literal shape from LibRegex to LibJS and add runtime
coverage for the bare literal case alongside a grouped control.
The fast path in RegExp.prototype.test() checked for an own "exec"
property on the instance via storage_has(), but did not detect when
RegExp.prototype.exec itself had been replaced. This meant overriding
exec on the prototype was silently ignored, violating the spec which
requires test() to go through RegExpExec() and thus the overridable
exec method.
Fix this by resolving "exec" via a full prototype chain lookup and
checking whether the result is still the built-in exec, matching the
approach already used in Symbol.replace's fast path.
Switch LibJS `RegExp` over to the Rust-backed `ECMAScriptRegex` APIs.
Route `new RegExp()`, regex literals, and the RegExp builtins through
the new compile and exec APIs, and stop re-validating patterns with the
deleted C++ parser on the way in. Preserve the observable error
behavior by carrying structured compile errors and backtracking-limit
failures across the FFI boundary. Cache compiled regex state and named
capture metadata on `RegExpObject` in the new representation.
Use the new API surface to simplify and speed up the builtin paths too:
share `exec_internal`, cache compiled regex pointers, keep the legacy
RegExp statics lazy, run global replace through batch `find_all`, and
optimize replace, test, split, and String helper paths. Add regression
tests for those JavaScript-visible paths.
Add LibRegex's new Rust ECMAScript regular expression engine.
Replace the old parser's direct pattern-to-bytecode pipeline with a
split architecture: parse patterns into a lossless AST first, then
lower that AST into bytecode for a dedicated backtracking VM. Keep the
syntax tree as the place for validation, analysis, and optimization
instead of teaching every transformation to rewrite partially built
bytecode.
Specialize this backend for the job LibJS actually needs. The old C++
engine shared one generic parser and matcher stack across ECMA-262 and
POSIX modes and supported both byte-string and UTF-16 inputs. The new
engine focuses on ECMA-262 semantics on WTF-16 data, which lets it
model lone surrogates and other JavaScript-specific behavior directly
instead of carrying POSIX and multi-encoding constraints through the
whole implementation.
Fill in the ECMAScript features needed to replace the old engine for
real web workloads: Unicode properties and sets, lookahead and
lookbehind, named groups and backreferences, modifier groups, string
properties, large quantifiers, lone surrogates, and the parser and VM
corner cases those features exercise.
Reshape the runtime around compile-time pattern hints and a hotter VM
loop. Pre-resolve Unicode properties, derive first-character,
character-class, and simple-scan filters, extract safe trailing
literals for anchored patterns, add literal and literal-alternation
fast paths, and keep reusable scratch storage for registers,
backtracking state, and modifier stacks. Teach `find_all` to stay
inside one VM so global searches stop paying setup costs on every
match.
Make those shortcuts semantics-aware instead of merely fast. In Unicode
mode, do not use literal fast paths for lone surrogates, since
ECMA-262 must not let `/\ud83d/u` match inside a surrogate pair.
Likewise, only derive end-anchor suffix hints when the suffix lies on
every path to `Match`, so lookarounds and disjunctions cannot skip into
a shared tail and produce false negatives.
This commit lands the Rust crate, the C++ wrapper, the build
integration, and the initial LibJS-side plumbing needed to exercise
the new engine under real RegExp callers before removing the legacy
backend.
The set of all prototype shapes was a process-global static, which
meant that Shape::invalidate_all_prototype_chains_leading_to_this()
had to iterate over every prototype shape from every Realm in the
process.
This was catastrophic for pages that load many SVG-as-img resources,
since each SVG image creates its own Realm with a full set of JS
intrinsics and web prototypes. With N SVG images, each adding ~100
properties to their ObjectPrototype, this became O(N * 100 * M)
where M is the total number of prototype shapes across all Realms.
Since prototype chains never cross Realm boundaries, we can scope
the tracking to each Realm, making the invalidation cost independent
of the number of Realms in the process.
Store yield_continuation and yield_is_await directly in
ExecutionContext instead of allocating a GeneratorResult GC cell.
This removes a heap allocation per yield/await and fixes a latent
bug where continuation addresses stored as doubles could lose
precision.
Reduces the size of `Optional<EnvironmentCoordinate>` from 12 to 8
bytes, and by reordering the fields in `Reference` we shrink that down
from 64 to 56 bytes as well.
We specialize `Optional<T>` for value types that inherently support some
kind of "empty" value or whose value range allow for a unlikely to be
useful sentinel value that can mean "empty", instead of the boolean flag
a regular Optional<T> needs to store. Because of padding, this often
means saving 4 to 8 bytes per instance.
By extending the new `SentinelOptional<T, Traits>`, these
specializations are significantly simplified to just having to define
what the sentinel value is, and how to identify a sentinel value.
Clean up leftover references to the removed C++ pipeline:
- Remove stale forward declarations from Forward.h (ASTNode,
Parser, Program, FunctionNode, ScopeNode, etc.)
- Delete unused FunctionParsingInsights.h
- Remove dead get_builtin(MemberExpression const&) declaration
from Builtins.h
- Update stale comments referencing ASTCodegen.cpp and
generate_bytecode()
Delete Lexer.cpp/h and Token.cpp, replacing all tokenization with a
new rust_tokenize() FFI function that calls back for each token.
Rewrite SyntaxHighlighter.cpp and js.cpp REPL to use the Rust
tokenizer. The token type and category enums in Token.h now mirror
the Rust definitions in token.rs.
Move is_syntax_character/is_whitespace/is_line_terminator helpers
into RegExpConstructor.cpp as static functions, since they were only
used there.
Delete AST.cpp, AST.h, ASTDump.cpp, ScopeRecord.h, and the dead
get_builtin(MemberExpression const&) from Builtins.cpp.
Extract ImportEntry and ExportEntry into a new ModuleEntry.h,
since they are data types used by the module system, not AST
node types.
Inline ModuleRequest's sorting constructor and
SourceRange::filename().
Remove the dead annex_b_function_declarations field from
EvalDeclarationData, which was only populated by the C++ parser.
Remove the constructor that took C++ AST nodes (FunctionParameters,
Statement), along with create_for_function_node() and the
m_formal_parameters / m_ecmascript_code fields. These were only used
by the now-removed C++ compilation pipeline.
Also remove the dead EvalDeclarationData::create(VM&, Program&, bool)
and ECMAScriptFunctionObject::ecmascript_code() accessor.
Remove Bytecode::compile() and the old create() overloads on
ECMAScriptFunctionObject that accepted C++ AST nodes. These
have no remaining callers now that all compilation goes through
the Rust pipeline.
Also remove the if-constexpr Parse Node branch from
async_block_start, since the Statement template instantiation
was already removed.
Fix transitive include dependencies on Generator.h by adding
explicit includes for headers that were previously pulled in
transitively.
Port all remaining users of the C++ Parser/Lexer/Generator to
use the Rust pipeline instead:
- Intrinsics: Remove C++ fallback in parse_builtin_file()
- ECMAScriptFunctionObject: Remove C++ compile() fallback
- NativeJavaScriptBackedFunction: Remove C++ compile() fallback
- EventTarget: Port to compile_dynamic_function
- WebDriver/ExecuteScript: Port to compile_dynamic_function
- LibTest/JavaScriptTestRunner.h: Remove Parser/Lexer includes
- FuzzilliJs: Remove unused Parser/Lexer includes
Also remove the dead Statement-based template instantiation of
async_block_start/async_function_start.
Now that the Rust pipeline is the sole compilation path, remove all
C++ parser/codegen fallback paths from the callers:
- Script::parse() no longer falls back to C++ Parser
- SourceTextModule::parse() no longer falls back to C++ Parser
- perform_eval() no longer falls back to C++ Parser + Generator
- create_dynamic_function() no longer falls back to C++ Parser
- ShadowRealm eval no longer falls back to C++ Parser + Generator
- Interpreter::run(Script&) no longer falls back to Generator
Also remove the now-dead old constructors that took C++ AST nodes,
the module_requests() helper, and AST dump code from js.cpp.
Remove PipelineComparison.cpp/h and all LIBJS_COMPARE_PIPELINES
support from RustIntegration.cpp. This includes:
- The compare_pipelines_enabled() function
- All comparison blocks in compile_script/eval/module/function
- The pair_shared_function_data() helper
- The m_cpp_comparison_sfd field on SharedFunctionInstanceData
The Rust pipeline has been validated extensively through comparison
testing and no longer needs the side-by-side verification harness.
Cache raw data pointers on fixed-length typed array views so asm
GetByValue and PutByValue can use them directly for indexed
element access.
Replace the asm typed-array hot-path
ArrayBuffer/DataBlock/ByteBuffer walk with one cached_data_ptr load.
Remove six unconditional loads, four branches, and the byte_offset
add before the element access, trading them for one
cached_data_ptr null check.
Keep direct C++ typed-array access on IsValidIntegerIndex-based
checks, invalidate cached pointers eagerly when a backing
ArrayBuffer is detached, and add regression coverage for shrink,
regrow, and detach on number and BigInt typed arrays.
Use dedicated Packed branches in GetByValue and PutByValue so
in-bounds indexed accesses can skip hole checks and slot
reloads.
Keep Holey writes on the guarded arm, and keep append writes on
the C++ slow path so PutByValue still respects non-extensible
indexed objects and arrays with a non-writable length.
Add a bytecode regression that exercises both append failure
cases through the real js binary path.
Arrays created via new Array(N) or by setting .length start as Holey
since their elements are not present. After sequential fill (e.g.
for (i=0; i<N; i++) a[i]=v), all holes are filled but the array
remained Holey, preventing the Packed fast paths in the asm
interpreter from triggering.
Now, whenever indexed_put() writes to the last index of a Holey
array, we scan for remaining holes and promote to Packed if none
are found. Only checking on writes to the last index avoids O(N^2)
scanning on partial fills while still catching the common
sequential fill pattern.
When the receiver is an Array with packed storage and an intact default
prototype chain, some methods can skip the generic property access
machinery and operate directly on the indexed element storage.
This patch adds fast paths for push(), pop(), concat(), slice() and
splice().
Replace the OwnPtr<IndexedPropertyStorage> indirection with inline
indexed element storage directly on Object. This eliminates virtual
dispatch and reduces indirection for indexed property access.
The new system uses three storage kinds tracked by IndexedStorageKind:
- Packed: Dense array, no holes. Elements stored in a malloced Value*
array with capacity header (same layout as named properties).
- Holey: Dense array with possible holes marked by empty sentinel.
Same physical layout as Packed.
- Dictionary: Sparse storage using GenericIndexedPropertyStorage,
type-punned into the m_indexed_elements pointer.
Transitions: None->Packed->Holey->Dictionary (mostly monotonic).
Dictionary mode triggers on non-default attributes or sparse arrays.
Object keeps the same 48-byte size since m_indexed_elements (8 bytes)
replaces IndexedProperties (8 bytes), and the storage kind + array
size fit in existing padding alongside m_flags.
The asm interpreter benefits from one fewer indirection: it now reads
the element pointer and array size directly from Object fields instead
of chasing through OwnPtr -> IndexedPropertyStorage -> Vector.
Removes: IndexedProperties, SimpleIndexedPropertyStorage,
IndexedPropertyStorage, IndexedPropertyIterator.
Keeps: GenericIndexedPropertyStorage (for Dictionary mode).
Replace the 24-byte Vector<Value> m_storage with an 8-byte raw
Value* m_named_properties pointer, backed by a malloc'd allocation
with an inline capacity header.
Memory layout of the allocation:
[u32 capacity] [u32 padding] [Value 0] [Value 1] ...
m_named_properties points to Value 0.
This shrinks JS::Object from 64 to 48 bytes (on non-Windows
platforms) and removes one level of indirection for property access
in the asm interpreter, since the data pointer is now stored directly
on the object rather than inside a Vector's internal metadata.
Growth policy: max(4, max(needed, old_capacity * 2)).
When an async function is resumed from a microtask and hits another
await with a non-thenable value (primitive or already-settled native
promise), and the microtask queue is empty, we can resolve the await
synchronously without suspending. No other microtask can observe the
difference in execution order, making this optimization safe.
This avoids the overhead of creating a GC::Function for the microtask
job, enqueuing/dequeuing from the microtask queue, and the execution
context push/pop that comes with it.
A new VM host hook, host_promise_job_queue_is_empty, is added so both
the standalone js binary and LibWeb can provide the appropriate check
for their respective job queue implementations.
Per spec, every `await` goes through PromiseResolve (which wraps the
value in a new Promise via NewPromiseCapability) and then
PerformPromiseThen (which creates PromiseReaction and JobCallback
objects). This results in 13-16 GC cell allocations per await.
Add a fast path that detects two common cases:
1. Primitive values: These can never have a "then" property, so we
can skip all promise wrapping and directly schedule the async
function's continuation as a microtask.
2. Already-settled native Promises: If the promise has no own
properties and its prototype is the intrinsic %Promise.prototype%,
we can extract the result directly and schedule continuation.
For these cases, we bypass promise_resolve(), new_promise_capability(),
create_resolving_functions(), perform_then(), PromiseReaction creation,
and JobCallback creation -- replacing ~13 GC allocations with 1
(the GC::Function for the microtask job).
The fulfilled and rejected closures in AsyncFunctionDriverWrapper::await
were identical in structure, differing only in whether they resumed the
async context with a NormalCompletion or ThrowCompletion.
Since the callback has access to the current promise, we can check its
state at reaction time to determine which completion to use. This allows
us to allocate a single GC-tracked NativeFunction (m_on_settled) and
register it for both the onFulfilled and onRejected slots in
PerformPromiseThen, halving the number of GC allocations on this path.
There will be an extremely similar function that accepts a calendar year
rather than ISO year in a subsequent commit. Rename the ISO year variant
so that they are more distinguishable.
Replace the 16-byte Variant<Empty, GC::Ref<Script>, GC::Ref<Module>>
with a simple 8-byte GC::Ptr<Cell> that points to either a Script or
Module (or is null for Empty).
A helper function script_or_module_from_cell() converts back to the
full ScriptOrModule variant when needed (e.g. in
VM::get_active_script_or_module).
This field was written by push_inline_frame but never read anywhere.
The caller's executable is accessible via caller_frame->executable
if ever needed.
Shrinks ExecutionContext from 120 to 112 bytes.
The arguments Span (pointer + size = 16 bytes) was always derivable
from the tail array layout: data = values + (total_count - arg_count).
Replace it with a u32 argument_count and derive the span on demand
via arguments_span() / arguments_data() accessors.
Shrinks ExecutionContext from 136 to 120 bytes.
CachedSourceRange was a GC-allocated cell stored on the
ExecutionContext, only needed because ExecutionContext must be
trivially destructible.
Move the source range cache to a HashMap<u32, SourceRange> on the
Executable (keyed by program counter), where it belongs. This
eliminates the GC::Cell subclass entirely and removes the
cached_source_range field from ExecutionContext.
StackTraceElement and TracebackFrame now store Optional<SourceRange>
directly instead of GC::Ptr<CachedSourceRange>.
Shrinks ExecutionContext from 144 to 136 bytes.
This field was only used by LibWeb to prevent GC collection of the
EnvironmentSettingsObject while its execution context is on the stack.
This is unnecessary because the ESO is already reachable through the
realm's host_defined pointer: EC -> realm -> host_defined ->
PrincipalHostDefined -> environment_settings_object.
Shrinks ExecutionContext from 152 to 144 bytes.